Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
haskell-gargantext
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
147
Issues
147
List
Board
Labels
Milestones
Merge Requests
10
Merge Requests
10
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
gargantext
haskell-gargantext
Commits
ab1b7497
Unverified
Commit
ab1b7497
authored
Jun 06, 2019
by
Nicolas Pouillard
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Eleve...
parent
da8e963b
Pipeline
#440
canceled with stage
Changes
1
Pipelines
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
38 additions
and
19 deletions
+38
-19
Eleve.hs
src/Gargantext/Text/Eleve.hs
+38
-19
No files found.
src/Gargantext/Text/Eleve.hs
View file @
ab1b7497
...
...
@@ -80,8 +80,8 @@ makeLenses ''I
type
ModEntropy
i
o
e
=
(
e
->
e
)
->
i
->
o
set
NormEntrop
y
::
ModEntropy
e
(
I
e
)
e
set
NormEntrop
y
f
e
=
I
e
(
f
e
)
set
_autonom
y
::
ModEntropy
e
(
I
e
)
e
set
_autonom
y
f
e
=
I
e
(
f
e
)
data
StartStop
=
Start
|
Stop
deriving
(
Ord
,
Eq
,
Show
)
...
...
@@ -94,14 +94,20 @@ isTerminal :: Token -> Bool
isTerminal
(
Terminal
_
)
=
True
isTerminal
(
NonTerminal
_
)
=
False
toToken
::
Int
->
[
Text
]
->
[
Token
]
toToken
n
xs
=
Terminal
Start
:
(
NonTerminal
<$>
xs
)
<>
[
Terminal
Stop
]
parseToken
::
Text
->
Token
parseToken
"<start>"
=
Terminal
Start
parseToken
"<stop>"
=
Terminal
Stop
parseToken
t
=
NonTerminal
t
unToken
::
[
Token
]
->
[
Text
]
unToken
=
map
f
toToken
::
[
Text
]
->
[
Token
]
toToken
xs
=
Terminal
Start
:
(
NonTerminal
<$>
xs
)
<>
[
Terminal
Stop
]
printToken
::
Token
->
Text
printToken
=
f
where
f
(
NonTerminal
x
)
=
x
f
(
Terminal
_
)
=
""
f
(
Terminal
Start
)
=
"<start>"
f
(
Terminal
Stop
)
=
"<stop>"
------------------------------------------------------------------------
...
...
@@ -318,15 +324,18 @@ split inE t0 ts =
mainEleve
::
Int
->
[[
Text
]]
->
[[[
Text
]]]
mainEleve
_
_
=
[]
{-
mainEleve n input = map
unToken
. split identity (t :: Trie Token Double) <$> inp
mainEleve n input = map
(map printToken)
. split identity (t :: Trie Token Double) <$> inp
where
inp = toToken
(n - 1)
<$> input
inp = toToken <$> input
t = buildTrie $ L.concat $ chunkAlong n 1 <$> inp
-}
sim
::
Entropy
e
=>
e
->
e
->
Bool
sim
x
y
=
x
==
y
||
(
P
.
isNaN
x
&&
P
.
isNaN
y
)
chunkAlongEleve
::
Int
->
[
a
]
->
[[
a
]]
chunkAlongEleve
n
xs
=
L
.
take
n
<$>
L
.
tails
xs
testEleve
::
e
~
Double
=>
Bool
->
Int
->
[
Text
]
->
[(
Text
,
Int
,
e
,
e
,
e
,
e
,
e
)]
->
IO
Bool
testEleve
debug
n
output
checks
=
do
let
...
...
@@ -339,9 +348,9 @@ testEleve debug n output checks = do
]
]
-}
--res = map
unToken
. split identity fwd <$> inp
--res = map
unToken
. split info_norm_entropy' nt' <$> inp
res
=
map
unToken
.
split
info_autonomy
nt
<$>
inp
--res = map
(map printToken)
. split identity fwd <$> inp
--res = map
(map printToken)
. split info_norm_entropy' nt' <$> inp
res
=
map
(
map
printToken
)
.
split
info_autonomy
nt
<$>
inp
when
debug
$
do
P
.
putStrLn
(
show
input
)
-- mapM_ (P.putStrLn . show) pss
...
...
@@ -357,11 +366,11 @@ testEleve debug n output checks = do
out
=
T
.
words
<$>
output
expected
=
fmap
(
T
.
splitOn
"-"
)
<$>
out
input
=
(
T
.
splitOn
"-"
=<<
)
<$>
out
inp
=
toToken
(
n
-
1
)
<$>
input
t
=
buildTrie
$
L
.
concat
$
chunkAlong
(
n
+
1
)
1
<$>
inp
-- nt = normalizeEntropy identity set
NormEntrop
y (fwd :: Trie Token Double)
inp
=
toToken
<$>
input
t
=
buildTrie
$
L
.
concat
$
chunkAlong
Eleve
(
n
+
2
)
<$>
inp
-- nt = normalizeEntropy identity set
_autonom
y (fwd :: Trie Token Double)
-- nt = normalizeEntropy' info_entropy (\f -> info_norm_entropy' %~ f) nt
nt
=
normalizeEntropy
identity
set
NormEntrop
y
t
nt
=
normalizeEntropy
identity
set
_autonom
y
t
check
f
msg
x
y
=
if
f
x
y
...
...
@@ -369,7 +378,7 @@ testEleve debug n output checks = do
else
P
.
putStrLn
$
" FAIL "
<>
msg
<>
" "
<>
show
x
<>
" /= "
<>
show
y
checker
(
ngram
,
count
,
entropy
,
_ev
,
autonomy
,
bwd_entropy
,
fwd_entropy
)
=
do
let
ns
=
NonTerminal
<$>
T
.
words
ngram
let
ns
=
parseToken
<$>
T
.
words
ngram
t'
=
findTrie
ns
nt
P
.
putStrLn
$
" "
<>
T
.
unpack
ngram
<>
":"
check
(
==
)
"count"
count
(
_node_count
(
_fwd
t'
))
...
...
@@ -400,7 +409,17 @@ example6 = ["le-petit chat"
checks0
,
checks2
::
[(
Text
,
Int
,
Double
,
Double
,
Double
,
Double
,
Double
)]
checks0
=
[(
"
\
ue
02
b New"
,
1
,
nan
,
nan
,
nan
,
nan
,
0.0
),
(
"New York"
,
3
,
1.584962500721156
,
1.584962500721156
,
1.414213562373095
,
nan
,
1.584962500721156
),
(
"York is"
,
1
,
0.0
,
nan
,
nan
,
nan
,
0.0
),
(
"is New"
,
1
,
0.0
,
nan
,
nan
,
nan
,
0.0
),
(
"New York"
,
3
,
1.584962500721156
,
1.584962500721156
,
1.414213562373095
,
nan
,
1.584962500721156
),
(
"York and"
,
1
,
0.0
,
nan
,
nan
,
nan
,
0.0
),
(
"and New"
,
1
,
0.0
,
nan
,
nan
,
nan
,
0.0
),
(
"New York"
,
3
,
1.584962500721156
,
1.584962500721156
,
1.414213562373095
,
nan
,
1.584962500721156
),
(
"York
\
ue
02
d"
,
1
,
nan
,
nan
,
nan
,
nan
,
nan
)]
checks0
=
[(
"<start> New"
,
1
,
nan
,
nan
,
nan
,
nan
,
0.0
)
,(
"New York"
,
3
,
1.584962500721156
,
1.584962500721156
,
1.414213562373095
,
nan
,
1.584962500721156
)
,(
"York is"
,
1
,
0.0
,
nan
,
nan
,
nan
,
0.0
)
,(
"is New"
,
1
,
0.0
,
nan
,
nan
,
nan
,
0.0
)
,(
"New York"
,
3
,
1.584962500721156
,
1.584962500721156
,
1.414213562373095
,
nan
,
1.584962500721156
)
,(
"York and"
,
1
,
0.0
,
nan
,
nan
,
nan
,
0.0
)
,(
"and New"
,
1
,
0.0
,
nan
,
nan
,
nan
,
0.0
)
,(
"New York"
,
3
,
1.584962500721156
,
1.584962500721156
,
1.414213562373095
,
nan
,
1.584962500721156
)
,(
"York <stop>"
,
1
,
nan
,
nan
,
nan
,
nan
,
nan
)
]
checks2
=
[(
"to be"
,
3
,
1.2516291673878228
,
1.2516291673878228
,
1.5535694744293167
,
nan
,
0.9182958340544896
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment