Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
H
haskell-gargantext
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Przemyslaw Kaminski
haskell-gargantext
Commits
ab1b7497
Unverified
Commit
ab1b7497
authored
Jun 06, 2019
by
Nicolas Pouillard
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Eleve...
parent
da8e963b
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
38 additions
and
19 deletions
+38
-19
Eleve.hs
src/Gargantext/Text/Eleve.hs
+38
-19
No files found.
src/Gargantext/Text/Eleve.hs
View file @
ab1b7497
...
...
@@ -80,8 +80,8 @@ makeLenses ''I
type
ModEntropy
i
o
e
=
(
e
->
e
)
->
i
->
o
set
NormEntrop
y
::
ModEntropy
e
(
I
e
)
e
set
NormEntrop
y
f
e
=
I
e
(
f
e
)
set
_autonom
y
::
ModEntropy
e
(
I
e
)
e
set
_autonom
y
f
e
=
I
e
(
f
e
)
data
StartStop
=
Start
|
Stop
deriving
(
Ord
,
Eq
,
Show
)
...
...
@@ -94,14 +94,20 @@ isTerminal :: Token -> Bool
isTerminal
(
Terminal
_
)
=
True
isTerminal
(
NonTerminal
_
)
=
False
toToken
::
Int
->
[
Text
]
->
[
Token
]
toToken
n
xs
=
Terminal
Start
:
(
NonTerminal
<$>
xs
)
<>
[
Terminal
Stop
]
parseToken
::
Text
->
Token
parseToken
"<start>"
=
Terminal
Start
parseToken
"<stop>"
=
Terminal
Stop
parseToken
t
=
NonTerminal
t
unToken
::
[
Token
]
->
[
Text
]
unToken
=
map
f
toToken
::
[
Text
]
->
[
Token
]
toToken
xs
=
Terminal
Start
:
(
NonTerminal
<$>
xs
)
<>
[
Terminal
Stop
]
printToken
::
Token
->
Text
printToken
=
f
where
f
(
NonTerminal
x
)
=
x
f
(
Terminal
_
)
=
""
f
(
NonTerminal
x
)
=
x
f
(
Terminal
Start
)
=
"<start>"
f
(
Terminal
Stop
)
=
"<stop>"
------------------------------------------------------------------------
...
...
@@ -318,15 +324,18 @@ split inE t0 ts =
mainEleve
::
Int
->
[[
Text
]]
->
[[[
Text
]]]
mainEleve
_
_
=
[]
{-
mainEleve n input = map
unToken
. split identity (t :: Trie Token Double) <$> inp
mainEleve n input = map
(map printToken)
. split identity (t :: Trie Token Double) <$> inp
where
inp = toToken
(n - 1)
<$> input
inp = toToken <$> input
t = buildTrie $ L.concat $ chunkAlong n 1 <$> inp
-}
sim
::
Entropy
e
=>
e
->
e
->
Bool
sim
x
y
=
x
==
y
||
(
P
.
isNaN
x
&&
P
.
isNaN
y
)
chunkAlongEleve
::
Int
->
[
a
]
->
[[
a
]]
chunkAlongEleve
n
xs
=
L
.
take
n
<$>
L
.
tails
xs
testEleve
::
e
~
Double
=>
Bool
->
Int
->
[
Text
]
->
[(
Text
,
Int
,
e
,
e
,
e
,
e
,
e
)]
->
IO
Bool
testEleve
debug
n
output
checks
=
do
let
...
...
@@ -339,9 +348,9 @@ testEleve debug n output checks = do
]
]
-}
--res = map
unToken
. split identity fwd <$> inp
--res = map
unToken
. split info_norm_entropy' nt' <$> inp
res
=
map
unToken
.
split
info_autonomy
nt
<$>
inp
--res = map
(map printToken)
. split identity fwd <$> inp
--res = map
(map printToken)
. split info_norm_entropy' nt' <$> inp
res
=
map
(
map
printToken
)
.
split
info_autonomy
nt
<$>
inp
when
debug
$
do
P
.
putStrLn
(
show
input
)
-- mapM_ (P.putStrLn . show) pss
...
...
@@ -357,11 +366,11 @@ testEleve debug n output checks = do
out
=
T
.
words
<$>
output
expected
=
fmap
(
T
.
splitOn
"-"
)
<$>
out
input
=
(
T
.
splitOn
"-"
=<<
)
<$>
out
inp
=
toToken
(
n
-
1
)
<$>
input
t
=
buildTrie
$
L
.
concat
$
chunkAlong
(
n
+
1
)
1
<$>
inp
-- nt = normalizeEntropy identity set
NormEntrop
y (fwd :: Trie Token Double)
inp
=
toToken
<$>
input
t
=
buildTrie
$
L
.
concat
$
chunkAlong
Eleve
(
n
+
2
)
<$>
inp
-- nt = normalizeEntropy identity set
_autonom
y (fwd :: Trie Token Double)
-- nt = normalizeEntropy' info_entropy (\f -> info_norm_entropy' %~ f) nt
nt
=
normalizeEntropy
identity
set
NormEntrop
y
t
nt
=
normalizeEntropy
identity
set
_autonom
y
t
check
f
msg
x
y
=
if
f
x
y
...
...
@@ -369,7 +378,7 @@ testEleve debug n output checks = do
else
P
.
putStrLn
$
" FAIL "
<>
msg
<>
" "
<>
show
x
<>
" /= "
<>
show
y
checker
(
ngram
,
count
,
entropy
,
_ev
,
autonomy
,
bwd_entropy
,
fwd_entropy
)
=
do
let
ns
=
NonTerminal
<$>
T
.
words
ngram
let
ns
=
parseToken
<$>
T
.
words
ngram
t'
=
findTrie
ns
nt
P
.
putStrLn
$
" "
<>
T
.
unpack
ngram
<>
":"
check
(
==
)
"count"
count
(
_node_count
(
_fwd
t'
))
...
...
@@ -400,7 +409,17 @@ example6 = ["le-petit chat"
checks0
,
checks2
::
[(
Text
,
Int
,
Double
,
Double
,
Double
,
Double
,
Double
)]
checks0
=
[(
"
\
ue
02
b New"
,
1
,
nan
,
nan
,
nan
,
nan
,
0.0
),
(
"New York"
,
3
,
1.584962500721156
,
1.584962500721156
,
1.414213562373095
,
nan
,
1.584962500721156
),
(
"York is"
,
1
,
0.0
,
nan
,
nan
,
nan
,
0.0
),
(
"is New"
,
1
,
0.0
,
nan
,
nan
,
nan
,
0.0
),
(
"New York"
,
3
,
1.584962500721156
,
1.584962500721156
,
1.414213562373095
,
nan
,
1.584962500721156
),
(
"York and"
,
1
,
0.0
,
nan
,
nan
,
nan
,
0.0
),
(
"and New"
,
1
,
0.0
,
nan
,
nan
,
nan
,
0.0
),
(
"New York"
,
3
,
1.584962500721156
,
1.584962500721156
,
1.414213562373095
,
nan
,
1.584962500721156
),
(
"York
\
ue
02
d"
,
1
,
nan
,
nan
,
nan
,
nan
,
nan
)]
checks0
=
[(
"<start> New"
,
1
,
nan
,
nan
,
nan
,
nan
,
0.0
)
,(
"New York"
,
3
,
1.584962500721156
,
1.584962500721156
,
1.414213562373095
,
nan
,
1.584962500721156
)
,(
"York is"
,
1
,
0.0
,
nan
,
nan
,
nan
,
0.0
)
,(
"is New"
,
1
,
0.0
,
nan
,
nan
,
nan
,
0.0
)
,(
"New York"
,
3
,
1.584962500721156
,
1.584962500721156
,
1.414213562373095
,
nan
,
1.584962500721156
)
,(
"York and"
,
1
,
0.0
,
nan
,
nan
,
nan
,
0.0
)
,(
"and New"
,
1
,
0.0
,
nan
,
nan
,
nan
,
0.0
)
,(
"New York"
,
3
,
1.584962500721156
,
1.584962500721156
,
1.414213562373095
,
nan
,
1.584962500721156
)
,(
"York <stop>"
,
1
,
nan
,
nan
,
nan
,
nan
,
nan
)
]
checks2
=
[(
"to be"
,
3
,
1.2516291673878228
,
1.2516291673878228
,
1.5535694744293167
,
nan
,
0.9182958340544896
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment