Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
H
haskell-gargantext
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Przemyslaw Kaminski
haskell-gargantext
Commits
14246fa5
Commit
14246fa5
authored
Mar 22, 2022
by
Alexandre Delanoë
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
[FIX] NLP API + group revert
parent
996fd394
Changes
4
Hide whitespace changes
Inline
Side-by-side
Showing
4 changed files
with
56 additions
and
40 deletions
+56
-40
Multi.hs
src/Gargantext/Core/Text/Terms/Multi.hs
+17
-8
Group.hs
src/Gargantext/Core/Text/Terms/Multi/Group.hs
+2
-1
Types.hs
src/Gargantext/Core/Types.hs
+31
-25
JohnSnowNLP.hs
src/Gargantext/Utils/JohnSnowNLP.hs
+6
-6
No files found.
src/Gargantext/Core/Text/Terms/Multi.hs
View file @
14246fa5
...
...
@@ -12,7 +12,7 @@ Multi-terms are ngrams where n > 1.
-}
module
Gargantext.Core.Text.Terms.Multi
(
multiterms
,
multiterms_rake
)
module
Gargantext.Core.Text.Terms.Multi
(
multiterms
,
multiterms_rake
,
tokenTagsWith
)
where
import
Data.Text
hiding
(
map
,
group
,
filter
,
concat
)
...
...
@@ -28,6 +28,11 @@ import qualified Gargantext.Core.Text.Terms.Multi.Lang.En as En
import
qualified
Gargantext.Core.Text.Terms.Multi.Lang.Fr
as
Fr
import
Gargantext.Core.Text.Terms.Multi.RAKE
(
multiterms_rake
)
import
qualified
Gargantext.Utils.JohnSnowNLP
as
JohnSnow
-------------------------------------------------------------------
type
NLP_API
=
Lang
->
Text
->
IO
PosSentences
-------------------------------------------------------------------
-- To be removed
...
...
@@ -37,21 +42,25 @@ multiterms = multiterms' tokenTag2terms
multiterms'
::
(
TokenTag
->
a
)
->
Lang
->
Text
->
IO
[
a
]
multiterms'
f
lang
txt
=
concat
<$>
map
(
map
f
)
<$>
map
(
filter
(
\
t
->
_my_token_pos
t
==
Just
NP
))
<$>
map
(
filter
(
\
t
->
_my_token_pos
t
==
Just
NP
))
<$>
tokenTags
lang
txt
-------------------------------------------------------------------
tokenTag2terms
::
TokenTag
->
Terms
tokenTag2terms
(
TokenTag
ws
t
_
_
)
=
Terms
ws
t
tokenTags
::
Lang
->
Text
->
IO
[[
TokenTag
]]
tokenTags
lang
s
=
map
(
groupTokens
lang
)
<$>
tokenTags'
lang
s
tokenTags
EN
txt
=
tokenTagsWith
EN
txt
corenlp
tokenTags
FR
txt
=
tokenTagsWith
FR
txt
JohnSnow
.
nlp
tokenTags
_
_
=
panic
"[G.C.T.T.Multi] NLP API not implemented yet"
tokenTagsWith
::
Lang
->
Text
->
NLP_API
->
IO
[[
TokenTag
]]
tokenTagsWith
lang
txt
nlp
=
map
(
groupTokens
lang
)
<$>
map
tokens2tokensTags
<$>
map
_sentenceTokens
<$>
_sentences
<$>
nlp
lang
txt
tokenTags'
::
Lang
->
Text
->
IO
[[
TokenTag
]]
tokenTags'
lang
t
=
map
tokens2tokensTags
<$>
map
_sentenceTokens
<$>
_sentences
<$>
corenlp
lang
t
---- | This function analyses and groups (or not) ngrams according to
---- specific grammars of each language.
...
...
src/Gargantext/Core/Text/Terms/Multi/Group.hs
View file @
14246fa5
...
...
@@ -23,7 +23,8 @@ import Gargantext.Prelude
group2
::
POS
->
POS
->
[
TokenTag
]
->
[
TokenTag
]
group2
p1
p2
(
x
@
(
TokenTag
_
_
(
Just
p1'
)
_
)
:
y
@
(
TokenTag
_
_
(
Just
p2'
)
_
)
:
z
)
=
if
(
p1
==
p1'
)
&&
(
p2
==
p2'
)
then
(
x
:
y
:
group2
p1
p2
(
x
<>
y
:
z
))
then
group2
p1
p2
(
x
<>
y
:
z
)
-- then (x : y : group2 p1 p2 (x<>y : z))
else
(
x
:
group2
p1
p2
(
y
:
z
))
group2
p1
p2
(
x
@
(
TokenTag
_
_
Nothing
_
)
:
y
)
=
(
x
:
group2
p1
p2
y
)
group2
_
_
[
x
@
(
TokenTag
_
_
(
Just
_
)
_
)]
=
[
x
]
...
...
src/Gargantext/Core/Types.hs
View file @
14246fa5
...
...
@@ -72,7 +72,7 @@ data POS = NP
|
JJ
|
VB
|
CC
|
IN
|
DT
|
ADV
|
No
Pos
|
No
tFound
{
not_found
::
[
Char
]
}
deriving
(
Show
,
Generic
,
Eq
,
Ord
)
------------------------------------------------------------------------
-- https://pythonprogramming.net/part-of-speech-tagging-nltk-tutorial/
...
...
@@ -80,32 +80,38 @@ instance FromJSON POS where
parseJSON
=
withText
"String"
(
\
x
->
pure
(
pos
$
unpack
x
))
where
pos
::
[
Char
]
->
POS
pos
"ADJ"
=
JJ
pos
"CC"
=
CC
pos
"DT"
=
DT
pos
"IN"
=
IN
pos
"JJ"
=
JJ
pos
"JJR"
=
JJ
pos
"JJS"
=
JJ
pos
"NC"
=
NP
pos
"NN"
=
NP
pos
"NNS"
=
NP
pos
"NNP"
=
NP
pos
"ADJ"
=
JJ
pos
"CC"
=
CC
pos
"CCONJ"
=
CC
pos
"DT"
=
DT
pos
"DET"
=
DT
pos
"IN"
=
IN
pos
"JJ"
=
JJ
pos
"JJR"
=
JJ
pos
"JJS"
=
JJ
pos
"NC"
=
NP
pos
"NN"
=
NP
pos
"NOUN"
=
NP
pos
"NNS"
=
NP
pos
"NNP"
=
NP
pos
"NNPS"
=
NP
pos
"NP"
=
NP
pos
"VB"
=
VB
pos
"VBD"
=
VB
pos
"VBG"
=
VB
pos
"VBN"
=
VB
pos
"VBP"
=
VB
pos
"VBZ"
=
VB
pos
"RB"
=
ADV
pos
"RBR"
=
ADV
pos
"RBS"
=
ADV
pos
"WRB"
=
ADV
pos
"NP"
=
NP
pos
"VB"
=
VB
pos
"VERB"
=
VB
pos
"VBD"
=
VB
pos
"VBG"
=
VB
pos
"VBN"
=
VB
pos
"VBP"
=
VB
pos
"VBZ"
=
VB
pos
"RB"
=
ADV
pos
"ADV"
=
ADV
pos
"RBR"
=
ADV
pos
"RBS"
=
ADV
pos
"WRB"
=
ADV
-- French specific
pos
"P"
=
IN
pos
_
=
NoPos
pos
"P"
=
IN
pos
"PUNCT"
=
IN
pos
x
=
NotFound
x
instance
ToJSON
POS
instance
Hashable
POS
...
...
src/Gargantext/Utils/JohnSnowNLP.hs
View file @
14246fa5
{-|
Module : Gargantext.Utils.JohnSnow
NLP
Description :
PosTagging module using Stanford java REST API
Module : Gargantext.Utils.JohnSnow
Description :
John Snow NLP API connexion
Copyright : (c) CNRS, 2017
License : AGPL + CECILL v3
Maintainer : team@gargantext.org
...
...
@@ -181,14 +181,14 @@ waitForJsTask jsTask = wait' 0
getPosTagAndLems
::
Lang
->
Text
->
IO
PosSentences
getPosTagAndLems
l
t
=
do
jsPosTask
<-
jsRequest
t
(
JSPOS
l
)
jsPosTask
<-
jsRequest
t
(
JSPOS
l
)
jsLemmaTask
<-
jsRequest
t
(
JSLemma
l
)
-- wait for both tasks
jsPos
<-
waitForJsTask
jsPosTask
jsLemma
<-
waitForJsTask
jsLemmaTask
printDebug
"[getPosTagAndLems] sentences"
$
jsAsyncTaskResponseToSentences
jsPos
jsLemma
pure
$
PosSentences
[]
pure
$
jsAsyncTaskResponseToSentences
jsPos
jsLemma
nlp
::
Lang
->
Text
->
IO
PosSentences
nlp
=
getPosTagAndLems
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment