Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
haskell-gargantext
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
142
Issues
142
List
Board
Labels
Milestones
Merge Requests
8
Merge Requests
8
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
gargantext
haskell-gargantext
Commits
4e21f839
Commit
4e21f839
authored
Sep 03, 2024
by
Yoelis Acourt
1
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
configure coreNLP tokenization to group hyphaneted words
parent
238628a4
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
2 deletions
+2
-2
PosTagging.hs
src/Gargantext/Core/Text/Terms/Multi/PosTagging.hs
+2
-2
No files found.
src/Gargantext/Core/Text/Terms/Multi/PosTagging.hs
View file @
4e21f839
...
@@ -82,7 +82,7 @@ corenlp' :: ( FromJSON a
...
@@ -82,7 +82,7 @@ corenlp' :: ( FromJSON a
=>
URI
->
Lang
->
p
->
IO
(
Response
a
)
=>
URI
->
Lang
->
p
->
IO
(
Response
a
)
corenlp'
uri
lang
txt
=
do
corenlp'
uri
lang
txt
=
do
req
<-
parseRequest
$
req
<-
parseRequest
$
"POST "
<>
show
(
uri
{
uriQuery
=
"?properties="
<>
(
BSL
.
unpack
$
encode
$
toJSON
$
Map
.
fromList
properties
)
})
"POST "
<>
show
(
uri
{
uriQuery
=
"?properties="
<>
BSL
.
unpack
(
encode
$
toJSON
$
Map
.
fromList
properties
)
})
-- curl -XPOST 'http://localhost:9000/?properties=%7B%22annotators%22:%20%22tokenize,ssplit,pos,ner%22,%20%22outputFormat%22:%20%22json%22%7D' -d 'hello world, hello' | jq .
-- curl -XPOST 'http://localhost:9000/?properties=%7B%22annotators%22:%20%22tokenize,ssplit,pos,ner%22,%20%22outputFormat%22:%20%22json%22%7D' -d 'hello world, hello' | jq .
-- printDebug "[corenlp] sending body" $ (cs txt :: ByteString)
-- printDebug "[corenlp] sending body" $ (cs txt :: ByteString)
catch
(
httpJSON
$
setRequestBodyLBS
(
cs
txt
)
req
)
$
\
e
->
catch
(
httpJSON
$
setRequestBodyLBS
(
cs
txt
)
req
)
$
\
e
->
...
@@ -97,7 +97,7 @@ corenlp' uri lang txt = do
...
@@ -97,7 +97,7 @@ corenlp' uri lang txt = do
properties_
::
[(
Text
,
Text
)]
properties_
::
[(
Text
,
Text
)]
properties_
=
case
lang
of
properties_
=
case
lang
of
-- TODO: Add: Aeson.encode $ Aeson.toJSON $ Map.fromList [()] instead of these hardcoded JSON strings
-- TODO: Add: Aeson.encode $ Aeson.toJSON $ Map.fromList [()] instead of these hardcoded JSON strings
EN
->
[
(
"annotators"
,
"tokenize,ssplit,pos,ner"
)
]
EN
->
[
(
"annotators"
,
"tokenize,ssplit,pos,ner"
)
,
(
"tokenize.options"
,
"splitHyphenated=false"
)
]
FR
->
[
(
"annotators"
,
"tokenize,ssplit,pos,lemma,ner"
)
FR
->
[
(
"annotators"
,
"tokenize,ssplit,pos,lemma,ner"
)
-- , ("parse.model", "edu/stanford/nlp/models/lexparser/frenchFactored.ser.gz")
-- , ("parse.model", "edu/stanford/nlp/models/lexparser/frenchFactored.ser.gz")
,
(
"pos.model"
,
"edu/stanford/nlp/models/pos-tagger/models/french.tagger"
)
,
(
"pos.model"
,
"edu/stanford/nlp/models/pos-tagger/models/french.tagger"
)
...
...
Przemyslaw Kaminski
@cgenie
mentioned in commit
5660aec0
·
Oct 08, 2024
mentioned in commit
5660aec0
mentioned in commit 5660aec07ec5a0a0a5468f440092c1a8f57a864e
Toggle commit list
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment