Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
haskell-gargantext
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
160
Issues
160
List
Board
Labels
Milestones
Merge Requests
14
Merge Requests
14
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
gargantext
haskell-gargantext
Commits
b1aae86b
Commit
b1aae86b
authored
Sep 04, 2023
by
Alfredo Di Napoli
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Generalise addDocumentsToHyperCorpus
parent
223d0d09
Pipeline
#4557
passed with stages
in 38 minutes and 47 seconds
Changes
2
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
24 additions
and
20 deletions
+24
-20
DocumentUpload.hs
src/Gargantext/API/Node/DocumentUpload.hs
+5
-4
Flow.hs
src/Gargantext/Database/Action/Flow.hs
+19
-16
No files found.
src/Gargantext/API/Node/DocumentUpload.hs
View file @
b1aae86b
...
...
@@ -16,17 +16,18 @@ import Gargantext.API.Admin.EnvTypes (GargJob(..), Env)
import
Gargantext.API.Admin.Orchestrator.Types
(
JobLog
(
..
),
AsyncJobs
)
import
Gargantext.API.Prelude
import
Gargantext.Core
(
Lang
(
..
))
import
Gargantext.Core.NLP
(
nlpServerGet
)
import
Gargantext.Core.Text.Corpus.Parsers.Date
(
dateSplit
)
import
Gargantext.Core.Utils.Prefix
(
unCapitalize
,
dropPrefix
)
import
Gargantext.Database.Action.Flow.Types
import
Gargantext.Core.Text.Terms
(
TermType
(
..
))
import
Gargantext.Core.Utils.Prefix
(
unCapitalize
,
dropPrefix
)
import
Gargantext.Database.Action.Flow
(
addDocumentsToHyperCorpus
)
import
Gargantext.Database.Action.Flow.Types
import
Gargantext.Database.Admin.Types.Hyperdata.Corpus
import
Gargantext.Database.Admin.Types.Hyperdata.Document
(
HyperdataDocument
(
..
))
import
Gargantext.Database.Admin.Types.Node
import
Gargantext.Database.Query.Table.Node
(
getClosestParentIdByType'
)
import
Gargantext.Prelude
import
Gargantext.Utils.Jobs
(
serveJobsAPI
,
MonadJobStatus
(
..
))
import
Gargantext.Core.NLP
(
nlpServerGet
)
data
DocumentUpload
=
DocumentUpload
...
...
@@ -118,4 +119,4 @@ documentUpload nId doc = do
let
lang
=
EN
ncs
<-
view
$
nlpServerGet
lang
addDocumentsToHyperCorpus
ncs
Nothing
(
Multi
lang
)
cId
[
hd
]
addDocumentsToHyperCorpus
ncs
(
Nothing
::
Maybe
HyperdataCorpus
)
(
Multi
lang
)
cId
[
hd
]
src/Gargantext/Database/Action/Flow.hs
View file @
b1aae86b
...
...
@@ -288,33 +288,36 @@ flow :: forall env err m a c.
flow
c
u
cn
la
mfslw
(
mLength
,
docsC
)
jobHandle
=
do
(
_userId
,
userCorpusId
,
listId
)
<-
createNodes
u
cn
c
-- TODO if public insertMasterDocs else insertUserDocs
runConduit
$
zipSources
(
yieldMany
[
1
..
])
docsC
nlpServer
<-
view
$
nlpServerGet
(
_tt_lang
la
)
runConduit
$
zipSources
(
yieldMany
([
1
..
]
::
[
Int
]))
docsC
.|
CList
.
chunksOf
100
.|
mapM_C
(
\
docs
->
void
$
insertDocs'
docs
>>=
Doc
.
add
userCorpusId
)
.|
mapM_C
(
addDocumentsWithProgress
nlpServer
userCorpusId
)
.|
sinkNull
$
(
logLocM
)
DEBUG
"Calling flowCorpusUser"
flowCorpusUser
(
la
^.
tt_lang
)
u
userCorpusId
listId
c
mfslw
where
insertDocs'
::
[(
Integer
,
a
)]
->
m
[
NodeId
]
insertDocs'
[]
=
pure
[]
insertDocs'
docs
=
do
ncs
<-
view
$
nlpServerGet
(
_tt_lang
la
)
$
(
logLocM
)
DEBUG
$
T
.
pack
$
"calling insertDoc, ([idx], mLength) = "
<>
show
(
fst
<$>
docs
,
mLength
)
ids
<-
insertMasterDocs
ncs
c
la
(
snd
<$>
docs
)
addDocumentsWithProgress
::
NLPServerConfig
->
CorpusId
->
[(
Int
,
a
)]
->
m
()
addDocumentsWithProgress
nlpServer
userCorpusId
docsChunk
=
do
$
(
logLocM
)
DEBUG
$
T
.
pack
$
"calling insertDoc, ([idx], mLength) = "
<>
show
(
fst
<$>
docsChunk
,
mLength
)
docs
<-
addDocumentsToHyperCorpus
nlpServer
c
la
userCorpusId
(
map
snd
docsChunk
)
markProgress
(
length
docs
)
jobHandle
pure
ids
-- | Given a list of corpus documents and a 'NodeId' identifying the 'CorpusId', adds
-- the given documents to the corpus. Returns the Ids of the inserted documents.
addDocumentsToHyperCorpus
::
(
DbCmd'
env
err
m
,
HasNodeError
err
)
=>
NLPServerConfig
->
Maybe
HyperdataCorpus
->
TermType
Lang
->
CorpusId
->
[
HyperdataDocument
]
->
m
[
DocId
]
addDocumentsToHyperCorpus
::
(
DbCmd'
env
err
m
,
HasNodeError
err
,
FlowCorpus
document
,
MkCorpus
corpus
)
=>
NLPServerConfig
->
Maybe
corpus
->
TermType
Lang
->
CorpusId
->
[
document
]
->
m
[
DocId
]
addDocumentsToHyperCorpus
ncs
mb_hyper
la
corpusId
docs
=
do
ids
<-
insertMasterDocs
ncs
mb_hyper
la
docs
void
$
Doc
.
add
corpusId
ids
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment