Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
H
haskell-gargantext
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Przemyslaw Kaminski
haskell-gargantext
Commits
261af659
Commit
261af659
authored
Feb 14, 2022
by
Przemyslaw Kaminski
Browse files
Options
Browse Files
Download
Plain Diff
Merge branch 'dev' into 104-dev-john-snow-nlp
parents
0d2c688f
92316028
Pipeline
#2474
failed with stage
in 0 seconds
Changes
35
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
Showing
35 changed files
with
733 additions
and
166 deletions
+733
-166
CHANGELOG.md
CHANGELOG.md
+42
-0
Auth.hs
bin/gargantext-client/Auth.hs
+44
-0
Core.hs
bin/gargantext-client/Core.hs
+24
-0
Main.hs
bin/gargantext-client/Main.hs
+24
-0
Options.hs
bin/gargantext-client/Options.hs
+14
-0
Script.hs
bin/gargantext-client/Script.hs
+45
-0
Tracking.hs
bin/gargantext-client/Tracking.hs
+68
-0
Main.hs
bin/gargantext-upgrade/Main.hs
+29
-0
cabal.project
cabal.project
+7
-15
create
devops/postgres/create
+18
-11
extensions.sql
devops/postgres/extensions.sql
+3
-0
0.0.5.5.2.sql
devops/postgres/upgrade/0.0.5.5.2.sql
+1
-0
0.0.5.5.6.sql
devops/postgres/upgrade/0.0.5.5.6.sql
+2
-0
package.yaml
package.yaml
+1
-1
Settings.hs
src/Gargantext/API/Admin/Settings.hs
+1
-1
Client.hs
src/Gargantext/API/Client.hs
+22
-17
Ngrams.hs
src/Gargantext/API/Ngrams.hs
+7
-6
List.hs
src/Gargantext/API/Ngrams/List.hs
+11
-11
Types.hs
src/Gargantext/API/Ngrams/Types.hs
+1
-1
Update.hs
src/Gargantext/API/Node/Update.hs
+40
-2
NodeStory.hs
src/Gargantext/Core/NodeStory.hs
+3
-3
List.hs
src/Gargantext/Core/Text/List.hs
+7
-5
Flow.hs
src/Gargantext/Database/Action/Flow.hs
+6
-1
Metrics.hs
src/Gargantext/Database/Action/Metrics.hs
+180
-17
NgramsByContext.hs
src/Gargantext/Database/Action/Metrics/NgramsByContext.hs
+29
-21
Search.hs
src/Gargantext/Database/Action/Search.hs
+20
-17
ContextNodeNgrams.hs
src/Gargantext/Database/Admin/Trigger/ContextNodeNgrams.hs
+3
-3
NodesContexts.hs
src/Gargantext/Database/Admin/Trigger/NodesContexts.hs
+4
-4
Prelude.hs
src/Gargantext/Database/Prelude.hs
+16
-0
Facet.hs
src/Gargantext/Database/Query/Facet.hs
+2
-19
Ngrams.hs
src/Gargantext/Database/Query/Table/Ngrams.hs
+32
-5
Add.hs
src/Gargantext/Database/Query/Table/Node/Document/Add.hs
+4
-3
NodeNodeNgrams.hs
src/Gargantext/Database/Query/Table/NodeNodeNgrams.hs
+3
-0
Ngrams.hs
src/Gargantext/Database/Schema/Ngrams.hs
+19
-2
stack.yaml
stack.yaml
+1
-1
No files found.
CHANGELOG.md
View file @
261af659
## Version 0.0.5.5.7
*
[
FRONT
][
FIX
]
NgramsTable Cache search.
## Version 0.0.5.5.6
*
[
BACK
][
FIX
]
./bin/psql gargantext.ini < devops/posgres/upgrade/0.0.5.5.6.sql
*
[
FRONT
]
fix NodeType list show (Nodes options)
## Version 0.0.5.5.5
*
[
FORNT
]
fix Graph Explorer search ngrams
*
[
FRONT
]
fix NodeType list show (main Nodes)
## Version 0.0.5.5.4
*
[
BACK
][
OPTIM
]
NgramsTable scores
*
[
BACK
]
bin/client script to analyze backend performance and reproduce bugs
*
[
FRONT
]
Adding Language selection
## Version 0.0.5.5.3
*
[
BACK
]
Adding a Max limit for others lists.
## Version 0.0.5.5.2
*
[
BACK
][
OPTIM
]
Index on node_node_ngrams to seed up ngrams table score
queries. Please execute the upgrade SQL script
devops/postgres/0.0.5.5.2.sql
## Version 0.0.5.5.1
*
[
BACK
]
FIX Graph Explorer search with selected ngrams
*
[
FRONT
]
Clean CSS
## Version 0.0.5.5
*
[
FRONT
]
Visio frame removed, using a new tab instead (which is working)
*
[
BACK
]
Scores on the docs view fixed
## Version 0.0.5.3
*
[
FRONT
]
SSL local option
## Version 0.0.5.2
*
[
QUAL
]
Scores in Ngrams Table fixed during workflow and user can
refresh it if needed.
## Version 0.0.5.1
*
[
OPTIM
]
Upgrade fix with indexes and scores counts
## Version 0.0.5
*
[
OPTIM
][
DATABASE
]
Upgrade Schema, move conTexts in contexts table which requires a version bump.
...
...
bin/gargantext-client/Auth.hs
0 → 100644
View file @
261af659
module
Auth
where
import
Core
import
Options
import
Control.Monad.IO.Class
import
Data.Text.Encoding
(
encodeUtf8
)
import
Options.Generic
import
Servant.Client
import
qualified
Servant.Auth.Client
as
SA
import
Gargantext.API.Client
import
qualified
Gargantext.API.Admin.Auth.Types
as
Auth
import
qualified
Gargantext.Core.Types.Individu
as
Auth
import
qualified
Gargantext.Database.Admin.Types.Node
as
Node
-- | Authenticate and use the resulting Token to perform
-- auth-restricted actions
withAuthToken
::
ClientOpts
-- ^ source of user/pass data
->
(
SA
.
Token
->
Node
.
NodeId
->
ClientM
a
)
-- ^ do something once authenticated
->
ClientM
a
withAuthToken
opts
act
-- both user and password CLI arguments passed
|
Helpful
(
Just
usr
)
<-
user
opts
,
Helpful
(
Just
pw
)
<-
pass
opts
=
do
authRes
<-
postAuth
(
Auth
.
AuthRequest
usr
(
Auth
.
GargPassword
pw
))
case
Auth
.
_authRes_valid
authRes
of
-- authentication failed, this function critically needs it to
-- be able to run the action, so we abort
Nothing
->
problem
$
"invalid auth response: "
++
maybe
""
(
show
.
Auth
.
_authInv_message
)
(
Auth
.
_authRes_inval
authRes
)
-- authentication went through, we can run the action
Just
(
Auth
.
AuthValid
tok
tree_id
)
->
do
let
tok'
=
SA
.
Token
(
encodeUtf8
tok
)
whenVerbose
opts
$
do
liftIO
.
putStrLn
$
"[Debug] Authenticated: token="
++
show
tok
++
", tree_id="
++
show
tree_id
act
tok'
tree_id
-- user and/or pass CLI arguments not passed
|
otherwise
=
problem
"auth-protected actions require --user and --pass"
bin/gargantext-client/Core.hs
0 → 100644
View file @
261af659
module
Core
(
problem
,
whenVerbose
)
where
import
Options
import
Options.Generic
import
Control.Exception
import
Control.Monad
import
Control.Monad.Catch
import
Servant.Client
newtype
GargClientException
=
GCE
String
instance
Show
GargClientException
where
show
(
GCE
s
)
=
"Garg client exception: "
++
s
instance
Exception
GargClientException
-- | Abort with a message
problem
::
String
->
ClientM
a
problem
=
throwM
.
GCE
-- | Only run the given computation when the @--verbose@ flag is
-- passed.
whenVerbose
::
Monad
m
=>
ClientOpts
->
m
()
->
m
()
whenVerbose
opts
act
=
when
(
unHelpful
$
verbose
opts
)
act
bin/gargantext-client/Main.hs
0 → 100644
View file @
261af659
module
Main
where
import
Control.Monad
import
Network.HTTP.Client
import
Options.Generic
import
Servant.Client
import
Options
import
Script
(
script
)
main
::
IO
()
main
=
do
-- we parse CLI options
opts
@
(
ClientOpts
(
Helpful
uri
)
_
_
(
Helpful
verb
))
<-
getRecord
"Gargantext client"
mgr
<-
newManager
defaultManagerSettings
burl
<-
parseBaseUrl
uri
when
verb
$
do
putStrLn
$
"[Debug] user: "
++
maybe
"<none>"
show
(
unHelpful
$
user
opts
)
putStrLn
$
"[Debug] backend: "
++
show
burl
-- we run 'script' from the Script module, reporting potential errors
res
<-
runClientM
(
script
opts
)
(
mkClientEnv
mgr
burl
)
case
res
of
Left
err
->
putStrLn
$
"[Client error] "
++
show
err
Right
a
->
print
a
bin/gargantext-client/Options.hs
0 → 100644
View file @
261af659
{-# LANGUAGE TypeOperators #-}
module
Options
where
import
Options.Generic
-- | Some general options to be specified on the command line.
data
ClientOpts
=
ClientOpts
{
url
::
String
<?>
"URL to gargantext backend"
,
user
::
Maybe
Text
<?>
"(optional) username for auth-restricted actions"
,
pass
::
Maybe
Text
<?>
"(optional) password for auth-restricted actions"
,
verbose
::
Bool
<?>
"Enable verbose output"
}
deriving
(
Generic
,
Show
)
instance
ParseRecord
ClientOpts
bin/gargantext-client/Script.hs
0 → 100644
View file @
261af659
module
Script
(
script
)
where
import
Control.Monad.IO.Class
import
Gargantext.API.Client
import
Servant.Client
import
Auth
import
Core
import
Options
import
Tracking
-- | An example script. Tweak, rebuild and re-run the executable to see the
-- effect of your changes. You can hit any gargantext endpoint in the body
-- of 'script' using the many (many!) client functions exposed by the
-- 'Gargantext.API.Client' module.
--
-- Don't forget to pass @--user@ and @--pass@ if you're using 'withAuthToken'.
script
::
ClientOpts
->
ClientM
()
script
opts
=
do
-- we start by asking the backend for its version
ver
<-
getBackendVersion
liftIO
.
putStrLn
$
"Backend version: "
++
show
ver
-- next we authenticate using the credentials given on the command line
-- (through --user and --pass), erroring out loudly if the auth creds don't
-- go through, running the continuation otherwise.
withAuthToken
opts
$
\
tok
userNode
->
do
liftIO
.
putStrLn
$
"user node: "
++
show
userNode
steps
<-
-- we run a few client computations while tracking some EKG metrics
-- (any RTS stats or routing-related data), which means that we sample the
-- metrics at the beginning, the end, and in between each pair of steps.
tracking
opts
[
"rts.gc.bytes_allocated"
]
[
(
"get roots"
,
do
roots
<-
getRoots
tok
liftIO
.
putStrLn
$
"roots: "
++
show
roots
)
,
(
"get user node detail"
,
do
userNodeDetail
<-
getNode
tok
userNode
liftIO
.
putStrLn
$
"user node details: "
++
show
userNodeDetail
)
]
-- we pretty print the values we sampled for all metrics and the
-- results of all the steps
whenVerbose
opts
(
ppTracked
steps
)
bin/gargantext-client/Tracking.hs
0 → 100644
View file @
261af659
{-# LANGUAGE TupleSections #-}
module
Tracking
(
tracking
,
ppTracked
,
EkgMetric
,
Step
)
where
import
Core
import
Options
import
Control.Monad.IO.Class
import
Data.List
(
intersperse
)
import
Data.Text
(
Text
)
import
Servant.Client
import
System.Metrics.Json
(
Value
)
import
Gargantext.API.Client
import
qualified
Data.Text
as
T
-- | e.g @["rts", "gc", "bytes_allocated"]@
type
EkgMetric
=
[
Text
]
-- | Any textual description of a step
type
Step
=
Text
-- | Track EKG metrics before/after running a bunch of computations
-- that can talk to the backend.
tracking
::
ClientOpts
->
[
Text
]
-- ^ e.g @["rts.gc.bytes_allocated"]@
->
[(
Step
,
ClientM
a
)]
->
ClientM
[
Either
[(
EkgMetric
,
Value
)]
(
Step
,
a
)]
-- no steps, nothing to do
tracking
_
_
[]
=
return
[]
-- no metrics to track, we just run the steps
tracking
_
[]
steps
=
traverse
runStep
steps
-- metrics to track: we intersperse metric fetching and steps,
-- starting and ending with metric fetching
tracking
opts
ms'
steps
=
mix
(
Left
<$>
fetchMetrics
)
(
map
runStep
steps
)
where
fetchMetrics
::
ClientM
[(
EkgMetric
,
Value
)]
fetchMetrics
=
flip
traverse
ms
$
\
metric
->
do
whenVerbose
opts
$
liftIO
.
putStrLn
$
"[Debug] metric to track: "
++
T
.
unpack
(
T
.
intercalate
"."
metric
)
dat
<-
(
metric
,)
<$>
getMetricSample
metric
whenVerbose
opts
$
liftIO
.
putStrLn
$
"[Debug] metric pulled: "
++
show
dat
return
dat
mix
::
ClientM
a
->
[
ClientM
a
]
->
ClientM
[
a
]
mix
x
xs
=
sequence
$
[
x
]
++
intersperse
x
xs
++
[
x
]
ms
=
map
(
T
.
splitOn
"."
)
ms'
-- ^ A trivial function to print results of steps and sampled metrics
ppTracked
::
Show
a
=>
[
Either
[(
EkgMetric
,
Value
)]
(
Step
,
a
)]
->
ClientM
()
ppTracked
[]
=
return
()
ppTracked
(
Right
(
step
,
a
)
:
rest
)
=
do
liftIO
.
putStrLn
$
"[step: "
++
T
.
unpack
step
++
"] returned: "
++
show
a
ppTracked
rest
ppTracked
(
Left
ms
:
rest
)
=
do
liftIO
.
putStrLn
$
unlines
[
T
.
unpack
(
T
.
intercalate
"."
metric
)
++
" = "
++
show
val
|
(
metric
,
val
)
<-
ms
]
ppTracked
rest
runStep
::
(
Step
,
ClientM
a
)
->
ClientM
(
Either
e
(
Step
,
a
))
runStep
(
step
,
act
)
=
Right
.
(
step
,)
<$>
act
bin/gargantext-upgrade/Main.hs
View file @
261af659
...
...
@@ -252,6 +252,35 @@ sqlSchema = do
DROP TRIGGER if EXISTS trigger_insert_count ON nodes_nodes;
-- Indexes needed to speed up the deletes
-- Trigger for constraint node_ngrams_node_id_fkey
CREATE INDEX IF NOT EXISTS node_ngrams_node_id_idx ON public.node_ngrams USING btree (node_id);
-- Trigger for constraint node_node_ngrams2_node_id_fkey
CREATE INDEX IF NOT EXISTS node_node_ngrams2_node_id_idx ON public.node_node_ngrams2 USING btree (node_id);
-- Trigger for constraint node_node_ngrams_node1_id_fkey
CREATE INDEX IF NOT EXISTS node_node_ngrams_node1_id_idx ON public.node_node_ngrams USING btree (node1_id);
-- Trigger for constraint node_node_ngrams_node2_id_fkey
CREATE INDEX IF NOT EXISTS node_node_ngrams_node2_id_idx ON public.node_node_ngrams USING btree (node2_id);
-- Trigger for constraint nodes_nodes_node1_id_fkey
CREATE INDEX IF NOT EXISTS nodes_nodes_node1_id_idx ON public.nodes_nodes USING btree (node1_id);
-- Trigger for constraint nodes_nodes_node2_id_fkey
CREATE INDEX IF NOT EXISTS nodes_nodes_node2_id_idx ON public.nodes_nodes USING btree (node2_id);
-- Trigger for constraint nodes_parent_id_fkey
CREATE INDEX IF NOT EXISTS nodes_parent_id_idx ON public.nodes USING btree (parent_id);
-- Trigger for constraint rights_node_id_fkey
CREATE INDEX IF NOT EXISTS rights_node_id_idx ON public.rights USING btree (node_id);
-- Trigger for constraint nodes_contexts_node_id_fkey
CREATE INDEX IF NOT EXISTS nodes_contexts_node_id_idx ON public.nodes_contexts USING btree (node_id);
-- Trigger for constraint context_node_ngrams_node_id_fkey
CREATE INDEX IF NOT EXISTS context_node_node_id_idx ON public.context_node_ngrams USING btree (node_id);
|]
...
...
cabal.project
View file @
261af659
packages: .
-- ../servant-job
-- ../ekg-json
-- ../../../code/servant/servant
-- ../../../code/servant/servant-server
-- ../../../code/servant/servant-client-core
-- ../../../code/servant/servant-client
-- ../../../code/servant/servant-auth/servant-auth
-- ../../../code/servant/servant-auth/servant-auth-client
-- ../../../code/servant/servant-auth/servant-auth-server
allow-newer: base, accelerate, servant, time
allow-newer: base, accelerate, servant, time, classy-prelude
-- Patches
source-repository-package
...
...
@@ -20,7 +11,7 @@ source-repository-package
source-repository-package
type: git
location: https://github.com/alpmestan/ekg-json.git
tag:
c7bde4851a7cd41b3f3debf0c57f11bbcb11d698
tag:
fd7e5d7325939103cd87d0dc592faf644160341c
source-repository-package
type: git
...
...
@@ -53,7 +44,7 @@ source-repository-package
source-repository-package
type: git
location: https://gitlab.iscpif.fr/gargantext/crawlers/pubmed.git
tag:
a9d8e08a7ef82f90e29dfaced4071704a316339
4
tag:
9cdba6423decad5acfacb0f274212fd8723ce73
4
source-repository-package
type: git
...
...
@@ -112,7 +103,7 @@ source-repository-package
source-repository-package
type: git
location: https://github.com/delanoe/haskell-opaleye.git
tag:
d3ab7acd5ede737478763630035aa880f7e34444
tag:
756cb90f4ce725463d957bc899d764e0ed73738c
source-repository-package
type: git
...
...
@@ -146,4 +137,5 @@ source-repository-package
constraints: unordered-containers==0.2.14.*,
servant-ekg==0.3.1,
time==1.9.3
time==1.9.3,
stm==2.5.0.1
devops/postgres/create
View file @
261af659
#!/bin/bash
# sudo su postgres
# postgresql://$USER:$PW@localhost/$DB
PW
=
"C8kdcUrAQy66U"
DB
=
"gargandb1"
USER
=
"gargantua"
INIFILE
=
$1
#psql -c "CREATE USER \"${USER}\""
#psql -c "ALTER USER \"${USER}\" with PASSWORD '${PW}'"
getter
()
{
grep
$1
$INIFILE
|
sed
"s/^.*= //"
}
psql
-c
"DROP DATABASE IF EXISTS
\"
${
DB
}
\"
"
createdb
"
${
DB
}
"
#psql "${DB}" < schema.sql
USER
=
$(
getter
"DB_USER"
)
NAME
=
$(
getter
"DB_NAME"
)
PASS
=
$(
getter
"DB_PASS"
)
HOST
=
$(
getter
"DB_HOST"
)
PORT
=
$(
getter
"DB_PORT"
)
../../bin/psql ../../gargantext.ini < gargandb.dump
psql
-c
"ALTER DATABASE
\"
${
DB
}
\"
OWNER to
\"
${
USER
}
\"
"
#psql -c "CREATE USER \"${USER}\""
#psql -c "ALTER USER \"${USER}\" with PASSWORD '${PW}'"
psql
-c
"DROP DATABASE IF EXISTS
\"
${
NAME
}
\"
"
createdb
"
${
NAME
}
"
psql
"
${
NAME
}
"
< extensions.sql
#psql "${NAME}" < schema.sql
#../../bin/psql ../../gargantext.ini < gargandb.dump
psql
-c
"ALTER DATABASE
\"
${
NAME
}
\"
OWNER to
\"
${
USER
}
\"
"
devops/postgres/extensions.sql
0 → 100644
View file @
261af659
CREATE
EXTENSION
IF
NOT
EXISTS
pgcrypto
;
CREATE
EXTENSION
IF
NOT
EXISTS
tsm_system_rows
;
devops/postgres/upgrade/0.0.5.5.2.sql
0 → 100644
View file @
261af659
CREATE
INDEX
ON
public
.
node_node_ngrams
USING
btree
(
node1_id
,
node2_id
,
ngrams_type
);
devops/postgres/upgrade/0.0.5.5.6.sql
0 → 100644
View file @
261af659
drop
trigger
trigger_count_insert
on
node_node_ngrams
;
package.yaml
View file @
261af659
name
:
gargantext
version
:
'
0.0.5'
version
:
'
0.0.5
.5.7
'
synopsis
:
Search, map, share
description
:
Please see README.md
category
:
Data
...
...
src/Gargantext/API/Admin/Settings.hs
View file @
261af659
...
...
@@ -106,7 +106,7 @@ repoSnapshot repoDir = repoDir <> "/repo.cbor"
repoSaverAction
::
RepoDirFilePath
->
Serialise
a
=>
a
->
IO
()
repoSaverAction
repoDir
a
=
do
withTempFile
repoDir
"tmp-repo.cbor"
$
\
fp
h
->
do
printDebug
"repoSaverAction"
fp
--
printDebug "repoSaverAction" fp
L
.
hPut
h
$
serialise
a
hClose
h
renameFile
fp
(
repoSnapshot
repoDir
)
...
...
src/Gargantext/API/Client.hs
View file @
261af659
{-# OPTIONS_GHC -freduction-depth=0 #-}
{-# OPTIONS_GHC -O0 #-}
{-# OPTIONS_GHC -O0 #-}
module
Gargantext.API.Client
where
...
...
@@ -55,17 +55,18 @@ import Servant.Job.Core
import
Servant.Job.Types
import
System.Metrics.Json
(
Sample
,
Value
)
-- * actual client functions for individual endpoints
-- * version API
getBackendVersion
::
ClientM
Text
-- * auth API
postAuth
::
AuthRequest
->
ClientM
AuthResponse
-- * admin api
getRoots
::
Token
->
ClientM
[
Node
HyperdataUser
]
putRoots
::
Token
->
ClientM
Int
-- not actually implemented in the backend
deleteNodes
::
Token
->
[
NodeId
]
->
ClientM
Int
--
*
node api
getNode
::
Token
->
NodeId
->
ClientM
(
Node
HyperdataAny
)
getContext
::
Token
->
ContextId
->
ClientM
(
Node
HyperdataAny
)
renameNode
::
Token
->
NodeId
->
RenameNode
->
ClientM
[
Int
]
...
...
@@ -155,7 +156,7 @@ killNodeDocumentUploadAsyncJob :: Token -> NodeId -> JobID 'Unsafe -> Maybe Limi
pollNodeDocumentUploadAsyncJob
::
Token
->
NodeId
->
JobID
'U
n
safe
->
Maybe
Limit
->
Maybe
Offset
->
ClientM
(
JobStatus
'S
a
fe
JobLog
)
waitNodeDocumentUploadAsyncJob
::
Token
->
NodeId
->
JobID
'U
n
safe
->
ClientM
(
JobOutput
JobLog
)
-- corpus api
--
*
corpus api
getCorpus
::
Token
->
CorpusId
->
ClientM
(
Node
HyperdataCorpus
)
renameCorpus
::
Token
->
CorpusId
->
RenameNode
->
ClientM
[
Int
]
postCorpus
::
Token
->
CorpusId
->
PostNode
->
ClientM
[
CorpusId
]
...
...
@@ -244,13 +245,13 @@ killCorpusDocumentUploadAsyncJob :: Token -> CorpusId -> JobID 'Unsafe -> Maybe
pollCorpusDocumentUploadAsyncJob
::
Token
->
CorpusId
->
JobID
'U
n
safe
->
Maybe
Limit
->
Maybe
Offset
->
ClientM
(
JobStatus
'S
a
fe
JobLog
)
waitCorpusDocumentUploadAsyncJob
::
Token
->
CorpusId
->
JobID
'U
n
safe
->
ClientM
(
JobOutput
JobLog
)
-- corpus node/node API
--
*
corpus node/node API
getCorpusNodeNode
::
Token
->
NodeId
->
NodeId
->
ClientM
(
Node
HyperdataAny
)
-- corpus export API
--
*
corpus export API
getCorpusExport
::
Token
->
CorpusId
->
Maybe
ListId
->
Maybe
NgramsType
->
ClientM
Corpus
-- * annuaire api
getAnnuaire
::
Token
->
AnnuaireId
->
ClientM
(
Node
HyperdataAnnuaire
)
renameAnnuaire
::
Token
->
AnnuaireId
->
RenameNode
->
ClientM
[
Int
]
postAnnuaire
::
Token
->
AnnuaireId
->
PostNode
->
ClientM
[
AnnuaireId
]
...
...
@@ -338,17 +339,17 @@ killAnnuaireDocumentUploadAsyncJob :: Token -> AnnuaireId -> JobID 'Unsafe -> Ma
pollAnnuaireDocumentUploadAsyncJob
::
Token
->
AnnuaireId
->
JobID
'U
n
safe
->
Maybe
Limit
->
Maybe
Offset
->
ClientM
(
JobStatus
'S
a
fe
JobLog
)
waitAnnuaireDocumentUploadAsyncJob
::
Token
->
AnnuaireId
->
JobID
'U
n
safe
->
ClientM
(
JobOutput
JobLog
)
-- contact api
--
*
contact api
postAnnuaireContactAsync
::
Token
->
AnnuaireId
->
ClientM
(
JobStatus
'S
a
fe
JobLog
)
postAnnuaireContactAsyncJob
::
Token
->
AnnuaireId
->
JobInput
Maybe
AddContactParams
->
ClientM
(
JobStatus
'S
a
fe
JobLog
)
killAnnuaireContactAsyncJob
::
Token
->
AnnuaireId
->
JobID
'U
n
safe
->
Maybe
Limit
->
Maybe
Offset
->
ClientM
(
JobStatus
'S
a
fe
JobLog
)
pollAnnuaireContactAsyncJob
::
Token
->
AnnuaireId
->
JobID
'U
n
safe
->
Maybe
Limit
->
Maybe
Offset
->
ClientM
(
JobStatus
'S
a
fe
JobLog
)
waitAnnuaireContactAsyncJob
::
Token
->
AnnuaireId
->
JobID
'U
n
safe
->
ClientM
(
JobOutput
JobLog
)
-- contact node/node API
--
*
contact node/node API
getAnnuaireContactNodeNode
::
Token
->
NodeId
->
NodeId
->
ClientM
(
Node
HyperdataContact
)
-- document ngrams api
--
*
document ngrams api
getDocumentNgramsTable
::
Token
->
DocId
->
TabType
->
ListId
->
Int
->
Maybe
Int
->
Maybe
ListType
->
Maybe
MinSize
->
Maybe
MaxSize
->
Maybe
Ngrams
.
OrderBy
->
Maybe
Text
->
ClientM
(
VersionedWithCount
NgramsTable
)
putDocumentNgramsTable
::
Token
->
DocId
->
TabType
->
ListId
->
Versioned
NgramsTablePatch
->
ClientM
(
Versioned
NgramsTablePatch
)
postRecomputeDocumentNgramsTableScore
::
Token
->
DocId
->
TabType
->
ListId
->
ClientM
Int
...
...
@@ -359,15 +360,14 @@ killDocumentNgramsTableAsyncJob :: Token -> DocId -> JobID 'Unsafe -> Maybe Limi
pollDocumentNgramsTableAsyncJob
::
Token
->
DocId
->
JobID
'U
n
safe
->
Maybe
Limit
->
Maybe
Offset
->
ClientM
(
JobStatus
'S
a
fe
JobLog
)
waitDocumentNgramsTableAsyncJob
::
Token
->
DocId
->
JobID
'U
n
safe
->
ClientM
(
JobOutput
JobLog
)
-- document export API
--
*
document export API
getDocumentExportJSON
::
Token
->
DocId
->
ClientM
DocumentExport
.
DocumentExport
getDocumentExportCSV
::
Token
->
DocId
->
ClientM
Text
--getDocumentExportCSV :: Token -> DocId -> ClientM [DocumentExport.Document]
-- count api
--
*
count api
postCountQuery
::
Token
->
Query
->
ClientM
Counts
-- graph api
--
*
graph api
getGraphHyperdata
::
Token
->
NodeId
->
ClientM
HyperdataGraphAPI
postGraphAsync
::
Token
->
NodeId
->
ClientM
(
JobStatus
'S
a
fe
JobLog
)
postGraphAsyncJob
::
Token
->
NodeId
->
JobInput
Maybe
()
->
ClientM
(
JobStatus
'S
a
fe
JobLog
)
...
...
@@ -382,6 +382,7 @@ postGraphRecomputeVersion :: Token -> NodeId -> ClientM Graph
getTree
::
Token
->
NodeId
->
[
NodeType
]
->
ClientM
(
Tree
NodeTree
)
getTreeFirstLevel
::
Token
->
NodeId
->
[
NodeType
]
->
ClientM
(
Tree
NodeTree
)
-- * new corpus API
postNewCorpusWithFormAsync
::
Token
->
NodeId
->
ClientM
(
JobStatus
'S
a
fe
JobLog
)
postNewCorpusWithFormAsyncJob
::
Token
->
NodeId
->
JobInput
Maybe
NewWithForm
->
ClientM
(
JobStatus
'S
a
fe
JobLog
)
killNewCorpusWithFormAsyncJob
::
Token
->
NodeId
->
JobID
'U
n
safe
->
Maybe
Limit
->
Maybe
Offset
->
ClientM
(
JobStatus
'S
a
fe
JobLog
)
...
...
@@ -394,6 +395,7 @@ killNewCorpusWithQueryAsyncJob :: Token -> NodeId -> JobID 'Unsafe -> Maybe Limi
pollNewCorpusWithQueryAsyncJob
::
Token
->
NodeId
->
JobID
'U
n
safe
->
Maybe
Limit
->
Maybe
Offset
->
ClientM
(
JobStatus
'S
a
fe
JobLog
)
waitNewCorpusWithQueryAsyncJob
::
Token
->
NodeId
->
JobID
'U
n
safe
->
ClientM
(
JobOutput
JobLog
)
-- * list API
getList
::
Token
->
NodeId
->
ClientM
(
Headers
'[
H
eader
"Content-Disposition"
Text
]
(
Map
NgramsType
(
Versioned
NgramsTableMap
)))
postListJsonUpdateAsync
::
Token
->
NodeId
->
ClientM
(
JobStatus
'S
a
fe
JobLog
)
postListJsonUpdateAsyncJob
::
Token
->
NodeId
->
JobInput
Maybe
WithFile
->
ClientM
(
JobStatus
'S
a
fe
JobLog
)
...
...
@@ -407,11 +409,14 @@ killListCsvUpdateAsyncJob :: Token -> NodeId -> JobID 'Unsafe -> Maybe Limit ->
pollListCsvUpdateAsyncJob
::
Token
->
NodeId
->
JobID
'U
n
safe
->
Maybe
Limit
->
Maybe
Offset
->
ClientM
(
JobStatus
'S
a
fe
JobLog
)
waitListCsvUpdateAsyncJob
::
Token
->
NodeId
->
JobID
'U
n
safe
->
ClientM
(
JobOutput
JobLog
)
-- * public API
getPublicData
::
ClientM
[
PublicData
]
getPublicNodeFile
::
NodeId
->
ClientM
(
Headers
'[
H
eader
"Content-Type"
Text
]
BSResponse
)
-- ekg api
-- * ekg api
-- | get a sample of all metrics
getMetricsSample
::
ClientM
Sample
-- | open @<backend:port\/ekg\/index.html@ to see a list of metrics
getMetricSample
::
[
Text
]
->
ClientM
Value
-- * unpacking of client functions to derive all the individual clients
...
...
src/Gargantext/API/Ngrams.hs
View file @
261af659
...
...
@@ -245,7 +245,7 @@ setListNgrams :: HasNodeStory env err m
->
Map
NgramsTerm
NgramsRepoElement
->
m
()
setListNgrams
listId
ngramsType
ns
=
do
printDebug
"[setListNgrams]"
(
listId
,
ngramsType
)
--
printDebug "[setListNgrams]" (listId, ngramsType)
getter
<-
view
hasNodeStory
var
<-
liftBase
$
(
getter
^.
nse_getter
)
[
listId
]
liftBase
$
modifyMVar_
var
$
...
...
@@ -283,7 +283,7 @@ commitStatePatch :: (HasNodeStory env err m, HasMail env)
->
Versioned
NgramsStatePatch'
->
m
(
Versioned
NgramsStatePatch'
)
commitStatePatch
listId
(
Versioned
p_version
p
)
=
do
printDebug
"[commitStatePatch]"
listId
--
printDebug "[commitStatePatch]" listId
var
<-
getNodeStoryVar
[
listId
]
vq'
<-
liftBase
$
modifyMVar
var
$
\
ns
->
do
let
...
...
@@ -523,14 +523,13 @@ getTableNgrams _nType nId tabType listId limit_ offset
selected_inner
roots
n
=
maybe
False
(`
Set
.
member
`
roots
)
(
n
^.
ne_root
)
---------------------------------------
sortOnOrder
Nothing
=
identity
sortOnOrder
Nothing
=
sortOnOrder
(
Just
ScoreDesc
)
sortOnOrder
(
Just
TermAsc
)
=
List
.
sortOn
$
view
ne_ngrams
sortOnOrder
(
Just
TermDesc
)
=
List
.
sortOn
$
Down
.
view
ne_ngrams
sortOnOrder
(
Just
ScoreAsc
)
=
List
.
sortOn
$
view
ne_occurrences
sortOnOrder
(
Just
ScoreDesc
)
=
List
.
sortOn
$
Down
.
view
ne_occurrences
---------------------------------------
filteredNodes
::
Map
NgramsTerm
NgramsElement
->
[
NgramsElement
]
filteredNodes
tableMap
=
rootOf
<$>
list
&
filter
selected_node
where
...
...
@@ -562,13 +561,13 @@ getTableNgrams _nType nId tabType listId limit_ offset
setScores
False
table
=
pure
table
setScores
True
table
=
do
let
ngrams_terms
=
table
^..
each
.
ne_ngrams
printDebug
"ngrams_terms"
ngrams_terms
--
printDebug "ngrams_terms" ngrams_terms
t1
<-
getTime
occurrences
<-
getOccByNgramsOnlyFast'
nId
listId
ngramsType
ngrams_terms
printDebug
"occurrences"
occurrences
--
printDebug "occurrences" occurrences
t2
<-
getTime
liftBase
$
hprint
stderr
(
"getTableNgrams/setScores #ngrams="
%
int
%
" time="
%
hasTime
%
"
\n
"
)
...
...
@@ -587,11 +586,13 @@ getTableNgrams _nType nId tabType listId limit_ offset
let
scoresNeeded
=
needsScores
orderBy
tableMap1
<-
getNgramsTableMap
listId
ngramsType
t1
<-
getTime
tableMap2
<-
tableMap1
&
v_data
%%~
setScores
scoresNeeded
.
Map
.
mapWithKey
ngramsElementFromRepo
fltr
<-
tableMap2
&
v_data
%%~
fmap
NgramsTable
.
setScores
(
not
scoresNeeded
)
.
filteredNodes
let
fltrCount
=
length
$
fltr
^.
v_data
.
_NgramsTable
t2
<-
getTime
...
...
src/Gargantext/API/Ngrams/List.hs
View file @
261af659
...
...
@@ -41,7 +41,7 @@ import Gargantext.Database.Admin.Types.Hyperdata.Document
import
Gargantext.Database.Admin.Types.Node
import
Gargantext.Database.Query.Table.NodeContext
(
selectDocNodes
)
import
Gargantext.Database.Schema.Ngrams
import
Gargantext.Database.Schema.
Node
import
Gargantext.Database.Schema.
Context
import
Gargantext.Database.Types
(
Indexed
(
..
))
import
Gargantext.Prelude
import
Network.HTTP.Media
((
//
),
(
/:
))
...
...
@@ -155,12 +155,12 @@ reIndexWith cId lId nt lts = do
<$>
HashMap
.
toList
<$>
getTermsWith
identity
[
lId
]
nt
lts
--
printDebug "ts" ts
printDebug
"ts"
ts
-- Taking the ngrams with 0 occurrences only (orphans)
occs
<-
getOccByNgramsOnlyFast'
cId
lId
nt
ts
--
printDebug "occs" occs
printDebug
"occs"
occs
let
orphans
=
List
.
concat
$
map
(
\
t
->
case
HashMap
.
lookup
t
occs
of
...
...
@@ -168,28 +168,28 @@ reIndexWith cId lId nt lts = do
Just
n
->
if
n
<=
1
then
[
t
]
else
[ ]
)
ts
--
printDebug "orphans" orphans
printDebug
"orphans"
orphans
-- Get all documents of the corpus
docs
<-
selectDocNodes
cId
--
printDebug "docs length" (List.length docs)
printDebug
"docs length"
(
List
.
length
docs
)
-- Checking Text documents where orphans match
-- TODO Tests here
let
ngramsByDoc
=
map
(
HashMap
.
fromList
)
$
map
(
map
(
\
(
k
,
v
)
->
(
SimpleNgrams
(
text2ngrams
k
),
v
)))
$
map
(
\
doc
->
List
.
zip
$
map
(
\
doc
->
List
.
zip
(
termsInText
(
buildPatterns
$
map
(
\
k
->
(
Text
.
splitOn
" "
$
unNgramsTerm
k
,
[]
))
orphans
)
$
Text
.
unlines
$
catMaybes
[
doc
^.
node
_hyperdata
.
hd_title
,
doc
^.
node
_hyperdata
.
hd_abstract
[
doc
^.
context
_hyperdata
.
hd_title
,
doc
^.
context
_hyperdata
.
hd_abstract
]
)
(
List
.
cycle
[
Map
.
fromList
$
[(
nt
,
Map
.
singleton
(
doc
^.
node
_id
)
1
)]])
)
(
map
context2node
docs
)
(
List
.
cycle
[
Map
.
fromList
$
[(
nt
,
Map
.
singleton
(
doc
^.
context
_id
)
1
)]])
)
docs
--
printDebug "ngramsByDoc" ngramsByDoc
printDebug
"ngramsByDoc"
ngramsByDoc
-- Saving the indexation in database
_
<-
mapM
(
saveDocNgramsWith
lId
)
ngramsByDoc
...
...
src/Gargantext/API/Ngrams/Types.hs
View file @
261af659
...
...
@@ -578,7 +578,7 @@ ngramsElementFromRepo
,
_ne_parent
=
p
,
_ne_children
=
c
,
_ne_ngrams
=
ngrams
,
_ne_occurrences
=
panic
$
"API.Ngrams.Types._ne_occurrences"
,
_ne_occurrences
=
0
--
panic $ "API.Ngrams.Types._ne_occurrences"
{-
-- Here we could use 0 if we want to avoid any `panic`.
-- It will not happen using getTableNgrams if
...
...
src/Gargantext/API/Node/Update.hs
View file @
261af659
...
...
@@ -18,7 +18,7 @@ module Gargantext.API.Node.Update
import
Control.Lens
(
view
)
import
Data.Aeson
import
Data.Maybe
(
Maybe
(
..
))
import
Data.Maybe
(
Maybe
(
..
)
,
fromMaybe
)
import
Data.Swagger
import
GHC.Generics
(
Generic
)
import
Gargantext.API.Admin.Orchestrator.Types
(
JobLog
(
..
),
AsyncJobs
)
...
...
@@ -30,7 +30,9 @@ import Gargantext.API.Prelude (GargServer, simuLogs)
import
Gargantext.Core.Methods.Distances
(
GraphMetric
(
..
))
import
Gargantext.Core.Types.Main
(
ListType
(
..
))
import
Gargantext.Core.Viz.Graph.API
(
recomputeGraph
)
import
Gargantext.Database.Action.Metrics
(
updateNgramsOccurrences
,
updateContextScore
)
import
Gargantext.Database.Action.Flow.Pairing
(
pairing
)
import
Gargantext.Database.Query.Table.Node
(
defaultList
)
import
Gargantext.Database.Action.Flow.Types
(
FlowCmdM
)
import
Gargantext.Database.Admin.Types.Node
import
Gargantext.Database.Query.Table.Node
(
getNode
)
...
...
@@ -165,7 +167,40 @@ updateNode _uId lId (UpdateNodeParamsList _mode) logStatus = do
}
_
<-
case
corpusId
of
Just
cId
->
reIndexWith
cId
lId
NgramsTerms
(
Set
.
singleton
MapTerm
)
Just
cId
->
do
_
<-
reIndexWith
cId
lId
NgramsTerms
(
Set
.
singleton
MapTerm
)
_
<-
updateNgramsOccurrences
cId
(
Just
lId
)
pure
()
Nothing
->
pure
()
pure
JobLog
{
_scst_succeeded
=
Just
3
,
_scst_failed
=
Just
0
,
_scst_remaining
=
Just
0
,
_scst_events
=
Just
[]
}
updateNode
_uId
tId
(
UpdateNodeParamsTexts
_mode
)
logStatus
=
do
logStatus
JobLog
{
_scst_succeeded
=
Just
1
,
_scst_failed
=
Just
0
,
_scst_remaining
=
Just
2
,
_scst_events
=
Just
[]
}
corpusId
<-
view
node_parent_id
<$>
getNode
tId
lId
<-
defaultList
$
fromMaybe
(
panic
"[G.A.N.Update] updateNode/UpdateNodeParamsTexts: no defaultList"
)
corpusId
logStatus
JobLog
{
_scst_succeeded
=
Just
2
,
_scst_failed
=
Just
0
,
_scst_remaining
=
Just
1
,
_scst_events
=
Just
[]
}
_
<-
case
corpusId
of
Just
cId
->
do
_
<-
reIndexWith
cId
lId
NgramsTerms
(
Set
.
singleton
MapTerm
)
_
<-
updateNgramsOccurrences
cId
(
Just
lId
)
_
<-
updateContextScore
cId
(
Just
lId
)
-- printDebug "updateContextsScore" (cId, lId, u)
pure
()
Nothing
->
pure
()
pure
JobLog
{
_scst_succeeded
=
Just
3
...
...
@@ -175,6 +210,9 @@ updateNode _uId lId (UpdateNodeParamsList _mode) logStatus = do
}
updateNode
_uId
_nId
_p
logStatus
=
do
simuLogs
logStatus
10
...
...
src/Gargantext/Core/NodeStory.hs
View file @
261af659
...
...
@@ -177,8 +177,8 @@ type NodeStoryDir = FilePath
writeNodeStories
::
NodeStoryDir
->
NodeListStory
->
IO
()
writeNodeStories
fp
nls
=
do
done
<-
mapM
(
writeNodeStory
fp
)
$
splitByNode
nls
printDebug
"[writeNodeStories]"
done
_
done
<-
mapM
(
writeNodeStory
fp
)
$
splitByNode
nls
--
printDebug "[writeNodeStories]" done
pure
()
writeNodeStory
::
NodeStoryDir
->
(
NodeId
,
NodeListStory
)
->
IO
()
...
...
@@ -192,7 +192,7 @@ splitByNode (NodeStory m) =
saverAction'
::
Serialise
a
=>
NodeStoryDir
->
NodeId
->
a
->
IO
()
saverAction'
repoDir
nId
a
=
do
withTempFile
repoDir
((
cs
$
show
nId
)
<>
"-tmp-repo.cbor"
)
$
\
fp
h
->
do
printDebug
"[repoSaverAction]"
fp
--
printDebug "[repoSaverAction]" fp
DBL
.
hPut
h
$
serialise
a
hClose
h
renameFile
fp
(
nodeStoryPath
repoDir
nId
)
...
...
src/Gargantext/Core/Text/List.hs
View file @
261af659
...
...
@@ -76,15 +76,16 @@ buildNgramsLists :: ( HasNodeStory env err m
buildNgramsLists
user
uCid
mCid
mfslw
gp
=
do
ngTerms
<-
buildNgramsTermsList
user
uCid
mCid
mfslw
gp
(
NgramsTerms
,
MapListSize
350
)
othersTerms
<-
mapM
(
buildNgramsOthersList
user
uCid
mfslw
GroupIdentity
)
[
(
Authors
,
MapListSize
9
)
,
(
Sources
,
MapListSize
9
)
,
(
Institutes
,
MapListSize
9
)
[
(
Authors
,
MapListSize
9
,
MaxListSize
1000
)
,
(
Sources
,
MapListSize
9
,
MaxListSize
1000
)
,
(
Institutes
,
MapListSize
9
,
MaxListSize
1000
)
]
pure
$
Map
.
unions
$
[
ngTerms
]
<>
othersTerms
data
MapListSize
=
MapListSize
{
unMapListSize
::
!
Int
}
data
MaxListSize
=
MaxListSize
{
unMaxListSize
::
!
Int
}
buildNgramsOthersList
::
(
HasNodeError
err
,
CmdM
env
err
m
...
...
@@ -95,9 +96,9 @@ buildNgramsOthersList :: ( HasNodeError err
->
UserCorpusId
->
Maybe
FlowSocialListWith
->
GroupParams
->
(
NgramsType
,
MapListSize
)
->
(
NgramsType
,
MapListSize
,
MaxListSize
)
->
m
(
Map
NgramsType
[
NgramsElement
])
buildNgramsOthersList
user
uCid
mfslw
_groupParams
(
nt
,
MapListSize
mapListSize
)
=
do
buildNgramsOthersList
user
uCid
mfslw
_groupParams
(
nt
,
MapListSize
mapListSize
,
MaxListSize
maxListSize
)
=
do
allTerms
::
HashMap
NgramsTerm
(
Set
NodeId
)
<-
getContextsByNgramsUser
uCid
nt
-- PrivateFirst for first developments since Public NodeMode is not implemented yet
...
...
@@ -118,6 +119,7 @@ buildNgramsOthersList user uCid mfslw _groupParams (nt, MapListSize mapListSize)
listSize
=
mapListSize
-
(
List
.
length
mapTerms
)
(
mapTerms'
,
candiTerms
)
=
both
HashMap
.
fromList
$
List
.
splitAt
listSize
$
List
.
take
maxListSize
$
List
.
sortOn
(
Down
.
viewScore
.
snd
)
$
HashMap
.
toList
tailTerms'
...
...
src/Gargantext/Database/Action/Flow.hs
View file @
261af659
...
...
@@ -86,6 +86,7 @@ import Gargantext.Database.Action.Flow.Types
import
Gargantext.Database.Action.Flow.Utils
(
insertDocNgrams
,
DocumentIdWithNgrams
(
..
))
import
Gargantext.Database.Action.Search
(
searchDocInDatabase
)
import
Gargantext.Database.Admin.Config
(
userMaster
,
corpusMasterName
)
import
Gargantext.Database.Action.Metrics
(
updateNgramsOccurrences
)
import
Gargantext.Database.Admin.Types.Hyperdata
import
Gargantext.Database.Admin.Types.Node
-- (HyperdataDocument(..), NodeType(..), NodeId, UserId, ListId, CorpusId, RootId, MasterCorpusId, MasterUserId)
import
Gargantext.Database.Prelude
...
...
@@ -281,6 +282,8 @@ flowCorpusUser l user corpusName ctype ids mfslw = do
--
_
<-
mkPhylo
userCorpusId
userId
-- Annuaire Flow
-- _ <- mkAnnuaire rootUserId userId
_
<-
updateNgramsOccurrences
userCorpusId
(
Just
listId
)
pure
userCorpusId
...
...
@@ -320,6 +323,8 @@ saveDocNgramsWith :: ( FlowCmdM env err m)
->
m
()
saveDocNgramsWith
lId
mapNgramsDocs'
=
do
terms2id
<-
insertExtractedNgrams
$
HashMap
.
keys
mapNgramsDocs'
printDebug
"terms2id"
terms2id
let
mapNgramsDocs
=
HashMap
.
mapKeys
extracted2ngrams
mapNgramsDocs'
-- new
...
...
@@ -327,7 +332,7 @@ saveDocNgramsWith lId mapNgramsDocs' = do
$
map
(
first
_ngramsTerms
.
second
Map
.
keys
)
$
HashMap
.
toList
mapNgramsDocs
--
printDebug "saveDocNgramsWith" mapCgramsId
printDebug
"saveDocNgramsWith"
mapCgramsId
-- insertDocNgrams
_return
<-
insertContextNodeNgrams2
$
catMaybes
[
ContextNodeNgrams2
<$>
Just
nId
...
...
src/Gargantext/Database/Action/Metrics.hs
View file @
261af659
...
...
@@ -10,17 +10,27 @@ Portability : POSIX
Node API
-}
{-# LANGUAGE QuasiQuotes #-}
module
Gargantext.Database.Action.Metrics
where
import
Database.PostgreSQL.Simple.SqlQQ
(
sql
)
import
Data.HashMap.Strict
(
HashMap
)
import
Data.Map
(
Map
)
import
Data.Set
(
Set
)
import
Database.PostgreSQL.Simple
(
Query
,
Only
(
..
))
import
Database.PostgreSQL.Simple.Types
(
Values
(
..
),
QualifiedIdentifier
(
..
))
import
Data.Vector
(
Vector
)
import
Gargantext.Core
(
HasDBid
(
toDBid
))
import
Gargantext.API.Ngrams.Tools
(
filterListWithRoot
,
groupNodesByNgrams
,
Diagonal
(
..
),
getCoocByNgrams
,
mapTermListRoot
,
RootTerm
,
getRepo'
)
import
Gargantext.
API.Ngrams.Types
(
TabType
(
..
),
ngramsTypeFromTabType
,
NgramsTerm
)
import
Gargantext.
Core.Text.Metrics
(
scored
,
Scored
(
..
),
{-localMetrics, toScored-}
)
import
Gargantext.
Database.Prelude
(
runPGSQuery
{-, formatPGSQuery-}
)
import
Gargantext.
API.Ngrams.Types
(
TabType
(
..
),
ngramsTypeFromTabType
,
NgramsTerm
(
..
)
)
import
Gargantext.Core.Mail.Types
(
HasMail
)
import
Gargantext.Core.Types
(
ListType
(
..
),
Limit
,
NodeType
(
..
))
import
Gargantext.Core.NodeStory
import
Gargantext.Core.Text.Metrics
(
scored
,
Scored
(
..
),
{-localMetrics, toScored-}
)
import
Database.PostgreSQL.Simple.ToField
(
toField
,
Action
{-, ToField-}
)
import
Gargantext.Core.Types
(
ListType
(
..
),
Limit
,
NodeType
(
..
),
ContextId
)
import
Gargantext.Database.Action.Flow.Types
(
FlowCmdM
)
import
Gargantext.Database.Action.Metrics.NgramsByContext
(
getContextsByNgramsOnlyUser
{-, getTficfWith-}
)
import
Gargantext.Database.Admin.Config
(
userMaster
)
...
...
@@ -29,6 +39,10 @@ import Gargantext.Database.Query.Table.Node (defaultList)
import
Gargantext.Database.Query.Table.Node.Select
import
Gargantext.Prelude
import
qualified
Data.HashMap.Strict
as
HM
import
qualified
Data.Map
as
Map
import
qualified
Data.Set
as
Set
import
qualified
Data.List
as
List
import
qualified
Data.Text
as
Text
getMetrics
::
FlowCmdM
env
err
m
=>
CorpusId
->
Maybe
ListId
->
TabType
->
Maybe
Limit
...
...
@@ -46,39 +60,188 @@ getNgramsCooc :: (FlowCmdM env err m)
,
HashMap
(
NgramsTerm
,
NgramsTerm
)
Int
)
getNgramsCooc
cId
maybeListId
tabType
maybeLimit
=
do
(
ngs'
,
ngs
)
<-
getNgrams
cId
maybeListId
tabType
let
take'
Nothing
xs
=
xs
take'
(
Just
n
)
xs
=
take
n
xs
lId
<-
case
maybeListId
of
Nothing
->
defaultList
cId
Just
lId'
->
pure
lId'
(
ngs'
,
ngs
)
<-
getNgrams
lId
tabType
lId
<-
defaultList
cId
lIds
<-
selectNodesWithUsername
NodeList
userMaster
myCooc
<-
HM
.
filter
(
>
1
)
<$>
getCoocByNgrams
(
Diagonal
True
)
<$>
groupNodesByNgrams
ngs
<$>
getContextsByNgramsOnlyUser
cId
(
lIds
<>
[
lId
])
(
ngramsTypeFromTabType
tabType
)
(
take'
maybeLimit
$
HM
.
keys
ngs
)
<$>
getContextsByNgramsOnlyUser
cId
(
lIds
<>
[
lId
])
(
ngramsTypeFromTabType
tabType
)
(
take'
maybeLimit
$
HM
.
keys
ngs
)
pure
$
(
ngs'
,
ngs
,
myCooc
)
------------------------------------------------------------------------
------------------------------------------------------------------------
updateNgramsOccurrences
::
(
FlowCmdM
env
err
m
)
=>
CorpusId
->
Maybe
ListId
->
m
()
updateNgramsOccurrences
cId
mlId
=
do
_
<-
mapM
(
updateNgramsOccurrences'
cId
mlId
Nothing
)
[
Terms
,
Sources
,
Authors
,
Institutes
]
pure
()
getNgrams
::
(
HasMail
env
,
HasNodeStory
env
err
m
)
=>
CorpusId
->
Maybe
ListId
->
TabType
->
m
(
HashMap
NgramsTerm
(
ListType
,
Maybe
NgramsTerm
)
,
HashMap
NgramsTerm
(
Maybe
RootTerm
)
)
getNgrams
cId
maybeListId
tabType
=
do
updateNgramsOccurrences'
::
(
FlowCmdM
env
err
m
)
=>
CorpusId
->
Maybe
ListId
->
Maybe
Limit
->
TabType
->
m
[
Int
]
updateNgramsOccurrences'
cId
maybeListId
maybeLimit
tabType
=
do
lId
<-
case
maybeListId
of
Nothing
->
defaultList
cId
Just
lId'
->
pure
lId'
result
<-
getNgramsOccurrences
cId
lId
tabType
maybeLimit
let
toInsert
::
[[
Action
]]
toInsert
=
map
(
\
(
ngramsTerm
,
score
)
->
[
toField
cId
,
toField
lId
,
toField
$
unNgramsTerm
ngramsTerm
,
toField
$
toDBid
$
ngramsTypeFromTabType
tabType
,
toField
score
]
)
$
HM
.
toList
result
queryInsert
::
Query
queryInsert
=
[
sql
|
WITH input(corpus_id, list_id, terms, type_id, weight) AS (?)
INSERT into node_node_ngrams (node1_id, node2_id, ngrams_id, ngrams_type, weight)
SELECT input.corpus_id,input.list_id,ngrams.id,input.type_id,input.weight FROM input
JOIN ngrams on ngrams.terms = input.terms
ON CONFLICT (node1_id, node2_id, ngrams_id, ngrams_type)
DO UPDATE SET weight = excluded.weight
RETURNING 1
|]
let
fields
=
map
(
\
t
->
QualifiedIdentifier
Nothing
t
)
$
map
Text
.
pack
[
"int4"
,
"int4"
,
"text"
,
"int4"
,
"int4"
]
map
(
\
(
Only
a
)
->
a
)
<$>
runPGSQuery
queryInsert
(
Only
$
Values
fields
toInsert
)
------------------------------------------------------------------------
-- Used for scores in Ngrams Table
getNgramsOccurrences
::
(
FlowCmdM
env
err
m
)
=>
CorpusId
->
ListId
->
TabType
->
Maybe
Limit
->
m
(
HashMap
NgramsTerm
Int
)
getNgramsOccurrences
c
l
t
ml
=
HM
.
map
Set
.
size
<$>
getNgramsContexts
c
l
t
ml
getNgramsContexts
::
(
FlowCmdM
env
err
m
)
=>
CorpusId
->
ListId
->
TabType
->
Maybe
Limit
->
m
(
HashMap
NgramsTerm
(
Set
ContextId
))
getNgramsContexts
cId
lId
tabType
maybeLimit
=
do
(
_ngs'
,
ngs
)
<-
getNgrams
lId
tabType
lIds
<-
selectNodesWithUsername
NodeList
userMaster
-- TODO maybe add an option to group here
getContextsByNgramsOnlyUser
cId
(
lIds
<>
[
lId
])
(
ngramsTypeFromTabType
tabType
)
(
take'
maybeLimit
$
HM
.
keys
ngs
)
------------------------------------------------------------------------
updateContextScore
::
(
FlowCmdM
env
err
m
)
=>
CorpusId
->
Maybe
ListId
->
m
[
Int
]
updateContextScore
cId
maybeListId
=
do
lId
<-
case
maybeListId
of
Nothing
->
defaultList
cId
Just
lId'
->
pure
lId'
result
<-
getContextsNgramsScore
cId
lId
Terms
MapTerm
Nothing
let
toInsert
::
[[
Action
]]
toInsert
=
map
(
\
(
contextId
,
score
)
->
[
toField
cId
,
toField
contextId
,
toField
score
]
)
$
Map
.
toList
result
queryInsert
::
Query
queryInsert
=
[
sql
|
WITH input(node_id, context_id, score) AS (?)
UPDATE nodes_contexts nc
SET score = input.score
FROM input
WHERE nc.node_id = input.node_id
AND nc.context_id = input.context_id
RETURNING 1
|]
let
fields
=
map
(
\
t
->
QualifiedIdentifier
Nothing
t
)
$
map
Text
.
pack
[
"int4"
,
"int4"
,
"int4"
]
map
(
\
(
Only
a
)
->
a
)
<$>
runPGSQuery
queryInsert
(
Only
$
Values
fields
toInsert
)
-- Used for scores in Doc Table
getContextsNgramsScore
::
(
FlowCmdM
env
err
m
)
=>
CorpusId
->
ListId
->
TabType
->
ListType
->
Maybe
Limit
->
m
(
Map
ContextId
Int
)
getContextsNgramsScore
cId
lId
tabType
listType
maybeLimit
=
Map
.
map
Set
.
size
<$>
getContextsNgrams
cId
lId
tabType
listType
maybeLimit
getContextsNgrams
::
(
FlowCmdM
env
err
m
)
=>
CorpusId
->
ListId
->
TabType
->
ListType
->
Maybe
Limit
->
m
(
Map
ContextId
(
Set
NgramsTerm
))
getContextsNgrams
cId
lId
tabType
listType
maybeLimit
=
do
(
ngs'
,
ngs
)
<-
getNgrams
lId
tabType
lIds
<-
selectNodesWithUsername
NodeList
userMaster
result
<-
groupNodesByNgrams
ngs
<$>
getContextsByNgramsOnlyUser
cId
(
lIds
<>
[
lId
])
(
ngramsTypeFromTabType
tabType
)
(
take'
maybeLimit
$
HM
.
keys
$
HM
.
filter
(
\
v
->
fst
v
==
listType
)
ngs'
)
-- printDebug "getCoocByNgrams" result
pure
$
Map
.
fromListWith
(
<>
)
$
List
.
concat
$
map
(
\
(
ng
,
contexts
)
->
List
.
zip
(
Set
.
toList
contexts
)
(
List
.
cycle
[
Set
.
singleton
ng
]))
$
HM
.
toList
result
------------------------------------------------------------------------
------------------------------------------------------------------------
getNgrams
::
(
HasMail
env
,
HasNodeStory
env
err
m
)
=>
ListId
->
TabType
->
m
(
HashMap
NgramsTerm
(
ListType
,
Maybe
NgramsTerm
)
,
HashMap
NgramsTerm
(
Maybe
RootTerm
)
)
getNgrams
lId
tabType
=
do
lists
<-
mapTermListRoot
[
lId
]
(
ngramsTypeFromTabType
tabType
)
<$>
getRepo'
[
lId
]
let
maybeSyn
=
HM
.
unions
$
map
(
\
t
->
filterListWithRoot
t
lists
)
[
MapTerm
,
StopTerm
,
CandidateTerm
]
[
MapTerm
,
StopTerm
,
CandidateTerm
]
pure
(
lists
,
maybeSyn
)
-- Some useful Tools
take'
::
Maybe
Int
->
[
a
]
->
[
a
]
take'
Nothing
xs
=
xs
take'
(
Just
n
)
xs
=
take
n
xs
src/Gargantext/Database/Action/Metrics/NgramsByContext.hs
View file @
261af659
...
...
@@ -16,21 +16,23 @@ Ngrams by node enable contextual metrics.
module
Gargantext.Database.Action.Metrics.NgramsByContext
where
-- import Debug.Trace (trace)
--import Data.Map.Strict.Patch (PatchMap, Replace, diff)
import
Data.HashMap.Strict
(
HashMap
)
import
Data.Map
(
Map
)
import
Data.Maybe
(
catMaybes
)
import
Data.Set
(
Set
)
import
Data.Text
(
Text
)
import
Data.Tuple.Extra
(
first
,
second
,
swap
)
import
Database.PostgreSQL.Simple.SqlQQ
(
sql
)
import
Database.PostgreSQL.Simple.Types
(
Values
(
..
),
QualifiedIdentifier
(
..
))
-- import Debug.Trace (trace)
import
Gargantext.Core
import
Gargantext.API.Ngrams.Types
(
NgramsTerm
(
..
))
import
Gargantext.Core
import
Gargantext.Data.HashMap.Strict.Utils
as
HM
import
Gargantext.Database.Admin.Types.Node
(
ListId
,
CorpusId
,
NodeId
(
..
),
ContextId
,
MasterCorpusId
,
NodeType
(
NodeDocument
),
UserCorpusId
,
DocId
)
import
Gargantext.Database.Prelude
(
Cmd
,
runPGSQuery
)
import
Gargantext.Database.Schema.Ngrams
(
ngramsTypeId
,
NgramsType
(
..
))
import
Gargantext.Database.Query.Table.Ngrams
(
selectNgramsId
)
import
Gargantext.Database.Schema.Ngrams
(
ngramsTypeId
,
NgramsType
(
..
),
NgramsId
)
import
Gargantext.Prelude
import
qualified
Data.HashMap.Strict
as
HM
import
qualified
Data.Map
as
Map
...
...
@@ -111,37 +113,43 @@ getOccByNgramsOnlyFast' :: CorpusId
->
NgramsType
->
[
NgramsTerm
]
->
Cmd
err
(
HashMap
NgramsTerm
Int
)
getOccByNgramsOnlyFast'
cId
lId
nt
tms
=
-- trace (show (cId, lId)) $
HM
.
fromListWith
(
+
)
<$>
map
(
second
round
)
<$>
run
cId
lId
nt
tms
getOccByNgramsOnlyFast'
cId
lId
nt
tms
=
do
-- trace (show (cId, lId)) $
mapNgramsIds
<-
selectNgramsId
$
map
unNgramsTerm
tms
HM
.
fromListWith
(
+
)
<$>
catMaybes
<$>
map
(
\
(
nId
,
s
)
->
(,)
<$>
(
NgramsTerm
<$>
(
Map
.
lookup
nId
mapNgramsIds
))
<*>
(
Just
$
round
s
)
)
<$>
run
cId
lId
nt
(
Map
.
keys
mapNgramsIds
)
where
fields
=
[
QualifiedIdentifier
Nothing
"text"
]
run
::
CorpusId
->
ListId
->
NgramsType
->
[
Ngrams
Term
]
->
Cmd
err
[(
Ngrams
Term
,
Double
)]
run
cId'
lId'
nt'
tms'
=
map
(
first
NgramsTerm
)
<$>
runPGSQuery
query
(
Values
fields
((
DPS
.
Only
.
unNgramsTerm
)
<$>
tms'
)
->
[
Ngrams
Id
]
->
Cmd
err
[(
Ngrams
Id
,
Double
)]
run
cId'
lId'
nt'
tms'
=
runPGSQuery
query
(
Values
fields
((
DPS
.
Only
)
<$>
tms'
)
,
cId'
,
lId'
,
ngramsTypeId
nt'
)
fields
=
[
QualifiedIdentifier
Nothing
"int4"
]
query
::
DPS
.
Query
query
=
[
sql
|
WITH input_rows(terms) AS (?)
SELECT ng.terms, nng.weight FROM nodes_contexts nc
JOIN node_node_ngrams nng ON nng.node1_id = nc.node_id
JOIN ngrams ng ON nng.ngrams_id = ng.id
JOIN input_rows ir ON ir.terms = ng.terms
WHERE nng.node1_id = ? -- CorpusId
AND nng.node2_id = ? -- ListId
AND nng.ngrams_type = ? -- NgramsTypeId
AND nc.category > 0 -- Not trash
GROUP BY ng.terms, nng.weight
WITH input_ngrams(id) AS (?)
SELECT ngi.id, nng.weight FROM nodes_contexts nc
JOIN node_node_ngrams nng ON nng.node1_id = nc.node_id
JOIN input_ngrams ngi ON nng.ngrams_id = ngi.id
WHERE nng.node1_id = ?
AND nng.node2_id = ?
AND nng.ngrams_type = ?
AND nc.category > 0
GROUP BY ngi.id, nng.weight
|]
selectNgramsOccurrencesOnlyByContextUser_withSample
::
HasDBid
NodeType
...
...
src/Gargantext/Database/Action/Search.hs
View file @
261af659
...
...
@@ -25,8 +25,11 @@ import Gargantext.Database.Query.Facet
import
Gargantext.Database.Query.Filter
import
Gargantext.Database.Query.Join
(
leftJoin5
)
import
Gargantext.Database.Query.Table.Node
import
Gargantext.Database.Query.Table.Context
import
Gargantext.Database.Query.Table.NodeNode
import
Gargantext.Database.Query.Table.NodeContext
import
Gargantext.Database.Schema.Node
import
Gargantext.Database.Schema.Context
import
Gargantext.Prelude
import
Gargantext.Core.Text.Terms.Mono.Stem.En
(
stemIt
)
import
Opaleye
hiding
(
Order
)
...
...
@@ -80,27 +83,27 @@ queryInCorpus :: HasDBid NodeType
->
Text
->
O
.
Select
FacetDocRead
queryInCorpus
cId
t
q
=
proc
()
->
do
(
n
,
nn
)
<-
joinInCorpus
-<
()
restrict
-<
(
n
n
^.
nn_node1
_id
)
.==
(
toNullable
$
pgNodeId
cId
)
(
c
,
nc
)
<-
joinInCorpus
-<
()
restrict
-<
(
n
c
^.
nc_node
_id
)
.==
(
toNullable
$
pgNodeId
cId
)
restrict
-<
if
t
then
(
n
n
^.
nn
_category
)
.==
(
toNullable
$
sqlInt4
0
)
else
(
n
n
^.
nn
_category
)
.>=
(
toNullable
$
sqlInt4
1
)
restrict
-<
(
n
^.
n
s_search
)
@@
(
sqlTSQuery
(
unpack
q
))
restrict
-<
(
n
^.
n
s_typename
)
.==
(
sqlInt4
$
toDBid
NodeDocument
)
returnA
-<
FacetDoc
{
facetDoc_id
=
n
^.
n
s_id
,
facetDoc_created
=
n
^.
n
s_date
,
facetDoc_title
=
n
^.
n
s_name
,
facetDoc_hyperdata
=
n
^.
n
s_hyperdata
,
facetDoc_category
=
nn
^.
nn
_category
,
facetDoc_ngramCount
=
n
n
^.
nn
_score
,
facetDoc_score
=
nn
^.
nn
_score
then
(
n
c
^.
nc
_category
)
.==
(
toNullable
$
sqlInt4
0
)
else
(
n
c
^.
nc
_category
)
.>=
(
toNullable
$
sqlInt4
1
)
restrict
-<
(
c
^.
c
s_search
)
@@
(
sqlTSQuery
(
unpack
q
))
restrict
-<
(
c
^.
c
s_typename
)
.==
(
sqlInt4
$
toDBid
NodeDocument
)
returnA
-<
FacetDoc
{
facetDoc_id
=
c
^.
c
s_id
,
facetDoc_created
=
c
^.
c
s_date
,
facetDoc_title
=
c
^.
c
s_name
,
facetDoc_hyperdata
=
c
^.
c
s_hyperdata
,
facetDoc_category
=
nc
^.
nc
_category
,
facetDoc_ngramCount
=
n
c
^.
nc
_score
,
facetDoc_score
=
nc
^.
nc
_score
}
joinInCorpus
::
O
.
Select
(
NodeSearchRead
,
NodeNode
ReadNull
)
joinInCorpus
=
leftJoin
query
NodeSearchTable
queryNodeNode
Table
cond
joinInCorpus
::
O
.
Select
(
ContextSearchRead
,
NodeContext
ReadNull
)
joinInCorpus
=
leftJoin
query
ContextSearchTable
queryNodeContext
Table
cond
where
cond
::
(
NodeSearchRead
,
NodeNode
Read
)
->
Column
SqlBool
cond
(
n
,
nn
)
=
nn
^.
nn_node2_id
.==
_ns_id
n
cond
::
(
ContextSearchRead
,
NodeContext
Read
)
->
Column
SqlBool
cond
(
c
,
nc
)
=
nc
^.
nc_context_id
.==
_cs_id
c
------------------------------------------------------------------------
searchInCorpusWithContacts
...
...
src/Gargantext/Database/Admin/Trigger/ContextNodeNgrams.hs
View file @
261af659
...
...
@@ -35,7 +35,7 @@ triggerCountInsert = execPGSQuery query (toDBid NodeDocument, toDBid NodeList)
RETURN NEW;
END IF;
IF TG_OP = 'INSERT' THEN
INSERT INTO
context_node_ngrams (context_id, node
_id, ngrams_id, ngrams_type, weight)
INSERT INTO
node_node_ngrams (node1_id, node2
_id, ngrams_id, ngrams_type, weight)
select n.parent_id, n.id, new0.ngrams_id, new0.ngrams_type, count(*) from NEW as new0
INNER JOIN contexts n ON n.id = new0.context_id
INNER JOIN nodes n2 ON n2.id = new0.node_id
...
...
@@ -43,8 +43,8 @@ triggerCountInsert = execPGSQuery query (toDBid NodeDocument, toDBid NodeList)
AND n.typename = ? -- not mandatory
AND n.parent_id <> n2.id -- not mandatory
GROUP BY n.parent_id, n.id, new0.ngrams_id, new0.ngrams_type
ON CONFLICT (
context_id, node
_id, ngrams_id, ngrams_type)
DO UPDATE set weight =
context
_node_ngrams.weight + excluded.weight
ON CONFLICT (
node1_id, node2
_id, ngrams_id, ngrams_type)
DO UPDATE set weight =
node
_node_ngrams.weight + excluded.weight
;
END IF;
...
...
src/Gargantext/Database/Admin/Trigger/NodesContexts.hs
View file @
261af659
...
...
@@ -42,7 +42,7 @@ triggerInsertCount lId = execPGSQuery query (lId, nodeTypeId NodeList)
, count(*) AS weight
FROM NEW as new1
INNER JOIN contexts doc ON doc.id = new1.context_id
INNER JOIN nodes lists ON lists.parent_id =
lists.parent
_id
INNER JOIN nodes lists ON lists.parent_id =
new1.node
_id
INNER JOIN context_node_ngrams cnn ON cnn.context_id = doc.id
WHERE lists.id in (?, lists.id)
AND lists.typename = ?
...
...
@@ -76,9 +76,9 @@ triggerUpdateAdd lId = execPGSQuery query (lId, nodeTypeId NodeList)
, cnn.ngrams_type AS ngrams_type
, count(*) AS fix_count
FROM NEW as new1
INNER JOIN contexts doc ON doc.id = new1.context_id
INNER JOIN nodes lists ON
new1.node_id = lists.parent
_id
INNER JOIN context_node_ngrams cnn ON cnn.context_id = doc.id
INNER JOIN contexts doc ON doc.id
= new1.context_id
INNER JOIN nodes lists ON
lists.parent_id = new1.node
_id
INNER JOIN context_node_ngrams cnn ON cnn.context_id
= doc.id
WHERE lists.id in (?, lists.id) -- (masterList_id, userLists)
AND lists.typename = ?
GROUP BY node1_id, node2_id, ngrams_id, ngrams_type
...
...
src/Gargantext/Database/Prelude.hs
View file @
261af659
...
...
@@ -140,6 +140,22 @@ runPGSQuery q a = mkCmd $ \conn -> catch (PGS.query conn q a) (printError conn)
hPutStrLn
stderr
q'
throw
(
SomeException
e
)
{-
-- TODO
runPGSQueryFold :: ( CmdM env err m
, PGS.FromRow r
)
=> PGS.Query -> a -> (a -> r -> IO a) -> m a
runPGSQueryFold q initialState consume = mkCmd $ \conn -> catch (PGS.fold_ conn initialState consume) (printError conn)
where
printError c (SomeException e) = do
q' <- PGS.formatQuery c q
hPutStrLn stderr q'
throw (SomeException e)
-}
-- | TODO catch error
runPGSQuery_
::
(
CmdM
env
err
m
,
PGS
.
FromRow
r
...
...
src/Gargantext/Database/Query/Facet.hs
View file @
261af659
...
...
@@ -20,7 +20,7 @@ Portability : POSIX
module
Gargantext.Database.Query.Facet
(
runViewAuthorsDoc
,
runViewDocuments
,
viewDocuments'
--
, viewDocuments'
,
runCountDocuments
,
filterWith
...
...
@@ -306,8 +306,7 @@ runViewDocuments cId t o l order query = do
printDebug
"[runViewDocuments] sqlQuery"
$
showSql
sqlQuery
runOpaQuery
$
filterWith
o
l
order
sqlQuery
where
ntId
=
toDBid
NodeDocument
sqlQuery
=
viewDocuments
cId
t
ntId
query
sqlQuery
=
viewDocuments
cId
t
(
toDBid
NodeDocument
)
query
runCountDocuments
::
HasDBid
NodeType
=>
CorpusId
->
IsTrash
->
Maybe
Text
->
Cmd
err
Int
runCountDocuments
cId
t
mQuery
=
do
...
...
@@ -331,22 +330,6 @@ viewDocuments cId t ntId mQuery = viewDocumentsQuery cId t ntId mQuery >>> proc
,
facetDoc_score
=
toNullable
$
nc
^.
nc_score
}
viewDocuments'
::
CorpusId
->
IsTrash
->
NodeTypeId
->
Maybe
Text
->
Select
NodeRead
viewDocuments'
cId
t
ntId
mQuery
=
viewDocumentsQuery
cId
t
ntId
mQuery
>>>
proc
(
c
,
_nc
)
->
do
returnA
-<
Node
{
_node_id
=
_cs_id
c
,
_node_hash_id
=
""
,
_node_typename
=
_cs_typename
c
,
_node_user_id
=
_cs_user_id
c
,
_node_parent_id
=
-
1
,
_node_name
=
_cs_name
c
,
_node_date
=
_cs_date
c
,
_node_hyperdata
=
_cs_hyperdata
c
}
viewDocumentsQuery
::
CorpusId
->
IsTrash
->
NodeTypeId
...
...
src/Gargantext/Database/Query/Table/Ngrams.hs
View file @
261af659
...
...
@@ -18,27 +18,29 @@ module Gargantext.Database.Query.Table.Ngrams
,
queryNgramsTable
,
selectNgramsByDoc
,
insertNgrams
,
selectNgramsId
)
where
import
Control.Lens
((
^.
))
import
Data.ByteString.Internal
(
ByteString
)
import
Data.HashMap.Strict
(
HashMap
)
import
Data.Map
(
Map
)
import
Data.Text
(
Text
)
import
qualified
Data.HashMap.Strict
as
HashMap
import
qualified
Data.List
as
List
import
qualified
Database.PostgreSQL.Simple
as
PGS
import
Gargantext.Core.Types
import
Gargantext.Database.Prelude
(
runOpaQuery
,
Cmd
,
formatPGSQuery
,
runPGSQuery
)
import
Gargantext.Database.Query.Join
(
leftJoin3
)
import
Gargantext.Database.Query.Table.ContextNodeNgrams2
import
Gargantext.Database.Query.Table.NodeNgrams
(
queryNodeNgramsTable
)
import
Gargantext.Database.Schema.Ngrams
import
Gargantext.Database.Schema.NodeNgrams
import
Gargantext.Database.Query.Table.NodeNgrams
(
queryNodeNgramsTable
)
import
Gargantext.Database.Schema.Prelude
import
Gargantext.Database.Types
import
Gargantext.Prelude
import
qualified
Data.HashMap.Strict
as
HashMap
import
qualified
Data.List
as
List
import
qualified
Data.Map
as
Map
import
qualified
Database.PostgreSQL.Simple
as
PGS
queryNgramsTable
::
Select
NgramsRead
queryNgramsTable
=
selectTable
ngramsTable
...
...
@@ -106,3 +108,28 @@ queryInsertNgrams = [sql|
FROM input_rows
JOIN ngrams c USING (terms); -- columns of unique index
|]
--------------------------------------------------------------------------
selectNgramsId
::
[
Text
]
->
Cmd
err
(
Map
NgramsId
Text
)
selectNgramsId
ns
=
if
List
.
null
ns
then
pure
Map
.
empty
else
Map
.
fromList
<$>
map
(
\
(
Indexed
i
t
)
->
(
i
,
t
))
<$>
(
selectNgramsId'
ns
)
selectNgramsId'
::
[
Text
]
->
Cmd
err
[
Indexed
Int
Text
]
selectNgramsId'
ns
=
runPGSQuery
querySelectNgramsId
(
PGS
.
Only
$
Values
fields
ns
)
where
fields
=
map
(
\
t
->
QualifiedIdentifier
Nothing
t
)
[
"text"
]
querySelectNgramsId
::
PGS
.
Query
querySelectNgramsId
=
[
sql
|
WITH input_rows(terms) AS (?)
SELECT n.id, n.terms
FROM ngrams n
JOIN input_rows ir ON ir.terms = n.terms
GROUP BY n.terms, n.id
|]
src/Gargantext/Database/Query/Table/Node/Document/Add.hs
View file @
261af659
...
...
@@ -48,14 +48,14 @@ add_debug pId ns = formatPGSQuery queryAdd (Only $ Values fields inputData)
-- | Input Tables: types of the tables
inputSqlTypes
::
[
Text
]
inputSqlTypes
=
[
"int4"
,
"int4"
,
"int4"
]
inputSqlTypes
=
[
"int4"
,
"int4"
,
"int4"
,
"int4"
]
-- | SQL query to add documents
-- TODO return id of added documents only
queryAdd
::
Query
queryAdd
=
[
sql
|
WITH input_rows(node_id,context_id,category) AS (?)
INSERT INTO nodes_contexts (node_id, context_id,category)
WITH input_rows(node_id,context_id,
score,
category) AS (?)
INSERT INTO nodes_contexts (node_id, context_id,
score,
category)
SELECT * FROM input_rows
ON CONFLICT (node_id, context_id) DO NOTHING -- on unique index
RETURNING 1
...
...
@@ -75,6 +75,7 @@ data InputData = InputData { inNode_id :: NodeId
instance
ToRow
InputData
where
toRow
inputData
=
[
toField
(
inNode_id
inputData
)
,
toField
(
inContext_id
inputData
)
,
toField
(
0
::
Int
)
,
toField
(
1
::
Int
)
]
src/Gargantext/Database/Query/Table/NodeNodeNgrams.hs
View file @
261af659
...
...
@@ -54,3 +54,6 @@ insertNodeNodeNgramsW nnnw =
,
iOnConflict
=
(
Just
DoNothing
)
})
src/Gargantext/Database/Schema/Ngrams.hs
View file @
261af659
...
...
@@ -11,14 +11,16 @@ Ngrams connection to the Database.
-}
{-# LANGUAGE Arrows #-}
{-# LANGUAGE FunctionalDependencies #-}
{-# OPTIONS_GHC -fno-warn-orphans #-}
{-# LANGUAGE Arrows #-}
{-# LANGUAGE FunctionalDependencies #-}
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE TemplateHaskell #-}
module
Gargantext.Database.Schema.Ngrams
where
import
Data.Maybe
(
fromMaybe
)
import
Data.HashMap.Strict
(
HashMap
)
import
Data.Hashable
(
Hashable
)
import
Codec.Serialise
(
Serialise
())
...
...
@@ -32,6 +34,7 @@ import Gargantext.Core.Types (TODO(..), Typed(..))
import
Gargantext.Prelude
import
Servant
(
FromHttpApiData
(
..
),
Proxy
(
..
),
ToHttpApiData
(
..
))
import
Text.Read
(
read
)
import
Gargantext.Core
(
HasDBid
(
..
))
import
Gargantext.Database.Types
import
Gargantext.Database.Schema.Prelude
import
qualified
Database.PostgreSQL.Simple
as
PGS
...
...
@@ -82,6 +85,7 @@ data NgramsType = Authors | Institutes | Sources | NgramsTerms
instance
Serialise
NgramsType
ngramsTypes
::
[
NgramsType
]
ngramsTypes
=
[
minBound
..
]
...
...
@@ -141,6 +145,16 @@ fromNgramsTypeId id = lookup id
|
nt
<-
[
minBound
..
maxBound
]
::
[
NgramsType
]
]
unNgramsTypeId
::
NgramsTypeId
->
Int
unNgramsTypeId
(
NgramsTypeId
i
)
=
i
toNgramsTypeId
::
Int
->
NgramsTypeId
toNgramsTypeId
i
=
NgramsTypeId
i
instance
HasDBid
NgramsType
where
toDBid
=
unNgramsTypeId
.
ngramsTypeId
fromDBid
=
fromMaybe
(
panic
"NgramsType id not indexed"
)
.
fromNgramsTypeId
.
toNgramsTypeId
------------------------------------------------------------------------
------------------------------------------------------------------------
-- | TODO put it in Gargantext.Core.Text.Ngrams
...
...
@@ -160,6 +174,9 @@ instance FromField Ngrams where
x
<-
fromField
fld
mdata
pure
$
text2ngrams
x
instance
PGS
.
ToRow
Text
where
toRow
t
=
[
toField
t
]
text2ngrams
::
Text
->
Ngrams
text2ngrams
txt
=
UnsafeNgrams
txt'
$
length
$
splitOn
" "
txt'
where
...
...
stack.yaml
View file @
261af659
...
...
@@ -57,7 +57,7 @@ extra-deps:
-
git
:
https://github.com/delanoe/servant-static-th.git
commit
:
8cb8aaf2962ad44d319fcea48442e4397b3c49e8
-
git
:
https://github.com/alpmestan/ekg-json.git
commit
:
c7bde4851a7cd41b3f3debf0c57f11bbcb11d698
commit
:
fd7e5d7325939103cd87d0dc592faf644160341c
# Databases libs
-
git
:
https://github.com/delanoe/haskell-opaleye.git
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment