Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
haskell-gargantext
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
153
Issues
153
List
Board
Labels
Milestones
Merge Requests
12
Merge Requests
12
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
gargantext
haskell-gargantext
Commits
dab6f378
Commit
dab6f378
authored
Dec 30, 2020
by
Alexandre Delanoë
Browse files
Options
Browse Files
Download
Plain Diff
[FIX] merge
parents
d3d2b646
804f9027
Pipeline
#1323
failed with stage
Changes
6
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
Showing
6 changed files
with
219 additions
and
56 deletions
+219
-56
package.yaml
package.yaml
+2
-0
pinned-20.09.nix
pinned-20.09.nix
+0
-1
Distributional.hs
...ntext/Core/Methods/Distances/Accelerate/Distributional.hs
+93
-15
Utils.hs
src/Gargantext/Core/Methods/Matrix/Accelerate/Utils.hs
+108
-34
API.hs
src/Gargantext/Core/Viz/Graph/API.hs
+4
-3
stack.yaml
stack.yaml
+12
-3
No files found.
package.yaml
View file @
dab6f378
...
...
@@ -108,6 +108,8 @@ library:
-
SHA
-
Unique
-
accelerate
-
accelerate-utility
-
accelerate-arithmetic
-
aeson
-
aeson-lens
-
aeson-pretty
...
...
pinned-20.09.nix
View file @
dab6f378
...
...
@@ -4,7 +4,6 @@ import (builtins.fetchGit {
# Descriptive name to make the store path easier to identify
name
=
"nixos-20.09"
;
url
=
"https://github.com/nixos/nixpkgs/"
;
# Last commit hash for nixos-unstable
# `git ls-remote https://github.com/nixos/nixpkgs-channels nixos-20.09`
ref
=
"refs/heads/nixos-20.09"
;
rev
=
"19db3e5ea2777daa874563b5986288151f502e27"
;
...
...
src/Gargantext/Core/Methods/Distances/Accelerate/Distributional.hs
View file @
dab6f378
{-|
Module : Gargantext.Core.Methods.Distances.Accelerate.Distributional
Description :
Description :
Copyright : (c) CNRS, 2017-Present
License : AGPL + CECILL v3
Maintainer : team@gargantext.org
Stability : experimental
Portability : POSIX
This module aims at implementig distances of terms context by context is
the same referential of corpus.
Implementation use Accelerate library which enables GPU and CPU computation
See Gargantext.Core.Methods.Graph.Accelerate)
* Distributional Distance metric
__Definition :__ Distributional metric is a relative metric which depends on the
selected list, it represents structural equivalence of mutual information.
__Objective :__ We want to compute with matrices processing the similarity between term $i$ and term $j$ :
distr(i,j)=$\frac{\Sigma_{k \neq i,j} min(\frac{n_{ik}^2}{n_{ii}n_{kk}},\frac{n_{jk}^2}{n_{jj}n_{kk}})}{\Sigma_{k \neq i}\frac{n_{ik}^2}{ n_{ii}n_{kk}}}$
where $n_{ij}$ is the cooccurrence between term $i$ and term $j$
* For a vector V=[$x_1$ ... $x_n$], we note $|V|_1=\Sigma_ix_i$
* operator : .* and ./ cell by cell multiplication and division of the matrix
* operator * is the matrix multiplication
* Matrice M=[$n_{ij}$]$_{i,j}$
* opérateur : Diag(M)=[$n_{ii}$]$_i$ (vecteur)
* Id= identity matrix
* O=[1]$_{i,j}$ (matrice one)
* D(M)=Id .* M
* O * D(M) =[$n_{jj}$]$_{i,j}$
* D(M) * O =[$n_{ii}$]$_{i,j}$
* $V_i=[0~0~0~1~0~0~0]'$ en i
* MI=(M ./ O * D(M)) .* (M / D(M) * O )
* distr(i,j)=$\frac{|min(V'_i * (MI-D(MI)),V'_j * (MI-D(MI)))|_1}{|V'_i.(MI-D(MI))|_1}$
[Specifications written by David Chavalarias on Garg v4 shared NodeWrite, team Pyremiel 2020]
-}
...
...
@@ -30,15 +50,72 @@ import Data.Array.Accelerate.Interpreter (run)
import
Gargantext.Core.Methods.Matrix.Accelerate.Utils
import
qualified
Gargantext.Prelude
as
P
-- | `distributional m` returns the distributional distance between terms each
-- pair of terms as a matrix. The argument m is the matrix $[n_{ij}]_{i,j}$
-- where $n_{ij}$ is the coocccurrence between term $i$ and term $j$.
--
-- ## Basic example with Matrix of size 3:
--
-- >>> theMatrixInt 3
-- Matrix (Z :. 3 :. 3)
-- [ 7, 4, 0,
-- 4, 5, 3,
-- 0, 3, 4]
--
-- >>> distributional $ theMatrixInt 3
-- Matrix (Z :. 3 :. 3)
-- [ 1.0, 0.0, 0.9843749999999999,
-- 0.0, 1.0, 0.0,
-- 1.0, 0.0, 1.0]
--
-- ## Basic example with Matrix of size 4:
--
-- >>> theMatrixInt 4
-- Matrix (Z :. 4 :. 4)
-- [ 4, 1, 2, 1,
-- 1, 4, 0, 0,
-- 2, 0, 3, 3,
-- 1, 0, 3, 3]
--
-- >>> distributional $ theMatrixInt 4
-- Matrix (Z :. 4 :. 4)
-- [ 1.0, 0.0, 0.5714285714285715, 0.8421052631578947,
-- 0.0, 1.0, 1.0, 1.0,
-- 8.333333333333333e-2, 4.6875e-2, 1.0, 0.25,
-- 0.3333333333333333, 5.7692307692307696e-2, 1.0, 1.0]
--
distributional
::
Matrix
Int
->
Matrix
Double
distributional
m'
=
run
result
where
m
=
map
fromIntegral
$
use
m'
n
=
dim
m'
diag_m
=
diag
m
d_1
=
replicate
(
constant
(
Z
:.
n
:.
All
))
diag_m
d_2
=
replicate
(
constant
(
Z
:.
All
:.
n
))
diag_m
mi
=
(
.*
)
((
./
)
m
d_1
)
((
./
)
m
d_2
)
-- w = (.-) mi d_mi
-- The matrix permutations is taken care of below by directly replicating
-- the matrix mi, making the matrix w unneccessary and saving one step.
w_1
=
replicate
(
constant
(
Z
:.
All
:.
n
:.
All
))
mi
w_2
=
replicate
(
constant
(
Z
:.
n
:.
All
:.
All
))
mi
w'
=
zipWith
min
w_1
w_2
-- The matrix ii = [r_{i,j,k}]_{i,j,k} has r_(i,j,k) = 0 if k = i OR k = j
-- and r_(i,j,k) = 1 otherwise (i.e. k /= i AND k /= j).
ii
=
generate
(
constant
(
Z
:.
n
:.
n
:.
n
))
(
lift1
(
\
(
Z
:.
i
:.
j
:.
k
)
->
cond
((
&&
)
((
/=
)
k
i
)
((
/=
)
k
j
))
1
0
))
z_1
=
sum
((
.*
)
w'
ii
)
z_2
=
sum
((
.*
)
w_1
ii
)
result
=
termDivNan
z_1
z_2
-- * Metrics of proximity
-----------------------------------------------------------------------
-- ** Distributional Distance
-- | Distributional Distance metric
--
-- Distributional metric is a relative metric which depends on the
-- selected list, it represents structural equivalence of mutual information.
--
-- The distributional metric P(c) of @i@ and @j@ terms is: \[
-- S_{MI} = \frac {\sum_{k \neq i,j ; MI_{ik} >0}^{} \min(MI_{ik},
...
...
@@ -59,8 +136,9 @@ import qualified Gargantext.Prelude as P
-- Total cooccurrences of terms given a map list of size @m@
-- \[N_{m} = \sum_{i,i \neq i}^{m} \sum_{j, j \neq j}^{m} S_{ij}\]
--
distributional
::
Matrix
Int
->
Matrix
Double
distributional
m
=
-- run {- $ matMiniMax -}
distributional''
::
Matrix
Int
->
Matrix
Double
distributional''
m
=
-- run {- $ matMiniMax -}
run
$
diagNull
n
$
rIJ
n
$
filterWith
0
100
...
...
@@ -107,6 +185,6 @@ rIJ n m = matMiniMax $ divide a b
-- | Test perfermance with this matrix
-- TODO : add this in a benchmark folder
distriTest
::
Int
->
Matrix
Double
distriTest
n
=
distributional
(
theMatrix
n
)
distriTest
n
=
distributional
(
theMatrix
Int
n
)
src/Gargantext/Core/Methods/Matrix/Accelerate/Utils.hs
View file @
dab6f378
...
...
@@ -36,9 +36,96 @@ import Data.Array.Accelerate
import
Data.Array.Accelerate.Interpreter
(
run
)
import
qualified
Gargantext.Prelude
as
P
-- | Matrix cell by cell multiplication
(
.*
)
::
(
Shape
ix
,
Slice
ix
,
Elt
a
,
P
.
Num
(
Exp
a
)
)
=>
Acc
(
Array
((
ix
:.
Int
)
:.
Int
)
a
)
->
Acc
(
Array
((
ix
:.
Int
)
:.
Int
)
a
)
->
Acc
(
Array
((
ix
:.
Int
)
:.
Int
)
a
)
(
.*
)
=
zipWith
(
*
)
(
./
)
::
(
Shape
ix
,
Slice
ix
,
Elt
a
,
P
.
Num
(
Exp
a
)
,
P
.
Fractional
(
Exp
a
)
)
=>
Acc
(
Array
((
ix
:.
Int
)
:.
Int
)
a
)
->
Acc
(
Array
((
ix
:.
Int
)
:.
Int
)
a
)
->
Acc
(
Array
((
ix
:.
Int
)
:.
Int
)
a
)
(
./
)
=
zipWith
(
/
)
-- | Term by term division where divisions by 0 produce 0 rather than NaN.
termDivNan
::
(
Shape
ix
,
Slice
ix
,
Elt
a
,
Eq
a
,
P
.
Num
(
Exp
a
)
,
P
.
Fractional
(
Exp
a
)
)
=>
Acc
(
Array
((
ix
:.
Int
)
:.
Int
)
a
)
->
Acc
(
Array
((
ix
:.
Int
)
:.
Int
)
a
)
->
Acc
(
Array
((
ix
:.
Int
)
:.
Int
)
a
)
termDivNan
=
zipWith
(
\
i
j
->
cond
((
==
)
j
0
)
0
((
/
)
i
j
))
(
.-
)
::
(
Shape
ix
,
Slice
ix
,
Elt
a
,
P
.
Num
(
Exp
a
)
,
P
.
Fractional
(
Exp
a
)
)
=>
Acc
(
Array
((
ix
:.
Int
)
:.
Int
)
a
)
->
Acc
(
Array
((
ix
:.
Int
)
:.
Int
)
a
)
->
Acc
(
Array
((
ix
:.
Int
)
:.
Int
)
a
)
(
.-
)
=
zipWith
(
-
)
(
.+
)
::
(
Shape
ix
,
Slice
ix
,
Elt
a
,
P
.
Num
(
Exp
a
)
,
P
.
Fractional
(
Exp
a
)
)
=>
Acc
(
Array
((
ix
:.
Int
)
:.
Int
)
a
)
->
Acc
(
Array
((
ix
:.
Int
)
:.
Int
)
a
)
->
Acc
(
Array
((
ix
:.
Int
)
:.
Int
)
a
)
(
.+
)
=
zipWith
(
+
)
-----------------------------------------------------------------------
runExp
::
Elt
e
=>
Exp
e
->
e
runExp
e
=
indexArray
(
run
(
unit
e
))
Z
matrixOne
::
Num
a
=>
Dim
->
Acc
(
Matrix
a
)
matrixOne
n'
=
ones
where
ones
=
fill
(
index2
n
n
)
1
n
=
constant
n'
matrixIdentity
::
Num
a
=>
Dim
->
Acc
(
Matrix
a
)
matrixIdentity
n'
=
let
zeros
=
fill
(
index2
n
n
)
0
ones
=
fill
(
index1
n
)
1
n
=
constant
n'
in
permute
const
zeros
(
\
(
unindex1
->
i
)
->
index2
i
i
)
ones
matrixEye
::
Num
a
=>
Dim
->
Acc
(
Matrix
a
)
matrixEye
n'
=
let
ones
=
fill
(
index2
n
n
)
1
zeros
=
fill
(
index1
n
)
0
n
=
constant
n'
in
permute
const
ones
(
\
(
unindex1
->
i
)
->
index2
i
i
)
zeros
diagNull
::
Num
a
=>
Dim
->
Acc
(
Matrix
a
)
->
Acc
(
Matrix
a
)
diagNull
n
m
=
zipWith
(
*
)
m
(
matrixEye
n
)
-----------------------------------------------------------------------
_runExp
::
Elt
e
=>
Exp
e
->
e
_runExp
e
=
indexArray
(
run
(
unit
e
))
Z
-----------------------------------------------------------------------
-- | Define a vector
...
...
@@ -89,10 +176,10 @@ dim m = n
-- [ 12.0, 15.0, 18.0,
-- 12.0, 15.0, 18.0,
-- 12.0, 15.0, 18.0]
matSumCol
::
Dim
->
Acc
(
Matrix
Double
)
->
Acc
(
Matrix
Double
)
matSumCol
::
(
Elt
a
,
P
.
Num
(
Exp
a
))
=>
Dim
->
Acc
(
Matrix
a
)
->
Acc
(
Matrix
a
)
matSumCol
r
mat
=
replicate
(
constant
(
Z
:.
(
r
::
Int
)
:.
All
))
$
sum
$
transpose
mat
matSumCol'
::
Matrix
Double
->
Matrix
Double
matSumCol'
::
(
Elt
a
,
P
.
Num
(
Exp
a
))
=>
Matrix
a
->
Matrix
a
matSumCol'
m
=
run
$
matSumCol
n
m'
where
n
=
dim
m
...
...
@@ -164,23 +251,10 @@ filterWith' :: (Elt a, Ord a) => Exp a -> Exp a -> Acc (Matrix a) -> Acc (Matrix
filterWith'
t
v
m
=
map
(
\
x
->
ifThenElse
(
x
>
t
)
x
v
)
m
-- run $ (identityMatrix (DAA.constant (10::Int)) :: DAA.Acc (DAA.Matrix Int)) Matrix (Z :. 10 :. 10)
identityMatrix
::
Num
a
=>
Exp
Int
->
Acc
(
Matrix
a
)
identityMatrix
n
=
let
zeros
=
fill
(
index2
n
n
)
0
ones
=
fill
(
index1
n
)
1
in
permute
const
zeros
(
\
(
unindex1
->
i
)
->
index2
i
i
)
ones
------------------------------------------------------------------------
------------------------------------------------------------------------
eyeMatrix
::
Num
a
=>
Dim
->
Acc
(
Matrix
a
)
eyeMatrix
n'
=
let
ones
=
fill
(
index2
n
n
)
1
zeros
=
fill
(
index1
n
)
0
n
=
constant
n'
in
permute
const
ones
(
\
(
unindex1
->
i
)
->
index2
i
i
)
zeros
-- | TODO use Lenses
data
Direction
=
MatCol
(
Exp
Int
)
|
MatRow
(
Exp
Int
)
|
Diag
...
...
@@ -259,11 +333,6 @@ selfMatrix' m' = run $ selfMatrix n
m = use m'
-}
-------------------------------------------------
diagNull
::
Num
a
=>
Dim
->
Acc
(
Matrix
a
)
->
Acc
(
Matrix
a
)
diagNull
n
m
=
zipWith
(
*
)
m
eye
where
eye
=
eyeMatrix
n
-------------------------------------------------
crossProduct
::
Dim
->
Acc
(
Matrix
Double
)
->
Acc
(
Matrix
Double
)
crossProduct
n
m
=
{-trace (P.show (run m',run m'')) $-}
zipWith
(
*
)
m'
m''
...
...
@@ -296,7 +365,7 @@ cross' :: Matrix Double -> Matrix Double
cross'
mat
=
run
$
cross
n
mat'
where
mat'
=
use
mat
n
=
dim
mat
n
=
dim
mat
{-
...
...
@@ -313,23 +382,28 @@ p_ m = zipWith (/) m (n_ m)
) m
-}
theMatrix
::
Int
->
Matrix
Int
theMatrix
n
=
matrix
n
(
dataMatrix
n
)
theMatrixDouble
::
Int
->
Matrix
Double
theMatrixDouble
n
=
run
$
map
fromIntegral
(
use
$
theMatrixInt
n
)
theMatrixInt
::
Int
->
Matrix
Int
theMatrixInt
n
=
matrix
n
(
dataMatrix
n
)
where
dataMatrix
::
Int
->
[
Int
]
dataMatrix
x
|
(
P
.==
)
x
2
=
[
1
,
1
,
1
,
2
]
|
(
P
.==
)
x
3
=
[
1
,
1
,
2
,
1
,
2
,
3
,
2
,
3
,
4
|
(
P
.==
)
x
3
=
[
7
,
4
,
0
,
4
,
5
,
3
,
0
,
3
,
4
]
|
(
P
.==
)
x
4
=
[
1
,
1
,
2
,
3
,
1
,
2
,
3
,
4
,
2
,
3
,
4
,
5
,
3
,
4
,
5
,
6
|
(
P
.==
)
x
4
=
[
4
,
1
,
2
,
1
,
1
,
4
,
0
,
0
,
2
,
0
,
3
,
3
,
1
,
0
,
3
,
3
]
|
P
.
otherwise
=
P
.
undefined
{-
...
...
src/Gargantext/Core/Viz/Graph/API.hs
View file @
dab6f378
...
...
@@ -91,7 +91,8 @@ getGraph _uId nId = do
-- TODO Distance in Graph params
case
graph
of
Nothing
->
do
graph'
<-
computeGraph
cId
Conditional
NgramsTerms
repo
graph'
<-
computeGraph
cId
Distributional
NgramsTerms
repo
-- graph' <- computeGraph cId Conditional NgramsTerms repo
mt
<-
defaultGraphMetadata
cId
"Title"
repo
let
graph''
=
set
graph_metadata
(
Just
mt
)
graph'
let
hg
=
HyperdataGraphAPI
graph''
camera
...
...
@@ -204,7 +205,7 @@ graphRecompute u n logStatus = do
,
_scst_remaining
=
Just
1
,
_scst_events
=
Just
[]
}
_g
<-
trace
(
show
u
)
$
recomputeGraph
u
n
Conditional
_g
<-
trace
(
show
u
)
$
recomputeGraph
u
n
Distributional
--
Conditional
pure
JobLog
{
_scst_succeeded
=
Just
1
,
_scst_failed
=
Just
0
,
_scst_remaining
=
Just
0
...
...
@@ -239,7 +240,7 @@ graphVersions nId = do
,
gv_repo
=
v
}
recomputeVersions
::
UserId
->
NodeId
->
GargNoServer
Graph
recomputeVersions
uId
nId
=
recomputeGraph
uId
nId
Conditional
recomputeVersions
uId
nId
=
recomputeGraph
uId
nId
Distributional
--
Conditional
------------------------------------------------------------
graphClone
::
UserId
...
...
stack.yaml
View file @
dab6f378
...
...
@@ -7,6 +7,8 @@ packages:
#- 'deps/patches-map'
#- 'deps/servant-job'
#- 'deps/clustering-louvain'
#- 'deps/accelerate'
#- 'deps/accelerate-utility'
docker
:
enable
:
false
...
...
@@ -20,6 +22,7 @@ nix:
shell-file
:
build-shell.nix
allow-newer
:
true
extra-deps
:
# Data Mining Libs
...
...
@@ -71,10 +74,18 @@ extra-deps:
-
git
:
https://gitlab.iscpif.fr/gargantext/clustering-louvain.git
commit
:
7d74f96dfea8e51fbab1793cc0429b2fe741f73d
# Accelerate Linear Algebra and specific instances
# (UndecidableInstances for newer GHC version)
-
git
:
https://gitlab.iscpif.fr/anoe/accelerate.git
commit
:
f5c0e0071ec7b6532f9a9cd3eb33d14f340fbcc9
-
git
:
https://gitlab.iscpif.fr/anoe/accelerate-utility.git
commit
:
83ada76e78ac10d9559af8ed6bd4064ec81308e4
-
accelerate-arithmetic-1.0.0.1@sha256:555639232aa5cad411e89247b27871d09352b987a754230a288c690b6de6d888,2096
# Others dependencies (with stack resolver)
-
KMP-0.2.0.0@sha256:6dfbac03ef00ebd9347234732cb86a40f62ab5a80c0cc6bedb8eb51766f7df28,2562
-
Unique-0.4.7.7@sha256:2269d3528271e25d34542e7c24a4e541e27ec33460e1ea00845da95b82eec6fa,2777
-
accelerate-1.2.0.1@sha256:bb1928efe602545df4043692916ed427c959110cbd678d03c3f9c3be25d1ae88,20112
-
dependent-sum-0.4@sha256:40c705604f52374fb72616e10234635104a626ede737ddde899777b719df120b,1907
-
duckling-0.1.6.1@sha256:dab60953f405b45fe93e1e745f8cc83e5166e1788b1f4999cc06382e131153d8,47147
-
fclabels-2.0.4@sha256:efcc20c6c903d0a59e36eb1cb547a7bbbbba93b6e20b84b06e919c350891beb2,4492
-
full-text-search-0.2.1.4@sha256:81f6df3327e5b604f99b15e78635e5d6ca996e504c21d268a6d751d7d131aa36,6032
...
...
@@ -91,6 +102,4 @@ extra-deps:
-
smtp-mail-0.2.0.0@sha256:b91c81f6dbb41a9ceee8c443385118684ecec55006b77f7d3c0e49cffd2468cf,1211
-
stemmer-0.5.2@sha256:823aec56249ec2619f60a2c0d1384b732894dbbbe642856d337ebfe9629a0efd,4082
-
xmlbf-0.6.1@sha256:57867fcb39e0514d17b3328ff5de8d241a18482fc89bb742d9ed820a6a2a5187,1540
-
dependent-sum-0.4@sha256:40c705604f52374fb72616e10234635104a626ede737ddde899777b719df120b,1907
-
xmlbf-xeno-0.2@sha256:39f70fced6052524c290cf595f114661c721452e65fc3e0953a44e7682a6a6b0,950
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment