Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
haskell-gargantext
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
147
Issues
147
List
Board
Labels
Milestones
Merge Requests
6
Merge Requests
6
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
gargantext
haskell-gargantext
Commits
d7fd1875
Commit
d7fd1875
authored
May 02, 2018
by
Alexandre Delanoë
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
[FEAT/Distances] Accelerate lib for GPU: conditional and distributional. Needs behavioral tests.
parent
46330fdd
Changes
9
Hide whitespace changes
Inline
Side-by-side
Showing
9 changed files
with
215 additions
and
94 deletions
+215
-94
Main.hs
app/Main.hs
+6
-2
package.yaml
package.yaml
+2
-0
Conditional.hs
src/Gargantext/Graph/Distances/Conditional.hs
+46
-24
Distributional.hs
src/Gargantext/Graph/Distances/Distributional.hs
+2
-5
Matrice.hs
src/Gargantext/Graph/Distances/Matrice.hs
+108
-0
Utils.hs
src/Gargantext/Graph/Utils.hs
+4
-20
Parsers.hs
src/Gargantext/Parsers.hs
+32
-32
Prelude.hs
src/Gargantext/Prelude.hs
+4
-3
stack.yaml
stack.yaml
+11
-8
No files found.
app/Main.hs
View file @
d7fd1875
...
...
@@ -26,9 +26,12 @@ import Data.Text (unpack)
import
Gargantext.Prelude
import
Gargantext.API
(
startGargantext
,
startGargantextMock
)
--------------------------------------------------------
-- Tests
import
qualified
Gargantext.Graph.Utils
as
U
import
qualified
Gargantext.Graph.Distances.Conditional
as
C
import
qualified
Gargantext.Graph.Distances.Distributional
as
D
import
qualified
Gargantext.Graph.Distances.Matrice
as
M
--------------------------------------------------------
data
Mode
=
Dev
|
Mock
|
Prod
...
...
@@ -48,8 +51,9 @@ instance ParseRecord (MyOptions Wrapped)
deriving
instance
Show
(
MyOptions
Unwrapped
)
main
::
IO
()
main
=
putStrLn
$
show
$
C
.
conditional
U
.
m1
--main = putStrLn $ show $ map show $ take 10 $ D.distributional U.m1
main
=
do
putStrLn
$
show
$
M
.
conditional
$
M
.
myMat
10
--putStrLn $ show $ M.size' $ M.myMat 100
main'
::
IO
()
main'
=
do
...
...
package.yaml
View file @
d7fd1875
...
...
@@ -37,6 +37,7 @@ library:
-
Gargantext.Graph.Utils
-
Gargantext.Graph.Distances.Conditional
-
Gargantext.Graph.Distances.Distributional
-
Gargantext.Graph.Distances.Matrice
-
Gargantext.Ngrams
-
Gargantext.Ngrams.Analysis
-
Gargantext.Ngrams.TFICF
...
...
@@ -61,6 +62,7 @@ library:
-
Gargantext.Utils.Prefix
dependencies
:
-
QuickCheck
-
accelerate
-
aeson
-
aeson-lens
-
aeson-pretty
...
...
src/Gargantext/Graph/Distances/Conditional.hs
View file @
d7fd1875
...
...
@@ -34,10 +34,55 @@ import qualified Data.Vector as V
import
Gargantext.Prelude
import
Gargantext.Graph.Utils
------------------------------------------------------------------------
------------------------------------------------------------------------
-- | Optimisation issue
toBeOptimized
::
(
Num
a
,
Fractional
a
,
Ord
a
)
=>
Matrix
a
->
Matrix
a
toBeOptimized
m
=
proba
Col
m
------------------------------------------------------------------------
-- | Main Functions
-- Compute the probability from axis
-- x' = x / (sum Col x)
proba
::
(
Num
a
,
Fractional
a
)
=>
Axis
->
Matrix
a
->
Matrix
a
proba
a
m
=
mapOn
a
(
\
c
x
->
x
/
V
.
sum
(
axis
a
c
m
))
m
mapOn
::
Axis
->
(
AxisId
->
a
->
a
)
->
Matrix
a
->
Matrix
a
mapOn
a
f
m
=
V
.
foldl'
f'
m
(
V
.
enumFromTo
1
(
nOf
a
m
))
where
f'
m'
c
=
mapOnly
a
f
c
m'
mapOnly
::
Axis
->
(
AxisId
->
a
->
a
)
->
AxisId
->
Matrix
a
->
Matrix
a
mapOnly
Col
=
mapCol
mapOnly
Row
=
mapRow
mapAll
::
(
a
->
a
)
->
Matrix
a
->
Matrix
a
mapAll
f
m
=
mapOn
Col
(
\
_
->
f
)
m
---------------------------------------------------------------
-- | Compute a distance from axis
-- xs = (sum Col x') - x'
distFromSum
::
(
Num
a
,
Fractional
a
)
=>
Axis
->
Matrix
a
->
Matrix
a
distFromSum
a
m
=
mapOn
a
(
\
c
x
->
V
.
sum
(
axis
a
c
m
)
-
x
)
m
---------------------------------------------------------------
---------------------------------------------------------------
-- | To compute included/excluded or specific/generic scores
opWith
::
(
Fractional
a1
,
Num
a1
)
=>
(
Matrix
a2
->
t
->
Matrix
a1
)
->
Matrix
a2
->
t
->
Matrix
a1
opWith
op
xs
ys
=
mapAll
(
\
x
->
x
/
(
2
*
n
-
1
))
(
xs
`
op
`
ys
)
where
n
=
fromIntegral
$
nOf
Col
xs
---------------------------------------------------------------
-------------------------------------------------------
conditional
::
(
Num
a
,
Fractional
a
,
Ord
a
)
=>
Matrix
a
->
Matrix
a
conditional
m
=
x'
--
filter (threshold m') m'
conditional
m
=
filter
(
threshold
m'
)
m'
where
------------------------------------------------------------------------
-- | Main Operations
...
...
@@ -88,26 +133,3 @@ conditional m = x' -- filter (threshold m') m'
False
->
0
------------------------------------------------------------------------
------------------------------------------------------------------------
-- | Main Functions
-- Compute the probability from axis
-- x' = x / (sum Col x)
proba
::
(
Num
a
,
Fractional
a
)
=>
Axis
->
Matrix
a
->
Matrix
a
proba
a
m
=
mapOn
a
(
\
c
x
->
x
/
V
.
sum
(
axis
a
c
m
))
m
---------------------------------------------------------------
-- | Compute a distance from axis
-- xs = (sum Col x') - x'
distFromSum
::
(
Num
a
,
Fractional
a
)
=>
Axis
->
Matrix
a
->
Matrix
a
distFromSum
a
m
=
mapOn
a
(
\
c
x
->
V
.
sum
(
axis
a
c
m
)
-
x
)
m
---------------------------------------------------------------
---------------------------------------------------------------
-- | To compute included/excluded or specific/generic scores
opWith
::
(
Fractional
a1
,
Num
a1
)
=>
(
Matrix
a2
->
t
->
Matrix
a1
)
->
Matrix
a2
->
t
->
Matrix
a1
opWith
op
xs
ys
=
mapAll
(
\
x
->
x
/
(
2
*
n
-
1
))
(
xs
`
op
`
ys
)
where
n
=
fromIntegral
$
nOf
Col
xs
---------------------------------------------------------------
src/Gargantext/Graph/Distances/Distributional.hs
View file @
d7fd1875
...
...
@@ -7,7 +7,7 @@ Maintainer : team@gargantext.org
Stability : experimental
Portability : POSIX
Motivation and definition of the @
Condi
tional@ distance.
Motivation and definition of the @
Distribu
tional@ distance.
-}
{-# LANGUAGE BangPatterns #-}
...
...
@@ -64,7 +64,7 @@ mi m = matrix c r createMat
where
(
c
,
r
)
=
(
nOf
Col
m
,
nOf
Row
m
)
createMat
(
x
,
y
)
=
doMi
x
y
m
doMi
x
y
m
=
if
x
==
y
then
0
else
(
nonNegative
$
log
(
doMi'
x
y
m
)
)
doMi
x
y
m
=
if
x
==
y
then
0
else
(
max
(
log
(
doMi'
x
y
m
))
0
)
doMi'
x
y
m
=
(
getElem
x
y
m
)
/
(
cross
x
y
m
/
total
m
)
...
...
@@ -79,9 +79,6 @@ ax a i j m = dropAt j' $ axis a i' m
j'
=
mod
r
j
+
1
(
c
,
r
)
=
(
nOf
Col
m
,
nOf
Row
m
)
nonNegative
::
(
Ord
a
,
Num
a
)
=>
a
->
a
nonNegative
x
=
if
x
>
0
then
x
else
0
miniMax
::
(
Ord
a
)
=>
Matrix
a
->
a
miniMax
m
=
V
.
minimum
$
V
.
map
(
\
c
->
V
.
maximum
$
getCol
c
m
)
(
V
.
enumFromTo
1
(
nOf
Col
m
))
...
...
src/Gargantext/Graph/Distances/Matrice.hs
0 → 100644
View file @
d7fd1875
{-|
Module : Gargantext.Graph.Distances.Matrix
Description :
Copyright : (c) CNRS, 2017-Present
License : AGPL + CECILL v3
Maintainer : team@gargantext.org
Stability : experimental
Portability : POSIX
Motivation and definition of the @Conditional@ distance.
-}
{-# LANGUAGE NoImplicitPrelude #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE TypeFamilies #-}
module
Gargantext.Graph.Distances.Matrice
where
import
Data.Array.Accelerate.Data.Bits
import
Data.Array.Accelerate.Interpreter
import
Data.Array.Accelerate
import
Data.Array.Accelerate.Smart
import
Data.Array.Accelerate.Type
import
Data.Array.Accelerate.Array.Sugar
(
fromArr
,
Array
,
Z
)
import
Data.Maybe
(
Maybe
(
Just
))
import
qualified
Gargantext.Prelude
as
P
import
qualified
Data.Array.Accelerate.Array.Representation
as
Repr
matrix
::
Elt
c
=>
Int
->
[
c
]
->
Matrix
c
matrix
n
l
=
fromList
(
Z
:.
n
:.
n
)
l
myMat
::
Int
->
Matrix
Double
myMat
n
=
matrix
n
[
1
..
]
-- | Two ways to get the rank (as documentation)
rank
::
(
Matrix
Double
)
->
Int
rank
m
=
arrayRank
$
arrayShape
m
rank'
::
(
Matrix
Double
)
->
Int
rank'
m
=
n
where
Z
:.
_
:.
n
=
arrayShape
m
-----------------------------------------------------------------------
-- | Conditional Distance
type
Rank
=
Int
proba
::
Rank
->
Acc
(
Matrix
Double
)
->
Acc
(
Matrix
Double
)
proba
r
mat
=
zipWith
(
/
)
mat
(
mkSum
r
mat
)
mkSum
::
Rank
->
Acc
(
Matrix
Double
)
->
Acc
(
Matrix
Double
)
mkSum
r
mat
=
replicate
(
constant
(
Z
:.
(
r
::
Int
)
:.
All
))
$
fold
(
+
)
0
mat
type
Matrix'
a
=
Acc
(
Matrix
a
)
conditional
::
Matrix
Double
->
(
Matrix
Double
,
Matrix
Double
)
conditional
m
=
(
run
$
ie
(
use
m
),
run
$
sg
(
use
m
))
where
r
::
Rank
r
=
rank'
m
xs
::
Matrix'
Double
->
Matrix'
Double
xs
mat
=
zipWith
(
-
)
(
proba
r
mat
)
(
mkSum
r
$
proba
r
mat
)
ys
::
Acc
(
Matrix
Double
)
->
Acc
(
Matrix
Double
)
ys
mat
=
zipWith
(
-
)
(
proba
r
mat
)
(
mkSum
r
$
transpose
$
proba
r
mat
)
ie
::
Matrix'
Double
->
Matrix'
Double
ie
mat
=
map
(
\
x
->
x
/
(
2
*
n
-
1
))
$
zipWith
(
+
)
(
xs
mat
)
(
ys
mat
)
sg
::
Acc
(
Matrix
Double
)
->
Acc
(
Matrix
Double
)
sg
mat
=
map
(
\
x
->
x
/
(
2
*
n
-
1
))
$
zipWith
(
-
)
(
xs
mat
)
(
ys
mat
)
n
::
Exp
Double
n
=
P
.
fromIntegral
r
-- filter with threshold
-----------------------------------------------------------------------
-- | Distributional Distance
distributional
::
Matrix
Double
->
Matrix
Double
distributional
m
=
run
$
filter
$
ri
(
use
m
)
where
n
=
rank
m
filter
m
=
zipWith
(
\
a
b
->
max
a
b
)
m
(
transpose
m
)
--miniMax m = fold minimum $ fold maximum m
ri
mat
=
zipWith
(
/
)
mat1
mat2
where
mat1
=
mkSum
n
$
zipWith
min
(
mi
mat
)
(
mi
$
transpose
mat
)
mat2
=
mkSum
n
mat
mi
m'
=
zipWith
(
\
a
b
->
max
(
log
$
a
/
b
)
0
)
m'
$
zipWith
(
/
)
(
crossProduct
m'
)
(
total
m'
)
total
m''
=
replicate
(
constant
(
Z
:.
n
:.
n
))
$
fold
(
+
)
0
$
fold
(
+
)
0
m''
crossProduct
m
=
zipWith
(
*
)
(
cross
m
)
(
cross
(
transpose
m
))
cross
mat
=
zipWith
(
-
)
(
mkSum
n
mat
)
(
mat
)
src/Gargantext/Graph/Utils.hs
View file @
d7fd1875
...
...
@@ -32,6 +32,7 @@ import qualified Data.List as L
import
Gargantext.Prelude
------------------------------------------------------------------------
-- | Some utils to build the matrix from cooccurrence results
type
Distance
=
Double
type
Cooc
=
Int
...
...
@@ -117,14 +118,13 @@ data Axis = Col | Row
--divide :: Fractional a => Matrix a -> Matrix a -> Matrix a
--divide a b = a `multStd` (div b)
total
::
Num
a
=>
Matrix
a
->
a
total
m
=
V
.
sum
$
V
.
map
(
\
c
->
V
.
sum
(
getCol
c
m
))
(
V
.
enumFromTo
1
(
nOf
Col
m
))
------------------------------------------------------------------------
-- | Matrix functions
type
AxisId
=
Int
total
::
Num
a
=>
Matrix
a
->
a
total
m
=
V
.
sum
$
V
.
map
(
\
c
->
V
.
sum
(
getCol
c
m
))
(
V
.
enumFromTo
1
(
nOf
Col
m
))
nOf
::
Axis
->
Matrix
a
->
Int
nOf
Row
=
nrows
nOf
Col
=
ncols
...
...
@@ -133,21 +133,6 @@ axis :: Axis -> AxisId -> Matrix a -> Vector a
axis
Col
=
getCol
axis
Row
=
getRow
--mapOn' :: Axis -> (a -> a) -> Matrix a -> Matrix a
--mapOn' a f m = foldl' (\m' aId -> mapOn a (aId f) m') m [1.. (nOf a m)]
mapOn
::
Axis
->
(
AxisId
->
a
->
a
)
->
Matrix
a
->
Matrix
a
mapOn
a
f
m
=
V
.
foldl'
f'
m
(
V
.
enumFromTo
1
(
nOf
a
m
))
where
f'
m'
c
=
mapOnly
a
f
c
m'
mapOnly
::
Axis
->
(
AxisId
->
a
->
a
)
->
AxisId
->
Matrix
a
->
Matrix
a
mapOnly
Col
=
mapCol
mapOnly
Row
=
mapRow
mapAll
::
(
a
->
a
)
->
Matrix
a
->
Matrix
a
mapAll
f
m
=
mapOn
Col
(
\
_
->
f
)
m
toListsWithIndex
::
Matrix
a
->
[((
Int
,
Int
),
a
)]
toListsWithIndex
m
=
concat'
$
zip
[
1
..
]
$
map
(
\
c
->
zip
[
1
..
]
c
)
$
toLists
m
...
...
@@ -156,7 +141,6 @@ toListsWithIndex m = concat' $ zip [1..] $ map (\c -> zip [1..] c) $ toLists m
concat'
xs
=
L
.
concat
$
map
(
\
(
x
,
ys
)
->
map
(
\
(
y
,
a
)
->
((
x
,
y
),
a
))
ys
)
xs
-- | For tests only, to be removed
m1
::
Matrix
Double
m1
=
fromList
300
300
[
1
..
]
...
...
src/Gargantext/Parsers.hs
View file @
d7fd1875
...
...
@@ -75,37 +75,37 @@ data FileFormat = WOS -- Implemented (ISI Format)
-- TODO: to debug maybe add the filepath in error message
parse
::
FileFormat
->
FilePath
->
IO
([
ParseError
],
[[(
Text
,
Text
)]])
parse
format
path
=
do
files
<-
case
takeExtension
path
of
".zip"
->
openZip
path
_
->
pure
<$>
DB
.
readFile
path
(
as
,
bs
)
<-
partitionEithers
<$>
mapConcurrently
(
runParser
format
)
files
pure
(
as
,
map
toText
$
concat
bs
)
where
-- TODO : decode with bayesian inference on encodings
toText
=
map
(
\
(
a
,
b
)
->
(
decodeUtf8
a
,
decodeUtf8
b
))
-- | withParser:
-- According the format of the text, choosing the right parser.
-- TODO withParser :: FileFormat -> Parser [Document]
withParser
::
FileFormat
->
Parser
[[(
DB
.
ByteString
,
DB
.
ByteString
)]]
withParser
WOS
=
wosParser
--withParser DOC = docParser
--withParser ODT = odtParser
--withParser XML = xmlParser
--withParser _ = error "[ERROR] Parser not implemented yet"
runParser
::
FileFormat
->
DB
.
ByteString
->
IO
(
Either
String
[[(
DB
.
ByteString
,
DB
.
ByteString
)]])
runParser
format
text
=
pure
$
parseOnly
(
withParser
format
)
text
openZip
::
FilePath
->
IO
[
DB
.
ByteString
]
openZip
fp
=
do
path
<-
resolveFile'
fp
entries
<-
withArchive
path
(
DM
.
keys
<$>
getEntries
)
bs
<-
mapConcurrently
(
\
s
->
withArchive
path
(
getEntry
s
))
entries
pure
bs
--
parse :: FileFormat -> FilePath -> IO ([ParseError], [[(Text, Text)]])
--
parse format path = do
--
files <- case takeExtension path of
--
".zip" -> openZip path
--
_ -> pure <$> DB.readFile path
--
(as, bs) <- partitionEithers <$> mapConcurrently (runParser format) files
--
pure (as, map toText $ concat bs)
--
where
--
-- TODO : decode with bayesian inference on encodings
--
toText = map (\(a,b) -> (decodeUtf8 a, decodeUtf8 b))
--
--
--
--
| withParser:
--
--
According the format of the text, choosing the right parser.
--
--
TODO withParser :: FileFormat -> Parser [Document]
--
withParser :: FileFormat -> Parser [[(DB.ByteString, DB.ByteString)]]
--
withParser WOS = wosParser
--
--
withParser DOC = docParser
--
--
withParser ODT = odtParser
--
--
withParser XML = xmlParser
--
--
withParser _ = error "[ERROR] Parser not implemented yet"
--
--
runParser :: FileFormat -> DB.ByteString
--
-> IO (Either String [[(DB.ByteString, DB.ByteString)]])
--
runParser format text = pure $ parseOnly (withParser format) text
--
--
openZip :: FilePath -> IO [DB.ByteString]
--
openZip fp = do
--
path <- resolveFile' fp
--
entries <- withArchive path (DM.keys <$> getEntries)
--
bs <- mapConcurrently (\s -> withArchive path (getEntry s)) entries
--
pure bs
src/Gargantext/Prelude.hs
View file @
d7fd1875
...
...
@@ -31,19 +31,19 @@ import Data.Maybe (isJust, fromJust, maybe)
import
Protolude
(
Bool
(
True
,
False
),
Int
,
Double
,
Integer
,
Fractional
,
Num
,
Maybe
(
Just
,
Nothing
)
,
Floating
,
Char
,
IO
,
pure
,
(
<$>
),
panic
,
pure
,
(
<$>
),
(
<&>
),
panic
,
head
,
flip
,
Ord
,
Integral
,
Foldable
,
RealFrac
,
Monad
,
filter
,
reverse
,
map
,
zip
,
drop
,
take
,
zipWith
,
sum
,
fromIntegral
,
length
,
fmap
,
foldl
,
foldl'
,
takeWhile
,
sqrt
,
undefined
,
identity
,
abs
,
min
,
max
,
maximum
,
minimum
,
return
,
snd
,
truncate
,
(
+
),
(
*
),
(
/
),
(
-
),
(
.
),
(
$
),
(
**
),
(
^
),
(
<
),
(
>
),
log
,
(
+
),
(
*
),
(
/
),
(
-
),
(
.
),
(
$
),
(
&
),
(
**
),
(
^
),
(
<
),
(
>
),
log
,
Eq
,
(
==
),
(
>=
),
(
<=
),
(
<>
),
(
/=
)
,
(
&&
),
(
||
),
not
,
fst
,
snd
,
toS
,
elem
,
die
,
mod
,
div
,
curry
,
uncurry
,
curry
,
uncurry
,
repeat
)
-- TODO import functions optimized in Utils.Count
...
...
@@ -235,3 +235,4 @@ zipSnd f xs = zip xs (f xs)
unMaybe
::
[
Maybe
a
]
->
[
a
]
unMaybe
=
map
fromJust
.
L
.
filter
isJust
stack.yaml
View file @
d7fd1875
...
...
@@ -2,6 +2,7 @@ flags: {}
extra-package-dbs
:
[]
packages
:
-
.
-
'
/home/alexandre/local/logiciels/haskell/accelerate/accelerate'
allow-newer
:
true
extra-deps
:
...
...
@@ -15,18 +16,20 @@ extra-deps:
-
aeson-lens-0.5.0.0
-
duckling-0.1.3.0
-
extra-1.5.3
-
file-embed-lzma-0
-
haskell-src-exts-1.18.2
-
http-types-0.12.1
-
protolude-0.2
-
servant-0.1
2
.1
-
servant-auth-0.3.
0.1
-
servant-client-0.1
2
.0.1
-
servant-client-core-0.1
2
-
servant-docs-0.11.
1
-
servant-0.1
3.0
.1
-
servant-auth-0.3.
1.0
-
servant-client-0.1
3
.0.1
-
servant-client-core-0.1
3.0.1
-
servant-docs-0.11.
2
-
servant-multipart-0.11.1
-
servant-server-0.12
-
servant-swagger-ui-0.2.3.2.2.8
-
servant-server-0.13.0.1
-
servant-swagger-ui-0.3.0.3.13.2
-
servant-swagger-ui-core-0.3
-
stemmer-0.5.2
-
text-1.2.3.0
-
text-show-3.6.2
resolver
:
lts-1
0
.6
resolver
:
lts-1
1
.6
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment