Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
haskell-gargantext
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
148
Issues
148
List
Board
Labels
Milestones
Merge Requests
12
Merge Requests
12
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
gargantext
haskell-gargantext
Commits
762b3416
Commit
762b3416
authored
May 07, 2019
by
Quentin Lobbé
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
phylo from wos in progress
parent
2f9f9de6
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
27 additions
and
5 deletions
+27
-5
Main.hs
bin/gargantext-phylo/Main.hs
+26
-4
Parsers.hs
src/Gargantext/Text/Parsers.hs
+1
-1
No files found.
bin/gargantext-phylo/Main.hs
View file @
762b3416
...
...
@@ -29,6 +29,7 @@ import GHC.IO (FilePath)
import
Gargantext.Prelude
import
Gargantext.Text.List.CSV
(
csvGraphTermList
)
import
Gargantext.Text.Parsers.CSV
(
readCsv
,
csv_title
,
csv_abstract
,
csv_publication_year
)
import
Gargantext.Text.Parsers
(
FileFormat
(
..
),
parseDocs
)
import
Gargantext.Text.Terms.WithList
import
Gargantext.Text.Context
(
TermList
)
...
...
@@ -52,24 +53,32 @@ import qualified Data.ByteString.Lazy as L
-- | Conf | --
--------------
type
ListPath
=
FilePath
type
CorpusPath
=
FilePath
data
CorpusType
=
Wos
|
Csv
deriving
(
Show
,
Generic
)
type
Limit
=
Int
data
Conf
=
Conf
{
corpusPath
::
CorpusPath
Conf
{
corpusPath
::
CorpusPath
,
corpusType
::
CorpusType
,
listPath
::
ListPath
,
outputPath
::
FilePath
,
phyloName
::
Text
,
limit
::
Limit
}
deriving
(
Show
,
Generic
)
instance
FromJSON
Conf
instance
ToJSON
Conf
instance
FromJSON
CorpusType
instance
ToJSON
CorpusType
-- | Get the conf from a Json file
getJson
::
FilePath
->
IO
L
.
ByteString
getJson
path
=
L
.
readFile
path
---------------
-- | Parse | --
---------------
...
...
@@ -82,12 +91,23 @@ filterTerms patterns (year', doc) = (year',termsInText patterns doc)
termsInText
pats
txt
=
DL
.
nub
$
DL
.
concat
$
map
(
map
unwords
)
$
extractTermsWithList
pats
txt
csvToCorpus
::
Int
->
File
Path
->
IO
([(
Int
,
Text
)])
csvToCorpus
::
Int
->
Corpus
Path
->
IO
([(
Int
,
Text
)])
csvToCorpus
limit
csv
=
DV
.
toList
.
DV
.
take
limit
.
DV
.
map
(
\
n
->
(
csv_publication_year
n
,
(
csv_title
n
)
<>
" "
<>
(
csv_abstract
n
)))
.
snd
<$>
readCsv
csv
wosToCorpus
::
Int
->
CorpusPath
->
IO
([(
Int
,
Text
)])
wosToCorpus
limit
path
=
undefined
fileToCorpus
::
CorpusType
->
Int
->
CorpusPath
->
IO
([(
Int
,
Text
)])
fileToCorpus
format
limit
path
=
case
format
of
Wos
->
wosToCorpus
limit
path
Csv
->
csvToCorpus
limit
path
parse
::
Limit
->
CorpusPath
->
TermList
->
IO
[
Document
]
parse
limit
corpus
lst
=
do
corpus'
<-
csvToCorpus
limit
corpus
...
...
@@ -123,7 +143,7 @@ main = do
putStrLn
$
show
"--| Build the phylo |--"
let
query
=
PhyloQueryBuild
"cultural_evolution"
""
5
3
defaultFis
[]
[]
(
WeightedLogJaccard
$
WLJParams
0.00001
10
)
2
(
RelatedComponents
$
RCParams
$
WeightedLogJaccard
$
WLJParams
0.5
10
)
let
query
=
PhyloQueryBuild
(
phyloName
conf
)
""
5
3
defaultFis
[]
[]
(
WeightedLogJaccard
$
WLJParams
0.00001
10
)
2
(
RelatedComponents
$
RCParams
$
WeightedLogJaccard
$
WLJParams
0.5
10
)
let
queryView
=
PhyloQueryView
2
Merge
False
1
[
BranchAge
]
[
defaultSmallBranch
]
[
BranchPeakFreq
,
GroupLabelCooc
]
(
Just
(
ByBranchAge
,
Asc
))
Json
Flat
True
...
...
@@ -133,4 +153,6 @@ main = do
putStrLn
$
show
"--| Export the phylo as a dot graph |--"
P
.
writeFile
(
outputPath
conf
)
$
dotToString
$
viewToDot
view
let
outputFile
=
(
outputPath
conf
)
P
.++
(
DT
.
unpack
$
phyloName
conf
)
P
.++
".dot"
P
.
writeFile
outputFile
$
dotToString
$
viewToDot
view
src/Gargantext/Text/Parsers.hs
View file @
762b3416
...
...
@@ -71,7 +71,7 @@ type ParseError = String
-- | According to the format of Input file,
-- different parser are available.
data
FileFormat
=
WOS
|
CsvHalFormat
-- | CsvGargV3
data
FileFormat
=
WOS
|
CsvHalFormat
-- | CsvGargV3
deriving
(
Show
)
-- Implemented (ISI Format)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment