Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
haskell-gargantext
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Christian Merten
haskell-gargantext
Commits
c0afe078
Commit
c0afe078
authored
Oct 07, 2024
by
Grégoire Locqueville
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Reorganize TSV parsing code
parent
6c591f21
Changes
4
Expand all
Hide whitespace changes
Inline
Side-by-side
Showing
4 changed files
with
520 additions
and
451 deletions
+520
-451
FilterTermsAndCooc.hs
bin/gargantext-cli/CLI/FilterTermsAndCooc.hs
+2
-2
Common.hs
bin/gargantext-cli/CLI/Phylo/Common.hs
+2
-2
IMTUser.hs
src/Gargantext/Core/Ext/IMTUser.hs
+2
-2
TSV.hs
src/Gargantext/Core/Text/Corpus/Parsers/TSV.hs
+514
-445
No files found.
bin/gargantext-cli/CLI/FilterTermsAndCooc.hs
View file @
c0afe078
...
...
@@ -20,7 +20,7 @@ import Data.Tuple.Extra (both)
import
Data.Vector
qualified
as
DV
import
GHC.Generics
import
Gargantext.Core.Text.Context
(
TermList
)
import
Gargantext.Core.Text.Corpus.Parsers.TSV
(
readTSVFile
,
tsv_title
,
tsv_abstract
,
tsv_publication_year
,
fromMIntOrDec
,
defaultYear
)
import
Gargantext.Core.Text.Corpus.Parsers.TSV
(
readTSVFile
,
unIntOrDec
,
tsv_title
,
tsv_abstract
,
tsv_publication_year
,
defaultYear
)
import
Gargantext.Core.Text.List.Formats.TSV
(
tsvMapTermList
)
import
Gargantext.Core.Text.Metrics.Count
(
coocOnContexts
,
Coocs
)
import
Gargantext.Core.Text.Terms.WithList
(
Patterns
,
buildPatterns
,
extractTermsWithList
)
...
...
@@ -52,7 +52,7 @@ filterTermsAndCoocCLI (CorpusFile corpusFile) (TermListFile termListFile) (Outpu
Right
cf
->
do
let
corpus
=
DM
.
fromListWith
(
<>
)
.
DV
.
toList
.
DV
.
map
(
\
n
->
(
fromMIntOrDec
defaultYear
$
tsv_publication_year
n
,
[(
tsv_title
n
)
<>
" "
<>
(
tsv_abstract
n
)]))
.
DV
.
map
(
\
n
->
(
maybe
defaultYear
unIntOrDec
$
tsv_publication_year
n
,
[(
tsv_title
n
)
<>
" "
<>
(
tsv_abstract
n
)]))
.
snd
$
cf
-- termListMap :: [Text]
...
...
bin/gargantext-cli/CLI/Phylo/Common.hs
View file @
c0afe078
...
...
@@ -82,8 +82,8 @@ tsvToDocs parser patterns time path =
Wos
_
->
Prelude
.
error
"tsvToDocs: unimplemented"
Tsv
limit
->
Vector
.
toList
<$>
Vector
.
take
limit
<$>
Vector
.
map
(
\
row
->
Document
(
toPhyloDate
(
Tsv
.
fromMIntOrDec
Tsv
.
defaultYear
$
tsv_publication_year
row
)
(
fromMaybe
Tsv
.
defaultMonth
$
tsv_publication_month
row
)
(
fromMaybe
Tsv
.
defaultDay
$
tsv_publication_day
row
)
time
)
(
toPhyloDate'
(
Tsv
.
fromMIntOrDec
Tsv
.
defaultYear
$
tsv_publication_year
row
)
(
fromMaybe
Tsv
.
defaultMonth
$
tsv_publication_month
row
)
(
fromMaybe
Tsv
.
defaultDay
$
tsv_publication_day
row
)
time
)
<$>
Vector
.
map
(
\
row
->
Document
(
toPhyloDate
(
maybe
Tsv
.
defaultYear
Tsv
.
unIntOrDec
$
tsv_publication_year
row
)
(
fromMaybe
Tsv
.
defaultMonth
$
tsv_publication_month
row
)
(
fromMaybe
Tsv
.
defaultDay
$
tsv_publication_day
row
)
time
)
(
toPhyloDate'
(
maybe
Tsv
.
defaultYear
Tsv
.
unIntOrDec
$
tsv_publication_year
row
)
(
fromMaybe
Tsv
.
defaultMonth
$
tsv_publication_month
row
)
(
fromMaybe
Tsv
.
defaultDay
$
tsv_publication_day
row
)
time
)
(
termsInText
patterns
$
(
tsv_title
row
)
<>
" "
<>
(
tsv_abstract
row
))
Nothing
[]
...
...
src/Gargantext/Core/Ext/IMTUser.hs
View file @
c0afe078
...
...
@@ -22,7 +22,7 @@ import Data.Csv ( (.:), header, decodeByNameWith, FromNamedRecord(..), Header )
import
Data.Text
qualified
as
T
import
Data.Vector
(
Vector
)
import
Data.Vector
qualified
as
Vector
import
Gargantext.Core.Text.Corpus.Parsers.TSV
(
tsvDecodeOptions
,
ColumnDelimiter
(
Tab
)
)
import
Gargantext.Core.Text.Corpus.Parsers.TSV
(
defaultDecodingOptionsWithDelimiter
,
ColumnDelimiter
(
Tab
)
)
import
Gargantext.Database.Admin.Types.Hyperdata.Contact
import
Gargantext.Prelude
import
System.FilePath.Posix
(
takeExtension
)
...
...
@@ -119,7 +119,7 @@ readTSVFile_Annuaire' :: FilePath -> IO (Header, Vector IMTUser)
readTSVFile_Annuaire'
=
fmap
readTsvHalLazyBS'
.
BL
.
readFile
where
readTsvHalLazyBS'
::
BL
.
ByteString
->
(
Header
,
Vector
IMTUser
)
readTsvHalLazyBS'
bs
=
case
decodeByNameWith
(
tsvDecodeOptions
Tab
)
bs
of
readTsvHalLazyBS'
bs
=
case
decodeByNameWith
(
defaultDecodingOptionsWithDelimiter
Tab
)
bs
of
Left
e
->
panicTrace
(
cs
e
)
Right
rows
->
rows
...
...
src/Gargantext/Core/Text/Corpus/Parsers/TSV.hs
View file @
c0afe078
This diff is collapsed.
Click to expand it.
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment