Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
haskell-gargantext
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
158
Issues
158
List
Board
Labels
Milestones
Merge Requests
9
Merge Requests
9
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
gargantext
haskell-gargantext
Commits
18968540
Commit
18968540
authored
May 05, 2019
by
Alexandre Delanoë
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
[PARSERS] RIS/PRESSE fix title and abstract field.
parent
d388d621
Pipeline
#370
failed with stage
Changes
2
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
57 additions
and
37 deletions
+57
-37
RIS.hs
src/Gargantext/Text/Parsers/RIS.hs
+4
-3
Presse.hs
src/Gargantext/Text/Parsers/RIS/Presse.hs
+53
-34
No files found.
src/Gargantext/Text/Parsers/RIS.hs
View file @
18968540
...
...
@@ -19,7 +19,7 @@ citation programs to exchange data.
{-# LANGUAGE NoImplicitPrelude #-}
{-# LANGUAGE OverloadedStrings #-}
module
Gargantext.Text.Parsers.RIS
(
parser
,
with
Field
,
fieldWith
,
lines
)
where
module
Gargantext.Text.Parsers.RIS
(
parser
,
on
Field
,
fieldWith
,
lines
)
where
import
Data.List
(
lookup
)
import
Control.Applicative
...
...
@@ -68,7 +68,8 @@ lines = many line
line
=
"
\n
"
*>
takeTill
isEndOfLine
-------------------------------------------------------------
withField
::
ByteString
->
(
ByteString
->
[(
ByteString
,
ByteString
)])
-- Field for First elem of a Tuple, Key for corresponding Map
onField
::
ByteString
->
(
ByteString
->
[(
ByteString
,
ByteString
)])
->
[(
ByteString
,
ByteString
)]
->
[(
ByteString
,
ByteString
)]
with
Field
k
f
m
=
m
<>
(
maybe
[]
f
(
lookup
k
m
)
)
on
Field
k
f
m
=
m
<>
(
maybe
[]
f
(
lookup
k
m
)
)
src/Gargantext/Text/Parsers/RIS/Presse.hs
View file @
18968540
...
...
@@ -7,7 +7,7 @@ Maintainer : team@gargantext.org
Stability : experimental
Portability : POSIX
Presse RIS format parser
en enricher
.
Presse RIS format parser
for Europresse Database
.
-}
...
...
@@ -16,41 +16,60 @@ Presse RIS format parser en enricher.
module
Gargantext.Text.Parsers.RIS.Presse
(
presseEnrich
)
where
import
Data.List
(
lookup
)
import
Data.Either
(
either
)
import
Data.Tuple.Extra
(
first
)
import
Data.Tuple.Extra
(
first
,
both
,
uncurry
)
import
Data.Attoparsec.ByteString
(
parseOnly
)
import
Data.ByteString
(
ByteString
)
import
Gargantext.Prelude
hiding
(
takeWhile
,
take
)
import
Gargantext.Text.Parsers.RIS
(
withField
)
import
Data.ByteString
(
ByteString
,
length
)
import
Gargantext.Prelude
hiding
(
takeWhile
,
take
,
length
)
import
Gargantext.Text.Parsers.RIS
(
onField
)
import
Gargantext.Core
(
Lang
(
..
))
import
qualified
Gargantext.Text.Parsers.Date.Attoparsec
as
Date
-------------------------------------------------------------
-------------------------------------------------------------
presseEnrich
::
[(
ByteString
,
ByteString
)]
->
[(
ByteString
,
ByteString
)]
presseEnrich
=
(
withField
"DA"
presseDate
)
.
(
withField
"LA"
presseLang
)
.
(
map
(
first
presseFields
))
presseDate
::
ByteString
->
[(
ByteString
,
ByteString
)]
presseDate
str
=
either
(
const
[]
)
identity
$
parseOnly
(
Date
.
parserWith
"/"
)
str
presseLang
::
ByteString
->
[(
ByteString
,
ByteString
)]
presseLang
"Français"
=
[(
"language"
,
"FR"
)]
presseLang
"English"
=
[(
"language"
,
"EN"
)]
presseLang
x
=
[(
"language"
,
x
)]
presseFields
::
ByteString
->
ByteString
presseFields
champs
|
champs
==
"AU"
=
"authors"
|
champs
==
"TI"
=
"title"
|
champs
==
"JF"
=
"source"
|
champs
==
"DI"
=
"doi"
|
champs
==
"UR"
=
"url"
|
champs
==
"N2"
=
"abstract"
|
otherwise
=
champs
{-
fixTitle :: [(ByteString, ByteString)] -> [(ByteString, ByteString)]
fixTitle ns = ns <> [ti, ab]
presseEnrich
=
(
onField
"DA"
parseDate
)
.
(
onField
"LA"
parseLang
)
.
fixFields
parseDate
::
ByteString
->
[(
ByteString
,
ByteString
)]
parseDate
str
=
either
(
const
[]
)
identity
$
parseOnly
(
Date
.
parserWith
"/"
)
str
parseLang
::
ByteString
->
[(
ByteString
,
ByteString
)]
parseLang
"Français"
=
[(
langField
,
cs
$
show
FR
)]
parseLang
"English"
=
[(
langField
,
cs
$
show
EN
)]
parseLang
x
=
[(
langField
,
x
)]
langField
::
ByteString
langField
=
"language"
fixFields
::
[(
ByteString
,
ByteString
)]
->
[(
ByteString
,
ByteString
)]
fixFields
ns
=
map
(
first
fixFields''
)
ns
where
ti = case
-}
-- | Title is sometimes longer than abstract
fixFields''
=
case
uncurry
(
>
)
<$>
look''
of
Just
True
->
fixFields'
"abstract"
"title"
_
->
fixFields'
"title"
"abstract"
look''
::
Maybe
(
Int
,
Int
)
look''
=
both
length
<$>
look
look
::
Maybe
(
ByteString
,
ByteString
)
look
=
(,)
<$>
lookup
"TI"
ns
<*>
lookup
"N2"
ns
fixFields'
::
ByteString
->
ByteString
->
ByteString
->
ByteString
fixFields'
title
abstract
champs
|
champs
==
"AU"
=
"authors"
|
champs
==
"TI"
=
title
|
champs
==
"JF"
=
"source"
|
champs
==
"DI"
=
"doi"
|
champs
==
"UR"
=
"url"
|
champs
==
"N2"
=
abstract
|
otherwise
=
champs
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment