Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
haskell-gargantext
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
158
Issues
158
List
Board
Labels
Milestones
Merge Requests
11
Merge Requests
11
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
gargantext
haskell-gargantext
Commits
112ea7af
Verified
Commit
112ea7af
authored
Sep 17, 2021
by
Przemyslaw Kaminski
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
[framewrite] better line parsing
parent
267fae44
Pipeline
#1850
canceled with stage
Changes
1
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
77 additions
and
4 deletions
+77
-4
FrameWrite.hs
src/Gargantext/Core/Text/Corpus/Parsers/FrameWrite.hs
+77
-4
No files found.
src/Gargantext/Core/Text/Corpus/Parsers/FrameWrite.hs
View file @
112ea7af
...
...
@@ -5,7 +5,8 @@ import Control.Monad (void)
import
Data.Maybe
import
Data.Text
import
Gargantext.Prelude
import
Text.Parsec
import
Prelude
(
String
,
(
++
))
import
Text.Parsec
hiding
(
Line
)
import
Text.Parsec.Combinator
import
Text.Parsec.String
...
...
@@ -33,7 +34,22 @@ sample =
,
"document contents 2"
]
sampleUnordered
=
unlines
[
"title1"
,
"title2"
,
"=="
,
"document contents 1"
,
"^@@date: 2021-09-10"
,
"^@@authors: FirstName1, LastName1; FirstName2, LastName2"
,
"^@@source: someSource"
,
"document contents 2"
]
parseSample
=
parse
documentP
"sample"
(
unpack
sample
)
parseSampleUnordered
=
parse
documentP
"sampleUnordered"
(
unpack
sampleUnordered
)
parseLinesSample
=
parse
documentLinesP
"sample"
(
unpack
sample
)
parseLinesSampleUnordered
=
parse
documentLinesP
"sampleUnordered"
(
unpack
sampleUnordered
)
data
Author
=
Author
{
firstName
::
Text
...
...
@@ -48,6 +64,61 @@ data Parsed =
,
contents
::
Text
}
deriving
(
Show
)
emptyParsed
=
Parsed
{
title
=
""
,
authors
=
[]
,
date
=
Nothing
,
source
=
Nothing
,
contents
=
""
}
data
Line
=
LAuthors
[
Author
]
|
LDate
Text
|
LSource
Text
|
LContents
Text
|
LTitle
Text
deriving
(
Show
)
parseLines
::
Text
->
Parsed
parseLines
text
=
foldl
f
emptyParsed
lst
where
lst
=
parse
documentLinesP
""
(
unpack
text
)
f
(
Parsed
{
..
})
(
LAuthors
as
)
=
Parsed
{
authors
=
as
,
..
}
f
(
Parsed
{
..
})
(
LDate
d
)
=
Parsed
{
date
=
d
,
..
}
f
(
Parsed
{
..
})
(
LSource
s
)
=
Parsed
{
source
=
s
,
..
}
f
(
Parsed
{
..
})
(
LContents
c
)
=
Parsed
{
contents
=
contents
++
c
,
..
}
f
(
Parsed
{
..
})
(
LTitle
t
)
=
Parsed
{
title
=
t
,
..
}
documentLinesP
=
do
t
<-
titleP
lines
<-
lineP
`
sepBy
`
newline
pure
$
[
LTitle
$
pack
t
]
++
lines
lineP
::
Parser
Line
lineP
=
do
choice
[
try
authorsLineP
,
try
dateLineP
,
try
sourceLineP
,
contentsLineP
]
authorsLineP
=
do
authors
<-
authorsP
pure
$
LAuthors
authors
dateLineP
=
do
date
<-
dateP
pure
$
LDate
$
pack
date
sourceLineP
=
do
source
<-
sourceP
pure
$
LSource
$
pack
source
contentsLineP
=
do
contents
<-
many
(
noneOf
"
\n
"
)
pure
$
LContents
$
pack
contents
--------------------
documentP
=
do
t
<-
titleP
a
<-
optionMaybe
authorsP
...
...
@@ -77,7 +148,8 @@ authorP = do
fn
<-
manyTill
anyChar
(
char
','
)
_
<-
many
(
char
' '
)
--ln <- manyTill anyChar (void (char ';') <|> tokenEnd)
ln
<-
manyTill
anyChar
(
tokenEnd
)
--ln <- manyTill anyChar (tokenEnd)
ln
<-
many
(
noneOf
"
\n
"
)
pure
$
Author
{
firstName
=
pack
fn
,
lastName
=
pack
ln
}
-- manyTill anyChar (void (char '\n') <|> eof)
...
...
@@ -86,15 +158,16 @@ datePrefixP = do
many
(
char
' '
)
dateP
::
Parser
[
Char
]
dateP
=
try
datePrefixP
*>
many
Till
anyChar
tokenEnd
*>
many
(
noneOf
"
\n
"
)
sourcePrefixP
=
do
_
<-
string
"^@@source:"
many
(
char
' '
)
sourceP
::
Parser
[
Char
]
sourceP
=
try
sourcePrefixP
*>
many
Till
anyChar
tokenEnd
*>
many
(
noneOf
"
\n
"
)
contentsP
::
Parser
String
contentsP
=
many
anyChar
tokenEnd
=
void
(
char
'
\n
'
)
<|>
eof
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment