Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
H
haskell-gargantext
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Przemyslaw Kaminski
haskell-gargantext
Commits
112ea7af
Verified
Commit
112ea7af
authored
Sep 17, 2021
by
Przemyslaw Kaminski
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
[framewrite] better line parsing
parent
267fae44
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
77 additions
and
4 deletions
+77
-4
FrameWrite.hs
src/Gargantext/Core/Text/Corpus/Parsers/FrameWrite.hs
+77
-4
No files found.
src/Gargantext/Core/Text/Corpus/Parsers/FrameWrite.hs
View file @
112ea7af
...
...
@@ -5,7 +5,8 @@ import Control.Monad (void)
import
Data.Maybe
import
Data.Text
import
Gargantext.Prelude
import
Text.Parsec
import
Prelude
(
String
,
(
++
))
import
Text.Parsec
hiding
(
Line
)
import
Text.Parsec.Combinator
import
Text.Parsec.String
...
...
@@ -33,7 +34,22 @@ sample =
,
"document contents 2"
]
sampleUnordered
=
unlines
[
"title1"
,
"title2"
,
"=="
,
"document contents 1"
,
"^@@date: 2021-09-10"
,
"^@@authors: FirstName1, LastName1; FirstName2, LastName2"
,
"^@@source: someSource"
,
"document contents 2"
]
parseSample
=
parse
documentP
"sample"
(
unpack
sample
)
parseSampleUnordered
=
parse
documentP
"sampleUnordered"
(
unpack
sampleUnordered
)
parseLinesSample
=
parse
documentLinesP
"sample"
(
unpack
sample
)
parseLinesSampleUnordered
=
parse
documentLinesP
"sampleUnordered"
(
unpack
sampleUnordered
)
data
Author
=
Author
{
firstName
::
Text
...
...
@@ -48,6 +64,61 @@ data Parsed =
,
contents
::
Text
}
deriving
(
Show
)
emptyParsed
=
Parsed
{
title
=
""
,
authors
=
[]
,
date
=
Nothing
,
source
=
Nothing
,
contents
=
""
}
data
Line
=
LAuthors
[
Author
]
|
LDate
Text
|
LSource
Text
|
LContents
Text
|
LTitle
Text
deriving
(
Show
)
parseLines
::
Text
->
Parsed
parseLines
text
=
foldl
f
emptyParsed
lst
where
lst
=
parse
documentLinesP
""
(
unpack
text
)
f
(
Parsed
{
..
})
(
LAuthors
as
)
=
Parsed
{
authors
=
as
,
..
}
f
(
Parsed
{
..
})
(
LDate
d
)
=
Parsed
{
date
=
d
,
..
}
f
(
Parsed
{
..
})
(
LSource
s
)
=
Parsed
{
source
=
s
,
..
}
f
(
Parsed
{
..
})
(
LContents
c
)
=
Parsed
{
contents
=
contents
++
c
,
..
}
f
(
Parsed
{
..
})
(
LTitle
t
)
=
Parsed
{
title
=
t
,
..
}
documentLinesP
=
do
t
<-
titleP
lines
<-
lineP
`
sepBy
`
newline
pure
$
[
LTitle
$
pack
t
]
++
lines
lineP
::
Parser
Line
lineP
=
do
choice
[
try
authorsLineP
,
try
dateLineP
,
try
sourceLineP
,
contentsLineP
]
authorsLineP
=
do
authors
<-
authorsP
pure
$
LAuthors
authors
dateLineP
=
do
date
<-
dateP
pure
$
LDate
$
pack
date
sourceLineP
=
do
source
<-
sourceP
pure
$
LSource
$
pack
source
contentsLineP
=
do
contents
<-
many
(
noneOf
"
\n
"
)
pure
$
LContents
$
pack
contents
--------------------
documentP
=
do
t
<-
titleP
a
<-
optionMaybe
authorsP
...
...
@@ -77,7 +148,8 @@ authorP = do
fn
<-
manyTill
anyChar
(
char
','
)
_
<-
many
(
char
' '
)
--ln <- manyTill anyChar (void (char ';') <|> tokenEnd)
ln
<-
manyTill
anyChar
(
tokenEnd
)
--ln <- manyTill anyChar (tokenEnd)
ln
<-
many
(
noneOf
"
\n
"
)
pure
$
Author
{
firstName
=
pack
fn
,
lastName
=
pack
ln
}
-- manyTill anyChar (void (char '\n') <|> eof)
...
...
@@ -86,15 +158,16 @@ datePrefixP = do
many
(
char
' '
)
dateP
::
Parser
[
Char
]
dateP
=
try
datePrefixP
*>
many
Till
anyChar
tokenEnd
*>
many
(
noneOf
"
\n
"
)
sourcePrefixP
=
do
_
<-
string
"^@@source:"
many
(
char
' '
)
sourceP
::
Parser
[
Char
]
sourceP
=
try
sourcePrefixP
*>
many
Till
anyChar
tokenEnd
*>
many
(
noneOf
"
\n
"
)
contentsP
::
Parser
String
contentsP
=
many
anyChar
tokenEnd
=
void
(
char
'
\n
'
)
<|>
eof
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment