Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
haskell-gargantext
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Julien Moutinho
haskell-gargantext
Commits
df6f1dde
Commit
df6f1dde
authored
Apr 11, 2023
by
Alexandre Delanoë
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
[FIX] Add more redundancies to texts Notes
parent
fda25302
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
12 additions
and
7 deletions
+12
-7
gargantext.cabal
gargantext.cabal
+1
-1
FrameWrite.hs
src/Gargantext/Core/Text/Corpus/Parsers/FrameWrite.hs
+11
-6
No files found.
gargantext.cabal
View file @
df6f1dde
src/Gargantext/Core/Text/Corpus/Parsers/FrameWrite.hs
View file @
df6f1dde
...
@@ -218,13 +218,14 @@ dateISOP = do
...
@@ -218,13 +218,14 @@ dateISOP = do
rd
=
read
::
[
Char
]
->
Integer
rd
=
read
::
[
Char
]
->
Integer
number
=
many1
digit
number
=
many1
digit
sourcePrefixP
::
Parser
[
Char
]
sourcePrefixP
=
do
_
<-
string
"source:"
many
(
char
' '
)
sourceP
::
Parser
[
Char
]
sourceP
::
Parser
[
Char
]
sourceP
=
try
sourcePrefixP
sourceP
=
try
sourcePrefixP
*>
many
(
noneOf
"
\n
"
)
*>
many
(
noneOf
"
\n
"
)
where
sourcePrefixP
::
Parser
[
Char
]
sourcePrefixP
=
do
_
<-
string
"source:"
many
(
char
' '
)
-- contentsP :: Parser String
-- contentsP :: Parser String
-- contentsP = many anyChar
-- contentsP = many anyChar
...
@@ -233,15 +234,19 @@ tokenEnd :: Parser ()
...
@@ -233,15 +234,19 @@ tokenEnd :: Parser ()
tokenEnd
=
void
(
char
'
\n
'
)
<|>
eof
tokenEnd
=
void
(
char
'
\n
'
)
<|>
eof
--- MISC Tools
--- MISC Tools
-- Using ChunkAlong here enable redundancies in short corpora of texts
-- maybe use splitEvery or chunkAlong depending on the size of the whole text
text2titleParagraphs
::
Int
->
Text
->
[(
Text
,
Text
)]
text2titleParagraphs
::
Int
->
Text
->
[(
Text
,
Text
)]
text2titleParagraphs
n
=
catMaybes
text2titleParagraphs
n
=
catMaybes
.
List
.
map
doTitle
.
List
.
map
doTitle
.
(
splitEvery
n
)
.
(
chunkAlong
n'
n
)
-- . (splitEvery n)
.
sentences
.
sentences
.
DT
.
intercalate
" "
-- ". "
.
DT
.
intercalate
" "
-- ". "
.
List
.
filter
(
/=
""
)
.
List
.
filter
(
/=
""
)
.
DT
.
lines
.
DT
.
lines
where
n'
=
n
+
(
round
$
(
fromIntegral
n
)
/
(
2
::
Double
))
doTitle
::
[
Text
]
->
Maybe
(
Text
,
Text
)
doTitle
::
[
Text
]
->
Maybe
(
Text
,
Text
)
doTitle
(
t
:
ts
)
=
Just
(
t
,
DT
.
concat
ts
)
doTitle
(
t
:
ts
)
=
Just
(
t
,
DT
.
concat
ts
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment