Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
H
haskell-gargantext
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Przemyslaw Kaminski
haskell-gargantext
Commits
45c3bb43
Commit
45c3bb43
authored
Jun 12, 2018
by
Alexandre Delanoë
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
[TYPES] Terms and Human Lang types fusion.
parent
15511563
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
15 additions
and
12 deletions
+15
-12
Pipeline.hs
src/Gargantext/Pipeline.hs
+7
-4
Metrics.hs
src/Gargantext/Text/Metrics.hs
+1
-1
Terms.hs
src/Gargantext/Text/Terms.hs
+7
-7
No files found.
src/Gargantext/Pipeline.hs
View file @
45c3bb43
...
...
@@ -55,15 +55,18 @@ workflow lang path = do
-- Text <- IO Text <- FilePath
text
<-
readFile
path
-- context :: Text -> [Text]
let
contexts
=
splitBy
(
Sentences
5
)
text
-- Context :: Text -> [Text]
-- Contexts = Paragraphs n | Sentences n | Chars n
myterms
<-
extractTerms
Mono
lang
contexts
-- myterms
<- extractTerms (Mono lang) contexts
# filter (\t -> not . elem t stopList)
--
# groupBy (Stem|GroupList)
myterms
<-
extractTerms
(
Mono
lang
)
contexts
-- myterms # filter (\t -> not . elem t stopList)
-- # groupBy (Stem|GroupList)
printDebug
"myterms"
(
sum
$
map
length
myterms
)
-- Bulding the map list
-- compute copresences of terms
-- Cooc = Map (Term, Term) Int
let
myCooc1
=
cooc
myterms
printDebug
"myCooc1"
(
M
.
size
myCooc1
)
...
...
src/Gargantext/Text/Metrics.hs
View file @
45c3bb43
...
...
@@ -180,7 +180,7 @@ metrics_sentences_Test = metrics_sentences == metrics_sentences'
-}
metrics_terms
::
IO
[[
Terms
]]
metrics_terms
=
mapM
(
terms
MonoMulti
EN
)
$
splitBy
(
Sentences
0
)
metrics_text
metrics_terms
=
mapM
(
terms
(
MonoMulti
EN
)
)
$
splitBy
(
Sentences
0
)
metrics_text
-- | Occurrences
{-
...
...
src/Gargantext/Text/Terms.hs
View file @
45c3bb43
...
...
@@ -42,23 +42,23 @@ import Gargantext.Core.Types
import
Gargantext.Text.Terms.Multi
(
multiterms
)
import
Gargantext.Text.Terms.Mono
(
monoterms'
)
data
TermType
=
Mono
|
Multi
|
MonoMulti
data
TermType
lang
=
Mono
lang
|
Multi
lang
|
MonoMulti
lang
-- remove Stop Words
-- map (filter (\t -> not . elem t)) $
------------------------------------------------------------------------
-- | Sugar to extract terms from text (hiddeng mapM from end user).
extractTerms
::
Traversable
t
=>
TermType
->
Lang
->
t
Text
->
IO
(
t
[
Terms
])
extractTerms
termType
lang
=
mapM
(
terms
termType
l
ang
)
extractTerms
::
Traversable
t
=>
TermType
Lang
->
t
Text
->
IO
(
t
[
Terms
])
extractTerms
termType
Lang
=
mapM
(
terms
termTypeL
ang
)
------------------------------------------------------------------------
-- | Terms from Text
-- Mono : mono terms
-- Multi : multi terms
-- MonoMulti : mono and multi
-- TODO : multi terms should exclude mono (intersection is not empty yet)
terms
::
TermType
->
Lang
->
Text
->
IO
[
Terms
]
terms
Mono
lang
txt
=
pure
$
monoterms'
lang
txt
terms
Multi
lang
txt
=
multiterms
lang
txt
terms
MonoMulti
lang
txt
=
terms
Multi
lang
txt
terms
::
TermType
Lang
->
Text
->
IO
[
Terms
]
terms
(
Mono
lang
)
txt
=
pure
$
monoterms'
lang
txt
terms
(
Multi
lang
)
txt
=
multiterms
lang
txt
terms
(
MonoMulti
lang
)
txt
=
terms
(
Multi
lang
)
txt
------------------------------------------------------------------------
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment