Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
haskell-gargantext
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
191
Issues
191
List
Board
Labels
Milestones
Merge Requests
8
Merge Requests
8
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
gargantext
haskell-gargantext
Commits
2eb05b13
Commit
2eb05b13
authored
Sep 08, 2025
by
Alfredo Di Napoli
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Try to keep only the roots in searchTableNgrams (breaks tests)
parent
7be89ae5
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
54 additions
and
49 deletions
+54
-49
Ngrams.hs
src/Gargantext/API/Ngrams.hs
+54
-49
No files found.
src/Gargantext/API/Ngrams.hs
View file @
2eb05b13
...
@@ -114,7 +114,7 @@ import Gargantext.API.Ngrams.Types
...
@@ -114,7 +114,7 @@ import Gargantext.API.Ngrams.Types
import
Gargantext.Core.NodeStory
hiding
(
buildForest
)
import
Gargantext.Core.NodeStory
hiding
(
buildForest
)
import
Gargantext.Core.NodeStory
qualified
as
NodeStory
import
Gargantext.Core.NodeStory
qualified
as
NodeStory
import
Gargantext.Core.Text.Ngrams
(
Ngrams
,
NgramsType
)
import
Gargantext.Core.Text.Ngrams
(
Ngrams
,
NgramsType
)
import
Gargantext.Core.Types
(
ListType
(
..
),
NodeId
,
ListId
,
TODO
,
assertValid
,
ContextId
,
HasValidationError
)
import
Gargantext.Core.Types
(
ListType
(
..
),
NodeId
,
ListId
,
TODO
,
assertValid
,
HasValidationError
)
import
Gargantext.Core.Types.Query
(
Limit
(
..
),
Offset
(
..
),
MinSize
(
..
),
MaxSize
(
..
))
import
Gargantext.Core.Types.Query
(
Limit
(
..
),
Offset
(
..
),
MinSize
(
..
),
MaxSize
(
..
))
import
Gargantext.Database.Action.Metrics.NgramsByContext
(
getOccByNgramsOnlyFast
)
import
Gargantext.Database.Action.Metrics.NgramsByContext
(
getOccByNgramsOnlyFast
)
import
Gargantext.Database.Prelude
import
Gargantext.Database.Prelude
...
@@ -482,58 +482,63 @@ searchTableNgrams :: Versioned (Map NgramsTerm NgramsElement)
...
@@ -482,58 +482,63 @@ searchTableNgrams :: Versioned (Map NgramsTerm NgramsElement)
->
Either
BuildForestError
(
VersionedWithCount
NgramsTable
)
->
Either
BuildForestError
(
VersionedWithCount
NgramsTable
)
searchTableNgrams
versionedTableMap
NgramsSearchQuery
{
..
}
=
searchTableNgrams
versionedTableMap
NgramsSearchQuery
{
..
}
=
let
tableMap
=
versionedTableMap
^.
v_data
let
tableMap
=
versionedTableMap
^.
v_data
in
case
buildForest
tableMap
of
in
case
keepRoots
<$>
buildForest
tableMap
of
Left
err
->
Left
err
Left
err
->
Left
err
Right
fs
->
Right
fs
->
let
forestRoots
=
Set
.
fromList
let
forestRoots
=
filterNgramsNodes
_nsq_listType
_nsq_minSize
_nsq_maxSize
_nsq_searchQuery
$
fs
.
Map
.
elems
.
destroyForest
.
filterNgramsNodes
_nsq_listType
_nsq_minSize
_nsq_maxSize
_nsq_searchQuery
$
fs
tableMapSorted
=
versionedTableMap
tableMapSorted
=
versionedTableMap
&
v_data
.~
(
NgramsTable
.
sortAndPaginate
.
withInners
tableMap
$
forestRoots
)
&
v_data
.~
(
NgramsTable
.
Map
.
elems
.
destroyForest
in
Right
$
toVersionedWithCount
(
Set
.
size
forestRoots
)
tableMapSorted
.
sortAndPaginateForest
_nsq_offset
_nsq_limit
_nsq_orderBy
.
withInnersForest
$
forestRoots
)
in
Right
$
toVersionedWithCount
(
length
forestRoots
)
tableMapSorted
keepRoots
::
Forest
NgramsElement
->
Forest
NgramsElement
keepRoots
=
filter
(
\
(
Node
r
_
)
->
isNothing
(
_ne_root
r
)
||
isNothing
(
_ne_parent
r
))
-- | For each input root, extends its occurrence count with
-- the information found in the subforest.
withInnersForest
::
Forest
NgramsElement
->
Forest
NgramsElement
withInnersForest
=
map
sumSubitemsOccurrences
where
where
sumSubitemsOccurrences
::
Tree
NgramsElement
->
Tree
NgramsElement
-- Sorts the input 'NgramsElement' list.
sumSubitemsOccurrences
(
Node
root
children
)
=
-- /IMPORTANT/: As we might be sorting ngrams in all sorts of language,
let
children'
=
withInnersForest
children
-- some of them might include letters with accents and other unicode symbols,
root'
=
root
{
_ne_occurrences
=
(
_ne_occurrences
root
)
<>
foldMap
(
_ne_occurrences
.
rootLabel
)
children'
}
-- but we need to filter those /diacritics/ out so that the sorting would
in
Node
root'
children'
-- happen in the way users would expect. See ticket #331.
sortOnOrder
::
Maybe
OrderBy
->
([
NgramsElement
]
->
[
NgramsElement
])
sortAndPaginateForest
::
Maybe
Offset
sortOnOrder
Nothing
=
sortOnOrder
(
Just
ScoreDesc
)
->
Limit
sortOnOrder
(
Just
TermAsc
)
=
List
.
sortBy
ngramTermsAscSorter
->
Maybe
OrderBy
sortOnOrder
(
Just
TermDesc
)
=
List
.
sortBy
ngramTermsDescSorter
->
Forest
NgramsElement
sortOnOrder
(
Just
ScoreAsc
)
=
List
.
sortOn
$
view
(
ne_occurrences
.
to
Set
.
size
)
->
Forest
NgramsElement
sortOnOrder
(
Just
ScoreDesc
)
=
List
.
sortOn
$
Down
.
view
(
ne_occurrences
.
to
Set
.
size
)
sortAndPaginateForest
mb_offset
limit
orderBy
xs
=
let
offset'
=
getOffset
$
maybe
0
identity
mb_offset
ngramTermsAscSorter
=
on
unicodeDUCETSorter
(
unNgramsTerm
.
view
ne_ngrams
)
in
take
(
getLimit
limit
)
ngramTermsDescSorter
=
on
(
\
n1
n2
->
unicodeDUCETSorter
n2
n1
)
(
unNgramsTerm
.
view
ne_ngrams
)
.
drop
offset'
.
sortOnOrderForest
orderBy
-- | For each input root, extends its occurrence count with
$
xs
-- the information found in the subitems.
withInners
::
Map
NgramsTerm
NgramsElement
->
Set
NgramsElement
->
Set
NgramsElement
-- Sorts the input 'NgramsElement' list.
withInners
tblMap
roots
=
Set
.
map
addSubitemsOccurrences
roots
-- /IMPORTANT/: As we might be sorting ngrams in all sorts of language,
where
-- some of them might include letters with accents and other unicode symbols,
addSubitemsOccurrences
::
NgramsElement
->
NgramsElement
-- but we need to filter those /diacritics/ out so that the sorting would
addSubitemsOccurrences
e
=
-- happen in the way users would expect. See ticket #331.
e
{
_ne_occurrences
=
foldl'
alterOccurrences
(
e
^.
ne_occurrences
)
(
e
^.
ne_children
)
}
sortOnOrderForest
::
Maybe
OrderBy
->
(
Forest
NgramsElement
->
Forest
NgramsElement
)
sortOnOrderForest
Nothing
=
sortOnOrderForest
(
Just
ScoreDesc
)
alterOccurrences
::
Set
ContextId
->
NgramsTerm
->
Set
ContextId
sortOnOrderForest
(
Just
TermAsc
)
=
List
.
sortBy
(
\
(
Node
t1
_
)
(
Node
t2
_
)
->
ngramTermsAscSorter
t1
t2
)
alterOccurrences
occs
t
=
case
Map
.
lookup
t
tblMap
of
sortOnOrderForest
(
Just
TermDesc
)
=
List
.
sortBy
(
\
(
Node
t1
_
)
(
Node
t2
_
)
->
ngramTermsDescSorter
t1
t2
)
Nothing
->
occs
sortOnOrderForest
(
Just
ScoreAsc
)
=
List
.
sortOn
$
\
(
Node
root
_
)
->
root
^.
(
ne_occurrences
.
to
Set
.
size
)
Just
e'
->
occs
<>
e'
^.
ne_occurrences
sortOnOrderForest
(
Just
ScoreDesc
)
=
List
.
sortOn
$
Down
.
(
\
(
Node
root
_
)
->
root
^.
(
ne_occurrences
.
to
Set
.
size
))
-- | Paginate the results
ngramTermsAscSorter
::
NgramsElement
->
NgramsElement
->
Ordering
sortAndPaginate
::
Set
NgramsElement
->
[
NgramsElement
]
ngramTermsAscSorter
=
on
unicodeDUCETSorter
(
unNgramsTerm
.
view
ne_ngrams
)
sortAndPaginate
xs
=
let
offset'
=
getOffset
$
maybe
0
identity
_nsq_offset
ngramTermsDescSorter
::
NgramsElement
->
NgramsElement
->
Ordering
in
take
(
getLimit
_nsq_limit
)
ngramTermsDescSorter
=
on
(
\
n1
n2
->
unicodeDUCETSorter
n2
n1
)
(
unNgramsTerm
.
view
ne_ngrams
)
.
drop
offset'
.
sortOnOrder
_nsq_orderBy
.
Set
.
toList
$
xs
-- | This function allows sorting two texts via their unicode sorting
-- | This function allows sorting two texts via their unicode sorting
-- (as opposed as the standard lexicographical sorting) by relying on
-- (as opposed as the standard lexicographical sorting) by relying on
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment