Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
haskell-gargantext
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
191
Issues
191
List
Board
Labels
Milestones
Merge Requests
8
Merge Requests
8
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
gargantext
haskell-gargantext
Commits
2eb05b13
Commit
2eb05b13
authored
Sep 08, 2025
by
Alfredo Di Napoli
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Try to keep only the roots in searchTableNgrams (breaks tests)
parent
7be89ae5
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
54 additions
and
49 deletions
+54
-49
Ngrams.hs
src/Gargantext/API/Ngrams.hs
+54
-49
No files found.
src/Gargantext/API/Ngrams.hs
View file @
2eb05b13
...
...
@@ -114,7 +114,7 @@ import Gargantext.API.Ngrams.Types
import
Gargantext.Core.NodeStory
hiding
(
buildForest
)
import
Gargantext.Core.NodeStory
qualified
as
NodeStory
import
Gargantext.Core.Text.Ngrams
(
Ngrams
,
NgramsType
)
import
Gargantext.Core.Types
(
ListType
(
..
),
NodeId
,
ListId
,
TODO
,
assertValid
,
ContextId
,
HasValidationError
)
import
Gargantext.Core.Types
(
ListType
(
..
),
NodeId
,
ListId
,
TODO
,
assertValid
,
HasValidationError
)
import
Gargantext.Core.Types.Query
(
Limit
(
..
),
Offset
(
..
),
MinSize
(
..
),
MaxSize
(
..
))
import
Gargantext.Database.Action.Metrics.NgramsByContext
(
getOccByNgramsOnlyFast
)
import
Gargantext.Database.Prelude
...
...
@@ -482,58 +482,63 @@ searchTableNgrams :: Versioned (Map NgramsTerm NgramsElement)
->
Either
BuildForestError
(
VersionedWithCount
NgramsTable
)
searchTableNgrams
versionedTableMap
NgramsSearchQuery
{
..
}
=
let
tableMap
=
versionedTableMap
^.
v_data
in
case
buildForest
tableMap
of
in
case
keepRoots
<$>
buildForest
tableMap
of
Left
err
->
Left
err
Right
fs
->
let
forestRoots
=
Set
.
fromList
.
Map
.
elems
.
destroyForest
.
filterNgramsNodes
_nsq_listType
_nsq_minSize
_nsq_maxSize
_nsq_searchQuery
$
fs
let
forestRoots
=
filterNgramsNodes
_nsq_listType
_nsq_minSize
_nsq_maxSize
_nsq_searchQuery
$
fs
tableMapSorted
=
versionedTableMap
&
v_data
.~
(
NgramsTable
.
sortAndPaginate
.
withInners
tableMap
$
forestRoots
)
in
Right
$
toVersionedWithCount
(
Set
.
size
forestRoots
)
tableMapSorted
&
v_data
.~
(
NgramsTable
.
Map
.
elems
.
destroyForest
.
sortAndPaginateForest
_nsq_offset
_nsq_limit
_nsq_orderBy
.
withInnersForest
$
forestRoots
)
in
Right
$
toVersionedWithCount
(
length
forestRoots
)
tableMapSorted
keepRoots
::
Forest
NgramsElement
->
Forest
NgramsElement
keepRoots
=
filter
(
\
(
Node
r
_
)
->
isNothing
(
_ne_root
r
)
||
isNothing
(
_ne_parent
r
))
-- | For each input root, extends its occurrence count with
-- the information found in the subforest.
withInnersForest
::
Forest
NgramsElement
->
Forest
NgramsElement
withInnersForest
=
map
sumSubitemsOccurrences
where
-- Sorts the input 'NgramsElement' list.
-- /IMPORTANT/: As we might be sorting ngrams in all sorts of language,
-- some of them might include letters with accents and other unicode symbols,
-- but we need to filter those /diacritics/ out so that the sorting would
-- happen in the way users would expect. See ticket #331.
sortOnOrder
::
Maybe
OrderBy
->
([
NgramsElement
]
->
[
NgramsElement
])
sortOnOrder
Nothing
=
sortOnOrder
(
Just
ScoreDesc
)
sortOnOrder
(
Just
TermAsc
)
=
List
.
sortBy
ngramTermsAscSorter
sortOnOrder
(
Just
TermDesc
)
=
List
.
sortBy
ngramTermsDescSorter
sortOnOrder
(
Just
ScoreAsc
)
=
List
.
sortOn
$
view
(
ne_occurrences
.
to
Set
.
size
)
sortOnOrder
(
Just
ScoreDesc
)
=
List
.
sortOn
$
Down
.
view
(
ne_occurrences
.
to
Set
.
size
)
ngramTermsAscSorter
=
on
unicodeDUCETSorter
(
unNgramsTerm
.
view
ne_ngrams
)
ngramTermsDescSorter
=
on
(
\
n1
n2
->
unicodeDUCETSorter
n2
n1
)
(
unNgramsTerm
.
view
ne_ngrams
)
-- | For each input root, extends its occurrence count with
-- the information found in the subitems.
withInners
::
Map
NgramsTerm
NgramsElement
->
Set
NgramsElement
->
Set
NgramsElement
withInners
tblMap
roots
=
Set
.
map
addSubitemsOccurrences
roots
where
addSubitemsOccurrences
::
NgramsElement
->
NgramsElement
addSubitemsOccurrences
e
=
e
{
_ne_occurrences
=
foldl'
alterOccurrences
(
e
^.
ne_occurrences
)
(
e
^.
ne_children
)
}
alterOccurrences
::
Set
ContextId
->
NgramsTerm
->
Set
ContextId
alterOccurrences
occs
t
=
case
Map
.
lookup
t
tblMap
of
Nothing
->
occs
Just
e'
->
occs
<>
e'
^.
ne_occurrences
-- | Paginate the results
sortAndPaginate
::
Set
NgramsElement
->
[
NgramsElement
]
sortAndPaginate
xs
=
let
offset'
=
getOffset
$
maybe
0
identity
_nsq_offset
in
take
(
getLimit
_nsq_limit
)
.
drop
offset'
.
sortOnOrder
_nsq_orderBy
.
Set
.
toList
$
xs
sumSubitemsOccurrences
::
Tree
NgramsElement
->
Tree
NgramsElement
sumSubitemsOccurrences
(
Node
root
children
)
=
let
children'
=
withInnersForest
children
root'
=
root
{
_ne_occurrences
=
(
_ne_occurrences
root
)
<>
foldMap
(
_ne_occurrences
.
rootLabel
)
children'
}
in
Node
root'
children'
sortAndPaginateForest
::
Maybe
Offset
->
Limit
->
Maybe
OrderBy
->
Forest
NgramsElement
->
Forest
NgramsElement
sortAndPaginateForest
mb_offset
limit
orderBy
xs
=
let
offset'
=
getOffset
$
maybe
0
identity
mb_offset
in
take
(
getLimit
limit
)
.
drop
offset'
.
sortOnOrderForest
orderBy
$
xs
-- Sorts the input 'NgramsElement' list.
-- /IMPORTANT/: As we might be sorting ngrams in all sorts of language,
-- some of them might include letters with accents and other unicode symbols,
-- but we need to filter those /diacritics/ out so that the sorting would
-- happen in the way users would expect. See ticket #331.
sortOnOrderForest
::
Maybe
OrderBy
->
(
Forest
NgramsElement
->
Forest
NgramsElement
)
sortOnOrderForest
Nothing
=
sortOnOrderForest
(
Just
ScoreDesc
)
sortOnOrderForest
(
Just
TermAsc
)
=
List
.
sortBy
(
\
(
Node
t1
_
)
(
Node
t2
_
)
->
ngramTermsAscSorter
t1
t2
)
sortOnOrderForest
(
Just
TermDesc
)
=
List
.
sortBy
(
\
(
Node
t1
_
)
(
Node
t2
_
)
->
ngramTermsDescSorter
t1
t2
)
sortOnOrderForest
(
Just
ScoreAsc
)
=
List
.
sortOn
$
\
(
Node
root
_
)
->
root
^.
(
ne_occurrences
.
to
Set
.
size
)
sortOnOrderForest
(
Just
ScoreDesc
)
=
List
.
sortOn
$
Down
.
(
\
(
Node
root
_
)
->
root
^.
(
ne_occurrences
.
to
Set
.
size
))
ngramTermsAscSorter
::
NgramsElement
->
NgramsElement
->
Ordering
ngramTermsAscSorter
=
on
unicodeDUCETSorter
(
unNgramsTerm
.
view
ne_ngrams
)
ngramTermsDescSorter
::
NgramsElement
->
NgramsElement
->
Ordering
ngramTermsDescSorter
=
on
(
\
n1
n2
->
unicodeDUCETSorter
n2
n1
)
(
unNgramsTerm
.
view
ne_ngrams
)
-- | This function allows sorting two texts via their unicode sorting
-- (as opposed as the standard lexicographical sorting) by relying on
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment