Skip to content

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
    • Help
    • Submit feedback
    • Contribute to GitLab
  • Sign in
P
purescript-gargantext
  • Project
    • Project
    • Details
    • Activity
    • Releases
    • Cycle Analytics
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Charts
  • Issues 137
    • Issues 137
    • List
    • Board
    • Labels
    • Milestones
  • Merge Requests 5
    • Merge Requests 5
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
    • Charts
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Charts
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • gargantext
  • purescript-gargantext
  • Issues
  • #553

Closed
Open
Opened May 30, 2023 by Przemyslaw Kaminski@cgenie
  • Report abuse
  • New issue
Report abuse New issue

Tokenize ngrams on the backend, don't do it on the frontend

We currently perform ngrams analysis on the backend (via corenlp; also we use postgres fts). However, there is also custom code on the frontend which is supposed to find ngrams in a given text and highlight them:

https://gitlab.iscpif.fr/gargantext/purescript-gargantext/blob/baaea6e34deb185ae0b1bca0fdecc21a9f210d87/src/Gargantext/Core/NgramsTable/Functions.purs#L145

This code is quite complex and error-prone (also with #551 (closed) it seems it might not be too fast).

We can use the fact that our documents are immutable.

My suggestion is to store the ngrams position in the DB already and just serve the frontend with a list like this:

[
  { "from": 10
  , "to": 30
  , "text": "Michael Jackson"
  , "type": "MapTerm"
  ...
  }
]

This way the frontend doesn't care how the ngrams were generated, it just does what a frontend should do, i.e. be dumb, just show data, don't compute it. I guess this would also make a better decomposition of responsibilities: if highlighting doesn't work then it's frontend's fault, if terms aren't shown then it's backend's fault.

Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
None
Due date
None
2
Labels
high toClose?
Assign labels
  • View project labels
Reference: gargantext/purescript-gargantext#553