Skip to content

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
    • Help
    • Submit feedback
    • Contribute to GitLab
  • Sign in
haskell-gargantext
haskell-gargantext
  • Project
    • Project
    • Details
    • Activity
    • Releases
    • Cycle Analytics
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Charts
  • Issues 186
    • Issues 186
    • List
    • Board
    • Labels
    • Milestones
  • Merge Requests 13
    • Merge Requests 13
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
    • Charts
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Charts
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • gargantext
  • haskell-gargantexthaskell-gargantext
  • Issues
  • #511

Closed
Open
Opened Sep 15, 2025 by david Chavalarias@davidchavalarias
  • Report abuse
  • New issue
Report abuse New issue

Failure to import large corpora

Summary

This might be specific to either the IMT instance or the HAL query but it seems that GTX fails to import the full corpora of all IMT publications : https://imt.sub.gargantext.org/#/share/NodeCorpus/132585

There is also en error in the doc chart which suggest that some process has been interrupted somewhere in the middle (on this chart, we have mostly doc in 2014, which does not reflect the state of the system).

image

Steps to reproduce

The query is API -> in database : HAL -> filter with organization: IMT : all_IMT

What is the current bug behavior?

The import is stuck at 49257. The relaunch of the query do not update the corpora. Estimated final corpora size is 100k doc.

What is the expected correct behavior?

  • When you re-launch the API search on the node https://imt.sub.gargantext.org/#/share/NodeCorpus/132585 the import should continue and complete until about 100k doc.
  • At first launch, it should have gone through the ~100k
Edited Sep 15, 2025 by david Chavalarias
Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
None
Due date
None
1
Labels
To Do
Assign labels
  • View project labels
Reference: gargantext/haskell-gargantext#511