src/Gargantext · 44615be8cbbc424fb0752ed0f3ed3eb4c3d06090 · gargantext / haskell-gargantext

[worker] fix an unfortunate coincidence of various async issues · 406cd89e

Przemyslaw Kaminski authored Jun 13, 2025

This described in this comment:
#477 (comment 14458)

I repaste, for history:

- job timeout was 30 seconds only and this is a big zip file, so the job timed out in worker
- however, this was recently added https://gitlab.iscpif.fr/gargantext/haskell-gargantext/blame/dev/src/Gargantext/Database/Action/Flow.hs#L490 and the timeout wasn't caught and the worker continued happily
- the job finished normally (most probably)
- the job was restarted, because default strategy for timeouts is to restart a job
- for sending files, we use postgres large objects because it keeps our JSONs small
- when the job finishes, it clears definitely the large object so that we don't leave large, unused blob data
- however, that job was restarted and there was no more a large object to work on
- you got some sql error, but that wasn't the root cause

Solution is:
- don't catch any exception, but be careful and handle `Timeout` or `KillWorkerSafely`
- increase job timeout for file upload
- change timeout strategy for file upload to `TSDelete`, i.e. don't retry that job anymore

406cd89e

Name	Last commit	Last update
..
API		Loading commit data...
Core		Loading commit data...
Data/HashMap/Strict		Loading commit data...
Database		Loading commit data...
MicroServices		Loading commit data...
Orphans		Loading commit data...
System		Loading commit data...
Utils		Loading commit data...
API.hs		Loading commit data...
Core.hs		Loading commit data...
Database.hs		Loading commit data...
Defaults.hs		Loading commit data...
Orphans.hs		Loading commit data...

Download source code

Download this directory