gargantext.cabal · 406cd89e52844abcc2a4985704a03addbc19e08d · gargantext / haskell-gargantext

[worker] fix an unfortunate coincidence of various async issues · 406cd89e

Przemyslaw Kaminski authored Jun 13, 2025

This described in this comment:
#477 (comment 14458)

I repaste, for history:

- job timeout was 30 seconds only and this is a big zip file, so the job timed out in worker
- however, this was recently added https://gitlab.iscpif.fr/gargantext/haskell-gargantext/blame/dev/src/Gargantext/Database/Action/Flow.hs#L490 and the timeout wasn't caught and the worker continued happily
- the job finished normally (most probably)
- the job was restarted, because default strategy for timeouts is to restart a job
- for sending files, we use postgres large objects because it keeps our JSONs small
- when the job finishes, it clears definitely the large object so that we don't leave large, unused blob data
- however, that job was restarted and there was no more a large object to work on
- you got some sql error, but that wasn't the root cause

Solution is:
- don't catch any exception, but be careful and handle `Timeout` or `KillWorkerSafely`
- increase job timeout for file upload
- change timeout strategy for file upload to `TSDelete`, i.e. don't retry that job anymore

406cd89e

gargantext.cabal 30.8 KB

Replace gargantext.cabal