Skip to content

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
    • Help
    • Submit feedback
    • Contribute to GitLab
  • Sign in
haskell-gargantext
haskell-gargantext
  • Project
    • Project
    • Details
    • Activity
    • Releases
    • Cycle Analytics
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Charts
  • Issues 164
    • Issues 164
    • List
    • Board
    • Labels
    • Milestones
  • Merge Requests 8
    • Merge Requests 8
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
    • Charts
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Charts
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • gargantext
  • haskell-gargantexthaskell-gargantext
  • Issues
  • #480

Closed
Open
Opened Jun 16, 2025 by Alfredo Di Napoli@AlfredoDiNapoli
  • Report abuse
  • New issue
Report abuse New issue

Weird behaviour when rolling back Postgres transactions

When working on #476, I had some code written like this:

insertMasterDoc :: ( HasNodeError err
                   , UniqParameters doc
                   , FlowCorpus doc
                   , MkCorpus   c
                   )
                 => GargConfig
                 -> UncommittedNgrams doc
                 -- ^ The ngrams extracted for /all/ the documents
                 -- and indexed by the hash of the given document.
                 -- We can use this map to associate the document
                 -- with the node being created.
                 -> Maybe c
                 -> doc
                 -> DBUpdate err DocId
insertMasterDoc cfg uncommittedNgrams c h = do
  (masterUserId, _, masterCorpusId) <- getOrMkRootWithCorpus cfg MkCorpusUserMaster c
  documentWithId <- insertDoc masterUserId masterCorpusId (toNode masterUserId Nothing h)

  _ <- Doc.add masterCorpusId [_index documentWithId]

  -- TODO
  -- create a corpus with database name (CSV or PubMed)
  -- add documents to the corpus (create node_node link)
  -- this will enable global database monitoring

  case commitNgramsForDocument uncommittedNgrams documentWithId of
    Left failed ->
      -- this allows rollingback the whole tx, and we won't leave dangling nodes behind.
      nodeError $ NodeCreationFailed (InsertDocFailed failed)
    Right ngramsDocsMap -> do
     lId <- getOrMkList masterCorpusId masterUserId
     _ <- saveDocNgramsWith lId ngramsDocsMap
     pure $ contextId2NodeId $ _index documentWithId

The call to nodeError in the middle of that DBUpdate transaction should have the effect to rollback the entire transaction, meaning that we shouldn't create a document node if we cannot extract Ngrams out of it, as that would be stateful (that's an arguable design decision which I think I have changed my mind about, but that's orthogonal to this issue).

Then, I had a test like this:

corpusAddDocBatchSurvival :: TestEnv -> Assertion
corpusAddDocBatchSurvival env = runTestMonad env $ do
  cfg <- view hasConfig
  parentId <- runDBQuery $ getRootId (UserName userMaster)
  [corpus] <- runDBQuery $ getCorporaWithParentId parentId
  let la  = Mono EN
  nlp <- view (nlpServerGet (_tt_lang la))

  oldCount <- docsInCorpusCount

  let docs = [exampleDocument_06, exampleDocument_07, exampleDocument_08 ]

  uNgrams1 <- extractNgramsFromDocument nlp la exampleDocument_06
  uNgrams2 <- extractNgramsFromDocument nlp la exampleDocument_08

  (failures, savedDocs) <- insertMasterDocs cfg (uNgrams1 <> uNgrams2) (Just $ _node_hyperdata corpus) docs

  c' <- docsInCorpusCount
  liftIO $ do
    failures  `shouldBe` [DocumentInsertionError "InternalNodeError Cannot make node due to: Couldn't find the associated ngrams for document with hash \\x666b8e7bfd7c0af37d630e1097791f7ba438a669ecb6d1cb38014edd0b7a2977, therefore the added document won't have any ngrams."]
    c' `shouldBe` oldCount + 2
    savedDocs `shouldBe` [ UnsafeMkNodeId 5, UnsafeMkNodeId 6 ]

With my surprise, this test fails with:

  test/Test/Database/Operations/DocumentSearch.hs:235:15: 
  1) Database, With test user, With test corpus, With test documents, Can survive failures in a batch
       expected: [nodeId-5,nodeId-6]
        but got: [nodeId-5,nodeId-7]

This tests simulates adding 3 documents, the middle one having no ngrams, so the system should persist only the first and the last. To my surprise, it actually persisted all three of them despite the call to nodeError to rollback the transaction.

While looking at the implementation of evalOp in the .Transactional module I thought I discovered the bug: in the case of DBFail we are using throwError, but really we should throw an exception in IO so that the exception handler as part of withTransaction would kick in, rolling back the transaction.

However, after I did the changes, the bug is still there -- I even changed the type of evalOp to make it monomorphic over IO but without luck.

There must be something glaringly obvious (or very subtle) I'm missing as we have tests to assess that a DBTx can rollback and they pass.

Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
None
Due date
None
1
Labels
To Do
Assign labels
  • View project labels
Reference: gargantext/haskell-gargantext#480