Weird behaviour when rolling back Postgres transactions
When working on #476, I had some code written like this:
insertMasterDoc :: ( HasNodeError err
, UniqParameters doc
, FlowCorpus doc
, MkCorpus c
)
=> GargConfig
-> UncommittedNgrams doc
-- ^ The ngrams extracted for /all/ the documents
-- and indexed by the hash of the given document.
-- We can use this map to associate the document
-- with the node being created.
-> Maybe c
-> doc
-> DBUpdate err DocId
insertMasterDoc cfg uncommittedNgrams c h = do
(masterUserId, _, masterCorpusId) <- getOrMkRootWithCorpus cfg MkCorpusUserMaster c
documentWithId <- insertDoc masterUserId masterCorpusId (toNode masterUserId Nothing h)
_ <- Doc.add masterCorpusId [_index documentWithId]
-- TODO
-- create a corpus with database name (CSV or PubMed)
-- add documents to the corpus (create node_node link)
-- this will enable global database monitoring
case commitNgramsForDocument uncommittedNgrams documentWithId of
Left failed ->
-- this allows rollingback the whole tx, and we won't leave dangling nodes behind.
nodeError $ NodeCreationFailed (InsertDocFailed failed)
Right ngramsDocsMap -> do
lId <- getOrMkList masterCorpusId masterUserId
_ <- saveDocNgramsWith lId ngramsDocsMap
pure $ contextId2NodeId $ _index documentWithId
The call to nodeError
in the middle of that DBUpdate
transaction should have the effect to rollback the entire transaction, meaning that we shouldn't create a document node if we cannot extract Ngrams out of it, as that would be stateful (that's an arguable design decision which I think I have changed my mind about, but that's orthogonal to this issue).
Then, I had a test like this:
corpusAddDocBatchSurvival :: TestEnv -> Assertion
corpusAddDocBatchSurvival env = runTestMonad env $ do
cfg <- view hasConfig
parentId <- runDBQuery $ getRootId (UserName userMaster)
[corpus] <- runDBQuery $ getCorporaWithParentId parentId
let la = Mono EN
nlp <- view (nlpServerGet (_tt_lang la))
oldCount <- docsInCorpusCount
let docs = [exampleDocument_06, exampleDocument_07, exampleDocument_08 ]
uNgrams1 <- extractNgramsFromDocument nlp la exampleDocument_06
uNgrams2 <- extractNgramsFromDocument nlp la exampleDocument_08
(failures, savedDocs) <- insertMasterDocs cfg (uNgrams1 <> uNgrams2) (Just $ _node_hyperdata corpus) docs
c' <- docsInCorpusCount
liftIO $ do
failures `shouldBe` [DocumentInsertionError "InternalNodeError Cannot make node due to: Couldn't find the associated ngrams for document with hash \\x666b8e7bfd7c0af37d630e1097791f7ba438a669ecb6d1cb38014edd0b7a2977, therefore the added document won't have any ngrams."]
c' `shouldBe` oldCount + 2
savedDocs `shouldBe` [ UnsafeMkNodeId 5, UnsafeMkNodeId 6 ]
With my surprise, this test fails with:
test/Test/Database/Operations/DocumentSearch.hs:235:15:
1) Database, With test user, With test corpus, With test documents, Can survive failures in a batch
expected: [nodeId-5,nodeId-6]
but got: [nodeId-5,nodeId-7]
This tests simulates adding 3 documents, the middle one having no ngrams, so the system should persist only the first and the last. To my surprise, it actually persisted all three of them despite the call to nodeError
to rollback the transaction.
While looking at the implementation of evalOp
in the .Transactional
module I thought I discovered the bug: in the case of DBFail
we are using throwError
, but really we should throw an exception in IO
so that the exception handler as part of withTransaction
would kick in, rolling back the transaction.
However, after I did the changes, the bug is still there -- I even changed the type of evalOp
to make it monomorphic over IO
but without luck.
There must be something glaringly obvious (or very subtle) I'm missing as we have tests to assess that a DBTx
can rollback and they pass.