Output of `toPhylo` & co non-deterministic?
This ticket requires more investigation, but when I was working on purescript-gargantext#632 (closed) , as part of my regression tests I wanted to add a test to check that the output of PhyloExport
would stay the same between round of refactorings I was doing.
To do that, I have added the following test which checks the output against a golden test:
testPhyloExportExpectedOutput :: Assertion
testPhyloExportExpectedOutput = do
-- Acquire the config from the golden file.
expected_e <- JSON.eitherDecodeFileStrict' =<< getDataFileName "test-data/phylo/112828.json"
case expected_e of
Left err -> fail err
Right (pd :: PhyloData) -> do
let goldenCfg = pd_config pd
corpusPath' <- getDataFileName "test-data/phylo/GarganText_DocsList-nodeId-112828.csv"
listPath' <- getDataFileName "test-data/phylo/GarganText_NgramsList-112829.csv"
let config = goldenCfg { corpusPath = corpusPath'
, listPath = listPath'
, listParser = V3
}
mapList <- fileToList (listParser config) (listPath config)
corpus <- fileToDocsDefault (corpusParser config)
(corpusPath config)
[Year 3 1 5,Month 3 1 5,Week 4 2 5]
mapList
actual_e <- JSON.parseEither JSON.parseJSON <$> phylo2dot2json (toPhylo $ toPhyloWithoutLink corpus config)
case actual_e of
Left err -> fail err
Right (expected :: GraphData) -> do
let prettyConfig = JSON.defConfig { JSON.confCompare = compare }
let actualJSON = TE.decodeUtf8 (BL.toStrict $ JSON.encodePretty' prettyConfig $ pd_data pd)
let expectedJSON = TE.decodeUtf8 (BL.toStrict $ JSON.encodePretty' prettyConfig $ expected)
assertBool (show $ ansiWlEditExpr $ ediff' expectedJSON actualJSON) (expectedJSON == actualJSON)
To my surprise, this test randomly fails sometimes. At first I thought it was due to the fact that JSON.encode
doesn't produce sorted objects (especially since aeson 2.x switched from containers
to unordered-containers
) but I have mitigated that by using the encodePretty'
function from aeson-pretty
that can produce JSON objects which have sorted keys.
Despite that, I still get some random failures. Observe this excerpt, that is using the tree-diff
library to show a diff-like counterexample:
-" \"name\": \"Period20062008\",\n",
+" \"name\": \"Branches peaks\",\n",
" \"nodes\": [\n",
-" 19\n",
+" 18\n",
" ],\n",
" \"nodesep\": \"1\",\n",
" \"overlap\": \"scale\",\n",
" \"phyloBranches\": \"1\",\n",
" \"phyloDocs\": \"72.0\",\n",
" \"phyloFoundations\": \"221\",\n",
" \"phyloGroups\": \"9\",\n",
" \"phyloPeriods\": \"17\",\n",
" \"phyloSources\": \"[]\",\n",
" \"phyloTerms\": \"95\",\n",
" \"phyloTimeScale\": \"year\",\n",
" \"rank\": \"same\",\n",
" \"ranksep\": \"1\",\n",
" \"ratio\": \"fill\",\n",
" \"splines\": \"spline\",\n",
" \"style\": \"filled\"\n",
" },\n",
" {\n",
" \"_gvid\": 1,\n",
" \"bb\": \"0,0,1224.6,2787\",\n",
" \"color\": \"white\",\n",
" \"fontsize\": \"30\",\n",
" \"label\": \"Phylo Name\",\n",
" \"labelloc\": \"t\",\n",
" \"lheight\": \"0.47\",\n",
" \"lp\": \"612.32,2766.2\",\n",
" \"lwidth\": \"2.07\",\n",
-" \"name\": \"Period20072009\",\n",
+" \"name\": \"Period20062008\",\n",
" \"nodes\": [\n",
-" 20\n",
+" 19\n",
" ],\n",
It's a bit hard to read, so let's try with a screenshot:
I'm not sure exactly how the algorithm is meant to work, but I would have expected it to generate a predictable list of nodes and edges, especially since it's meant to be mostly (completely?) pure.
Now, there are a few aspects to consider here:
- As said, this requires further investigation, but it might explain why we have seen those intermittent failures on the tests I added a while ago;
- Important: the two runs seems to contain the same output, it just that it's somehow wrongly "correlated" (for example nodes and labels for a given node are swapped between runs etc), so it seems there is some sort of effectful computation that is generating a list something in a potentially unpredictable order.
- It would be nice to reduce the test area further so that we can replicate this for smaller phylos and for code that doesn't use the export code (i.e. too many variables at play at once).
@anoe I think it would be nice to do some extra digging on this in due course, as having a strong test suite for Phylo is very important, considering how it's one of the primary source of issues (related to the frontend and the backend).