Output of `toPhylo` & co non-deterministic? (#329) · Issues · gargantext / haskell-gargantext

Output of `toPhylo` & co non-deterministic?

This ticket requires more investigation, but when I was working on purescript-gargantext#632 (closed) , as part of my regression tests I wanted to add a test to check that the output of PhyloExport would stay the same between round of refactorings I was doing.

To do that, I have added the following test which checks the output against a golden test:

testPhyloExportExpectedOutput :: Assertion
testPhyloExportExpectedOutput = do

  -- Acquire the config from the golden file.
  expected_e <- JSON.eitherDecodeFileStrict' =<< getDataFileName "test-data/phylo/112828.json"
  case expected_e of
    Left err -> fail err
    Right (pd :: PhyloData) -> do
      let goldenCfg  = pd_config pd
      corpusPath'    <- getDataFileName "test-data/phylo/GarganText_DocsList-nodeId-112828.csv"
      listPath'      <- getDataFileName "test-data/phylo/GarganText_NgramsList-112829.csv"
      let config     = goldenCfg { corpusPath = corpusPath'
                                 , listPath   = listPath'
                                 , listParser = V3
                                 }
      mapList <- fileToList (listParser config) (listPath config)
      corpus <- fileToDocsDefault (corpusParser config)
                                  (corpusPath config)
                                  [Year 3 1 5,Month 3 1 5,Week 4 2 5]
                                  mapList
      actual_e   <- JSON.parseEither JSON.parseJSON <$> phylo2dot2json (toPhylo $ toPhyloWithoutLink corpus config)
      case actual_e of
        Left err -> fail err
        Right (expected :: GraphData) -> do
          let prettyConfig = JSON.defConfig { JSON.confCompare = compare }
          let actualJSON   = TE.decodeUtf8 (BL.toStrict $ JSON.encodePretty' prettyConfig $ pd_data pd)
          let expectedJSON = TE.decodeUtf8 (BL.toStrict $ JSON.encodePretty' prettyConfig $ expected)
          assertBool (show $ ansiWlEditExpr $ ediff' expectedJSON actualJSON) (expectedJSON == actualJSON)

To my surprise, this test randomly fails sometimes. At first I thought it was due to the fact that JSON.encode doesn't produce sorted objects (especially since aeson 2.x switched from containers to unordered-containers) but I have mitigated that by using the encodePretty' function from aeson-pretty that can produce JSON objects which have sorted keys.

Despite that, I still get some random failures. Observe this excerpt, that is using the tree-diff library to show a diff-like counterexample:

            -"            \"name\": \"Period20062008\",\n",
            +"            \"name\": \"Branches peaks\",\n",
            "            \"nodes\": [\n",
            -"                19\n",
            +"                18\n",
            "            ],\n",
            "            \"nodesep\": \"1\",\n",
            "            \"overlap\": \"scale\",\n",
            "            \"phyloBranches\": \"1\",\n",
            "            \"phyloDocs\": \"72.0\",\n",
            "            \"phyloFoundations\": \"221\",\n",
            "            \"phyloGroups\": \"9\",\n",
            "            \"phyloPeriods\": \"17\",\n",
            "            \"phyloSources\": \"[]\",\n",
            "            \"phyloTerms\": \"95\",\n",
            "            \"phyloTimeScale\": \"year\",\n",
            "            \"rank\": \"same\",\n",
            "            \"ranksep\": \"1\",\n",
            "            \"ratio\": \"fill\",\n",
            "            \"splines\": \"spline\",\n",
            "            \"style\": \"filled\"\n",
            "        },\n",
            "        {\n",
            "            \"_gvid\": 1,\n",
            "            \"bb\": \"0,0,1224.6,2787\",\n",
            "            \"color\": \"white\",\n",
            "            \"fontsize\": \"30\",\n",
            "            \"label\": \"Phylo Name\",\n",
            "            \"labelloc\": \"t\",\n",
            "            \"lheight\": \"0.47\",\n",
            "            \"lp\": \"612.32,2766.2\",\n",
            "            \"lwidth\": \"2.07\",\n",
            -"            \"name\": \"Period20072009\",\n",
            +"            \"name\": \"Period20062008\",\n",
            "            \"nodes\": [\n",
            -"                20\n",
            +"                19\n",
            "            ],\n",

It's a bit hard to read, so let's try with a screenshot:

I'm not sure exactly how the algorithm is meant to work, but I would have expected it to generate a predictable list of nodes and edges, especially since it's meant to be mostly (completely?) pure.

Now, there are a few aspects to consider here:

As said, this requires further investigation, but it might explain why we have seen those intermittent failures on the tests I added a while ago;
Important: the two runs seems to contain the same output, it just that it's somehow wrongly "correlated" (for example nodes and labels for a given node are swapped between runs etc), so it seems there is some sort of effectful computation that is generating a list something in a potentially unpredictable order.
It would be nice to reduce the test area further so that we can replicate this for smaller phylos and for code that doesn't use the export code (i.e. too many variables at play at once).

@anoe I think it would be nice to do some extra digging on this in due course, as having a strong test suite for Phylo is very important, considering how it's one of the primary source of issues (related to the frontend and the backend).

This ticket requires more investigation, but when I was working on https://gitlab.iscpif.fr/gargantext/purescript-gargantext/issues/632 , as part of my regression tests I wanted to add a test to check that the output of `PhyloExport` would stay the same between round of refactorings I was doing.

To do that, I have added the following test which checks the output against a golden test:

```hs
testPhyloExportExpectedOutput :: Assertion
testPhyloExportExpectedOutput = do

-- Acquire the config from the golden file.
  expected_e <- JSON.eitherDecodeFileStrict' =<< getDataFileName "test-data/phylo/112828.json"
  case expected_e of
    Left err -> fail err
    Right (pd :: PhyloData) -> do
      let goldenCfg  = pd_config pd
      corpusPath'    <- getDataFileName "test-data/phylo/GarganText_DocsList-nodeId-112828.csv"
      listPath'      <- getDataFileName "test-data/phylo/GarganText_NgramsList-112829.csv"
      let config     = goldenCfg { corpusPath = corpusPath'
                                 , listPath   = listPath'
                                 , listParser = V3
                                 }
      mapList <- fileToList (listParser config) (listPath config)
      corpus <- fileToDocsDefault (corpusParser config)
                                  (corpusPath config)
                                  [Year 3 1 5,Month 3 1 5,Week 4 2 5]
                                  mapList
      actual_e   <- JSON.parseEither JSON.parseJSON <$> phylo2dot2json (toPhylo $ toPhyloWithoutLink corpus config)
      case actual_e of
        Left err -> fail err
        Right (expected :: GraphData) -> do
          let prettyConfig = JSON.defConfig { JSON.confCompare = compare }
          let actualJSON   = TE.decodeUtf8 (BL.toStrict $ JSON.encodePretty' prettyConfig $ pd_data pd)
          let expectedJSON = TE.decodeUtf8 (BL.toStrict $ JSON.encodePretty' prettyConfig $ expected)
          assertBool (show $ ansiWlEditExpr $ ediff' expectedJSON actualJSON) (expectedJSON == actualJSON)

```

To my surprise, this test **randomly fails** sometimes. At first I thought it was due to the fact that `JSON.encode` doesn't produce _sorted objects_ (especially since aeson 2.x switched from `containers` to `unordered-containers`) but I have mitigated that by using the `encodePretty'` function from `aeson-pretty` that can produce JSON objects which have sorted keys.

Despite that, I still get some random failures. Observe this excerpt, that is using the `tree-diff` library to show a diff-like counterexample:

```
            -"            \"name\": \"Period20062008\",\n",
            +"            \"name\": \"Branches peaks\",\n",
            "            \"nodes\": [\n",
            -"                19\n",
            +"                18\n",
            "            ],\n",
            "            \"nodesep\": \"1\",\n",
            "            \"overlap\": \"scale\",\n",
            "            \"phyloBranches\": \"1\",\n",
            "            \"phyloDocs\": \"72.0\",\n",
            "            \"phyloFoundations\": \"221\",\n",
            "            \"phyloGroups\": \"9\",\n",
            "            \"phyloPeriods\": \"17\",\n",
            "            \"phyloSources\": \"[]\",\n",
            "            \"phyloTerms\": \"95\",\n",
            "            \"phyloTimeScale\": \"year\",\n",
            "            \"rank\": \"same\",\n",
            "            \"ranksep\": \"1\",\n",
            "            \"ratio\": \"fill\",\n",
            "            \"splines\": \"spline\",\n",
            "            \"style\": \"filled\"\n",
            "        },\n",
            "        {\n",
            "            \"_gvid\": 1,\n",
            "            \"bb\": \"0,0,1224.6,2787\",\n",
            "            \"color\": \"white\",\n",
            "            \"fontsize\": \"30\",\n",
            "            \"label\": \"Phylo Name\",\n",
            "            \"labelloc\": \"t\",\n",
            "            \"lheight\": \"0.47\",\n",
            "            \"lp\": \"612.32,2766.2\",\n",
            "            \"lwidth\": \"2.07\",\n",
            -"            \"name\": \"Period20072009\",\n",
            +"            \"name\": \"Period20062008\",\n",
            "            \"nodes\": [\n",
            -"                20\n",
            +"                19\n",
            "            ],\n",
```

It's a bit hard to read, so let's try with a screenshot:

![Screenshot_2024-03-19_at_08.29.31](/uploads/c42d3e27666a184a9e7b892273fa00ea/Screenshot_2024-03-19_at_08.29.31.png)

I'm not sure exactly how the algorithm is meant to work, but I would have expected it to generate a predictable list of nodes and edges, especially since it's meant to be mostly (completely?) pure.

Now, there are a few aspects to consider here:

* As said, this requires further investigation, but it might explain why we have seen those [intermittent failures](https://gitlab.iscpif.fr/gargantext/haskell-gargantext/merge_requests/254) on the tests I added a while ago;
* **Important**: the two runs seems to contain the same output, it just that it's somehow wrongly "correlated" (for example nodes and labels for a given node are swapped between runs etc), so it seems there is some sort of effectful computation that is generating a list something in a potentially unpredictable order.
* It would be nice to reduce the test area further so that we can replicate this for smaller phylos and for code that doesn't use the export code (i.e. too many variables at play at once).