Switch to NNG (from Nanomsg), fix (allegedly) Flaky tests
Allegedly fixes #440 (closed).
This MR is a tale of two fixes. The first fix, i.e. switching from nanomsg
to nng
was motivated by the fact I wanted to be 100% sure that test runs would be nice and isolated between each other (see #440 (closed) commentary). To do so, I wanted to have the Dispatcher and CE talk using the IPC protocol during testing, but it turned out that our version of nanomsg
wasn't working with that protocol. This is why I switched to be using nng
(the moral successor of nanomsg
).
Here is where the second story begins; in order to support using nng
in GGTX, I had to update my own fork of nanomsg with the code necessary to use the nng functions from FFI. As I was at it, I have started incorporating various outstanding fixes that were submitted to the main repo over the years but that ended up being unmerged.
In particular, once I have cherry-picked this fix my tests started to have predicable again. My hunch is that the previous version of send
wasn't always waiting for the underlying socket to bind completely, resulting in the "illusion" of the payload being sent, when in reality it wasn't: now the tests relying on reading from the destination TChan wouldn't see a value readily available on time, triggering the bug.