- 26 May, 2025 2 commits
-
-
Grégoire Locqueville authored
-
Alfredo Di Napoli authored
-
- 21 May, 2025 1 commit
-
-
Grégoire Locqueville authored
Spinglass functions would output incoherent clusters when the input graph was nonconnected.
-
- 12 May, 2025 3 commits
-
-
Alfredo Di Napoli authored
The refactored DB API now has a separate building block to create an Opaleye query that counts the number of returned results; we do that via `countRows`, exactly like the previous version. However, I have discovered a small footgun in the Opaleye API -- if you have two `Select` statements both calling countRows in a chain, that will always yield a value of 1, because the inner `countRows` will give you the actual number of results by returning a single row with an integer inside (i.e. the count). However, the subsequent (outer) call to `countRows` will return the number of rows of the previous step .. which is always going to be one! The bug was that I had left somewhere the spurious `countRows` in the query which would return the number of documents needed for the TFICF field, triggering the bug (because then we had `it` ALWAYS equal to 1.0). In the new API, while we cannot prevent the bug at the type level we can easily do an audit by grepping for `countRows`, making sure we have exactly one instance, i.e. inside `mkOpaCountQuery`.
-
Alfredo Di Napoli authored
-
Alfredo Di Napoli authored
This gigantic commit ports the existing DB operations in GGTX to use the transactional API, meaning that we can now compose DB operations and they will all run in the same Postgres transaction using the same connection, which will eliminate those class of bugs where concurrent DB access might result in an inconsistent state. On top of that, we simplify some parts of the API, for which a summary is given below: 1. The `NodeStoryEnv` management has been greatly simplified; in the new API we don't need an external connection pool to be passed and we don't have to pass IO actions, we can just pass DB operations, therefore we can greatly simplify the API to just pass mostly pure values; 2. Due to the fact that our `DBTx` monad can't do arbitrary IO (which is a good thing) we cannot fire Central Exchange notifications immediately. Rather that happens now is that we collect the `CEMessage` to be sent and we fire them in the relevant concrete monad after we finished with the DB transaction. This means that in principle there would be a small delay between the DB operation taking place and the notification firing but in practice the latency should be negligible and bear in mind this is typically what we want: if we have a long DB Tx that triggers an error in the middle we don't want to be sending out CE messages prematurely if the overall operation didn't succeed! 3. There are still a few places in the codebase where we couldn't make things fully compositional with regards to the DBTx API, because we had Servant handlers which had DB operations mixed with other IO effectful computations (or other things like the notification from the `MonadJobStatus`). For now we are splitting these functions by manually running the partial DB operations, and while this is not ideal it can be fixed in subsequent merge requests. 4. The `WorkerEnv` doesn't use `IOException` as its `MonadError` anymore, as for consistency we can just use `BackendInternalError` by adding a `InternalWorkerError` data constructor accepting the `IOException` triggered by the Worker monad. More testing is needed, with particular attention to performance (regression) but this should hopefully offer a decent baseline.
-
- 28 Apr, 2025 3 commits
-
-
Alfredo Di Napoli authored
-
Alfredo Di Napoli authored
-
Alfredo Di Napoli authored
-
- 24 Apr, 2025 2 commits
-
-
Alfredo Di Napoli authored
-
Alfredo Di Napoli authored
-
- 07 Apr, 2025 2 commits
-
-
Alfredo Di Napoli authored
This commit introduces/improves the `parseTvsWithDiagnostics` function to parse the input TSV incrementally, collecting errors as we go, and eventually reporting them upstream via the newly added `emitTsvParseWarning` function.
-
Alfredo Di Napoli authored
-
- 27 Mar, 2025 1 commit
-
-
Alexandre Delanoë authored
-
- 26 Mar, 2025 1 commit
-
-
Alexandre Delanoë authored
-
- 25 Mar, 2025 3 commits
-
-
Alfredo Di Napoli authored
Before we were repeating the same code to initialise all the different loggers. This commit introduces two stock loggers called `ioStdLogger` and `monadicStdLogger` which can be reused many times. It also allows the `GGTX_LOG_LEVEL` to take effect during `readConfig`, so that the `startupInfo` would show up the correct information.
-
Alfredo Di Napoli authored
-
Alfredo Di Napoli authored
-
- 18 Mar, 2025 1 commit
-
-
Alexandre Delanoë authored
-
- 13 Mar, 2025 1 commit
-
-
Alexandre Delanoë authored
-
- 12 Mar, 2025 1 commit
-
-
Alexandre Delanoë authored
-
- 10 Mar, 2025 5 commits
-
-
Alexandre Delanoë authored
-
Alfredo Di Napoli authored
The problem was caused by the improper usage of `delegate_ctrl` when creating the coreNLP process. For a long time I was under the impression this flag was essential to allow child processes to shutdown cleanly without leaving zombie threads, but the result here in the context of the testsuite was that the coreNLP server was receiving the first Ctrl^C, completely starving the Haskell RTS, which wouldn't receive any and as a result our testsuite would be running forever.
-
Alfredo Di Napoli authored
By default, we shouldn't run debug logs for phylo in production, let alone some that runs within pure code. Debug logs will hinder performance, and showing them on the production server is not their place anyway.
-
Alfredo Di Napoli authored
This commit correctly propagates the correct `LogConfig` in all the places where we were just defaulting to log everything, and this allows us to silence debug logs in tests, unless we want them.
-
Alfredo Di Napoli authored
Furthermore, the env var we used to override (in some parts) the logging level from `LOG_LEVEL` to `GGTX_LOG_LEVEL`, to avoid the env var `LOG_LEVEL` clashing with some other service. This will eventually allow us to properly override the logging level in the tests, silencing non interesting stuff.
-
- 03 Mar, 2025 2 commits
-
-
Alfredo Di Napoli authored
This commit uses the latest version of `gargantext-graph`, now rebranded `gargantext-graph-core`, which allowed us to drop unused dependencies like `accelerate-arithmetic` & co.
-
Alfredo Di Napoli authored
-
- 27 Feb, 2025 11 commits
-
-
Alfredo Di Napoli authored
-
Alfredo Di Napoli authored
We are now around 6/7 times slower than the LLVM code.
-
Alfredo Di Napoli authored
This also changes the cabal.project to not pull the accelerate-llvm or the llvm-hs dependencies.
-
Alfredo Di Napoli authored
-
Alfredo Di Napoli authored
-
Alfredo Di Napoli authored
-
Alfredo Di Napoli authored
-
Alfredo Di Napoli authored
This commit starts introducing `massiv` in the codebase, initially for simple functions like `termDivNan`. The main goal is to extend the linear algebra toolkit up to the point where we can implement `distributional` in terms of `massive` and measure its performance.
-
Alfredo Di Napoli authored
The previous code was sometimes yielding a matrix of NaN numbers as it was attempting the division of the input matrix with the diagonal, which would be 0 in case of an input matrix of 0, resulting in a division by 0 error.
-
Alfredo Di Napoli authored
-
Alfredo Di Napoli authored
The main idea is trying to refactor/improve the existing linear algebra functions one function at the time, using reference implementations and benchmarks along the way.
-
- 21 Feb, 2025 1 commit
-
-
Przemyslaw Kaminski authored
-