Commits · 706-dev-graph-parameters-display · gargantext / haskell-gargantext

26 May, 2025 2 commits
- Update project dependencies · 93d91e0c
  Grégoire Locqueville authored 2 weeks ago
  
  93d91e0c
- Add docNgrams test · 97b483d7
  Alfredo Di Napoli authored 2 weeks ago
  
  97b483d7
21 May, 2025 1 commit

Grégoire Locqueville authored 4 weeks ago

Spinglass functions would output incoherent clusters when the input graph
was nonconnected.

58e1e578

12 May, 2025 3 commits

Fix bug in selectCountDocs · 6ff05ee1

Alfredo Di Napoli authored 1 month ago

The refactored DB API now has a separate building block to create an Opaleye query that
counts the number of returned results; we do that via `countRows`, exactly like the previous version.

However, I have discovered a small footgun in the Opaleye API -- if you have
two `Select` statements both calling countRows in a chain, that will always yield a value of 1,
because the inner `countRows` will give you the actual number of results by returning
a single row with an integer inside (i.e. the count).

However, the subsequent (outer) call to `countRows` will return the number of rows
of the previous step .. which is always going to be one!

The bug was that I had left somewhere the spurious `countRows` in the query which
would return the number of documents needed for the TFICF field, triggering the bug
(because then we had `it` ALWAYS equal to 1.0).

In the new API, while we cannot prevent the bug at the type level we can
easily do an audit by grepping for `countRows`, making sure we have exactly one instance,
i.e. inside `mkOpaCountQuery`.

6ff05ee1

Make CLI compile again · b0568522
Alfredo Di Napoli authored 1 month ago

b0568522

Port DB operations to transactional API · 53512f89

Alfredo Di Napoli authored 1 month ago

This gigantic commit ports the existing DB operations in GGTX to use the
transactional API, meaning that we can now compose DB operations and
they will all run in the same Postgres transaction using the same
connection, which will eliminate those class of bugs where concurrent DB
access might result in an inconsistent state.

On top of that, we simplify some parts of the API, for which a summary
is given below:

1. The `NodeStoryEnv` management has been greatly simplified; in the new
   API we don't need an external connection pool to be passed and we
   don't have to pass IO actions, we can just pass DB operations,
   therefore we can greatly simplify the API to just pass mostly pure
   values;

2. Due to the fact that our `DBTx` monad can't do arbitrary IO (which is
   a good thing) we cannot fire Central Exchange notifications
   immediately. Rather that happens now is that we collect the
   `CEMessage` to be sent and we fire them in the relevant concrete
   monad after we finished with the DB transaction. This means that in
   principle there would be a small delay between the DB operation
   taking place and the notification firing but in practice the latency
   should be negligible and bear in mind this is typically what we want:
   if we have a long DB Tx that triggers an error in the middle we don't
   want to be sending out CE messages prematurely if the overall
   operation didn't succeed!

3. There are still a few places in the codebase where we couldn't make
   things fully compositional with regards to the DBTx API, because we
   had Servant handlers which had DB operations mixed with other IO
   effectful computations (or other things like the notification from
   the `MonadJobStatus`). For now we are splitting these functions by
   manually running the partial DB operations, and while this is not
   ideal it can be fixed in subsequent merge requests.

4. The `WorkerEnv` doesn't use `IOException` as its `MonadError`
   anymore, as for consistency we can just use `BackendInternalError` by
   adding a `InternalWorkerError` data constructor accepting the
   `IOException` triggered by the Worker monad.

More testing is needed, with particular attention to performance
(regression) but this should hopefully offer a decent baseline.

53512f89

28 Apr, 2025 3 commits
- Remove unused Gargantext.Database.Transactional.Prelude module · f50fb3ea
  Alfredo Di Napoli authored 1 month ago
  
  f50fb3ea
- Add tests for rollback and RW consistency · c48f3d46
  Alfredo Di Napoli authored 1 month ago
  
  c48f3d46
- Initial simple test for pure queries · 489968f6
  Alfredo Di Napoli authored 1 month ago
  
  489968f6
24 Apr, 2025 2 commits
- Implement DBTx in terms of a 'Free' monad. · 41ad6f5d
  Alfredo Di Napoli authored 1 month ago
  
  41ad6f5d
- Stub out transactional DB API · 1eca6e88
  Alfredo Di Napoli authored 1 month ago
  
  1eca6e88
07 Apr, 2025 2 commits

Implement a proper incremental parser for TSV documents · f18e2dbc

Alfredo Di Napoli authored 2 months ago

This commit introduces/improves the `parseTvsWithDiagnostics`
function to parse the input TSV incrementally, collecting errors
as we go, and eventually reporting them upstream via the newly
added `emitTsvParseWarning` function.

f18e2dbc

Reproduce TSV parsing issue for #380 · 9a26e565
Alfredo Di Napoli authored 2 months ago

9a26e565

27 Mar, 2025 1 commit
- [VERSION] +1 to 0.0.7.4.7 · 5225daf6
  Alexandre Delanoë authored 2 months ago
  
  5225daf6
26 Mar, 2025 1 commit
- [VERSION] +1 to 0.0.7.4.6 · da942cf9
  Alexandre Delanoë authored 2 months ago
  
  da942cf9
25 Mar, 2025 3 commits

unify loggers · 17a4f03a

Alfredo Di Napoli authored 2 months ago

Before we were repeating the same code to initialise all the different
loggers. This commit introduces two stock loggers called `ioStdLogger`
and `monadicStdLogger` which can be reused many times.

It also allows the `GGTX_LOG_LEVEL` to take effect during `readConfig`,
so that the `startupInfo` would show up the correct information.

17a4f03a

Add Test module for List API · fd4b99ec
Alfredo Di Napoli authored 2 months ago

fd4b99ec
Add test data for issue-381 · a5b3b3b3
Alfredo Di Napoli authored 2 months ago

a5b3b3b3

18 Mar, 2025 1 commit
- [VERSION] +1 to 0.0.7.4.5.1 · 0ab82ad5
  Alexandre Delanoë authored 2 months ago
  
  0ab82ad5
13 Mar, 2025 1 commit
- [VERSION] +1 to 0.0.7.4.5 · 7f759ab4
  Alexandre Delanoë authored 3 months ago
  
  7f759ab4
12 Mar, 2025 1 commit
- [VERSION] +1 to 0.0.7.4.4 · 3bf63d30
  Alexandre Delanoë authored 3 months ago
  
  3bf63d30
10 Mar, 2025 5 commits

[VERSION] +1 to 0.0.7.4.3 · 44df14d2
Alexandre Delanoë authored 3 months ago

44df14d2

fix(tests): allow ctrl-c to shut down the tests cleanly · 1ad17efd

Alfredo Di Napoli authored 3 months ago

The problem was caused by the improper usage of
`delegate_ctrl` when creating the coreNLP process. For a long
time I was under the impression this flag was essential to allow child
processes to shutdown cleanly without leaving zombie threads, but the
result here in the context of the testsuite was that the coreNLP server
was receiving the first Ctrl^C, completely starving the Haskell RTS,
which wouldn't receive any and as a result our testsuite would be
running forever.

1ad17efd

Turn on by default no-phylo-debug-logs · 333bfac9

Alfredo Di Napoli authored 3 months ago

By default, we shouldn't run debug logs for phylo in production, let
alone some that runs within pure code.

Debug logs will hinder performance, and showing them on the production
server is not their place anyway.

333bfac9

refactoring(logger): Silence debug logs in tests · c448afb3

Alfredo Di Napoli authored 3 months ago

This commit correctly propagates the correct `LogConfig` in all the
places where we were just defaulting to log everything, and this allows
us to silence debug logs in tests, unless we want them.

c448afb3

refactoring(logging): add log_file and log_config to Toml config · 93f605d5

Alfredo Di Napoli authored 3 months ago

Furthermore, the env var we used to override (in some parts) the logging
level from `LOG_LEVEL`  to `GGTX_LOG_LEVEL`, to avoid the env var
`LOG_LEVEL` clashing with some other service.

This will eventually allow us to properly override the logging level in
the tests, silencing non interesting stuff.

93f605d5

03 Mar, 2025 2 commits
- feat(deps): Upgrade to gargantext-graph-core, drop redundant accelerate-related deps · ae0ffe85
  Alfredo Di Napoli authored 3 months ago
```
This commit uses the latest version of `gargantext-graph`, now rebranded
`gargantext-graph-core`, which allowed us to drop unused dependencies
like `accelerate-arithmetic` & co.
```
  ae0ffe85
- feat(ghc): Upgrade GGTX to build with GHC 9.6.6 · 62c0a399
  Alfredo Di Napoli authored 3 months ago
  
  62c0a399
27 Feb, 2025 11 commits
- Code review amendments · 28b288c7
  Alfredo Di Napoli authored 3 months ago
  
  28b288c7
- Attempt to speed up logDistributional2 · b2b68a63
  Alfredo Di Napoli authored 4 months ago
```
We are now around 6/7 times slower than the LLVM code.
```
  b2b68a63
- Drop accelerate-llvm dependency · 0536d1ff
  Alfredo Di Napoli authored 4 months ago
```
This also changes the cabal.project to not pull the accelerate-llvm or
the llvm-hs dependencies.
```
  0536d1ff
- WIP: try to use away intermediate matrixes in distributional · 1909fd3a
  Alfredo Di Napoli authored 4 months ago
  
  1909fd3a
- Try to improve massiv benchmark performance · e2d59228
  Alfredo Di Napoli authored 4 months ago
  
  e2d59228
- Add basic benchmarks · 755d64f8
  Alfredo Di Napoli authored 4 months ago
  
  755d64f8
- WIP: implement distributional in terms of massiv · c9385388
  Alfredo Di Napoli authored 4 months ago
  
  c9385388
- Introduce massiv for small tasks · e124af3e
  Alfredo Di Napoli authored 4 months ago
```
This commit starts introducing `massiv` in the codebase,
initially for simple functions like `termDivNan`. The main
goal is to extend the linear algebra toolkit up to the
point where we can implement `distributional` in terms of
`massive` and measure its performance.
```
  e124af3e
- Fix division by 0 bug in distributional · 96e57e41
  Alfredo Di Napoli authored 4 months ago
```
The previous code was sometimes yielding a matrix of NaN numbers as
it was attempting the division of the input matrix with the diagonal,
which would be 0 in case of an input matrix of 0, resulting in a
division by 0 error.
```
  96e57e41
- Refactor createIndices · 690629a4
  Alfredo Di Napoli authored 4 months ago
  
  690629a4
- Introduce the Gargantext.Core.LinearAlgebra module · 8f4d901a
  Alfredo Di Napoli authored 4 months ago
```
The main idea is trying to refactor/improve the existing linear algebra
functions one function at the time, using reference implementations and
benchmarks along the way.
```
  8f4d901a
21 Feb, 2025 1 commit
- [export] sqlite export tests · 62d14afd
  Przemyslaw Kaminski authored 3 months ago
  
  62d14afd