- 09 Jun, 2025 4 commits
-
-
Przemyslaw Kaminski authored
We have more ngrams terms now.
-
Przemyslaw Kaminski authored
-
Alfredo Di Napoli authored
Separate ngram extraction from document insertion Closes #473 See merge request !415
-
Przemyslaw Kaminski authored
-
- 06 Jun, 2025 1 commit
-
-
Przemyslaw Kaminski authored
-
- 05 Jun, 2025 7 commits
-
-
Alfredo Di Napoli authored
Now the tests pass again, but crucially `insertMasterDocs` runs in a single atomic DB update, meaning we can rollback cleanly in case disaster strikes.
-
Alfredo Di Napoli authored
-
Alfredo Di Napoli authored
The `ExtractNgrams` typeclass _definition_ imposed a redundant `HasText` constraint, forcing all the _instances_ to have `HasText` defined even though the instance didn't make any use of that. That is overly-rigid, and it has been fixed by this commit.
-
Przemyslaw Kaminski authored
However, tests don't pass and I'm not sure if other functionality doesn't break.
-
Przemyslaw Kaminski authored
Thing is: I sometimes have corenlp/postgres running for my dev env. When I start tests, they fail because they try to start corenlp again on the same port. So what I added was checking for that "address already in use" condition and going on with the tests. CoreNLP is stateless so it shouldn't matter which one we use for tests.
-
Przemyslaw Kaminski authored
-
Przemyslaw Kaminski authored
-
- 04 Jun, 2025 1 commit
-
-
Alfredo Di Napoli authored
This commit refactors the flow code to generate the ngrams for the master docs separately, and then it "commits" them later after such docs have been associated with a `Node`.
-
- 02 Jun, 2025 4 commits
-
-
Przemyslaw Kaminski authored
-
Przemyslaw Kaminski authored
706 dev graph parameters display See merge request !407
-
Przemyslaw Kaminski authored
-
Karen Konou authored
-
- 29 May, 2025 1 commit
-
-
Przemyslaw Kaminski authored
Fix a bug in `buildPatterns` and friends Closes #395 See merge request !413
-
- 26 May, 2025 10 commits
-
-
Alfredo Di Napoli authored
-
Alfredo Di Napoli authored
-
Alfredo Di Napoli authored
-
Alfredo Di Napoli authored
The final `T` doesn't add anything. It also moves the `HasText` constraint _outside_ the typeclass definition.
-
Alfredo Di Napoli authored
Update IGraph See merge request !411
-
Alfredo Di Napoli authored
Fixes a bug in the implementation of `buildPatterns`. In particular, when we are building a `Pattern`, we need to do so in a case insenstive fashion, otherwise later in the call to `replaceTerms` we would be calling this from `extractTermsWithList` that cast everything into lowercase due to the use of `monoTextsBySentence`. This means that before this commit if we tried to search "Map" into the text "Map is what I use when I'm lost" we wouldn't get a match, because the latter would be converted into lowercase first (i.e. "map is what i use when i'm lost") and we were trying to look for the string "Map" (i.e. the former) into the transformer, yielding no matches.
-
Alfredo Di Napoli authored
Previously the generator was generating all sorts of unicode symbols, which doesn't play well for things like tab separators, carriage returns and other things. Furthermore, we need to be careful to not use the same symbol set of `isSep` when we generate terms, because we are simulating an ngrams search in a document and ngrams do not contain those separators (i.e. `k2(` is not a valid ngram, but `k2` is).
-
Grégoire Locqueville authored
-
Alfredo Di Napoli authored
-
Alfredo Di Napoli authored
-
- 22 May, 2025 2 commits
-
-
Grégoire Locqueville authored
Our IGraph library was updated to fix [this issue](haskell-igraph#5).
-
Grégoire Locqueville authored
The `filterNodes` function's name and type signature were confusing, so they were changed to make it evident that the filtering function for an antry is applied to its total number of cooccurrences.
-
- 21 May, 2025 9 commits
-
-
Grégoire Locqueville authored
-
Grégoire Locqueville authored
Reduce code duplication and simplify a few things. The changes in this commit were made without relying on any knowledge about what the code means or does; the changes should be understandable even without knowing the project.
-
Grégoire Locqueville authored
-
Grégoire Locqueville authored
-
Grégoire Locqueville authored
There's a filter in place to not show nodes not connected to any other node, however the strict inequality was also filtering out nodes connected to exactly one other node.
-
Grégoire Locqueville authored
Removed some dead code, renamed some stuff, made exports explicit
-
Grégoire Locqueville authored
Spinglass functions would output incoherent clusters when the input graph was nonconnected.
-
Grégoire Locqueville authored
`occurrencesWith` takes a function parameter to map over the input collection before counting occurrences. Every time it's called, though, it's with `identity`, so I replaced the function with a basic occurence counting function. Note that if one needs to apply a function to a collection before counting occurrences, they can simply `fmap` over the collection first.
-
Alfredo Di Napoli authored
Port (almost all) DB operations in GGTX to use the transaction API Closes #466 See merge request !408
-
- 19 May, 2025 1 commit
-
-
Alfredo Di Napoli authored
-