- 12 May, 2025 1 commit
-
-
Alfredo Di Napoli authored
This gigantic commit ports the existing DB operations in GGTX to use the transactional API, meaning that we can now compose DB operations and they will all run in the same Postgres transaction using the same connection, which will eliminate those class of bugs where concurrent DB access might result in an inconsistent state. On top of that, we simplify some parts of the API, for which a summary is given below: 1. The `NodeStoryEnv` management has been greatly simplified; in the new API we don't need an external connection pool to be passed and we don't have to pass IO actions, we can just pass DB operations, therefore we can greatly simplify the API to just pass mostly pure values; 2. Due to the fact that our `DBTx` monad can't do arbitrary IO (which is a good thing) we cannot fire Central Exchange notifications immediately. Rather that happens now is that we collect the `CEMessage` to be sent and we fire them in the relevant concrete monad after we finished with the DB transaction. This means that in principle there would be a small delay between the DB operation taking place and the notification firing but in practice the latency should be negligible and bear in mind this is typically what we want: if we have a long DB Tx that triggers an error in the middle we don't want to be sending out CE messages prematurely if the overall operation didn't succeed! 3. There are still a few places in the codebase where we couldn't make things fully compositional with regards to the DBTx API, because we had Servant handlers which had DB operations mixed with other IO effectful computations (or other things like the notification from the `MonadJobStatus`). For now we are splitting these functions by manually running the partial DB operations, and while this is not ideal it can be fixed in subsequent merge requests. 4. The `WorkerEnv` doesn't use `IOException` as its `MonadError` anymore, as for consistency we can just use `BackendInternalError` by adding a `InternalWorkerError` data constructor accepting the `IOException` triggered by the Worker monad. More testing is needed, with particular attention to performance (regression) but this should hopefully offer a decent baseline.
-
- 07 Apr, 2025 3 commits
-
-
Alfredo Di Napoli authored
-
Alfredo Di Napoli authored
-
Alfredo Di Napoli authored
This commit introduces/improves the `parseTvsWithDiagnostics` function to parse the input TSV incrementally, collecting errors as we go, and eventually reporting them upstream via the newly added `emitTsvParseWarning` function.
-
- 17 Feb, 2025 1 commit
-
-
Przemyslaw Kaminski authored
-
- 12 Feb, 2025 1 commit
-
-
Przemyslaw Kaminski authored
-
- 10 Feb, 2025 1 commit
-
-
Przemyslaw Kaminski authored
Related to #444 The rationale behind this is that we don't want to pollute worker job queue with large file blobs. Instead, upon API request, we create a pg_largeobject and use that in worker. After job is finished (with error or not), the object is removed.
-
- 30 Jan, 2025 1 commit
-
-
Przemyslaw Kaminski authored
commit be879b1e Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Thu Jan 30 18:22:44 2025 +0100 [ngrams] code fixes according to review Related MR: !378 commit bf89561b Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Wed Jan 22 21:11:47 2025 +0100 [test] notification on node move Also, some small refactorings. commit 3d5d74ab Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Wed Jan 22 20:13:44 2025 +0100 [tests] add notifications func comment, fix core/notifications indent commit b8ea3af2 Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Wed Jan 22 19:13:35 2025 +0100 [update-project-dependencies] commit 1217baf4 Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Wed Jan 22 19:09:17 2025 +0100 [tests] notifications: test async notifications for update tree Related to #418 commit 874785e9 Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Tue Jan 21 06:06:27 2025 +0100 [refactor] unify Database & ExternalIDs These types are the same, except for Database.Empty I managed to have backwards compatibility with the frontend format, hence the frontend doesn't need any changes. commit e7b16520 Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Tue Jan 21 06:05:57 2025 +0100 [cabal] upgrade haskell-bee to fix TSRetry and ESRepeat issues commit ad045ae0 Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Mon Jan 20 06:32:49 2025 +0100 [cabal] upgrade haskell-bee tag commit b3910bb4 Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Tue Jan 14 10:56:12 2025 +0100 [test] move some Arbitrary instances to Test/Instances.hs commit bb282d02 Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Tue Jan 14 09:17:23 2025 +0100 [test] WithQuery offline test (with EPO constructor) commit c0fe2e51 Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Tue Jan 14 06:59:45 2025 +0100 [query] move EPO user/token into the datafield This simplifies the WithQuery structure even more commit 93586adc Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Mon Jan 13 17:45:42 2025 +0100 [tests] fix WithQuery frontend serialization test Also, add WithQuery pubmed test (with api_key) commit bc29319c Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Mon Jan 13 10:13:15 2025 +0100 [ngrams] simplify WithQuery json structure There is now only a 'datafield' field, no need for duplicated 'database'. Related to #441 commit e6fdbee4 Merge: 95dc32b3 13457ca8 Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Fri Jan 10 12:03:59 2025 +0100 Merge branch 'dev' into 224-dev-understanding-ngrams commit 95dc32b3 Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Tue Jan 7 20:01:11 2025 +0100 [ngrams] refactor PubMed DB type (to include Maybe APIKey) commit baa2491f Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Tue Jan 7 18:09:04 2025 +0100 [refactor] searx search refactoring commit fcf83bf7 Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Tue Jan 7 11:14:03 2025 +0100 [ngrams] more types annotations commit 0d8a77c4 Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Mon Dec 30 16:15:07 2024 +0100 [ngrams, test] refactor: Count -> Terms commit 85f1dffe Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Mon Dec 30 14:35:05 2024 +0100 [ngrams] refactor opaque Int into TermsWeight newtype commit a81ea049 Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Mon Dec 30 14:34:39 2024 +0100 [CLI] fix limit removal It wasn't used anyways. commit d1dfbf79 Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Mon Dec 30 11:35:41 2024 +0100 [ngrams] one more simplification in ngramsByDoc' commit fcb48b8f Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Mon Dec 30 11:33:33 2024 +0100 [ngrams] some more simplification of ngramsByDoc' commit ab7c1766 Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Mon Dec 30 11:00:19 2024 +0100 [ngrams, tests] understanding ngramsByDoc better commit 35c2d0b0 Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Mon Dec 23 21:20:29 2024 +0100 [ngrams] small simplification to docNgrams function commit 161ac077 Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Mon Dec 23 18:35:59 2024 +0100 [ngrams] annotate types of ngrams algorithms commit 08c7c91c Author: Przemysław Kaminski <pk@intrepidus.pl> Date: Sat Dec 21 09:45:00 2024 +0100 [ngrams] improve function documentation, add types, add unit tests I want to understand ngrams algorithms better.
-
- 29 Jan, 2025 1 commit
-
-
Przemyslaw Kaminski authored
commit b4755ad5 Author: Alfredo Di Napoli <alfredo.dinapoli@gmail.com> Date: Wed Jan 29 11:41:32 2025 +0100 Code review, part II This commit splits the /export (renaming it to just remote) and tuck it under the /node hierarchy. The import also lives tucked in the /node. commit 483bd3e5 Author: Alfredo Di Napoli <alfredo.dinapoli@gmail.com> Date: Tue Jan 28 09:34:14 2025 +0100 Code review feedback * Rename `exampleS` into `exampleSchema`; * Revert commit about the public keys & co; commit 18d207f0 Author: Alfredo Di Napoli <alfredo.dinapoli@gmail.com> Date: Tue Jan 28 09:50:56 2025 +0100 Revert "Add _env_remote_transfer_keys field" This reverts commit 3ea32b50. commit 9cc5159a Author: Alfredo Di Napoli <alfredo.dinapoli@gmail.com> Date: Mon Jan 20 10:00:53 2025 +0100 Support transfering of notes commit 1fe60d75 Author: Alfredo Di Napoli <alfredo.dinapoli@gmail.com> Date: Fri Jan 17 08:37:53 2025 +0100 Refactor exporting and transfering of nodes commit b39c1805 Author: Alfredo Di Napoli <alfredo.dinapoli@gmail.com> Date: Thu Jan 16 09:54:38 2025 +0100 Preliminary work to transfer notes commit b2f7a9a8 Author: Alfredo Di Napoli <alfredo.dinapoli@gmail.com> Date: Mon Jan 13 15:25:34 2025 +0100 Chunks the insertion of remote docs commit 0d4e0554 Author: Alfredo Di Napoli <alfredo.dinapoli@gmail.com> Date: Mon Jan 13 14:43:16 2025 +0100 Move terms updating to separate job as well commit c62480c7 Author: Alfredo Di Napoli <alfredo.dinapoli@gmail.com> Date: Mon Jan 13 12:30:19 2025 +0100 Proper support for importing documents commit 6019587c Author: Alfredo Di Napoli <alfredo.dinapoli@gmail.com> Date: Mon Jan 6 15:44:13 2025 +0100 Initial support for importing ngrams commit 842b3d36 Author: Alfredo Di Napoli <alfredo.dinapoli@gmail.com> Date: Mon Jan 6 15:07:21 2025 +0100 Support exporting docs and ngrams (but not importing them yet) commit 98708c2e Author: Alfredo Di Napoli <alfredo.dinapoli@gmail.com> Date: Mon Jan 6 12:32:17 2025 +0100 Support exporting of tree hierarchies (with proviso) Exporting a corpus works, as it also exports its children, but for example the docs and terms nodes do not have any associated content. This is because those are stored in separate DB tables, and we need to find a way to export those as well. commit c248eaf1 Author: Alfredo Di Napoli <alfredo.dinapoli@gmail.com> Date: Mon Jan 6 11:50:14 2025 +0100 Support trees of export nodes (to be tested) commit dd2049aa Author: Alfredo Di Napoli <alfredo.dinapoli@gmail.com> Date: Mon Jan 6 11:23:30 2025 +0100 Add getNodes function to Database.Query.Table.Node commit c429cbb1 Author: Alfredo Di Napoli <alfredo.dinapoli@gmail.com> Date: Mon Jan 6 10:39:37 2025 +0100 Restrict export of nodes to only a few types commit c648699e Author: Alfredo Di Napoli <alfredo.dinapoli@gmail.com> Date: Mon Jan 6 09:27:28 2025 +0100 Update deps again (after rebase) commit 7337820e Author: Alfredo Di Napoli <alfredo.dinapoli@gmail.com> Date: Mon Dec 16 16:07:49 2024 +0100 Basic Remote API testing commit 1eb59c52 Author: Alfredo Di Napoli <alfredo.dinapoli@gmail.com> Date: Mon Dec 16 15:31:47 2024 +0100 Barebone (non-streaming) storage of nodes commit be5e9faf Author: Alfredo Di Napoli <alfredo.dinapoli@gmail.com> Date: Mon Dec 16 14:20:16 2024 +0100 Send serialised nodes instead of dummy strings commit aff15b60 Author: Alfredo Di Napoli <alfredo.dinapoli@gmail.com> Date: Mon Dec 16 12:07:49 2024 +0100 Remove redundant test imports commit 6d776767 Author: Alfredo Di Napoli <alfredo.dinapoli@gmail.com> Date: Mon Dec 16 12:01:55 2024 +0100 Bolt-on ownership check for /remote/export commit 58d9fcb0 Author: Alfredo Di Napoli <alfredo.dinapoli@gmail.com> Date: Mon Dec 16 11:08:11 2024 +0100 Proper error handling for remote import and export handlers commit 23a06d28 Author: Alfredo Di Napoli <alfredo.dinapoli@gmail.com> Date: Mon Dec 9 15:37:29 2024 +0100 Update project deps commit d5096e40 Author: Alfredo Di Napoli <alfredo.dinapoli@gmail.com> Date: Mon Dec 2 17:05:04 2024 +0100 Make a start on the remote (streaming) endpoints It typechecks but it exchange only a very simple string and it prints it. commit 3ea32b50 Author: Alfredo Di Napoli <alfredo.dinapoli@gmail.com> Date: Mon Dec 2 11:26:11 2024 +0100 Add _env_remote_transfer_keys field This adds a new randomly-generated pair of (PublicKey, PrivateKey) to be later used to send messages between instances. It also: * Returns a remote transfer pub key inside an AuthResponse * Adds pubKey roundtrip test
-
- 14 Jan, 2025 1 commit
-
-
Przemyslaw Kaminski authored
This simplifies the WithQuery structure even more
-
- 13 Jan, 2025 1 commit
-
-
Przemyslaw Kaminski authored
There is now only a 'datafield' field, no need for duplicated 'database'. Related to #441
-
- 07 Jan, 2025 1 commit
-
-
Przemyslaw Kaminski authored
-
- 19 Nov, 2024 1 commit
-
-
Przemyslaw Kaminski authored
Also, some import refactoring
-
- 30 Oct, 2024 1 commit
-
-
Przemyslaw Kaminski authored
-
- 28 Oct, 2024 1 commit
-
-
Przemyslaw Kaminski authored
-
- 19 Sep, 2024 2 commits
-
-
Przemyslaw Kaminski authored
Now everything is in Core/Config
-
Przemyslaw Kaminski authored
I moved HasConfig to Core/Config instead of it being in Database.Prelude
-
- 04 Sep, 2024 1 commit
-
-
Przemyslaw Kaminski authored
-
- 26 Aug, 2024 1 commit
-
-
Przemyslaw Kaminski authored
-
- 30 Jul, 2024 1 commit
-
-
Przemyslaw Kaminski authored
-
- 04 Jul, 2024 1 commit
-
-
Alfredo Di Napoli authored
This commit moves `GargConfig` and the other config-related data structures back into gargantext, so that they can be edited and expanded without needing to worry about the prelude project.
-
- 26 Jun, 2024 1 commit
-
-
Alfredo Di Napoli authored
-
- 07 Jun, 2024 1 commit
-
-
Loïc Chapron authored
-
- 03 Jun, 2024 1 commit
-
-
Alfredo Di Napoli authored
-
- 09 May, 2024 1 commit
-
-
Alfredo Di Napoli authored
This big commit adds a separate module hierarchy for Servant named routes (see https://www.tweag.io/blog/2022-02-24-named-routes/ ) which will make working with servant endpoints more pleasant (especially when it comes to emitted errors). This still doesn't do anything to wire the routes to the concrete handlers.
-
- 09 Apr, 2024 1 commit
-
-
Przemyslaw Kaminski authored
-
- 08 Apr, 2024 1 commit
-
-
Alfredo Di Napoli authored
It forces programmers to think about errors we are logging and reporting to the frontend, because they need to contain no sensitive data.
-
- 25 Mar, 2024 1 commit
-
-
Przemyslaw Kaminski authored
This was (User, Either CorpusName, [CorpusId]) before, but the case of UserMaster doesn't make sense with these parameters, so I rewrote the function to accept only correct datatypes as inputs.
-
- 27 Dec, 2023 1 commit
-
-
Przemyslaw Kaminski authored
This is similar to purescript-gargantext#594 and !205/ but for JSON import.
-
- 04 Dec, 2023 1 commit
-
-
Alfredo Di Napoli authored
-
- 09 Nov, 2023 1 commit
-
-
Przemyslaw Kaminski authored
-
- 12 Oct, 2023 1 commit
-
-
Przemyslaw Kaminski authored
Basically, Gargantext.Prelude exports all of Protolude now.
-
- 06 Oct, 2023 1 commit
-
-
Przemyslaw Kaminski authored
-
- 25 Sep, 2023 1 commit
-
-
Przemyslaw Kaminski authored
-
- 21 Sep, 2023 1 commit
-
-
Przemyslaw Kaminski authored
The issue was that when search was performed, the initial table didn't have an archive ready and so first commit resulted in some errors on the frontend. This fix performs the commit after flow is finished.
-
- 21 Aug, 2023 2 commits
-
-
Alfredo Di Napoli authored
Fixes #258 This commit fixes the issue of the "empty Ngrams". The culprit was a bug in the `arxiv-api` package, which was fixed by: gargantext/crawlers/arxiv-api!2 (commits) In a nutshell, due to the fact the relevant `Conduit` was never completing, it was blocking the `flow` function, which was endlessly waiting for the last part of the results to arrive. Hence we were never calling the `flowCorpusUser`, which is the function responsible for generating the Ngrams.
-
Alfredo Di Napoli authored
-
- 08 Aug, 2023 1 commit
-
-
Przemyslaw Kaminski authored
-
- 28 Jul, 2023 1 commit
-
-
Alfredo Di Napoli authored
-
- 13 Jul, 2023 1 commit
-
-
Alfredo Di Napoli authored
-