Name
Last commit
Last update
.gitlab/issue_templates File name
bench [refactor] remove unnecessary LANGUAGE pragmas
bench-data/phylo FIX test csv->tsv
bin Merge remote-tracking branch 'origin/304-dev-toml-config-rewrite-and-update-deps' into dev
devops Merge branch 'dev' into 341-dev-websockets
docs Update documentation
ekg-assets introduce EKG-based monitoring infrastructure
nix [central-exchange] very simple implementation using nanomsg
src Merge remote-tracking branch 'origin/304-dev-toml-config-rewrite-and-update-deps' into dev
src-doctest [TEST] fix tests (WIP)
test [refactor] constraints cleanup, move errors arbitrary to test.instances
test-data [config] Settings removed completely
.clippy.dhall missing file
.envrc [ngrams] fix for json/csv upload for older terms
.gitignore Added .png files to .gitignore
.gitlab-ci.yml Use goldenVsString
.mailmap Add .mailmap file
CHANGELOG.md [VERSION] +1 to 0.0.7.2.8
CODE_OF_CONDUCT.md [FIX] Code Of Conduct simpler
DEVELOPER_GUIDELINES.md Style Guidelines improvement, thank you Alfredo
LICENSE [FIX] Licence
README.md Correcting a command that got typo-ed in the README
cabal.project Merge branch 'dev' into '388-remove-obsolete-ghc-option'
cabal.project.freeze [ghc] more deps fixes, also test small fix
cabal.project.local_toCopy Remove obsolete GHC option
gargantext-settings.toml_toModify Loading commit data...
gargantext.cabal Loading commit data...
gargantext.ini_toModify Loading commit data...
run Loading commit data...
server Loading commit data...
shell.nix Loading commit data...
stack.yaml Loading commit data...
start Loading commit data...
weeder.toml Loading commit data...

 

Gargantext with Haskell (Backend instance)

Haskell  Nix  Cabal  Stack  GHC  Docker

Table of Contents

  1. About the project
  2. Installation and development
  3. Uses cases
  4. GraphQL
  5. PostgreSQL

About the project

GarganText is a collaborative web-decentralized-based macro-service platform for the exploration of unstructured texts. It combines tools from natural language processing, text-data-mining bricks, complex networks analysis algorithms and interactive data visualization tools to pave the way toward new kinds of interactions with your textual and digital corpora.

This software is free (as "Libre" in French) software, developed by the CNRS Complex Systems Institute of Paris Île-de-France (ISC-PIF) and its partners.

GarganText Project: this repo builds the backend for the frontend server built by backend.

Installation and development

Disclaimer: since this project is still in development, this document remains in progress. Please report and improve this documentation if you encounter any issues.

Prerequisites

You must have the following installed:

Building

Clone the projects

Clone both the backend (haskell-gargantext), and the frontend (purescript-gargantext) at the root of the backend.

$ git clone https://gitlab.iscpif.fr/gargantext/haskell-gargantext.git
$ cd haskell-gargantext
$ git clone https://gitlab.iscpif.fr/gargantext/purescript-gargantext.git
$ cd ..

The Nix shell

In what follows, many commands need to be executed from within the Nix shell. To make that clear, those will be prefixed with n$, but you must not actually type n$ before the commands. To enter a Nix shell, run the following (this will take a moment the first time you run it, be patient):

$ nix-shell

Once you are in a Nix shell, you can run commands like you would in any other shell.

At any point, you can exit a Nix shell and go back to your regular shell by running exit.

If for some reason you do not want to enter a Nix shell, you can still run a command from outside: running the following in a non-Nix shell

$ nix-shell --run "my command"

is equivalent to running my command from within a Nix shell.

(Optional) Disable optimization flags

If you are developing Gargantext, you might be interested in disabling compiler optimizations. This speeds up compilation, but the compiled program itself will be less efficient.

To disable compiler optimizations, copy the file cabal.project.local_toCopy (which contains the flags that disable optimizations) into cabal.project.local (which will be read by Cabal):

$ cp cabal.project.local_toCopy cabal.project.local

Build the frontend

$ cd purescript-gargantext/
$ ./bin/install
$ cd ..

Build the backend

Note: This project can be built with either stack or cabal. We keep the cabal.project up-to-date, which allows us to build with cabal by default but we support stack thanks to thanks to cabal2stack, which allows us to generate a valid stack.yaml from a cabal.project. Due to the fact gargantext requires a particular set of system dependencies (C++ libraries, toolchains, etc) we use nix to setup an environment with all the required system dependencies, in a sandboxed and isolated fashion.

This documentation shows how to build with cabal. For information related to stack, see the note about Stack in the developer documentation.

Depending on your situation, there are several ways to build the project:

  1. Simple build

This will build the project and install the executables gargantext-cli and gargantext-server somewhere on your system. Depending on your Cabal configuration, this is probably ~/.local/bin/ or ~/.cabal/bin/.

From within the Nix shell, run:

n$ cabal update
n$ cabal install
  1. Full build

Same as "simple build" above, but also runs tests and builds documentation.

Just run the install script:

$ ./bin/install
  1. Build and run

Builds and runs the Gargantext server. This has the advantage of letting you run Gargantext without having to know where on your machine the executable is.

Since you will be running Gargantext, you need to have gone through initialization first; see "Initializing and running" below.

From inside a Nix shell:

n$ cabal run gargantext-server -- --toml gargantext-settings.toml --run Prod

Upgrading haskell packages

We use gargantext.cabal, cabal.project and cabal.project.freeze as the source of truth. Ouf ot that, we generate the stack.yaml file for those who prefer to use Stack.

Upgrading packages can be a pain sometimes, with cabal.

Here are some tips:

  • Manually remove entries from your cabal.project.freeze to make the build a bit more "elastic";
  • Lock the hackage-index state in the cabal.project, so that the solver won't try to pull newer dependencies;
  • Specify constraints you want directly when building like cabal v2-build --constraint tasty==x.y.z.w
  • Generate another .freeze with cabal v2-freeze once you got the new build to compile (this is good for small, incremental upgrades)
  • Bounds in .cabal are definitely respected, but ofc the .freeze takes priority, so you want to maybe use cabal gen-bounds when your .freeze still exists, remove the file, try again.

Also, it's helpful to build with stack build from time to time. The warnings are displayed, whenever a different stack lts package is used than the one defined in .cabal file - it's an incentive to upgrade the .cabal file versions.

Occasionally, you can get issues with the allow-newer: * constraint from cabal.project. E.g. when I was building with GHC 9.4.7, I had errors with hashable-1.5.0. The solution is:

cabal v2-build --constraint hashable==1.4.3.0

(we don't depend on hashable directly, but allow-newer: * is so liberal that a package that is too new is used).

Overall, it's preferred to specify strict constraints in gargantext.cabal file and to do that, one can use stack ls dependencies to have an idea what works.

If you want to see the detailed build info for a given dependency:

cabal v2-build -v servant-server

Also, you might use the -Wunused-packages GHC option, to get a warning about unused packages (make sure though you build all targets with cabal v2-build all).

Also, here is a relevant discussion: https://discourse.haskell.org/t/whats-your-workflow-to-update-cabal-dependencies/9475

Initializing and running

Start containers for database and NLP software bricks

$ cd devops/docker
$ docker compose up

The initialization schema should be loaded automatically from devops/postgres/schema.sql.

Create configuration file

$ cp gargantext-settings.toml_toModify gargantext-settings.toml

NOTE If you had the gargantext.ini file before, you can automatically convert it into a file gargantext-settings.toml by running the following from a Nix shell:

n$ cabal v2-run gargantext-cli -- ini

.gitignore excludes this file, so you don't need to worry about committing it by mistake, and you can change the passwords in gargantext-settings.toml safely.

Create master user

From within the Nix shell:

n$ gargantext-cli init

The master user's name is automatically set to gargantua, but you will be prompted for their password and email address.

Running

Make sure you know where gargantext-server is (probably in ~/.local/bin/ or .cabal/bin/). If the location is in your $PATH, just run:

$ gargantext-server --run Prod --toml gargantext-settings.toml

(If the location is not in your $PATH, just prefix gargantext-server with the path to it.)

You might want to use the ./start script: it rebuilds the backend, starts the docker containers, and launches the Gargantext server at once.

Running tests

From nix shell:

n$ cabal v2-test --test-show-details=streaming

Or, from "outside":

$ nix-shell --run "cabal v2-test --test-show-details=streaming"

If you want to run particular tests, use:

cabal v2-test garg-test-tasty --test-show-details=streaming --test-option=--pattern='/job status update and tracking/

Working on libraries

When a devlopment is needed on libraries (for instance, the HAL crawler in https://gitlab.iscpif.fr/gargantext/crawlers):

  1. Ongoing devlopment (on local repo):
    1. In cabal.project:
      • add ../hal to packages:
      • turn off (temporarily) the hal in source-repository-package
    2. When changes work and tests are OK, commit in repo hal
  2. When changes are commited / merged:
    1. Get the hash id, and edit cabal.project with the new commit id
    2. run ./bin/update-project-dependencies
      • get an error that sha256 don't match, so update the ./bin/update-project-dependencies with new sha256 hash
      • run again ./bin/update-project-dependencies (to make sure it's a fixed point now)

Note: without stack.yaml we would have to only fix cabal.project -> source-repository-package commit id. Sha256 is there to make sure CI reruns the tests.

Tooling info

Once you get Gargantext to compile and run on your machine, you will likely want the following:

  • Language support (intellisense) in your editor; see docs/editor_setup.md
  • Being able to send commands to the Gargantext server from GHCI; see docs/running_commands.md

Use Cases

Multi-User with Graphical User Interface (Server Mode)

$ ~/.local/bin/stack --docker exec gargantext-server -- --run Prod

Then you can log in with user1 / 1resu

Command Line Mode tools

Simple cooccurrences computation and indexation from a list of Ngrams

$ stack --docker exec gargantext-cli -- CorpusFromGarg.csv ListFromGarg.csv Ouput.json

Analyzing the ngrams table repo

We store the repository in directory repos in the CBOR file format. To decode it to JSON and analyze, say, using jq, use the following command:

$ cat repos/repo.cbor.v5 | stack exec gargantext-cbor2json | jq .

Documentation

To build documentation, run:

$ stack build --haddock --no-haddock-deps --fast

(in .stack-work/dist/x86_64-linux-nix/Cabal-3.2.1.0/doc/html/gargantext).

GraphQL

Some introspection information.

Playground is located at http://localhost:8008/gql

List all GraphQL types in the Playground

{
  __schema {
    types {
      name
    }
  }
}

List details about a type in GraphQL

{
  __type(name:"User") {
  	fields {
    	name
      description
      type {
        name
      }
  	}
	}
}

PostgreSQL

Upgrading using Docker

https://www.cloudytuts.com/tutorials/docker/how-to-upgrade-postgresql-in-docker-and-kubernetes/

To upgrade PostgreSQL in Docker containers, for example from 11.x to 14.x, simply run:

$ docker exec -it <container-id> pg_dumpall -U gargantua > 11-db.dump

Then, shut down the container, replace image section in devops/docker/docker-compose.yaml with postgres:14. Also, it is a good practice to create a new volume, say garg-pgdata14 and bind the new container to it. If you want to keep the same volume, remember about removing it like so:

$ docker-compose rm postgres
$ docker volume rm docker_garg-pgdata

Now, start the container and execute:

$ # need to drop the empty DB first, since schema will be created when restoring the dump
$ docker exec -i <new-container-id> dropdb -U gargantua gargandbV5
$ # recreate the db, but empty with no schema
$ docker exec -i <new-container-id> createdb -U gargantua gargandbV5
$ # now we can restore the dump
$ docker exec -i <new-container-id> psql -U gargantua -d gargandbV5 < 11-db.dump

Upgrading using

There is a solution using pgupgrade_cluster but you need to manage the clusters version 14 and 13. Hence here is a simple solution to upgrade.

First save your data:

$ sudo su postgres
$ pg_dumpall > gargandb.dump

Upgrade postgresql:

$ sudo apt install postgresql-server-14 postgresql-client-14
$ sudo apt remove --purge postgresql-13

Restore your data:

$ sudo su postgres
$ psql < gargandb.dump

Maybe you need to restore the gargantua password

$ ALTER ROLE gargantua PASSWORD 'yourPasswordIn_gargantext-settings.toml'

Maybe you need to change the port to 5433 for database connection in your gargantext.ini file.

haskell-language-server

If you want to use haskell-language-server for GHC 9.4.7, install it with ghcup:

ghcup compile hls --version 2.7.0.0 --ghc 9.4.7

https://haskell-language-server.readthedocs.io/en/latest/installation.html