Name
Last commit
Last update
bin Merge remote-tracking branch 'origin/adinapoli/issue-258-part-2' into dev
devops [corenlp] update stanford coreNLP to 4.5.4
docs [searx] first draft of searx parsing, updated stack to lts 18.4
ekg-assets introduce EKG-based monitoring infrastructure
nix Fix pkgs.nix silent syntax quirk
src Fix the 'happiness' bug
src-doctest [TEST] fix tests (WIP)
test Setup DB during tests, first test passes
test-data Setup DB during tests, first test passes
.clippy.dhall missing file
.envrc [envrc] export LANG to fix some ghc compile errors
.gitignore add test-data/test_config.ini
.gitlab-ci.yml Try to run cabal tests as 'test' user
CHANGELOG.md [VERSION] +1 to 0.0.6.9.9.7.5.1
CODE_OF_CONDUCT.md [FIX] Code Of Conduct simpler
CONTRIBUTING.md Update CONTRIBUTING.md
DEVELOPER_GUIDELINES.md Guidelines reading, modification and approval
LICENSE [FIX] Licence
README.md update doc
cabal.project Merge remote-tracking branch 'origin/adinapoli/issue-258-part-2' into dev
cabal.project.freeze Try to make testExceptions more predictable
gargantext.cabal Merge remote-tracking branch 'origin/adinapoli/issue-258-part-2' into dev
gargantext.ini_toModify [nlp] corenlps/johnsnows/spacys for https versions of schema
run [BIN] run script
server update script
shell.nix Loading commit data...
stack.yaml Loading commit data...
start Loading commit data...
weeder.dhall Loading commit data...

 

Gargantext with Haskell (Backend instance)

Haskell  Nix  Cabal  Stack  GHC  Docker

Table of Contents

  1. About the project
  2. Installation
  3. Initialization
  4. Launch & develop GarganText
  5. Uses cases
  6. GraphQL
  7. PostgreSQL

About the project

GarganText is a collaborative web-decentralized-based macro-service platform for the exploration of unstructured texts. It combines tools from natural language processing, text-data-mining bricks, complex networks analysis algorithms and interactive data visualization tools to pave the way toward new kinds of interactions with your textual and digital corpora.

This software is free (as "Libre" in French) software, developed by the CNRS Complex Systems Institute of Paris Île-de-France (ISC-PIF) and its partners.

GarganText Project: this repo builds the backend for the frontend server built by backend.

Installation

Disclaimer: since this project is still in development, this document remains in progress. Please report and improve this documentation if you encounter any issues.

Prerequisites

1. Installation

This project can be built with either Stack or Cabal. For historical reasons, we generate a cabal.project from the stack.yaml, and we do not commit the former to the repo, to have a single "source of truth". However, it's always possible to generate a cabal.project thanks to stack2cabal.

Install Nix

Gargantext requires Nix to provide system dependencies (for example, C libraries), but its use is limited to that. In order to install Nix:

sh <(curl -L https://nixos.org/nix/install) --daemon

Verify the installation is complete with

nix-env --version
nix-env (Nix) 2.16.0

Important: Before building the project with either stack or cabal you need to be in the correct Nix shell, which will fetch all the required system dependencies. To do so, just type:

nix-shell

This will take a bit of time the first time.

Build: choose cabal (new) or stack (old)

With Cabal (recommanded)

First, into nix-shell:

cabal update
cabal install

Once you have a valid version of cabal, building requires generating a valid cabal.project. This can be done by installing stack2cabal:

cabal v2-install stack2cabal-1.0.14

And finally:

stack2cabal --no-run-hpack -p '2023-06-25'
cabal v2-build

With Stack

Install Stack (or Haskell Tool Stack):

curl -sSL https://get.haskellstack.org/ | sh

Verify the installation is complete with

stack --version
Version 2.9.1

NOTE: Default build (with optimizations) requires large amounts of RAM (16GB at least). To avoid heavy compilation times and swapping out your machine, it is recommended to stack build with the --fast flag, i.e.:

stack build --fast

Keeping the cabal.project updated with stack.yaml

Simply run:

./bin/update-cabal-project

Initialization

1. Docker-compose will configure your database and some NLP bricks (such as CoreNLP):

# If docker is not installed:
# curl -sSL https://gitlab.iscpif.fr/gargantext/haskell-gargantext/raw/dev/devops/docker/docker-install | sh
cd devops/docker
docker compose up

Initialization schema should be loaded automatically (from devops/postgres/schema.sql).

(Optional) If using stack, then install:
stack install

2. Copy the configuration file:

cp gargantext.ini_toModify gargantext.ini

Do not worry, .gitignore avoids adding this file to the repository by mistake, then you can change the passwords in gargantext.ini safely.

3. A user have to be created first as instance:

~/.local/bin/gargantext-init "gargantext.ini"

Now, user1 is created with password 1resu

4. Clone FRONTEND repository:

From the Backend root folder (haskell-gargantext):

git clone ssh://git@gitlab.iscpif.fr:20022/gargantext/purescript-gargantext.git

 

Launch & develop GarganText

Note: here, the method with Cabal is used as default

From the Backend root folder (haskell-gargantext):

./start
# The start script runs following commands:
# - `docker compose up` to run the Docker for postgresql from devops/docker folder
# - `cabal run gargantext-server -- --ini gargantext.ini --run Prod` to run other services through `nix-shell`

For frontend development and compilation, see the Frontend Readme.md

Use Cases

Multi-User with Graphical User Interface (Server Mode)

~/.local/bin/stack --docker exec gargantext-server -- --ini "gargantext.ini" --run Prod

Then you can log in with user1 / 1resu

Command Line Mode tools

Simple cooccurrences computation and indexation from a list of Ngrams

stack --docker exec gargantext-cli -- CorpusFromGarg.csv ListFromGarg.csv Ouput.json

Analyzing the ngrams table repo

We store the repository in directory repos in the CBOR file format. To decode it to JSON and analyze, say, using jq, use the following command:

cat repos/repo.cbor.v5 | stack exec gargantext-cbor2json | jq .

Documentation

To build documentation, run:

stack build --haddock --no-haddock-deps --fast

(in .stack-work/dist/x86_64-linux-nix/Cabal-3.2.1.0/doc/html/gargantext).

GraphQL

Some introspection information.

Playground is located at http://localhost:8008/gql

List all GraphQL types in the Playground

{
  __schema {
    types {
      name
    }
  }
}

List details about a type in GraphQL

{
  __type(name:"User") {
  	fields {
    	name
      description
      type {
        name
      }
  	}
	}
}

PostgreSQL

Upgrading using Docker

https://www.cloudytuts.com/tutorials/docker/how-to-upgrade-postgresql-in-docker-and-kubernetes/

To upgrade PostgreSQL in Docker containers, for example from 11.x to 14.x, simply run:

docker exec -it <container-id> pg_dumpall -U gargantua > 11-db.dump

Then, shut down the container, replace image section in devops/docker/docker-compose.yaml with postgres:14. Also, it is a good practice to create a new volume, say garg-pgdata14 and bind the new container to it. If you want to keep the same volume, remember about removing it like so:

docker-compose rm postgres
docker volume rm docker_garg-pgdata

Now, start the container and execute:

# need to drop the empty DB first, since schema will be created when restoring the dump
docker exec -i <new-container-id> dropdb -U gargantua gargandbV5
# recreate the db, but empty with no schema
docker exec -i <new-container-id> createdb -U gargantua gargandbV5
# now we can restore the dump
docker exec -i <new-container-id> psql -U gargantua -d gargandbV5 < 11-db.dump

Upgrading using

There is a solution using pgupgrade_cluster but you need to manage the clusters version 14 and 13. Hence here is a simple solution to upgrade.

First save your data:

sudo su postgres
pg_dumpall > gargandb.dump

Upgrade postgresql:

sudo apt install postgresql-server-14 postgresql-client-14
sudo apt remove --purge postgresql-13

Restore your data:

sudo su postgres
psql < gargandb.dump

Maybe you need to restore the gargantua password

ALTER ROLE gargantua PASSWORD 'yourPasswordIn_gargantext.ini'

Maybe you need to change the port to 5433 for database connection in your gargantext.ini file.