Documentation removed after subtree tests. Adding it again.

dcfd1f71 · delanoe · 2cd4a570 · dcfd1f71 · dcfd1f71 · dcfd1f71
Commit dcfd1f71 authored Oct 04, 2016 by delanoe
30 changed files
--- a/docs/tools/about.md
+++ b/docs/tools/about.md
--- a/docs/tools/about/credits.md
+++ b/docs/tools/about/credits.md
--- a/docs/tools/about/index.md
+++ b/docs/tools/about/index.md
+#Gargantext
+Welcome to Garagentext documentation!
--- a/docs/tools/about/license.md
+++ b/docs/tools/about/license.md
--- a/docs/tools/about/release-notes.md
+++ b/docs/tools/about/release-notes.md
--- a/docs/tools/api_urls.md
+++ b/docs/tools/api_urls.md
+List of garg's own JSON API(s) urls
+===================================
+2016-05-27
+### /api/nodes/2
+```
+{
+  "id": 2,
+  "parent_id": 1,
+  "name": "abstract:\"evaporation+loss\"",
+  "typename": "CORPUS"
+}
+```
+------------------------------
+### /api/nodes?pagination_limit=-1
+```
+{
+  "records": [
+    {
+      "id": 9,
+      "parent_id": 2,
+      "name": "A recording evaporimeter",
+      "typename": "DOCUMENT"
+    },
+    (...)
+    {
+      "id": 119,
+      "parent_id": 81,
+      "name": "GRAPH EXPLORER COOC (in:81)",
+      "typename": "COOCCURRENCES"
+    }
+  ],
+  "count": 119,
+  "parameters": {
+      "formated": "json","pagination_limit": -1,
+      "fields": ["id","parent_id","name","typename"],
+      "pagination_offset": 0
+  }
+}
+```
+------------------------------
+### /api/nodes?types[]=CORPUS
+```
+{
+  "records": [
+    {
+      "id": 2,
+      "parent_id": 1,
+      "name": "abstract:\"evaporation+loss\"",
+      "typename": "CORPUS"
+    },
+    (...)
+    {
+      "id": 8181,
+      "parent_id": 1,
+      "name": "abstract:(astrogeology+OR ((space OR spatial) AND planetary) AND geology)",
+      "typename": "CORPUS"
+    }
+  ],
+  "count": 2,
+  "parameters": {
+        "pagination_limit": 10,
+        "types": ["CORPUS"],
+        "formated": "json",
+        "pagination_offset": 0,
+        "fields": ["id","parent_id","name","typename"]
+  }
+}
+```
+------------------------------
+### /api/nodes/5?fields[]=ngrams
+Où <5> représente un doc_id ou list_id
+```
+{
+  "ngrams": [
+    [1.0,{"id":2299,"n":1,"terms":designs}],
+    [1.0,{"id":1917,"n":1,"terms":height}],
+    [1.0,{"id":1755,"n":2,"terms":higher speeds}],
+    [1.0,{"id":1940,"n":1,"terms":cylinders}],
+    [1.0,{"id":2221,"n":3,"terms":other synthesized materials}],
+    (...)
+    [2.0,{"id":1970,"n":1,"terms":storms}],
+    [9.0,{"id":1754,"n":2,"terms":spherical gauges}],
+    [1.0,{"id":1895,"n":1,"terms":direction}],
+    [1.0,{"id":2032,"n":1,"terms":testing}],
+    [1.0,{"id":1981,"n":2,"terms":"wind effects"}]
+  ]
+}
+```
+------------------------------
+### api/nodes/3?fields[]=id&fields[]=hyperdata&fields[]=typename
+```
+{
+  "id": 3,
+  "typename": "DOCUMENT",
+  "hyperdata": {
+    "language_name": "English",
+    "language_iso3": "eng",
+    "language_iso2": "en",
+    "title": "A blabla analysis of laser treated aluminium blablabla",
+    "name": "A blabla analysis of laser treated aluminium blablabla",
+    "authors": "A K. Jain, V.N. Kulkarni, D.K. Sood"
+    "authorsRAW": [
+    {"name": "....", "affiliations": ["... Research Centre,.. 085, Country"]},
+    {"name": "....", "affiliations": ["... Research Centre,.. 086, Country"]}
+    (...)
+    ],
+    "abstract": "Laser processing of materials, being a rapid melt quenching process, quite often produces a surface which is far from being ideally smooth for ion beam analysis. (...)",
+    "genre": ["research-article"],
+    "doi": "10.1016/0029-554X(81)90998-8",
+    "journal": "Nuclear Instruments and Methods In Physics Research",
+    "publication_year": "1981",
+    "publication_date": "1981-01-01 00:00:00",
+    "publication_month": "01",
+    "publication_day": "01",
+    "publication_hour": "00",
+    "publication_minute": "00",
+    "publication_second": "00",
+    "id": "61076EB1178A97939B1C893904C77FB7DA2276D0",
+    "source": "elsevier",
+    "distributor": "istex"
+  }
+}
+```
+## TODO continuer la liste
--- a/docs/tools/automatic_install.md
+++ b/docs/tools/automatic_install.md
+#
--- a/docs/tools/contribute/ngrams/ngram_parsing_flow.dot
+++ b/docs/tools/contribute/ngrams/ngram_parsing_flow.dot
+// dot ngram_parsing_flow.dot -Tpng -o ngram_parsing_flow.png
+digraph ngramflow {
+    edge [fontsize=10] ;
+    label=<<B><U>gargantext.util.toolchain</U></B><BR/>(ngram extraction flow)>;
+    labelloc="t" ;
+    "extracted_ngrams" -> "grouplist" ;
+    "extracted_ngrams" -> "occs+ti_rank" ;
+    "project stoplist (todo)" -> "stoplist" ;
+    "stoplist" -> "mainlist" ;
+    "occs+ti_rank" -> "mainlist" [label="  TI_RANK_LIMIT"];
+    "mainlist" -> "coocs" [label="  COOCS_THRESHOLD"] ;
+    "coocs" -> "specificity" ;
+    "specificity" -> "maplist" [label="MAPLIST_LIMIT\nMONOGRAM_PART"];
+    "mainlist" -> "tfidf" ;
+    "tfidf" -> "explore" [label="doc relations with all map and candidates"];
+    "maplist" -> "explore" ;
+    "grouplist" -> "occs+ti_rank" ;
+    "grouplist" -> "coocs" ;
+    "grouplist" -> "tfidf" ;
+}
--- a/docs/tools/contribute/ngrams/ngram_parsing_flow.png
+++ b/docs/tools/contribute/ngrams/ngram_parsing_flow.png
--- a/docs/tools/contribution-guide.md
+++ b/docs/tools/contribution-guide.md
+#Contribution guide
+## Community
+* [http://gargantext.org/about](http://gargantext.org/about)
+* IRC Chat: (OFTC/FreeNode) #gargantex
+##Tools
+* gogs
+* server access
+* forge
+* gargantext box
+##Gargantex
+* Gargantex box install
+(S.I.R.= Setup Install & Run procedures)
+* Architecture Overview
+* Database Schema Overview
+* Interface design Overview
+##To do:
+* Docs
+* Interface deisgn
+* Parsers/scrapers
+* Computing
+## How to contribute:
+    1. Clone the repo
+    2. Create a new branch <username>-refactoring
+    3. Run the gargantext-box
+    4. Code
+    5.Test
+    6. Commit
+### Exemple1: Adding a parser
+* create your new file cern.py into gargantex/scrapers/
+* reference into gargantex/scrapers/urls.py
+add this line:
+import scrapers.cern  as cern
+* reference into gargantext/constants
+```
+# type 9
+    {   'name': 'Cern',
+        'parser': CernParser,
+        'default_language': 'en',
+    },
+```
+* add an APIKEY in gargantex/settings
+### Exemple2: User Interface Design
--- a/docs/tools/contribution-guide/archi.md
+++ b/docs/tools/contribution-guide/archi.md
--- a/docs/tools/contribution-guide/contribution.md
+++ b/docs/tools/contribution-guide/contribution.md
+#Contribution guide
+* A question or a problem? Ask the community
+* Sources
+* Tools
+* Contribution workflow: for contributions, bugs and features
+* Some examples of contributions
+## Community
+Need help? Ask the community
+* [http://gargantext.org/about](http://gargantext.org/about)
+* IRC Chat: (OFTC/FreeNode) #gargantex
+## Source
+Source are available throught XXX LICENSE
+You can install Gargantext throught the [installation procedure](./install.md)
+##Tools
+* gogs
+* forge.iscpif.fr
+* server access
+* gargantext box
+## Contributing: workflow procedure
+Once you have installed and tested Gargantext
+You
+1. Clone the stable release into your project
+    Note: The current stable release <release_branch> is:  refactoring
+Inside the repo, clone the reference branch and get the last changes:
+git checkout <ref_branch>
+git pull
+It is highly recommended to create a generic branch on a stable release such as
+git checkout -b <username>-<release_branch>
+git pull
+2. Create your project on stable release
+git checkout -b <username>-<release_branch>-<project_name>
+Do your modifications and commits as you want it:
+git commit -m "foo/bar/1"
+git commit -m "foo/bar/2"
+git push
+If you want to save your local change you can merge it into your generic branch <username>-<release_branch>
+git checkout <username>-<release_branch>
+git pull
+git merge <username>-<release_branch>-<project_name>
+git commit -m "[Merge OK] comment"
+##Technical Overview
+* Interface Overview
+* Database Schema Overview
+* Architecture Overview
+### Exemple1: Adding a parser
+### Exemple2: User Interface Design
--- a/docs/tools/contribution-guide/db.md
+++ b/docs/tools/contribution-guide/db.md
--- a/docs/tools/contribution-guide/dev.md
+++ b/docs/tools/contribution-guide/dev.md
--- a/docs/tools/contribution-guide/ngrams_lifecycle.md
+++ b/docs/tools/contribution-guide/ngrams_lifecycle.md
+Cycle de vie des décomptes ngrammes
+-----------------------------------
+### (schéma actuel et pistes) ###
+Dans ce qui crée les décomptes, on peut distinguer deux niveaux ou étapes:
+1.  l'extraction initiale et le stockage du poids de la relation ngramme
+    document (appelons ces nodes "1doc")
+2.  tout le reste: la préparation des décomptes agrégés pour la table
+    termes ("stats"), et pour les tables de travail des graphes et de la
+    recherche de publications.
+On pourrait peut-être parler d'indexation par docs pour le niveau 1 et de "modélisations" pour le niveau 2.
+On peut remarquer que le niveau 1 concerne des **formes** ou ngrammes seuls (la forme observée <=> chaine de caractères u-nique après normalisation) tandis que dans le niveau 2 on a des objets plus riches... Au fur et à mesure des traitements on a finalement toujours des ngrammes mais:
+  - filtrés (on ne calcule pas tout sur tout)
+  - typés avec les listes map, stop, main (et peut-être bientôt des
+    "ownlistes" utilisateur)...
+  - groupés (ce qu'on voit avec le `+` de la table terme, et qu'on
+    pourrait peut-être faire apparaître aussi côté graphe?)
+On peut dire qu'on manipule plutôt des **termes** au niveau 2 et non plus des **formes**... ils sont toujours des ngrammes mais enrichis par l'inclusion dans une série de mini modèles (agrégations et typologie de ngrammes guidée par les usages).
+### Tables en BDD
+Si on adopte cette distinction entre formes et termes, ça permet de clarifier à quel moment on doit mettre à jour ce qu'on a dans les tables. Côté structure de données, les décomptes sont toujours stockés via des n-uplets qu'on peut du coup résumer comme cela:
+-   **1doc**: (doc:node - forme:ngr - poids:float) dans des tables
+    NodeNgram
+-   **occs/gen/spec/tirank**: (type_mesure:node - terme:ngr -
+    poids:float) dans des tables NodeNgram
+-   **cooc**: (type_graphe:node - terme1:ngr - terme2:ngr -
+    poids:float) dans des tables NodeNgramNgram
+-   **tfidf**: (type_lienspublis:node - doc:node - terme:ngr -
+    correlation:float) dans des tables NodeNodeNgram.
+Où "type" est le node portant la nature de la stat obtenue, ou bien la
+ref du graphe pour cooc et de l'index lié à la recherche de publis pour
+le tfidf.
+Il y a aussi les relations qui ne contiennent pas de décomptes mais sont
+essentielles pour former les décomptes des autres:
+-   map/main/stopliste: (type_liste:node - forme ou terme:ngr) dans des
+    tables NodeNgram
+-   "groupes": (mainform:ngr - subform:ngr) dans des tables
+    NodeNgramNgram.
+### Scénarios d'actualisation
+Alors, dans le déroulé des "scénarios utilisateurs", il y plusieurs
+évenements qui viennent **modifier ces décomptes**:
+1.  les créations de termes opérés par l'utilisateur (ex: par
+    sélection/ajout dans la vue annotation)
+2.  les imports de termes correspondant à des formes jamais indexées sur
+    ce corpus
+3.  les dégroupements de termes opérés par l'utilisateur
+4.  le passage d'un terme de la stopliste aux autres listes
+5.  tout autre changement de listes et/ou création de nouveaux
+    groupes...
+A et B sont les deux seules étapes hormis l'extraction initiale où des
+formes sont rajoutées. Actuellement A et B sont gérés tout de suite pour
+le niveau 1 (tables par doc) : il me semble qu'il est bon d'opérer la
+ré-indexation des 1doc le plus tôt possible après A ou B. Pour la vue
+annotations, l'utilisateur s'attend à voir apparaître le surlignage
+immédiatement sur le doc visualisé. Pour l'import B, c'est pratique car
+on a la liste des nouveaux termes sous la main, ça évite de la stocker
+quelque part en attendant un recalcul ultérieur.
+L'autre info mise à jour tout de suite pour A et B est l'appartenance
+aux listes et aux groupes (pour B), qui ne demandent aucun calcul.
+C, D et E n'affectent pas le niveau 1 (tables par docs) car ils ne
+rajoutent pas de formes nouvelles, mais constituent des modifications
+sur les listes et les groupes, et devront donc provoquer une
+modification du tfidf (pour cela on doit passer par un re-calcul) et des
+coocs sur map (effet appliqué à la demande d'un nouveau graphe).
+C et D demandent aussi une mise à jour des stats par termes
+(occurrences, gen/spec etc) puisque les éléments subforms et les
+éléments de la stopliste ne figurent pas dans les stats.
+Donc pour résumer on a dans tous les cas:
+=> l'ajout à une liste, à un groupe et tout éventuel décompte de
+nouvelle forme dans les docs sont gérés dès l'action utilisateur
+=> mais les modélisations plus "avancées" représentées par les les
+stats occs, gen, spec et les tables de travail "coocs sur map" et
+"tfidf" doivent attendre un recalcul.
+Idéalement à l'avenir il seraient tous mis à jour incrémentalement au
+lieu de forcer ce recalcul... mais pour l'instant on en est là.
+### Fonctions associées
+|       |                          GUI                          |                                       API action → url                                        |                      VIEW                       |                                                            SUBROUTINES                                                             |
+|-------|-------------------------------------------------------|-----------------------------------------------------------------------------------------------|-------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------|
+| A     | "annotations/highlight.js, annotations/ngramlists.js" | "PUT → api/ngrams, PUT/DEL → api/ngramlists/change"                                           | "ApiNgrams, ListChange"                         | util.toolchain.ngrams_addition.index_new_ngrams                                                                                    |
+| B     | NGrams_dyna_chart_and_table                           | POST/PATCH → api/ngramlists/import                                                            | CSVLists                                        | "util.ngramlists_tools.import_ngramlists, util.ngramlists_tools.merge_ngramlists, util.toolchain.ngrams_addition.index_new_ngrams" |
+| C,D,E | NGrams_dyna_chart_and_table                           | "PUT/DEL → api/ngramlists/change,  PUT/DEL → api/ngramlists/groups" "ListChange, GroupChange" | util.toolchain.ngrams_addition.index_new_ngrams |                                                                                                                                    |
+L'import B a été remis en route il y a quelques semaines, et je viens de
+reconnecter A dans la vue annotations.
--- a/docs/tools/contribution-guide/website.md
+++ b/docs/tools/contribution-guide/website.md
--- a/docs/tools/contribution.md
+++ b/docs/tools/contribution.md
+#Contribution guide
+## Community
+* [http://gargantext.org/about](http://gargantext.org/about)
+* IRC Chat: (OFTC/FreeNode) #gargantex
+##Tools
+* gogs
+* server access
+* gargantext box
+##Gargantex
+* Gargantex box install
+see [install procedure](install.md)
+* Architecture Overview
+* Database Schema Overview
+* Interface design Overview
+##To do:
+* Docs
+* Interface design
+* [Parsers](./overview/parser.md) / scrappers(./overview/scraper.md)
+* Computing
+## How to contribute:
+    1. Clone the repo
+    2. Create a new branch <username>-refactoring
+    3. Run the gargantext-box
+    4. Code
+    5. Test
+    6. Commit
--- a/docs/tools/corpus_example/pubmed.md5
+++ b/docs/tools/corpus_example/pubmed.md5
+94eb7bdf57557b72dcd1b93a42af044b  pubmed.zip
--- a/docs/tools/demo.md
+++ b/docs/tools/demo.md
--- a/docs/tools/demo/todo.md
+++ b/docs/tools/demo/todo.md
+# API
+Be more careful about authorizations.
+cf. "ng-resource".
+# Projects
+## Overview of all projects
+- re-implement deletion
+## Single project view
+- re-implement deletion
+# Taggers
+Path for data used by taggers should be defined in `gargantext.constants`.
+# Database
+# Sharing
+Here follows a brief description of how sharing could be implemented.
+## Database representation
+The database representation of sharing can be distributed among 4 tables:
+ - `persons`, of which items represent either a user or a group
+ - `relationships` describes the relationships between persons (affiliation
+     of a user to a group, contact between two users, etc.)
+ - `nodes` contains the projects, corpora, documents, etc. to share (they shall
+     inherit the sharing properties from their parents)
+ - `permissions` stores the relations existing between the three previously
+     described above: it only consists of 2 foreign keys, plus an integer
+     between 1 and 3 representing the level of sharing and the start date
+     (when the sharing has been set) and the end date (when necessary, the time
+     at which sharing has been removed, `NULL` otherwise)
+## Python code
+The permission levels should be set in `gargantext.constants`, and defined as:
+```python
+PERMISSION_NONE = 0     # 0b0000
+PERMISSION_READ = 1     # 0b0001
+PERMISSION_WRITE = 3    # 0b0011
+PERMISSION_OWNER = 7    # 0b0111
+```
+The requests to check for permissions (or add new ones) should not be rewritten
+every time. They should be "hidden" within the models:
+ - `Person.owns(node)` returns a boolean
+ - `Person.can_read(node)` returns a boolean
+ - `Person.can_write(node)` returns a boolean
+ - `Person.give_right(node, permission)` gives a right to a given user
+ - `Person.remove_right(node, permission)` removes a right from a given user
+ - `Person.get_nodes(permission[, type])` returns an iterator on the list of
+    nodes on which the person has at least the given permission (optional
+    argument: type of requested node)
+- `Node.get_persons(permission[, type])` returns an iterator on the list of
+   users who have at least the given permission on the node (optional argument:
+   type of requested persons, such as `USER` or `GROUP`)
+## Example
+Let's imagine the `persons` table contains the following data:
+| id | type  | username  |
+|----|-------|-----------|
+| 1  | USER  | David     |
+| 2  | GROUP | C.N.R.S.  |
+| 3  | USER  | Alexandre |
+| 4  | USER  | Untel     |
+| 5  | GROUP | I.S.C.    |
+| 6  | USER  | Bidule    |
+Assume "David" owns the groups "C.N.R.S." and "I.S.C.", "Alexandre" belongs to
+the group "I.S.C.", with "Untel" and "Bidule" belonging to the group "C.N.R.S.".
+"Alexandre" and "David" are in contact.
+The `relationships` table then contains:
+| person1_id | person2_id | type    |
+|------------|------------|---------|
+| 1          | 2          | OWNER   |
+| 1          | 5          | OWNER   |
+| 3          | 2          | MEMBER  |
+| 4          | 5          | MEMBER  |
+| 6          | 5          | MEMBER  |
+| 1          | 3          | CONTACT |
+The `nodes` table is populated as such:
+| id | type     | name                 |
+|----|----------|----------------------|
+| 12 | PROJECT  | My super project     |
+| 13 | CORPUS   | A given corpus       |
+| 13 | CORPUS   | The corpus           |
+| 14 | DOCUMENT | Some document        |
+| 15 | DOCUMENT | Another document     |
+| 16 | DOCUMENT | Yet another document |
+| 17 | DOCUMENT | Last document        |
+| 18 | PROJECT  | Another project      |
+| 19 | PROJECT  | That project         |
+If we want to express that "David" created "My super project" (and its children)
+and wants everyone in "C.N.R.S." to be able to view it, but not access it,
+`permissions` should contain:
+| person_id | node_id | permission |
+|-----------|---------|------------|
+| 1         | 12      | OWNER      |
+| 2         | 12      | READ       |
+If "David" also wanted "Alexandre" (and no one else) to view and modify "The
+corpus" (and its children), we would have:
+| person_id | node_id | permission |
+|-----------|---------|------------|
+| 1         | 12      | OWNER      |
+| 2         | 12      | READ       |
+| 3         | 13      | WRITE      |
+If "Alexandre" created "That project" and wants "Bidule" (and no one else) to be
+able to view and modify it (and its children), the table should then have:
+| person_id | node_id | permission |
+|-----------|---------|------------|
+| 3         | 19      | OWNER      |
+| 6         | 19      | WRITE      |
--- a/docs/tools/demo/tuto.md
+++ b/docs/tools/demo/tuto.md
+#User guide
+1. Login
+run the gargantex box following the install procedure
+open a webrowser at http://127.0.0.1:8000/
+click on Test Gargantext
+login with:
+```
+Login : gargantua
+Password : autnagrag
+```
+2. Create a project
+3. Import an existing corpus
+4. Create corpus from search
+5. Explore stats
+6. Explore graphs
+7. Query
+8. Refine
+* Time periods
+* Nodes
+9. Export
--- a/docs/tools/discover.md
+++ b/docs/tools/discover.md
+#Architecture Overview
+#Database Schema
+#Website
--- a/docs/tools/index.md
+++ b/docs/tools/index.md
+Gargantext is a web plateform to explore your corpora using text-mining[...](about.md)
+## Getting started
+* [Install](install.md) the Gargantext box
+* [Take a tour](demo.md) of the different features offered by Gargantext
+##Need some help?
+Ask the community at:
+* [http://gargantext.org/about](http://gargantext.org/about)
+* IRC Chat: (OFTC/FreeNode) #gargantex
+##Want to contribute?
+* take a look at the [architecture overview](overview.md)
+* read the [contribution guide](contribution-guide.md)
+## News
+## Credits and acknowledgments
--- a/docs/tools/install.md
+++ b/docs/tools/install.md
+#Install Instructions for Gargamelle:
+Gargamelle is the gargantext plateforme toolbox it is a full plateform system
+with minimal modules
+First you need to get the source code to install it
+The folder will be /srv/gargantext:
+* docs containes all informations on gargantext
+    /srv/gargantext/docs/
+* install contains all the installation files
+    /srv/gargantext/install/
+Help needed ?
+See [http://gargantext.org/about](http://gargantext.org/about) and [tools](./contribution_guide.md) for the community
+## Get the source code
+by cloning gargantext into /srv/gargantext
+``` bash
+git clone ssh://gitolite@delanoe.org:1979/gargantext /srv/gargantext \
+        && cd /srv/gargantext \
+        && git fetch origin stable \
+        && git checkout stable \
+```
+## Install
+ ```bash
+ # go into the directory
+ user@computer: cd /srv/gargantext/
+ #git inside installation folder
+ user@computer: cd /install
+ #execute the installation
+ user@computer: ./install
+ ```
+The installation requires to create a user for gargantext,  it will be asked:
+```bash
+Username (leave blank to use 'gargantua'):
+#email is not mandatory
+Email address:
+Password:
+Password (again):
+```
+If successfully done this step you should see:
+```bash
+Superuser created successfully.
+[ ok ] Stopping PostgreSQL 9.5 database server: main.
+```
+## Run
+Once you proceed to installation Gargantext plateforme will be available at localhost:8000
+to start gargantext plateform:
+ ``` bash
+ # go into the directory
+ user@computer: cd /srv/gargantext/
+ #git inside installation folder
+ user@computer: ./start
+ #type ctrl+d to exit or simply type exit in terminal;
+ ```
+Then open up a chromium browser and go to localhost:8000
+Click on "Enter Gargantext"
+Login in with you created username and pasword
+Enjoy! ;)
--- a/docs/tools/manual_install.md
+++ b/docs/tools/manual_install.md
+* Create user gargantua
+Main user of Gargantext is Gargantua (role of Pantagruel soon)!
+``` bash
+sudo adduser --disabled-password --gecos "" gargantua
+```
+* Create the directories you need
+here for the example gargantext package will be installed in /srv/
+``` bash
+for dir in "/srv/gargantext"
+           "/srv/gargantext_lib"
+           "/srv/gargantext_static"
+           "/srv/gargantext_media"
+           "/srv/env_3-5"; do
+    sudo mkdir -p $dir ;
+    sudo chown gargantua:gargantua $dir ;
+done
+```
+You should see:
+```bash
+$tree /srv
+/srv
+├── gargantext
+├── gargantext_lib
+├── gargantext_media
+│   └── srv
+│       └── env_3-5
+└── gargantext_static
+```
+* Get the main libraries
+Download uncompress and make main user access to it.
+PLease, Be patient due to the size of the packages libraries (27GO)
+this step can be long....
+``` bash
+wget http://dl.gargantext.org/gargantext_lib.tar.bz2 \
+&& tar xvjf gargantext_lib.tar.bz2 -o /srv/gargantext_lib \
+&& sudo chown -R gargantua:gargantua /srv/gargantext_lib \
+&& echo "Libs installed"
+```
+* Get the source code of Gargantext
+by cloning the repository of gargantext
+``` bash
+git clone ssh://gitolite@delanoe.org:1979/gargantext /srv/gargantext \
+        && cd /srv/gargantext \
+        && git fetch origin refactoring \
+        && git checkout refactoring \
+```
+    TODO(soon): git clone https://gogs.iscpif.fr/gargantext.git
+See the [next steps of installation procedure](install.md#Install)
--- a/docs/tools/overview.md
+++ b/docs/tools/overview.md
+#Architecture Overview
+#Database Schema
+#Website
--- a/docs/tools/overview/parser.md
+++ b/docs/tools/overview/parser.md
+# HOW TO: Reference a new webscrapper/API + parser
+## Global scope
+Three main mooves to do:
+- develop and index parser
+in gargantext.util.parsers
+- developp and index a scrapper
+in gargantext.moissonneurs
+- adapt forms for a new source
+in templates and views
+## Reference parser into gargantext website
+gargantext website is stored in gargantext/gargantext
+### reference your new parser into contants.py
+* import your parser l.125
+```
+from gargantext.util.parsers import \
+    EuropressParser, RISParser, PubmedParser, ISIParser, CSVParser, ISTexParser, CernParser
+```
+The parser corresponds to the name of the parser referenced in gargantext/util/parser
+here  name is CernParser
+* index your RESOURCETYPE
+int RESOURCETYPES (l.145) **at the end of the list**
+```
+# type 10
+   {    "name": 'SCOAP (XML MARC21 Format)',
+        "parser": CernParser,
+        "default_language": "en",
+        'accepted_formats':["zip","xml"],
+   },
+```
+    A noter le nom ici est composé de l'API_name(SCOAP) + (GENERICFILETYPE FORMAT_XML Format)
+    La complexité du nommage correspond à trois choses:
+        * le nom de l'API (different de l'organisme de production)
+        * le type de format: XML
+        * la norme XML de ce format : MARC21 (cf. CernParser in gargantext/util/parser/Cern.py )
+The default_langage corresponds to the default accepted lang that **should load** the default corresponding tagger
+```
+from gargantext.util.taggers import NltkTagger
+```
+    TO DO: charger à la demander les types de taggers en fonction des langues et de l'install
+    TO DO: proposer un module pour télécharger des parsers supplémentaires
+    TO DO: provide install tagger module scripts inside lib
+Les formats correspondent aux types de fichiers acceptées lors de l'envoi du fichier dans le formulaire de
+parsing disponible dans `gargantext/view/pages/projects.py` et
+exposé dans `/templates/pages/projects/project.html`
+## reference your parser script
+## add your parser script into folder gargantext/util/parser/
+here my filename was Cern.py
+##declare it into gargantext/util/parser/__init__.py
+from .Cern  import CernParser
+At this step, you will be able to see your parser and add a file with the form
+but nothing will occur
+## the good way to write the scrapper script
+Three main and only requirements:
+* your parser class should inherit from the base class _Parser()
+`gargantext/gargantext/util/parser/_Parser`
+* your parser class must have a parse method that take a **file buffer** as input
+* you parser must structure and store data into **hyperdata_list** variable name
+to be properly indexed by toolchain
+! Be careful of date format: provide a publication_date in  a string format YYYY-mm-dd HH:MM:SS
+# Adding a scrapper API to offer search option:
+En cours
+* Add pop up question Do you have a corpus
+option search in /templates/pages/projects/project.html line 181
+## Reference a scrapper (moissonneur) into gargantext
+* adding accepted_formats in constants
+* adding check_file routine in Form check ==> but should inherit from utils/files.py
+that also have implmented the size upload limit check
+# Suggestion 4 next steps:
+* XML parser MARC21 UNIMARC ...
+* A project type is qualified by the first element add i.e:
+the first element determine the type of corpus of all the corpora within the project
--- a/docs/tools/resource.md
+++ b/docs/tools/resource.md
+#resources
+Adding a new source into Gargantext requires a previous declaration
+of the source inside constants.py
+```python
+RESOURCETYPES= [
+{    "type":9, #give a unique type int
+      "name": 'SCOAP [XML]', #resource name as proposed into the add corpus FORM [generic format]
+      "parser": "CernParser", #name of the new parser class inside a CERN.py file (set to None if not implemented)
+      "format": 'MARC21', #specific format
+      'file_formats':["zip","xml"],# accepted file format
+      "crawler": "CernCrawler", #name of the new crawler class inside a CERN.py file (set to None if no Crawler implemented)
+      'default_languages': ['en', 'fr'], #supported defaut languages of the source
+ },
+ ...
+ ]
+```
+## adding a new parser
+Once you declared your new parser inside constants.py
+add your new crawler file into /srv/gargantext/utils/parsers/
+following this naming convention:
+* Filename must be in uppercase without the Crawler mention.
+  eg. MailParser => MAIL.py
+* Inside this file the Parser must be called following the exact typo declared as parser in constants.py
+* Your new crawler shall inherit from baseclasse Parser and provide a parse(filebuffer) method
+```python
+  #!/usr/bin/python3 env
+  #filename:/srv/gargantext/util/parser/MAIL.py:
+  from ._Parser import Parser
+  class MailParser(Parser):
+      def parse(self, file):
+          ...
+```
+## adding a new crawler
+Once you declared your new parser inside constants.py
+add your new crawler file into /srv/gargantext/utils/parsers/
+following this naming convention:
+* Filename must be in uppercase without the Crawler mention.
+  eg. MailCrawler => MAIL.py
+* Inside this file the Crawler must be called following the exact typo declared as crawler in constants.py
+* Your new crawler shall inherit from baseclasse Crawler and provide three method:
+  * scan_results => ids
+  * sample = > yes/no
+  * fetch
+```python
+  #!/usr/bin/python3 env
+  #filename:/srv/gargantext/util/crawler/MAIL.py:
+  from ._Crawler import Crawler
+  class MailCrawler(Crawler):
+      def scan_results(self, query):
+        ...
+        self.ids = set()
+      def sample(self, results_nb):
+        ...
+      def fetch(self, ids):
+```
--- a/docs/tools/schemas/ngram_parsing_flow.dot
+++ b/docs/tools/schemas/ngram_parsing_flow.dot
+// dot ngram_parsing_flow.dot -Tpng -o ngram_parsing_flow.png
+digraph ngramflow {
+    edge [fontsize=10] ;
+    label=<<B><U>gargantext.util.toolchain</U></B><BR/>(ngram extraction flow)>;
+    labelloc="t" ;
+    "extracted_ngrams" -> "grouplist" ;
+    "extracted_ngrams" -> "occs+tfidfs" ;
+    "main_user_stoplist" -> "stoplist" ;
+    "stoplist" -> "mainlist" ;
+    "occs+tfidfs" -> "mainlist" [label="  TFIDF_LIMIT"];
+    "mainlist" -> "coocs" [label="  COOCS_THRESHOLD"] ;
+    "coocs" -> "specificity" ;
+    "specificity" -> "maplist" [label="MAPLIST_LIMIT\nMONOGRAM_PART"];
+    "maplist" -> "explore" ;
+    "grouplist" -> "maplist" ;
+}
--- a/docs/tools/schemas/ngram_parsing_flow.png
+++ b/docs/tools/schemas/ngram_parsing_flow.png