Merge branch 'romain-reintegration-graphExplorer' into anoe-graph

a26bb0f6 · delanoe · b2067a0c · f5314bfa · a26bb0f6 · a26bb0f6
Commit a26bb0f6 authored Jul 18, 2016 by delanoe
35 changed files
--- a/docs/install.md
+++ b/docs/install.md
-Install Instructions for Gargantext (CNRS):
+#Install Instructions for Gargantext (CNRS):
+
+## Get the source code
+by cloning gargantext into /srv/gargantext
+
+``` bash
+git clone ssh://gitolite@delanoe.org:1979/gargantext /srv/gargantext \
+        && cd /srv/gargantext \
+        && git fetch origin stable \
+        && git checkout stable \
+```

-Help needed ?
-See [http://gargantext.org/about](http://gargantext.org/about) and [tools]() for the community

+The folder will be /srv/gargantext:
+* docs containes all informations on gargantext
+    /srv/gargantext/docs/
+* install contains all the installation files
+    /srv/gargantext/install/

-Prepare your environnement and make the initial installation.
-Once you setup and install the Gargantext box. You can use ./install/run.sh utility
-to load gargantext web plateform and access it throught your web browser
+Help needed ?
+See [http://gargantext.org/about](http://gargantext.org/about) and [tools](./contribution_guide.md) for the community

-______________________________
+Two installation procedure are provided:

-1. [Prerequisites](##Prerequisites)
+1. Semi-automatic installation  [EASY]
+2. Step by step installation     [ADVANCED]

-2. [SETUP](##Setup)
+Here only semi-automatic installation is covered checkout [manual_install](manual_install.md)
+to follow step by step procedure

-3. [INSTALL](##Install)

-4. [RUN](##RUN)
-______________________________
 ##Prerequisites
+## Init Setup
+## Install
+## Run
+
+--------------------
+
+# Semi automatic installation
+All the procedure files are located into /srv/garantext/install/
+``` bash
+user@computer:$ cd /srv/garantext/install/
+```
+
+## Prerequisites

 * A Debian based OS >= [FIXME]

-* At least 35GO in the desired location of Gargantua [FIXME]
+* At least 35GO in /srv/ [FIXME]
    todo: reduce the size of gargantext lib
-    todo: remove lib once docker is configure
+    todo: remove lib once docker is configured

-    tip: if you have enought space for the full package you can:
+! tip: if you have enought space for the full package you can:
        * resize your partition
        * make a simlink on gargantext_lib

-* A [docker engine installation](https://docs.docker.com/engine/installation/linux/)

-##Setup
-Prepare your environnement and make the initial setup.
-
-Setup can be done in 2 ways:
-    * [automatic setup](setup.sh) can be done by using the setup script provided [here](setup.sh)
-    * [manual setup](manual_setup.md) if you want to change some parameters [here](manual_setup.md)
-
-##Install

-Two installation procedure are actually proposed:
-* the docker way [easy]
-* the debian way [advanced]

-####DOCKER WAY [EASY]
+##Init Setup
+Prepare your environnement and make the initial setup.

-* Install docker
-See [installation instruction for your distribution](https://docs.docker.com/engine/installation/)
+This initial step creates a user for gargantext plateform along with dowloading additionnal libs and files.

-* Build your docker image
+It also install docker and build the docker image and build the gargantext box

 ``` bash
-cd /srv/gargantext/install/docker/dev
-./build
-ID=$(docker build .) && docker run -i -t $ID
+user@computer:/srv/garantext/install/$ .init.sh
 ```

-You should see

-```
-Successfully built <container_id>
-```
+### Install
+Once the init step is done

 * Enter into the docker environnement

+Inside folder /srv/garantext/install/
+enter the gargantext image
 ``` bash
-./srv/gargantext/install/docker/enterGargantextImage
+user@computer:/srv/garantext/install/$ .docker/enterGargantextImage
 ```
+go to the installation folder
+``` bash
+root@dockerimage8989809:$ cd /srv/gargantext/install/
+```
+    [ICI] Tester si les config de postgresql et python sont faits en amont à la création du docker file

 * Install Python environment

-Inside the docker image, execute as root:
+
 ``` bash
-/srv/gargantext/install/python/configure
+root@dockerimage8989809:/srv/garantext/install/$ python/configure
 ```
 * Configure PostgreSql

 Inside the docker image, execute as root:
 ``` bash
-/srv/gargantext/install/postgres/configure
-```
-* Exit the docker
-```
-exit (or Ctrl+D)
+root@computer:/srv/garantext/install/$ postgres/configure
 ```
-
+    [Si OK ] enlever ses lignes

 Install Gargantext server

-* Enter docker container
-``` bash
-/srv/gargantext/install/docker/enterGargantextImage
-```
-
 *  Configure the database
 Inside the docker container:
 ``` bash
 service postgresql start
 #su gargantua
+#activate the virtualenv
 source /srv/env_3-5/bin/activate
-python /srv/gargantext/dbmigrate.py
-/srv/gargantext/manage.py makemigrations
-/srv/gargantext/manage.py migrate
-python /srv/gargantext/dbmigrate.py
+```
+You have entered the virtualenv as shown with (env_3-5)
+``` bash
+(env_3-5) $ python /srv/gargantext/dbmigrate.py
+(env_3-5) $  /srv/gargantext/manage.py makemigrations
+(env_3-5) $  /srv/gargantext/manage.py migrate
+(env_3-5) $  python /srv/gargantext/dbmigrate.py
 #will create tables and not hyperdata_nodes
-python /srv/gargantext/dbmigrate.py
+(env_3-5) $  python /srv/gargantext/dbmigrate.py
 #will create table hyperdata_nodes
 #launch first time the server to create first user
-/srv/gargantext/manage.py runserver 0.0.0.0:8000
-/srv/gargantext/init_accounts.py /srv/gargantext/install/init/account.csv
+(env_3-5) $ /srv/gargantext/manage.py runserver 0.0.0.0:8000
+(env_3-5) $  /srv/gargantext/init_accounts.py /srv/gargantext/install/init/account.csv
 ```

    FIXME: dbmigrate need to launched several times since tables are
    ordered with alphabetical order (and not dependencies order)
-####Debian way [advanced]

-##Run Gargantext
-* Launch Gargantext
+* Exit the docker
+```
+exit (or Ctrl+D)
+```
+
+
+
+## Run Gargantext

 Enter the docker container:
 ``` bash
@@ -126,31 +141,30 @@ Enter the docker container:
 ```
 Inside the docker container:
 ``` bash
-#start postgresql
+#start Database (postgresql)
 service postgresql start
 #change to user
 su gargantua
 #activate the virtualenv
 source /srv/env_3-5/bin/activate
 #go to gargantext srv
-cd /srv/gargantext/
+(env_3-5) $ cd /srv/gargantext/
 #run the server
-/manage.py runserver 0.0.0.0:8000
+(env_3-5) $ /manage.py runserver 0.0.0.0:8000
 ```
-
-
-* Launch browser
-outside the docker
+Keep it open and  outside the docker launch browser

 ``` bash
 chromium http://127.0.0.1:8000/
 ```
+
 * Click on Test Gargantext
 ```
 Login : gargantua
 Password : autnagrag
 ```
 Enjoy :)
+
 See [User Guide](/demo/tuto.md) for quick usage example


--- a/gargantext/constants.py
+++ b/gargantext/constants.py
@@ -12,14 +12,16 @@ LISTTYPES = {
    'STOPLIST'     : UnweightedList,
    'MAINLIST'     : UnweightedList,
    'MAPLIST'      : UnweightedList,
-    'SPECIFICITY'  : WeightedList,
+    'SPECCLUSION'  : WeightedList,
+    'GENCLUSION'   : WeightedList,
    'OCCURRENCES'  : WeightedIndex,   # could be WeightedList
    'COOCCURRENCES': WeightedMatrix,
    'TFIDF-CORPUS' : WeightedIndex,
    'TFIDF-GLOBAL' : WeightedIndex,
    'TIRANK-LOCAL' : WeightedIndex,   # could be WeightedList
-    'TIRANK-GLOBAL' : WeightedIndex   # could be WeightedList
+    'TIRANK-GLOBAL' : WeightedIndex,   # could be WeightedList
 }
+# 'OWNLIST'      : UnweightedList,    # £TODO use this for any term-level tags

 NODETYPES = [
    # TODO separate id not array index, read by models.node
@@ -37,7 +39,7 @@ NODETYPES = [
    'COOCCURRENCES',         # 9
    # scores
    'OCCURRENCES',           # 10
-    'SPECIFICITY',           # 11
+    'SPECCLUSION',           # 11
    'CVALUE',                # 12
    'TFIDF-CORPUS',          # 13
    'TFIDF-GLOBAL',          # 14
@@ -47,6 +49,7 @@ NODETYPES = [
    # more scores (sorry!)
    'TIRANK-LOCAL',          # 16
    'TIRANK-GLOBAL',         # 17
+    'GENCLUSION',            # 18
 ]

 INDEXED_HYPERDATA = {
@@ -222,12 +225,16 @@ DEFAULT_RANK_CUTOFF_RATIO      = .75         # MAINLIST maximum terms in %
 DEFAULT_RANK_HARD_LIMIT        = 5000        # MAINLIST maximum terms abs
                                             # (makes COOCS larger ~ O(N²) /!\)

-DEFAULT_COOC_THRESHOLD          = 2          # inclusive minimum for COOCS coefs
+DEFAULT_COOC_THRESHOLD          = 3          # inclusive minimum for COOCS coefs
                                             # (makes COOCS more sparse)

 DEFAULT_MAPLIST_MAX             = 350        # MAPLIST maximum terms

-DEFAULT_MAPLIST_MONOGRAMS_RATIO = .15         # part of monograms in MAPLIST
+DEFAULT_MAPLIST_MONOGRAMS_RATIO = .2         # quota of monograms in MAPLIST
+                                             # (vs multigrams = 1-mono)
+
+DEFAULT_MAPLIST_GENCLUSION_RATIO = .6        # quota of top genclusion in MAPLIST
+                                             # (vs top specclusion = 1-gen)

 DEFAULT_MAX_NGRAM_LEN           = 7          # limit used after POStagging rule
                                             # (initial ngrams number is a power law of this /!\)
@@ -272,7 +279,7 @@ DOWNLOAD_DIRECTORY = UPLOAD_DIRECTORY

 # about batch processing...
 BATCH_PARSING_SIZE          = 256
-BATCH_NGRAMSEXTRACTION_SIZE = 1024
+BATCH_NGRAMSEXTRACTION_SIZE = 3000   # how many distinct ngrams before INTEGRATE


 # Scrapers config
@@ -282,7 +289,7 @@ QUERY_SIZE_N_DEFAULT = 1000

 # Grammar rules for chunking
 RULE_JJNN   = "{<JJ.*>*<NN.*|>+<JJ.*>*}"
-RULE_JJDTNN = "{<JJ.*>*<NN.*>+((<P|IN> <DT>? <JJ.*>* <NN.*>+ <JJ.*>*)|(<JJ.*>))*}"
+RULE_NPN    = "{<JJ.*>*<NN.*>+((<P|IN> <DT>? <JJ.*>* <NN.*>+ <JJ.*>*)|(<JJ.*>))*}"
 RULE_TINA   = "^((VBD,|VBG,|VBN,|CD.?,|JJ.?,|\?,){0,2}?(N.?.?,|\?,)+?(CD.,)??)\
               +?((PREP.?|DET.?,|IN.?,|CC.?,|\?,)((VBD,|VBG,|VBN,|CD.?,|JJ.?,|\?\
               ,){0,2}?(N.?.?,|\?,)+?)+?)*?$"
--- a/gargantext/settings.py
+++ b/gargantext/settings.py
@@ -42,6 +42,9 @@ CELERY_ACCEPT_CONTENT = ['pickle', 'json', 'msgpack', 'yaml']
 CELERY_IMPORTS = ("gargantext.util.toolchain", "graph.cooccurrences")


+# garg's custom unittests runner (adapted to our db models)
+TEST_RUNNER = 'unittests.framework.GargTestRunner'
+
 # Application definition

 INSTALLED_APPS = [
@@ -123,6 +126,9 @@ DATABASES = {
        'PASSWORD': 'C8kdcUrAQy66U',
        'HOST': '127.0.0.1',
        'PORT': '5432',
+        'TEST': {
+            'NAME': 'test_gargandb',
+        },
    }
 }


--- a/gargantext/util/ngramlists_tools.py
+++ b/gargantext/util/ngramlists_tools.py
@@ -19,7 +19,7 @@ from gargantext.constants        import DEFAULT_CSV_DELIM, DEFAULT_CSV_DELIM_GRO

 # import will implement the same text cleaning procedures as toolchain
 from gargantext.util.toolchain.parsing           import normalize_chars
-from gargantext.util.toolchain.ngrams_extraction import normalize_terms
+from gargantext.util.toolchain.ngrams_extraction import normalize_forms

 from sqlalchemy.sql      import exists
 from os                  import path

--- a/gargantext/util/ngramsextractors.py
+++ b/gargantext/util/ngramsextractors.py
 from gargantext.util.languages import languages
-from gargantext.constants import LANGUAGES, DEFAULT_MAX_NGRAM_LEN, RULE_JJNN, RULE_JJDTNN
+from gargantext.constants import LANGUAGES, DEFAULT_MAX_NGRAM_LEN, RULE_JJNN, RULE_NPN

 import nltk
 import re

--- a/gargantext/util/toolchain/list_main.py
+++ b/gargantext/util/toolchain/list_main.py
@@ -39,11 +39,11 @@ def do_mainlist(corpus,
    # retrieve helper nodes if not provided
    if not ranking_scores_id:
        ranking_scores_id  = session.query(Node.id).filter(
-                                Node.typename  == "TFIDF-GLOBAL",
+                                Node.typename  == "TIRANK-GLOBAL",
                                Node.parent_id == corpus.id
                    ).first()
        if not ranking_scores_id:
-            raise ValueError("MAINLIST: TFIDF node needed for mainlist creation")
+            raise ValueError("MAINLIST: TIRANK node needed for mainlist creation")

    if not stoplist_id:
        stoplist_id  = session.query(Node.id).filter(

--- a/gargantext/util/toolchain/list_map.py
+++ b/gargantext/util/toolchain/list_map.py
@@ -9,37 +9,49 @@ from gargantext.util.db_cache import cache
 from gargantext.util.lists    import UnweightedList
 from sqlalchemy               import desc, asc
 from gargantext.constants     import DEFAULT_MAPLIST_MAX,\
+                                     DEFAULT_MAPLIST_GENCLUSION_RATIO,\
                                     DEFAULT_MAPLIST_MONOGRAMS_RATIO

 def do_maplist(corpus,
               overwrite_id = None,
               mainlist_id  = None,
-               specificity_id = None,
+               specclusion_id = None,
+               genclusion_id = None,
               grouplist_id = None,
               limit=DEFAULT_MAPLIST_MAX,
+               genclusion_part=DEFAULT_MAPLIST_GENCLUSION_RATIO,
               monograms_part=DEFAULT_MAPLIST_MONOGRAMS_RATIO
               ):
    '''
-    According to Specificities and mainlist
+    According to Genericity/Specificity and mainlist

    Parameters:
      - mainlist_id (starting point, already cleaned of stoplist terms)
-      - specificity_id (ranking factor)
+      - specclusion_id (ngram inclusion by cooc specificity -- ranking factor)
+      - genclusion_id (ngram inclusion by cooc genericity -- ranking factor)
      - grouplist_id (filtering grouped ones)
      - overwrite_id: optional if preexisting MAPLIST node to overwrite

-      + 2 constants to modulate the terms choice
+      + 3 params to modulate the terms choice
        - limit for the amount of picked terms
        - monograms_part: a ratio of terms with only one lexical unit to keep
+                          (multigrams quota = limit * (1-monograms_part))
+        - genclusion_part: a ratio of terms with only one lexical unit to keep
+                           (speclusion quota = limit * (1-genclusion_part))
    '''

-    if not (mainlist_id and specificity_id and grouplist_id):
-        raise ValueError("Please provide mainlist_id, specificity_id and grouplist_id")
+    if not (mainlist_id and specclusion_id and genclusion_id and grouplist_id):
+        raise ValueError("Please provide mainlist_id, specclusion_id, genclusion_id and grouplist_id")

-    monograms_limit = round(limit * monograms_part)
-    multigrams_limit = limit - monograms_limit
-    print("MAPLIST: monograms_limit =", monograms_limit)
-    print("MAPLIST: multigrams_limit = ", multigrams_limit)
+    quotas = {'topgen':{}, 'topspec':{}}
+    genclusion_limit = round(limit * genclusion_part)
+    speclusion_limit = limit - genclusion_limit
+    quotas['topgen']['monograms'] = round(genclusion_limit * monograms_part)
+    quotas['topgen']['multigrams'] = genclusion_limit - quotas['topgen']['monograms']
+    quotas['topspec']['monograms'] = round(speclusion_limit * monograms_part)
+    quotas['topspec']['multigrams'] = speclusion_limit - quotas['topspec']['monograms']
+
+    print("MAPLIST quotas:", quotas)

    #dbg = DebugTime('Corpus #%d - computing Miam' % corpus.id)

@@ -54,11 +66,19 @@ def do_maplist(corpus,
                         )

    ScoreSpec=aliased(NodeNgram)
-
-    # specificity-ranked
-    query = (session.query(ScoreSpec.ngram_id)
+    ScoreGen=aliased(NodeNgram)
+
+    # ngram with both ranking factors spec and gen
+    query = (session.query(
+                        ScoreSpec.ngram_id,
+                        ScoreSpec.weight,
+                        ScoreGen.weight,
+                        Ngram.n
+                        )
                .join(Ngram, Ngram.id == ScoreSpec.ngram_id)
-                .filter(ScoreSpec.node_id == specificity_id)
+                .join(ScoreGen, ScoreGen.ngram_id == ScoreSpec.ngram_id)
+                .filter(ScoreSpec.node_id == specclusion_id)
+                .filter(ScoreGen.node_id == genclusion_id)

                # we want only terms within mainlist
                .join(MainlistTable, Ngram.id == MainlistTable.ngram_id)
@@ -68,36 +88,99 @@ def do_maplist(corpus,
                .outerjoin(IsSubform,
                           IsSubform.c.ngram2_id == ScoreSpec.ngram_id)
                .filter(IsSubform.c.ngram2_id == None)
-            )
-
-    # TODO: move these 2 pools up to mainlist selection
-    top_monograms = (query
-                .filter(Ngram.n == 1)
-                .order_by(asc(ScoreSpec.weight))
-                .limit(monograms_limit)
-                .all()
-               )

-    top_multigrams = (query
-                .filter(Ngram.n >= 2)
+                # specificity-ranked
                .order_by(desc(ScoreSpec.weight))
-                .limit(multigrams_limit)
-                .all()
-               )
-    obtained_mono  = len(top_monograms)
-    obtained_multi = len(top_multigrams)
-    obtained_total = obtained_mono + obtained_multi
-    # print("MAPLIST: top_monograms =", obtained_mono)
-    # print("MAPLIST: top_multigrams = ", obtained_multi)
+            )
+
+    # format in scored_ngrams array:
+    # -------------------------------
+    # [(37723,    8.428, 14.239,   3    ),   etc]
+    #   ngramid   wspec   wgen    nwords
+    scored_ngrams = query.all()
+    n_ngrams = len(scored_ngrams)
+
+    if n_ngrams == 0:
+        raise ValueError("No ngrams in cooc table ?")
+
+    # results, with same structure as quotas
+    chosen_ngrams = {
+                     'topgen':{'monograms':[], 'multigrams':[]},
+                     'topspec':{'monograms':[], 'multigrams':[]}
+                     }
+
+    # specificity and genericity are rather reverse-correlated
+    # but occasionally they can have common ngrams (same ngram well ranked in both)
+    # => we'll use a lookup table to check if we didn't already get it
+    already_gotten_ngramids = {}
+
+    # 2 loops to fill spec-clusion then gen-clusion quotas
+    #   (1st loop uses order from DB, 2nd loop uses our own sort at end of 1st)
+    for rkr in ['topspec', 'topgen']:
+        got_enough_mono = False
+        got_enough_multi = False
+        all_done = False
+        i = -1
+        while((not all_done) and (not (got_enough_mono and got_enough_multi))):
+            # retrieve sorted ngram n° i
+            i += 1
+            (ng_id, wspec, wgen, nwords) = scored_ngrams[i]
+
+            # before any continue case, we check the next i for max reached
+            all_done = (i+1 >= n_ngrams)
+
+            if ng_id in already_gotten_ngramids:
+                continue
+
+            # NB: nwords could be replaced by a simple search on r' '
+            if nwords == 1:
+                if got_enough_mono:
+                    continue
+                else:
+                    # add ngram to results and lookup
+                    chosen_ngrams[rkr]['monograms'].append(ng_id)
+                    already_gotten_ngramids[ng_id] = True
+            # multi
+            else:
+                if got_enough_multi:
+                    continue
+                else:
+                    # add ngram to results and lookup
+                    chosen_ngrams[rkr]['multigrams'].append(ng_id)
+                    already_gotten_ngramids[ng_id] = True
+
+            got_enough_mono = (len(chosen_ngrams[rkr]['monograms']) >= quotas[rkr]['monograms'])
+            got_enough_multi = (len(chosen_ngrams[rkr]['multigrams']) >= quotas[rkr]['multigrams'])
+
+        # at the end of the first loop we just need to sort all by the second ranker (gen)
+        scored_ngrams = sorted(scored_ngrams, key=lambda ng_infos: ng_infos[2], reverse=True)
+
+    obtained_spec_mono = len(chosen_ngrams['topspec']['monograms'])
+    obtained_spec_multi = len(chosen_ngrams['topspec']['multigrams'])
+    obtained_gen_mono = len(chosen_ngrams['topgen']['monograms'])
+    obtained_gen_multi = len(chosen_ngrams['topgen']['multigrams'])
+    obtained_total = obtained_spec_mono   \
+                    + obtained_spec_multi \
+                    + obtained_gen_mono   \
+                    + obtained_gen_multi
+    print("MAPLIST: top_spec_monograms =",  obtained_spec_mono)
+    print("MAPLIST: top_spec_multigrams =", obtained_spec_multi)
+    print("MAPLIST: top_gen_monograms =",   obtained_gen_mono)
+    print("MAPLIST: top_gen_multigrams =",  obtained_gen_multi)
    print("MAPLIST: kept %i ngrams in total " % obtained_total)

+    obtained_data = chosen_ngrams['topspec']['monograms']      \
+                    + chosen_ngrams['topspec']['multigrams']   \
+                    + chosen_ngrams['topgen']['monograms']     \
+                    + chosen_ngrams['topgen']['multigrams']
+
    # NEW MAPLIST NODE
    # -----------------
    # saving the parameters of the analysis in the Node JSON
    new_hyperdata = { 'corpus': corpus.id,
                      'limit' : limit,
-                      'monograms_part' : monograms_part,
-                     'monograms_result' : obtained_mono/obtained_total if obtained_total != 0 else 0
+                      'monograms_part' :  monograms_part,
+                      'genclusion_part' : genclusion_part,
                    }
    if overwrite_id:
        # overwrite pre-existing node
@@ -118,9 +201,7 @@ def do_maplist(corpus,
        the_id = the_maplist.id

    # create UnweightedList object and save (=> new NodeNgram rows)
-    datalist = UnweightedList(
-                   [res.ngram_id for res in top_monograms + top_multigrams]
-               )
+    datalist = UnweightedList(obtained_data)

    # save
    datalist.save(the_id)

--- a/gargantext/util/toolchain/main.py
+++ b/gargantext/util/toolchain/main.py
@@ -10,8 +10,8 @@ from .ngram_groups        import compute_groups
 from .metric_tfidf        import compute_occs, compute_tfidf_local, compute_ti_ranking
 from .list_main           import do_mainlist
 from .ngram_coocs         import compute_coocs
-from .metric_specificity  import compute_specificity
-from .list_map            import do_maplist     # TEST
+from .metric_specgen      import compute_specgen
+from .list_map            import do_maplist
 from .mail_notification   import notify_owner
 from gargantext.util.db   import session
 from gargantext.models    import Node
@@ -136,22 +136,26 @@ def parse_extract_indexhyperdata(corpus):
    # => used for doc <=> ngram association

    # ------------
-    # -> cooccurrences on mainlist: compute + write (=> Node and NodeNgramNgram)
+    # -> cooccurrences on mainlist: compute + write (=> Node and NodeNgramNgram)*
    coocs = compute_coocs(corpus,
                            on_list_id = mainlist_id,
                            groupings_id = group_id,
-                            just_pass_result = True)
+                            just_pass_result = True,
+                            diagonal_filter = False) # preserving the diagonal
+                                                     # (useful for spec/gen)
    print('CORPUS #%d: [%s] computed mainlist coocs for specif rank' % (corpus.id, t()))

-    # -> specificity: compute + write (=> NodeNodeNgram)
-    spec_id = compute_specificity(corpus,cooc_matrix = coocs)
+    # -> specclusion/genclusion: compute + write (2 Nodes + 2 lists in NodeNgram)
+    (spec_id, gen_id) = compute_specgen(corpus,cooc_matrix = coocs)
    # no need here for subforms because cooc already counted them in mainform
-    print('CORPUS #%d: [%s] new specificity node #%i' % (corpus.id, t(), spec_id))
+    print('CORPUS #%d: [%s] new spec-clusion node #%i' % (corpus.id, t(), spec_id))
+    print('CORPUS #%d: [%s] new gen-clusion node #%i' % (corpus.id, t(), gen_id))

    # maplist: compute + write (to Node and NodeNgram)
    map_id = do_maplist(corpus,
                        mainlist_id = mainlist_id,
-                        specificity_id=spec_id,
+                        specclusion_id=spec_id,
+                        genclusion_id=gen_id,
                        grouplist_id=group_id
                        )
    print('CORPUS #%d: [%s] new maplist node #%i' % (corpus.id, t(), map_id))
@@ -187,7 +191,7 @@ def recount(corpus):
         - ndocs
         - ti_rank
         - coocs
-         - specificity
+         - specclusion/genclusion
         - tfidf

    NB: no new extraction, no list change, just the metrics
@@ -208,10 +212,15 @@ def recount(corpus):
        old_tirank_id = None

    try:
-        old_spec_id   = corpus.children("SPECIFICITY").first().id
+        old_spec_id   = corpus.children("SPECCLUSION").first().id
    except:
        old_spec_id   = None

+    try:
+        old_gen_id   = corpus.children("GENCLUSION").first().id
+    except:
+        old_gen_id   = None
+
    try:
        old_ltfidf_id = corpus.children("TFIDF-CORPUS").first().id
    except:
@@ -254,11 +263,13 @@ def recount(corpus):
                            just_pass_result = True)
    print('RECOUNT #%d: [%s] updated mainlist coocs for specif rank' % (corpus.id, t()))

-    # -> specificity: compute + write (=> NodeNgram)
-    spec_id = compute_specificity(corpus,cooc_matrix = coocs, overwrite_id = old_spec_id)

+    # -> specclusion/genclusion: compute + write (=> NodeNodeNgram)
+    (spec_id, gen_id) = compute_specgen(corpus, cooc_matrix = coocs,
+                            spec_overwrite_id = spec_id, gen_overwrite_id = gen_id)

-    print('RECOUNT #%d: [%s] updated specificity node #%i' % (corpus.id, t(), spec_id))
+    print('RECOUNT #%d: [%s] updated spec-clusion node #%i' % (corpus.id, t(), spec_id))
+    print('RECOUNT #%d: [%s] updated gen-clusion node #%i' % (corpus.id, t(), gen_id))

    print('RECOUNT #%d: [%s] FINISHED metric recounts' % (corpus.id, t()))


--- a/gargantext/util/toolchain/metric_specgen.py
+++ b/gargantext/util/toolchain/metric_specgen.py
+"""
+Computes a specificity metric from the ngram cooccurrence matrix.
+ + SAVE => WeightedList => NodeNgram
+"""
+from gargantext.models        import Node, Ngram, NodeNgram, NodeNgramNgram
+from gargantext.util.db       import session, aliased, func, bulk_insert
+from gargantext.util.lists    import WeightedList
+from collections              import defaultdict
+from pandas                   import DataFrame
+from numpy                    import diag
+
+def round3(floating_number):
+    """
+    Rounds a floating number to 3 decimals
+    Good when we don't need so much details in the DB writen data
+    """
+    return float("%.3f" % floating_number)
+
+def compute_specgen(corpus, cooc_id=None, cooc_matrix=None,
+                    spec_overwrite_id = None, gen_overwrite_id = None):
+    '''
+    Compute genericity/specificity:
+        P(j|i) = N(ij) / N(ii)
+        P(i|j) = N(ij) / N(jj)
+
+        Gen(i) = Sum{j} P(j_k|i)
+        Spec(i)  = Sum{j} P(i|j_k)
+
+        Gen-clusion(i) = (Spec(i) + Gen(i)) / 2
+        Spec-clusion(i) = (Spec(i) - Gen(i)) / 2
+
+    Parameters:
+        - cooc_id: mandatory id of a cooccurrences node to use as base
+        - spec_overwrite_id: optional preexisting specificity node to overwrite
+        - gen_overwrite_id: optional preexisting genericity node to overwrite
+    '''
+
+    matrix = defaultdict(lambda : defaultdict(float))
+
+    if cooc_id == None and cooc_matrix == None:
+        raise TypeError("compute_specificity: needs a cooc_id or cooc_matrix param")
+
+    elif cooc_id:
+        cooccurrences = (session.query(NodeNgramNgram)
+                        .filter(NodeNgramNgram.node_id==cooc_id)
+                        )
+        # no filtering: cooc already filtered on mainlist_id at creation
+        for cooccurrence in cooccurrences:
+            matrix[cooccurrence.ngram1_id][cooccurrence.ngram2_id] = cooccurrence.weight
+            # matrix[cooccurrence.ngram2_id][cooccurrence.ngram1_id] = cooccurrence.weight
+
+    elif cooc_matrix:
+        # copy WeightedMatrix into local matrix structure
+        for (ngram1_id, ngram2_id) in cooc_matrix.items:
+
+            w = cooc_matrix.items[(ngram1_id, ngram2_id)]
+            # ------- 8< --------------------------------------------
+            # tempo hack to ignore lines/columns where diagonal == 0
+            # £TODO find why they exist and then remove this snippet
+            if (((ngram1_id,ngram1_id) not in cooc_matrix.items) or
+                ((ngram2_id,ngram2_id) not in cooc_matrix.items)):
+                continue
+            # ------- 8< --------------------------------------------
+            matrix[ngram1_id][ngram2_id] = w
+
+    nb_ngrams = len(matrix)
+
+    print("SPECIFICITY: computing on %i ngrams" % nb_ngrams)
+
+    # example corpus (7 docs, 8 nouns)
+    # --------------------------------
+    # "The report says that humans are animals."
+    # "The report says that rivers are full of water."
+    # "The report says that humans like to make war."
+    # "The report says that animals must eat food."
+    # "The report says that animals drink water."
+    # "The report says that humans like food and water."
+    # "The report says that grass is food for some animals."
+
+    #===========================================================================
+    cooc_counts = DataFrame(matrix).fillna(0)
+
+    # cooc_counts matrix
+    # ------------------
+    #           animals  food  grass  humans  report  rivers  war  water
+    # animals         4     2      1       1       4       0    0      1
+    # food            2     3      1       1       3       0    0      1
+    # grass           1     1      1       0       1       0    0      0
+    # humans          1     1      0       3       3       0    1      1
+    # report          4     3      1       3       7       1    1      3
+    # rivers          0     0      0       0       1       1    0      1
+    # war             0     0      0       1       1       0    1      0
+    # water           1     1      0       1       3       1    0      3
+
+    #===========================================================================
+    # conditional p(col|line)
+    diagonal = list(diag(cooc_counts))
+
+
+    # debug
+    # print("WARN diag: ", diagonal)
+    # print("WARN diag: =================== 0 in diagonal ?\n",
+    #         0 in diagonal ? "what ??? zeros in the diagonal :/" : "ok no zeros",
+    #         "\n===================")
+
+    p_col_given_line = cooc_counts / list(diag(cooc_counts))
+
+    # p_col_given_line
+    # ----------------
+    #          animals  food  grass  humans  report rivers   war  water
+    # animals      1.0   0.7    1.0     0.3     0.6    0.0   0.0    0.3
+    # food         0.5   1.0    1.0     0.3     0.4    0.0   0.0    0.3
+    # grass        0.2   0.3    1.0     0.0     0.1    0.0   0.0    0.0
+    # humans       0.2   0.3    0.0     1.0     0.4    0.0   1.0    0.3
+    # report       1.0   1.0    1.0     1.0     1.0    1.0   1.0    1.0
+    # rivers       0.0   0.0    0.0     0.0     0.1    1.0   0.0    0.3
+    # war          0.0   0.0    0.0     0.3     0.1    0.0   1.0    0.0
+    # water        0.2   0.3    0.0     0.3     0.4    1.0   0.0    1.0
+
+    #===========================================================================
+    # total per lines (<=> genericity)
+    Gen = p_col_given_line.sum(axis=1)
+
+    # Gen.sort_values(ascending=False)
+    # ---
+    # report    8.0
+    # animals   3.9
+    # food      3.6
+    # water     3.3
+    # humans    3.3
+    # grass     1.7
+    # war       1.5
+    # rivers    1.5
+
+    #===========================================================================
+    # total columnwise (<=> specificity)
+    Spec = p_col_given_line.sum(axis=0)
+
+    # Spec.sort_values(ascending=False)
+    # ----
+    # grass     4.0
+    # food      3.7
+    # water     3.3
+    # humans    3.3
+    # report    3.3
+    # animals   3.2
+    # war       3.0
+    # rivers    3.0
+
+
+    #===========================================================================
+    # our "inclusion by specificity" metric
+    Specclusion = Spec-Gen
+
+    # Specclusion.sort_values(ascending=False)
+    # -----------
+    # grass      1.1
+    # war        0.8
+    # rivers     0.8
+    # food       0.0
+    # humans    -0.0
+    # water     -0.0
+    # animals   -0.3
+    # report    -2.4
+
+    #===========================================================================
+    # our "inclusion by genericity" metric
+    Genclusion = Spec+Gen
+
+    # Genclusion.sort_values(ascending=False)
+    # -----------
+    # report     11.3
+    # food        7.3
+    # animals     7.2
+    # water       6.7
+    # humans      6.7
+    # grass       5.7
+    # war         4.5
+    # rivers      4.5
+
+    #===========================================================================
+    # specificity node
+    if spec_overwrite_id:
+        # overwrite pre-existing id
+        the_spec_id = spec_overwrite_id
+        session.query(NodeNgram).filter(NodeNgram.node_id==the_spec_id).delete()
+        session.commit()
+    else:
+        specnode = corpus.add_child(
+            typename  = "SPECCLUSION",
+            name = "Specclusion (in:%s)" % corpus.id
+        )
+        session.add(specnode)
+        session.commit()
+        the_spec_id = specnode.id
+
+    if not Specclusion.empty:
+        data = WeightedList(
+                zip(  Specclusion.index.tolist()
+                    , [v for v  in map(round3, Specclusion.values.tolist())]
+                 )
+               )
+        data.save(the_spec_id)
+    else:
+        print("WARNING: had no terms in COOCS => empty SPECCLUSION node")
+
+    #===========================================================================
+    # genclusion node
+    if gen_overwrite_id:
+        the_gen_id = gen_overwrite_id
+        session.query(NodeNgram).filter(NodeNgram.node_id==the_gen_id).delete()
+        session.commit()
+    else:
+        gennode = corpus.add_child(
+            typename  = "GENCLUSION",
+            name = "Genclusion (in:%s)" % corpus.id
+        )
+        session.add(gennode)
+        session.commit()
+        the_gen_id = gennode.id
+
+    if not Genclusion.empty:
+        data = WeightedList(
+                zip(  Genclusion.index.tolist()
+                    , [v for v  in map(round3, Genclusion.values.tolist())]
+                 )
+               )
+        data.save(the_gen_id)
+    else:
+        print("WARNING: had no terms in COOCS => empty GENCLUSION node")
+
+    #===========================================================================
+    return(the_spec_id, the_gen_id)
--- a/gargantext/util/toolchain/metric_specificity.py
+++ b/gargantext/util/toolchain/metric_specificity.py
-"""
-Computes a specificity metric from the ngram cooccurrence matrix.
- + SAVE => WeightedList => NodeNgram
-"""
-from gargantext.models        import Node, Ngram, NodeNgram, NodeNgramNgram
-from gargantext.util.db       import session, aliased, func, bulk_insert
-from gargantext.util.lists    import WeightedList
-from collections              import defaultdict
-from pandas                   import DataFrame
-import pandas as pd
-
-def compute_specificity(corpus, cooc_id=None, cooc_matrix=None, overwrite_id = None):
-    '''
-    Compute the specificity, simple calculus.
-
-    Parameters:
-        - cooc_id: mandatory id of a cooccurrences node to use as base
-        - overwrite_id: optional preexisting specificity node to overwrite
-    '''
-
-    matrix = defaultdict(lambda : defaultdict(float))
-
-    if cooc_id == None and cooc_matrix == None:
-        raise TypeError("compute_specificity: needs a cooc_id or cooc_matrix param")
-
-    elif cooc_id:
-        cooccurrences = (session.query(NodeNgramNgram)
-                        .filter(NodeNgramNgram.node_id==cooc_id)
-                        )
-        # no filtering: cooc already filtered on mainlist_id at creation
-        for cooccurrence in cooccurrences:
-            matrix[cooccurrence.ngram1_id][cooccurrence.ngram2_id] = cooccurrence.weight
-            matrix[cooccurrence.ngram2_id][cooccurrence.ngram1_id] = cooccurrence.weight
-
-    elif cooc_matrix:
-        # copy WeightedMatrix into local matrix structure
-        for (ngram1_id, ngram2_id) in cooc_matrix.items:
-            w = cooc_matrix.items[(ngram1_id, ngram2_id)]
-            matrix[ngram1_id][ngram2_id] = w
-
-    nb_ngrams = len(matrix)
-
-    print("SPECIFICITY: computing on %i ngrams" % nb_ngrams)
-
-    x = DataFrame(matrix).fillna(0)
-
-    # proba (x/y) ( <= on divise chaque ligne par son total)
-    x = x / x.sum(axis=1)
-
-    # vectorisation
-    # d:Matrix => v: Vector (len = nb_ngrams)
-    # v = d.sum(axis=1) (- lui-même)
-    xs = x.sum(axis=1) - x
-    ys = x.sum(axis=0) - x
-
-
-    # top inclus ou exclus
-    #n = ( xs + ys) / (2 * (x.shape[0] - 1))
-
-    # top generic or specific (asc is spec, desc is generic)
-    v = ( xs - ys) / ( 2 * (x.shape[0] - 1))
-
-    ## d ##
-    #######
-    #               Grenelle  biodiversité  kilomètres  site  élus  île
-    # Grenelle             0             0           4     0     0    0
-    # biodiversité         0             0           0     0     4    0
-    # kilomètres           4             0           0     0     4    0
-    # site                 0             0           0     0     4    6
-    # élus                 0             4           4     4     0    0
-    # île                  0             0           0     6     0    0
-
-
-    ## d.sum(axis=1) ##
-    ###################
-    # Grenelle         4
-    # biodiversité     4
-    # kilomètres       8
-    # site            10
-    # élus            12
-    # île              6
-
-    # résultat temporaire
-    # -------------------
-    # pour l'instant on va utiliser les sommes en ligne comme ranking de spécificité
-    # (**même** ordre qu'avec la formule d'avant le refactoring mais calcul + simple)
-    # TODO analyser la cohérence math ET sem de cet indicateur
-    #v.sort_values(inplace=True)
-
-    # [ ('biodiversité' , 0.333 ),
-    #   ('Grenelle'     , 0.5   ),
-    #   ('île'          , 0.599 ),
-    #   ('kilomètres'   , 1.333 ),
-    #   ('site'         , 1.333 ),
-    #   ('élus'         , 1.899 ) ]
-
-    # ----------------
-    # specificity node
-    if overwrite_id:
-        # overwrite pre-existing id
-        the_id = overwrite_id
-        session.query(NodeNgram).filter(NodeNgram.node_id==the_id).delete()
-        session.commit()
-    else:
-        specnode = corpus.add_child(
-            typename  = "SPECIFICITY",
-            name = "Specif (in:%s)" % corpus.id
-        )
-        session.add(specnode)
-        session.commit()
-        the_id = specnode.id
-
-    # print(v)
-    pd.options.display.float_format = '${:,.2f}'.format
-
-    if not v.empty:
-        data = WeightedList(
-                zip(  v.index.tolist()
-                    , v.values.tolist()[0]
-                 )
-               )
-        data.save(the_id)
-    else:
-        print("WARNING: had no terms in COOCS => empty SPECIFICITY node")
-
-    return(the_id)
--- a/gargantext/util/toolchain/ngram_coocs.py
+++ b/gargantext/util/toolchain/ngram_coocs.py
@@ -18,7 +18,8 @@ def compute_coocs(  corpus,
                    stoplist_id     = None,
                    start           = None,
                    end             = None,
-                    symmetry_filter = False):
+                    symmetry_filter = False,
+                    diagonal_filter = True):
    """
    Count how often some extracted terms appear
    together in a small context (document)
@@ -55,6 +56,9 @@ def compute_coocs(  corpus,
                    NB the expected type of parameter value is datetime.datetime
                        (string is also possible but format must follow
                          this convention: "2001-01-01" aka "%Y-%m-%d")
+      - symmetry_filter: prevent calculating where ngram1_id  > ngram2_id
+      - diagonal_filter: prevent calculating where ngram1_id == ngram2_id
+

     (deprecated parameters)
      - field1,2: allowed to count other things than ngrams (eg tags) but no use case at present
@@ -69,7 +73,7 @@ def compute_coocs(  corpus,
        JOIN nodes_ngrams AS idxb
        ON idxa.node_id = idxb.node_id      <== that's cooc
        ---------------------------------
-        AND idxa.ngram_id <> idxb.ngram_id
+        AND idxa.ngram_id <> idxb.ngram_id   (diagonal_filter)
        AND idxa.node_id = MY_DOC ;

    on entire corpus
@@ -152,16 +156,14 @@ def compute_coocs(  corpus,
                    ucooc

                    # for debug (2/4)
-                    #, Xngram.terms.label("w_x")
-                    #, Yngram.terms.label("w_y")
+                    # , Xngram.terms.label("w_x")
+                    # , Yngram.terms.label("w_y")
                    )
               .join(Yindex, Xindex.node_id == Yindex.node_id )   # <- by definition of cooc

               .join(Node, Node.id == Xindex.node_id) # <- b/c within corpus
               .filter(Node.parent_id == corpus.id)   # <- b/c within corpus
               .filter(Node.typename == "DOCUMENT")   # <- b/c within corpus
-
-               .filter(Xindex_ngform_id != Yindex_ngform_id) # <- b/c not with itself
        )

    # outerjoin the synonyms if needed
@@ -179,12 +181,12 @@ def compute_coocs(  corpus,
               .group_by(
                    Xindex_ngform_id, Yindex_ngform_id # <- what we're counting
                    # for debug (3/4)
-                    #,"w_x", "w_y"
+                    # ,"w_x", "w_y"
                    )

            # for debug (4/4)
-            #.join(Xngram, Xngram.id == Xindex_ngform_id)
-            #.join(Yngram, Yngram.id == Yindex_ngform_id)
+            # .join(Xngram, Xngram.id == Xindex_ngform_id)
+            # .join(Yngram, Yngram.id == Yindex_ngform_id)

            .order_by(ucooc)
           )
@@ -192,6 +194,9 @@ def compute_coocs(  corpus,

    # 4) INPUT FILTERS (reduce N before O(N²))
    if on_list_id:
+        # £TODO listes différentes ou bien une liste pour x et tous les ngrammes pour y
+        #       car permettrait expansion de liste aux plus proches voisins (MacLachlan)
+        #       (avec une matr rectangulaire)

        m1 = aliased(NodeNgram)
        m2 = aliased(NodeNgram)
@@ -226,6 +231,10 @@ def compute_coocs(  corpus,

        )

+    if diagonal_filter:
+        # don't compute ngram with itself
+        coocs_query = coocs_query.filter(Xindex_ngform_id != Yindex_ngform_id)
+
    if start or end:
        Time = aliased(NodeHyperdata)

@@ -268,6 +277,7 @@ def compute_coocs(  corpus,
    # threshold
    # £TODO adjust COOC_THRESHOLD a posteriori:
    # ex: sometimes 2 sometimes 4 depending on sparsity
+    print("COOCS: filtering pairs under threshold:", threshold)
    coocs_query = coocs_query.having(ucooc >= threshold)



--- a/gargantext/util/toolchain/ngrams_extraction.py
+++ b/gargantext/util/toolchain/ngrams_extraction.py
@@ -77,7 +77,7 @@ def extract_ngrams(corpus, keys=('title', 'abstract', ), do_subngrams = DEFAULT_
                    continue
                # get ngrams
                for ngram in ngramsextractor.extract(value):
-                    tokens = tuple(token[0] for token in ngram)
+                    tokens = tuple(normalize_forms(token[0]) for token in ngram)

                    if do_subngrams:
                        # ex tokens = ["very", "cool", "exemple"]
@@ -90,7 +90,7 @@ def extract_ngrams(corpus, keys=('title', 'abstract', ), do_subngrams = DEFAULT_
                        subterms = [tokens]

                    for seqterm in subterms:
-                        ngram = normalize_terms(' '.join(seqterm))
+                        ngram = ' '.join(seqterm)
                        if len(ngram) > 1:
                            # doc <=> ngram index
                            nodes_ngrams_count[(document.id, ngram)] += 1
@@ -118,7 +118,7 @@ def extract_ngrams(corpus, keys=('title', 'abstract', ), do_subngrams = DEFAULT_
        raise error


-def normalize_terms(term_str, do_lowercase=DEFAULT_ALL_LOWERCASE_FLAG):
+def normalize_forms(term_str, do_lowercase=DEFAULT_ALL_LOWERCASE_FLAG):
    """
    Removes unwanted trailing punctuation
    AND optionally puts everything to lowercase
@@ -127,14 +127,14 @@ def normalize_terms(term_str, do_lowercase=DEFAULT_ALL_LOWERCASE_FLAG):

    (benefits from normalize_chars upstream so there's less cases to consider)
    """
-    # print('normalize_terms  IN: "%s"' % term_str)
-    term_str = sub(r'^[-",;/%(){}\\\[\]\.\' ]+', '', term_str)
-    term_str = sub(r'[-",;/%(){}\\\[\]\.\' ]+$', '', term_str)
+    # print('normalize_forms  IN: "%s"' % term_str)
+    term_str = sub(r'^[-\'",;/%(){}\\\[\]\. ©]+', '', term_str)
+    term_str = sub(r'[-\'",;/%(){}\\\[\]\. ©]+$', '', term_str)

    if do_lowercase:
        term_str = term_str.lower()

-    # print('normalize_terms OUT: "%s"' % term_str)
+    # print('normalize_forms OUT: "%s"' % term_str)

    return term_str


--- a/gargantext/views/api/ngramlists.py
+++ b/gargantext/views/api/ngramlists.py
@@ -57,7 +57,7 @@ class CSVLists(APIView):
        params in request.GET:
            onto_corpus:  the corpus whose lists are getting patched

-        params in request.FILES:
+        params in request.data:
            csvfile:      the csv file

        /!\ We assume we checked the file size client-side before upload

--- a/install/README.md
+++ b/install/README.md
--- a/install/docker/README.md
+++ b/install/docker/README.md
-# Gargantext Installation
-
-You will find here a Dockerfile and docker-compose script 
-that builds a development container for Gargantex
-along with a PostgreSQL 9.5.X server.
-
-*  Install Docker
-On your host machine, you need Docker.
-[Installation guide details](https://docs.docker.com/engine/installation/#installation)
-
-* clone the gargantex repository and get the refactoring branch
-```
-git clone ssh://gitolite@delanoe.org:1979/gargantext /srv/gargantext
-cd /srv/gargantext
-git fetch origin refactoring
-git checkout refactoring
-Install additionnal dependencies into gargantex_lib
-```
-wget http://dl.gargantext.org/gargantext_lib.tar.bz2 \
-     && sudo tar xvjf gargantext_lib.tar.bz2 -o /srv/gargantext_lib \
-     && sudo chown -R gargantua:gargantua /srv/gargantext_lib \
-```
-
-* Developers: create your own branch based on refactoring 
-
-see [CHANGELOG](CHANGELOG.md) for migrations and branches name
-
-```
-git checkout-b username-refactoring refactoring
-
-```
-Build the docker images: 
- a database container
- a gargantext container
-
-```
-cd /srv/gargantext/install/
-docker-compose build -t gargantex /srv/gargantext/install/docker/config/
-docker-compose run web bundle install
-```
-Finally, setup the PostgreSQL database with the following commands.
-```
-docker-compose run web bundle exec rake db:create
-docker-compose run web bundle exec rake db:migrate
-docker-compose run web bundle exec rake db:seed
-```
-
-## OS
-
-## Debian Stretch
-See install/debian
-
-If you do not have a Debian environment, then install docker and 
-execute /srv/gargantext/install/docker/dev/install.sh
-
-You need a docker image.
-All the steps are explained in [docker/dev/install.sh](docker/dev/install.sh) (not automatic yet).
-
-Bug reports are welcome.
-
-
--- a/install/docker/config/Dockerfile
+++ b/install/docker/config/Dockerfile
@@ -26,6 +26,7 @@ ENV PYTHON_ENV  /srv/env_3-5
 RUN apt-get update && \
    apt-get install -y \
    apt-utils ca-certificates locales \
+    python3-dev \
    sudo aptitude gcc g++ wget git postgresql-9.5 vim \
    build-essential make

@@ -44,7 +45,7 @@ RUN apt-get update && apt-get install -y \
        postgresql-server-dev-9.5 libpq-dev libxml2 \
        libxml2-dev xml-core libgfortran-5-dev \
        virtualenv python3-virtualenv \
-        python3.5 python3-dev \
+        python3.5 \
        python3-six python3-numpy python3-setuptools \
        # ^for numpy, pandas
        python3-numexpr \

--- a/install/docker/config/build
+++ b/install/docker/config/build
+#!/bin/bash
+
+#######################################################################
+#    ____             _
+#   |  _ \  ___   ___| | _____ _ __
+#   | | | |/ _ \ / __| |/ / _ \ '__|
+#   | |_| | (_) | (__|   <  __/ |
+#   |____/ \___/ \___|_|\_\___|_|
+#
+######################################################################
+
+sudo docker build -t gargantext .
+# OR Get the ID of your container
+#ID=$(docker build .) && docker run -i -t $ID
+# OR
+# cd /tmp
+# wget http://dl.gargantext.org/gargantext_docker_image.tar \
+# && sudo docker import - gargantext:latest < gargantext_docker_image.tar
+
+
+
+
+
+
+
--- a/install/init.sh
+++ b/install/init.sh
+#!/bin/bash
+echo "Adding user gargantua";
+sudo adduser --disabled-password --gecos "" gargantua;
+echo "Creating the environnement into /srv/";
+for dir in "/srv/gargantext" "/srv/gargantext_lib" "/srv/gargantext_static" "/srv/gargantext_media""/srv/env_3-5"; do
+    sudo mkdir -p $dir ;
+    sudo chown gargantua:gargantua $dir ;
+done;
+echo "Downloading the libs. Please be patient!";
+wget http://dl.gargantext.org/gargantext_lib.tar.bz2 \
+&& tar xvjf gargantext_lib.tar.bz2 -o /srv/gargantext_lib \
+&& sudo chown -R gargantua:gargantua /srv/gargantext_lib \
+&& echo "Libs installed";
+echo 'Install docker'
+sudo apt-get install -y docker-engine
+echo 'Build gargantext image'
+cd /srv/gargantext/install/
+./docker/config/build
+
+
+
+#Next steps
+#install and configure git
+#sudo apt-get install -y git
+#clone your SSH key
+#cp ~/.ssh/id_rsa.pub id_rsa.pub
+#clone the repo
+#~ git clone ssh://gitolite@delanoe.org:1979/gargantext /srv/gargantext \
+        #~ && cd /srv/gargantext \
+        # get on branch
+        #~ && git fetch origin unstable \
+        #~ && git checkout unstable \
+#~ echo "Currently on /srv/gargantext unstable branch";
+#create your own branch
+# git checkout -b my-unstable
+
--- a/static/lib/graphExplorer/example_start_explorer.html
+++ b/static/lib/graphExplorer/example_start_explorer.html
@@ -256,7 +256,7 @@
 	    </div>

 	    <!-- Sidebar -->
-	    <div id="leftcolumn">
+	    <div id="sidecolumn">
          <div style="text-align: center;">
              <a href="http://www.cnrs.fr" target="_blank"><img width="40%"  src="https://www.ipmc.cnrs.fr/~duprat/comm/images/logo_cnrs_transparent.gif"></a>
          </div>

--- a/static/lib/graphExplorer/extras_explorerjs.js
+++ b/static/lib/graphExplorer/extras_explorerjs.js
@@ -149,11 +149,11 @@ function CRUD( list_id , ngram_ids , http_method , callback) {
        var div_info = "";

        if( $( ".colorgraph_div" ).length>0 )
-            div_info += '<ul id="colorGraph" class="nav navbar-nav navbar-right">'
+            div_info += '<ul id="colorGraph" class="nav navbar-nav">'

        div_info += ' <li class="dropdown">'
        div_info += '<a href="#" class="dropdown-toggle" data-toggle="dropdown">'
-        div_info += '        <img title="Set Colors" src="/static/img/colors.png" width="20px"><b class="caret"></b></img>'
+        div_info += '        <img title="Set Colors" src="/static/img/colors.png" width="22px"><b class="caret"></b></img>'
        div_info += '</a>'
        div_info += '  <ul class="dropdown-menu">'

@@ -186,11 +186,11 @@ function CRUD( list_id , ngram_ids , http_method , callback) {
        div_info = "";

        if( $( ".sizegraph_div" ).length>0 )
-            div_info += '<ul id="sizeGraph" class="nav navbar-nav navbar-right">'
+            div_info += '<ul id="sizeGraph" class="nav navbar-nav">'

        div_info += ' <li class="dropdown">'
        div_info += '<a href="#" class="dropdown-toggle" data-toggle="dropdown">'
-        div_info += '        <img title="Set Sizes" src="/static/img/NodeSize.png" width="20px"><b class="caret"></b></img>'
+        div_info += '        <img title="Set Sizes" src="/static/img/NodeSize.png" width="18px"><b class="caret"></b></img>'
        div_info += '</a>'
        div_info += '  <ul class="dropdown-menu">'


--- a/static/lib/graphExplorer/libs/css2/custom.css
+++ b/static/lib/graphExplorer/libs/css2/custom.css
@@ -18,14 +18,53 @@
 }


-
 .navbar { 
 	margin-bottom:1px;
 }

-#defaultop{	
+#defaultop{
    min-height: 5%;
-    max-height: 10%;
+    /*max-height: 10%;*/
+    text-align: center;
+}
+
+#defaultop li.basicitem{
+    /*font-family: "Helvetica Neue", Helvetica, Arial, sans-serif ;*/
+    padding-left: .4em;
+    padding-right: .4em;
+    padding-bottom: 0;
+    font-size: 90% ;
+}
+
+#defaultop > div {
+    float: none;
+    display: inline-block;
+    text-align: left;
+}
+
+#defaultop > div {
+    float: none;
+    display: inline-block;
+    text-align: left;
+}
+
+#defaultop .nav > li > a {
+    text-align: center;
+    padding-top:    .4em;
+    padding-bottom: .2em;
+    margin-left: auto ;
+    margin-right: auto ;
+}
+
+/*searchnav should get same padding as our .navbar-nav > li > a or bootstrap's*/
+#defaultop div#searchnav {
+    padding-top: 13px; 
+    padding-bottom: 9px;
+}
+
+#defaultop .settingslider {
+    max-width: 80px;
+    display: inline-block ;
 }

 #sigma-example {
@@ -165,9 +204,7 @@
    display:inline-block;
    border:solid 1px;
    /*box-shadow: 0px 0px 0px 1px rgba(0,0,0,0.3); */
-    -moz-border-radius: 6px; 
-    -webkit-border-radius: 6px; 
-    -khtml-border-radius: 6px;'+
+    border-radius: 6px; 
    border-color:#BDBDBD; 
    padding:0px 2px 0px 2px;
    margin:1px 0px 1px 0px;
@@ -367,6 +404,12 @@
    padding-left:5%;
 }

+/* small messages */
+p.micromessage{
+  font-size: 85%;
+  color: #707070 ;
+}
+
 .btn-sm:hover {
 	font-weight: bold;
 }
@@ -376,7 +419,7 @@
 .tab { display: inline-block; zoom:1; *display:inline; background: #eee; border: solid 1px #999; border-bottom: none; -moz-border-radius: 4px 4px 0 0; -webkit-border-radius: 4px 4px 0 0; }
 .tab a { font-size: 12px; line-height: 2em; display: block; padding: 0 10px; outline: none; }
 .tab a:hover { text-decoration: underline; }
-.tab.active { background: #fff; padding-top: 6px; position: relative; top: 1px; border-color: #666; }
+.tab.active { background: #fff; padding-top: 6px; position: relative; top: 3px; border-color: #666; }
 .tab a.active { font-weight: bold; }
 .tab-container .panel-container { background: #fff; border: solid #666 1px; padding: 10px; -moz-border-radius: 0 4px 4px 4px; -webkit-border-radius: 0 4px 4px 4px; }
 .panel-container { margin-bottom: 10px; }
--- a/static/lib/graphExplorer/libs/css2/freshslider.css
+++ b/static/lib/graphExplorer/libs/css2/freshslider.css
 .fsslider {
 position: relative;
-min-width: 100px;
+min-width: 80px;
 height: 8px;
 display: inline-block;
 width: 100%;

--- a/static/lib/graphExplorer/libs/css2/sidebar.css
+++ b/static/lib/graphExplorer/libs/css2/sidebar.css
@@ -21,14 +21,15 @@ box-shadow: 0px 0px 3px 0px #888888;
 }*/


-#leftcolumn {
+#sidecolumn {
  overflow-y: scroll;
-  margin-right: -300px;
-  margin-left: 0px;
  padding-bottom: 10px;
  padding-left: 5px;
-  right: 300px;
-  width: 300px;
+  right: 0px;
+  /* this width one is just a first guess... 
+  /* (it will be changed in main.js to sidecolumnSize param)
+   */
+  width: 25em;
  position: fixed;
  height: 100%;
  border: 1px #888888 solid;

--- a/static/lib/graphExplorer/settings_explorerjs.js
+++ b/static/lib/graphExplorer/settings_explorerjs.js
@@ -30,6 +30,11 @@ var mainfile = ["db.json"];
 // getUrlParam.file = window.location.origin+"/"+$("#graphid").html(); // garg exclusive
 // var corpusesList = {} // garg exclusive -> corpus comparison

+var tagcloud_limit = 50;
+
+// for the css of sidecolumn and canvasLimits size
+var sidecolumnSize = "20%"
+
 var current_url = window.location.origin+window.location.pathname+window.location.search
 getUrlParam.file = current_url.replace(/projects/g, "api/projects")


--- a/static/lib/graphExplorer/tinawebJS/Tinaweb.js
+++ b/static/lib/graphExplorer/tinawebJS/Tinaweb.js
@@ -22,45 +22,27 @@
  /[$\w]+/g
 );

-$.fn.visibleHeight = function() {
-    console.log('FUN t.TinawebJS:visibleHeight')
-    var elBottom, elTop, scrollBot, scrollTop, visibleBottom, visibleTop;
-    scrollTop = $(window).scrollTop();
-    scrollBot = scrollTop + $(window).height();
-    elTop = this.offset().top;
-    elBottom = elTop + this.outerHeight();
-    visibleTop = elTop < scrollTop ? scrollTop : elTop;
-    visibleBottom = elBottom > scrollBot ? scrollBot : elBottom;
-    return visibleBottom - visibleTop
-}
-
-// for new SigmaUtils
-function sigmaLimits( sigmacanvas ) {
-    console.log('FUN t.TinawebJS:sigmaLimits')
-    pw=$( sigmacanvas ).width();
-    ph=$( sigmacanvas ).height();
-    // $("body").css("padding-top",0)
-    // var footer = ( $("footer").length>0) ? ($('#leftcolumn').position().top -$("footer").height()) : $('#leftcolumn').position().top*2;
-    var ancho_total = $( window ).width() - $('#leftcolumn').width() ;
-    var alto_total = $('#leftcolumn').visibleHeight() ;
-    // console.log("")
-    // console.log(footer)
-    // console.log(ancho_total)
-    // console.log(alto_total)
-    // console.log("")
-
-    sidebar=$('#leftcolumn').width();
-    anchototal=$('#dafixedtop').width();
-
-    $( sigmacanvas ).width(ancho_total);
-    $( sigmacanvas ).height( alto_total );
-    pw=$( sigmacanvas ).width();
-    ph=$( sigmacanvas ).height();
-    return "new canvas! w:"+pw+" , h:"+ph;
+// on window resize
+// @param canvasdiv: id of the div (without '#')
+function sigmaLimits( canvasdiv ) {
+    console.log('FUN t.TinawebJS:sigmaLimits') ;
+    var canvas = document.getElementById(canvasdiv) ;
+    var sidecolumn = document.getElementById('sidecolumn') ;
+    var ancho_total = window.innerWidth - sidecolumn.offsetWidth ;
+    var alto_total =  window.innerHeight - sidecolumn.offsetTop ;
+
+    // setting new size
+    canvas.style.width = ancho_total - 5 ;
+    canvas.style.height = alto_total - 5 ;
+    
+    // fyi result
+    var pw=canvas.offsetWidth;
+    var ph=canvas.offsetHeight;
+
+    console.log("new canvas! w:"+pw+" , h:"+ph) ;
 }


-
 SelectionEngine = function() {
    console.log('FUN t.TinawebJS:SelectionEngine:new')
    // Selection Engine!! finally...
@@ -381,6 +363,8 @@ SelectionEngine = function() {

 TinaWebJS = function ( sigmacanvas ) {
    console.log('FUN t.TinawebJS:TinaWebJS:new')
+
+    // '#canvasid'
    this.sigmacanvas = sigmacanvas;

    this.init = function () {
@@ -392,11 +376,11 @@ TinaWebJS = function ( sigmacanvas ) {
        return this.sigmacanvas;
    }

-    this.AdjustSigmaCanvas = function ( sigmacanvas ) {
+    this.AdjustSigmaCanvas = function ( canvasdiv ) {
        console.log('FUN t.TinawebJS:AdjustSigmaCanvas')
-        var canvasdiv = "";
-        if( sigmacanvas ) canvasdiv = sigmacanvas;
-        else canvasdiv = this.sigmacanvas;
+        if (! canvasdiv)
+            // '#canvasid' => 'canvasid'
+            canvasdiv = sigmacanvas.substring(1);

        return sigmaLimits( canvasdiv );
    }
@@ -565,8 +549,8 @@ TinaWebJS = function ( sigmacanvas ) {

        //  ===  un/hide leftpanel  === //
        $("#aUnfold").click(function(e) {
-            //SHOW leftcolumn
-            sidebar = $("#leftcolumn");
+            //SHOW sidecolumn
+            sidebar = $("#sidecolumn");
            fullwidth=$('#fixedtop').width();
            e.preventDefault();
            // $("#wrapper").toggleClass("active");
@@ -590,7 +574,7 @@ TinaWebJS = function ( sigmacanvas ) {
                }, 400);
            }
            else {
-                //HIDE leftcolumn
+                //HIDE sidecolumn
                $("#aUnfold").attr("class","leftarrow");
                sidebar.animate({
                    "right" : "-" + sidebar.width() + "px"

--- a/static/lib/graphExplorer/tinawebJS/main.js
+++ b/static/lib/graphExplorer/tinawebJS/main.js
@@ -178,7 +178,7 @@ function MainFunction( RES ) {
    // [ Initiating Sigma-Canvas ]
    var twjs_ = new TinaWebJS('#sigma-example');
    print( twjs_.AdjustSigmaCanvas() );
-    $( window ).resize(function() { print(twjs_.AdjustSigmaCanvas()) });
+    window.onresize = function(){twjs_.AdjustSigmaCanvas()} // TODO: debounce?
    // [ / Initiating Sigma-Canvas ]

    print("categories: "+categories)
@@ -357,6 +357,9 @@ function MainFunction( RES ) {
        partialGraph.stopForceAtlas2();
    }, fa2seconds*1000);

+    // apply width from settings on left column
+    document.getElementById('sidecolumn').style.width = sidecolumnSize ;
+
 }



--- a/static/lib/graphExplorer/tinawebJS/methods.js
+++ b/static/lib/graphExplorer/tinawebJS/methods.js
--- a/static/lib/jquery/1.11.2/images/ui-bg_glass_65_ffffff_1x400.png
+++ b/static/lib/jquery/1.11.2/images/ui-bg_glass_65_ffffff_1x400.png
--- a/templates/graphExplorer/explorer.html
+++ b/templates/graphExplorer/explorer.html
--- a/templates/pages/menu.html
+++ b/templates/pages/menu.html
@@ -119,13 +119,14 @@
                                                <a tabindex="-1"
                                                    data-url="/projects/{{project.id}}/corpora/{{ corpus.id }}/explorer?field1=ngrams&amp;field2=ngrams&amp;distance=distributional&amp;bridgeness=5" onclick='gotoexplorer(this)'  >With distributional distance</a>
                                            </li>
-
+                                            <!--
                                            <li>
                                                <a tabindex="-1"
                                                    onclick="javascript:location.href='/projects/{{project.id}}/corpora/{{ corpus.id }}/myGraphs'"
                                                    data-target='#' href='#'>My Graphs
                                                </a>
                                            </li>
+                                            --!>


                                    </ul>
@@ -213,16 +214,11 @@
                                </div>
                        </div>
                    </div>
-                {% else %}
-                    <div class="container theme-showcase">
-                        <div class="jumbotron" style="margin-bottom:0">
-                        </div>
-                    </div>
                {% endif %}
            {% endif %}
        {% endblock %}

-
+        
        {% block content %}
        {% endblock %}

@@ -235,7 +231,7 @@
            <p>
                Gargantext
                <span class="glyphicon glyphicon-registration-mark" aria-hidden="true"></span>
-                , version 3.0.3.1,
+                , version 3.0.3.3,
                <a href="http://www.cnrs.fr" target="blank" title="Institution that enables this project.">
                    Copyrights
                    <span class="glyphicon glyphicon-copyright-mark" aria-hidden="true"></span>

--- a/unittests/README.md
+++ b/unittests/README.md
+UNIT TESTS
+==========
+
+Prerequisite
+------------
+Running unit tests will involve creating a **temporary test DB** !
+ + it implies **CREATEDB permssions** for settings.DATABASES.user  
+   (this has security consequences)
+ + for instance in gargantext you would need to run this in psql as postgres:  
+   `# ALTER USER gargantua CREATEDB;`
+
+A "principe de précaution" could be to allow gargantua the CREATEDB rights on the **dev** machines (to be able to run tests) and not give it on the **prod** machines (no testing but more protection just in case).
+
+Usage
+------
+```
+./manage.py test unittests/ -v 2            # in django root container directory
+
+# or for a single module
+./manage.py test unittests.tests_010_basic  -v 2
+```
+
+( `-v 2` is the verbosity level )
+
+
+Tests
+------
+  1. **tests_010_basic**
+  2. ** tests ??? **
+  3. ** tests ??? **
+  4. ** tests ??? **
+  5. ** tests ??? **
+  6. ** tests ??? **
+  7. **tests_070_routes**  
+     Checks the response types from the app url routes:  
+    - "/"
+    - "/api/nodes"
+    - "/api/nodes/<ID>"
+
+
+GargTestRunner
+---------------
+Most of the tests will interact with a DB but we don't want to touch the real one so we provide a customized test_runner class in `unittests/framework.py` that creates a test database.
+
+It must be referenced in django's `settings.py` like this:
+```
+TEST_RUNNER = 'unittests.framework.GargTestRunner'
+```
+
+(This way the `./manage.py test` command will be using GargTestRunner.)
+
+
+Using a DB session
+------------------
+To emulate a session the way we usually do it in gargantext, our `unittests.framework` also
+provides a session object to the test database via `GargTestRunner.testdb_session`
+
+To work correctly, it needs to be read *inside the test setup.*
+
+**Example**
+```
+from unittests.framework import GargTestRunner
+
+class MyTestRecipes(TestCase):
+    def setUp(self):
+        # -------------------------------------
+        session = GargTestRunner.testdb_session
+        # -------------------------------------
+        new_project = Node(
+            typename = 'PROJECT',
+            name = "hello i'm a project",
+        )
+        session.add(new_project)
+        session.commit()
+```
+
+
+Accessing the URLS
+------------------
+Django tests provide a client to browse the urls
+
+
+**Example**
+```
+from django.test import Client
+
+class MyTestRecipes(TestCase):
+    def setUp(self):
+        self.client = Client()
+
+    def test_001_get_front_page(self):
+        ''' get the about page localhost/about '''
+        # --------------------------------------
+        the_response = self.client.get('/about')
+        # --------------------------------------
+        self.assertEqual(the_response.status_code, 200)
+```
+
+Logging in
+-----------
+Most of our functionalities are only available on login so we provide a fake user at the initialization of the test DB.
+
+His login in 'pcorser' and password is 'peter'
+
+**Example**
+```
+from django.test import Client
+
+class MyTestRecipes(TestCase):
+    def setUp(self):
+        self.client = Client()
+        # login ---------------------------------------------------
+        response = self.client.post(
+                      '/auth/login/',
+                      {'username': 'pcorser', 'password': 'peter'}
+                   )
+        # ---------------------------------------------------------
+
+    def test_002_get_to_a_restricted_page(self):
+        ''' get the projects page /projects '''
+        the_response = self.client.get('/projects')
+        self.assertEqual(the_response.status_code, 200)
+```
+
+*Si vous aimez les aventures de Peter Corser, lisez l'album précédent ["Doors"](https://gogs.iscpif.fr/leclaire/doors)* (Scénario M. Leclaire, Dessins R. Loth) (disponible dans toutes les bonnes librairies)
+
+
+FIXME
+-----
+
+url client get will still give read access to original DB ?
+      cf. http://stackoverflow.com/questions/19714521
+      cf. http://stackoverflow.com/questions/11046039
+      cf. test_073_get_api_one_node
--- a/unittests/__init__.py
+++ b/unittests/__init__.py
--- a/unittests/framework.py
+++ b/unittests/framework.py
+"""
+A test runner derived from default (DiscoverRunner) but adapted to our custom DB
+
+cf. docs.djangoproject.com/en/1.9/topics/testing/advanced/#using-different-testing-frameworks
+cf. gargantext/settings.py => TEST_RUNNER
+cf. dbmigrate.py
+
+FIXME url get will still give read access to original DB ?
+      cf. http://stackoverflow.com/questions/19714521
+      cf. http://stackoverflow.com/questions/11046039
+      cf. test_073_get_api_one_node
+"""
+
+# basic elements
+from django.test.runner  import DiscoverRunner, get_unique_databases_and_mirrors
+from sqlalchemy          import create_engine
+from gargantext.settings import DATABASES
+
+# things needed to create a user
+from django.contrib.auth.models import User
+
+# here we setup a minimal django so as to load SQLAlchemy models ---------------
+# and then be able to import models and Base.metadata.tables
+from os import environ
+from django import setup
+environ.setdefault("DJANGO_SETTINGS_MODULE", "gargantext.settings")
+setup()   # models can now be imported
+from gargantext import models # Base is now filled
+from gargantext.util.db  import Base  # contains metadata.tables
+# ------------------------------------------------------------------------------
+
+# things needed to provide a session
+from sqlalchemy.orm import sessionmaker, scoped_session
+
+
+class GargTestRunner(DiscoverRunner):
+    """
+    We use the default test runner but we just add
+    our own dbmigrate elements at db creation
+
+    => we let django.test.runner do the test db creation + auto migrations
+    => we retrieve the test db name from django.test.runner
+    => we create a test engine like in gargantext.db.create_engine but with the test db name
+    => we create tables for our models like in dbmigrate with the test engine
+
+    TODO: list of tables to be created are hard coded in self.models
+    """
+
+    # we'll also expose a session as GargTestRunner.testdb_session
+    testdb_session = None
+
+    def __init__(self, *args, **kwargs):
+        # our custom tables to be created (in correct order)
+        self.models = ['ngrams', 'nodes', 'contacts', 'nodes_nodes',  'nodes_ngrams', 'nodes_nodes_ngrams',  'nodes_ngrams_ngrams',  'nodes_hyperdata']
+        self.testdb_engine = None
+
+        # and execute default django init
+        old_config = super(GargTestRunner, self).__init__(*args, **kwargs)
+
+
+    def setup_databases(self, *args, **kwargs):
+        """
+        Complement the database creation
+        by our own "models to tables" migration
+        """
+
+        # default django setup performs base creation + auto migrations
+        old_config = super(GargTestRunner, self).setup_databases(*args, **kwargs)
+
+        # retrieve the testdb_name set by DiscoverRunner
+        testdb_names = []
+        for db_infos in get_unique_databases_and_mirrors():
+            # a key has the form: (IP, port, backend, dbname)
+            for key in db_infos:
+                # db_infos[key] has the form (dbname, {'default'})
+                testdb_names.append(db_infos[key][0])
+
+        # /!\ hypothèse d'une database unique /!\
+        testdb_name = testdb_names[0]
+
+        # now we use a copy of our normal db config...
+        db_params = DATABASES['default']
+
+        # ...just changing the name
+        db_params['NAME'] = testdb_name
+
+        # connect to this test db
+        testdb_url = 'postgresql+psycopg2://{USER}:{PASSWORD}@{HOST}:{PORT}/{NAME}'.format_map(db_params)
+        self.testdb_engine = create_engine( testdb_url )
+        print("TESTDB INIT: opened connection to database **%s**" % db_params['NAME'])
+
+        # we retrieve real tables declarations from our loaded Base
+        sqla_models = (Base.metadata.tables[model_name] for model_name in self.models)
+
+        # example: Base.metadata.tables['ngrams']
+        # ---------------------------------------
+        # Table('ngrams', Column('id', Integer(), table=<ngrams>, primary_key=True),
+        #                 Column('terms', String(length=255), table=<ngrams>),
+        #                 Column('n', Integer(), table=<ngrams>),
+        #                 schema=None)
+
+
+        # and now creation of each table in our test db (like dbmigrate)
+        for model in sqla_models:
+            try:
+                model.create(self.testdb_engine)
+                print('TESTDB INIT: created model: `%s`' % model)
+            except Exception as e:
+                print('TESTDB INIT ERROR: could not create model: `%s`, %s' % (model, e))
+
+
+        # we also create a session to provide it the way we usually do in garg
+        # (it's a class based static var to be able to share it with our tests)
+        GargTestRunner.testdb_session = scoped_session(sessionmaker(bind=self.testdb_engine))
+
+        # and let's create a user too otherwise we'll never be able to login
+        user = User.objects.create_user(username='pcorser', password='peter')
+
+        # old_config will be used by DiscoverRunner
+        # (to remove everything at the end)
+        return old_config
+
+
+    def teardown_databases(self, old_config, *args, **kwargs):
+        """
+        After all tests
+        """
+        # close the session
+        GargTestRunner.testdb_session.close()
+
+        # free the connection
+        self.testdb_engine.dispose()
+
+        # default django teardown performs destruction of the test base
+        super(GargTestRunner, self).teardown_databases(old_config, *args, **kwargs)
+
+
+
+
+# snippets if we choose direct model building instead of setup() and Base.metadata.tables[model_name]
+# from sqlalchemy.types import Integer, String, DateTime, Text, Boolean, Float
+# from gargantext.models.nodes import NodeType
+# from gargantext.models.hyperdata import HyperdataKey
+# from sqlalchemy.schema import Table, Column, ForeignKey, UniqueConstraint, MetaData
+# from sqlalchemy.dialects.postgresql import JSONB, DOUBLE_PRECISION
+# from sqlalchemy.ext.mutable import MutableDict, MutableList
+# Double = DOUBLE_PRECISION
+
+# sqla_models = [i for i in sqla_models]
+# print (sqla_models)
+# sqla_models = [Table('ngrams', MetaData(bind=None), Column('id', Integer(), primary_key=True, nullable=False), Column('terms', String(length=255)), Column('n', Integer()), schema=None), Table('nodes', MetaData(bind=None), Column('id', Integer(), primary_key=True, nullable=False), Column('typename', NodeType()), Column('user_id', Integer(), ForeignKey('auth_user.id')), Column('parent_id', Integer(), ForeignKey('nodes.id')), Column('name', String(length=255)), Column('date', DateTime()), Column('hyperdata', JSONB(astext_type=Text())), schema=None), Table('contacts', MetaData(bind=None), Column('id', Integer(), primary_key=True, nullable=False), Column('user1_id', Integer(), primary_key=True, nullable=False), Column('user2_id', Integer(), primary_key=True, nullable=False), Column('is_blocked', Boolean()), Column('date_creation', DateTime()), schema=None), Table('nodes_nodes', MetaData(bind=None), Column('node1_id', Integer(), ForeignKey('nodes.id'), primary_key=True, nullable=False), Column('node2_id', Integer(), ForeignKey('nodes.id'), primary_key=True, nullable=False), Column('score', Float(precision=24)), schema=None), Table('nodes_ngrams', MetaData(bind=None), Column('node_id', Integer(), ForeignKey('nodes.id'), primary_key=True, nullable=False), Column('ngram_id', Integer(), ForeignKey('ngrams.id'), primary_key=True, nullable=False), Column('weight', Float()), schema=None), Table('nodes_nodes_ngrams', MetaData(bind=None), Column('node1_id', Integer(), ForeignKey('nodes.id'), primary_key=True, nullable=False), Column('node2_id', Integer(), ForeignKey('nodes.id'), primary_key=True, nullable=False), Column('ngram_id', Integer(), ForeignKey('ngrams.id'), primary_key=True, nullable=False), Column('score', Float(precision=24)), schema=None), Table('nodes_ngrams_ngrams', MetaData(bind=None), Column('node_id', Integer(), ForeignKey('nodes.id'), primary_key=True, nullable=False), Column('ngram1_id', Integer(), ForeignKey('ngrams.id'), primary_key=True, nullable=False), Column('ngram2_id', Integer(), ForeignKey('ngrams.id'), primary_key=True, nullable=False), Column('weight', Float(precision=24)), schema=None), Table('nodes_hyperdata', MetaData(bind=None), Column('id', Integer(), primary_key=True, nullable=False), Column('node_id', Integer(), ForeignKey('nodes.id')), Column('key', HyperdataKey()), Column('value_int', Integer()), Column('value_flt', DOUBLE_PRECISION()), Column('value_utc', DateTime(timezone=True)), Column('value_str', String(length=255)), Column('value_txt', Text()), schema=None)]
--- a/unittests/tests_010_basic.py
+++ b/unittests/tests_010_basic.py
+"""
+BASIC UNIT TESTS FOR GARGANTEXT IN DJANGO
+=========================================
+"""
+from django.test import TestCase
+
+
+class NodeTestCase(TestCase):
+    def setUp(self):
+        from gargantext.models import nodes
+        self.node_1000 = nodes.Node(id=1000)
+        self.new_node = nodes.Node()
+
+    def test_010_node_has_id(self):
+        '''new_node.id'''
+        self.assertEqual(self.node_1000.id, 1000)
+
+    def test_011_node_write(self):
+        '''write new_node to DB and commit'''
+        from gargantext.util.db import session
+        self.assertFalse(self.new_node._sa_instance_state._attached)
+        session.add(self.new_node)
+        session.commit()
+        self.assertTrue(self.new_node._sa_instance_state._attached)
--- a/unittests/tests_070_routes.py
+++ b/unittests/tests_070_routes.py
+"""
+ROUTE UNIT TESTS
+================
+"""
+from django.test import TestCase
+from django.test import Client
+
+# to be able to create Nodes
+from gargantext.models import Node
+
+# to be able to compare in test_073_get_api_one_node()
+from gargantext.constants import NODETYPES
+
+# provides GargTestRunner.testdb_session
+from unittests.framework import GargTestRunner
+
+
+class RoutesChecker(TestCase):
+    def setUp(self):
+        """
+        Will be run before each test
+        """
+        self.client = Client()
+
+        # login with our fake user
+        response = self.client.post(
+                            '/auth/login/',
+                            {'username': 'pcorser', 'password': 'peter'}
+                            )
+        print(response.status_code)
+
+        session = GargTestRunner.testdb_session
+
+        new_project = Node(
+            typename = 'PROJECT',
+            name = "hello i'm a project",
+        )
+        session.add(new_project)
+        session.commit()
+        self.a_node_id = new_project.id
+        print("created a project with id: %i" % new_project.id)
+
+    def test_071_get_front_page(self):
+        ''' get the front page / '''
+        front_response = self.client.get('/')
+        self.assertEqual(front_response.status_code, 200)
+        self.assertIn('text/html', front_response.get('Content-Type'))
+        # on suppose que la page contiendra toujours ce titre
+        self.assertIn(b'<h1>Gargantext</h1>', front_response.content)
+
+    def test_072_get_api_nodes(self):
+        ''' get "/api/nodes" '''
+        api_response = self.client.get('/api/nodes')
+        self.assertEqual(api_response.status_code, 200)
+
+        # 1) check the type is json
+        self.assertTrue(api_response.has_header('Content-Type'))
+        self.assertIn('application/json', api_response.get('Content-Type'))
+
+        # 2) let's try to get things in the json
+        json_content = api_response.json()
+        json_count = json_content['count']
+        json_nodes = json_content['records']
+        self.assertEqual(type(json_count), int)
+        self.assertEqual(type(json_nodes), list)
+        print("\ntesting nodecount: %i " % json_count)
+
+
+    def test_073_get_api_one_node(self):
+        ''' get "api/nodes/<node_id>" '''
+
+        # we first get one node id by re-running this bit from test_072
+        a_node_id = self.client.get('/api/nodes').json()['records'][0]['id']
+
+        one_node_route = '/api/nodes/%i' % a_node_id
+        # print("\ntesting node route: %s" % one_node_route)
+        api_response = self.client.get(one_node_route)
+        self.assertTrue(api_response.has_header('Content-Type'))
+        self.assertIn('application/json', api_response.get('Content-Type'))
+
+        json_content = api_response.json()
+        nodetype = json_content['typename']
+        nodename = json_content['name']
+        print("\ntesting nodename:", nodename)
+        print("\ntesting nodetype:", nodetype)
+        self.assertIn(nodetype, NODETYPES)
+
+    # TODO http://localhost:8000/api/nodes?types[]=CORPUS
+
+    # £TODO test request.*
+        # print ("request")
+        # print ("user.id", request.user.id)
+        # print ("user.name", request.user.username)
+        # print ("path", request.path)
+        # print ("path_info", request.path_info)