Commit f77b1e25 authored by Romain Loth's avatar Romain Loth

project conf: separate legends.json for facets + doc

parent b32b6401
......@@ -39,13 +39,16 @@ Après commits de la semaine 26-30 juin 2017, une structure plus facile pour les
│       │ (exemples sous twlibs/default_hit_templates)
│       ├─ (etc.)
│       │
│       └─ project_conf.json <= pour déclarer:
│ - types de nodes
│ - colorations/légendes des attributs des nodes
| - les bases associées pour les requêtes
| (et les templates de résultats ad hoc)
│       ├─ project_conf.json <= pour déclarer:
│ | - les types de nodes
| | - les bases associées pour les requêtes
| | (et les templates de résultats ad hoc)
| |
│       └─ legends.json <= pour déclarer les colorations/légendes
| (pre-traitements des attributs des nodes)
|
├── server_menu.json <= liste des sources gexf/json par projet (optionnelle: sert à afficher un menu)
├── server_menu.json <= liste des sources gexf/json par projet
| (optionnelle: pour afficher un menu des graphes)
├── favicon.ico
├── LICENSE
├── README.md
......
......@@ -2,17 +2,22 @@
The directories under `data/` on the app's server are called *project directories* and should contain:
- a graph file (ending in `.gexf` or `.json`)
- a `project_conf.json` to declare nodetypes and optionally link DBs to each graph file.
- optionally the said associated database of documents
- a `project_conf.json` to declare
- types of nodes in the input graph
- optionally linked DBs of documents for each graph file
- optionally the associated DBs of documents themselves
See for example `data/test/project_conf.json` in the project dir `test`.
NB: Remember that localfile input mode can not open project directories nor project configuration.
Remarks:
- The `localfile` input mode can **not** open project directories nor project-specific configurations (but the user can open a graph file and visualize it with default configuration)
- ProjectExplorer also allows another conf file in the project directory for data attributes preprocessing settings. The file is called `legends.json` and has a separate documentation under [data facets and legends](https://github.com/moma/ProjectExplorer/blob/master/00.DOCUMENTATION/C-advanced/data_facets_and_legends.json)).
------------------------------------------------------
#### Minimal Config
One minimal entry contains for each graph file of the project dir : a list of expected node types starting by 'node0' (**the nodetypes**)
One minimal exemple of `project_conf.json` contains for each graph file of the project dir : a list of expected node types starting by 'node0' (**the nodetypes**)
```json
{
......@@ -190,135 +195,3 @@ An additional variable `${{score}}` is always available in the templating contex
In this last exemple, we have two nodetypes:
- node0 allows both CSV and twitter relatedDocs tabs.
- node1 allows only the CSV relatedDocs tab.
------------------------------------------------------
#### Configuring facets (node attributes) rendering
Your graph nodes may contain attributes (aka **data facets**) and project_conf can allow you to specify how to use them.
For instance let's assume a node in your gexf input file may contain something like this:
```xml
<node id="99262" label="entreprises">
<attvalues>
<attvalue for="modularity_class" value="3"/>
<attvalue for="age" value="2012"/>
</attvalues>
<viz:size value="100.0"/>
<viz:color r="0" g="173" b="38"/>
</node>
```
The input data here has two attributes: "age" and "modularity_class"
These attributes (attvalues) can be processed at input time to:
- color the nodes in the interface
- create a legend with close values grouped into [statistical bins](https://en.wikipedia.org/wiki/Data_binning) by defining intervals
- replace the attribute name by a human-readable label in the legend and menus
- find a title for each subgroups or class
This processing is default and will take place any way if the value `scanAttributes` is true in the global conf (`settings_explorerjs.js`).
But the project conf `project_conf.json` allows us to fine-tune this, by specifying `facets` properties in the node entry for a source in your project :
###### Exemple 1: gradient coloring and 4 bins
```
"facets": {
"age" : {
"legend": "Date d'entrée dans le corpus" <== label used for legends
"col": "gradient", <== coloring function
"binmode": "samerange", <== binning mode
"n": 4, <== optional: number of bins
}
}
```
Here, `age` is the name of the attribute in the original data.
For the `col` key, the available coloring functions are:
- `cluster`: for attributes describing *classes* (class names or class numbers, contrasted colors)
- `gradient`: for continuous variables (uniform map from light yellow to dark red)
- `heatmap`: for continuous variables (from blue-green to red-brown, centered on a white "neutral" color)
Binning can build the intervals with 3 strategies (`binmode` key):
- `samerange`: constant intervals between each bin (dividing the range into `n` equal intervals)
- `samepop`: constant cardinality inside each class (~ quantiles: dividing the range into `n` intervals with equal population)
- `off`: no binning (each value gets a different color)
###### Exemple 2: `cluster` coloring
```
"facets": {
"Modularity Class" : {
"legend": "Modules dans le graphe",
"col": "cluster",
"binmode": "off" <== no binning: values are kept intact
}
}
```
Remarks:
- Heatmap coloring maximum amount of bins is 24.
- `legend` is optional
- `n` is not needed if `binmode` is off.
- Cluster coloring works best with no binning: each distinct value corresponds to a class and becomes a different color.
###### Real life example 1
```json
{
"ProgrammeDesCandidats.gexf": {
"node0": {
"name": "term",
"reldbs": {...},
"facets": {
"age" : {
"legend": "Date d'entrée dans le corpus",
"col": "gradient",
"binmode": "samerange",
"n": 4
},
"growth_rate" : {
"legend": "Tendances et oubliés de la semaine",
"col": "heatmap",
"binmode": "samepop",
"n": 11
},
"modularity_class" : {
"legend": "Modules dans le graphe",
"col": "cluster",
"binmode": "off"
}
}
}
}
}
```
###### Real life example 2
```json
{
"Maps_S_800.gexf": {
"node0": {
"name": "termsWhitelist",
"reldbs": {...},
"facets":{
"level": {"col": "heatmap" , "binmode": "off" },
"weight": {"col": "heatmap" , "n": 5, "binmode": "samerange" },
"period": {"col": "cluster" , "binmode": "off" },
"in-degree": {"col": "heatmap" , "n": 3, "binmode": "samepop" },
"out-degree": {"col": "heatmap" , "n": 3, "binmode": "samepop" },
"betweeness": {"col": "gradient", "n": 4, "binmode": "samepop" },
"cluster_label": {"col": "cluster" , "binmode": "off" },
"community_orphan":{"col": "cluster" , "binmode": "off" },
"cluster_universal_index": {"col": "cluster" ,"binmode": "off" },
}
}
}
}
```
NB: If an attribute is **not** described in `facets` and `TW.conf.scanAttributes` is true, the attribute will get `"gradient"` coloration by default and the distinct attributes values will be counted:
- if there is few of them (less than 15), they won't be binned
- if there is many distinct values, they will be binned into 7 intervals
The corresponding global conf keys to this default behavior are `TWConf.legendBins` and `TWConf.maxDiscreteValues` in `settings_explorerjs.js`
For more information, see the [developer's manual](https://github.com/moma/ProjectExplorer/blob/master/00.DOCUMENTATION/C-advanced/developer_manual.md#exposed-facets-indices)
#### Configuring legends and facets (node attributes rendering)
Your graph nodes may contain attributes (aka **data facets**). By default, they will be colored using a gradient from light yellow to dark red.
**You can specify additional preprocessing, legends and coloring in your project directory by creating a `legends.json` configuration file under your project directory.**
For instance let's assume a node in your gexf input file may contain something like this:
```xml
<node id="99262" label="entreprises">
<attvalues>
<attvalue for="modularity_class" value="3"/>
<attvalue for="age" value="2012"/>
</attvalues>
<viz:size value="100.0"/>
<viz:color r="0" g="173" b="38"/>
</node>
```
The input data here has two attributes: "age" and "modularity_class"
These attributes (attvalues) can be processed at input time to:
- color the nodes in the interface
- create a legend with close values grouped into [statistical bins](https://en.wikipedia.org/wiki/Data_binning) by defining intervals
- replace the attribute name by a human-readable label in the legend and menus
- find a title for each subgroups or class
This processing is default and will take place any way if the value `scanAttributes` is true in the global conf (`settings_explorerjs.js`).
But you can also create an additional conf file under `/data/yourproject/legends.json` to fine-tune the coloring and legends.
###### Exemple 1: gradient coloring and 4 bins
```
"age" : {
"legend": "Date d'entrée dans le corpus", <== label used for legends
"col": "gradient", <== coloring function
"binmode": "samerange", <== binning mode
"n": 4 <== optional: number of bins
}
```
Here, `age` is the name of the attribute in the original data.
For the `col` key, the available coloring functions are:
- `cluster`: for attributes describing *classes* (class names or class numbers, contrasted colors)
- `gradient`: for continuous variables (uniform map from light yellow to dark red)
- `heatmap`: for continuous variables (from blue-green to red-brown, centered on a white "neutral" color)
Binning can build the intervals with 3 strategies (`binmode` key):
- `samerange`: constant intervals between each bin (dividing the range into `n` equal intervals)
- `samepop`: constant cardinality inside each class (~ quantiles: dividing the range into `n` intervals with equal population)
- `off`: no binning (each value gets a different color)
###### Exemple 2: `cluster` coloring
```
"modularity_class" : {
"legend": "Modules dans le graphe",
"col": "cluster",
"binmode": "off" <== no binning: values are kept intact
}
```
Remarks:
- `legend` and `n` are optional
- `n` is not needed if `binmode` is off
- if `binmode` is not off, the default value for `n` is 7
- `cluster` coloring works best with no binning: each distinct value corresponds to a class and becomes a different color.
- `heatmap` coloring maximum amount of bins is 24.
-----------------------------------------------------
###### Real life example 1
```json
{
"age" : {
"legend": "Date d'entrée dans le corpus",
"col": "gradient",
"binmode": "samerange",
"n": 4
},
"growth_rate" : {
"legend": "Tendances et oubliés de la semaine",
"col": "heatmap",
"binmode": "samepop",
"n": 11
},
"modularity_class" : {
"legend": "Modules dans le graphe",
"col": "cluster",
"binmode": "off"
}
}
```
###### Real life example 2
```json
{
"level": {"col": "heatmap" , "binmode": "off" },
"weight": {"col": "heatmap" , "n": 5, "binmode": "samerange" },
"period": {"col": "cluster" , "binmode": "off" },
"in-degree": {"col": "heatmap" , "n": 11, "binmode": "samepop" },
"out-degree": {"col": "heatmap" , "n": 11, "binmode": "samepop" },
"betweeness": {"col": "gradient", "n": 4, "binmode": "samepop" },
"cluster_label": {"col": "cluster" , "binmode": "off" },
"community_orphan":{"col": "cluster" , "binmode": "off" },
"cluster_universal_index": {"col": "cluster" ,"binmode": "off" }
}
```
NB: If an attribute is **not** described in `legends.json` and `TW.conf.scanAttributes` is true, the attribute will get `"gradient"` coloration by default and the distinct attributes values will be counted:
- if there is few of them (less than 15), they won't be binned
- if there is many distinct values, they will be binned into 7 intervals
The corresponding global conf keys to this default behavior are `TWConf.legendBins` and `TWConf.maxDiscreteValues` in `settings_explorerjs.js`
For more information, see the [developer's manual](https://github.com/moma/ProjectExplorer/blob/master/00.DOCUMENTATION/C-advanced/developer_manual.md#exposed-facets-indices)
{
"someOccs": {
"col": "gradient",
"n": 3,
"binmode": "samepop",
"legend": "Test Occurrences"
},
"country": {"col": "cluster" , "legend": "Country", "binmode": "off" },
"myValue": {
"col": "heatmap",
"n": 2,
"binmode": "samerange",
"legend": "Some Important Value"
}
}
......@@ -9,14 +9,6 @@
"template": "bib_details"
},
"twitter": {}
},
"facets": {
"someOccs": {
"col": "gradient",
"n": 3,
"binmode": "samepop",
"legend": "Test Occurrences"
}
}
},
"node1": {
......@@ -27,15 +19,6 @@
"qcols": ["author"],
"template": "bib_details"
}
},
"facets": {
"country": {"col": "cluster" , "legend": "Country", "binmode": "off" },
"myValue": {
"col": "heatmap",
"n": 2,
"binmode": "samerange",
"legend": "Some Important Value"
}
}
}
}
......
......@@ -353,10 +353,8 @@ function mainStartGraph(inFormat, inData, twInstance) {
let srcDirname = pathsplit[1] ;
let srcBasename = pathsplit[2] ;
// try and retrieve associated conf
[optNodeTypes,
optRelDBs,
optProjectFacets] = readProjectConf(srcDirname, srcBasename)
// try and retrieve associated project_conf.json
[optNodeTypes, optRelDBs] = readProjectConf(srcDirname, srcBasename)
// export to globals for getTopPapers and makeRendererFromTemplate
if (optRelDBs) {
......@@ -364,7 +362,10 @@ function mainStartGraph(inFormat, inData, twInstance) {
TW.Project = srcDirname
}
// same for facet options
// try and retrieve associated legends.json
optProjectFacets = readProjectFacetsConf(srcDirname, srcBasename)
// export to globals for facet options (merge with previous defaults)
if (optProjectFacets) {
TW.facetOptions = Object.assign(TW.facetOptions, optProjectFacets)
}
......
......@@ -162,13 +162,10 @@ function readMenu(infofile) {
return [serverMenu, firstProject]
}
// read project_conf.json files in the project for this file
// read project_conf.json files in the project dir for this file
function readProjectConf(projectPath, filePath) {
let declaredNodetypes
let declaredDBConf
let declaredFacetsConf
// ££TODO declaredFacetOptions
let projectConfFile = projectPath + '/project_conf.json'
......@@ -234,25 +231,51 @@ function readProjectConf(projectPath, filePath) {
}
}
}
// optional facets -----------------------
if (confEntry[ndtype].facets) {
if (! declaredFacetsConf) declaredFacetsConf = {}
// POSS store facets conf by type ?
declaredFacetsConf = Object.assign(
declaredFacetsConf,
confEntry[ndtype].facets
)
}
// ----------------------------------------
}
}
}
}
return [declaredNodetypes, declaredDBConf, declaredFacetsConf]
return [declaredNodetypes, declaredDBConf]
}
// read optional legends.json file in the project dir for this file
function readProjectFacetsConf(projectPath, filePath) {
let declaredFacetsConf
let legendConfFile = projectPath + '/legends.json'
if (! linkCheck(legendConfFile)) {
console.log (`no legend.json next to the file, ${filePath},
will try using default facet options`)
}
else {
if (TW.conf.debug.logFetchers)
console.info(`attempting to load legends conf ${legendConfFile}`)
var legconfRes = AjaxSync({ url: legendConfFile, datatype:"json" });
if (TW.conf.debug.logFetchers)
console.log('legends conf AjaxSync result legconfRes', legconfRes)
if (! legconfRes['OK']
|| ! legconfRes.data) {
console.warn (`legends.json in ${projectPath} is not valid json: skipped`)
}
else {
// load attributes params as they are
declaredFacetsConf = legconfRes.data
// (each coloring function has own fallbacks and checks on these params)
}
}
return declaredFacetsConf
}
// settings: {norender: Bool}
function cancelSelection (fromTagCloud, settings) {
if (TW.conf.debug.logSelections) { console.log("\t***in cancelSelection"); }
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment