Your graph nodes may contain attributes (aka **data facets**) and project_conf can allow you to specify how to use them.
For instance let's assume a node in your gexf input file may contain something like this:
```xml
<nodeid="99262"label="entreprises">
<attvalues>
<attvaluefor="modularity_class"value="3"/>
<attvaluefor="age"value="2012"/>
</attvalues>
<viz:sizevalue="100.0"/>
<viz:colorr="0"g="173"b="38"/>
</node>
```
The input data here has two attributes: "age" and "modularity_class"
These attributes (attvalues) can be processed at input time to:
- color the nodes in the interface
- create a legend with close values grouped into [statistical bins](https://en.wikipedia.org/wiki/Data_binning) by defining intervals
- replace the attribute name by a human-readable label in the legend and menus
- find a title for each subgroups or class
This processing is default and will take place any way if the value `scanAttributes` is true in the global conf (`settings_explorerjs.js`).
But the project conf `project_conf.json` allows us to fine-tune this, by specifying `facets` properties in the node entry for a source in your project :
###### Exemple 1: gradient coloring and 4 bins
```
"facets": {
"age" : {
"legend": "Date d'entrée dans le corpus" <== label used for legends
"col": "gradient", <== coloring function
"binmode": "samerange", <== binning mode
"n": 4, <== optional: number of bins
}
}
```
Here, `age` is the name of the attribute in the original data.
For the `col` key, the available coloring functions are:
-`cluster`: for attributes describing *classes* (class names or class numbers, contrasted colors)
-`gradient`: for continuous variables (uniform map from light yellow to dark red)
-`heatmap`: for continuous variables (from blue-green to red-brown, centered on a white "neutral" color)
Binning can build the intervals with 3 strategies (`binmode` key):
-`samerange`: constant intervals between each bin (dividing the range into `n` equal intervals)
-`samepop`: constant cardinality inside each class (~ quantiles: dividing the range into `n` intervals with equal population)
-`off`: no binning (each value gets a different color)
###### Exemple 2: `cluster` coloring
```
"facets": {
"Modularity Class" : {
"legend": "Modules dans le graphe",
"col": "cluster",
"binmode": "off" <== no binning: values are kept intact
}
}
```
Remarks:
- Heatmap coloring maximum amount of bins is 24.
-`legend` is optional
-`n` is not needed if `binmode` is off.
- Cluster coloring works best with no binning: each distinct value corresponds to a class and becomes a different color.
NB: If an attribute is **not** described in `facets` and `TW.conf.scanAttributes` is true, the attribute will get `"gradient"` coloration by default and the distinct attributes values will be counted:
- if there is few of them (less than 15), they won't be binned
- if there is many distinct values, they will be binned into 7 intervals
The corresponding global conf keys to this default behavior are `TWConf.legendBins` and `TWConf.maxDiscreteValues` in `settings_explorerjs.js`
For more information, see the [developer's manual](https://github.com/moma/ProjectExplorer/blob/master/00.DOCUMENTATION/C-advanced/developer_manual.md#exposed-facets-indices)
2. calls [`sigmaUtils`] where the function `FillGraph()` was a central point for filtering and preparing properties but now with 2 and 3 it just creates a filtered copy of the nodes and edges of the current active types to a new structure that groups them together (POSSIBLE remove this extra step)
...
...
@@ -46,7 +46,7 @@ This will still evolve but the main steps for any graph initialization messily u
- if the category name is "document" => catSoc (type 1)
-`somenode.attributes`: the `attributes` property is always an object
- any attribute listed in the sourcenode.attributes will be indexed if the TW.scanClusters flag is true
- any attribute listed in the sourcenode.attributes will be indexed if the TW.scanAttributes flag is true
- data type and style of processing (for heatmap, or for classes, etc.) should be stipulated in settings (cf. **data facets** below)
...
...
@@ -107,14 +107,19 @@ The values can be binned or not and can be linked to different color schemes:
- 'samepop': constant cardinality inside each class (~ quantiles)
- 'off' : no binning (each distinct value will be a legend item)
These choices can be specified in the conf `facetOptions` entry.
These choices can be specified in each project_conf.json under the `facets` entry.
If an attribute is **not** described in `facetOptions`, it will get `"gradient"` coloration and will be binned iff it has more disctinct values than `maxDiscreteValues`, into `legendBins` intervals.
If an attribute is **not** described in `project_conf.json`, it will get `"gradient"` coloration and will be binned iff it has more disctinct values than `maxDiscreteValues`, into `legendBins` intervals.
These indexes are stored in TW.Clusters and provide an access to sets of nodes that have a given value or range of values
- the mapping from attribute values to matching nodes is always in `TW.Clusters.aType.anAttr.invIdx.aClass.nids`
The allowed coloring functions are declared in TW.gui.colorFuns in `environment.js`.
#### Exposed facets indices
A faceted index is an index "value of an attribute" => nodes having this value.
These indexes are stored in the exposed `TW.Facets` variable by parseCustom time and provide an access to sets of nodes that have a given value or range of values
- the mapping from attribute values to matching nodes is always in `TW.Facets.aType.anAttr.invIdx.aClass.nids`
(where aClass is the chosen interval or distinct value)
- the datatype of the observed values is in `TW.Clusters.aType.anAttr.meta`
- the datatype of the observed values is in `TW.Facets.aType.anAttr.meta`
- the source datatype is always string in gexf, but real type ("vtype") can be numeric
- (ie numeric cast doesn't give NaN or it do so very rarely over the values)