OpenMOLE: a workflow system for massively distributed executions
Romain Reuillon
Institut des Systémes Complexes, Paris Île de France
Complex-systems
The interactions matter
Large scale experiments on data / models
Naturally parallel algorithms => leverage parallelism.
 |
Data processing |
Design of experiments on a space of parameters |
 |
 |
Model / algorithm calibration |
Optimisation / genetic algorithms |
 |
 |
Sensitivity / robustness analysis |
Generic solution for data driven parallelism
It implements a workflow formalism for massively distributed algorithms. |
 |
It delegates transparently the computational loads to massively parallel environments. |
 |
A component oriented formalism...
... that scales!
Prototype small
(on your laptop)
Experiment large
(millions of jobs)
Embed an execution component as a black box
C
R
C++
Java
Scala
Scilab
Octave
Python
Netlogo
...
Provide executable commands, inputs and outputs with it.
.
Describe the way to explore your space of parameters.
Connect the tasks and build a workflow.
Assign execution environments to tasks.
Run and control the workflow execution.
val i1 = Prototype[Int]("i1")
val i2 = Prototype[Int]("i2")
val j = Prototype[Int]("j")
val hello = GroovyTask("hello", "j = Model.compute(i1, i2)")
hello addInput i1
hello addInput i2
hello addOutput j
hello addLib "/path/to/model.jar"
val exploration = ExplorationTask(
"exploration",
Factor(i1, 0 to 100 by 2 toDomain) x
Factor(i2, new UniformIntDistribution take 10)
)
val ex = exploration -< (hello by 10 on biomed) toExecution
ex.start
Download: http://www.openmole.org
Chromosome structuring |
 |
 |
C++ 2 days per simulation 1600 simulations 8.5 years / CPU |
Junier et al., CTCF-mediated transcriptional regulation through cell type-specific chromosome organization in the β-globin locus, Nucleic Acids Research, 2012.
SimTRAP project |
 |
 |
NetLogo 5 minutes per simulation 100000 simulations 1 year / CPU |
PhD thesis of J. Figuel, Modélisation et simulation des comportements piétonniers dans les espaces de transport – Application aux échanges quai / train de voyageurs.
Simpop project |
 |
 |
Scala 5 minutes per simulation 360 000 000 simulations 22 years / CPU |
Reuillon et al., Algorithmes évolutionnaires sur grille de calcul pour le calibrage de modéles géographiques, proceedings of France Grilles 2012.
Bioemergences project |
 |
C / C++ / Python Web service Image processing Daily production |
3 months release cycle
- June 2012: Boundless Bamboo (0.5)
- September 2012: Crazy Coconut (0.6)
- January 2013: Daddy Django (0.7)
- April 2013: Elastic Earth (0.8)
- Scalable and reliable.
- Ergonomic (console and GUI).
- Embeddable.
Coming versions
- Integration of state-of-the-art methods using HPC to study models.
- Place the model at the center.
- OpenMOLE server.
Join us on the users@list.openmole.org
and on openmole.org