State-of-the-art, scalable, model evaluation

Mathieu Leclaire
Jonathan Passerat-Palmbach
Romain Reuillon

Purpose

  • Provide state-of-the-art exploration methods
  • Tackle scalability using distributed computing
  • Is not intrusive in the model logic

1 - Model?



Stuff that you can launch, taking inputs and producing outputs

Zero deployment approach

  • User code is automatically deployed at runtime
  • No prior knowledge of remote environment needed
  • No installation required on any machine

Packaging

  • JVM => transparent
  • Native => packaged with CARE

Packaging with Care

Applications have dependencies:
  • Shared libraries
  • Packages (Python, R, ...)
  • Low level system calls
  • Environment variables
  • Network ports
  • ...

Capture these dependencies and transfer along with the application from Linux to Linux
Works with almost any language / plateform running on Linux

3 steps

  • Package it with CARE ⇒ execute it on linux
  • Write your OpenMOLE workflow
  • Click the run button

Work in the project

  • Docker tasks, for docker enabled environments.
  • In user land: Singularity, UDocker, Vaga?

2 - Method?


  • Data reconstruction
  • Parameter estimation
  • Sensitivity analysis
  • Robustness assessment
  • Optimisation
  • Diversity research
  • Hybrid, ex: optimisation + diversity
Designed in a scalable manner, handle stochasticity, is usable on any models and environments.

Master/slave

Example: genetic algorithm

Genetic Algorithm

Diversity research

Work in the project

  • Adapt / design methods to explore / understand / simplify the models?
  • Machine learning to speedup the exploration of scenarii in the production plateform?

3 - Execution environment?


Prototype Small, Scale for Free

Transparent access

  • Access as the user would do it
  • Use user credential
  • No preliminary step

Automatic data transfers
(+ Replica management)

Files and folders transfers are handled transparently by OpenMOLE

Today

Multi-thread
Delegation through SSH
PBS (on ssh)
SLURM (on ssh)
Condor (on ssh)
SGE (on ssh)
OAR (on ssh)
EGI Grid (trough DIRAC)
Adhoc Desktop Grid

Grid Computing

4000 cores, but some technical and legal limitations

In the project

  • Academic cloud?
  • Kubernetes?
  • Commercial docker engines?

And now: examples!

The terminology



The interface



A workflow

val i   = Val[Double]
val res = Val[Double]

val exploration = ExplorationTask ( i in (0.0 to 10.0 by 1.0) )

val model =
  ScalaTask ("val res = i * 2") set (
    inputs  += i,
    outputs += (i, res)
  )

val env = LocalEnvironment(5)

exploration -< (model on env)

Scale Up!

val i   = Val[Double]
val res = Val[Double]

val exploration = ExplorationTask ( i in (0.0 to 10.0 by 1.0) )

val model =
  ScalaTask ("val res = i * 2") set (
    inputs  += i,
    outputs += (i, res)
  )

val env = EGIEnvironment("biomed")

exploration -< (model on env)

Native code

care -o hello.tgz.bin python hello.py 42 test.txt
val arg = Val[Int]
val output = Val[File]

val pythonTask =
  CARETask(
    workDirectory / "hello.tgz.bin",
    "python hello.py ${arg} output.txt") set (
    inputs += arg,
    outputFiles += ("output.txt", output),
    outputs += arg
  )

val exploration = ExplorationTask(arg in (0 to 10))

val copy = CopyFileHook(output, workDirectory / "hello${arg}.txt")

exploration -< (pythonTask hook copy)

Optimisation

val optimisation =
  SteadyStateEvolution(
    algorithm =
      NSGA2(
        mu = 1000,
        genome = Seq(x in (0.0, 1.0), y in (0.0, 1.0)),
        objectives = Seq(o1, o2),
        replication = Replication(seed = seed, max = 100)
      ),
    evaluation = model,
    termination = 10 hours
  )

val cloud = GoogleContainerEngine(....)

(optimisation on cloud)

Integrate with 4City plateform

  • Docker support
  • Commercial docker engines environment
  • REST API
  • Something else?

Useful Links

Documentation www.openmole.org
Development version next.openmole.org
Source code github.com/openmole
Market place github.com/openmole-market
Demo instance demo.openmole.org

Thanks!

romain.reuillon@iscpif.fr
mathieu.leclaire@iscpif.fr
j.passerat-palmbach@imperial.ac.uk