OpenMOLE: a Workflow Engine for Distributed Medical Image Analysis

Up arrow

Jonathan Passerat-Palmbach // Mathieu Leclaire // Romain Reuillon // Zehan Wang // Daniel Rueckert

HPC MICCAI workshop - Sept, 14th 2014

geod
imperial

Wang, Z., Bhatia, K.K., Glocker, B., de Marvao, A., Dawes, T., Kazunari, M., Mori, K., Rueckert, D.: Geodesic patch-based segmentation, MICCAI 2014

"14 hours for each abdominal image (with 50 atlases) using 16 CPU cores clocked at 2.8Ghz"

What does OpenMOLE do ?

It transparently delegates computational loads to massively parallel environments
It exposes a workflow to describe an experiment

Embed your algorithm(s) as a black box

C
R
C++
Java
Scala
Scilab
Octave
Python
Netlogo
...

Workflow topology

A Task runs an executable,
receives and produces Variables.
Variables navigate from one Task to another by means of Transitions
eventually doing Loops.

Each task can run on:


multiple threads
remote SSH servers
PBS SLURM, SGE, Condor, OAR Clusters
EGI
Dirac

Upscaling

Prototype small
Experiment large

OpenMOLE from the GUI

OpenMOLE from a Domain Specific Language


val i1 = Prototype[Int]("i1")
val i2 = Prototype[Int]("i2")
val j = Prototype[Int]("j")

val hello = GroovyTask("hello", "j = Model.compute(i1, i2)")

hello addInput i1
hello addInput i2
hello addOutput j
hello addLib "/path/to/model.jar"

val exploration = ExplorationTask(
    "exploration",
    Factor(i1, 0 to 100 by 2 toDomain) x 
    Factor(i2, new UniformIntDistribution take 10)
  )

val ex = exploration -< (hello by 10 on biomed) toExecution        
ex.start
                        

Can OpenMOLE serve the Medical Image Analysis community?

Yet another

Workflow Engine?

  • Not community focused

  • Powerful DSL

  • Wide range of computing environments
  • Zero-deployment approach


Dinov I, Lozev K, Petrosyan P, Liu Z, Eggert P, et al., Neuroimaging Study Designs, Computational Analyses and Data Provenance Using the LONI Pipeline, PLoS ONe 2010

Exploring the dataset of a Patch Based Segmentation method in

4 steps

Step 1:

Run

Everywhere

  • Static compilation

  • Language specific archive (jar)

  • Yapa (CDE)
    
yapa python ./scripts/abdominalwithreg.py --spatialWeight 7 --queryDilation \
4 --preDtErosion 2 --atlasBoundaryDilation 6 -k 40 --numAtlases 50 \
--patchSize 5 --spatialInfoType coordinates --resLevel 3 --numProcessors 1 \
--dtLabels 0 3 4 7 8 --savePath /vol/bitbucket/jpassera/PatchBasedStandAlone/AbdominalCT/Results \
--transformedImagesPath /vol/bitbucket/jpassera/AbdominalCT/TransformedImages \
--transformedLabelsPath /vol/bitbucket/jpassera/AbdominalCT/TransformedLabels \
--transformedGdtImagesPath /vol/bitbucket/jpassera/AbdominalCT/TransformedGdtImagesPath \
--transformedDtLabelsPath /vol/bitbucket/jpassera/AbdominalCT/TransformedDtLabels \
--fileName nusurgery001.512.nii.gz
    
    

Step 2:

Plan your exploration

  • Input variables of your application

  • Parameters to tune

       
val imageID = Prototype[String]("imageID")
val idsList = List( 1 to 200 by 2, 301 to 519 by 2 ).flatten.map (
                                                   "%03d" format _)

// construct the parameter set
val exploIDsTask = ExplorationTask (
  "exploIDs",
  Factor(
    imageID,
    idsList toDomain
  )
)
    
    

Step 3:

Choose the appropriate execution platform

  • Local execution

  • Cluster

  • Grid

       
DiracAuthentication() = P12Certificate(encrypted, "/homes/jpassera/.globus/ \
grid_certificate_uk_2014.p12")

val dirac = DIRACGliteEnvironment("biomed", "https://ccdirac06.in2p3.fr:9178",
cpuTime = 4 hours)
    
    

Step 4:

Study your results

  • Results downloaded to your computer

  • Write the paper :)

       
val ex = exploIDsTask -< (cTask on dirac hook resultHook) toExecution
ex.start
    
    

Scalability

               
val imageID = Prototype[String]("imageID")
val level   = Prototye[Int]("level")
val idsList = List( 1 to 200 by 2, 301 to 519 by 2 ).flatten.map (
                                                   "%03d" format _)
val levelsList = 1 to 4 toList
        
// construct the parameter set
val exploIDsTask = ExplorationTask (
  "exploIDs",
  Factor(
    imageID,
    idsList toDomain
  ) x 
  Factor(
    level,
    levelsList toDomain
  )    
)

       
DiracAuthentication() = P12Certificate(encrypted, "/homes/jpassera/.globus/ \
grid_certificate_uk_2014.p12")

val dirac = DIRACGliteEnvironment("biomed", "https://ccdirac06.in2p3.fr:9178",
cpuTime = 4 hours)
    
    

Summary

  • Workflow engine

  • Satisfying for Medical Image Analysis
    • Hundreds of MB of input data per subject
    • Hundreds of individual executions run
    • Equivalent of 2100 CPU-hours crunched in less than 48 hours on the EGI computing grid

  • Wide range of computing environments

  • Expressive workflows (GUI / DSL)

  • Zero-deployment approach

What's next?

  • Package anywhere, run everywhere

  • Cloud-enabled version

Download: http://www.openmole.org

@OpenMOLE