This is the source code for our OOPSLA 2017 paper Monadic composition for deterministic, parallel batch processing.
Our DetFlow system allows for deterministic execution of batch processing workloads like software builds and bioinformatics data pipelines. DetFlow consists of two parts: 1) a parallel coordinator process written in a deterministic language (in our case, Haskell) to support low-overhead deterministic parallelism and 2) a sandbox for black-box binary software that supports legacy programs but runs them sequentially. For a software build, the make process is the coordinator, and each build rule (running gcc
, say) runs in a sandbox. DetFlow allows multiple sandbox instances to run in parallel, thereby achieving good scalability on multicore hardware.