Framework for running simulations on a cluster computing system

This function allows for simulations to be run in parallel on a cluster computing system (CCS). See the Parallelization vignette for a detailed overview of how CCS parallelization works in SimEngine. run_on_cluster acts as a wrapper for the code in your simulation, organizing the code into three sections, labeled "first" (code that is run once at the start of the simulation), "main" (running the simulation script repeatedly), and "last" (code to process or summarize simulation results). This function is to be used in conjunction with job scheduler software (e.g., Slurm or Oracle Grid Engine) to divide the simulation into tasks that are run in parallel on the CCS. See the Parallelization documentation for a detailed overview of how CCS parallelization works in SimEngine. run)), and "last" (usually code to process or summarize simulation results). This function interacts with cluster job scheduler software (e.g. Slurm or Oracle Grid Engine) to divide parallel tasks over cluster nodes.

Usage

run_on_cluster(first, main, last, cluster_config)

Arguments

first: Code to run at the start of a simulation. This should be a block of code enclosed by curly braces that creates and sets up a simulation object.
main: Code that will run for every simulation replicate. This should be a block of code enclosed by curly braces , and will typically be a single line of code calling the run) function. This code block will have access to the simulation object you created in the 'first' code block, but any changes made here to the simulation object will not be saved.
last: Code that will run after all simulation replicates have been run. This should be a block of code enclosed by curly braces that processes your simulation object (which at this point will contain your results), which may involve calls to summarize, creation of plots, and so on.
cluster_config: A list of configuration options. You must specify either js (the job scheduler you are using) or tid_var (the name of the environment variable that your task ID is stored in); see examples. Run js_support() to see a list of job schedulers that are currently supported. You can optionally also specify dir, which is a character string representing a path to a directory on the CCS; this directory will serve as your working directory and hold your simulation object and all temporary objects created by SimEngine. If unspecified, this defaults to the working directory of the R script that contains your simulation code).

Examples

if (FALSE) { # \dontrun{
# The following code is saved in a file called my_simulation.R:
library(SimEngine)
run_on_cluster(
  first = {
    sim <- new_sim()
    create_data <- function(n) { return(rpois(n=n, lambda=20)) }
    est_lambda <- function(dat, type) {
      if (type=="M") { return(mean(dat)) }
      if (type=="V") { return(var(dat)) }
    }
    sim %<>% set_levels(estimator = c("M","V"), n = c(10,100,1000))
    sim %<>% set_script(function() {
      dat <- create_data(L$n)
      lambda_hat <- est_lambda(dat=dat, type=L$estimator)
      return(list("lambda_hat"=lambda_hat))
    })
    sim %<>% set_config(num_sim=100, n_cores=20)
  },
  main = {
    sim %<>% run()
  },
  last = {
    sim %>% summarize()
  },
  cluster_config = list(js="slurm")
)

# The following code is saved in a file called run_sim.sh:
# #!/bin/bash
# Rscript my_simulation.R

# The following lines of code are run on the CCS head node:
# sbatch --export=sim_run='first' run_sim.sh
# sbatch --export=sim_run='main' --array=1-20 --depend=afterok:101 run_sim.sh
# sbatch --export=sim_run='last' --depend=afterok:102 run_sim.sh
} # }