documentation of array and packed jobs

2019-10-11 16:00:18 +02:00
parent ba94e64988
commit ce72b9e9c2
1 changed files with 109 additions and 0 deletions
--- a/merlin6-slurm/slurm-examples.md
+++ b/merlin6-slurm/slurm-examples.md
@ -153,3 +153,112 @@ options can be found in the following link: https://slurm.schedmd.com/sbatch.htm

 If you have questions about how to properly execute your jobs, please contact us through merlin-admins@lists.psi.ch. Do not run
 advanced configurations unless your are sure of what you are doing.
+
+## Array Jobs - how to launch a big number of similar jobs
+
+If you need to run a larger number of jobs using the same program with systematically varying inputs,
+e.g. a parameter sweep, you can do this most easily in form of a **simple array job**
+
+``` bash
+#!/bin/bash
+#SBATCH --job-name=test-array
+#SBATCH --partition=daily
+#SBATCH --ntasks=1
+#SBATCH --time=08:00:00
+#SBATCH --array=1-8
+
+echo $(date) "I am job number ${SLURM_ARRAY_TASK_ID}"
+srun myprogram config-file-${SLURM_ARRAY_TASK_ID}.dat
+
+```
+
+This will run 8 independent jobs, where each job can use the counter variable `SLURM_ARRAY_TASK_ID`
+to feed the correct input arguments or configuration file to the "myprogram" executable. Each job
+will receive the same set of configurations (e.g. time limit of 8h in the example above).
+
+The jobs are independent, but they will run in parallel (if the cluster resources allow for
+it). The jobs will get JobIDs like {some-number}_0 to {some-number}_7, and they also will each
+have their own output file.
+
+**Note:**
+   * Do not use such jobs if you have very short tasks, since each array sub job will incur the full overhead for launching an independent Slurm job. For such cases you should used a **packed job** (see below).
+   * If you want to control how many of these jobs can run in parallel, you can use the `#SBATCH --array=1-100%5` syntax. The `%5` will define
+     that only 5 sub jobs may ever run in parallel.
+   
+You also can use an array job approach to run over all files in a directory, substituting the payload with
+
+``` bash
+FILES=(/path/to/data/*)
+srun ./myprogram ${FILES[$SLURM_ARRAY_TASK_ID]}
+```
+
+Or for a trivial case you could supply the parameter to scan in form
+of a parameter list
+
+``` bash
+ARGS=(0.05 0.25 0.5 1 2 5 100)
+srun ./my_program.exe ${ARGS[$SLURM_ARRAY_TASK_ID]}
+```
+
+## Array jobs for long running tasks with checkpoint files
+
+If you need to run a job for a much longer than the queues (partitions) allow, and
+your executable is able to create checkpoints at intervals, you can use this
+strategy:
+
+``` bash
+#!/bin/bash
+#SBATCH --job-name=test-checkpoint
+#SBATCH --partition=general
+#SBATCH --ntasks=1
+#SBATCH --time=7-00:00:00       # each job can run for 7 days
+#SBATCH --cpus-per-task=1
+#SBATCH --array=1-10%1   # Run a 10-job array, one job at a time.
+if test -e checkpointfile; then 
+     # There is a checkpoint file;
+     myprogram --read-checkp checkpointfile
+else
+     # There is no checkpoint file, start a new simulation.
+     myprogram
+fi
+```
+
+The `%1` in the `#SBATCH --array=1-10%1` statement defines that only 1 subjob can ever run in parallel, so
+this will result in subjob n+1 only being started when job n has finished. It will read the checkpoint file
+if it is present.
+
+
+## Packed jobs - running a large number of short tasks
+
+Since the launching of a Slurm job incurs some overhead, you should not submit each short task as a separate
+Slurm job. Use job packing, i.e. you run the short tasks within the loop of a single Slurm job.
+
+You can launch the short tasks using `srun` with the `--exclusive` switch (not to be confused with the
+switch of the same name used in the SBATCH commands). This switch will ensure that only a specified
+number of tasks can run in parallel.
+
+As an example, the following job submission script will ask Slurm for
+44 cores (threads), then it will run the =myprog= program 1000 times with
+arguments passed from 1 to 1000. But with the =-N1 -n1 -c1
+--exclusive= option, it will control that at any point in time only 44
+instances are effectively running, each being allocated one CPU. You
+can at this point decide to allocate several CPUs or tasks by adapting
+the corresponding parameters.
+   
+``` bash
+#! /bin/bash
+#SBATCH --job-name=test-checkpoint
+#SBATCH --partition=general
+#SBATCH --ntasks=1
+#SBATCH --time=7-00:00:00
+#SBATCH --ntasks=44    # defines the number of parallel tasks
+for i in {1..1000}
+do
+   srun -N1 -n1 -c1 --exclusive ./myprog $i &
+done
+wait
+```
+
+**Note:** The `&` at the end of the `srun` line is needed to not have the script waiting (blocking).
+The `wait` command waits for all such background tasks to finish and returns the exit code.
+