diff --git a/pages/merlin6/merlin6-slurm/slurm-examples.md b/pages/merlin6/merlin6-slurm/slurm-examples.md new file mode 100644 index 0000000..7c60e71 --- /dev/null +++ b/pages/merlin6/merlin6-slurm/slurm-examples.md @@ -0,0 +1,159 @@ +--- +title: Slurm Examples +#tags: +#keywords: +last_updated: 19 June 2019 +#summary: "" +sidebar: merlin6_sidebar +permalink: /merlin6/slurm-basics.html +--- + +## Basic single core job + +### Basic single core job - Example 1 + +```bash +#!/bin/bash +#SBATCH --partition=hourly # Using 'hourly' will grant higher priority +#SBATCH --constraint=mc # Use CPU batch system +#SBATCH --ntasks-per-core=1 # Force no Hyper-Threading, will run 1 task per core +#SBATCH --mem-per-cpu=8000 # Double the default memory per cpu +#SBATCH --time=00:30:00 # Define max time job will run +#SBATCH --output=myscript.out # Define your output file +#SBATCH --error=myscript.err # Define your error file + +my_script +``` + +In this example we run a single core job by defining ``--ntasks-per-core=1`` (which is also the default). Since the default memory per +cpu is 4000MB (in Slurm, this is equivalent to the memory per thread), and we are using 1 single thread per core, default memory per CPU +should be doubled: using a single thread will always be accounted as if the job was using the whole physical core (which has 2 available +hyperthreads), hence we want to use the memory as if we were using 2 threads. + +### Basic single core job - Example 2 + +```bash +#!/bin/bash +#SBATCH --partition=hourly # Using 'hourly' will grant higher priority +#SBATCH --constraint=mc # Use CPU batch system +#SBATCH --ntasks-per-core=1 # Force no Hyper-Threading, will run 1 task per core +#SBATCH --mem=352000 # We want to use the whole memory +#SBATCH --time=00:30:00 # Define max time job will run +#SBATCH --output=myscript.out # Define your output file +#SBATCH --error=myscript.err # Define your error file + +my_script +``` + +In this example we run a single core job by defining ``--ntasks-per-core=1`` (which is also the default). Also, we define that the +job will use the whole memory of a node with ``--mem=352000`` (which is the maximum memory available per Apollo node). Whenever +you want to run a job needing more memory than the default (4000MB per thread) is very important to specify the amount of memory that +the job will use. This must be done in order to avoid conflicts with other jobs from other users. + +## Basic MPI with hyper-threading + +```bash +#!/bin/bash +#SBATCH --partition=hourly # Using 'hourly' will grant higher priority +#SBATCH --exclusive # Use the node in exclusive mode +#SBATCH --ntasks=88 # Job will run 88 tasks +#SBATCH --ntasks-per-core=2 # Force Hyper-Threading, will run 2 tasks per core +#SBATCH --constraint=mc # Use CPU batch system +#SBATCH --time=00:30:00 # Define max time job will run +#SBATCH --output=myscript.out # Define your output file +#SBATCH --error=myscript.err # Define your error file + +module load gcc/8.3.0 openmpi/3.1.3 + +MPI_script +``` + +In this example we run a job that will run 88 tasks. Merlin6 Apollo nodes have 44 cores each one with HT +enabled. This means that we can run 2 threads per core, in total 88 threads. We add the option ``--exclusive`` to +ensure that the node usage is exclusive and no other jobs are running there. Finally, the default memory +per thread is 4000MB, in total this job can use up to 352000MB memory which is the maximum allowed in a single node. + +## Basic MPI without hyper-threading + +```bash +#!/bin/bash +#SBATCH --partition=hourly # Using 'hourly' will grant higher priority +#SBATCH --ntasks=44 # Job will run 44 tasks +#SBATCH --ntasks-per-core=1 # Force no Hyper-Threading, will run 1 task per core +#SBATCH --mem=352000 # Define the whole memory of the node +#SBATCH --constraint=mc # Use CPU batch system +#SBATCH --time=00:30:00 # Define max time job will run +#SBATCH --output=myscript.out # Define your output file +#SBATCH --error=myscript.err # Define your output file + +module load gcc/8.3.0 openmpi/3.1.3 + +MPI_script +``` + +In this example we run a job that will run 44 tasks, and Hyper-Threading will not be used. Merlin6 Apollo nodes have 44 cores +each one with HT enabled. However, defining ``--ntasks-per-core=1`` we force the use of one single thread per core (if this is +not defined, will be the default, but is recommended to add it explicitly). Each task will +run in 1 thread, and each tasks will be assigned to an independent core. We add the option ``--exclusive`` to ensre that the node +usage is exclusive and no other jobs are running there. Finally, since the default memory per thread is 4000MB and we use only +1 thread, we want to avoid using half of the memory: we have to specify that we will use the whole memory of the node with the +option ``--mem=352000`` (which is the maximum memory available in the node)`. + +## Advanced Slurm Example + +Copy-paste the following example in a file called myAdvancedTest.batch): + +```bash +#!/bin/bash +#SBATCH --partition=daily # name of slurm partition to submit +#SBATCH --time=2:00:00 # limit the execution of this job to 2 hours, see sinfo for the max. allowance +#SBATCH --nodes=2 # number of nodes +#SBATCH --ntasks=44 # number of tasks + +module load gcc/8.3.0 openmpi/3.1.3 +module list + +echo "Example no-MPI:" ; hostname # will print one hostname per node +echo "Example MPI:" ; mpirun hostname # will print one hostname per ntask +``` + +In the above example are specified the options ``--nodes=2`` and ``--ntasks=44``. This means that up 2 nodes are requested, +and is expected to run 44 tasks. Hence, 44 cores are needed for running that job (we do not specify ``--ntasks-per-core``, so it will +default to ``1``). Slurm will try to allocate a maximum of 2 nodes, both together having at least 44 cores. +Since our nodes have 44 cores / each, if nodes are empty (no other users have running jobs there), job can land on a single node +(it has enough cores to run 44 tasks). + +If we want to ensure that job is using at least two different nodes (i.e. for boosting CPU frequency, or because the job requires +more memory per core) you should specify other options. + +A good example is ``--ntasks-per-node=22``. This will equally distribute 22 tasks on 2 nodes. + +```bash +#SBATCH --ntasks-per-node=22 +``` + +A different example could be by specifying how much memory per core is needed. For instance ``--mem-per-cpu=32000`` will reserve +~32000MB per core. Since we have a maximum of 352000MB per Apollo node, Slurm will be only able to allocate 11 cores (32000MB x 11cores = 352000MB) per node. +It means that 4 nodes will be needed (max 11 tasks per node due to memory definition, and we need to run 44 tasks), in this case we need to change ``--nodes=4`` +(or remove ``--nodes``). Alternatively, we can decrease ``--mem-per-cpu`` to a lower value which can allow the use of at least 44 cores per node (i.e. with ``16000`` +should be able to use 2 nodes) + +```bash +#SBATCH --mem-per-cpu=16000 +``` + +Finally, in order to ensure exclusivity of the node, an option *--exclusive* can be used (see below). This will ensure that +the requested nodes are exclusive for the job (no other users jobs will interact with this node, and only completely +free nodes will be allocated). + +```bash +#SBATCH --exclusive +``` + +This can be combined with the previous examples. + +More advanced configurations can be defined and can be combined with the previous examples. More information about advanced +options can be found in the following link: https://slurm.schedmd.com/sbatch.html (or run 'man sbatch'). + +If you have questions about how to properly execute your jobs, please contact us through merlin-admins@lists.psi.ch. Do not run +advanced configurations unless your are sure of what you are doing.