Added Slurm Examples
This commit is contained in:
159
pages/merlin6/merlin6-slurm/slurm-examples.md
Normal file
159
pages/merlin6/merlin6-slurm/slurm-examples.md
Normal file
@ -0,0 +1,159 @@
|
||||
---
|
||||
title: Slurm Examples
|
||||
#tags:
|
||||
#keywords:
|
||||
last_updated: 19 June 2019
|
||||
#summary: ""
|
||||
sidebar: merlin6_sidebar
|
||||
permalink: /merlin6/slurm-basics.html
|
||||
---
|
||||
|
||||
## Basic single core job
|
||||
|
||||
### Basic single core job - Example 1
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#SBATCH --partition=hourly # Using 'hourly' will grant higher priority
|
||||
#SBATCH --constraint=mc # Use CPU batch system
|
||||
#SBATCH --ntasks-per-core=1 # Force no Hyper-Threading, will run 1 task per core
|
||||
#SBATCH --mem-per-cpu=8000 # Double the default memory per cpu
|
||||
#SBATCH --time=00:30:00 # Define max time job will run
|
||||
#SBATCH --output=myscript.out # Define your output file
|
||||
#SBATCH --error=myscript.err # Define your error file
|
||||
|
||||
my_script
|
||||
```
|
||||
|
||||
In this example we run a single core job by defining ``--ntasks-per-core=1`` (which is also the default). Since the default memory per
|
||||
cpu is 4000MB (in Slurm, this is equivalent to the memory per thread), and we are using 1 single thread per core, default memory per CPU
|
||||
should be doubled: using a single thread will always be accounted as if the job was using the whole physical core (which has 2 available
|
||||
hyperthreads), hence we want to use the memory as if we were using 2 threads.
|
||||
|
||||
### Basic single core job - Example 2
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#SBATCH --partition=hourly # Using 'hourly' will grant higher priority
|
||||
#SBATCH --constraint=mc # Use CPU batch system
|
||||
#SBATCH --ntasks-per-core=1 # Force no Hyper-Threading, will run 1 task per core
|
||||
#SBATCH --mem=352000 # We want to use the whole memory
|
||||
#SBATCH --time=00:30:00 # Define max time job will run
|
||||
#SBATCH --output=myscript.out # Define your output file
|
||||
#SBATCH --error=myscript.err # Define your error file
|
||||
|
||||
my_script
|
||||
```
|
||||
|
||||
In this example we run a single core job by defining ``--ntasks-per-core=1`` (which is also the default). Also, we define that the
|
||||
job will use the whole memory of a node with ``--mem=352000`` (which is the maximum memory available per Apollo node). Whenever
|
||||
you want to run a job needing more memory than the default (4000MB per thread) is very important to specify the amount of memory that
|
||||
the job will use. This must be done in order to avoid conflicts with other jobs from other users.
|
||||
|
||||
## Basic MPI with hyper-threading
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#SBATCH --partition=hourly # Using 'hourly' will grant higher priority
|
||||
#SBATCH --exclusive # Use the node in exclusive mode
|
||||
#SBATCH --ntasks=88 # Job will run 88 tasks
|
||||
#SBATCH --ntasks-per-core=2 # Force Hyper-Threading, will run 2 tasks per core
|
||||
#SBATCH --constraint=mc # Use CPU batch system
|
||||
#SBATCH --time=00:30:00 # Define max time job will run
|
||||
#SBATCH --output=myscript.out # Define your output file
|
||||
#SBATCH --error=myscript.err # Define your error file
|
||||
|
||||
module load gcc/8.3.0 openmpi/3.1.3
|
||||
|
||||
MPI_script
|
||||
```
|
||||
|
||||
In this example we run a job that will run 88 tasks. Merlin6 Apollo nodes have 44 cores each one with HT
|
||||
enabled. This means that we can run 2 threads per core, in total 88 threads. We add the option ``--exclusive`` to
|
||||
ensure that the node usage is exclusive and no other jobs are running there. Finally, the default memory
|
||||
per thread is 4000MB, in total this job can use up to 352000MB memory which is the maximum allowed in a single node.
|
||||
|
||||
## Basic MPI without hyper-threading
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#SBATCH --partition=hourly # Using 'hourly' will grant higher priority
|
||||
#SBATCH --ntasks=44 # Job will run 44 tasks
|
||||
#SBATCH --ntasks-per-core=1 # Force no Hyper-Threading, will run 1 task per core
|
||||
#SBATCH --mem=352000 # Define the whole memory of the node
|
||||
#SBATCH --constraint=mc # Use CPU batch system
|
||||
#SBATCH --time=00:30:00 # Define max time job will run
|
||||
#SBATCH --output=myscript.out # Define your output file
|
||||
#SBATCH --error=myscript.err # Define your output file
|
||||
|
||||
module load gcc/8.3.0 openmpi/3.1.3
|
||||
|
||||
MPI_script
|
||||
```
|
||||
|
||||
In this example we run a job that will run 44 tasks, and Hyper-Threading will not be used. Merlin6 Apollo nodes have 44 cores
|
||||
each one with HT enabled. However, defining ``--ntasks-per-core=1`` we force the use of one single thread per core (if this is
|
||||
not defined, will be the default, but is recommended to add it explicitly). Each task will
|
||||
run in 1 thread, and each tasks will be assigned to an independent core. We add the option ``--exclusive`` to ensre that the node
|
||||
usage is exclusive and no other jobs are running there. Finally, since the default memory per thread is 4000MB and we use only
|
||||
1 thread, we want to avoid using half of the memory: we have to specify that we will use the whole memory of the node with the
|
||||
option ``--mem=352000`` (which is the maximum memory available in the node)`.
|
||||
|
||||
## Advanced Slurm Example
|
||||
|
||||
Copy-paste the following example in a file called myAdvancedTest.batch):
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#SBATCH --partition=daily # name of slurm partition to submit
|
||||
#SBATCH --time=2:00:00 # limit the execution of this job to 2 hours, see sinfo for the max. allowance
|
||||
#SBATCH --nodes=2 # number of nodes
|
||||
#SBATCH --ntasks=44 # number of tasks
|
||||
|
||||
module load gcc/8.3.0 openmpi/3.1.3
|
||||
module list
|
||||
|
||||
echo "Example no-MPI:" ; hostname # will print one hostname per node
|
||||
echo "Example MPI:" ; mpirun hostname # will print one hostname per ntask
|
||||
```
|
||||
|
||||
In the above example are specified the options ``--nodes=2`` and ``--ntasks=44``. This means that up 2 nodes are requested,
|
||||
and is expected to run 44 tasks. Hence, 44 cores are needed for running that job (we do not specify ``--ntasks-per-core``, so it will
|
||||
default to ``1``). Slurm will try to allocate a maximum of 2 nodes, both together having at least 44 cores.
|
||||
Since our nodes have 44 cores / each, if nodes are empty (no other users have running jobs there), job can land on a single node
|
||||
(it has enough cores to run 44 tasks).
|
||||
|
||||
If we want to ensure that job is using at least two different nodes (i.e. for boosting CPU frequency, or because the job requires
|
||||
more memory per core) you should specify other options.
|
||||
|
||||
A good example is ``--ntasks-per-node=22``. This will equally distribute 22 tasks on 2 nodes.
|
||||
|
||||
```bash
|
||||
#SBATCH --ntasks-per-node=22
|
||||
```
|
||||
|
||||
A different example could be by specifying how much memory per core is needed. For instance ``--mem-per-cpu=32000`` will reserve
|
||||
~32000MB per core. Since we have a maximum of 352000MB per Apollo node, Slurm will be only able to allocate 11 cores (32000MB x 11cores = 352000MB) per node.
|
||||
It means that 4 nodes will be needed (max 11 tasks per node due to memory definition, and we need to run 44 tasks), in this case we need to change ``--nodes=4``
|
||||
(or remove ``--nodes``). Alternatively, we can decrease ``--mem-per-cpu`` to a lower value which can allow the use of at least 44 cores per node (i.e. with ``16000``
|
||||
should be able to use 2 nodes)
|
||||
|
||||
```bash
|
||||
#SBATCH --mem-per-cpu=16000
|
||||
```
|
||||
|
||||
Finally, in order to ensure exclusivity of the node, an option *--exclusive* can be used (see below). This will ensure that
|
||||
the requested nodes are exclusive for the job (no other users jobs will interact with this node, and only completely
|
||||
free nodes will be allocated).
|
||||
|
||||
```bash
|
||||
#SBATCH --exclusive
|
||||
```
|
||||
|
||||
This can be combined with the previous examples.
|
||||
|
||||
More advanced configurations can be defined and can be combined with the previous examples. More information about advanced
|
||||
options can be found in the following link: https://slurm.schedmd.com/sbatch.html (or run 'man sbatch').
|
||||
|
||||
If you have questions about how to properly execute your jobs, please contact us through merlin-admins@lists.psi.ch. Do not run
|
||||
advanced configurations unless your are sure of what you are doing.
|
Reference in New Issue
Block a user