Updated documentation
This commit is contained in:
@ -1,160 +1,110 @@
|
||||
---
|
||||
title: Slurm Examples
|
||||
#tags:
|
||||
#keywords:
|
||||
keywords: example, template, examples, templates, running jobs, sbatch
|
||||
last_updated: 28 June 2019
|
||||
#summary: ""
|
||||
summary: "This document shows different template examples for running jobs in the Merlin cluster."
|
||||
sidebar: merlin6_sidebar
|
||||
permalink: /merlin6/slurm-examples.html
|
||||
---
|
||||
|
||||
## Basic single core job
|
||||
## Single core based job examples
|
||||
|
||||
### Basic single core job - Example 1
|
||||
### Example 1
|
||||
|
||||
In this example we want to do not use hyper-threading (``--ntasks-per-core=1`` and ``--hint=nomultithread``). In our Merlin6 configuration,
|
||||
the default memory per cpu (in Slurm, this is equivalent to memory per thread) is 4000MB, but in this example we are using 1 single thread
|
||||
per core. As we are not using the second thread in the core, we can double the memory used by the single thread to 8000MB. When using one
|
||||
single thread per core, doubling the memory is recommended (however, some applications might not need it).
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#SBATCH --partition=hourly # Using 'hourly' will grant higher priority
|
||||
#SBATCH --ntasks-per-core=1 # Force no Hyper-Threading, will run 1 task per core
|
||||
#SBATCH --ntasks-per-core=1 # Request the max ntasks be invoked on each core
|
||||
#SBATCH --hint=nomultithread # Don't use extra threads with in-core multi-threading
|
||||
#SBATCH --mem-per-cpu=8000 # Double the default memory per cpu
|
||||
#SBATCH --time=00:30:00 # Define max time job will run
|
||||
#SBATCH --output=myscript.out # Define your output file
|
||||
#SBATCH --error=myscript.err # Define your error file
|
||||
|
||||
my_script
|
||||
module load $module # ...
|
||||
My_Script || srun $task # ...
|
||||
```
|
||||
|
||||
In this example we run a single core job by defining ``--ntasks-per-core=1`` (which is also the default). Since the default memory per
|
||||
cpu is 4000MB (in Slurm, this is equivalent to the memory per thread), and we are using 1 single thread per core, default memory per CPU
|
||||
should be doubled: using a single thread will always be accounted as if the job was using the whole physical core (which has 2 available
|
||||
hyperthreads), hence we want to use the memory as if we were using 2 threads.
|
||||
### Example 2
|
||||
|
||||
### Basic single core job - Example 2
|
||||
In this example we want to do not use hyper-threading (``--ntasks-per-core=1`` and ``--hint=nomultithread``). We want to run a single
|
||||
task but we need to use all the memory available in the node. For that, we need to define that the job will use the whole memory of
|
||||
a node with ``--mem=352000`` (which is the maximum memory available on a single Apollo node). Whenever you want to run a job requiring
|
||||
more memory than the default (4000MB per thread) is very important to specify the amount of memory that the job will use.
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#SBATCH --partition=hourly # Using 'hourly' will grant higher priority
|
||||
#SBATCH --ntasks-per-core=1 # Force no Hyper-Threading, will run 1 task per core
|
||||
#SBATCH --ntasks-per-core=1 # Request the max ntasks be invoked on each core
|
||||
#SBATCH --hint=nomultithread # Don't use extra threads with in-core multi-threading
|
||||
#SBATCH --mem=352000 # We want to use the whole memory
|
||||
#SBATCH --time=00:30:00 # Define max time job will run
|
||||
#SBATCH --output=myscript.out # Define your output file
|
||||
#SBATCH --error=myscript.err # Define your error file
|
||||
|
||||
my_script
|
||||
module load $module # ...
|
||||
My_Script || srun $task # ...
|
||||
```
|
||||
|
||||
In this example we run a single core job by defining ``--ntasks-per-core=1`` (which is also the default). Also, we define that the
|
||||
job will use the whole memory of a node with ``--mem=352000`` (which is the maximum memory available per Apollo node). Whenever
|
||||
you want to run a job needing more memory than the default (4000MB per thread) is very important to specify the amount of memory that
|
||||
the job will use. This must be done in order to avoid conflicts with other jobs from other users.
|
||||
## Multi core based job examples
|
||||
|
||||
## Basic MPI with hyper-threading
|
||||
### Example 1: with Hyper-Threading
|
||||
|
||||
In this example we run a job that will run 88 tasks. Merlin6 Apollo nodes have 44 cores each one with hyper-threading
|
||||
enabled. This means that we can run 2 threads per core, in total 88 threads. To accomplish that, users should specify
|
||||
``--ntasks-per-core=2`` and ``--hint=multithread``. On the other hand, we add the option ``--exclusive`` to ensure
|
||||
that the node usage is exclusive and no other jobs are running there. Finally, notice that the default memory per
|
||||
thread is 4000MB; hence, in total this job can use up to 352000MB memory which is the maximum allowed in a single node.
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#SBATCH --partition=hourly # Using 'hourly' will grant higher priority
|
||||
#SBATCH --exclusive # Use the node in exclusive mode
|
||||
#SBATCH --ntasks=88 # Job will run 88 tasks
|
||||
#SBATCH --ntasks-per-core=2 # Force Hyper-Threading, will run 2 tasks per core
|
||||
#SBATCH --ntasks-per-core=2 # Request the max ntasks be invoked on each core
|
||||
#SBATCH --hint=multithread # Use extra threads with in-core multi-threading
|
||||
#SBATCH --time=00:30:00 # Define max time job will run
|
||||
#SBATCH --output=myscript.out # Define your output file
|
||||
#SBATCH --error=myscript.err # Define your error file
|
||||
|
||||
module load gcc/8.3.0 openmpi/3.1.3
|
||||
|
||||
MPI_script
|
||||
module load $module # ...
|
||||
My_Script || srun $task # ...
|
||||
```
|
||||
|
||||
In this example we run a job that will run 88 tasks. Merlin6 Apollo nodes have 44 cores each one with HT
|
||||
enabled. This means that we can run 2 threads per core, in total 88 threads. We add the option ``--exclusive`` to
|
||||
ensure that the node usage is exclusive and no other jobs are running there. Finally, the default memory
|
||||
per thread is 4000MB, in total this job can use up to 352000MB memory which is the maximum allowed in a single node.
|
||||
### Example 2: without Hyper-Threading
|
||||
|
||||
## Basic MPI without hyper-threading
|
||||
In this example we want to run a job that will run 44 tasks, and for performance reason we want to disable hyper-threading.
|
||||
Merlin6 Apollo nodes have 44 cores each one with hyper-threading enabled. For ensure that only 1 thread will be used, users
|
||||
should specify ``--ntasks-per-core=1`` and ``--hint=nomultithread``. With this configuration, each task will run in 1 thread,
|
||||
and each tasks will be assigned to an independent core. We add the option ``--exclusive`` to ensure that the node usage is
|
||||
exclusive and no other jobs are running there. Finally, in our Slurm configuration the default memory per thread is 4000MB,
|
||||
but we want to use only 1 thread. This means that only half of the memory would be used. If the job requires more memory,
|
||||
users need to increase it by either by setting ``--mem=352000`` or (exclusive) by setting ``--mem-per-cpu=8000``.
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#SBATCH --partition=hourly # Using 'hourly' will grant higher priority
|
||||
#SBATCH --ntasks=44 # Job will run 44 tasks
|
||||
#SBATCH --ntasks-per-core=1 # Force no Hyper-Threading, will run 1 task per core
|
||||
#SBATCH --ntasks-per-core=2 # Request the max ntasks be invoked on each core
|
||||
#SBATCH --hint=nomultithread # Don't use extra threads with in-core multi-threading
|
||||
#SBATCH --mem=352000 # Define the whole memory of the node
|
||||
#SBATCH --time=00:30:00 # Define max time job will run
|
||||
#SBATCH --output=myscript.out # Define your output file
|
||||
#SBATCH --error=myscript.err # Define your output file
|
||||
|
||||
module load gcc/8.3.0 openmpi/3.1.3
|
||||
|
||||
MPI_script
|
||||
module load $module # ...
|
||||
My_Script || srun $task # ...
|
||||
```
|
||||
|
||||
In this example we run a job that will run 44 tasks, and Hyper-Threading will not be used. Merlin6 Apollo nodes have 44 cores
|
||||
each one with HT enabled. However, defining ``--ntasks-per-core=1`` we force the use of one single thread per core (if this is
|
||||
not defined, will be the default, but is recommended to add it explicitly). Each task will
|
||||
run in 1 thread, and each tasks will be assigned to an independent core. We add the option ``--exclusive`` to ensre that the node
|
||||
usage is exclusive and no other jobs are running there. Finally, since the default memory per thread is 4000MB and we use only
|
||||
1 thread, we want to avoid using half of the memory: we have to specify that we will use the whole memory of the node with the
|
||||
option ``--mem=352000`` (which is the maximum memory available in the node)`.
|
||||
## Advanced examples
|
||||
|
||||
## Advanced Slurm Example
|
||||
|
||||
Copy-paste the following example in a file called myAdvancedTest.batch):
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#SBATCH --partition=daily # name of slurm partition to submit
|
||||
#SBATCH --time=2:00:00 # limit the execution of this job to 2 hours, see sinfo for the max. allowance
|
||||
#SBATCH --nodes=2 # number of nodes
|
||||
#SBATCH --ntasks=44 # number of tasks
|
||||
|
||||
module load gcc/8.3.0 openmpi/3.1.3
|
||||
module list
|
||||
|
||||
echo "Example no-MPI:" ; hostname # will print one hostname per node
|
||||
echo "Example MPI:" ; mpirun hostname # will print one hostname per ntask
|
||||
```
|
||||
|
||||
In the above example are specified the options ``--nodes=2`` and ``--ntasks=44``. This means that up 2 nodes are requested,
|
||||
and is expected to run 44 tasks. Hence, 44 cores are needed for running that job (we do not specify ``--ntasks-per-core``, so it will
|
||||
default to ``1``). Slurm will try to allocate a maximum of 2 nodes, both together having at least 44 cores.
|
||||
Since our nodes have 44 cores / each, if nodes are empty (no other users have running jobs there), job can land on a single node
|
||||
(it has enough cores to run 44 tasks).
|
||||
|
||||
If we want to ensure that job is using at least two different nodes (i.e. for boosting CPU frequency, or because the job requires
|
||||
more memory per core) you should specify other options.
|
||||
|
||||
A good example is ``--ntasks-per-node=22``. This will equally distribute 22 tasks on 2 nodes.
|
||||
|
||||
```bash
|
||||
#SBATCH --ntasks-per-node=22
|
||||
```
|
||||
|
||||
A different example could be by specifying how much memory per core is needed. For instance ``--mem-per-cpu=32000`` will reserve
|
||||
~32000MB per core. Since we have a maximum of 352000MB per Apollo node, Slurm will be only able to allocate 11 cores (32000MB x 11cores = 352000MB) per node.
|
||||
It means that 4 nodes will be needed (max 11 tasks per node due to memory definition, and we need to run 44 tasks), in this case we need to change ``--nodes=4``
|
||||
(or remove ``--nodes``). Alternatively, we can decrease ``--mem-per-cpu`` to a lower value which can allow the use of at least 44 cores per node (i.e. with ``16000``
|
||||
should be able to use 2 nodes)
|
||||
|
||||
```bash
|
||||
#SBATCH --mem-per-cpu=16000
|
||||
```
|
||||
|
||||
Finally, in order to ensure exclusivity of the node, an option *--exclusive* can be used (see below). This will ensure that
|
||||
the requested nodes are exclusive for the job (no other users jobs will interact with this node, and only completely
|
||||
free nodes will be allocated).
|
||||
|
||||
```bash
|
||||
#SBATCH --exclusive
|
||||
```
|
||||
|
||||
This can be combined with the previous examples.
|
||||
|
||||
More advanced configurations can be defined and can be combined with the previous examples. More information about advanced
|
||||
options can be found in the following link: https://slurm.schedmd.com/sbatch.html (or run 'man sbatch').
|
||||
|
||||
If you have questions about how to properly execute your jobs, please contact us through merlin-admins@lists.psi.ch. Do not run
|
||||
advanced configurations unless your are sure of what you are doing.
|
||||
|
||||
## Array Jobs - launching a large number of related jobs
|
||||
### Array Jobs: launching a large number of related jobs
|
||||
|
||||
If you need to run a large number of jobs based on the same executable with systematically varying inputs,
|
||||
e.g. for a parameter sweep, you can do this most easily in form of a **simple array job**.
|
||||
@ -202,7 +152,7 @@ ARGS=(0.05 0.25 0.5 1 2 5 100)
|
||||
srun ./my_program.exe ${ARGS[$SLURM_ARRAY_TASK_ID]}
|
||||
```
|
||||
|
||||
## Array jobs for running very long tasks with checkpoint files
|
||||
### Array jobs: running very long tasks with checkpoint files
|
||||
|
||||
If you need to run a job for much longer than the queues (partitions) permit, and
|
||||
your executable is able to create checkpoint files, you can use this
|
||||
@ -230,7 +180,7 @@ this will result in subjob n+1 only being started when job n has finished. It wi
|
||||
if it is present.
|
||||
|
||||
|
||||
## Packed jobs - running a large number of short tasks
|
||||
### Packed jobs: running a large number of short tasks
|
||||
|
||||
Since the launching of a Slurm job incurs some overhead, you should not submit each short task as a separate
|
||||
Slurm job. Use job packing, i.e. you run the short tasks within the loop of a single Slurm job.
|
||||
@ -264,3 +214,62 @@ wait
|
||||
**Note:** The `&` at the end of the `srun` line is needed to not have the script waiting (blocking).
|
||||
The `wait` command waits for all such background tasks to finish and returns the exit code.
|
||||
|
||||
## Hands-On Example
|
||||
|
||||
Copy-paste the following example in a file called myAdvancedTest.batch):
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#SBATCH --partition=daily # name of slurm partition to submit
|
||||
#SBATCH --time=2:00:00 # limit the execution of this job to 2 hours, see sinfo for the max. allowance
|
||||
#SBATCH --nodes=2 # number of nodes
|
||||
#SBATCH --ntasks=44 # number of tasks
|
||||
#SBATCH --ntasks-per-core=1 # Request the max ntasks be invoked on each core
|
||||
#SBATCH --hint=nomultithread # Don't use extra threads with in-core multi-threading
|
||||
|
||||
module load gcc/9.2.0 openmpi/3.1.5_merlin6
|
||||
module list
|
||||
|
||||
echo "Example no-MPI:" ; hostname # will print one hostname per node
|
||||
echo "Example MPI:" ; mpirun hostname # will print one hostname per ntask
|
||||
```
|
||||
|
||||
In the above example are specified the options ``--nodes=2`` and ``--ntasks=44``. This means that up 2 nodes are requested,
|
||||
and is expected to run 44 tasks. Hence, 44 cores are needed for running that job. Slurm will try to allocate a maximum of
|
||||
2 nodes, both together having at least 44 cores. Since our nodes have 44 cores / each, if nodes are empty (no other users
|
||||
have running jobs there), job can land on a single node (it has enough cores to run 44 tasks).
|
||||
|
||||
If we want to ensure that job is using at least two different nodes (i.e. for boosting CPU frequency, or because the job
|
||||
requires more memory per core) you should specify other options.
|
||||
|
||||
A good example is ``--ntasks-per-node=22``. This will equally distribute 22 tasks on 2 nodes.
|
||||
|
||||
```bash
|
||||
#SBATCH --ntasks-per-node=22
|
||||
```
|
||||
|
||||
A different example could be by specifying how much memory per core is needed. For instance ``--mem-per-cpu=32000`` will reserve
|
||||
~32000MB per core. Since we have a maximum of 352000MB per Apollo node, Slurm will be only able to allocate 11 cores (32000MB x 11cores = 352000MB) per node.
|
||||
It means that 4 nodes will be needed (max 11 tasks per node due to memory definition, and we need to run 44 tasks), in this case we need to change ``--nodes=4``
|
||||
(or remove ``--nodes``). Alternatively, we can decrease ``--mem-per-cpu`` to a lower value which can allow the use of at least 44 cores per node (i.e. with ``16000``
|
||||
should be able to use 2 nodes)
|
||||
|
||||
```bash
|
||||
#SBATCH --mem-per-cpu=16000
|
||||
```
|
||||
|
||||
Finally, in order to ensure exclusivity of the node, an option *--exclusive* can be used (see below). This will ensure that
|
||||
the requested nodes are exclusive for the job (no other users jobs will interact with this node, and only completely
|
||||
free nodes will be allocated).
|
||||
|
||||
```bash
|
||||
#SBATCH --exclusive
|
||||
```
|
||||
|
||||
This can be combined with the previous examples.
|
||||
|
||||
More advanced configurations can be defined and can be combined with the previous examples. More information about advanced
|
||||
options can be found in the following link: https://slurm.schedmd.com/sbatch.html (or run 'man sbatch').
|
||||
|
||||
If you have questions about how to properly execute your jobs, please contact us through merlin-admins@lists.psi.ch. Do not run
|
||||
advanced configurations unless your are sure of what you are doing.
|
||||
|
Reference in New Issue
Block a user