Updated documentation

2020-01-23 16:43:12 +01:00
parent 6169b7a8dc
commit b12846e82a
8 changed files with 551 additions and 292 deletions
--- a/Submission/slurm-examples.md
+++ b/Submission/slurm-examples.md
@ -1,160 +1,110 @@
 ---
 title: Slurm Examples
 #tags:
-#keywords:
+keywords: example, template, examples, templates, running jobs, sbatch
 last_updated: 28 June 2019
-#summary: ""
+summary: "This document shows different template examples for running jobs in the Merlin cluster."
 sidebar: merlin6_sidebar
 permalink: /merlin6/slurm-examples.html
 ---

-## Basic single core job
+## Single core based job examples

-### Basic single core job - Example 1
+### Example 1
+
+In this example we want to do not use hyper-threading (``--ntasks-per-core=1`` and ``--hint=nomultithread``). In our Merlin6 configuration,
+the default memory per cpu (in Slurm, this is equivalent to memory per thread) is 4000MB, but in this example we are using 1 single thread 
+per core. As we are not using the second thread in the core, we can double the memory used by the single thread to 8000MB. When using one
+single thread per core, doubling the memory is recommended (however, some applications might not need it).

 ```bash
 #!/bin/bash
 #SBATCH --partition=hourly      # Using 'hourly' will grant higher priority
-#SBATCH --ntasks-per-core=1     # Force no Hyper-Threading, will run 1 task per core
+#SBATCH --ntasks-per-core=1     # Request the max ntasks be invoked on each core
+#SBATCH --hint=nomultithread    # Don't use extra threads with in-core multi-threading
 #SBATCH --mem-per-cpu=8000      # Double the default memory per cpu
 #SBATCH --time=00:30:00         # Define max time job will run
 #SBATCH --output=myscript.out   # Define your output file
 #SBATCH --error=myscript.err    # Define your error file

-my_script
+module load $module       # ...
+My_Script || srun $task   # ...
 ```

-In this example we run a single core job by defining ``--ntasks-per-core=1`` (which is also the default). Since the default memory per 
-cpu is 4000MB (in Slurm, this is equivalent to the memory per thread), and we are using 1 single thread per core, default memory per CPU
-should be doubled: using a single thread will always be accounted as if the job was using the whole physical core (which has 2 available 
-hyperthreads), hence we want to use the memory as if we were using 2 threads.
+### Example 2

-### Basic single core job - Example 2
+In this example we want to do not use hyper-threading (``--ntasks-per-core=1`` and ``--hint=nomultithread``). We want to run a single
+task but we need to use all the memory available in the node. For that, we need to define that the job will use the whole memory of 
+a node with ``--mem=352000`` (which is the maximum memory available on a single Apollo node). Whenever you want to run a job requiring 
+more memory than the default (4000MB per thread) is very important to specify the amount of memory that the job will use.

 ```bash
 #!/bin/bash
 #SBATCH --partition=hourly      # Using 'hourly' will grant higher priority
-#SBATCH --ntasks-per-core=1     # Force no Hyper-Threading, will run 1 task per core
+#SBATCH --ntasks-per-core=1     # Request the max ntasks be invoked on each core
+#SBATCH --hint=nomultithread    # Don't use extra threads with in-core multi-threading
 #SBATCH --mem=352000            # We want to use the whole memory
 #SBATCH --time=00:30:00         # Define max time job will run
 #SBATCH --output=myscript.out   # Define your output file
 #SBATCH --error=myscript.err    # Define your error file

-my_script
+module load $module       # ...
+My_Script || srun $task   # ...
 ```

-In this example we run a single core job by defining ``--ntasks-per-core=1`` (which is also the default). Also, we define that the 
-job will use the whole memory of a node with ``--mem=352000`` (which is the maximum memory available per Apollo node). Whenever 
-you want to run a job needing more memory than the default (4000MB per thread) is very important to specify the amount of memory that
-the job will use. This must be done in order to avoid conflicts with other jobs from other users.
+## Multi core based job examples

-## Basic MPI with hyper-threading
+### Example 1: with Hyper-Threading
+
+In this example we run a job that will run 88 tasks. Merlin6 Apollo nodes have 44 cores each one with hyper-threading
+enabled. This means that we can run 2 threads per core, in total 88 threads. To accomplish that, users should specify
+``--ntasks-per-core=2`` and ``--hint=multithread``. On the other hand, we add the option ``--exclusive`` to ensure 
+that the node usage is exclusive and no other jobs are running there. Finally, notice that the default memory per 
+thread is 4000MB; hence, in total this job can use up to 352000MB memory which is the maximum allowed in a single node.

 ```bash
 #!/bin/bash
 #SBATCH --partition=hourly      # Using 'hourly' will grant higher priority
 #SBATCH --exclusive             # Use the node in exclusive mode
 #SBATCH --ntasks=88             # Job will run 88 tasks
-#SBATCH --ntasks-per-core=2     # Force Hyper-Threading, will run 2 tasks per core
+#SBATCH --ntasks-per-core=2     # Request the max ntasks be invoked on each core
+#SBATCH --hint=multithread      # Use extra threads with in-core multi-threading
 #SBATCH --time=00:30:00         # Define max time job will run
 #SBATCH --output=myscript.out   # Define your output file
 #SBATCH --error=myscript.err    # Define your error file

-module load gcc/8.3.0 openmpi/3.1.3
-
-MPI_script
+module load $module       # ...
+My_Script || srun $task  # ...
 ```

-In this example we run a job that will run 88 tasks. Merlin6 Apollo nodes have 44 cores each one with HT
-enabled. This means that we can run 2 threads per core, in total 88 threads. We add the option ``--exclusive`` to
-ensure that the node usage is exclusive and no other jobs are running there. Finally, the default memory
-per thread is 4000MB, in total this job can use up to 352000MB memory which is the maximum allowed in a single node.
+### Example 2: without Hyper-Threading

-## Basic MPI without hyper-threading
+In this example we want to run a job that will run 44 tasks, and for performance reason we want to disable hyper-threading. 
+Merlin6 Apollo nodes have 44 cores each one with hyper-threading enabled. For ensure that only 1 thread will be used, users
+should specify ``--ntasks-per-core=1`` and ``--hint=nomultithread``. With this configuration, each task will run in 1 thread, 
+and each tasks will be assigned to an independent core. We add the option ``--exclusive`` to ensure that the node usage is 
+exclusive and no other jobs are running there. Finally, in our Slurm configuration the default memory per thread is 4000MB,
+but we want to use only 1 thread. This means that only half of the memory would be used. If the job requires more memory, 
+users need to increase it by either by setting ``--mem=352000`` or (exclusive) by setting ``--mem-per-cpu=8000``.

 ```bash
 #!/bin/bash
 #SBATCH --partition=hourly      # Using 'hourly' will grant higher priority
 #SBATCH --ntasks=44             # Job will run 44 tasks
-#SBATCH --ntasks-per-core=1     # Force no Hyper-Threading, will run 1 task per core
+#SBATCH --ntasks-per-core=2     # Request the max ntasks be invoked on each core
+#SBATCH --hint=nomultithread    # Don't use extra threads with in-core multi-threading
 #SBATCH --mem=352000            # Define the whole memory of the node
 #SBATCH --time=00:30:00         # Define max time job will run 
 #SBATCH --output=myscript.out   # Define your output file
 #SBATCH --error=myscript.err    # Define your output file

-module load gcc/8.3.0 openmpi/3.1.3
-
-MPI_script
+module load $module       # ...
+My_Script || srun $task  # ...
 ```

-In this example we run a job that will run 44 tasks, and Hyper-Threading will not be used. Merlin6 Apollo nodes have 44 cores 
-each one with HT enabled. However, defining ``--ntasks-per-core=1`` we force the use of one single thread per core (if this is
-not defined, will be the default, but is recommended to add it explicitly). Each task will
-run in 1 thread, and each tasks will be assigned to an independent core. We add the option ``--exclusive`` to ensre that the node
-usage is exclusive and no other jobs are running there. Finally, since the default memory per thread is 4000MB and we use only 
-1 thread, we want to avoid using half of the memory: we have to specify that we will use the whole memory of the node with the 
-option ``--mem=352000`` (which is the maximum memory available in the node)`.
+## Advanced examples

-## Advanced Slurm Example
-
-Copy-paste the following example in a file called myAdvancedTest.batch):
-
-```bash
-#!/bin/bash
-#SBATCH --partition=daily  # name of slurm partition to submit
-#SBATCH --time=2:00:00     # limit the execution of this job to 2 hours, see sinfo for the max. allowance
-#SBATCH --nodes=2          # number of nodes
-#SBATCH --ntasks=44        # number of tasks
-
-module load gcc/8.3.0 openmpi/3.1.3
-module list
-
-echo "Example no-MPI:" ; hostname        # will print one hostname per node
-echo "Example MPI:"    ; mpirun hostname # will print one hostname per ntask
-```
-
-In the above example are specified the options ``--nodes=2`` and ``--ntasks=44``. This means that up 2 nodes are requested,
-and is expected to run 44 tasks. Hence, 44 cores are needed for running that job (we do not specify ``--ntasks-per-core``, so it will
-default to ``1``). Slurm will try to allocate a maximum of 2 nodes, both together having at least 44 cores. 
-Since our nodes have 44 cores / each, if nodes are empty (no other users have running jobs there), job can land on a single node 
-(it has enough cores to run 44 tasks).
-
-If we want to ensure that job is using at least two different nodes (i.e. for boosting CPU frequency, or because the job requires
-more memory per core) you should specify other options.
-
-A good example is ``--ntasks-per-node=22``. This will equally distribute 22 tasks on 2 nodes.
-
-```bash
-#SBATCH --ntasks-per-node=22
-```
-
-A different example could be by specifying how much memory per core is needed. For instance ``--mem-per-cpu=32000`` will reserve
-~32000MB per core. Since we have a maximum of 352000MB per Apollo node, Slurm will be only able to allocate 11 cores (32000MB x 11cores = 352000MB) per node.
-It means that 4 nodes will be needed (max 11 tasks per node due to memory definition, and we need to run 44 tasks), in this case we need to change ``--nodes=4`` 
-(or remove ``--nodes``). Alternatively, we can decrease ``--mem-per-cpu`` to a lower value which can allow the use of at least 44 cores per node (i.e. with ``16000``
-should be able to use 2 nodes)
-
-```bash
-#SBATCH --mem-per-cpu=16000
-```
-
-Finally, in order to ensure exclusivity of the node, an option *--exclusive* can be used (see below). This will ensure that
-the requested nodes are exclusive for the job (no other users jobs will interact with this node, and only completely
-free nodes will be allocated).
-
-```bash
-#SBATCH --exclusive
-```
-
-This can be combined with the previous examples.
-
-More advanced configurations can be defined and can be combined with the previous examples. More information about advanced
-options can be found in the following link: https://slurm.schedmd.com/sbatch.html (or run 'man sbatch').
-
-If you have questions about how to properly execute your jobs, please contact us through merlin-admins@lists.psi.ch. Do not run
-advanced configurations unless your are sure of what you are doing.
-
-## Array Jobs - launching a large number of related jobs
+### Array Jobs: launching a large number of related jobs

 If you need to run a large number of jobs based on the same executable with systematically varying inputs,
 e.g. for a parameter sweep, you can do this most easily in form of a **simple array job**.
@ -202,7 +152,7 @@ ARGS=(0.05 0.25 0.5 1 2 5 100)
 srun ./my_program.exe ${ARGS[$SLURM_ARRAY_TASK_ID]}
 ```

-## Array jobs for running very long tasks with checkpoint files
+### Array jobs: running very long tasks with checkpoint files

 If you need to run a job for much longer than the queues (partitions) permit, and
 your executable is able to create checkpoint files, you can use this
@ -230,7 +180,7 @@ this will result in subjob n+1 only being started when job n has finished. It wi
 if it is present.


-## Packed jobs - running a large number of short tasks
+### Packed jobs: running a large number of short tasks

 Since the launching of a Slurm job incurs some overhead, you should not submit each short task as a separate
 Slurm job. Use job packing, i.e. you run the short tasks within the loop of a single Slurm job.
@ -264,3 +214,62 @@ wait
 **Note:** The `&` at the end of the `srun` line is needed to not have the script waiting (blocking).
 The `wait` command waits for all such background tasks to finish and returns the exit code.

+## Hands-On Example
+
+Copy-paste the following example in a file called myAdvancedTest.batch):
+
+```bash
+#!/bin/bash
+#SBATCH --partition=daily    # name of slurm partition to submit
+#SBATCH --time=2:00:00       # limit the execution of this job to 2 hours, see sinfo for the max. allowance
+#SBATCH --nodes=2            # number of nodes
+#SBATCH --ntasks=44          # number of tasks
+#SBATCH --ntasks-per-core=1  # Request the max ntasks be invoked on each core
+#SBATCH --hint=nomultithread # Don't use extra threads with in-core multi-threading
+
+module load gcc/9.2.0 openmpi/3.1.5_merlin6
+module list
+
+echo "Example no-MPI:" ; hostname        # will print one hostname per node
+echo "Example MPI:"    ; mpirun hostname # will print one hostname per ntask
+```
+
+In the above example are specified the options ``--nodes=2`` and ``--ntasks=44``. This means that up 2 nodes are requested,
+and is expected to run 44 tasks. Hence, 44 cores are needed for running that job. Slurm will try to allocate a maximum of 
+2 nodes, both together having at least 44 cores. Since our nodes have 44 cores / each, if nodes are empty (no other users 
+have running jobs there), job can land on a single node (it has enough cores to run 44 tasks).
+
+If we want to ensure that job is using at least two different nodes (i.e. for boosting CPU frequency, or because the job 
+requires more memory per core) you should specify other options.
+
+A good example is ``--ntasks-per-node=22``. This will equally distribute 22 tasks on 2 nodes.
+
+```bash
+#SBATCH --ntasks-per-node=22
+```
+
+A different example could be by specifying how much memory per core is needed. For instance ``--mem-per-cpu=32000`` will reserve
+~32000MB per core. Since we have a maximum of 352000MB per Apollo node, Slurm will be only able to allocate 11 cores (32000MB x 11cores = 352000MB) per node.
+It means that 4 nodes will be needed (max 11 tasks per node due to memory definition, and we need to run 44 tasks), in this case we need to change ``--nodes=4`` 
+(or remove ``--nodes``). Alternatively, we can decrease ``--mem-per-cpu`` to a lower value which can allow the use of at least 44 cores per node (i.e. with ``16000``
+should be able to use 2 nodes)
+
+```bash
+#SBATCH --mem-per-cpu=16000
+```
+
+Finally, in order to ensure exclusivity of the node, an option *--exclusive* can be used (see below). This will ensure that
+the requested nodes are exclusive for the job (no other users jobs will interact with this node, and only completely
+free nodes will be allocated).
+
+```bash
+#SBATCH --exclusive
+```
+
+This can be combined with the previous examples.
+
+More advanced configurations can be defined and can be combined with the previous examples. More information about advanced
+options can be found in the following link: https://slurm.schedmd.com/sbatch.html (or run 'man sbatch').
+
+If you have questions about how to properly execute your jobs, please contact us through merlin-admins@lists.psi.ch. Do not run
+advanced configurations unless your are sure of what you are doing.