From f0bee8ec7e048f5b37cac4377983ea7eeffe00be Mon Sep 17 00:00:00 2001 From: caubet_m Date: Wed, 19 Jun 2019 16:37:45 +0200 Subject: [PATCH] Finished running jobs --- pages/merlin6/merlin6-slurm/running-jobs.md | 81 +++++++++++++++------ 1 file changed, 59 insertions(+), 22 deletions(-) diff --git a/pages/merlin6/merlin6-slurm/running-jobs.md b/pages/merlin6/merlin6-slurm/running-jobs.md index 8640e13..13e79f0 100644 --- a/pages/merlin6/merlin6-slurm/running-jobs.md +++ b/pages/merlin6/merlin6-slurm/running-jobs.md @@ -10,12 +10,10 @@ permalink: /merlin6/running-jobs.html ## Commands for running jobs -* ``sbatch``: to submit a batch script to Slurm - * ``squeue``: for checking the status of your jobs - * ``scancel``: for deleting a job from the queue +* ``sbatch``: to submit a batch script to Slurm. Use ``squeue`` for checking jobs status and ``scancel`` for deleting a job from the queue. * ``srun``: to run parallel jobs in the batch system -* ``salloc``: to obtain a Slurm job allocation (a set of nodes), execute command(s), and then release the allocation when the command is finished. - * ``salloc`` is equivalent to an interactive run +* ``salloc``: to obtain a Slurm job allocation (a set of nodes), execute command(s), and then release the allocation when the command is finished. +This is equivalent to interactive run. ## Shared nodes and exclusivity @@ -28,7 +26,7 @@ we fill up first mixed nodes and we ensure that free full resources are availabl Exclusivity of a node can be setup by specific the ``--exclusive`` option as follows: -```bash +``` #SBATCH --exclusive ``` @@ -42,7 +40,7 @@ you submit the batch script: If you want to the default names it can be done with the options ``--output`` and ``--error``. In example: -```batch +``` #SBATCH --output=logs/myJob.%N.%j.out # Generate an output file per hostname and jobid #SBATCH --error=logs/myJob.%N.%j.err # Generate an errori file per hostname and jobid ``` @@ -54,7 +52,7 @@ Use **man sbatch** (``man sbatch | grep -A36 '^filename pattern'``) for getting Merlin6 contains 3 partitions for general purpose. These are ``general``, ``daily`` and ``hourly``. If no partition is defined, ``general`` will be the default. Partition can be defined with the ``--partition`` option as follows: -```bash +``` #SBATCH --partition= # name of slurm partition to submit. 'general' is the 'default'. ``` @@ -69,7 +67,7 @@ to run on CPU-based nodes. All users registered in Merlin6 are automatically inc The following options are mandatory settings that **must be included** in your batch scripts: -```bash +``` #SBATCH --constraint=mc # Always set it to 'mc' for CPU jobs. ``` @@ -80,10 +78,34 @@ There are some settings that are not mandatory but would be needed or useful to * ``--time``: mostly used when you need to specify longer runs in the ``general`` partition, also useful for specifying shorter times. This may affect scheduling priorities. - ```bash + ``` #SBATCH --time= # Time job needs to run ``` +### Slurm CPU Template + +The following template should be used by any user submitting jobs to CPU nodes: + +``` +#!/bin/sh +#SBATCH --partition= # Specify 'general' or 'daily' or 'hourly' +#SBATCH --time= # Recommended, and strictly recommended when using 'general' partition. +#SBATCH --output= # Generate custom output file +#SBATCH --error= # Generate custom error file +#SBATCH --constraint=mc # You must specify 'mc' when using 'cpu' jobs +#SBATCH --ntasks-per-core=1 # Recommended one thread per core +##SBATCH --exclusive # Uncomment if you need exclusive node usage + +## Advanced options example +##SBATCH --nodes=1 # Uncomment and specify number of nodes to use +##SBATCH --ntasks=44 # Uncomment and specify number of nodes to use +##SBATCH --ntasks-per-node=44 # Uncomment and specify number of tasks per node +##SBATCH --ntasks-per-core=2 # Uncomment and specifty number of tasks per core (threads) +##SBATCH --cpus-per-task=44 # Uncomment and specify the number of cores per task +``` + +* Users needing hyper-threading can specify ``--ntasks-per-core=2`` instead. This is not recommended for generic usage. + ## GPU-based Jobs Settings GPU-base jobs are restricted to BIO users, however access for PSI users can be requested on demand. Users must belong to @@ -94,27 +116,42 @@ are automatically registered to the ``merlin6-gpu`` account. Other users should The following options are mandatory settings that **must be included** in your batch scripts: -```bash +``` #SBATCH --constraint=gpu # Always set it to 'gpu' for GPU jobs. #SBATCH --gres=gpu # Always set at least this option when using GPUs ``` -## Slurm GPU Recommended Settings +### Slurm GPU Recommended Settings GPUs are also a shared resource. Hence, multiple users can run jobs on a single node, but only one GPU per user process must be used. Users can define which GPUs resources they need with the ``--gres`` option. +Valid ``gres`` options are: ``gpu[[:type]:count]`` where ``type=GTX1080|GTX1080Ti`` and ``count=`` This would be according to the following rules: -* All machines except ``merlin-g-001`` have up to 4 GPUs. ``merlin-g-001`` has up to 2 GPUs. -* Two different NVIDIA models profiles exist: ``GTX1080`` and ``GTX1080Ti``. - -Valid ``gres`` options are: ``gpu[[:type]:count]`` where: - -* ``type``: can be ``GTX1080`` or ``GTX1080Ti`` -* ``count``: will be the number of GPUs to use - In example: -```batch -#SBATCH --gres=gpu:GTX1080:4 # Use 4 x GTX1080 GPUs +``` +#SBATCH --gres=gpu:GTX1080:8 # Use 8 x GTX1080 GPUs +``` + +### Slurm GPU Template + +The following template should be used by any user submitting jobs to GPU nodes: + +``` +#!/bin/sh +#SBATCH --partition= # Specify 'general' or 'daily' or 'hourly' +#SBATCH --time= # Recommended, and strictly recommended when using 'general' partition. +#SBATCH --output= # Generate custom output file +#SBATCH --error=