Files
gitea-pages/pages/merlin6/merlin6-slurm/running-jobs.md
2019-06-28 14:16:07 +02:00

7.5 KiB

title, last_updated, sidebar, permalink
title last_updated sidebar permalink
Running Jobs 18 June 2019 merlin6_sidebar /merlin6/running-jobs.html

Commands for running jobs

  • sbatch: to submit a batch script to Slurm. Use squeue for checking jobs status and scancel for deleting a job from the queue.
  • srun: to run parallel jobs in the batch system
  • salloc: to obtain a Slurm job allocation (a set of nodes), execute command(s), and then release the allocation when the command is finished. This is equivalent to interactive run.

Running on Merlin5

The Merlin5 cluster will be available at least until 1st of November 2019. In the meantime, users can keep submitting jobs to the old cluster but they will need to specify a couple of extra options to their scripts.

#SBATCH --clusters=merlin5
#SBATCH --partition=merlin

By adding --clusters=merlin5 it will send the jobs to the old Merlin5 computing nodes. Also, --partition=merlin needs to be specified in order to use the old Merlin5 partitions: general (default), daily and hourly will not work when submitting to Merlin5.

Shared nodes and exclusivity

The Merlin6 cluster has been designed in a way that should allow running MPI/OpenMP processes as well as single core based jobs. For allowing co-existence, nodes are configured by default in a shared mode. It means, that multiple jobs from multiple users may land in the same node. This behaviour can be changed by a user if they require exclusive usage of nodes.

By default, Slurm will try to allocate jobs on nodes that are already occupied by processes not requiring exclusive usage of a node. In this way, we fill up first mixed nodes and we ensure that free full resources are available for MPI/OpenMP jobs.

Exclusivity of a node can be setup by specific the --exclusive option as follows:

#SBATCH --exclusive

Output and Errors

By default, Slurm script will generate standard output and errors files in the directory from where you submit the batch script:

  • standard output will be written into a file slurm-$SLURM_JOB_ID.out.
  • standard error will be written into a file slurm-$SLURM_JOB_ID.err.

If you want to the default names it can be done with the options --output and --error. In example:

#SBATCH --output=logs/myJob.%N.%j.out  # Generate an output file per hostname and jobid
#SBATCH --error=logs/myJob.%N.%j.err   # Generate an errori file per hostname and jobid

Use man sbatch (man sbatch | grep -A36 '^filename pattern') for getting a list specification of filename patterns.

Partitions

Merlin6 contains 3 partitions for general purpose. These are general, daily and hourly. If no partition is defined, general will be the default. Partition can be defined with the --partition option as follows:

#SBATCH --partition=<general|daily|hourly>  # Partition to use. 'general' is the 'default'.

Please check the section [Slurm Configuration#Merlin6 Slurm Partitions] for more information about Merlin6 partition setup.

CPU-based Jobs Settings

CPU-based jobs are available for all PSI users. Users must belong to the merlin6 Slurm Account in order to be able to run on CPU-based nodes. All users registered in Merlin6 are automatically included in the Account.

Slurm CPU Mandatory Settings

The following options are mandatory settings that must be included in your batch scripts:

#SBATCH --constraint=mc   # Always set it to 'mc' for CPU jobs.

There are some settings that are not mandatory but would be needed or useful to specify. These are the following:

  • --time: mostly used when you need to specify longer runs in the general partition, also useful for specifying shorter times. This may affect scheduling priorities.

    #SBATCH --time=<D-HH:MM:SS>   # Time job needs to run
    

Slurm CPU Template

The following template should be used by any user submitting jobs to CPU nodes:

#!/bin/sh
#SBATCH --partition=<general|daily|hourly>  # Specify 'general' or 'daily' or 'hourly'
#SBATCH --time=<D-HH:MM:SS>                 # Strictly recommended when using 'general' partition.
#SBATCH --output=<output_file>              # Generate custom output file
#SBATCH --error=<error_file>                # Generate custom error  file
#SBATCH --constraint=mc                     # You must specify 'mc' when using 'cpu' jobs
#SBATCH --ntasks-per-core=1                 # Recommended one thread per core
##SBATCH --exclusive                        # Uncomment if you need exclusive node usage

## Advanced options example
##SBATCH --nodes=1                          # Uncomment and specify #nodes to use
##SBATCH --ntasks=44                        # Uncomment and specify #nodes to use  
##SBATCH --ntasks-per-node=44               # Uncomment and specify #tasks per node
##SBATCH --ntasks-per-core=2                # Uncomment and specify #tasks per core (a.k.a. threads)
##SBATCH --cpus-per-task=44                 # Uncomment and specify the number of cores per task
  • Users needing hyper-threading can specify --ntasks-per-core=2 instead. This is not recommended for generic usage.

GPU-based Jobs Settings

GPU-base jobs are restricted to BIO users, however access for PSI users can be requested on demand. Users must belong to the merlin6-gpu Slurm Account in order to be able to run GPU-based nodes. BIO users belonging to any BIO group are automatically registered to the merlin6-gpu account. Other users should request access to the Merlin6 administrators.

Slurm CPU Mandatory Settings

The following options are mandatory settings that must be included in your batch scripts:

#SBATCH --constraint=gpu   # Always set it to 'gpu' for GPU jobs.
#SBATCH --gres=gpu         # Always set at least this option when using GPUs

GPUs are also a shared resource. Hence, multiple users can run jobs on a single node, but only one GPU per user process must be used. Users can define which GPUs resources they need with the --gres option. Valid gres options are: gpu[[:type]:count] where type=GTX1080|GTX1080Ti and count=<number of gpus to use> This would be according to the following rules:

In example:

#SBATCH --gres=gpu:GTX1080:8   # Use 8 x GTX1080 GPUs

Slurm GPU Template

The following template should be used by any user submitting jobs to GPU nodes:

#!/bin/sh
#SBATCH --partition=<general|daily|hourly>  # Specify 'general' or 'daily' or 'hourly'
#SBATCH --time=<D-HH:MM:SS>                 # Strictly recommended when using 'general' partition.
#SBATCH --output=<output_file>              # Generate custom output file
#SBATCH --error=<error_file                 # Generate custom error  file
#SBATCH --constraint=gpu                    # You must specify 'gpu' for using GPUs
#SBATCH --gres="gpu:<type>:<number_gpus>"   # You should specify at least 'gpu'
#SBATCH --ntasks-per-core=1                 # GPU nodes have hyper-threading disabled
##SBATCH --exclusive                        # Uncomment if you need exclusive node usage

## Advanced options example
##SBATCH --nodes=1                          # Uncomment and specify number of nodes to use
##SBATCH --ntasks=44                        # Uncomment and specify number of nodes to use
##SBATCH --ntasks-per-node=44               # Uncomment and specify number of tasks per node
##SBATCH --cpus-per-task=44                 # Uncomment and specify the number of cores per task