164 lines
7.2 KiB
Markdown
164 lines
7.2 KiB
Markdown
---
|
|
title: Running Jobs
|
|
#tags:
|
|
#keywords:
|
|
last_updated: 18 June 2019
|
|
#summary: ""
|
|
sidebar: merlin6_sidebar
|
|
permalink: /merlin6/running-jobs.html
|
|
---
|
|
|
|
## Commands for running jobs
|
|
|
|
* ``sbatch``: to submit a batch script to Slurm. Use ``squeue`` for checking jobs status and ``scancel`` for deleting a job from the queue.
|
|
* ``srun``: to run parallel jobs in the batch system
|
|
* ``salloc``: to obtain a Slurm job allocation (a set of nodes), execute command(s), and then release the allocation when the command is finished.
|
|
This is equivalent to interactive run.
|
|
|
|
## Running on Merlin5
|
|
|
|
The **Merlin5** cluster will be available at least until 1st of November 2019. In the meantime, users can keep submitting jobs to the old cluster
|
|
but they will need to specify a couple of extra options to their scripts.
|
|
|
|
```bash
|
|
#SBATCH --clusters=merlin5
|
|
#SBATCH --partition=merlin
|
|
```
|
|
|
|
By adding ``--clusters=merlin5`` it will send the jobs to the old Merlin5 computing nodes. Also, ``--partition=merlin`` needs to be specified in
|
|
order to use the old Merlin5 partitions: ``general`` (*default*), ``daily`` and ``hourly`` will not work when submitting to Merlin5.
|
|
|
|
## Shared nodes and exclusivity
|
|
|
|
The **Merlin6** cluster has been designed in a way that should allow running MPI/OpenMP processes as well as single core based jobs. For allowing
|
|
co-existence, nodes are configured by default in a shared mode. It means, that multiple jobs from multiple users may land in the same node. This
|
|
behaviour can be changed by a user if they require exclusive usage of nodes.
|
|
|
|
By default, Slurm will try to allocate jobs on nodes that are already occupied by processes not requiring exclusive usage of a node. In this way,
|
|
we fill up first mixed nodes and we ensure that free full resources are available for MPI/OpenMP jobs.
|
|
|
|
Exclusivity of a node can be setup by specific the ``--exclusive`` option as follows:
|
|
|
|
```bash
|
|
#SBATCH --exclusive
|
|
```
|
|
|
|
## Output and Errors
|
|
|
|
By default, Slurm script will generate standard output and errors files in the directory from where
|
|
you submit the batch script:
|
|
|
|
* standard output will be written into a file ``slurm-$SLURM_JOB_ID.out``.
|
|
* standard error will be written into a file ``slurm-$SLURM_JOB_ID.err``.
|
|
|
|
If you want to the default names it can be done with the options ``--output`` and ``--error``. In example:
|
|
|
|
```bash
|
|
#SBATCH --output=logs/myJob.%N.%j.out # Generate an output file per hostname and jobid
|
|
#SBATCH --error=logs/myJob.%N.%j.err # Generate an errori file per hostname and jobid
|
|
```
|
|
|
|
Use **man sbatch** (``man sbatch | grep -A36 '^filename pattern'``) for getting a list specification of **filename patterns**.
|
|
|
|
## Partitions
|
|
|
|
Merlin6 contains 6 partitions for general purpose:
|
|
|
|
* For the CPU these are ``general``, ``daily`` and ``hourly``.
|
|
* For the GPU these are ``gpu_general``, ``gpu_daily`` and ``gpu_hourly``.
|
|
|
|
If no partition is defined, ``general`` will be the default. Partition can be defined with the ``--partition`` option as follows:
|
|
|
|
```bash
|
|
#SBATCH --partition=<partition_name> # Partition to use. 'general' is the 'default'.
|
|
```
|
|
|
|
Please check the section [Slurm Configuration#Merlin6 Slurm Partitions] for more information about Merlin6 partition setup.
|
|
|
|
## CPU-based Jobs Settings
|
|
|
|
CPU-based jobs are available for all PSI users. Users must belong to the ``merlin6`` Slurm ``Account`` in order to be able
|
|
to run on CPU-based nodes. All users registered in Merlin6 are automatically included in the ``Account``.
|
|
|
|
### Slurm CPU Recommended Settings
|
|
|
|
There are some settings that are not mandatory but would be needed or useful to specify. These are the following:
|
|
|
|
* ``--time``: mostly used when you need to specify longer runs in the ``general`` partition, also useful for specifying
|
|
shorter times. This may affect scheduling priorities.
|
|
|
|
```bash
|
|
#SBATCH --time=<D-HH:MM:SS> # Time job needs to run
|
|
```
|
|
|
|
### Slurm CPU Template
|
|
|
|
The following template should be used by any user submitting jobs to CPU nodes:
|
|
|
|
```bash
|
|
#!/bin/sh
|
|
#SBATCH --partition=<general|daily|hourly> # Specify 'general' or 'daily' or 'hourly'
|
|
#SBATCH --time=<D-HH:MM:SS> # Strictly recommended when using 'general' partition.
|
|
#SBATCH --output=<output_file> # Generate custom output file
|
|
#SBATCH --error=<error_file> # Generate custom error file
|
|
#SBATCH --ntasks-per-core=1 # Recommended one thread per core
|
|
##SBATCH --exclusive # Uncomment if you need exclusive node usage
|
|
|
|
## Advanced options example
|
|
##SBATCH --nodes=1 # Uncomment and specify #nodes to use
|
|
##SBATCH --ntasks=44 # Uncomment and specify #nodes to use
|
|
##SBATCH --ntasks-per-node=44 # Uncomment and specify #tasks per node
|
|
##SBATCH --ntasks-per-core=2 # Uncomment and specify #tasks per core (a.k.a. threads)
|
|
##SBATCH --cpus-per-task=44 # Uncomment and specify the number of cores per task
|
|
```
|
|
|
|
* Users needing hyper-threading can specify ``--ntasks-per-core=2`` instead. This is not recommended for generic usage.
|
|
|
|
## GPU-based Jobs Settings
|
|
|
|
GPU-base jobs are restricted to BIO users, however access for PSI users can be requested on demand. Users must belong to
|
|
the ``merlin6-gpu`` Slurm ``Account`` in order to be able to run GPU-based nodes. BIO users belonging to any BIO group
|
|
are automatically registered to the ``merlin6-gpu`` account. Other users should request access to the Merlin6 administrators.
|
|
|
|
### Slurm CPU Mandatory Settings
|
|
|
|
The following options are mandatory settings that **must be included** in your batch scripts:
|
|
|
|
```bash
|
|
#SBATCH --gres=gpu # Always set at least this option when using GPUs
|
|
```
|
|
|
|
### Slurm GPU Recommended Settings
|
|
|
|
GPUs are also a shared resource. Hence, multiple users can run jobs on a single node, but only one GPU per user process
|
|
must be used. Users can define which GPUs resources they need with the ``--gres`` option.
|
|
Valid ``gres`` options are: ``gpu[[:type]:count]`` where ``type=GTX1080|GTX1080Ti`` and ``count=<number of gpus to use>``
|
|
This would be according to the following rules:
|
|
|
|
In example:
|
|
|
|
```bash
|
|
#SBATCH --gres=gpu:GTX1080:8 # Use 8 x GTX1080 GPUs
|
|
```
|
|
|
|
### Slurm GPU Template
|
|
|
|
The following template should be used by any user submitting jobs to GPU nodes:
|
|
|
|
```bash
|
|
#!/bin/sh
|
|
#SBATCH --partition=gpu_<general|daily|hourly> # Specify 'general' or 'daily' or 'hourly'
|
|
#SBATCH --time=<D-HH:MM:SS> # Strictly recommended when using 'general' partition.
|
|
#SBATCH --output=<output_file> # Generate custom output file
|
|
#SBATCH --error=<error_file # Generate custom error file
|
|
#SBATCH --gres="gpu:<type>:<number_gpus>" # You should specify at least 'gpu'
|
|
#SBATCH --ntasks-per-core=1 # GPU nodes have hyper-threading disabled
|
|
##SBATCH --exclusive # Uncomment if you need exclusive node usage
|
|
|
|
## Advanced options example
|
|
##SBATCH --nodes=1 # Uncomment and specify number of nodes to use
|
|
##SBATCH --ntasks=44 # Uncomment and specify number of nodes to use
|
|
##SBATCH --ntasks-per-node=44 # Uncomment and specify number of tasks per node
|
|
##SBATCH --cpus-per-task=44 # Uncomment and specify the number of cores per task
|
|
```
|