Doc changes
This commit is contained in:
parent
42d8f38934
commit
fcfdbf1344
@ -41,17 +41,17 @@ entries:
|
||||
url: /merlin6/ssh-keys.html
|
||||
- title: Software repository - PModules
|
||||
url: /merlin6/using-modules.html
|
||||
- title: Job Submission
|
||||
- title: Slurm General Documentation
|
||||
folderitems:
|
||||
- title: Slurm Basic Commands
|
||||
url: /merlin6/slurm-basics.html
|
||||
- title: Running Batch Scripts
|
||||
- title: Running Slurm Batch Scripts
|
||||
url: /merlin6/running-jobs.html
|
||||
- title: Running Interactive Jobs
|
||||
- title: Running Slurm Interactive Jobs
|
||||
url: /merlin6/interactive-jobs.html
|
||||
- title: Slurm Examples
|
||||
- title: Slurm Batch Script Examples
|
||||
url: /merlin6/slurm-examples.html
|
||||
- title: Monitoring
|
||||
- title: Slurm Monitoring
|
||||
url: /merlin6/monitoring.html
|
||||
- title: Merlin6 CPU Slurm cluster
|
||||
folderitems:
|
||||
@ -101,16 +101,8 @@ entries:
|
||||
url: /merlin6/ansys-mapdl.html
|
||||
- title: ParaView
|
||||
url: /merlin6/paraview.html
|
||||
- title: Announcements
|
||||
folderitems:
|
||||
- title: Downtimes
|
||||
url: /merlin6/downtimes.html
|
||||
- title: Past Downtimes
|
||||
url: /merlin6/past-downtimes.html
|
||||
- title: Support
|
||||
folderitems:
|
||||
- title: Migrating From Merlin5
|
||||
url: /merlin6/migrating.html
|
||||
- title: Known Problems
|
||||
url: /merlin6/known-problems.html
|
||||
- title: Troubleshooting
|
||||
|
@ -63,13 +63,13 @@ and, if possible, they will preempt running jobs from partitions with lower *Pri
|
||||
|
||||
### Merlin6 GPU Accounts
|
||||
|
||||
Users might need to specify the Slurm account to be used. If no account is specified, the **`merlin`** **account** will be used as default:
|
||||
Users need to ensure that the public **`merlin`** account is specified. No specifying account options would default to this account.
|
||||
This is mostly needed by users which have multiple Slurm accounts, which may define by mistake a different account.
|
||||
|
||||
```bash
|
||||
#SBATCH --account=merlin # Possible values: merlin, gwendolen_public, gwendolen
|
||||
```
|
||||
|
||||
Not all accounts can be used on all partitions. This is resumed in the table below:
|
||||
Not all the accounts can be used on all partitions. This is resumed in the table below:
|
||||
|
||||
| Slurm Account | Slurm Partitions |
|
||||
|:-------------------: | :--------------: |
|
||||
@ -77,13 +77,17 @@ Not all accounts can be used on all partitions. This is resumed in the table bel
|
||||
| **gwendolen_public** | `gwendolen` |
|
||||
| **gwendolen** | `gwendolen` |
|
||||
|
||||
By default, all users belong to the `merlin` and `gwendolen_public` Slurm accounts.
|
||||
The `gwendolen` **account** is only available for a few set of users, other users must use `gwendolen_public` instead.
|
||||
By default, all users belong to the `merlin` and `gwendolen_public` Slurm accounts. `gwendolen` is a restricted account.
|
||||
|
||||
For running jobs in the `gwendolen` **partition**, users must specify one of the `gwendolen_public` or `gwendolen` accounts.
|
||||
The `merlin` account is not allowed in the `gwendolen` partition.
|
||||
#### The 'gwendolen' accounts
|
||||
|
||||
### GPU specific options
|
||||
For running jobs in the **`gwendolen`** partition, users must specify one of the `gwendolen_public` or `gwendolen` accounts.
|
||||
The `merlin` account is not allowed to use the `gwendolen` partition.
|
||||
|
||||
* The **`gwendolen_public`** can be used by any Merlin user, and provides restricted resource access to **`gwendolen`**.
|
||||
* The **`gwendolen`** is restricted to a set of users, and provides full access to **`gwendolen`**.
|
||||
|
||||
### Slurm GPU specific options
|
||||
|
||||
Some options are available when using GPUs. These are detailed here.
|
||||
|
||||
@ -115,6 +119,7 @@ for each Slurm command for further information about it (`man salloc`, `man sbat
|
||||
Below are listed the most common settings:
|
||||
|
||||
```bash
|
||||
#SBATCH --hint=[no]multithread
|
||||
#SBATCH --ntasks=<ntasks>
|
||||
#SBATCH --ntasks-per-gpu=<ntasks>
|
||||
#SBATCH --mem-per-gpu=<size[units]>
|
||||
@ -127,6 +132,17 @@ Below are listed the most common settings:
|
||||
|
||||
Please, notice that when defining `[<type>:]` once, then all other options must use it too!
|
||||
|
||||
#### Dealing with Hyper-Threading
|
||||
|
||||
The **`gmerlin6`** cluster contains the partition `gwendolen`, which has a node with Hyper-Threading enabled.
|
||||
In that case, one should always specify whether to use Hyper-Threading or not. If not defined, Slurm will
|
||||
generally use it (exceptions apply). For this machine, generally HT is recommended.
|
||||
|
||||
```bash
|
||||
#SBATCH --hint=multithread # Use extra threads with in-core multi-threading.
|
||||
#SBATCH --hint=nomultithread # Don't use extra threads with in-core multi-threading.
|
||||
```
|
||||
|
||||
## User and job limits
|
||||
|
||||
The GPU cluster contains some basic user and job limits to ensure that a single user can not overabuse the resources and a fair usage of the cluster.
|
||||
@ -135,8 +151,8 @@ The limits are described below.
|
||||
### Per job limits
|
||||
|
||||
These are limits applying to a single job. In other words, there is a maximum of resources a single job can use.
|
||||
Limits are defined using QoS, and this is usually set at the partition level. Limits are described in the table below with the format: `SlurmQoS(limits)`,
|
||||
(list of possible `SlurmQoS` values can be listed with the command `sacctmgr show qos`):
|
||||
Limits are defined using QoS, and this is usually set at the partition level. Limits are described in the table below with the format: `SlurmQoS(limits)`
|
||||
(possible `SlurmQoS` values can be listed with the command `sacctmgr show qos`):
|
||||
|
||||
| Partition | Slurm Account | Mon-Sun 0h-24h |
|
||||
|:-------------:| :----------------: | :------------------------------------------: |
|
||||
@ -159,8 +175,8 @@ For full access, the `gwendolen` account is needed, and this is restricted to a
|
||||
### Per user limits for GPU partitions
|
||||
|
||||
These limits apply exclusively to users. In other words, there is a maximum of resources a single user can use.
|
||||
Limits are defined using QoS, and this is usually set at the partition level. Limits are described in the table below with the format: `SlurmQoS(limits)`,
|
||||
(list of possible `SlurmQoS` values can be listed with the command `sacctmgr show qos`):
|
||||
Limits are defined using QoS, and this is usually set at the partition level. Limits are described in the table below with the format: `SlurmQoS(limits)`
|
||||
(possible `SlurmQoS` values can be listed with the command `sacctmgr show qos`):
|
||||
|
||||
| Partition | Slurm Account | Mon-Sun 0h-24h |
|
||||
|:-------------:| :----------------: | :---------------------------------------------: |
|
||||
|
@ -23,7 +23,7 @@ The following table show default and maximum resources that can be used per node
|
||||
| merlin-c-[33-45] | 1 core | 16 cores | 1 | 60000 | 10000 |
|
||||
| merlin-c-[46-47] | 1 core | 16 cores | 1 | 124000 | 10000 |
|
||||
|
||||
There is one main difference between the Merlin5 and Merlin6 clusters: Merlin5 is keeping an old configuration which does not
|
||||
There is one *main difference between the Merlin5 and Merlin6 clusters*: Merlin5 is keeping an old configuration which does not
|
||||
consider the memory as a *consumable resource*. Hence, users can *oversubscribe* memory. This might trigger some side-effects, but
|
||||
this legacy configuration has been kept to ensure that old jobs can keep running in the same way they did a few years ago.
|
||||
If you know that this might be a problem for you, please, always use Merlin6 instead.
|
||||
@ -41,7 +41,7 @@ To run jobs in the **`merlin5`** cluster users **must** specify the cluster name
|
||||
#SBATCH --cluster=merlin5
|
||||
```
|
||||
|
||||
### CPU partitions
|
||||
### Merlin5 CPU partitions
|
||||
|
||||
Users might need to specify the Slurm partition. If no partition is specified, it will default to **`merlin`**:
|
||||
|
||||
@ -63,20 +63,18 @@ partitions, Slurm will also attempt first to allocate jobs on partitions with hi
|
||||
**\*\***Jobs submitted to a partition with a higher **PriorityTier** value will be dispatched before pending jobs in partition with lower *PriorityTier* value
|
||||
and, if possible, they will preempt running jobs from partitions with lower *PriorityTier* values.
|
||||
|
||||
The `merlin-long` partition, as it might contain jobs running for up to 21 days, is limited to 4 nodes.
|
||||
The **`merlin-long`** partition **is limited to 4 nodes**, as it might contain jobs running for up to 21 days.
|
||||
|
||||
### Merlin5 CPU Accounts
|
||||
|
||||
Users need to ensure that the **`merlin`** **account** is specified (or no account is specified).
|
||||
This is the unique account available in the **merlin5** cluster.
|
||||
This is mostly needed by users which have multiple Slurm accounts, which may defined by mistake a different account existing in
|
||||
one of the other Merlin clusters (i.e. `merlin6`, `gmerlin6`).
|
||||
Users need to ensure that the public **`merlin`** account is specified. No specifying account options would default to this account.
|
||||
This is mostly needed by users which have multiple Slurm accounts, which may define by mistake a different account.
|
||||
|
||||
```bash
|
||||
#SBATCH --account=merlin # Possible values: merlin
|
||||
```
|
||||
|
||||
### Merlin5 CPU specific options
|
||||
### Slurm CPU specific options
|
||||
|
||||
Some options are available when using CPUs. These are detailed here.
|
||||
|
||||
@ -109,124 +107,29 @@ resources from the batch system would drain the entire cluster for fitting the j
|
||||
Hence, there is a need of setting up wise limits and to ensure that there is a fair usage of the resources, by trying to optimize the overall efficiency
|
||||
of the cluster while allowing jobs of different nature and sizes (it is, **single core** based **vs parallel jobs** of different sizes) to run.
|
||||
|
||||
{{site.data.alerts.warning}}Wide limits are provided in the <b>daily</b> and <b>hourly</b> partitions, while for <b>general</b> those limits are
|
||||
more restrictive.
|
||||
<br>However, we kindly ask users to inform the Merlin administrators when there are plans to send big jobs which would require a
|
||||
massive draining of nodes for allocating such jobs. This would apply to jobs requiring the <b>unlimited</b> QoS (see below <i>"Per job limits"</i>)
|
||||
{{site.data.alerts.end}}
|
||||
In the **`merlin5`** cluster, as not many users are running on it, these limits are wider than the ones set in the **`merlin6`** and **`gmerlin6`** clusters.
|
||||
|
||||
{{site.data.alerts.tip}}If you have different requirements, please let us know, we will try to accomodate or propose a solution for you.
|
||||
{{site.data.alerts.end}}
|
||||
### Per job limits
|
||||
|
||||
#### Per job limits
|
||||
These are limits which apply to a single job. In other words, there is a maximum of resources a single job can use. These limits are described in the table below,
|
||||
with the format `SlurmQoS(limits)` (`SlurmQoS` can be listed from the `sacctmgr show qos` command):
|
||||
|
||||
These are limits which apply to a single job. In other words, there is a maximum of resources a single job can use. This is described in the table below,
|
||||
and limits will vary depending on the day of the week and the time (*working* vs *non-working* hours). Limits are shown in format: `SlurmQoS(limits)`,
|
||||
where `SlurmQoS` can be seen with the command `sacctmgr show qos`:
|
||||
| Partition | Mon-Sun 0h-24h | Other limits |
|
||||
|:---------------: | :--------------: | :----------: |
|
||||
| **merlin** | merlin5(cpu=384) | None |
|
||||
| **merlin-long** | merlin5(cpu=384) | Max. 4 nodes |
|
||||
|
||||
| Partition | Mon-Fri 0h-18h | Sun-Thu 18h-0h | From Fri 18h to Mon 0h |
|
||||
|:----------: | :------------------: | :------------: | :---------------------: |
|
||||
| **general** | normal(cpu=704,mem=2750G) | normal(cpu=704,mem=2750G) | normal(cpu=704,mem=2750G) |
|
||||
| **daily** | daytime(cpu=704,mem=2750G) | nighttime(cpu=1408,mem=5500G) | unlimited(cpu=2200,mem=8593.75G) |
|
||||
| **hourly** | unlimited(cpu=2200,mem=8593.75G) | unlimited(cpu=2200,mem=8593.75G) | unlimited(cpu=2200,mem=8593.75G) |
|
||||
By default, by QoS limits, a job can not use more than 384 cores (max CPU per job).
|
||||
However, for the `merlin-long`, this is even more restricted: there is an extra limit of 4 dedicated nodes for this partion. This is defined
|
||||
at the partition level, and will overwrite any QoS limit as long as this is more restrictive.
|
||||
|
||||
By default, a job can not use more than 704 cores (max CPU per job). In the same way, memory is also proportionally limited. This is equivalent as
|
||||
running a job using up to 8 nodes at once. This limit applies to the **general** partition (fixed limit) and to the **daily** partition (only during working hours).
|
||||
Limits are softed for the **daily** partition during non working hours, and during the weekend limits are even wider.
|
||||
### Per user limits for CPU partitions
|
||||
|
||||
For the **hourly** partition, **despite running many parallel jobs is something not desirable** (for allocating such jobs it requires massive draining of nodes),
|
||||
wider limits are provided. In order to avoid massive nodes drain in the cluster, for allocating huge jobs, setting per job limits is necessary. Hence, **unlimited** QoS
|
||||
mostly refers to "per user" limits more than to "per job" limits (in other words, users can run any number of hourly jobs, but the job size for such jobs is limited
|
||||
with wide values).
|
||||
No user limits apply by QoS. For the **`merlin`** partition, a single user could fill the whole batch system with jobs (however, the restriction is at the job size, as explained above). For the **`merlin-limit`** partition, the 4 node limitation still applies.
|
||||
|
||||
#### Per user limits for CPU partitions
|
||||
|
||||
These limits which apply exclusively to users. In other words, there is a maximum of resources a single user can use. This is described in the table below,
|
||||
and limits will vary depending on the day of the week and the time (*working* vs *non-working* hours). Limits are shown in format: `SlurmQoS(limits)`,
|
||||
where `SlurmQoS` can be seen with the command `sacctmgr show qos`:
|
||||
|
||||
| Partition | Mon-Fri 0h-18h | Sun-Thu 18h-0h | From Fri 18h to Mon 0h |
|
||||
|:-----------:| :----------------: | :------------: | :---------------------: |
|
||||
| **general** | normal(cpu=704,mem=2750G) | normal(cpu=704,mem=2750G) | normal(cpu=704,mem=2750G) |
|
||||
| **daily** | daytime(cpu=1408,mem=5500G) | nighttime(cpu=2112,mem=8250G) | unlimited(cpu=6336,mem=24750G) |
|
||||
| **hourly** | unlimited(cpu=6336,mem=24750G) | unlimited(cpu=6336,mem=24750G)| unlimited(cpu=6336,mem=24750G) |
|
||||
|
||||
By default, users can not use more than 704 cores at the same time (max CPU per user). Memory is also proportionally limited in the same way. This is
|
||||
equivalent to 8 exclusive nodes. This limit applies to the **general** partition (fixed limit) and to the **daily** partition (only during working hours).
|
||||
For the **hourly** partition, there are no limits restriction and user limits are removed. Limits are softed for the **daily** partition during non
|
||||
working hours, and during the weekend limits are removed.
|
||||
|
||||
## Merlin6 GPU
|
||||
|
||||
Basic configuration for the **merlin5 GPUs** will be detailed here.
|
||||
For advanced usage, please refer to [Understanding the Slurm configuration (for advanced users)](/merlin5/slurm-configuration.html#understanding-the-slurm-configuration-for-advanced-users)
|
||||
|
||||
### GPU nodes definition
|
||||
|
||||
| Nodes | Def.#CPUs | Max.#CPUs | #Threads | Def.Mem/CPU | Max.Mem/CPU | Max.Mem/Node | Max.Swap | GPU Type | Def.#GPUs | Max.#GPUs |
|
||||
|:------------------:| ---------:| :--------:| :------: | :----------:| :----------:| :-----------:| :-------:| :--------: | :-------: | :-------: |
|
||||
| merlin-g-[001] | 1 core | 8 cores | 1 | 4000 | 102400 | 102400 | 10000 | **GTX1080** | 1 | 2 |
|
||||
| merlin-g-[002-005] | 1 core | 20 cores | 1 | 4000 | 102400 | 102400 | 10000 | **GTX1080** | 1 | 4 |
|
||||
| merlin-g-[006-009] | 1 core | 20 cores | 1 | 4000 | 102400 | 102400 | 10000 | **GTX1080Ti** | 1 | 4 |
|
||||
| merlin-g-[010-013] | 1 core | 20 cores | 1 | 4000 | 102400 | 102400 | 10000 | **RTX2080Ti** | 1 | 4 |
|
||||
|
||||
{{site.data.alerts.tip}}Always check <b>'/etc/slurm/gres.conf'</b> for changes in the GPU type and details of the NUMA node.
|
||||
{{site.data.alerts.end}}
|
||||
|
||||
### GPU partitions
|
||||
|
||||
| GPU Partition | Default Time | Max Time | Max Nodes | Priority | PriorityJobFactor\* |
|
||||
|:-----------------: | :----------: | :------: | :-------: | :------: | :-----------------: |
|
||||
| **<u>gpu</u>** | 1 day | 1 week | 4 | low | 1 |
|
||||
| **gpu-short** | 2 hours | 2 hours | 4 | highest | 1000 |
|
||||
|
||||
\*The **PriorityJobFactor** value will be added to the job priority (*PARTITION* column in `sprio -l` ). In other words, jobs sent to higher priority
|
||||
partitions will usually run first (however, other factors such like **job age** or mainly **fair share** might affect to that decision). For the GPU
|
||||
partitions, Slurm will also attempt first to allocate jobs on partitions with higher priority over partitions with lesser priority.
|
||||
|
||||
### User and job limits
|
||||
|
||||
The GPU cluster contains some basic user and job limits to ensure that a single user can not overabuse the resources and a fair usage of the cluster.
|
||||
The limits are described below.
|
||||
|
||||
#### Per job limits
|
||||
|
||||
These are limits applying to a single job. In other words, there is a maximum of resources a single job can use.
|
||||
Limits are defined using QoS, and this is usually set at the partition level. Limits are described in the table below with the format: `SlurmQoS(limits)`,
|
||||
(list of possible `SlurmQoS` values can be listed with the command `sacctmgr show qos`):
|
||||
|
||||
| Partition | Mon-Sun 0h-24h |
|
||||
|:-------------:| :------------------------------------: |
|
||||
| **gpu** | gpu_week(cpu=40,gres/gpu=8,mem=200G) |
|
||||
| **gpu-short** | gpu_week(cpu=40,gres/gpu=8,mem=200G) |
|
||||
|
||||
With these limits, a single job can not use more than 40 CPUs, more than 8 GPUs or more than 200GB.
|
||||
Any job exceeding such limits will stay in the queue with the message **`QOSMax[Cpu|GRES|Mem]PerJob`**.
|
||||
Since there are no more existing QoS during the week temporary overriding job limits (this happens for instance in the CPU **daily** partition), the job needs to be cancelled, and the requested resources must be adapted according to the above resource limits.
|
||||
|
||||
#### Per user limits for CPU partitions
|
||||
|
||||
These limits apply exclusively to users. In other words, there is a maximum of resources a single user can use.
|
||||
Limits are defined using QoS, and this is usually set at the partition level. Limits are described in the table below with the format: `SlurmQoS(limits)`,
|
||||
(list of possible `SlurmQoS` values can be listed with the command `sacctmgr show qos`):
|
||||
|
||||
| Partition | Mon-Sun 0h-24h |
|
||||
|:-------------:| :---------------------------------------------------------: |
|
||||
| **gpu** | gpu_week(cpu=80,gres/gpu=16,mem=400G) |
|
||||
| **gpu-short** | gpu_week(cpu=80,gres/gpu=16,mem=400G) |
|
||||
|
||||
With these limits, a single user can not use more than 80 CPUs, more than 16 GPUs or more than 400GB.
|
||||
Jobs sent by any user already exceeding such limits will stay in the queue with the message **`QOSMax[Cpu|GRES|Mem]PerUser`**. In that case, job can wait in the queue until some of the running resources are freed.
|
||||
|
||||
Notice that user limits are wider than job limits. In that way, a user can run up to two 8 GPUs based jobs, or up to four 4 GPUs based jobs, etc.
|
||||
Please try to avoid occupying all GPUs of the same type for several hours or multiple days, otherwise it would block other users needing the same
|
||||
type of GPU.
|
||||
|
||||
## Understanding the Slurm configuration (for advanced users)
|
||||
## Advanced Slurm configuration
|
||||
|
||||
Clusters at PSI use the [Slurm Workload Manager](http://slurm.schedmd.com/) as the batch system technology for managing and scheduling jobs.
|
||||
Historically, *Merlin4* and *Merlin5* also used Slurm. In the same way, **Merlin6** has been also configured with this batch system.
|
||||
|
||||
Slurm has been installed in a **multi-clustered** configuration, allowing to integrate multiple clusters in the same batch system.
|
||||
|
||||
For understanding the Slurm configuration setup in the cluster, sometimes may be useful to check the following files:
|
||||
@ -235,6 +138,5 @@ For understanding the Slurm configuration setup in the cluster, sometimes may be
|
||||
* ``/etc/slurm/gres.conf`` - can be found in the GPU nodes, is also propgated to login nodes and computing nodes for user read access.
|
||||
* ``/etc/slurm/cgroup.conf`` - can be found in the computing nodes, is also propagated to login nodes for user read access.
|
||||
|
||||
The previous configuration files which can be found in the login nodes, correspond exclusively to the **merlin5** cluster configuration files.
|
||||
Configuration files for the old **merlin5** cluster must be checked directly on any of the **merlin5** computing nodes: these are not propagated
|
||||
to the **merlin5** login nodes.
|
||||
The previous configuration files which can be found in the login nodes, correspond exclusively to the **merlin6** cluster configuration files.
|
||||
Configuration files for the old **merlin5** cluster or for the **gmerlin6** cluster must be checked directly on any of the **merlin5** or **gmerlin6** computing nodes (in example, by login in to one of the nodes while a job or an active allocation is running).
|
||||
|
@ -1,341 +0,0 @@
|
||||
---
|
||||
title: Running Slurm Scripts
|
||||
#tags:
|
||||
keywords: batch script, slurm, sbatch, srun
|
||||
last_updated: 23 January 2020
|
||||
summary: "This document describes how to run batch scripts in Slurm."
|
||||
sidebar: merlin6_sidebar
|
||||
permalink: /merlin6/running-jobs.html
|
||||
---
|
||||
|
||||
|
||||
## The rules
|
||||
|
||||
Before starting using the cluster, please read the following rules:
|
||||
|
||||
1. Always try to **estimate and** to **define a proper run time** of your jobs:
|
||||
* Use ``--time=<D-HH:MM:SS>`` for that.
|
||||
* This will ease *scheduling* and *backfilling*.
|
||||
* Slurm will schedule efficiently the queued jobs.
|
||||
* For very long runs, please consider using ***[Job Arrays with Checkpointing](/merlin6/running-jobs.html#array-jobs-running-very-long-tasks-with-checkpoint-files)***
|
||||
2. Try to optimize your jobs for running within **one day**. Please, consider the following:
|
||||
* Some software can simply scale up by using more nodes while drastically reducing the run time.
|
||||
* Some software allow to save a specific state, and a second job can start from that state.
|
||||
* ***[Job Arrays with Checkpointing](/merlin6/running-jobs.html#array-jobs-running-very-long-tasks-with-checkpoint-files)*** can help you with that.
|
||||
* Use the **'daily'** partition when you ensure that you can run within one day:
|
||||
* ***'daily'*** **will give you more priority than running in the** ***'general'*** **queue!**
|
||||
3. Is **forbidden** to run **very short jobs**:
|
||||
* Running jobs of few seconds can cause severe problems.
|
||||
* Running very short jobs causes a lot of overhead.
|
||||
* ***Question:*** Is my job a very short job?
|
||||
* ***Answer:*** If it lasts in few seconds or very few minutes, yes.
|
||||
* ***Question:*** How long should my job run?
|
||||
* ***Answer:*** as the *Rule of Thumb*, from 5' would start being ok, from 15' would preferred.
|
||||
* Use ***[Packed Jobs](/merlin6/running-jobs.html#packed-jobs-running-a-large-number-of-short-tasks)*** for running a large number of short tasks.
|
||||
* For short runs lasting in less than 1 hour, please use the **hourly** partition.
|
||||
* ***'hourly'*** **will give you more priority than running in the** ***'daily'*** **queue!**
|
||||
4. Do not submit hundreds of similar jobs!
|
||||
* Use ***[Array Jobs](/merlin6/running-jobs.html#array-jobs-launching-a-large-number-of-related-jobs)*** for gathering jobs instead.
|
||||
|
||||
{{site.data.alerts.tip}}Having a good estimation of the <i>time</i> needed by your jobs, a proper way for running them, and optimizing the jobs to <i>run within one day</i> will contribute to make the system fairly and efficiently used.
|
||||
{{site.data.alerts.end}}
|
||||
|
||||
## Basic commands for running batch scripts
|
||||
|
||||
* Use **``sbatch``** for submitting a batch script to Slurm.
|
||||
* Use **``srun``** for running parallel tasks.
|
||||
* Use **``squeue``** for checking jobs status.
|
||||
* Use **``scancel``** for cancelling/deleting a job from the queue.
|
||||
|
||||
{{site.data.alerts.tip}}Use Linux <b>'man'</b> pages when needed (i.e. <span style="color:orange;">'man sbatch'</span>), mostly for checking the available options for the above commands.
|
||||
{{site.data.alerts.end}}
|
||||
|
||||
## Basic settings
|
||||
|
||||
For a complete list of options and parameters available is recommended to use the **man pages** (i.e. ``man sbatch``, ``man srun``, ``man salloc``).
|
||||
Please, notice that behaviour for some parameters might change depending on the command used when running jobs (in example, ``--exclusive`` behaviour in ``sbatch`` differs from ``srun``).
|
||||
|
||||
In this chapter we show the basic parameters which are usually needed in the Merlin cluster.
|
||||
|
||||
### Common settings
|
||||
|
||||
The following settings are the minimum required for running a job in the Merlin CPU and GPU nodes. Please, consider taking a look to the **man pages** (i.e. `man sbatch`, `man salloc`, `man srun`) for more
|
||||
information about all possible options. Also, do not hesitate to contact us on any questions.
|
||||
|
||||
* **Clusters:** For running jobs in the Merlin6 CPU and GPU nodes, users should to add the following option:
|
||||
```bash
|
||||
#SBATCH --clusters=merlin6
|
||||
```
|
||||
|
||||
Users with proper access, can also use the `merlin5` cluster.
|
||||
* **Partitions:** except when using the *default* partition, one needs to specify the partition:
|
||||
* GPU partitions: ``gpu``, ``gpu-short`` (more details **[Slurm GPU Partitions](/merlin6/slurm-configuration.html#gpu-partitions)**)
|
||||
* CPU partitions: ``general`` (**default** if no partition is specified), ``daily`` and ``hourly`` (more details: **[Slurm CPU Partitions](/merlin6/slurm-configuration.html#cpu-partitions)**)
|
||||
|
||||
Partition can be set as follows:
|
||||
```bash
|
||||
#SBATCH --partition=<partition_name> # Partition to use. 'general' is the 'default'
|
||||
```
|
||||
* **[Optional] Disabling shared nodes**: by default, nodes can share jobs from multiple users, but by ensuring that CPU/Memory/GPU resources are dedicated.
|
||||
One can request exclusive usage of a node (or set of nodes) with the following option:
|
||||
```bash
|
||||
#SBATCH --exclusive # Only if you want a dedicated node
|
||||
```
|
||||
* **Time**: is important to define how long a job should run, according to the reality. This will help Slurm when *scheduling* and *backfilling*, by managing job queues in a more efficient
|
||||
way. This value can never exceed the `MaxTime` of the affected partition. Please review the partition information (`scontrol show partition <partition_name>` or [GPU Partition Configuration](/merlin6/slurm-configuration.html#gpu-partitions)) for
|
||||
`DefaultTime` and `MaxTime` values.
|
||||
```bash
|
||||
#SBATCH --time=<D-HH:MM:SS> # Time job needs to run. Can not exceed the partition `MaxTime`
|
||||
```
|
||||
* **Output and error files**: by default, Slurm script will generate standard output and errors files in the directory from where you submit the batch script:
|
||||
* standard output will be written into a file ``slurm-$SLURM_JOB_ID.out``.
|
||||
* standard error will be written into a file ``slurm-$SLURM_JOB_ID.err``.
|
||||
|
||||
If you want to the default names it can be done with the options ``--output`` and ``--error``. In example:
|
||||
```bash
|
||||
#SBATCH --output=logs/myJob.%N.%j.out # Generate an output file per hostname and jobid
|
||||
#SBATCH --error=logs/myJob.%N.%j.err # Generate an errori file per hostname and jobid
|
||||
```
|
||||
Use **man sbatch** (``man sbatch | grep -A36 '^filename pattern'``) for getting a list specification of **filename patterns**.
|
||||
|
||||
* **Multithreading/No-Multithreading:** Whether a node has or not multithreading depends on the node configuration. By default, HT nodes have HT enabled, but one can ensure this feature with the option `--hint` as follows:
|
||||
```bash
|
||||
#SBATCH --hint=multithread # Use extra threads with in-core multi-threading.
|
||||
#SBATCH --hint=nomultithread # Don't use extra threads with in-core multi-threading.
|
||||
```
|
||||
Consider that, sometimes, depending on your job requirements, you might need also to setup how many `--ntasks-per-core` or `--cpus-per-task` (even other options) in addition to the `--hint` command. Please, contact us in case of doubts.
|
||||
{{site.data.alerts.tip}} In general, <span style="color:orange;"><b>--hint=[no]multithread</b></span> is a mandatory field. On the other hand, <span style="color:orange;"><b>--ntasks-per-core</b></span> is only needed when
|
||||
one needs to define how a task should be handled within a core, and this setting will not be generally used on Hybrid MPI/OpenMP jobs where multiple cores are needed for single tasks.
|
||||
{{site.data.alerts.end}}
|
||||
|
||||
|
||||
### GPU specific settings
|
||||
|
||||
The following settings are required for running on the GPU nodes:
|
||||
|
||||
* **Slurm account**: When using GPUs, users must use the `merlin-gpu` Slurm account. This is done with the ``--account`` setting as follows:
|
||||
```bash
|
||||
#SBATCH --account=merlin-gpu # The account 'merlin-gpu' must be used for GPUs
|
||||
```
|
||||
* **`[Valid until 08.01.2021]` GRES:** Slurm must be aware that the job will use GPUs. This is done with the `--gres` setting, at least, as follows:
|
||||
```bash
|
||||
#SBATCH --gres=gpu # Always set at least this option when using GPUs
|
||||
```
|
||||
This option is still valid as this might be needed by other resources, but for GPUs new options (i.e. `--gpus`, `--mem-per-gpu`) can be used, which provide more flexibility when running on GPUs.
|
||||
* **`[Valid from 08.01.2021]` GPU options (instead of GRES):** Slurm must be aware that the job will use GPUs. New options are available for specifying
|
||||
the GPUs as a consumable resource. These are the following:
|
||||
* `--gpus=[<type>:]<number>` *instead of* (but also in addition with) `--gres=gpu`: specifies the total number of GPUs required for the job.
|
||||
* `--gpus-per-task=[<type>:]<number>`, `--gpus-per-socket=[<type>:]<number>`, `--gpus-per-node=[<type>:]<number>` to specify the number of GPUs per tasks and/or socket and/or node.
|
||||
* `--gpus-per-node=[<type>:]<number>`, `--gpus-per-socket`, `--gpus-per-task`, to specify how many GPUs per node, socket and or tasks need to be allocated.
|
||||
* `--cpus-per-gpu`, to specify the number of CPUs to be used for each GPU.
|
||||
* `--mem-per-gpu`, to specify the amount of memory to be used for each GPU.
|
||||
* Other advanced options (i.e. `--gpu-bind`). Please see **man** pages for **sbatch**/**srun**/**salloc** (i.e. *`man sbatch`*) for further information.
|
||||
Please read below **[GPU advanced settings](/merlin6/running-jobs.html#gpu-advanced-settings)** for further information.
|
||||
* Please, consider that one can specify the GPU `type` in some options. If one needs to specify it, then it must be specified in all options defined in the Slurm job.
|
||||
|
||||
#### GPU advanced settings
|
||||
|
||||
GPUs are also a shared resource. Hence, multiple users can run jobs on a single node, but only one GPU per user process
|
||||
must be used.
|
||||
|
||||
**Until 08.01.2021**, users can define which GPUs resources and *how many per node* they need with the ``--gres`` option.
|
||||
Valid ``gres`` options are: ``gpu[[:type]:count]`` where ``type=GTX1080|GTX1080Ti|RTX2080Ti`` and ``count=<number of gpus requested per node>``. In example:
|
||||
```bash
|
||||
#SBATCH --gres=gpu:GTX1080:4 # Use a node with 4 x GTX1080 GPUs
|
||||
```
|
||||
|
||||
**From 08.01.2021**, `--gres` is not needed anymore (but can still be used), and `--gpus` and related other options should replace it. `--gpus` works in a similar way, but without
|
||||
the need of specifying the `gpu` resource. In oher words, `--gpus` options are: ``[[:type]:count]`` where ``type=GTX1080|GTX1080Ti|RTX2080Ti`` (which is optional) and ``count=<number of gpus to use>``. In example:
|
||||
```bash
|
||||
#SBATCH --gpus=GTX1080:4 # Use 4 GPUs with Type=GTX1080
|
||||
```
|
||||
This setting can use in addition other settings, such like `--gpus-per-node`, in order to accomplish a similar behaviour as with `--gres`.
|
||||
* Please, consider that one can specify the GPU `type` in some of the options. If one needs to specify it, then it must be specified in all options defined in the Slurm job.
|
||||
|
||||
{{site.data.alerts.tip}}Always check <span style="color:orange;"><b>'/etc/slurm/gres.conf'</b></span> for checking available <span style="color:orange;"><i>Types</i></span> and for details of the NUMA node.
|
||||
{{site.data.alerts.end}}
|
||||
|
||||
## Batch script templates
|
||||
|
||||
### CPU-based jobs templates
|
||||
|
||||
The following examples apply to the **Merlin6** cluster.
|
||||
|
||||
#### Nomultithreaded jobs template
|
||||
|
||||
The following template should be used by any user submitting jobs to CPU nodes:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#SBATCH --partition=<general|daily|hourly> # Specify 'general' or 'daily' or 'hourly'
|
||||
#SBATCH --time=<D-HH:MM:SS> # Strongly recommended
|
||||
#SBATCH --output=<output_file> # Generate custom output file
|
||||
#SBATCH --error=<error_file> # Generate custom error file
|
||||
#SBATCH --hint=nomultithread # Mandatory for non-multithreaded jobs
|
||||
##SBATCH --exclusive # Uncomment if you need exclusive node usage
|
||||
##SBATCH --ntasks-per-core=1 # Only mandatory for non-multithreaded single tasks
|
||||
|
||||
## Advanced options example
|
||||
##SBATCH --nodes=1 # Uncomment and specify #nodes to use
|
||||
##SBATCH --ntasks=44 # Uncomment and specify #nodes to use
|
||||
##SBATCH --ntasks-per-node=44 # Uncomment and specify #tasks per node
|
||||
##SBATCH --cpus-per-task=44 # Uncomment and specify the number of cores per task
|
||||
```
|
||||
|
||||
#### Multithreaded jobs template
|
||||
|
||||
The following template should be used by any user submitting jobs to CPU nodes:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#SBATCH --partition=<general|daily|hourly> # Specify 'general' or 'daily' or 'hourly'
|
||||
#SBATCH --time=<D-HH:MM:SS> # Strongly recommended
|
||||
#SBATCH --output=<output_file> # Generate custom output file
|
||||
#SBATCH --error=<error_file> # Generate custom error file
|
||||
#SBATCH --hint=multithread # Mandatory for multithreaded jobs
|
||||
##SBATCH --exclusive # Uncomment if you need exclusive node usage
|
||||
##SBATCH --ntasks-per-core=2 # Only mandatory for multithreaded single tasks
|
||||
|
||||
## Advanced options example
|
||||
##SBATCH --nodes=1 # Uncomment and specify #nodes to use
|
||||
##SBATCH --ntasks=88 # Uncomment and specify #nodes to use
|
||||
##SBATCH --ntasks-per-node=88 # Uncomment and specify #tasks per node
|
||||
##SBATCH --cpus-per-task=88 # Uncomment and specify the number of cores per task
|
||||
```
|
||||
|
||||
### GPU-based jobs templates
|
||||
|
||||
The following template should be used by any user submitting jobs to GPU nodes:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#SBATCH --partition=<gpu|gpu-short> # Specify GPU partition
|
||||
#SBATCH --gpus="<type>:<num_gpus>" # <type> is optional, <num_gpus> is mandatory
|
||||
#SBATCH --time=<D-HH:MM:SS> # Strongly recommended
|
||||
#SBATCH --output=<output_file> # Generate custom output file
|
||||
#SBATCH --error=<error_file # Generate custom error file
|
||||
#SBATCH --account=merlin-gpu # The account 'merlin-gpu' must be used
|
||||
##SBATCH --exclusive # Uncomment if you need exclusive node usage
|
||||
|
||||
## Advanced options example
|
||||
##SBATCH --nodes=1 # Uncomment and specify number of nodes to use
|
||||
##SBATCH --ntasks=1 # Uncomment and specify number of nodes to use
|
||||
##SBATCH --cpus-per-gpu=5 # Uncomment and specify the number of cores per task
|
||||
##SBATCH --mem-per-gpu=16000 # Uncomment and specify the number of cores per task
|
||||
##SBATCH --gpus-per-node=<type>:2 # Uncomment and specify the number of GPUs per node
|
||||
##SBATCH --gpus-per-socket=<type>:2 # Uncomment and specify the number of GPUs per socket
|
||||
##SBATCH --gpus-per-task=<type>:1 # Uncomment and specify the number of GPUs per task
|
||||
```
|
||||
|
||||
## Advanced configurations
|
||||
|
||||
### Array Jobs: launching a large number of related jobs
|
||||
|
||||
If you need to run a large number of jobs based on the same executable with systematically varying inputs,
|
||||
e.g. for a parameter sweep, you can do this most easily in form of a **simple array job**.
|
||||
|
||||
``` bash
|
||||
#!/bin/bash
|
||||
#SBATCH --job-name=test-array
|
||||
#SBATCH --partition=daily
|
||||
#SBATCH --ntasks=1
|
||||
#SBATCH --time=08:00:00
|
||||
#SBATCH --array=1-8
|
||||
|
||||
echo $(date) "I am job number ${SLURM_ARRAY_TASK_ID}"
|
||||
srun myprogram config-file-${SLURM_ARRAY_TASK_ID}.dat
|
||||
|
||||
```
|
||||
|
||||
This will run 8 independent jobs, where each job can use the counter
|
||||
variable `SLURM_ARRAY_TASK_ID` defined by Slurm inside of the job's
|
||||
environment to feed the correct input arguments or configuration file
|
||||
to the "myprogram" executable. Each job will receive the same set of
|
||||
configurations (e.g. time limit of 8h in the example above).
|
||||
|
||||
The jobs are independent, but they will run in parallel (if the cluster resources allow for
|
||||
it). The jobs will get JobIDs like {some-number}_0 to {some-number}_7, and they also will each
|
||||
have their own output file.
|
||||
|
||||
**Note:**
|
||||
* Do not use such jobs if you have very short tasks, since each array sub job will incur the full overhead for launching an independent Slurm job. For such cases you should used a **packed job** (see below).
|
||||
* If you want to control how many of these jobs can run in parallel, you can use the `#SBATCH --array=1-100%5` syntax. The `%5` will define
|
||||
that only 5 sub jobs may ever run in parallel.
|
||||
|
||||
You also can use an array job approach to run over all files in a directory, substituting the payload with
|
||||
|
||||
``` bash
|
||||
FILES=(/path/to/data/*)
|
||||
srun ./myprogram ${FILES[$SLURM_ARRAY_TASK_ID]}
|
||||
```
|
||||
|
||||
Or for a trivial case you could supply the values for a parameter scan in form
|
||||
of a argument list that gets fed to the program using the counter variable.
|
||||
|
||||
``` bash
|
||||
ARGS=(0.05 0.25 0.5 1 2 5 100)
|
||||
srun ./my_program.exe ${ARGS[$SLURM_ARRAY_TASK_ID]}
|
||||
```
|
||||
|
||||
### Array jobs: running very long tasks with checkpoint files
|
||||
|
||||
If you need to run a job for much longer than the queues (partitions) permit, and
|
||||
your executable is able to create checkpoint files, you can use this
|
||||
strategy:
|
||||
|
||||
``` bash
|
||||
#!/bin/bash
|
||||
#SBATCH --job-name=test-checkpoint
|
||||
#SBATCH --partition=general
|
||||
#SBATCH --ntasks=1
|
||||
#SBATCH --time=7-00:00:00 # each job can run for 7 days
|
||||
#SBATCH --cpus-per-task=1
|
||||
#SBATCH --array=1-10%1 # Run a 10-job array, one job at a time.
|
||||
if test -e checkpointfile; then
|
||||
# There is a checkpoint file;
|
||||
myprogram --read-checkp checkpointfile
|
||||
else
|
||||
# There is no checkpoint file, start a new simulation.
|
||||
myprogram
|
||||
fi
|
||||
```
|
||||
|
||||
The `%1` in the `#SBATCH --array=1-10%1` statement defines that only 1 subjob can ever run in parallel, so
|
||||
this will result in subjob n+1 only being started when job n has finished. It will read the checkpoint file
|
||||
if it is present.
|
||||
|
||||
|
||||
### Packed jobs: running a large number of short tasks
|
||||
|
||||
Since the launching of a Slurm job incurs some overhead, you should not submit each short task as a separate
|
||||
Slurm job. Use job packing, i.e. you run the short tasks within the loop of a single Slurm job.
|
||||
|
||||
You can launch the short tasks using `srun` with the `--exclusive` switch (not to be confused with the
|
||||
switch of the same name used in the SBATCH commands). This switch will ensure that only a specified
|
||||
number of tasks can run in parallel.
|
||||
|
||||
As an example, the following job submission script will ask Slurm for
|
||||
44 cores (threads), then it will run the =myprog= program 1000 times with
|
||||
arguments passed from 1 to 1000. But with the =-N1 -n1 -c1
|
||||
--exclusive= option, it will control that at any point in time only 44
|
||||
instances are effectively running, each being allocated one CPU. You
|
||||
can at this point decide to allocate several CPUs or tasks by adapting
|
||||
the corresponding parameters.
|
||||
|
||||
``` bash
|
||||
#! /bin/bash
|
||||
#SBATCH --job-name=test-checkpoint
|
||||
#SBATCH --partition=general
|
||||
#SBATCH --ntasks=1
|
||||
#SBATCH --time=7-00:00:00
|
||||
#SBATCH --ntasks=44 # defines the number of parallel tasks
|
||||
for i in {1..1000}
|
||||
do
|
||||
srun -N1 -n1 -c1 --exclusive ./myprog $i &
|
||||
done
|
||||
wait
|
||||
```
|
||||
|
||||
**Note:** The `&` at the end of the `srun` line is needed to not have the script waiting (blocking).
|
||||
The `wait` command waits for all such background tasks to finish and returns the exit code.
|
||||
|
@ -61,10 +61,8 @@ until the requested resources are allocated.
|
||||
|
||||
When running **``salloc``**, once the resources are allocated, *by default* the user will get
|
||||
a ***new shell on one of the allocated resources*** (if a user has requested few nodes, it will
|
||||
prompt a new shell on the first allocated node). This is thanks to the default ``srun`` command
|
||||
``srun -n1 -N1 --mem-per-cpu=0 --gres=gpu:0 --pty --preserve-env --mpi=none $SHELL`` which will run
|
||||
in the background (users do not need to specify it). However, this behaviour can
|
||||
be changed by running a different command after the **``salloc``** command. In example:
|
||||
prompt a new shell on the first allocated node). However, this behaviour can be changed by adding
|
||||
a shell (`$SHELL`) at the end of the `salloc` command. In example:
|
||||
|
||||
```bash
|
||||
# Typical 'salloc' call
|
||||
@ -142,9 +140,9 @@ how to connect to the **NoMachine** service in the Merlin cluster.
|
||||
|
||||
For other non officially supported graphical access (X11 forwarding):
|
||||
|
||||
* For Linux clients, please follow [{Accessing Merlin -> Accessing from Linux Clients}](/merlin6/connect-from-linux.html)
|
||||
* For Windows clients, please follow [{Accessing Merlin -> Accessing from Windows Clients}](/merlin6/connect-from-windows.html)
|
||||
* For MacOS clients, please follow [{Accessing Merlin -> Accessing from MacOS Clients}](/merlin6/connect-from-macos.html)
|
||||
* For Linux clients, please follow [{How To Use Merlin -> Accessing from Linux Clients}](/merlin6/connect-from-linux.html)
|
||||
* For Windows clients, please follow [{How To Use Merlin -> Accessing from Windows Clients}](/merlin6/connect-from-windows.html)
|
||||
* For MacOS clients, please follow [{How To Use Merlin -> Accessing from MacOS Clients}](/merlin6/connect-from-macos.html)
|
||||
|
||||
### 'srun' with x11 support
|
||||
|
284
pages/merlin6/03-Slurm-General-Documentation/running-jobs.md
Normal file
284
pages/merlin6/03-Slurm-General-Documentation/running-jobs.md
Normal file
@ -0,0 +1,284 @@
|
||||
---
|
||||
title: Running Slurm Scripts
|
||||
#tags:
|
||||
keywords: batch script, slurm, sbatch, srun
|
||||
last_updated: 23 January 2020
|
||||
summary: "This document describes how to run batch scripts in Slurm."
|
||||
sidebar: merlin6_sidebar
|
||||
permalink: /merlin6/running-jobs.html
|
||||
---
|
||||
|
||||
|
||||
## The rules
|
||||
|
||||
Before starting using the cluster, please read the following rules:
|
||||
|
||||
1. To ease and improve *scheduling* and *backfilling*, always try to **estimate and** to **define a proper run time** of your jobs:
|
||||
* Use ``--time=<D-HH:MM:SS>`` for that.
|
||||
* For very long runs, please consider using ***[Job Arrays with Checkpointing](/merlin6/running-jobs.html#array-jobs-running-very-long-tasks-with-checkpoint-files)***
|
||||
2. Try to optimize your jobs for running at most within **one day**. Please, consider the following:
|
||||
* Some software can simply scale up by using more nodes while drastically reducing the run time.
|
||||
* Some software allow to save a specific state, and a second job can start from that state: ***[Job Arrays with Checkpointing](/merlin6/running-jobs.html#array-jobs-running-very-long-tasks-with-checkpoint-files)*** can help you with that.
|
||||
* Jobs submitted to **`hourly`** get more priority than jobs submitted to **`daily`**: always use **`hourly`** for jobs shorter than 1 hour.
|
||||
* Jobs submitted to **`daily`** get more priority than jobs submitted to **`general`**: always use **`daily`** for jobs shorter than 1 day.
|
||||
3. Is **forbidden** to run **very short jobs** as they cause a lot of overhead but also can cause severe problems to the main scheduler.
|
||||
* ***Question:*** Is my job a very short job? ***Answer:*** If it lasts in few seconds or very few minutes, yes.
|
||||
* ***Question:*** How long should my job run? ***Answer:*** as the *Rule of Thumb*, from 5' would start being ok, from 15' would preferred.
|
||||
* Use ***[Packed Jobs](/merlin6/running-jobs.html#packed-jobs-running-a-large-number-of-short-tasks)*** for running a large number of short tasks.
|
||||
4. Do not submit hundreds of similar jobs!
|
||||
* Use ***[Array Jobs](/merlin6/running-jobs.html#array-jobs-launching-a-large-number-of-related-jobs)*** for gathering jobs instead.
|
||||
|
||||
{{site.data.alerts.tip}}Having a good estimation of the <i>time</i> needed by your jobs, a proper way for running them, and optimizing the jobs to <i>run within one day</i> will contribute to make the system fairly and efficiently used.
|
||||
{{site.data.alerts.end}}
|
||||
|
||||
## Basic commands for running batch scripts
|
||||
|
||||
* Use **``sbatch``** for submitting a batch script to Slurm.
|
||||
* Use **``srun``** for running parallel tasks.
|
||||
* Use **``squeue``** for checking jobs status.
|
||||
* Use **``scancel``** for cancelling/deleting a job from the queue.
|
||||
|
||||
{{site.data.alerts.tip}}Use Linux <b>'man'</b> pages when needed (i.e. <span style="color:orange;">'man sbatch'</span>), mostly for checking the available options for the above commands.
|
||||
{{site.data.alerts.end}}
|
||||
|
||||
## Basic settings
|
||||
|
||||
For a complete list of options and parameters available is recommended to use the **man pages** (i.e. ``man sbatch``, ``man srun``, ``man salloc``).
|
||||
Please, notice that behaviour for some parameters might change depending on the command used when running jobs (in example, ``--exclusive`` behaviour in ``sbatch`` differs from ``srun``).
|
||||
|
||||
In this chapter we show the basic parameters which are usually needed in the Merlin cluster.
|
||||
|
||||
### Common settings
|
||||
|
||||
The following settings are the minimum required for running a job in the Merlin CPU and GPU nodes. Please, consider taking a look to the **man pages** (i.e. `man sbatch`, `man salloc`, `man srun`) for more information about all possible options. Also, do not hesitate to contact us on any questions.
|
||||
|
||||
* **Clusters:** For running jobs in the different Slurm clusters, users should to add the following option:
|
||||
```bash
|
||||
#SBATCH --clusters=<cluster_name> # Possible values: merlin5, merlin6, gmerlin6
|
||||
```
|
||||
Refer to the documentation of each cluster ([**`merlin6`**](/merlin6/slurm-configuration.html),[**`gmerlin6`**](/gmerlin6/slurm-configuration.html),[**`merlin5`**](/merlin5/slurm-configuration.html) for further information.
|
||||
|
||||
* **Partitions:** except when using the *default* partition for each cluster, one needs to specify the partition:
|
||||
```bash
|
||||
#SBATCH --partition=<partition_name> # Check each cluster documentation for possible values
|
||||
```
|
||||
|
||||
Refer to the documentation of each cluster ([**`merlin6`**](/merlin6/slurm-configuration.html),[**`gmerlin6`**](/gmerlin6/slurm-configuration.html),[**`merlin5`**](/merlin5/slurm-configuration.html) for further information.
|
||||
|
||||
* **[Optional] Disabling shared nodes**: by default, nodes are not exclusive. Hence, multiple users can run in the same node. One can request exclusive node usage with the following option:
|
||||
```bash
|
||||
#SBATCH --exclusive # Only if you want a dedicated node
|
||||
```
|
||||
|
||||
* **Time**: is important to define how long a job should run, according to the reality. This will help Slurm when *scheduling* and *backfilling*, and will let Slurm managing job queues in a more efficient way. This value can never exceed the `MaxTime` of the affected partition.
|
||||
```bash
|
||||
#SBATCH --time=<D-HH:MM:SS> # Can not exceed the partition `MaxTime`
|
||||
```
|
||||
Refer to the documentation of each cluster ([**`merlin6`**](/merlin6/slurm-configuration.html),[**`gmerlin6`**](/gmerlin6/slurm-configuration.html),[**`merlin5`**](/merlin5/slurm-configuration.html) for further information about partition `MaxTime` values.
|
||||
|
||||
* **Output and error files**: by default, Slurm script will generate standard output (``slurm-%j.out``, where `%j` is the job_id) and error (``slurm-%j.err``, where `%j` is the job_id) files in the directory from where the job was submitted. Users can change default name with the following options:
|
||||
```bash
|
||||
#SBATCH --output=<filename> # Can include path. Patterns accepted (i.e. %j)
|
||||
#SBATCH --error=<filename> # Can include path. Patterns accepted (i.e. %j)
|
||||
```
|
||||
Use **man sbatch** (``man sbatch | grep -A36 '^filename pattern'``) for getting a list specification of **filename patterns**.
|
||||
|
||||
* **Enable/Disable Hyper-Threading**: Whether a node has or not Hyper-Threading depends on the node configuration. By default, HT nodes have HT enabled, but one should specify it from the Slurm command as follows:
|
||||
```bash
|
||||
#SBATCH --hint=multithread # Use extra threads with in-core multi-threading.
|
||||
#SBATCH --hint=nomultithread # Don't use extra threads with in-core multi-threading.
|
||||
```
|
||||
Refer to the documentation of each cluster ([**`merlin6`**](/merlin6/slurm-configuration.html),[**`gmerlin6`**](/gmerlin6/slurm-configuration.html),[**`merlin5`**](/merlin5/slurm-configuration.html) for further information about node configuration and Hyper-Threading.
|
||||
Consider that, sometimes, depending on your job requirements, you might need also to setup how many `--ntasks-per-core` or `--cpus-per-task` (even other options) in addition to the `--hint` command. Please, contact us in case of doubts.
|
||||
|
||||
{{site.data.alerts.tip}} In general, for the cluster `merlin6` <span style="color:orange;"><b>--hint=[no]multithread</b></span> is a recommended field. On the other hand, <span style="color:orange;"><b>--ntasks-per-core</b></span> is only needed when
|
||||
one needs to define how a task should be handled within a core, and this setting will not be generally used on Hybrid MPI/OpenMP jobs where multiple cores are needed for single tasks.
|
||||
{{site.data.alerts.end}}
|
||||
|
||||
## Batch script templates
|
||||
|
||||
### CPU-based jobs templates
|
||||
|
||||
The following examples apply to the **Merlin6** cluster.
|
||||
|
||||
#### Nomultithreaded jobs template
|
||||
|
||||
The following template should be used by any user submitting jobs to the Merlin6 CPU nodes:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#SBATCH --cluster=merlin6 # Cluster name
|
||||
#SBATCH --partition=general,daily,hourly # Specify one or multiple partitions
|
||||
#SBATCH --time=<D-HH:MM:SS> # Strongly recommended
|
||||
#SBATCH --output=<output_file> # Generate custom output file
|
||||
#SBATCH --error=<error_file> # Generate custom error file
|
||||
#SBATCH --hint=nomultithread # Mandatory for multithreaded jobs
|
||||
##SBATCH --exclusive # Uncomment if you need exclusive node usage
|
||||
##SBATCH --ntasks-per-core=1 # Only mandatory for multithreaded single tasks
|
||||
|
||||
## Advanced options example
|
||||
##SBATCH --nodes=1 # Uncomment and specify #nodes to use
|
||||
##SBATCH --ntasks=44 # Uncomment and specify #nodes to use
|
||||
##SBATCH --ntasks-per-node=44 # Uncomment and specify #tasks per node
|
||||
##SBATCH --cpus-per-task=44 # Uncomment and specify the number of cores per task
|
||||
```
|
||||
|
||||
#### Multithreaded jobs template
|
||||
|
||||
The following template should be used by any user submitting jobs to the Merlin6 CPU nodes:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#SBATCH --cluster=merlin6 # Cluster name
|
||||
#SBATCH --partition=general,daily,hourly # Specify one or multiple partitions
|
||||
#SBATCH --time=<D-HH:MM:SS> # Strongly recommended
|
||||
#SBATCH --output=<output_file> # Generate custom output file
|
||||
#SBATCH --error=<error_file> # Generate custom error file
|
||||
#SBATCH --hint=multithread # Mandatory for multithreaded jobs
|
||||
##SBATCH --exclusive # Uncomment if you need exclusive node usage
|
||||
##SBATCH --ntasks-per-core=2 # Only mandatory for multithreaded single tasks
|
||||
|
||||
## Advanced options example
|
||||
##SBATCH --nodes=1 # Uncomment and specify #nodes to use
|
||||
##SBATCH --ntasks=88 # Uncomment and specify #nodes to use
|
||||
##SBATCH --ntasks-per-node=88 # Uncomment and specify #tasks per node
|
||||
##SBATCH --cpus-per-task=88 # Uncomment and specify the number of cores per task
|
||||
```
|
||||
|
||||
### GPU-based jobs templates
|
||||
|
||||
The following template should be used by any user submitting jobs to GPU nodes:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#SBATCH --cluster=gmerlin6 # Cluster name
|
||||
#SBATCH --partition=gpu,gpu-short,gwendolen # Specify one or multiple partitions
|
||||
#SBATCH --gpus="<type>:<num_gpus>" # <type> is optional, <num_gpus> is mandatory
|
||||
#SBATCH --time=<D-HH:MM:SS> # Strongly recommended
|
||||
#SBATCH --output=<output_file> # Generate custom output file
|
||||
#SBATCH --error=<error_file> # Generate custom error file
|
||||
##SBATCH --exclusive # Uncomment if you need exclusive node usage
|
||||
##SBATCH --account=gwendolen_public # Uncomment if you need to use gwendolen
|
||||
|
||||
## Advanced options example
|
||||
##SBATCH --nodes=1 # Uncomment and specify number of nodes to use
|
||||
##SBATCH --ntasks=1 # Uncomment and specify number of nodes to use
|
||||
##SBATCH --cpus-per-gpu=5 # Uncomment and specify the number of cores per task
|
||||
##SBATCH --mem-per-gpu=16000 # Uncomment and specify the number of cores per task
|
||||
##SBATCH --gpus-per-node=<type>:2 # Uncomment and specify the number of GPUs per node
|
||||
##SBATCH --gpus-per-socket=<type>:2 # Uncomment and specify the number of GPUs per socket
|
||||
##SBATCH --gpus-per-task=<type>:1 # Uncomment and specify the number of GPUs per task
|
||||
```
|
||||
|
||||
## Advanced configurations
|
||||
|
||||
### Array Jobs: launching a large number of related jobs
|
||||
|
||||
If you need to run a large number of jobs based on the same executable with systematically varying inputs,
|
||||
e.g. for a parameter sweep, you can do this most easily in form of a **simple array job**.
|
||||
|
||||
``` bash
|
||||
#!/bin/bash
|
||||
#SBATCH --job-name=test-array
|
||||
#SBATCH --partition=daily
|
||||
#SBATCH --ntasks=1
|
||||
#SBATCH --time=08:00:00
|
||||
#SBATCH --array=1-8
|
||||
|
||||
echo $(date) "I am job number ${SLURM_ARRAY_TASK_ID}"
|
||||
srun myprogram config-file-${SLURM_ARRAY_TASK_ID}.dat
|
||||
|
||||
```
|
||||
|
||||
This will run 8 independent jobs, where each job can use the counter
|
||||
variable `SLURM_ARRAY_TASK_ID` defined by Slurm inside of the job's
|
||||
environment to feed the correct input arguments or configuration file
|
||||
to the "myprogram" executable. Each job will receive the same set of
|
||||
configurations (e.g. time limit of 8h in the example above).
|
||||
|
||||
The jobs are independent, but they will run in parallel (if the cluster resources allow for
|
||||
it). The jobs will get JobIDs like {some-number}_0 to {some-number}_7, and they also will each
|
||||
have their own output file.
|
||||
|
||||
**Note:**
|
||||
* Do not use such jobs if you have very short tasks, since each array sub job will incur the full overhead for launching an independent Slurm job. For such cases you should used a **packed job** (see below).
|
||||
* If you want to control how many of these jobs can run in parallel, you can use the `#SBATCH --array=1-100%5` syntax. The `%5` will define
|
||||
that only 5 sub jobs may ever run in parallel.
|
||||
|
||||
You also can use an array job approach to run over all files in a directory, substituting the payload with
|
||||
|
||||
``` bash
|
||||
FILES=(/path/to/data/*)
|
||||
srun ./myprogram ${FILES[$SLURM_ARRAY_TASK_ID]}
|
||||
```
|
||||
|
||||
Or for a trivial case you could supply the values for a parameter scan in form
|
||||
of a argument list that gets fed to the program using the counter variable.
|
||||
|
||||
``` bash
|
||||
ARGS=(0.05 0.25 0.5 1 2 5 100)
|
||||
srun ./my_program.exe ${ARGS[$SLURM_ARRAY_TASK_ID]}
|
||||
```
|
||||
|
||||
### Array jobs: running very long tasks with checkpoint files
|
||||
|
||||
If you need to run a job for much longer than the queues (partitions) permit, and
|
||||
your executable is able to create checkpoint files, you can use this
|
||||
strategy:
|
||||
|
||||
``` bash
|
||||
#!/bin/bash
|
||||
#SBATCH --job-name=test-checkpoint
|
||||
#SBATCH --partition=general
|
||||
#SBATCH --ntasks=1
|
||||
#SBATCH --time=7-00:00:00 # each job can run for 7 days
|
||||
#SBATCH --cpus-per-task=1
|
||||
#SBATCH --array=1-10%1 # Run a 10-job array, one job at a time.
|
||||
if test -e checkpointfile; then
|
||||
# There is a checkpoint file;
|
||||
myprogram --read-checkp checkpointfile
|
||||
else
|
||||
# There is no checkpoint file, start a new simulation.
|
||||
myprogram
|
||||
fi
|
||||
```
|
||||
|
||||
The `%1` in the `#SBATCH --array=1-10%1` statement defines that only 1 subjob can ever run in parallel, so
|
||||
this will result in subjob n+1 only being started when job n has finished. It will read the checkpoint file
|
||||
if it is present.
|
||||
|
||||
|
||||
### Packed jobs: running a large number of short tasks
|
||||
|
||||
Since the launching of a Slurm job incurs some overhead, you should not submit each short task as a separate
|
||||
Slurm job. Use job packing, i.e. you run the short tasks within the loop of a single Slurm job.
|
||||
|
||||
You can launch the short tasks using `srun` with the `--exclusive` switch (not to be confused with the
|
||||
switch of the same name used in the SBATCH commands). This switch will ensure that only a specified
|
||||
number of tasks can run in parallel.
|
||||
|
||||
As an example, the following job submission script will ask Slurm for
|
||||
44 cores (threads), then it will run the =myprog= program 1000 times with
|
||||
arguments passed from 1 to 1000. But with the =-N1 -n1 -c1
|
||||
--exclusive= option, it will control that at any point in time only 44
|
||||
instances are effectively running, each being allocated one CPU. You
|
||||
can at this point decide to allocate several CPUs or tasks by adapting
|
||||
the corresponding parameters.
|
||||
|
||||
``` bash
|
||||
#! /bin/bash
|
||||
#SBATCH --job-name=test-checkpoint
|
||||
#SBATCH --partition=general
|
||||
#SBATCH --ntasks=1
|
||||
#SBATCH --time=7-00:00:00
|
||||
#SBATCH --ntasks=44 # defines the number of parallel tasks
|
||||
for i in {1..1000}
|
||||
do
|
||||
srun -N1 -n1 -c1 --exclusive ./myprog $i &
|
||||
done
|
||||
wait
|
||||
```
|
||||
|
||||
**Note:** The `&` at the end of the `srun` line is needed to not have the script waiting (blocking).
|
||||
The `wait` command waits for all such background tasks to finish and returns the exit code.
|
||||
|
@ -8,20 +8,7 @@ sidebar: merlin6_sidebar
|
||||
permalink: /merlin6/slurm-configuration.html
|
||||
---
|
||||
|
||||
## About Merlin5 & Merlin6
|
||||
|
||||
The new Slurm cluster is called **merlin6**. However, the old Slurm *merlin* cluster will be kept for some time, and it has been renamed to **merlin5**.
|
||||
It will allow to keep running jobs in the old computing nodes until users have fully migrated their codes to the new cluster.
|
||||
|
||||
From July 2019, **merlin6** becomes the **default cluster** and any job submitted to Slurm will be submitted to that cluster. Users can keep submitting to
|
||||
the old *merlin5* computing nodes by using the option ``--cluster=merlin5``.
|
||||
|
||||
In this documentation is only explained the usage of the **merlin6** Slurm cluster.
|
||||
|
||||
## Merlin6 CPU
|
||||
|
||||
Basic configuration for the **merlin6 CPUs** cluster will be detailed here.
|
||||
For advanced usage, please refer to [Understanding the Slurm configuration (for advanced users)](/merlin6/slurm-configuration.html#understanding-the-slurm-configuration-for-advanced-users)
|
||||
This documentation shows basic Slurm configuration and options needed to run jobs in the Merlin6 CPU cluster.
|
||||
|
||||
## Merlin6 CPU nodes definition
|
||||
|
||||
@ -32,29 +19,63 @@ The following table show default and maximum resources that can be used per node
|
||||
| merlin-c-[001-024] | 1 core | 44 cores | 2 | 4000 | 352000 | 352000 | 10000 | N/A | N/A |
|
||||
| merlin-c-[101-124] | 1 core | 44 cores | 2 | 4000 | 352000 | 352000 | 10000 | N/A | N/A |
|
||||
| merlin-c-[201-224] | 1 core | 44 cores | 2 | 4000 | 352000 | 352000 | 10000 | N/A | N/A |
|
||||
| merlin-c-[301-306] | 1 core | 44 cores | 2 | 4000 | 352000 | 352000 | 10000 | N/A | N/A |
|
||||
|
||||
If nothing is specified, by default each core will use up to 8GB of memory. Memory can be increased with the `--mem=<mem_in_MB>` and
|
||||
`--mem-per-cpu=<mem_in_MB>` options, and maximum memory allowed is `Max.Mem/Node`.
|
||||
|
||||
In *Merlin6*, Memory is considered a Consumable Resource, as well as the CPU.
|
||||
In **`merlin6`**, Memory is considered a Consumable Resource, as well as the CPU. Hence, both resources will account when submitting a job,
|
||||
and by default resources can not be oversubscribed. This is a main difference with the old **`merlin5`** cluster, when only CPU were accounted,
|
||||
and memory was by default oversubscribed.
|
||||
|
||||
### CPU partitions
|
||||
{{site.data.alerts.tip}}Always check <b>'/etc/slurm/slurm.conf'</b> for changes in the hardware.
|
||||
{{site.data.alerts.end}}
|
||||
|
||||
## Running jobs in the 'merlin6' cluster
|
||||
|
||||
In this chapter we will cover basic settings that users need to specify in order to run jobs in the Merlin6 CPU cluster.
|
||||
|
||||
### Merlin6 CPU cluster
|
||||
|
||||
To run jobs in the **`merlin6`** cluster users **can optionally** specify the cluster name in Slurm:
|
||||
|
||||
```bash
|
||||
#SBATCH --cluster=merlin6
|
||||
```
|
||||
|
||||
If no cluster name is specified, by default any job will be submitted to this cluster (as this is the main cluster).
|
||||
Hence, this would be only necessary if one has to deal with multiple clusters or when one has defined some environmental
|
||||
variables which can modify the cluster name.
|
||||
|
||||
### Merlin6 CPU partitions
|
||||
|
||||
Users might need to specify the Slurm partition. If no partition is specified, it will default to **`general`**:
|
||||
|
||||
```bash
|
||||
#SBATCH --partition=<partition_name> # Possible <partition_name> values: general, daily, hourly
|
||||
```
|
||||
|
||||
Partition can be specified when submitting a job with the ``--partition=<partitionname>`` option.
|
||||
The following *partitions* (also known as *queues*) are configured in Slurm:
|
||||
|
||||
| CPU Partition | Default Time | Max Time | Max Nodes | Priority | PriorityJobFactor\* |
|
||||
|:-----------------: | :----------: | :------: | :-------: | :------: | :-----------------: |
|
||||
| **<u>general</u>** | 1 day | 1 week | 50 | low | 1 |
|
||||
| **daily** | 1 day | 1 day | 67 | medium | 500 |
|
||||
| **hourly** | 1 hour | 1 hour | unlimited | highest | 1000 |
|
||||
| CPU Partition | Default Time | Max Time | Max Nodes | PriorityJobFactor\* | PriorityTier\*\* |
|
||||
|:-----------------: | :----------: | :------: | :-------: | :-----------------: | :--------------: |
|
||||
| **<u>general</u>** | 1 day | 1 week | 50 | 1 | 1 |
|
||||
| **daily** | 1 day | 1 day | 67 | 500 | 1 |
|
||||
| **hourly** | 1 hour | 1 hour | unlimited | 1000 | 1 |
|
||||
| **gfa-asa** | 1 day | 1 week | 11 | 1000 | 1000 |
|
||||
|
||||
\*The **PriorityJobFactor** value will be added to the job priority (*PARTITION* column in `sprio -l` ). In other words, jobs sent to higher priority
|
||||
partitions will usually run first (however, other factors such like **job age** or mainly **fair share** might affect to that decision). For the GPU
|
||||
partitions, Slurm will also attempt first to allocate jobs on partitions with higher priority over partitions with lesser priority.
|
||||
|
||||
The **general** partition is the *default*: when nothing is specified, job will be by default assigned to that partition. General can not have more
|
||||
than 50 nodes running jobs. For **daily** this limitation is extended to 67 nodes while for **hourly** there are no limits.
|
||||
**\*\***Jobs submitted to a partition with a higher **PriorityTier** value will be dispatched before pending jobs in partition with lower *PriorityTier* value
|
||||
and, if possible, they will preempt running jobs from partitions with lower *PriorityTier* values.
|
||||
|
||||
* The **`general`** partition is the **default**. It can not have more than 50 nodes running jobs.
|
||||
* For **`daily`** this limitation is extended to 67 nodes.
|
||||
* For **`hourly`** there are no limits.
|
||||
* **`gfa-asa`** is a **private hidden** partition, belonging to one experiment. **Access is restricted**. However, by agreement with the experiment,
|
||||
nodes are usually added to the **`hourly`** partition as extra resources for the public resources.
|
||||
|
||||
{{site.data.alerts.tip}}Jobs which would run for less than one day should be always sent to <b>daily</b>, while jobs that would run for less
|
||||
than one hour should be sent to <b>hourly</b>. This would ensure that you have highest priority over jobs sent to partitions with less priority,
|
||||
@ -62,6 +83,57 @@ but also because <b>general</b> has limited the number of nodes that can be used
|
||||
be blocked by long jobs and we can always ensure resources for shorter jobs.
|
||||
{{site.data.alerts.end}}
|
||||
|
||||
### Merlin5 CPU Accounts
|
||||
|
||||
Users need to ensure that the public **`merlin`** account is specified. No specifying account options would default to this account.
|
||||
This is mostly needed by users which have multiple Slurm accounts, which may define by mistake a different account.
|
||||
|
||||
```bash
|
||||
#SBATCH --account=merlin # Possible values: merlin, gfa-asa
|
||||
```
|
||||
|
||||
Not all the accounts can be used on all partitions. This is resumed in the table below:
|
||||
|
||||
| Slurm Account | Slurm Partitions |
|
||||
| :------------------: | :----------------------------------: |
|
||||
| **<u>merlin</u>** | `hourly`,`daily`, `general` |
|
||||
| **gfa-asa** | `gfa-asa`,`hourly`,`daily`, `general` |
|
||||
|
||||
#### The 'gfa-asa' private account
|
||||
|
||||
For accessing the **`gfa-asa`** partition, it must be done through the **`gfa-asa`** account. This account **is restricted**
|
||||
to a group of users and is not public.
|
||||
|
||||
### Slurm CPU specific options
|
||||
|
||||
Some options are available when using CPUs. These are detailed here.
|
||||
|
||||
Alternative Slurm options for CPU based jobs are available. Please refer to the **man** pages
|
||||
for each Slurm command for further information about it (`man salloc`, `man sbatch`, `man srun`).
|
||||
Below are listed the most common settings:
|
||||
|
||||
```bash
|
||||
#SBATCH --hint=[no]multithread
|
||||
#SBATCH --ntasks=<ntasks>
|
||||
#SBATCH --ntasks-per-core=<ntasks>
|
||||
#SBATCH --ntasks-per-socket=<ntasks>
|
||||
#SBATCH --ntasks-per-node=<ntasks>
|
||||
#SBATCH --mem=<size[units]>
|
||||
#SBATCH --mem-per-cpu=<size[units]>
|
||||
#SBATCH --cpus-per-task=<ncpus>
|
||||
#SBATCH --cpu-bind=[{quiet,verbose},]<type> # only for 'srun' command
|
||||
```
|
||||
|
||||
#### Dealing with Hyper-Threading
|
||||
|
||||
The **`merlin6`** cluster contains nodes with Hyper-Threading enabled. One should always specify
|
||||
whether to use Hyper-Threading or not. If not defined, Slurm will generally use it (exceptions apply).
|
||||
|
||||
```bash
|
||||
#SBATCH --hint=multithread # Use extra threads with in-core multi-threading.
|
||||
#SBATCH --hint=nomultithread # Don't use extra threads with in-core multi-threading.
|
||||
```
|
||||
|
||||
### User and job limits
|
||||
|
||||
In the CPU cluster we provide some limits which basically apply to jobs and users. The idea behind this is to ensure a fair usage of the resources and to
|
||||
@ -84,12 +156,10 @@ massive draining of nodes for allocating such jobs. This would apply to jobs req
|
||||
|
||||
#### Per job limits
|
||||
|
||||
These are limits which apply to a single job. In other words, there is a maximum of resources a single job can use. This is described in the table below,
|
||||
and limits will vary depending on the day of the week and the time (*working* vs *non-working* hours). Limits are shown in format: `SlurmQoS(limits)`,
|
||||
where `SlurmQoS` can be seen with the command `sacctmgr show qos`:
|
||||
These are limits which apply to a single job. In other words, there is a maximum of resources a single job can use. Limits are described in the table below with the format: `SlurmQoS(limits)` (possible `SlurmQoS` values can be listed with the command `sacctmgr show qos`). Some limits will vary depending on the day and time of the week.
|
||||
|
||||
| Partition | Mon-Fri 0h-18h | Sun-Thu 18h-0h | From Fri 18h to Mon 0h |
|
||||
|:----------: | :------------------: | :------------: | :---------------------: |
|
||||
|:----------: | :------------------------------: | :------------------------------: | :------------------------------: |
|
||||
| **general** | normal(cpu=704,mem=2750G) | normal(cpu=704,mem=2750G) | normal(cpu=704,mem=2750G) |
|
||||
| **daily** | daytime(cpu=704,mem=2750G) | nighttime(cpu=1408,mem=5500G) | unlimited(cpu=2200,mem=8593.75G) |
|
||||
| **hourly** | unlimited(cpu=2200,mem=8593.75G) | unlimited(cpu=2200,mem=8593.75G) | unlimited(cpu=2200,mem=8593.75G) |
|
||||
@ -105,12 +175,10 @@ with wide values).
|
||||
|
||||
#### Per user limits for CPU partitions
|
||||
|
||||
These limits which apply exclusively to users. In other words, there is a maximum of resources a single user can use. This is described in the table below,
|
||||
and limits will vary depending on the day of the week and the time (*working* vs *non-working* hours). Limits are shown in format: `SlurmQoS(limits)`,
|
||||
where `SlurmQoS` can be seen with the command `sacctmgr show qos`:
|
||||
These limits which apply exclusively to users. In other words, there is a maximum of resources a single user can use. Limits are described in the table below with the format: `SlurmQoS(limits)` (possible `SlurmQoS` values can be listed with the command `sacctmgr show qos`). Some limits will vary depending on the day and time of the week.
|
||||
|
||||
| Partition | Mon-Fri 0h-18h | Sun-Thu 18h-0h | From Fri 18h to Mon 0h |
|
||||
|:-----------:| :----------------: | :------------: | :---------------------: |
|
||||
|:-----------:| :----------------------------: | :---------------------------: | :----------------------------: |
|
||||
| **general** | normal(cpu=704,mem=2750G) | normal(cpu=704,mem=2750G) | normal(cpu=704,mem=2750G) |
|
||||
| **daily** | daytime(cpu=1408,mem=5500G) | nighttime(cpu=2112,mem=8250G) | unlimited(cpu=6336,mem=24750G) |
|
||||
| **hourly** | unlimited(cpu=6336,mem=24750G) | unlimited(cpu=6336,mem=24750G)| unlimited(cpu=6336,mem=24750G) |
|
Loading…
x
Reference in New Issue
Block a user