gwendolen_long partition updated to gwendolen-long

This commit is contained in:
2022-02-17 15:35:55 +01:00
parent 4cade03842
commit 4bcde04fea
2 changed files with 10 additions and 10 deletions

View File

@ -53,7 +53,7 @@ The table below resumes shows all possible partitions available to users:
| `gpu` | 1 day | 1 week | 1 | 1 |
| `gpu-short` | 2 hours | 2 hours | 1000 | 500 |
| `gwendolen` | 30 minutes | 30 minutes | 1000 | 1000 |
| `gwendolen_long` | 30 minutes | 4 hours | 1 | 1 |
| `gwendolen-long` | 30 minutes | 4 hours | 1 | 1 |
\*The **PriorityJobFactor** value will be added to the job priority (*PARTITION* column in `sprio -l` ). In other words, jobs sent to higher priority
partitions will usually run first (however, other factors such like **job age** or mainly **fair share** might affect to that decision). For the GPU
@ -75,15 +75,15 @@ Not all the accounts can be used on all partitions. This is resumed in the table
| Slurm Account | Slurm Partitions |
|:-------------------: | :------------------: |
| **`merlin`** | **`gpu`**,`gpu-short` |
| `gwendolen` | `gwendolen`,`gwendolen_long` |
| `gwendolen` | `gwendolen`,`gwendolen-long` |
By default, all users belong to the `merlin` Slurm accounts, and jobs are submitted to the `gpu` partition when no partition is defined.
Users only need to specify the `gwendolen` account when using the `gwendolen` or `gwendolen_long` partitions, otherwise specifying account is not needed (it will always default to `merlin`).
Users only need to specify the `gwendolen` account when using the `gwendolen` or `gwendolen-long` partitions, otherwise specifying account is not needed (it will always default to `merlin`).
#### The 'gwendolen' account
For running jobs in the **`gwendolen`/`gwendolen_long`** partitions, users must specify the **`gwendolen`** account.
For running jobs in the **`gwendolen`/`gwendolen-long`** partitions, users must specify the **`gwendolen`** account.
The `merlin` account is not allowed to use the Gwendolen partitions.
Gwendolen is restricted to a set of users belonging to the **`unx-gwendolen`** Unix group. If you belong to a project allowed to use **Gwendolen**, or you are a user which would like to have access to it, please request access to the **`unx-gwendolen`** Unix group through [PSI Service Now](https://psi.service-now.com/): the request will be redirected to the responsible of the project (Andreas Adelmann).
@ -182,7 +182,7 @@ Please, notice that when defining `[<type>:]` once, then all other options must
#### Dealing with Hyper-Threading
The **`gmerlin6`** cluster contains the partitions `gwendolen` and `gwendolen_long`, which have a node with Hyper-Threading enabled.
The **`gmerlin6`** cluster contains the partitions `gwendolen` and `gwendolen-long`, which have a node with Hyper-Threading enabled.
In that case, one should always specify whether to use Hyper-Threading or not. If not defined, Slurm will
generally use it (exceptions apply). For this machine, generally HT is recommended.
@ -207,7 +207,7 @@ Limits are defined using QoS, and this is usually set at the partition level. Li
| **gpu** | **`merlin`** | gpu_week(cpu=40,gres/gpu=8,mem=200G) |
| **gpu-short** | **`merlin`** | gpu_week(cpu=40,gres/gpu=8,mem=200G) |
| **gwendolen** | `gwendolen` | No limits |
| **gwendolen_long** | `gwendolen` | No limits, active from 9pm to 5:30am |
| **gwendolen-long** | `gwendolen` | No limits, active from 9pm to 5:30am |
* With the limits in the public `gpu` and `gpu-short` partitions, a single job using the `merlin` acccount
(default account) can not use more than 40 CPUs, more than 8 GPUs or more than 200GB.
@ -216,10 +216,10 @@ As there are no more existing QoS during the week temporary overriding job limit
instance in the CPU **daily** partition), the job needs to be cancelled, and the requested resources
must be adapted according to the above resource limits.
* The **gwendolen** and **gwendolen_long** partitions are two special partitions for a **[NVIDIA DGX A100](https://www.nvidia.com/en-us/data-center/dgx-a100/)** machine.
* The **gwendolen** and **gwendolen-long** partitions are two special partitions for a **[NVIDIA DGX A100](https://www.nvidia.com/en-us/data-center/dgx-a100/)** machine.
Only users belonging to the **`unx-gwendolen`** Unix group can run in these partitions. No limits are applied (machine resources can be completely used).
* The **`gwendolen_long`** partition is available 24h. However,
* The **`gwendolen-long`** partition is available 24h. However,
* from 5:30am to 9pm the partition is `down` (jobs can be submitted, but can not run until the partition is set to `active`).
* from 9pm to 5:30am jobs are allowed to run (partition is set to `active`).
@ -234,7 +234,7 @@ Limits are defined using QoS, and this is usually set at the partition level. Li
| **gpu** | **`merlin`** | gpu_week(cpu=80,gres/gpu=16,mem=400G) |
| **gpu-short** | **`merlin`** | gpu_week(cpu=80,gres/gpu=16,mem=400G) |
| **gwendolen** | `gwendolen` | No limits |
| **gwendolen_long** | `gwendolen` | No limits, active from 9pm to 5:30am |
| **gwendolen-long** | `gwendolen` | No limits, active from 9pm to 5:30am |
* With the limits in the public `gpu` and `gpu-short` partitions, a single user can not use more than 80 CPUs, more than 16 GPUs or more than 400GB.
Jobs sent by any user already exceeding such limits will stay in the queue with the message **`QOSMax[Cpu|GRES|Mem]PerUser`**.

View File

@ -153,7 +153,7 @@ The following template should be used by any user submitting jobs to GPU nodes:
#!/bin/bash
#SBATCH --cluster=gmerlin6 # Cluster name
#SBATCH --partition=gpu,gpu-short # Specify one or multiple partitions, or
#SBATCH --partition=gwendolen,gwendolen_long # Only for Gwendolen users
#SBATCH --partition=gwendolen,gwendolen-long # Only for Gwendolen users
#SBATCH --gpus="<type>:<num_gpus>" # <type> is optional, <num_gpus> is mandatory
#SBATCH --time=<D-HH:MM:SS> # Strongly recommended
#SBATCH --output=<output_file> # Generate custom output file