Update docs
All checks were successful
Build and Deploy Documentation / build-and-deploy (push) Successful in 6s

This commit is contained in:
2025-07-24 15:27:11 +02:00
parent 4a5d205950
commit 4b9489888c
7 changed files with 115 additions and 28 deletions

View File

@@ -14,11 +14,12 @@ This documentation shows basic Slurm configuration and options needed to run job
### CPU public partitions
| PartitionName | DefaultTime | MaxTime | Priority | Account | Per Job Limits | Per User Limits |
| -----------------: | -----------: | ----------: | -------: | ---------------: | -----------------: | -----------------: |
| **<u>general</u>** | 1-00:00:00 | 7-00:00:00 | Low | <u>merlin</u> | cpu=1024,mem=1920G | cpu=1024,mem=1920G |
| **daily** | 0-01:00:00 | 1-00:00:00 | Medium | <u>merlin</u> | cpu=1024,mem=1920G | cpu=2048,mem=3840G |
| **hourly** | 0-00:30:00 | 0-01:00:00 | High | <u>merlin</u> | cpu=2048,mem=3840G | cpu=8192,mem=15T |
| PartitionName | DefaultTime | MaxTime | Priority | Account | Per Job Limits | Per User Limits |
| -----------------: | -----------: | ----------: | -------: | ---------------: | --------------------: | --------------------: |
| **<u>general</u>** | 1-00:00:00 | 7-00:00:00 | Low | <u>merlin</u> | cpu=1024,mem=1920G | cpu=1024,mem=1920G |
| **daily** | 0-01:00:00 | 1-00:00:00 | Medium | <u>merlin</u> | cpu=1024,mem=1920G | cpu=2048,mem=3840G |
| **hourly** | 0-00:30:00 | 0-01:00:00 | High | <u>merlin</u> | cpu=2048,mem=3840G | cpu=8192,mem=15T |
| **interactive** | 0-04:00:00 | 0-12:00:00 | Highest | <u>merlin</u> | cpu=16,mem=30G,node=1 | cpu=32,mem=60G,node=1 |
### GPU public partitions
@@ -110,12 +111,13 @@ various QoS definitions applicable to the merlin7 CPU-based cluster. Here:
* `MaxTRES` specifies resource limits per job.
* `MaxTRESPU` specifies resource limits per user.
| Name | MaxTRES | MaxTRESPU | Scope |
| --------------: | -----------------: | -----------------: | ---------------------: |
| **normal** | | | partition |
| **cpu_general** | cpu=1024,mem=1920G | cpu=1024,mem=1920G | <u>user</u>, partition |
| **cpu_daily** | cpu=1024,mem=1920G | cpu=2048,mem=3840G | partition |
| **cpu_hourly** | cpu=2048,mem=3840G | cpu=8192,mem=15T | partition |
| Name | MaxTRES | MaxTRESPU | Scope |
| -------------------: | --------------------: | --------------------: | ---------------------: |
| **normal** | | | partition |
| **cpu_general** | cpu=1024,mem=1920G | cpu=1024,mem=1920G | <u>user</u>, partition |
| **cpu_daily** | cpu=1024,mem=1920G | cpu=2048,mem=3840G | partition |
| **cpu_hourly** | cpu=2048,mem=3840G | cpu=8192,mem=15T | partition |
| **cpu_interactive** | cpu=16,mem=30G,node=1 | cpu=32,mem=60G,node=1 | partition |
Where:
* **`normal` QoS:** This QoS has no limits and is typically applied to partitions that do not require user or job
@@ -127,6 +129,8 @@ Where:
with higher resource needs.
* **`cpu_hourly` QoS:** Offers the least constraints, allowing more resources to be used for the `hourly` partition,
which caters to very short-duration jobs.
* **`cpu_interactive` QoS:** Is restricted to one node and a few CPUs only, and is intended to be used when interactive
allocations are necessary (`salloc`, `srun`).
For additional details, refer to the [CPU partitions](/merlin7/slurm-configuration.html#CPU-partitions) section.
@@ -155,11 +159,12 @@ Always verify partition configurations for potential changes using the <b>'scon
#### CPU public partitions
| PartitionName | DefaultTime | MaxTime | TotalNodes | PriorityJobFactor | PriorityTier | QoS | AllowAccounts |
| -----------------: | -----------: | ----------: | --------: | ----------------: | -----------: | ----------: | -------------: |
| **<u>general</u>** | 1-00:00:00 | 7-00:00:00 | 50 | 1 | 1 | cpu_general | <u>merlin</u> |
| **daily** | 0-01:00:00 | 1-00:00:00 | 62 | 500 | 1 | cpu_daily | <u>merlin</u> |
| **hourly** | 0-00:30:00 | 0-01:00:00 | 77 | 1000 | 1 | cpu_hourly | <u>merlin</u> |
| PartitionName | DefaultTime | MaxTime | TotalNodes | PriorityJobFactor | PriorityTier | QoS | AllowAccounts |
| -----------------: | -----------: | ----------: | --------: | ----------------: | -----------: | --------------: | -------------: |
| **<u>general</u>** | 1-00:00:00 | 7-00:00:00 | 46 | 1 | 1 | cpu_general | <u>merlin</u> |
| **daily** | 0-01:00:00 | 1-00:00:00 | 58 | 500 | 1 | cpu_daily | <u>merlin</u> |
| **hourly** | 0-00:30:00 | 0-01:00:00 | 77 | 1000 | 1 | cpu_hourly | <u>merlin</u> |
| **interactive** | 0-04:00:00 | 0-12:00:00 | 58 | 1 | 2 | cpu_interactive | <u>merlin</u> |
All Merlin users are part of the `merlin` account, which is used as the *default account* when submitting jobs.
Similarly, if no partition is specified, jobs are automatically submitted to the `general` partition by default.
@@ -174,6 +179,20 @@ The **`hourly`** partition may include private nodes as an additional buffer. Ho
by **`PriorityTier`**, ensures that jobs submitted to private partitions are prioritized and processed first. As a result, access to the
**`hourly`** partition might experience delays in such scenarios.
The **`interactive`** partition is designed specifically for real-time, interactive work. Here are the key characteristics:
* **CPU Oversubscription:** This partition allows CPU oversubscription (configured as `FORCE:4`), meaning that up to four interactive
jobs may share the same physical CPU core. This can impact performance, but enables fast access for short-term tasks.
* **Highest Scheduling Priority:** Jobs submitted to the interactive partition are always prioritized. They will be scheduled
before any jobs in other partitions.
* **Intended Use:** This partition is ideal for debugging, testing, compiling, short interactive runs, and other activities where
immediate access is important.
{{site.data.alerts.warning}}
Because of CPU sharing, the performance on the **'interactive'** partition may not be optimal for compute-intensive tasks.
For long-running or production workloads, use a dedicated batch partition instead.
{{site.data.alerts.end}}
#### CPU private partitions
##### CAS / ASA