vibe #3
This commit is contained in:
@@ -86,9 +86,8 @@ adjusted.
|
||||
For MPI-based jobs, where performance generally improves with single-threaded CPUs, this option is recommended.
|
||||
In such cases, you should double the **`--mem-per-cpu`** value to account for the reduced number of threads.
|
||||
|
||||
{{site.data.alerts.tip}}
|
||||
Always verify the Slurm <b>'/var/spool/slurmd/conf-cache/slurm.conf'</b> configuration file for potential changes.
|
||||
{{site.data.alerts.end}}
|
||||
!!! tip
|
||||
Always verify the Slurm `/var/spool/slurmd/conf-cache/slurm.conf` configuration file for potential changes.
|
||||
|
||||
### User and job limits with QoS
|
||||
|
||||
@@ -132,11 +131,10 @@ Where:
|
||||
* **`cpu_interactive` QoS:** Is restricted to one node and a few CPUs only, and is intended to be used when interactive
|
||||
allocations are necessary (`salloc`, `srun`).
|
||||
|
||||
For additional details, refer to the [CPU partitions](slurm-configuration.md#CPU-partitions) section.
|
||||
For additional details, refer to the [CPU partitions](#cpu-partitions) section.
|
||||
|
||||
{{site.data.alerts.tip}}
|
||||
Always verify QoS definitions for potential changes using the <b>'sacctmgr show qos format="Name%22,MaxTRESPU%35,MaxTRES%35"'</b> command.
|
||||
{{site.data.alerts.end}}
|
||||
!!! tip
|
||||
Always verify QoS definitions for potential changes using the `sacctmgr show qos format="Name%22,MaxTRESPU%35,MaxTRES%35"` command.
|
||||
|
||||
### CPU partitions
|
||||
|
||||
@@ -151,11 +149,10 @@ Key concepts:
|
||||
partitions, where applicable.
|
||||
* **`QoS`**: Specifies the quality of service associated with a partition. It is used to control and restrict resource availability
|
||||
for specific partitions, ensuring that resource allocation aligns with intended usage policies. Detailed explanations of the various
|
||||
QoS settings can be found in the [User and job limits with QoS](/merlin7/slurm-configuration.html#user-and-job-limits-with-qos) section.
|
||||
QoS settings can be found in the [User and job limits with QoS](#user-and-job-limits-with-qos) section.
|
||||
|
||||
{{site.data.alerts.tip}}
|
||||
Always verify partition configurations for potential changes using the <b>'scontrol show partition'</b> command.
|
||||
{{site.data.alerts.end}}
|
||||
!!! tip
|
||||
Always verify partition configurations for potential changes using the `scontrol show partition` command.
|
||||
|
||||
#### CPU public partitions
|
||||
|
||||
@@ -169,11 +166,11 @@ Always verify partition configurations for potential changes using the <b>'scon
|
||||
All Merlin users are part of the `merlin` account, which is used as the *default account* when submitting jobs.
|
||||
Similarly, if no partition is specified, jobs are automatically submitted to the `general` partition by default.
|
||||
|
||||
{{site.data.alerts.tip}}
|
||||
For jobs running less than one day, submit them to the <b>daily</b> partition.
|
||||
For jobs running less than one hour, use the <b>hourly</b> partition.
|
||||
These partitions provide higher priority and ensure quicker scheduling compared to <b>general</b>, which has limited node availability.
|
||||
{{site.data.alerts.end}}
|
||||
!!! tip
|
||||
For jobs running less than one day, submit them to the **daily** partition.
|
||||
For jobs running less than one hour, use the **hourly** partition. These
|
||||
partitions provide higher priority and ensure quicker scheduling compared
|
||||
to **general**, which has limited node availability.
|
||||
|
||||
The **`hourly`** partition may include private nodes as an additional buffer. However, the current Slurm partition configuration, governed
|
||||
by **`PriorityTier`**, ensures that jobs submitted to private partitions are prioritized and processed first. As a result, access to the
|
||||
@@ -188,10 +185,10 @@ before any jobs in other partitions.
|
||||
* **Intended Use:** This partition is ideal for debugging, testing, compiling, short interactive runs, and other activities where
|
||||
immediate access is important.
|
||||
|
||||
{{site.data.alerts.warning}}
|
||||
Because of CPU sharing, the performance on the **'interactive'** partition may not be optimal for compute-intensive tasks.
|
||||
For long-running or production workloads, use a dedicated batch partition instead.
|
||||
{{site.data.alerts.end}}
|
||||
!!! warning
|
||||
Because of CPU sharing, the performance on the **interactive** partition
|
||||
may not be optimal for compute-intensive tasks. For long-running or
|
||||
production workloads, use a dedicated batch partition instead.
|
||||
|
||||
#### CPU private partitions
|
||||
|
||||
@@ -261,9 +258,8 @@ adjusted.
|
||||
For MPI-based jobs, where performance generally improves with single-threaded CPUs, this option is recommended.
|
||||
In such cases, you should double the **`--mem-per-cpu`** value to account for the reduced number of threads.
|
||||
|
||||
{{site.data.alerts.tip}}
|
||||
Always verify the Slurm <b>'/var/spool/slurmd/conf-cache/slurm.conf'</b> configuration file for potential changes.
|
||||
{{site.data.alerts.end}}
|
||||
!!! tip
|
||||
Always verify the Slurm `/var/spool/slurmd/conf-cache/slurm.conf` configuration file for potential changes.
|
||||
|
||||
### User and job limits with QoS
|
||||
|
||||
@@ -308,11 +304,10 @@ Where:
|
||||
* **`gpu_a100_interactive` & `gpu_gh_interactive` QoS:** Guarantee interactive access to GPU nodes for software compilation and
|
||||
small testing.
|
||||
|
||||
For additional details, refer to the [GPU partitions](slurm-configuration.md#GPU-partitions) section.
|
||||
For additional details, refer to the [GPU partitions](#gpu-partitions) section.
|
||||
|
||||
{{site.data.alerts.tip}}
|
||||
Always verify QoS definitions for potential changes using the <b>'sacctmgr show qos format="Name%22,MaxTRESPU%35,MaxTRES%35"'</b> command.
|
||||
{{site.data.alerts.end}}
|
||||
!!! tip
|
||||
Always verify QoS definitions for potential changes using the `sacctmgr show qos format="Name%22,MaxTRESPU%35,MaxTRES%35"` command.
|
||||
|
||||
### GPU partitions
|
||||
|
||||
@@ -327,11 +322,10 @@ Key concepts:
|
||||
partitions, where applicable.
|
||||
* **`QoS`**: Specifies the quality of service associated with a partition. It is used to control and restrict resource availability
|
||||
for specific partitions, ensuring that resource allocation aligns with intended usage policies. Detailed explanations of the various
|
||||
QoS settings can be found in the [User and job limits with QoS](/merlin7/slurm-configuration.html#user-and-job-limits-with-qos) section.
|
||||
QoS settings can be found in the [User and job limits with QoS](#user-and-job-limits-with-qos) section.
|
||||
|
||||
{{site.data.alerts.tip}}
|
||||
Always verify partition configurations for potential changes using the <b>'scontrol show partition'</b> command.
|
||||
{{site.data.alerts.end}}
|
||||
!!! tip
|
||||
Always verify partition configurations for potential changes using the `scontrol show partition` command.
|
||||
|
||||
#### A100-based partitions
|
||||
|
||||
@@ -345,11 +339,12 @@ Always verify partition configurations for potential changes using the <b>'scon
|
||||
All Merlin users are part of the `merlin` account, which is used as the *default account* when submitting jobs.
|
||||
Similarly, if no partition is specified, jobs are automatically submitted to the `general` partition by default.
|
||||
|
||||
{{site.data.alerts.tip}}
|
||||
For jobs running less than one day, submit them to the <b>a100-daily</b> partition.
|
||||
For jobs running less than one hour, use the <b>a100-hourly</b> partition.
|
||||
These partitions provide higher priority and ensure quicker scheduling compared to <b>a100-general</b>, which has limited node availability.
|
||||
{{site.data.alerts.end}}
|
||||
!!! tip
|
||||
For jobs running less than one day, submit them to the **a100-daily**
|
||||
partition. For jobs running less than one hour, use the **a100-hourly**
|
||||
partition. These partitions provide higher priority and ensure quicker
|
||||
scheduling compared to **a100-general**, which has limited node
|
||||
availability.
|
||||
|
||||
#### GH-based partitions
|
||||
|
||||
@@ -363,8 +358,8 @@ These partitions provide higher priority and ensure quicker scheduling compared
|
||||
All Merlin users are part of the `merlin` account, which is used as the *default account* when submitting jobs.
|
||||
Similarly, if no partition is specified, jobs are automatically submitted to the `general` partition by default.
|
||||
|
||||
{{site.data.alerts.tip}}
|
||||
For jobs running less than one day, submit them to the <b>gh-daily</b> partition.
|
||||
For jobs running less than one hour, use the <b>gh-hourly</b> partition.
|
||||
These partitions provide higher priority and ensure quicker scheduling compared to <b>gh-general</b>, which has limited node availability.
|
||||
{{site.data.alerts.end}}
|
||||
!!! tip
|
||||
For jobs running less than one day, submit them to the **gh-daily**
|
||||
partition. For jobs running less than one hour, use the **gh-hourly**
|
||||
partition. These partitions provide higher priority and ensure quicker
|
||||
scheduling compared to **gh-general**, which has limited node availability.
|
||||
|
||||
Reference in New Issue
Block a user