initial formatting changes complete

This commit is contained in:
2026-01-06 16:40:15 +01:00
parent f58c1f57b8
commit 7db5d0fd05
81 changed files with 805 additions and 1112 deletions

View File

@@ -1,12 +1,4 @@
---
title: Slurm Configuration
#tags:
keywords: configuration, partitions, node definition
last_updated: 20 May 2021
summary: "This document describes a summary of the Merlin5 Slurm configuration."
sidebar: merlin6_sidebar
permalink: /merlin5/slurm-configuration.html
---
# Slurm Configuration
This documentation shows basic Slurm configuration and options needed to run jobs in the Merlin5 cluster.
@@ -28,7 +20,6 @@ consider the memory as a *consumable resource*. Hence, users can *oversubscribe*
this legacy configuration has been kept to ensure that old jobs can keep running in the same way they did a few years ago.
If you know that this might be a problem for you, please, always use Merlin6 instead.
## Running jobs in the 'merlin5' cluster
In this chapter we will cover basic settings that users need to specify in order to run jobs in the Merlin5 CPU cluster.
@@ -96,11 +87,11 @@ Below are listed the most common settings:
Notice that in **Merlin5** no hyper-threading is available (while in **Merlin6** it is).
Hence, in **Merlin5** there is not need to specify `--hint` hyper-threading related options.
## User and job limits
## User and job limits
In the CPU cluster we provide some limits which basically apply to jobs and users. The idea behind this is to ensure a fair usage of the resources and to
In the CPU cluster we provide some limits which basically apply to jobs and users. The idea behind this is to ensure a fair usage of the resources and to
avoid overabuse of the resources from a single user or job. However, applying limits might affect the overall usage efficiency of the cluster (in example,
pending jobs from a single user while having many idle nodes due to low overall activity is something that can be seen when user limits are applied).
pending jobs from a single user while having many idle nodes due to low overall activity is something that can be seen when user limits are applied).
In the same way, these limits can be also used to improve the efficiency of the cluster (in example, without any job size limits, a job requesting all
resources from the batch system would drain the entire cluster for fitting the job, which is undesirable).
@@ -119,7 +110,7 @@ with the format `SlurmQoS(limits)` (`SlurmQoS` can be listed from the `sacctmgr
| **merlin** | merlin5(cpu=384) | None |
| **merlin-long** | merlin5(cpu=384) | Max. 4 nodes |
By default, by QoS limits, a job can not use more than 384 cores (max CPU per job).
By default, by QoS limits, a job can not use more than 384 cores (max CPU per job).
However, for the `merlin-long`, this is even more restricted: there is an extra limit of 4 dedicated nodes for this partion. This is defined
at the partition level, and will overwrite any QoS limit as long as this is more restrictive.