Removed GPUs from CPUs partition
This commit is contained in:
@ -8,24 +8,7 @@ sidebar: merlin6_sidebar
|
||||
permalink: /merlin6/slurm-configuration.html
|
||||
---
|
||||
|
||||
## Using the Slurm batch system
|
||||
|
||||
Clusters at PSI use the [Slurm Workload Manager](http://slurm.schedmd.com/) as the batch system technology for managing and scheduling jobs.
|
||||
Historically, *Merlin4* and *Merlin5* also used Slurm. In the same way, **Merlin6** has been also configured with this batch system.
|
||||
|
||||
Slurm has been installed in a **multi-clustered** configuration, allowing to integrate multiple clusters in the same batch system.
|
||||
|
||||
For understanding the Slurm configuration setup in the cluster, sometimes may be useful to check the following files:
|
||||
|
||||
* ``/etc/slurm/slurm.conf`` - can be found in the login nodes and computing nodes.
|
||||
* ``/etc/slurm/cgroup.conf`` - can be found in the computing nodes, is also propagated to login nodes for user read access.
|
||||
* ``/etc/slurm/gres.conf`` - can be found in the GPU nodes, is also propgated to login nodes and computing nodes for user read access.
|
||||
|
||||
The previous configuration files which can be found in the login nodes, correspond exclusively to the **merlin6** cluster configuration files.
|
||||
Configuration files for the old **merlin5** cluster must be checked directly on any of the **merlin5** computing nodes: these are not propagated
|
||||
to the **merlin6** login nodes.
|
||||
|
||||
### About Merlin5 & Merlin6
|
||||
## About Merlin5 & Merlin6
|
||||
|
||||
The new Slurm cluster is called **merlin6**. However, the old Slurm *merlin* cluster will be kept for some time, and it has been renamed to **merlin5**.
|
||||
It will allow to keep running jobs in the old computing nodes until users have fully migrated their codes to the new cluster.
|
||||
@ -35,11 +18,11 @@ the old *merlin5* computing nodes by using the option ``--cluster=merlin5``.
|
||||
|
||||
In this documentation is only explained the usage of the **merlin6** Slurm cluster.
|
||||
|
||||
### Using Slurm 'merlin6' cluster
|
||||
## Using Slurm 'merlin6' cluster
|
||||
|
||||
Basic usage for the **merlin6** cluster will be detailed here. For advanced usage, please use the following document [LINK TO SLURM ADVANCED CONFIG]()
|
||||
|
||||
#### Merlin6 Node definition
|
||||
### Merlin6 Node definition
|
||||
|
||||
The following table show default and maximum resources that can be used per node:
|
||||
|
||||
@ -56,7 +39,7 @@ and maximum memory allowed is ``Max.Mem/Node``.
|
||||
|
||||
In *Merlin6*, memory is considered a Consumable Resource, as well as the CPU.
|
||||
|
||||
#### Merlin6 Slurm partitions
|
||||
### Merlin6 Slurm partitions
|
||||
|
||||
Partition can be specified when submitting a job with the ``--partition=<partitionname>`` option.
|
||||
The following *partitions* (also known as *queues*) are configured in Slurm:
|
||||
@ -71,7 +54,7 @@ General is the *default*, so when nothing is specified job will be by default as
|
||||
running jobs. For **daily** this limitation is extended to 60 nodes while for **hourly** there are no limits. Shorter jobs have more priority than
|
||||
longer jobs, hence in general terms would be scheduled before (however, other factors such like user fair share value can affect to this decision).
|
||||
|
||||
#### Merlin6 User limits
|
||||
### Merlin6 User limits
|
||||
|
||||
By default, users can not use more than 704 cores at the same time (Max CPU per user). This is equivalent to 8 exclusive nodes.
|
||||
This limit applies to the **general** and **daily** partitions. For the **hourly** partition, there is no restriction and user limits are removed.
|
||||
|
Reference in New Issue
Block a user