This commit is contained in:
2021-05-20 18:04:54 +02:00
parent 173759bbf0
commit 42d8f38934
10 changed files with 558 additions and 177 deletions

View File

@ -0,0 +1,27 @@
---
title: Introduction
#tags:
#keywords:
last_updated: 28 June 2019
#summary: "Merlin 6 cluster overview"
sidebar: merlin6_sidebar
permalink: /merlin6/cluster-introduction.html
---
## Slurm clusters
* The new Slurm CPU cluster is called [**`merlin6`**](/merlin6/cluster-introduction.html).
* The new Slurm GPU cluster is called [**`gmerlin6`**](/gmerlin6/cluster-introduction.html)
* The old Slurm *merlin* cluster is still active and best effort support is provided.
The cluster, was renamed as [**merlin5**](/merlin5/cluster-introduction.html).
From July 2019, **`merlin6`** becomes the **default Slurm cluster** and any job submitted from the login node will be submitted to that cluster if not .
* Users can keep submitting to the old *`merlin5`* computing nodes by using the option ``--cluster=merlin5``.
* Users submitting to the **`gmerlin6`** GPU cluster need to specify the option ``--cluster=gmerlin6``.
### Slurm 'merlin6'
**CPU nodes** are configured in a **Slurm** cluster, called **`merlin6`**, and
this is the _**default Slurm cluster**_. Hence, by default, if no Slurm cluster is
specified (with the `--cluster` option), this will be the cluster to which the jobs
will be sent.

View File

@ -11,7 +11,12 @@ redirect_from:
- /merlin6/index.html
---
## About Merlin6
## The Merlin local HPC cluster
Historically, the local HPC clusters at PSI were named Merlin. Over the years,
multiple generations of Merlin have been deployed.
### Merlin6
Merlin6 is a the official PSI Local HPC cluster for development and
mission-critical applications that has been built in 2019. It replaces
@ -22,25 +27,26 @@ more compute nodes and cluster storage without significant increase of
the costs of the manpower and the operations.
Merlin6 is mostly based on **CPU** resources, but also contains a small amount
of **GPU**-based resources which are mostly used by the BIO experiments.
of **GPU**-based resources which are mostly used by the BIO Division and Deep Learning projects:
* The Merlin6 CPU nodes are in a dedicated Slurm cluster called [**`merlin6`**](/merlin6/slurm-configuration.html).
* This is the default Slurm cluster configured in the login nodes, and any job submitted without the option `--cluster` will be submited to this cluster.
* The Merlin6 GPU resources are in a dedicated Slurm cluster called [**`gmerlin6`**](/gmerlin6/slurm-configuration.html).
* Users submitting to the **`gmerlin6`** GPU cluster need to specify the option ``--cluster=gmerlin6``.
### Slurm 'merlin6'
### Merlin5
**CPU nodes** are configured in a **Slurm** cluster, called **`merlin6`**, and
this is the _**default Slurm cluster**_. Hence, by default, if no Slurm cluster is
specified (with the `--cluster` option), this will be the cluster to which the jobs
will be sent.
The old Slurm **CPU** *merlin* cluster is still active and is maintained in a best effort basis.
* The Merlin5 CPU cluster is called [**merlin5**](/merlin5/slurm-configuration.html).
## Merlin6 Architecture
## Merlin Architecture
### Merlin6 Cluster Architecture Diagram
The following image shows the Slurm architecture design for the Merlin5 & Merlin6 clusters:
![Merlin6 Slurm Architecture Design]({{ "/images/merlin-slurm-architecture.png" }})
### Merlin6 Architecture Diagram
The following image shows the Merlin6 cluster architecture diagram:
![Merlin6 Architecture Diagram]({{ "/images/merlinschema3.png" }})
### Merlin5 + Merlin6 Slurm Cluster Architecture Design
The following image shows the Slurm architecture design for the Merlin5 & Merlin6 clusters:
![Merlin6 Slurm Architecture Design]({{ "/images/merlin-slurm-architecture.png" }})

View File

@ -2,31 +2,13 @@
title: Accessing Interactive Nodes
#tags:
#keywords:
last_updated: 13 June 2019
last_updated: 20 May 2021
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/interactive.html
---
## Login nodes description
The Merlin6 login nodes are the official machines for accessing the recources of Merlin6.
From these machines, users can submit jobs to the Slurm batch system as well as visualize or compile their software.
The Merlin6 login nodes are the following:
| Hostname | SSH | NoMachine | #cores | #Threads | CPU | Memory | Scratch | Scratch Mountpoint |
| ------------------- | --- | --------- | ------ |:--------:| :-------------------- | ------ | ---------- | :------------------ |
| merlin-l-001.psi.ch | yes | yes | 2 x 22 | 2 | Intel Xeon Gold 6152 | 384GB | 1.8TB NVMe | ``/scratch`` |
| merlin-l-002.psi.ch | yes | yes | 2 x 22 | 2 | Intel Xeon Gold 6142 | 384GB | 1.8TB NVMe | ``/scratch`` |
| merlin-l-01.psi.ch | yes | - | 2 x 16 | 2 | Intel Xeon E5-2697Av4 | 512GB | 100GB SAS | ``/scratch`` |
---
## Remote Access
### SSH Access
## SSH Access
For interactive command shell access, use an SSH client. We recommend to activate SSH's X11 forwarding to allow you to use graphical
applications (e.g. a text editor, but for more performant graphical access, refer to the sections below). X applications are supported
@ -38,26 +20,37 @@ in the login nodes and X11 forwarding can be used for those users who have prope
* PSI desktop configuration issues must be addressed through **[PSI Service Now](https://psi.service-now.com/psisp)** as an *Incident Request*.
* Ticket will be redirected to the corresponding Desktop support group (Windows, Linux).
#### Accessing from a Linux client
### Accessing from a Linux client
Refer to [{Accessing Merlin -> Accessing from Linux Clients}](/merlin6/connect-from-linux.html) for **Linux** SSH client and X11 configuration.
Refer to [{How To Use Merlin -> Accessing from Linux Clients}](/merlin6/connect-from-linux.html) for **Linux** SSH client and X11 configuration.
#### Accessing from a Windows client
### Accessing from a Windows client
Refer to [{Accessing Merlin -> Accessing from Windows Clients}](/merlin6/connect-from-windows.html) for **Windows** SSH client and X11 configuration.
Refer to [{How To Use Merlin -> Accessing from Windows Clients}](/merlin6/connect-from-windows.html) for **Windows** SSH client and X11 configuration.
#### Accessing from a MacOS client
### Accessing from a MacOS client
Refer to [{Accessing Merlin -> Accessing from MacOS Clients}](/merlin6/connect-from-macos.html) for **MacOS** SSH client and X11 configuration.
Refer to [{How To Use Merlin -> Accessing from MacOS Clients}](/merlin6/connect-from-macos.html) for **MacOS** SSH client and X11 configuration.
### Graphical access using **NoMachine** client
## NoMachine Remote Desktop Access
X applications are supported in the login nodes and can run efficiently through a **NoMachine** client. This is the officially supported way to run more demanding X applications on Merlin6. The client software can be downloaded from [the Nomachine Website](https://www.nomachine.com/product&p=NoMachine%20Enterprise%20Client).
X applications are supported in the login nodes and can run efficiently through a **NoMachine** client. This is the officially supported way to run more demanding X applications on Merlin6.
* For PSI Windows workstations, this can be installed from the Software Kiosk as 'NX Client'. If you have difficulties installing, please request support through **[PSI Service Now](https://psi.service-now.com/psisp)** as an *Incident Request*.
* For other workstations The client software can be downloaded from the [Nomachine Website](https://www.nomachine.com/product&p=NoMachine%20Enterprise%20Client).
* Install the NoMachine client locally. For PSI windows machines, this can be installed from the Software Kiosk as 'NX Client'. If you have difficulties installing, please request support through **[PSI Service Now](https://psi.service-now.com/psisp)** as an *Incident Request*.
* Configure a new connection in no machine to either `merlin-l-001.psi.ch` or `merlin-l-002.psi.ch`. The 'NX' protocol is recommended. Login nodes are available from the PSI network or through VPN.
* You can also connect via the photo science division's `rem-acc.psi.ch` jump point. After connecting you will be presented with options to jump to the merlin login nodes. This can be accessed remotely without VPN.
* NoMachine *client configuration* and *connectivity* for Merlin6 is fully supported by Merlin6 administrators.
* Please contact us through the official channels on any configuration issue with NoMachine.
### Configuring NoMachine
---
Refer to [{How To Use Merlin -> Remote Desktop Access}](/merlin6/nomachine.html) for further instructions of how to configure the NoMachine client and how to access it from PSI and from outside PSI.
## Login nodes hardware description
The Merlin6 login nodes are the official machines for accessing the recources of Merlin6.
From these machines, users can submit jobs to the Slurm batch system as well as visualize or compile their software.
The Merlin6 login nodes are the following:
| Hostname | SSH | NoMachine | #cores | #Threads | CPU | Memory | Scratch | Scratch Mountpoint |
| ------------------- | --- | --------- | ------ |:--------:| :-------------------- | ------ | ---------- | :------------------ |
| merlin-l-001.psi.ch | yes | yes | 2 x 22 | 2 | Intel Xeon Gold 6152 | 384GB | 1.8TB NVMe | ``/scratch`` |
| merlin-l-002.psi.ch | yes | yes | 2 x 22 | 2 | Intel Xeon Gold 6142 | 384GB | 1.8TB NVMe | ``/scratch`` |
| merlin-l-01.psi.ch | yes | - | 2 x 16 | 2 | Intel Xeon E5-2697Av4 | 512GB | 100GB SAS | ``/scratch`` |

View File

@ -8,48 +8,46 @@ sidebar: merlin6_sidebar
permalink: /merlin6/slurm-access.html
---
## The Merlin6 Slurm batch system
## The Merlin Slurm clusters
Clusters at PSI use the [Slurm Workload Manager](http://slurm.schedmd.com/) as the batch system technology for managing and scheduling jobs.
Historically, *Merlin4* and *Merlin5* also used Slurm. In the same way, **Merlin6** has been also configured with this batch system.
Merlin contains a multi-cluster setup, where multiple Slurm clusters coexist under the same umbrella.
It basically contains the following clusters:
Slurm has been installed in a **multi-clustered** configuration, allowing to integrate multiple clusters in the same batch system.
* Two different Slurm clusters exist: **merlin5** and **merlin6**.
* **merlin5** is a cluster with very old hardware (out-of-warranty).
* **merlin5** will exist as long as hardware incidents are soft and easy to repair/fix (i.e. hard disk replacement)
* **merlin6** is the default cluster when running Slurm commands (i.e. sinfo).
* The **Merlin6 Slurm CPU cluster**, which is called [**`merlin6`**](/merlin6/slurm-access.html#merlin6-cpu-cluster-access).
* The **Merlin6 Slurm GPU cluster**, which is called [**`gmerlin6`**](/merlin6/slurm-access.html#merlin6-gpu-cluster-access).
* The *old Merlin5 Slurm CPU cluster*, which is called [**`merlin5`**](/merlin6/slurm-access.html#merlin5-cpu-cluster-access), still supported in a best effort basis.
Please follow the section **Merlin6 Slurm** for more details about configuration and job submission.
## Accessing the Slurm clusters
### Merlin5 Access
Any job submission must be performed from a **Merlin login node**. Please refer to the [**Accessing the Interactive Nodes documentation**](/merlin6/interactive.html)
for further information about how to access the cluster.
Keeping the **merlin5** cluster will allow running jobs in the old computing nodes until users have fully migrated their codes to the new cluster.
In addition, any job *must be submitted from a high performance storage area visible by the login nodes and by the computing nodes*. For this, the possible storage areas are the following:
* `/data/user`
* `/data/project`
* `/shared-scratch`
Please, avoid using `/psi/home` directories for submitting jobs.
From July 2019, **merlin6** becomes the **default cluster**. However, users can keep submitting to the old **merlin5** computing nodes by using
the option ``--cluster=merlin5`` and using the corresponding Slurm partition with ``--partition=merlin``. In example:
### Merlin6 CPU cluster access
```bash
#SBATCH --clusters=merlin6
```
The **Merlin6 CPU cluster** (**`merlin6`**) is the default cluster configured in the login nodes. Any job submission will use by default this cluster, unless
the option `--cluster` is specified with another of the existing clusters.
Example of how to run a simple command:
For further information about how to use this cluster, please visit: [**Merlin6 CPU Slurm Cluster documentation**](/merlin6/slurm-configuration.html).
```bash
srun --clusters=merlin5 --partition=merlin hostname
sbatch --clusters=merlin5 --partition=merlin myScript.batch
```
### Merlin6 GPU cluster access
### Merlin6 Access
The **Merlin6 GPU cluster** (**`gmerlin6`**) is visible from the login nodes. However, to submit jobs to this cluster, one needs to specify the option `--cluster=gmerlin6` when submitting a job or allocation.
In order to run jobs on the **Merlin6** cluster, you need to specify the following option in your batch scripts:
For further information about how to use this cluster, please visit: [**Merlin6 GPU Slurm Cluster documentation**](/gmerlin6/slurm-configuration.html).
```bash
#SBATCH --clusters=merlin6
```
### Merlin5 CPU cluster access
Example of how to run a simple command:
The **Merlin5 CPU cluster** (**`merlin5`**) is visible from the login nodes. However, to submit jobs
to this cluster, one needs to specify the option `--cluster=merlin5` when submitting a job or allocation.
```bash
srun --clusters=merlin6 hostname
sbatch --clusters=merlin6 myScript.batch
```
Using this cluster is in general not recommended, however this is still available for old users needing
extra computational resources or longer jobs. Have in mind that this cluster is only supported in a
**best effort basis**, and it contains very old hardware and configurations.
For further information about how to use this cluster, please visit the [**Merlin5 CPU Slurm Cluster documentation**](/gmerlin6/slurm-configuration.html).

View File

@ -23,7 +23,7 @@ In this documentation is only explained the usage of the **merlin6** Slurm clust
Basic configuration for the **merlin6 CPUs** cluster will be detailed here.
For advanced usage, please refer to [Understanding the Slurm configuration (for advanced users)](/merlin6/slurm-configuration.html#understanding-the-slurm-configuration-for-advanced-users)
### CPU nodes definition
## Merlin6 CPU nodes definition
The following table show default and maximum resources that can be used per node:
@ -120,77 +120,9 @@ equivalent to 8 exclusive nodes. This limit applies to the **general** partition
For the **hourly** partition, there are no limits restriction and user limits are removed. Limits are softed for the **daily** partition during non
working hours, and during the weekend limits are removed.
## Merlin6 GPU
Basic configuration for the **merlin6 GPUs** will be detailed here.
For advanced usage, please refer to [Understanding the Slurm configuration (for advanced users)](/merlin6/slurm-configuration.html#understanding-the-slurm-configuration-for-advanced-users)
### GPU nodes definition
| Nodes | Def.#CPUs | Max.#CPUs | #Threads | Def.Mem/CPU | Max.Mem/CPU | Max.Mem/Node | Max.Swap | GPU Type | Def.#GPUs | Max.#GPUs |
|:------------------:| ---------:| :--------:| :------: | :----------:| :----------:| :-----------:| :-------:| :--------: | :-------: | :-------: |
| merlin-g-[001] | 1 core | 8 cores | 1 | 4000 | 102400 | 102400 | 10000 | **GTX1080** | 1 | 2 |
| merlin-g-[002-005] | 1 core | 20 cores | 1 | 4000 | 102400 | 102400 | 10000 | **GTX1080** | 1 | 4 |
| merlin-g-[006-009] | 1 core | 20 cores | 1 | 4000 | 102400 | 102400 | 10000 | **GTX1080Ti** | 1 | 4 |
| merlin-g-[010-013] | 1 core | 20 cores | 1 | 4000 | 102400 | 102400 | 10000 | **RTX2080Ti** | 1 | 4 |
{{site.data.alerts.tip}}Always check <b>'/etc/slurm/gres.conf'</b> for changes in the GPU type and details of the NUMA node.
{{site.data.alerts.end}}
### GPU partitions
| GPU Partition | Default Time | Max Time | Max Nodes | Priority | PriorityJobFactor\* |
|:-----------------: | :----------: | :------: | :-------: | :------: | :-----------------: |
| **<u>gpu</u>** | 1 day | 1 week | 4 | low | 1 |
| **gpu-short** | 2 hours | 2 hours | 4 | highest | 1000 |
\*The **PriorityJobFactor** value will be added to the job priority (*PARTITION* column in `sprio -l` ). In other words, jobs sent to higher priority
partitions will usually run first (however, other factors such like **job age** or mainly **fair share** might affect to that decision). For the GPU
partitions, Slurm will also attempt first to allocate jobs on partitions with higher priority over partitions with lesser priority.
### User and job limits
The GPU cluster contains some basic user and job limits to ensure that a single user can not overabuse the resources and a fair usage of the cluster.
The limits are described below.
#### Per job limits
These are limits applying to a single job. In other words, there is a maximum of resources a single job can use.
Limits are defined using QoS, and this is usually set at the partition level. Limits are described in the table below with the format: `SlurmQoS(limits)`,
(list of possible `SlurmQoS` values can be listed with the command `sacctmgr show qos`):
| Partition | Mon-Sun 0h-24h |
|:-------------:| :------------------------------------: |
| **gpu** | gpu_week(cpu=40,gres/gpu=8,mem=200G) |
| **gpu-short** | gpu_week(cpu=40,gres/gpu=8,mem=200G) |
With these limits, a single job can not use more than 40 CPUs, more than 8 GPUs or more than 200GB.
Any job exceeding such limits will stay in the queue with the message **`QOSMax[Cpu|GRES|Mem]PerJob`**.
Since there are no more existing QoS during the week temporary overriding job limits (this happens for instance in the CPU **daily** partition), the job needs to be cancelled, and the requested resources must be adapted according to the above resource limits.
#### Per user limits for CPU partitions
These limits apply exclusively to users. In other words, there is a maximum of resources a single user can use.
Limits are defined using QoS, and this is usually set at the partition level. Limits are described in the table below with the format: `SlurmQoS(limits)`,
(list of possible `SlurmQoS` values can be listed with the command `sacctmgr show qos`):
| Partition | Mon-Sun 0h-24h |
|:-------------:| :---------------------------------------------------------: |
| **gpu** | gpu_week(cpu=80,gres/gpu=16,mem=400G) |
| **gpu-short** | gpu_week(cpu=80,gres/gpu=16,mem=400G) |
With these limits, a single user can not use more than 80 CPUs, more than 16 GPUs or more than 400GB.
Jobs sent by any user already exceeding such limits will stay in the queue with the message **`QOSMax[Cpu|GRES|Mem]PerUser`**. In that case, job can wait in the queue until some of the running resources are freed.
Notice that user limits are wider than job limits. In that way, a user can run up to two 8 GPUs based jobs, or up to four 4 GPUs based jobs, etc.
Please try to avoid occupying all GPUs of the same type for several hours or multiple days, otherwise it would block other users needing the same
type of GPU.
## Understanding the Slurm configuration (for advanced users)
## Advanced Slurm configuration
Clusters at PSI use the [Slurm Workload Manager](http://slurm.schedmd.com/) as the batch system technology for managing and scheduling jobs.
Historically, *Merlin4* and *Merlin5* also used Slurm. In the same way, **Merlin6** has been also configured with this batch system.
Slurm has been installed in a **multi-clustered** configuration, allowing to integrate multiple clusters in the same batch system.
For understanding the Slurm configuration setup in the cluster, sometimes may be useful to check the following files:
@ -200,5 +132,4 @@ For understanding the Slurm configuration setup in the cluster, sometimes may be
* ``/etc/slurm/cgroup.conf`` - can be found in the computing nodes, is also propagated to login nodes for user read access.
The previous configuration files which can be found in the login nodes, correspond exclusively to the **merlin6** cluster configuration files.
Configuration files for the old **merlin5** cluster must be checked directly on any of the **merlin5** computing nodes: these are not propagated
to the **merlin6** login nodes.
Configuration files for the old **merlin5** cluster or for the **gmerlin6** cluster must be checked directly on any of the **merlin5** or **gmerlin6** computing nodes (in example, by login in to one of the nodes while a job or an active allocation is running).