3.0 KiB

title, keywords, last_updated, sidebar, permalink, redirect_from
title keywords last_updated sidebar permalink redirect_from
Introduction introduction, home, welcome, architecture, design 07 September 2022 merlin6_sidebar /merlin6/introduction.html
/merlin6
/merlin6/index.html

The Merlin local HPC cluster

Historically, the local HPC clusters at PSI were named Merlin. Over the years, multiple generations of Merlin have been deployed.

At present, the Merlin local HPC cluster contains two generations of it:

  • the old Merlin5 cluster (merlin5 Slurm cluster), and
  • the newest generation Merlin6, which is divided in two Slurm clusters:
    • merlin6 as the Slurm CPU cluster
    • gmerlin6 as the Slurm GPU cluster.

Access to the different Slurm clusters is possible from the Merlin login nodes, which can be accessed through the SSH protocol or the NoMachine (NX) service.

The following image shows the Slurm architecture design for the Merlin5 & Merlin6 (CPU & GPU) clusters:

![Merlin6 Slurm Architecture Design]({{ "/images/merlin-slurm-architecture.png" }})

Merlin6

Merlin6 is a the official PSI Local HPC cluster for development and mission-critical applications that has been built in 2019. It replaces the Merlin5 cluster.

Merlin6 is designed to be extensible, so is technically possible to add more compute nodes and cluster storage without significant increase of the costs of the manpower and the operations.

Merlin6 contains all the main services needed for running cluster, including login nodes, storage, computing nodes and other subservices, connected to the central PSI IT infrastructure.

CPU and GPU Slurm clusters

The Merlin6 computing nodes are mostly based on CPU resources. However, it also contains a small amount of GPU-based resources, which are mostly used by the BIO Division and by Deep Leaning project.

These computational resources are split into two different Slurm clusters:

  • The Merlin6 CPU nodes are in a dedicated Slurm cluster called merlin6.
    • This is the default Slurm cluster configured in the login nodes: any job submitted without the option --cluster will be submited to this cluster.
  • The Merlin6 GPU resources are in a dedicated Slurm cluster called gmerlin6.
    • Users submitting to the gmerlin6 GPU cluster need to specify the option --cluster=gmerlin6.

Merlin5

The old Slurm CPU merlin cluster is still active and is maintained in a best effort basis.

Merlin5 only contains computing nodes resources in a dedicated Slurm cluster.

  • The Merlin5 CPU cluster is called merlin5.