diff --git a/_data/sidebars/merlin6_sidebar.yml b/_data/sidebars/merlin6_sidebar.yml index 88b1331..d724d24 100644 --- a/_data/sidebars/merlin6_sidebar.yml +++ b/_data/sidebars/merlin6_sidebar.yml @@ -5,22 +5,36 @@ entries: - product: Merlin version: 6 folders: - - title: Introduction + - title: Quick Start Guide # URLs for top-level folders are optional. If omitted it is a bit easier to toggle the accordion. #url: /merlin6/introduction.html folderitems: - - title: Introduction - url: /merlin6/introduction.html - title: Code Of Conduct url: /merlin6/code-of-conduct.html - - title: Hardware And Software Description - url: /merlin6/hardware-and-software.html - - title: Accessing Merlin - folderitems: - title: Requesting Accounts url: /merlin6/request-account.html - title: Requesting Projects url: /merlin6/request-project.html + - title: Slurm CPU 'merlin5' + folderitems: + - title: Introduction + url: /merlin5/introduction.html + - title: Hardware And Software Description + url: /merlin5/hardware-and-software.html + - title: Slurm CPU 'merlin6' + folderitems: + - title: Introduction + url: /merlin6/introduction.html + - title: Hardware And Software Description + url: /merlin6/hardware-and-software.html + - title: Slurm GPU 'gmerlin6' + folderitems: + - title: Introduction + url: /gmerlin6/introduction.html + - title: Hardware And Software Description + url: /gmerlin6/hardware-and-software.html + - title: Accessing Merlin + folderitems: - title: Accessing Interactive Nodes url: /merlin6/interactive.html - title: Accessing from a Linux client diff --git a/_data/topnav.yml b/_data/topnav.yml index 5680e7d..fe1d0c7 100644 --- a/_data/topnav.yml +++ b/_data/topnav.yml @@ -22,3 +22,11 @@ topnav_dropdowns: url: /merlin6/use.html - title: User Guide url: /merlin6/user-guide.html + - title: Slurm + folderitems: + - title: Cluster 'merlin5' + url: /merlin5/slurm-cluster.html + - title: Cluster 'merlin6' + url: /gmerlin6/slurm-cluster.html + - title: Cluster 'gmerlin6' + url: /gmerlin6/slurm-cluster.html diff --git a/pages/gmerlin6/introduction.md b/pages/gmerlin6/introduction.md new file mode 100644 index 0000000..6fc3dff --- /dev/null +++ b/pages/gmerlin6/introduction.md @@ -0,0 +1,47 @@ +--- +title: Cluster 'gmerlin6' +#tags: +#keywords: +last_updated: 07 April 2021 +#summary: "GPU Merlin 6 cluster overview" +sidebar: merlin6_sidebar +permalink: /merlin5/introduction.html +redirect_from: + - /gmerlin6 + - /gmerlin6/index.html +--- + +## Slurm 'merlin5' cluster + +**Merlin5** was the old official PSI Local HPC cluster for development and +mission-critical applications which was built in 2016-2017. It was an +extension of the Merlin4 cluster and built from existing hardware due +to a lack of central investment on Local HPC Resources. **Merlin5** was +then replaced by the **[Merlin6](/merlin6/index.html)** cluster in 2019, +with an important central investment of ~1,5M CHF. **Merlin5** was mostly +based on CPU resources, but also contained a small amount of GPU-based +resources which were mostly used by the BIO experiments. + +**Merlin5** has been kept as a **Local HPC [Slurm](https://slurm.schedmd.com/overview.html) cluster**, +called **`merlin5`**. In that way, the old CPU computing nodes are still available as extra computation resources, +and as an extension of the official production **`merlin6`** [Slurm](https://slurm.schedmd.com/overview.html) cluster. + +The old Merlin5 _**login nodes**_, _**GPU nodes**_ and _**storage**_ were fully migrated to the **[Merlin6](/merlin6/index.html)** +cluster, which becomes the **main Local HPC Cluster**. Hence, **[Merlin6](/merlin6/index.html)** +contains the storage which is mounted on the different Merlin HPC [Slurm](https://slurm.schedmd.com/overview.html) Clusters (`merlin5`, `merlin6`, `gmerlin6`). + +### Submitting jobs to 'merlin5' + +To submit jobs to the **`merlin5`** Slurm cluster, it must be done from the **Merlin6** login nodes by using +the option `--clusters=merlin5` on any of the Slurm commands (`sbatch`, `salloc`, `srun`, etc. commands). + +## The Merlin Architecture + +### Multi Non-Federated Cluster Architecture Design: The Merlin cluster + +The following image shows the Slurm architecture design for Merlin cluster. +It contains a multi non-federated cluster setup, with a central Slurm database +and multiple independent clusters (`merlin5`, `merlin6`, `gmerlin6`): + +![Merlin6 Slurm Architecture Design]({{ "/images/merlin-slurm-architecture.png" }}) + diff --git a/pages/merlin5/hardware-and-software-description.md b/pages/merlin5/hardware-and-software-description.md new file mode 100644 index 0000000..bf06ab9 --- /dev/null +++ b/pages/merlin5/hardware-and-software-description.md @@ -0,0 +1,97 @@ +--- +title: Hardware And Software Description +#tags: +#keywords: +last_updated: 09 April 2021 +#summary: "" +sidebar: merlin6_sidebar +permalink: /merlin5/hardware-and-software.html +--- + +## Hardware + +### Computing Nodes + +Merlin5 is built from recycled nodes, and hardware will be decomissioned as soon as it fails (due to expired warranty and age of the cluster). +* Merlin5 is based on the [**HPE c7000 Enclosure**](https://h20195.www2.hpe.com/v2/getdocument.aspx?docname=c04128339) solution, with 16 x [**HPE ProLiant BL460c Gen8**](https://h20195.www2.hpe.com/v2/getdocument.aspx?docname=c04123239) nodes per chassis. +* Connectivity is based on Infiniband **ConnectX-3 QDR-40Gbps** + * 16 internal ports for intra chassis communication + * 2 connected external ports for inter chassis communication and storage access. + +The below table summarizes the hardware setup for the Merlin5 computing nodes: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Merlin5 CPU Computing Nodes
ChassisNodeProcessorSocketsCoresThreadsScratchMemory
#0merlin-c-[18-30]Intel Xeon E5-2670216150GB64GB
merlin-c-[31,32]128GB
#1merlin-c-[33-45]Intel Xeon E5-2670216150GB64GB
merlin-c-[46,47]128GB
+ +### Login Nodes + +The login nodes are part of the **[Merlin6](/merlin6/introduction.html)** HPC cluster, +and are used to compile and to submit jobs to the different ***Merlin Slurm clusters*** (`merlin5`,`merlin6`,`gmerlin6`,etc.). +Please refer to the **[Merlin6 Hardware Documentation](/merlin6/hardware-and-software.html)** for further information. + +### Storage + +The storage is part of the **[Merlin6](/merlin6/introduction.html)** HPC cluster, +and is mounted in all the ***Slurm clusters*** (`merlin5`,`merlin6`,`gmerlin6`,etc.). +Please refer to the **[Merlin6 Hardware Documentation](/merlin6/hardware-and-software.html)** for further information. + +### Network + +Merlin5 cluster connectivity is based on the [Infiniband QDR](https://en.wikipedia.org/wiki/InfiniBand) technology. +This allows fast access with very low latencies to the data as well as running extremely efficient MPI-based jobs. +However, this is an old version of Infiniband which requires older drivers and software can not take advantage of the latest features. + +## Software + +In Merlin5, we try to keep software stack coherency with the main cluster [Merlin6](/merlin6/index.html). + +Due to this, Merlin5 runs: +* [**RedHat Enterprise Linux 7**](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/7.9_release_notes/index) +* [**Slurm**](https://slurm.schedmd.com/), we usually try to keep it up to date with the most recent versions. +* [**GPFS v5**](https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.2/ibmspectrumscale502_welcome.html) +* [**MLNX_OFED LTS v.4.9-2.2.4.0**](https://www.mellanox.com/products/infiniband-drivers/linux/mlnx_ofed), which is an old version, but required because **ConnectX-3** support has been dropped on newer OFED versions. diff --git a/pages/merlin5/introduction.md b/pages/merlin5/introduction.md new file mode 100644 index 0000000..40ccfab --- /dev/null +++ b/pages/merlin5/introduction.md @@ -0,0 +1,47 @@ +--- +title: Cluster 'merlin5' +#tags: +#keywords: +last_updated: 07 April 2021 +#summary: "Merlin 5 cluster overview" +sidebar: merlin6_sidebar +permalink: /merlin5/introduction.html +redirect_from: + - /merlin5 + - /merlin5/index.html +--- + +## Slurm 'merlin5' cluster + +**Merlin5** was the old official PSI Local HPC cluster for development and +mission-critical applications which was built in 2016-2017. It was an +extension of the Merlin4 cluster and built from existing hardware due +to a lack of central investment on Local HPC Resources. **Merlin5** was +then replaced by the **[Merlin6](/merlin6/index.html)** cluster in 2019, +with an important central investment of ~1,5M CHF. **Merlin5** was mostly +based on CPU resources, but also contained a small amount of GPU-based +resources which were mostly used by the BIO experiments. + +**Merlin5** has been kept as a **Local HPC [Slurm](https://slurm.schedmd.com/overview.html) cluster**, +called **`merlin5`**. In that way, the old CPU computing nodes are still available as extra computation resources, +and as an extension of the official production **`merlin6`** [Slurm](https://slurm.schedmd.com/overview.html) cluster. + +The old Merlin5 _**login nodes**_, _**GPU nodes**_ and _**storage**_ were fully migrated to the **[Merlin6](/merlin6/index.html)** +cluster, which becomes the **main Local HPC Cluster**. Hence, **[Merlin6](/merlin6/index.html)** +contains the storage which is mounted on the different Merlin HPC [Slurm](https://slurm.schedmd.com/overview.html) Clusters (`merlin5`, `merlin6`, `gmerlin6`). + +### Submitting jobs to 'merlin5' + +To submit jobs to the **`merlin5`** Slurm cluster, it must be done from the **Merlin6** login nodes by using +the option `--clusters=merlin5` on any of the Slurm commands (`sbatch`, `salloc`, `srun`, etc. commands). + +## The Merlin Architecture + +### Multi Non-Federated Cluster Architecture Design: The Merlin cluster + +The following image shows the Slurm architecture design for Merlin cluster. +It contains a multi non-federated cluster setup, with a central Slurm database +and multiple independent clusters (`merlin5`, `merlin6`, `gmerlin6`): + +![Merlin6 Slurm Architecture Design]({{ "/images/merlin-slurm-architecture.png" }}) + diff --git a/pages/merlin6/01 introduction/hardware-and-software-description.md b/pages/merlin6/01 introduction/hardware-and-software-description.md index 4439a8a..139757a 100644 --- a/pages/merlin6/01 introduction/hardware-and-software-description.md +++ b/pages/merlin6/01 introduction/hardware-and-software-description.md @@ -8,104 +8,159 @@ sidebar: merlin6_sidebar permalink: /merlin6/hardware-and-software.html --- -# Hardware And Software Description -{: .no_toc } +## Hardware -## Table of contents -{: .no_toc .text-delta } +### Computing Nodes -1. TOC -{:toc} +The new Merlin6 cluster contains a solution based on **four** [**HPE Apollo k6000 Chassis**](https://h20195.www2.hpe.com/v2/getdocument.aspx?docname=a00016641enw) +* *Three* of them contain 24 x [**HP Apollo XL230K Gen10**](https://h20195.www2.hpe.com/v2/GetDocument.aspx?docname=a00016634enw) blades. +* A *fourth* chassis was purchased on 2021 with [**HP Apollo XL230K Gen10**](https://h20195.www2.hpe.com/v2/GetDocument.aspx?docname=a00016634enw) blades dedicated to few experiments. Blades have slighly different components depending on specific project requirements. ---- +The connectivity for the Merlin6 cluster is based on **ConnectX-5 EDR-100Gbps**, and each chassis contains: +* 1 x [HPE Apollo InfiniBand EDR 36-port Unmanaged Switch](https://h20195.www2.hpe.com/v2/getdocument.aspx?docname=a00016643enw) + * 24 internal EDR-100Gbps ports (1 port per blade for internal low latency connectivity) + * 12 external EDR-100Gbps ports (for external for internal low latency connectivity) -## Computing Nodes + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Merlin6 CPU Computing Nodes
ChassisNodeProcessorSocketsCoresThreadsScratchMemory
#0merlin-c-0[01-24]Intel Xeon Gold 615224421.2TB384GB
#1merlin-c-1[01-24]Intel Xeon Gold 615224421.2TB384GB
#2merlin-c-2[01-24]Intel Xeon Gold 615224421.2TB384GB
#3merlin-c-3[01-06]Intel Xeon Gold 6240R24821.2TB384GB
merlin-c-3[07-12]768GB
+Each blade contains a NVMe disk, where up to 300TB are dedicated to the O.S., and ~1.2TB are reserved for local `/scratch`. -The new Merlin6 cluster contains an homogeneous solution based on *three* HP Apollo k6000 systems. Each HP Apollo k6000 chassis contains 22 HP XL320k Gen10 blades. However, -each chassis can contain up to 24 blades, so is possible to upgradew with up to 2 nodes per chassis. +### Login Nodes -Each HP XL320k Gen 10 blade can contain up to two processors of the latest Intel® Xeon® Scalable Processor family. The hardware and software configuration is the following: -* 3 x HP Apollo k6000 chassis systems, each one: - * 22 x [HP Apollo XL230K Gen10](https://h20195.www2.hpe.com/v2/GetDocument.aspx?docname=a00016634enw), each one: - * 2 x *22 core* [Intel® Xeon® Gold 6152 Scalable Processor](https://ark.intel.com/products/120491/Intel-Xeon-Gold-6152-Processor-30-25M-Cache-2-10-GHz-) (2.10-3.70GHz). - * 12 x 32 GB (384 GB in total) of DDR4 memory clocked 2666 MHz. - * Dual Port !InfiniBand !ConnectX-5 EDR-100Gbps (low latency network); one active port per chassis. - * 1 x 1.6TB NVMe SSD Disk - * ~300GB reserved for the O.S. - * ~1.2TB reserved for local fast scratch ``/scratch``. - * Software: - * RedHat Enterprise Linux 7.6 - * [Slurm](https://slurm.schedmd.com/) v18.08 - * [GPFS](https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.2/ibmspectrumscale502_welcome.html) v5.0.2 - * 1 x [HPE Apollo InfiniBand EDR 36-port Unmanaged Switch](https://h20195.www2.hpe.com/v2/getdocument.aspx?docname=a00016643enw) - * 24 internal EDR-100Gbps ports (1 port per blade for internal low latency connectivity) - * 12 external EDR-100Gbps ports (for external for internal low latency connectivity) ---- +*One old login node* (``merlin-l-01.psi.ch``) is inherit from the previous Merlin5 cluster. Its mainly use is for running some BIO services (`cryosparc`) and for submitting jobs. +*Two new login nodes* (``merlin-l-001.psi.ch``,``merlin-l-002.psi.ch``) with similar configuration to the Merlin6 computing nodes are available for the users. The mainly use +is for compiling software and submitting jobs. -## Login Nodes +The connectivity is based on **ConnectX-5 EDR-100Gbps** for the new login nodes, and **ConnectIB FDR-56Gbps** for the old one. -### merlin-l-0[1,2] + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Merlin6 CPU Computing Nodes
HardwareNodeProcessorSocketsCoresThreadsScratchMemory
Oldmerlin-l-01Intel Xeon E5-2697AV42162100GB512GB
Newmerlin-l-00[1,2]Intel Xeon Gold 615224421.8TB384GB
-Two login nodes are inherit from the previous Merlin5 cluster: ``merlin-l-01.psi.ch``, ``merlin-l-02.psi.ch``. The hardware and software configuration is the following: - -* 2 x HP DL380 Gen9, each one: - * 2 x *16 core* [Intel® Xeon® Processor E5-2697AV4 Family](https://ark.intel.com/products/91768/Intel-Xeon-Processor-E5-2697A-v4-40M-Cache-2-60-GHz-) (2.60-3.60GHz) - * Hyper-Threading disabled - * 16 x 32 GB (512 GB in total) of DDR4 memory clocked 2400 MHz. - * Dual Port Infiniband !ConnectIB FDR-56Gbps (low latency network). - * Software: - * RedHat Enterprise Linux 7.6 - * [Slurm](https://slurm.schedmd.com/) v18.08 - * [GPFS](https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.2/ibmspectrumscale502_welcome.html) v5.0.2 - -### merlin-l-00[1,2] - -Two new login nodes are available in the new cluster: ``merlin-l-001.psi.ch``, ``merlin-l-002.psi.ch``. The hardware and software configuration is the following: - -* 2 x HP DL380 Gen10, each one: - * 2 x *22 core* [Intel® Xeon® Gold 6152 Scalable Processor](https://ark.intel.com/products/120491/Intel-Xeon-Gold-6152-Processor-30-25M-Cache-2-10-GHz-) (2.10-3.70GHz). - * Hyper-threading enabled. - * 24 x 16GB (384 GB in total) of DDR4 memory clocked 2666 MHz. - * Dual Port Infiniband !ConnectX-5 EDR-100Gbps (low latency network). - * Software: - * [NoMachine Terminal Server](https://www.nomachine.com/) - * Currently only on: ``merlin-l-001.psi.ch``. - * RedHat Enterprise Linux 7.6 - * [Slurm](https://slurm.schedmd.com/) v18.08 - * [GPFS](https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.2/ibmspectrumscale502_welcome.html) v5.0.2 (merlin-l-001) v5.0.3 (merlin-l-002) - ---- - -## Storage +### Storage The storage node is based on the [Lenovo Distributed Storage Solution for IBM Spectrum Scale](https://lenovopress.com/lp0626-lenovo-distributed-storage-solution-for-ibm-spectrum-scale-x3650-m5). -The solution is equipped with 334 x 10TB disks providing a useable capacity of 2.316 PiB (2.608PB). THe overall solution can provide a maximum read performance of 20GB/s. -* 1 x Lenovo DSS G240, composed by: - * 2 x ThinkSystem SR650, each one: - * 2 x Dual Port Infiniband ConnectX-5 EDR-100Gbps (low latency network). - * 2 x Dual Port Infiniband ConnectX-4 EDR-100Gbps (low latency network). - * 1 x ThinkSystem RAID 930-8i 2GB Flash PCIe 12Gb Adapter - * 1 x ThinkSystem SR630 - * 1 x Dual Port Infiniband ConnectX-5 EDR-100Gbps (low latency network). - * 1 x Dual Port Infiniband ConnectX-4 EDR-100Gbps (low latency network). - * 4 x Lenovo Storage D3284 High Density Expansion Enclosure, each one: - * Holds 84 x 3.5" hot-swap drive bays in two drawers. Each drawer has three rows of drives, and each row has 14 drives. - * Each drive bay will contain a 10TB Helium 7.2K NL-SAS HDD. - * 2 x Mellanox SB7800 InfiniBand 1U Switch for High Availability and fast access to the storage with very low latency. Each one: - * 36 EDR-100Gbps ports +* 2 x **Lenovo DSS G240** systems, each one composed by 2 IO Nodes **ThinkSystem SR650** mounting 4 x **Lenovo Storage D3284 High Density Expansion** enclosures. +* Each IO node has a connectivity of 400Gbps (4 x EDR 100Gbps ports, 2 of them are **ConnectX-5** and 2 are **ConnectX-4**). ---- +The storage solution is connected to the HPC clusters through 2 x **Mellanox SB7800 InfiniBand 1U Switches** for high availability and load balancing. -## Network +### Network -Merlin6 cluster connectivity is based on the [Infiniband](https://en.wikipedia.org/wiki/InfiniBand) technology. This allows fast access with very low latencies to the data as well as running +Merlin6 cluster connectivity is based on the [**Infiniband**](https://en.wikipedia.org/wiki/InfiniBand) technology. This allows fast access with very low latencies to the data as well as running extremely efficient MPI-based jobs: * Connectivity amongst different computing nodes on different chassis ensures up to 1200Gbps of aggregated bandwidth. * Inter connectivity (communication amongst computing nodes in the same chassis) ensures up to 2400Gbps of aggregated bandwidth. * Communication to the storage ensures up to 800Gbps of aggregated bandwidth. Merlin6 cluster currently contains 5 Infiniband Managed switches and 3 Infiniband Unmanaged switches (one per HP Apollo chassis): -* 1 * MSX6710 (FDR) for connecting old GPU nodes, old login nodes and MeG cluster to the Merlin6 cluster (and storage). No High Availability mode possible. -* 2 * MSB7800 (EDR) for connecting Login Nodes, Storage and other nodes in High Availability mode. -* 3 * HP EDR Unmanaged switches, each one embedded to each HP Apollo k6000 chassis solution. -* 2 * MSB7700 (EDR) are the top switches, interconnecting the Apollo unmanaged switches and the managed switches (MSX6710, MSB7800). +* 1 x **MSX6710** (FDR) for connecting old GPU nodes, old login nodes and MeG cluster to the Merlin6 cluster (and storage). No High Availability mode possible. +* 2 x **MSB7800** (EDR) for connecting Login Nodes, Storage and other nodes in High Availability mode. +* 3 x **HP EDR Unmanaged** switches, each one embedded to each HP Apollo k6000 chassis solution. +* 2 x **MSB7700** (EDR) are the top switches, interconnecting the Apollo unmanaged switches and the managed switches (MSX6710, MSB7800). + +## Software + +In Merlin6, we try to keep the latest software stack release to get the latest features and improvements. Due to this, **Merlin6** runs: +* [**RedHat Enterprise Linux 7**](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/7.9_release_notes/index) +* [**Slurm**](https://slurm.schedmd.com/), we usually try to keep it up to date with the most recent versions. +* [**GPFS v5**](https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.2/ibmspectrumscale502_welcome.html) +* [**MLNX_OFED LTS v.5.2-2.2.0.0 or newer**](https://www.mellanox.com/products/infiniband-drivers/linux/mlnx_ofed) for all **ConnectX-5** or superior cards. + * [MLNX_OFED LTS v.4.9-2.2.4.0](https://www.mellanox.com/products/infiniband-drivers/linux/mlnx_ofed) is installed for remaining **ConnectX-3** and **ConnectIB** cards.