Update Mrlin6 documentation with latest changes

2019-06-13 08:37:44 +02:00
parent e5d490759b
commit caa63db616
2 changed files with 11 additions and 88 deletions
--- a/pages/merlin6-user-guide/contact.md
+++ b/pages/merlin6-user-guide/contact.md
@ -2,7 +2,7 @@
 layout: default
 title: Contact
 parent: Merlin6 User Guide
-nav_order: 2
+nav_order: 6
 ---

 # Contact 
--- a/pages/merlin6-user-guide/introduction.md
+++ b/pages/merlin6-user-guide/introduction.md
@ -18,96 +18,19 @@ nav_order: 1

 ## About Merlin6

-Merlin6 is a the official PSI Local HPC cluster for development and mission-critical applications that has been built in 2019. It replaces the Merlin5 cluster.
+Merlin6 is a the official PSI Local HPC cluster for development and 
+mission-critical applications that has been built in 2019. It replaces 
+the Merlin5 cluster.

-Merlin6 is designed to be extensible, so is technically possible to add more compute nodes and cluster storage without significant increase of the costs of the manpower and 
-the operations.
+Merlin6 is designed to be extensible, so is technically possible to add
+more compute nodes and cluster storage without significant increase of 
+the costs of the manpower and the operations.

-Merlin6 is mostly based on CPU resources, but also contains a small amount of GPU-based resources which are mostly used by the BIO experiments.
+Merlin6 is mostly based on CPU resources, but also contains a small amount 
+of GPU-based resources which are mostly used by the BIO experiments.

 ---

-## Hardware & Software Description
+## Merlin6 

-### Computing Nodes
-
-The new Merlin6 cluster contains an homogeneous solution based on *three* HP Apollo k6000 systems. Each HP Apollo k6000 chassis contains 22 HP XL320k Gen10 blades. However,
-each chassis can contain up to 24 blades, so is possible to upgradew with up to 2 nodes per chassis.
-
-Each HP XL320k Gen 10 blade can contain up to two processors of the latest Intel® Xeon® Scalable Processor family. The hardware and software configuration is the following:
-* 3 x HP Apollo k6000 chassis systems, each one:
-    * 22 x [HP Apollo XL230K Gen10](https://h20195.www2.hpe.com/v2/GetDocument.aspx?docname=a00016634enw), each one:
-        * 2 x *22 core* [Intel® Xeon® Gold 6152 Scalable Processor](https://ark.intel.com/products/120491/Intel-Xeon-Gold-6152-Processor-30-25M-Cache-2-10-GHz-) (2.10-3.70GHz).
-        * 12 x 32 GB (384 GB in total) of DDR4 memory clocked 2666 MHz.
-        * Dual Port !InfiniBand !ConnectX-5 EDR-100Gbps (low latency network); one active port per chassis.
-        * 1 x 1.6TB NVMe SSD Disk
-            * ~300GB reserved for the O.S.
-            * ~1.2TB reserved for local fast scratch ``/scratch``.
-        * Software:
-            * RedHat Enterprise Linux 7.6
-            * [Slurm](https://slurm.schedmd.com/) v18.08
-            * [GPFS](https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.2/ibmspectrumscale502_welcome.html)  v5.0.2
-    * 1 x [HPE Apollo InfiniBand EDR 36-port Unmanaged Switch](https://h20195.www2.hpe.com/v2/getdocument.aspx?docname=a00016643enw)
-        * 24 internal EDR-100Gbps ports (1 port per blade for internal low latency connectivity)
-        * 12 external EDR-100Gbps ports (for external for internal low latency connectivity)
-
-### Login Nodes
-
-Two login nodes are inherit from the previous Merlin5 cluster: ``merlin-l-01.psi.ch``, ``merlin-l-02.psi.ch``. The hardware and software configuration is the following:
-
-* 2 x HP DL380 Gen9, each one:
-    * 2 x *16 core* [Intel® Xeon® Processor E5-2697AV4 Family](https://ark.intel.com/products/91768/Intel-Xeon-Processor-E5-2697A-v4-40M-Cache-2-60-GHz-) (2.60-3.60GHz)
-        * ``merlin-l-01.psi.ch`` hyper-threading disabled
-        * ``merlin-l-02.psi.ch`` hyper-threading enabled
-    * 16 x 32 GB (512 GB in total) of DDR4 memory clocked 2400 MHz.
-    * Dual Port Infiniband !ConnectIB FDR-56Gbps (low latency network).
-    * Software:
-        * RedHat Enterprise Linux 7.6
-        * [Slurm](https://slurm.schedmd.com/) v18.08
-        * [GPFS](https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.2/ibmspectrumscale502_welcome.html)  v5.0.2
-
-Two new login nodes are available in the new cluster: ``merlin-l-001.psi.ch``, ``merlin-l-002.psi.ch``. The hardware and software configuration is the following:
-
-* 2 x HP DL380 Gen10, each one:
-    * 2 x *22 core* [Intel® Xeon® Gold 6152 Scalable Processor](https://ark.intel.com/products/120491/Intel-Xeon-Gold-6152-Processor-30-25M-Cache-2-10-GHz-) (2.10-3.70GHz).
-        * Hyper-threading disabled.
-    * 24 x 16GB (384 GB in total) of DDR4 memory clocked 2666 MHz.
-    * Dual Port Infiniband !ConnectX-5 EDR-100Gbps (low latency network).
-    * Software:
-        * [NoMachine Terminal Server](https://www.nomachine.com/)
-            * Currently only on: ``merlin-l-001.psi.ch``.
-        * RedHat Enterprise Linux 7.6
-        * [Slurm](https://slurm.schedmd.com/) v18.08
-        * [GPFS](https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.2/ibmspectrumscale502_welcome.html)  v5.0.2
-
-### Storage
-
-The storage node is based on the [Lenovo Distributed Storage Solution for IBM Spectrum Scale](https://lenovopress.com/lp0626-lenovo-distributed-storage-solution-for-ibm-spectrum-scale-x3650-m5).
-The solution is equipped with 334 x 10TB disks providing a useable capacity of 2.316 PiB (2.608PB). THe overall solution can provide a maximum read performance of 20GB/s.
-* 1 x Lenovo DSS G240, composed by:
-    * 2 x ThinkSystem SR650, each one:
-        * 2 x Dual Port !Infiniband ConnectX-5 EDR-100Gbps (low latency network).
-        * 2 x Dual Port !Infiniband ConnectX-4 EDR-100Gbps (low latency network).
-        * 1 x ThinkSystem RAID 930-8i 2GB Flash PCIe 12Gb Adapter
-    * 1 x ThinkSystem SR630
-        * 1 x Dual Port !Infiniband ConnectX-5 EDR-100Gbps (low latency network).
-        * 1 x Dual Port !Infiniband ConnectX-4 EDR-100Gbps (low latency network).
-    * 4 x Lenovo Storage D3284 High Density Expansion Enclosure, each one:
-        * Holds 84 x 3.5" hot-swap drive bays in two drawers. Each drawer has three rows of drives, and each row has 14 drives.
-        * Each drive bay will contain a 10TB Helium 7.2K NL-SAS HDD.
-    * 2 x Mellanox SB7800 InfiniBand 1U Switch for High Availability and fast access to the storage with very low latency. Each one:
-        * 36 EDR-100Gbps ports
-
-### Networking
-
-Merlin6 cluster connectivity is based on the [Infiniband](https://en.wikipedia.org/wiki/InfiniBand) technology. This allows fast access with very low latencies to the data as well as running
-extremely efficient MPI-based jobs:
-* Connectivity amongst different computing nodes on different chassis ensures up to 1200Gbps of aggregated bandwidth. 
-* Inter connectivity (communication amongst computing nodes in the same chassis) ensures up to 2400Gbps of aggregated bandwidth. 
-* Communication to the storage ensures up to 800Gbps of aggregated bandwidth.
-
-Merlin6 cluster currently contains 5 Infiniband Managed switches and 3 Infiniband Unmanaged switches (one per HP Apollo chassis):
-* 1 * MSX6710 (FDR) for connecting old GPU nodes, old login nodes and MeG cluster to the Merlin6 cluster (and storage). No High Availability mode possible.
-* 2 * MSB7800 (EDR) for connecting Login Nodes, Storage and other nodes in High Availability mode.
-* 3 * HP EDR Unmanaged switches, each one embedded to each HP Apollo k6000 chassis solution.
-* 2 * MSB7700 (EDR) are the top switches, interconnecting the Apollo unmanaged switches and the managed switches (MSX6710, MSB7800).
+![Merlin6 Architecture](/source/images/merlinschema3.png "Merlin6 Architecture")