diff --git a/pages/merlin7/03-Slurm-General-Documentation/slurm-configuration.md b/pages/merlin7/03-Slurm-General-Documentation/slurm-configuration.md new file mode 100644 index 0000000..38235c2 --- /dev/null +++ b/pages/merlin7/03-Slurm-General-Documentation/slurm-configuration.md @@ -0,0 +1,46 @@ +--- +title: Slurm cluster 'merlin7' +#tags: +keywords: configuration, partitions, node definition +#last_updated: 24 Mai 2023 +summary: "This document describes a summary of the Merlin7 configuration." +sidebar: merlin7_sidebar +permalink: /merlin7/slurm-configuration.html +--- + +![Work In Progress](/images/WIP/WIP1.webp){:style="display:block; margin-left:auto; margin-right:auto"} + +{{site.data.alerts.warning}}The Merlin7 documentation is Work In Progress. +Please do not use or rely on this documentation until this becomes official. +This applies to any page under https://lsm-hpce.gitpages.psi.ch/merlin7/ +{{site.data.alerts.end}} + +This documentation shows basic Slurm configuration and options needed to run jobs in the Merlin7 cluster. + +## Infrastructure + +### Hardware + +The current configuration for the _preproduction_ phase (and likely the production phase) is made up as: + +* 92 nodes in total for Merlin7: + * 2 CPU-only login nodes + * 77 CPU-only compute nodes + * 5 GPU A100 nodes + * 8 GPU Grace Hopper nodes + +The specification of the node types is: + +| Node | CPU | RAM | GRES | Notes | +| ---- | --- | --- | ---- | ----- | +| Multi-core node | _2x_ AMD EPYC 7742 (x86_64 Rome, 64 Cores, 2.25GHz) | 512GB DDR4 3200Mhz | | For both the login and CPU-only compute nodes | +| A100 node | _2x_ AMD EPYC 7713 (x86_64 Milan, 64 Cores, 3.2GHz) | 512GB DDR4 3200Mhz | _4x_ NVidia A100 (Ampere, 80GB) | | +| GH Node | _2x_ NVidia Grace Neoverse-V2 (SBSA ARM 64bit, 144 Cores, 3.1GHz) | _2x_ 480GB DDR5X (CPU + GPU) | _4x_ NVidia GH200 (Hopper, 120GB) | | + +### Network + +The Merlin7 cluster builds on top of HPE/Cray technologies, including a high-performance network fabric called Slingshot. This network fabric is able +to provide up to 200 Gbit/s throughput between nodes. Further information on Slignshot can be found on . + +Through software interfaces like [libFabric](https://ofiwg.github.io/libfabric/) (which available on Merlin7), application can leverage the network seamlessly. + diff --git a/pages/merlin7/slurm-configuration.md b/pages/merlin7/slurm-configuration.md deleted file mode 100644 index a59b6c8..0000000 --- a/pages/merlin7/slurm-configuration.md +++ /dev/null @@ -1,35 +0,0 @@ ---- -title: Slurm cluster 'merlin7' -#tags: -keywords: configuration, partitions, node definition -last_updated: 24 Mai 2023 -summary: "This document describes a summary of the Merlin7 configuration." -sidebar: merlin7_sidebar -permalink: /merlin7/slurm-configuration.html ---- - -![Work In Progress](/images/WIP/WIP1.webp){:style="display:block; margin-left:auto; margin-right:auto"} - -{{site.data.alerts.warning}}The Merlin7 documentation is Work In Progress. -Please do not use or rely on this documentation until this becomes official. -This applies to any page under https://lsm-hpce.gitpages.psi.ch/merlin7/ -{{site.data.alerts.end}} - -This documentation shows basic Slurm configuration and options needed to run jobs in the Merlin7 cluster. - -### Infrastructure - -#### Hardware - -The current configuration for the _preproduction_ phase is made up as: - -* nodes for the _PSI-Dev_ development system - * 2 CPU-only login nodes - * 77 CPU-only compute nodes - * 4 GPU nodes - -| Node | CPU | RAM | GRES | Notes | -| ---- | --- | --- | ---- | ----- | -| Login node | _2x_ AMD EPYC 7742 (x86_64 Rome, 64 Cores, 3.2GHz) | 512GB DRR4 3200Mhz | | | -| CPU node | _2x_ AMD EPYC 7742 (x86_64 Rome, 64 Cores, 3.2GHz) | 512GB DRR4 3200Mhz | | | -| GPU node | _2x_ AMD EPYC 7713 (x86_64 Milan, 64 Cores, 3.2GHz) | 512GB DDR4 3200Mhz | _4x_ NVidia A100 (Ampere, 80GB) | |