Update docs
All checks were successful
Build and Deploy Documentation / build-and-deploy (push) Successful in 6s

This commit is contained in:
2025-07-24 15:27:11 +02:00
parent 4a5d205950
commit 4b9489888c
7 changed files with 115 additions and 28 deletions

View File

@@ -13,7 +13,7 @@ entries:
- title: News
url: /news.html
output: web
- title: Merlin7 HPC Cluster (W.I.P.)
- title: Merlin7 HPC Cluster
url: /merlin7/introduction.html
output: web
- title: Merlin6 HPC Cluster

View File

@@ -54,6 +54,10 @@ entries:
url: /merlin7/interactive-jobs.html
- title: Slurm Batch Script Examples
url: /merlin7/slurm-examples.html
- title: Jupyterhub
folderitems:
- title: Jupyterhub service
url: /merlin7/jupyterhub.html
- title: Software Support
folderitems:
- title: PSI Modules

View File

@@ -25,8 +25,8 @@ The specification of the node types is:
| ----: | ------ | --- | --- | ---- |
| Login Nodes | 2 | _2x_ AMD EPYC 7742 (x86_64 Rome, 64 Cores, 2.25GHz) | 512GB DDR4 3200Mhz | |
| CPU Nodes | 77 | _2x_ AMD EPYC 7742 (x86_64 Rome, 64 Cores, 2.25GHz) | 512GB DDR4 3200Mhz | |
| A100 GPU Nodes | 8 | _2x_ AMD EPYC 7713 (x86_64 Milan, 64 Cores, 3.2GHz) | 512GB DDR4 3200Mhz | 4 x NV_A100 (80GB) |
| GH GPU Nodes | 5 | _2x_ NVidia Grace Neoverse-V2 (SBSA ARM 64bit, 144 Cores, 3.1GHz) | _2x_ 480GB DDR5X (CPU+GPU) | 4 x NV_GH200 (120GB) |
| A100 GPU Nodes | 5 | _2x_ AMD EPYC 7713 (x86_64 Milan, 64 Cores, 3.2GHz) | 512GB DDR4 3200Mhz | 4 x NV_A100 (80GB) |
| GH GPU Nodes | 3 | _2x_ NVidia Grace Neoverse-V2 (SBSA ARM 64bit, 144 Cores, 3.1GHz) | _2x_ 480GB DDR5X (CPU+GPU) | 4 x NV_GH200 (120GB) |
### Network

View File

@@ -14,11 +14,12 @@ This documentation shows basic Slurm configuration and options needed to run job
### CPU public partitions
| PartitionName | DefaultTime | MaxTime | Priority | Account | Per Job Limits | Per User Limits |
| -----------------: | -----------: | ----------: | -------: | ---------------: | -----------------: | -----------------: |
| **<u>general</u>** | 1-00:00:00 | 7-00:00:00 | Low | <u>merlin</u> | cpu=1024,mem=1920G | cpu=1024,mem=1920G |
| **daily** | 0-01:00:00 | 1-00:00:00 | Medium | <u>merlin</u> | cpu=1024,mem=1920G | cpu=2048,mem=3840G |
| **hourly** | 0-00:30:00 | 0-01:00:00 | High | <u>merlin</u> | cpu=2048,mem=3840G | cpu=8192,mem=15T |
| PartitionName | DefaultTime | MaxTime | Priority | Account | Per Job Limits | Per User Limits |
| -----------------: | -----------: | ----------: | -------: | ---------------: | --------------------: | --------------------: |
| **<u>general</u>** | 1-00:00:00 | 7-00:00:00 | Low | <u>merlin</u> | cpu=1024,mem=1920G | cpu=1024,mem=1920G |
| **daily** | 0-01:00:00 | 1-00:00:00 | Medium | <u>merlin</u> | cpu=1024,mem=1920G | cpu=2048,mem=3840G |
| **hourly** | 0-00:30:00 | 0-01:00:00 | High | <u>merlin</u> | cpu=2048,mem=3840G | cpu=8192,mem=15T |
| **interactive** | 0-04:00:00 | 0-12:00:00 | Highest | <u>merlin</u> | cpu=16,mem=30G,node=1 | cpu=32,mem=60G,node=1 |
### GPU public partitions
@@ -110,12 +111,13 @@ various QoS definitions applicable to the merlin7 CPU-based cluster. Here:
* `MaxTRES` specifies resource limits per job.
* `MaxTRESPU` specifies resource limits per user.
| Name | MaxTRES | MaxTRESPU | Scope |
| --------------: | -----------------: | -----------------: | ---------------------: |
| **normal** | | | partition |
| **cpu_general** | cpu=1024,mem=1920G | cpu=1024,mem=1920G | <u>user</u>, partition |
| **cpu_daily** | cpu=1024,mem=1920G | cpu=2048,mem=3840G | partition |
| **cpu_hourly** | cpu=2048,mem=3840G | cpu=8192,mem=15T | partition |
| Name | MaxTRES | MaxTRESPU | Scope |
| -------------------: | --------------------: | --------------------: | ---------------------: |
| **normal** | | | partition |
| **cpu_general** | cpu=1024,mem=1920G | cpu=1024,mem=1920G | <u>user</u>, partition |
| **cpu_daily** | cpu=1024,mem=1920G | cpu=2048,mem=3840G | partition |
| **cpu_hourly** | cpu=2048,mem=3840G | cpu=8192,mem=15T | partition |
| **cpu_interactive** | cpu=16,mem=30G,node=1 | cpu=32,mem=60G,node=1 | partition |
Where:
* **`normal` QoS:** This QoS has no limits and is typically applied to partitions that do not require user or job
@@ -127,6 +129,8 @@ Where:
with higher resource needs.
* **`cpu_hourly` QoS:** Offers the least constraints, allowing more resources to be used for the `hourly` partition,
which caters to very short-duration jobs.
* **`cpu_interactive` QoS:** Is restricted to one node and a few CPUs only, and is intended to be used when interactive
allocations are necessary (`salloc`, `srun`).
For additional details, refer to the [CPU partitions](/merlin7/slurm-configuration.html#CPU-partitions) section.
@@ -155,11 +159,12 @@ Always verify partition configurations for potential changes using the <b>'scon
#### CPU public partitions
| PartitionName | DefaultTime | MaxTime | TotalNodes | PriorityJobFactor | PriorityTier | QoS | AllowAccounts |
| -----------------: | -----------: | ----------: | --------: | ----------------: | -----------: | ----------: | -------------: |
| **<u>general</u>** | 1-00:00:00 | 7-00:00:00 | 50 | 1 | 1 | cpu_general | <u>merlin</u> |
| **daily** | 0-01:00:00 | 1-00:00:00 | 62 | 500 | 1 | cpu_daily | <u>merlin</u> |
| **hourly** | 0-00:30:00 | 0-01:00:00 | 77 | 1000 | 1 | cpu_hourly | <u>merlin</u> |
| PartitionName | DefaultTime | MaxTime | TotalNodes | PriorityJobFactor | PriorityTier | QoS | AllowAccounts |
| -----------------: | -----------: | ----------: | --------: | ----------------: | -----------: | --------------: | -------------: |
| **<u>general</u>** | 1-00:00:00 | 7-00:00:00 | 46 | 1 | 1 | cpu_general | <u>merlin</u> |
| **daily** | 0-01:00:00 | 1-00:00:00 | 58 | 500 | 1 | cpu_daily | <u>merlin</u> |
| **hourly** | 0-00:30:00 | 0-01:00:00 | 77 | 1000 | 1 | cpu_hourly | <u>merlin</u> |
| **interactive** | 0-04:00:00 | 0-12:00:00 | 58 | 1 | 2 | cpu_interactive | <u>merlin</u> |
All Merlin users are part of the `merlin` account, which is used as the *default account* when submitting jobs.
Similarly, if no partition is specified, jobs are automatically submitted to the `general` partition by default.
@@ -174,6 +179,20 @@ The **`hourly`** partition may include private nodes as an additional buffer. Ho
by **`PriorityTier`**, ensures that jobs submitted to private partitions are prioritized and processed first. As a result, access to the
**`hourly`** partition might experience delays in such scenarios.
The **`interactive`** partition is designed specifically for real-time, interactive work. Here are the key characteristics:
* **CPU Oversubscription:** This partition allows CPU oversubscription (configured as `FORCE:4`), meaning that up to four interactive
jobs may share the same physical CPU core. This can impact performance, but enables fast access for short-term tasks.
* **Highest Scheduling Priority:** Jobs submitted to the interactive partition are always prioritized. They will be scheduled
before any jobs in other partitions.
* **Intended Use:** This partition is ideal for debugging, testing, compiling, short interactive runs, and other activities where
immediate access is important.
{{site.data.alerts.warning}}
Because of CPU sharing, the performance on the **'interactive'** partition may not be optimal for compute-intensive tasks.
For long-running or production workloads, use a dedicated batch partition instead.
{{site.data.alerts.end}}
#### CPU private partitions
##### CAS / ASA

View File

@@ -8,13 +8,6 @@ sidebar: merlin7_sidebar
permalink: /merlin7/slurm-examples.html
---
![Work In Progress](/images/WIP/WIP1.webp){:style="display:block; margin-left:auto; margin-right:auto"}
{{site.data.alerts.warning}}The Merlin7 documentation is <b>Work In Progress</b>.
Please do not use or rely on this documentation until this becomes official.
This applies to any page under <b><a href="https://hpce.pages.psi.ch/merlin7/">https://hpce.pages.psi.ch/merlin7/</a></b>
{{site.data.alerts.end}}
## Single core based job examples
```bash

View File

@@ -0,0 +1,71 @@
---
title: Jupyterhub on Merlin7
#tags:
keywords: jupyterhub, jupyter, jupyterlab, notebook, notebooks
last_updated: 24 July 2025
summary: "Jupyterhub service description"
sidebar: merlin7_sidebar
permalink: /merlin7/jupyterhub.html
---
Jupyterhub provides [jupyter notebooks](https://jupyter.org/) that are launched on
cluster nodes of merlin and can be accessed through a web portal.
## Accessing Jupyterhub and launching a session
The service is available inside of PSI (or through a VPN connection) at
**<https://merlin7-jupyter01.psi.ch:8000/hub/>**
1. **Login**: You will be presented with a **Login** web page for
authenticating with your PSI account.
1. **Spawn job**: The **Spawner Options** page allows you to
specify the properties (Slurm partition, running time,...) of
the batch jobs that will be running your jupyter notebook. Once
you click on the `Spawn` button, your job will be sent to the
Slurm batch system. If the cluster is not currently overloaded
and the resources you requested are available, your job will
usually start within 30 seconds.
### Recommended partitions
Running on the `merlin7` cluster and using the `interactive` partition would
in general guarantee fast access to resources. Keep in mind, that this partition
has a limit of 12 hours.
## Requesting additional resources
The **Spawner Options** page covers the most common options. These are used to
create a submission script for the jupyterhub job and submit it to the slurm
queue. Additional customization can be implemented using the *'Optional user
defined line to be added to the batch launcher script'* option. This line is
added to the submission script at the end of other `#SBATCH` lines. Parameters can
be passed to SLURM by starting the line with `#SBATCH`, like in [Running Slurm
Scripts](/merlin7/running-jobs.html). Some ideas:
**Request additional memory**
```
#SBATCH --mem=100G
```
**Request multiple GPUs** (gpu partition only)
```
#SBATCH --gpus=2
```
**Log additional information**
```
hostname; date; echo $USER
```
Output is found in `~/jupyterhub_batchspawner_<jobid>.log`.
## Contact
In case of problems or requests, please either submit a **[PSI Service
Now](https://psi.service-now.com/psisp)** incident containing *"Merlin
Jupyterhub"* as part of the subject, or contact us by mail through
<merlin-admins@lists.psi.ch>.

View File

@@ -33,12 +33,12 @@ Basic contact information is also displayed on every shell login to the system u
## Get updated through the Merlin User list!
Is strictly recommended that users subscribe to the Merlin Users mailing list: **<merlin7-users@lists.psi.ch>**
Is strictly recommended that users subscribe to the Merlin Users mailing list: **<merlin-users@lists.psi.ch>**
This mailing list is the official channel used by Merlin administrators to inform users about downtimes,
interventions or problems. Users can be subscribed in two ways:
* *(Preferred way)* Self-registration through **[Sympa](https://psilists.ethz.ch/sympa/info/merlin7-users)**
* *(Preferred way)* Self-registration through **[Sympa](https://psilists.ethz.ch/sympa/info/merlin-users)**
* If you need to subscribe many people (e.g. your whole group) by sending a request to the admin list **<merlin-admins@lists.psi.ch>**
and providing a list of email addresses.