first stab at mkdocs migration

refactor CSCS and Meg content

add merlin6 quick start

update merlin6 nomachine docs

give the userdoc its own color scheme

we use the Materials default one

refactored slurm general docs merlin6

add merlin6 JB docs

add software support m6 docs

add all files to nav

vibed changes #1

add missing pages

further vibing #2

vibe #3

further fixes
This commit is contained in:
2025-11-26 17:28:07 +01:00
parent 149de6fb18
commit bde174b726
313 changed files with 2608 additions and 11593 deletions

View File

@@ -0,0 +1,57 @@
# PSI HPC@CSCS
PSI has a long standing collaboration with CSCS for offering high end
HPC resources to PSI projects. PSI had co-invested in CSCS' initial
Cray XT3 supercomputer *Horizon* in 2005 and we continue to procure a share on the
CSCS flagship systems.
The share is intended for projects that by their nature cannot profit
from applying for regular [CSCS user lab allocation
schemes.](https://www.cscs.ch/user-lab/allocation-schemes).
We can also help PSI groups to procure additional resources based on
the PSI conditions - please contact us in such a case.
## Yearly survey for requesting a project on the PSI share
At the end of each year we prepare a survey process and notify all subscribed
users of the specialized **PSI HPC@CSCS mailing list** (see below) and the
merlin cluster lists, to enter their next year resource requests. Projects
receive resources in the form of allocations over the four quarters of the
following year.
The projects requests get reviewed and requests may get adapted to fit into the
available capacity.
The survey is done through ServiceNow, please navigate to
[Home > Service Catalog > Research Computing > Apply for computing resources at CSCS](https://psi.service-now.com/psisp?id=psi_new_sc_cat_item&sys_id=8d14bd1e4f9c7b407f7660fe0310c7e9)
and submit the form.
Applications will be reviewed and the final resource allocations, in case of
oversubscription, will be arbitrated by a panel within CSD.
### Instructions for filling out the 2026 survey
* We have a budget of 100 kCHF for 2026, which translates to 435'000 multicore node hours or 35'600 node hours on the GPU Grace Hopper nodes.
* multicore projects: The minimum allocation is 10'000 node hours, an average project allocation amounts to 30'000 node hours
* GPU projects: The minimum allocation is 800 node hours, an average project allocation is 2000 node hours.
* You need to specify the total resource request for your project in node hours, and how you would like to split the resources over the 4 quarters. For the allocations per quarter year, please enter the number in percent (e.g. 25%, 25%, 25%, 25%). If you indicate nothing, a 25% per quarter will be assumed.
* We currently have a total of 65 TB of storage for all projects. Additional storage
can be obtained, but large storage assignments are not in scope for these projects.
## CSCS Systems reference information
For 2025 we can offer access to [CSCS Alps](https://www.cscs.ch/computers/alps) Eiger (CPU multicore) and Daint (GPU) systems.
* [CSCS User Portal](https://user.cscs.ch/)
* Documentation
* [CSCS Eiger CPU multicore cluster](https://docs.cscs.ch/clusters/eiger/)
* [CSCS Daint GPU cluster](https://docs.cscs.ch/clusters/daint/)
## Contact information
* PSI Contacts:
* Mailing list contact: <psi-hpc-at-cscs-admin@lists.psi.ch>
* Marc Caubet Serrabou <marc.caubet@psi.ch>
* Derek Feichtinger <derek.feichtinger@psi.ch>
* Mailing list for receiving user notifications and survey information: psi-hpc-at-cscs@lists.psi.ch [(subscribe)](https://psilists.ethz.ch/sympa/subscribe/psi-hpc-at-cscs)

View File

@@ -0,0 +1,41 @@
# Transferring Data
This document shows how to transfer data between PSI and CSCS by using a Linux workstation.
## Preparing SSH configuration
If the directory **`.ssh`** does not exist in your home directory, create it with **`0700`** permissions:
```bash
mkdir ~/.ssh
chmod 0700 ~/.ssh
```
Then, if it does not exist, create a new file **`.ssh/config`**, otherwise add the following lines
to the already existing file, by replacing **`$cscs_accountname`** by your CSCS `username`:
```bash
Host daint.cscs.ch
Compression yes
ProxyJump ela.cscs.ch
Host *.cscs.ch
User $cscs_accountname
```
### Advanced SSH configuration
There are many different SSH settings available which would allow advanced configurations.
Users may have some configurations already present, therefore would need to adapt it accordingly.
## Transferring files
Once the above configuration is set, then try to rsync from Merlin to CSCS, on any direction:
```bash
# CSCS -> PSI
rsync -azv daint.cscs.ch:<source_path> <destination_path>
# PSI -> CSCS
rsync -azv <source_path> daint.cscs.ch:<destination_path>
```

View File

@@ -0,0 +1,46 @@
---
title: Introduction
#tags:
#keywords:
last_updated: 28 June 2019
#summary: "GPU Merlin 6 cluster overview"
sidebar: merlin6_sidebar
permalink: /gmerlin6/cluster-introduction.html
---
## About Merlin6 GPU cluster
### Introduction
Merlin6 is a the official PSI Local HPC cluster for development and
mission-critical applications that has been built in 2019. It replaces
the Merlin5 cluster.
Merlin6 is designed to be extensible, so is technically possible to add
more compute nodes and cluster storage without significant increase of
the costs of the manpower and the operations.
Merlin6 is mostly based on **CPU** resources, but also contains a small amount
of **GPU**-based resources which are mostly used by the BIO experiments.
### Slurm 'gmerlin6'
THe **GPU nodes** have a dedicated **Slurm** cluster, called **`gmerli6`**.
This cluster contains the same shared storage resources (`/data/user`, `/data/project`, `/shared-scracth`, `/afs`, `/psi/home`)
which are present in the other Merlin Slurm clusters (`merlin5`,`merlin6`). The Slurm `gmerlin6` cluster is maintainted
independently to ease access for the users and keep independent user accounting.
## Merlin6 Architecture
### Merlin6 Cluster Architecture Diagram
The following image shows the Merlin6 cluster architecture diagram:
![Merlin6 Architecture Diagram](../images/merlinschema3.png)
### Merlin5 + Merlin6 Slurm Cluster Architecture Design
The following image shows the Slurm architecture design for the Merlin5 & Merlin6 clusters:
![Merlin6 Slurm Architecture Design](../images/merlin-slurm-architecture.png)

View File

@@ -0,0 +1,151 @@
---
title: Hardware And Software Description
#tags:
#keywords:
last_updated: 19 April 2021
#summary: ""
sidebar: merlin6_sidebar
permalink: /gmerlin6/hardware-and-software.html
---
## Hardware
### GPU Computing Nodes
The GPU Merlin6 cluster was initially built from recycled workstations from different groups in the BIO division.
From then, little by little it was updated with new nodes from sporadic investments from the same division, and it was never possible a central big investment.
Hence, due to this, the Merlin6 GPU computing cluster has a non homogeneus solution, consisting on a big variety of hardware types and components.
On 2018, for the common good, BIO decided to open the cluster to the Merlin users and make it widely accessible for the PSI scientists.
The below table summarizes the hardware setup for the Merlin6 GPU computing nodes:
<table>
<thead>
<tr>
<th scope='colgroup' style="vertical-align:middle;text-align:center;" colspan="9">Merlin6 GPU Computing Nodes</th>
</tr>
<tr>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Node</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Processor</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Sockets</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Cores</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Threads</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Scratch</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Memory</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">GPUs</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">GPU Model</th>
</tr>
</thead>
<tbody>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-g-001</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><a href="https://ark.intel.com/content/www/us/en/ark/products/82930/intel-core-i7-5960x-processor-extreme-edition-20m-cache-up-to-3-50-ghz.html">Intel Core i7-5960X</a></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">1</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">16</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">1.8TB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">128GB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">GTX1080</td>
</tr>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-g-00[2-5]</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><a href="https://ark.intel.com/content/www/us/en/ark/products/92984/intel-xeon-processor-e5-2640-v4-25m-cache-2-40-ghz.html">Intel Xeon E5-2640</a></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">20</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">1</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">1.8TB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">128GB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">4</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">GTX1080</td>
</tr>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-g-006</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><a href="https://ark.intel.com/content/www/us/en/ark/products/92984/intel-xeon-processor-e5-2640-v4-25m-cache-2-40-ghz.html">Intel Xeon E5-2640</a></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">20</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">1</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">800GB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">128GB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">4</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">GTX1080Ti</td>
</tr>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-g-00[7-9]</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><a href="https://ark.intel.com/content/www/us/en/ark/products/92984/intel-xeon-processor-e5-2640-v4-25m-cache-2-40-ghz.html">Intel Xeon E5-2640</a></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">20</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">1</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">3.5TB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">128GB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">4</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">GTX1080Ti</td>
</tr>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-g-01[0-3]</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><a href="https://ark.intel.com/content/www/us/en/ark/products/197098/intel-xeon-silver-4210r-processor-13-75m-cache-2-40-ghz.html">Intel Xeon Silver 4210R</a></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">20</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">1</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">1.7TB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">128GB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">4</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">RTX2080Ti</td>
</tr>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-g-014</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><a href="https://www.intel.com/content/www/us/en/products/sku/199343/intel-xeon-gold-6240r-processor-35-75m-cache-2-40-ghz/specifications.html?wapkw=Intel(R)%20Xeon(R)%20Gold%206240R%20CP">Intel Xeon Gold 6240R</a></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">48</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">1</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2.9TB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">384GB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">8</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">RTX2080Ti</td>
</tr>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-g-015</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><a href="https://www.intel.com/content/www/us/en/products/sku/215279/intel-xeon-gold-5318s-processor-36m-cache-2-10-ghz/specifications.html">Intel(R) Xeon Gold 5318S</a></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">48</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">1</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2.9TB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">384GB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">8</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">RTX A5000</td>
</tr>
</tbody>
</table>
### Login Nodes
The login nodes are part of the **[Merlin6](../merlin6/cluster-introduction.md)** HPC cluster,
and are used to compile and to submit jobs to the different ***Merlin Slurm clusters*** (`merlin5`,`merlin6`,`gmerlin6`,etc.).
Please refer to the **[Merlin6 Hardware Documentation](../merlin6/hardware-and-software-description.md)** for further information.
### Storage
The storage is part of the **[Merlin6](../merlin6/cluster-introduction.md)** HPC cluster,
and is mounted in all the ***Slurm clusters*** (`merlin5`,`merlin6`,`gmerlin6`,etc.).
Please refer to the **[Merlin6 Hardware Documentation](../merlin6/hardware-and-software-description.md)** for further information.
### Network
The Merlin6 cluster connectivity is based on the [Infiniband FDR and EDR](https://en.wikipedia.org/wiki/InfiniBand) technologies.
This allows fast access with very low latencies to the data as well as running extremely efficient MPI-based jobs.
To check the network speed (56Gbps for **FDR**, 100Gbps for **EDR**) of the different machines, it can be checked by running on each node the following command:
```bash
ibstat | grep Rate
```
## Software
In the Merlin6 GPU computing nodes, we try to keep software stack coherency with the main cluster [Merlin6](../merlin6/index.md).
Due to this, the Merlin6 GPU nodes run:
* [**RedHat Enterprise Linux 7**](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/7.9_release_notes/index)
* [**Slurm**](https://slurm.schedmd.com/), we usually try to keep it up to date with the most recent versions.
* [**GPFS v5**](https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.2/ibmspectrumscale502_welcome.html)
* [**MLNX_OFED LTS v.5.2-2.2.0.0 or newer**](https://www.mellanox.com/products/infiniband-drivers/linux/mlnx_ofed) for all **ConnectX-4** or superior cards.

View File

@@ -0,0 +1,268 @@
---
title: Slurm cluster 'gmerlin6'
#tags:
keywords: configuration, partitions, node definition, gmerlin6
last_updated: 29 January 2021
summary: "This document describes a summary of the Slurm 'configuration."
sidebar: merlin6_sidebar
permalink: /gmerlin6/slurm-configuration.html
---
This documentation shows basic Slurm configuration and options needed to run jobs in the GPU cluster.
## Merlin6 GPU nodes definition
The table below shows a summary of the hardware setup for the different GPU nodes
| Nodes | Def.#CPUs | Max.#CPUs | #Threads | Def.Mem/CPU | Max.Mem/CPU | Max.Mem/Node | Max.Swap | GPU Type | Def.#GPUs | Max.#GPUs |
|:------------------:| ---------:| :--------:| :------: | :----------:| :----------:| :-----------:| :-------:| :--------: | :-------: | :-------: |
| merlin-g-[001] | 1 core | 8 cores | 1 | 5120 | 102400 | 102400 | 10000 | **geforce_gtx_1080** | 1 | 2 |
| merlin-g-[002-005] | 1 core | 20 cores | 1 | 5120 | 102400 | 102400 | 10000 | **geforce_gtx_1080** | 1 | 4 |
| merlin-g-[006-009] | 1 core | 20 cores | 1 | 5120 | 102400 | 102400 | 10000 | **geforce_gtx_1080_ti** | 1 | 4 |
| merlin-g-[010-013] | 1 core | 20 cores | 1 | 5120 | 102400 | 102400 | 10000 | **geforce_rtx_2080_ti** | 1 | 4 |
| merlin-g-014 | 1 core | 48 cores | 1 | 5120 | 360448 | 360448 | 10000 | **geforce_rtx_2080_ti** | 1 | 8 |
| merlin-g-015 | 1 core | 48 cores | 1 | 5120 | 360448 | 360448 | 10000 | **A5000** | 1 | 8 |
| merlin-g-100 | 1 core | 128 cores | 2 | 3900 | 998400 | 998400 | 10000 | **A100** | 1 | 8 |
!!! tip
Always check `/etc/slurm/gres.conf` and `/etc/slurm/slurm.conf` for changes in the GPU type and details of the hardware.
## Running jobs in the 'gmerlin6' cluster
In this chapter we will cover basic settings that users need to specify in order to run jobs in the GPU cluster.
### Merlin6 GPU cluster
To run jobs in the **`gmerlin6`** cluster users **must** specify the cluster name in Slurm:
```bash
#SBATCH --cluster=gmerlin6
```
### Merlin6 GPU partitions
Users might need to specify the Slurm partition. If no partition is specified, it will default to **`gpu`**:
```bash
#SBATCH --partition=<partition_name> # Possible <partition_name> values: gpu, gpu-short, gwendolen
```
The table below resumes shows all possible partitions available to users:
| GPU Partition | Default Time | Max Time | PriorityJobFactor\* | PriorityTier\*\* |
|:---------------------: | :----------: | :--------: | :-----------------: | :--------------: |
| `gpu` | 1 day | 1 week | 1 | 1 |
| `gpu-short` | 2 hours | 2 hours | 1000 | 500 |
| `gwendolen` | 30 minutes | 2 hours | 1000 | 1000 |
| `gwendolen-long`\*\*\* | 30 minutes | 8 hours | 1 | 1 |
\*The **PriorityJobFactor** value will be added to the job priority (*PARTITION* column in `sprio -l` ). In other words, jobs sent to higher priority
partitions will usually run first (however, other factors such like **job age** or mainly **fair share** might affect to that decision). For the GPU
partitions, Slurm will also attempt first to allocate jobs on partitions with higher priority over partitions with lesser priority.
\*\*Jobs submitted to a partition with a higher **PriorityTier** value will be dispatched before pending jobs in partition with lower *PriorityTier* value
and, if possible, they will preempt running jobs from partitions with lower *PriorityTier* values.
\*\*\***gwnedolen-long** is a special partition which is enabled during non-working hours only. As of _Nov 2023_, the current policy is to disable this partition from Mon to Fri, from 1am to 5pm. However, jobs can be submitted anytime, but can only be scheduled outside this time range.
### Merlin6 GPU Accounts
Users need to ensure that the public **`merlin`** account is specified. No specifying account options would default to this account.
This is mostly needed by users which have multiple Slurm accounts, which may define by mistake a different account.
```bash
#SBATCH --account=merlin # Possible values: merlin, gwendolen
```
Not all the accounts can be used on all partitions. This is resumed in the table below:
| Slurm Account | Slurm Partitions |
|:-------------------: | :------------------: |
| **`merlin`** | **`gpu`**,`gpu-short` |
| `gwendolen` | `gwendolen`,`gwendolen-long` |
By default, all users belong to the `merlin` Slurm accounts, and jobs are submitted to the `gpu` partition when no partition is defined.
Users only need to specify the `gwendolen` account when using the `gwendolen` or `gwendolen-long` partitions, otherwise specifying account is not needed (it will always default to `merlin`).
#### The 'gwendolen' account
For running jobs in the **`gwendolen`/`gwendolen-long`** partitions, users must specify the **`gwendolen`** account.
The `merlin` account is not allowed to use the Gwendolen partitions.
Gwendolen is restricted to a set of users belonging to the **`unx-gwendolen`** Unix group. If you belong to a project allowed to use **Gwendolen**, or you are a user which would like to have access to it, please request access to the **`unx-gwendolen`** Unix group through [PSI Service Now](https://psi.service-now.com/): the request will be redirected to the responsible of the project (Andreas Adelmann).
### Slurm GPU specific options
Some options are available when using GPUs. These are detailed here.
#### Number of GPUs and type
When using the GPU cluster, users **must** specify the number of GPUs they need to use:
```bash
#SBATCH --gpus=[<type>:]<number>
```
The GPU type is optional: if left empty, it will try allocating any type of GPU.
The different `[<type>:]` values and `<number>` of GPUs depends on the node.
This is detailed in the below table.
| Nodes | GPU Type | #GPUs |
|:---------------------: | :-----------------------: | :---: |
| **merlin-g-[001]** | **`geforce_gtx_1080`** | 2 |
| **merlin-g-[002-005]** | **`geforce_gtx_1080`** | 4 |
| **merlin-g-[006-009]** | **`geforce_gtx_1080_ti`** | 4 |
| **merlin-g-[010-013]** | **`geforce_rtx_2080_ti`** | 4 |
| **merlin-g-014** | **`geforce_rtx_2080_ti`** | 8 |
| **merlin-g-015** | **`A5000`** | 8 |
| **merlin-g-100** | **`A100`** | 8 |
#### Constraint / Features
Instead of specifying the GPU **type**, sometimes users would need to **specify the GPU by the amount of memory available in the GPU** card itself.
This has been defined in Slurm with **Features**, which is a tag which defines the GPU memory for the different GPU cards.
Users can specify which GPU memory size needs to be used with the `--constraint` option. In that case, notice that *in many cases
there is not need to specify `[<type>:]`* in the `--gpus` option.
```bash
#SBATCH --contraint=<Feature> # Possible values: gpumem_8gb, gpumem_11gb, gpumem_24gb, gpumem_40gb
```
The table below shows the available **Features** and which GPU card models and GPU nodes they belong to:
<table>
<thead>
<tr>
<th scope='colgroup' style="vertical-align:middle;text-align:center;" colspan="3">Merlin6 GPU Computing Nodes</th>
</tr>
<tr>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Nodes</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">GPU Type</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Feature</th>
</tr>
</thead>
<tbody>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td markdown="span" style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-g-[001-005]</b></td>
<td markdown="span" style="vertical-align:middle;text-align:center;" rowspan="1">`geforce_gtx_1080`</td>
<td markdown="span" style="vertical-align:middle;text-align:center;" rowspan="1"><b>`gpumem_8gb`</b></td>
</tr>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td markdown="span" style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-g-[006-009]</b></td>
<td markdown="span" style="vertical-align:middle;text-align:center;" rowspan="1">`geforce_gtx_1080_ti`</td>
<td markdown="span" style="vertical-align:middle;text-align:center;" rowspan="2"><b>`gpumem_11gb`</b></td>
</tr>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td markdown="span" style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-g-[010-014]</b></td>
<td markdown="span" style="vertical-align:middle;text-align:center;" rowspan="1">`geforce_rtx_2080_ti`</td>
</tr>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td markdown="span" style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-g-015</b></td>
<td markdown="span" style="vertical-align:middle;text-align:center;" rowspan="1">`A5000`</td>
<td markdown="span" style="vertical-align:middle;text-align:center;" rowspan="1"><b>`gpumem_24gb`</b></td>
</tr>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td markdown="span" style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-g-100</b></td>
<td markdown="span" style="vertical-align:middle;text-align:center;" rowspan="1">`A100`</td>
<td markdown="span" style="vertical-align:middle;text-align:center;" rowspan="1"><b>`gpumem_40gb`</b></td>
</tr>
</tbody>
</table>
#### Other GPU options
Alternative Slurm options for GPU based jobs are available. Please refer to the **man** pages
for each Slurm command for further information about it (`man salloc`, `man sbatch`, `man srun`).
Below are listed the most common settings:
```bash
#SBATCH --hint=[no]multithread
#SBATCH --ntasks=\<ntasks\>
#SBATCH --ntasks-per-gpu=\<ntasks\>
#SBATCH --mem-per-gpu=\<size[units]\>
#SBATCH --cpus-per-gpu=\<ncpus\>
#SBATCH --gpus-per-node=[\<type\>:]\<number\>
#SBATCH --gpus-per-socket=[\<type\>:]\<number\>
#SBATCH --gpus-per-task=[\<type\>:]\<number\>
#SBATCH --gpu-bind=[verbose,]\<type\>
```
Please, notice that when defining `[<type>:]` once, then all other options must use it too!
#### Dealing with Hyper-Threading
The **`gmerlin6`** cluster contains the partitions `gwendolen` and `gwendolen-long`, which have a node with Hyper-Threading enabled.
In that case, one should always specify whether to use Hyper-Threading or not. If not defined, Slurm will
generally use it (exceptions apply). For this machine, generally HT is recommended.
```bash
#SBATCH --hint=multithread # Use extra threads with in-core multi-threading.
#SBATCH --hint=nomultithread # Don't use extra threads with in-core multi-threading.
```
## User and job limits
The GPU cluster contains some basic user and job limits to ensure that a single user can not overabuse the resources and a fair usage of the cluster.
The limits are described below.
### Per job limits
These are limits applying to a single job. In other words, there is a maximum of resources a single job can use.
Limits are defined using QoS, and this is usually set at the partition level. Limits are described in the table below with the format: `SlurmQoS(limits)`
(possible `SlurmQoS` values can be listed with the command `sacctmgr show qos`):
| Partition | Slurm Account | Mon-Sun 0h-24h |
|:------------------:| :------------: | :------------------------------------------: |
| **gpu** | **`merlin`** | gpu_week(gres/gpu=8) |
| **gpu-short** | **`merlin`** | gpu_week(gres/gpu=8) |
| **gwendolen** | `gwendolen` | No limits |
| **gwendolen-long** | `gwendolen` | No limits, active from 9pm to 5:30am |
* With the limits in the public `gpu` and `gpu-short` partitions, a single job using the `merlin` acccount
(default account) can not use more than 40 CPUs, more than 8 GPUs or more than 200GB.
Any job exceeding such limits will stay in the queue with the message **`QOSMax[Cpu|GRES|Mem]PerJob`**.
As there are no more existing QoS during the week temporary overriding job limits (this happens for
instance in the CPU **daily** partition), the job needs to be cancelled, and the requested resources
must be adapted according to the above resource limits.
* The **gwendolen** and **gwendolen-long** partitions are two special partitions for a **[NVIDIA DGX A100](https://www.nvidia.com/en-us/data-center/dgx-a100/)** machine.
Only users belonging to the **`unx-gwendolen`** Unix group can run in these partitions. No limits are applied (machine resources can be completely used).
* The **`gwendolen-long`** partition is available 24h. However,
* from 5:30am to 9pm the partition is `down` (jobs can be submitted, but can not run until the partition is set to `active`).
* from 9pm to 5:30am jobs are allowed to run (partition is set to `active`).
### Per user limits for GPU partitions
These limits apply exclusively to users. In other words, there is a maximum of resources a single user can use.
Limits are defined using QoS, and this is usually set at the partition level. Limits are described in the table below with the format: `SlurmQoS(limits)`
(possible `SlurmQoS` values can be listed with the command `sacctmgr show qos`):
| Partition | Slurm Account | Mon-Sun 0h-24h |
|:------------------:| :----------------: | :---------------------------------------------: |
| **gpu** | **`merlin`** | gpu_week(gres/gpu=16) |
| **gpu-short** | **`merlin`** | gpu_week(gres/gpu=16) |
| **gwendolen** | `gwendolen` | No limits |
| **gwendolen-long** | `gwendolen` | No limits, active from 9pm to 5:30am |
* With the limits in the public `gpu` and `gpu-short` partitions, a single user can not use more than 80 CPUs, more than 16 GPUs or more than 400GB.
Jobs sent by any user already exceeding such limits will stay in the queue with the message **`QOSMax[Cpu|GRES|Mem]PerUser`**.
In that case, job can wait in the queue until some of the running resources are freed.
* Notice that user limits are wider than job limits. In that way, a user can run up to two 8 GPUs based jobs, or up to four 4 GPUs based jobs, etc.
Please try to avoid occupying all GPUs of the same type for several hours or multiple days, otherwise it would block other users needing the same
type of GPU.
## Advanced Slurm configuration
Clusters at PSI use the [Slurm Workload Manager](http://slurm.schedmd.com/) as the batch system technology for managing and scheduling jobs.
Slurm has been installed in a **multi-clustered** configuration, allowing to integrate multiple clusters in the same batch system.
For understanding the Slurm configuration setup in the cluster, sometimes may be useful to check the following files:
* ``/etc/slurm/slurm.conf`` - can be found in the login nodes and computing nodes.
* ``/etc/slurm/gres.conf`` - can be found in the GPU nodes, is also propgated to login nodes and computing nodes for user read access.
* ``/etc/slurm/cgroup.conf`` - can be found in the computing nodes, is also propagated to login nodes for user read access.
The previous configuration files which can be found in the login nodes, correspond exclusively to the **merlin6** cluster configuration files.
Configuration files for the old **merlin5** cluster or for the **gmerlin6** cluster must be checked directly on any of the **merlin5** or **gmerlin6** computing nodes (in example, by login in to one of the nodes while a job or an active allocation is running).

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.7 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 67 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 39 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.7 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 67 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 39 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 39 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 508 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 27 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 26 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 37 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 508 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 23 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 29 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 21 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 68 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 68 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 151 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 66 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 32 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 43 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 57 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 37 KiB

BIN
docs/images/WIP/WIP1.jpeg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.1 KiB

BIN
docs/images/WIP/WIP1.webp Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 39 KiB

BIN
docs/images/favicon.ico Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.3 KiB

BIN
docs/images/front_page.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.2 MiB

BIN
docs/images/hpce_logo.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 278 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 349 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 34 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 157 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 167 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 33 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 113 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 60 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 49 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 36 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 43 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 85 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 61 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 49 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 49 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 44 KiB

BIN
docs/images/psi-logo.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 41 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 127 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 70 KiB

BIN
docs/images/slurm/scom.gif Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1008 KiB

BIN
docs/images/slurm/sview.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 197 KiB

26
docs/index.md Normal file
View File

@@ -0,0 +1,26 @@
---
hide:
- navigation
- toc
---
# HPCE User Documentation
![The HPCE clusters](images/front_page.png){ width="500" }
/// caption
The magical trio 🪄
///
The [HPCE
group](https://www.psi.ch/en/awi/high-performance-computing-and-emerging-technologies-group)
is part of the [PSI Center for Scientific Computing, Theory and
Data](https://www.psi.ch/en/csd) at [Paul Scherrer
Institute](https://www.psi.ch). It provides a range of HPC services for PSI
scientists, such as the Merlin series of HPC clusters, and also engages in
research activities on technologies (data analysis and machine learning
technologies) used on these systems.
## Quick Links
- user support
- news

51
docs/meg/contact.md Normal file
View File

@@ -0,0 +1,51 @@
# Support
Support can be asked through:
* [PSI Service Now](https://psi.service-now.com/psisp)
* E-Mail: <meg-admins@lists.psi.ch>
Basic contact information is also displayed on every shell login to the system
using the *Message of the Day* mechanism.
## PSI Service Now
**[PSI Service Now](https://psi.service-now.com/psisp)**: is the official PSI tool for opening incident requests. However, contact via email (see below) is preferred.
* PSI HelpDesk will redirect the incident to the corresponding department, or
* you can always assign it directly by checking the box `I know which service
is affected` and providing the service name `Local HPC Resources (e.g. MEG)
[CF]` (just type in `Local` and you should get the valid completions).
## Contact Meg Administrators
**E-Mail <meg-admins@lists.psi.ch>** or **<merlin-admins@lists.psi.ch>**
* This is the preferred way to contact MEG Administrators.
Do not hesitate to contact us for such cases.
---
## Get updated through the Merlin User list
Is strongly recommended that users subscribe to the Merlin Users mailing list:
**<merlin-users@lists.psi.ch>**
This mailing list is the official channel used by Merlin administrators to
inform users about downtimes, interventions or problems. Users can be
subscribed in two ways:
* *(Preferred way)* Self-registration through **[Sympa](https://psilists.ethz.ch/sympa/info/merlin-users)**
* If you need to subscribe many people (e.g. your whole group) by sending a request to the admin list **<merlin-admins@lists.psi.ch>**
and providing a list of email addresses.
---
## The MEG Cluster Team
The PSI Merlin and MEG clusters are managed by the **[High Performance
Computing and Emerging technologies
Group](https://www.psi.ch/de/lsm/hpce-group)**, which is part of the [Science
IT Infrastructure, and Services department (AWI)](https://www.psi.ch/en/awi) in
PSI's [Center for Scientific Computing, Theory and Data
(SCD)](https://www.psi.ch/en/csd).

13
docs/meg/index.md Normal file
View File

@@ -0,0 +1,13 @@
# The MEG local HPC cluster
> The MEG II collaboration includes almost 70 physicists from research
> institutions from five countries. Researchers and technicians from PSI have
> played a leading role, particularly with providing the high-quality beam,
> technical support in the detector integration, and in the design, construction,
> and operation of the detector readout electronics."
>
> —— [Source](https://www.psi.ch/en/cnm/news/in-search-of-new-physics-new-result-from-the-meg-ii-collaboration)
The MEG data analysis cluster is a cluster tightly coupled to Merlin and
dedicated to the analysis of data from the MEG experiment. Operated for the
Muon Physics group.

View File

@@ -0,0 +1,200 @@
# Meg to Merlin7 Migration Guide
Welcome to the official documentation for migrating experiment data from **MEG** to **Merlin7**. Please follow the instructions carefully to ensure a smooth and secure transition.
---
## Directory Structure Changes
### Meg vs Merlin6 vs Merlin7
| Cluster | Home Directory | User Data Directory | Experiment data | Additional notes |
| ------- | :----------------- | :------------------ | --------------------- | ---------------- |
| merlin6 | /psi/home/`$USER` | /data/user/`$USER` | /data/experiments/meg | Symlink /meg |
| meg | /meg/home/`$USER` | N/A | /meg | |
| merlin7 | /data/user/`$USER` | /data/user/`$USER` | /data/project/meg | |
* The **Merlin6 home and user data directores have been merged** into the single new home directory `/data/user/$USER` on Merlin7.
* This is the same for the home directory in the meg cluster, which has to be merged into `/data/user/$USER` on Merlin7.
* Users are responsible for moving the data.
* The **experiment directory has been integrated into `/data/project/meg`**.
### Recommended Cleanup Actions
* Remove unused files and datasets.
* Archive large, inactive data sets.
### Mandatory Actions
* Stop activity on Meg and Merlin6 when performing the last rsync.
## Migration Instructions
### Preparation
A `experiment_migration.setup` migration script must be executed from **any MeG node** using the account that will perform the migration.
#### When using the local `root` account
* The script **must be executed after every reboot** of the destination nodes.
* **Reason:** On Merlin7, the home directory for the `root` user resides on ephemeral storage (no physical disk).
After a reboot, this directory is cleaned, so **SSH keys need to be redeployed** before running the migration again.
#### When using a PSI Active Directory (AD) account
* Applicable accounts include, for example:
* `gac-meg2_data`
* `gac-meg2`
* The script only needs to be executed **once**, provided that:
* The home directory for the AD account is located on a shared storage area.
* This shared storage is accessible from the node executing the transfer.
* **Reason:** On Merlin7, these accounts have their home directories on persistent shared storage, so the SSH keys remain available across reboots.
To run it:
```bash
experiment_migration.setup
```
This script will:
* Check that you have an account on Merlin7.
* Configure and check that your environment is ready for transferring files via Slurm job.
If there are issues, the script will:
* Print clear diagnostic output
* Give you some hints to resolve the issue
If you are stuck, email: [merlin-admins@lists.psi.ch](mailto:merlin-admins@lists.psi.ch)/[meg-admins@lists.psi.ch](mailto:meg-admins@lists.psi.ch)
### Migration Procedure
1. **Run an initial sync**, ideally within a `tmux` session
* This copies the bulk of the data from MeG to Merlin7.
* **IMPORTANT: Do not modify the destination directories**
* Please, before starting the transfer ensure that:
* The source and destination directories are correct.
* The destination directories exist.
2. **Run additional syncs if needed**
* Subsequent syncs can be executed to transfer changes.
* Ensure that **only one sync for the same directory runs at a time**.
* Multiple syncs are often required since the first one may take several hours or even days.
3. Schedule a date for the final migration:
* Any activity must be stopped on the source directory.
* In the same way, no activity must be done on the destination until the migration is complete.
4. **Perform a final sync with the `-E` option** (if it applies)
* Use `-E` **only if you need to delete files on the destination that were removed from the source.**
* This ensures the destination becomes an exact mirror of the source.
* **Never use `-E` after the destination has gone into production**, as it will delete new data created there.
5. Disable access on the source folder.
6. Enable access on the destination folder.
* At this point, **no new syncs have to be performed.**
!!! note "Important"
The `-E` option is destructive; handle with care.
Always verify that the destination is ready before triggering the final sync.
For optimal performance, use up to 12 threads with the -t option.
#### Running The Migration Script
The migration script is installed on the `meg-s-001` server at:
`/usr/local/bin/experiment_migration.bash`
This script is primarily a **wrapper** around `fpsync`, providing additional logic for synchronizing MeG experiment data.
```bash
[root@meg-s-001 ~]# experiment_migration.bash --help
Usage: /usr/local/bin/experiment_migration.bash [options] -p <project_name>
Options:
-t | --threads N Number of parallel threads (default: 10). Recommended 12 as max.
-b | --experiment-src-basedir DIR Experiment base directory (default: /meg)
-S | --space-source SPACE Source project space name (default: data1)
-B | --experiment-dst-basedir DIR Experiment base directory (default: /data/project/meg)
-D | --space-destination SPACE Destination project space name (default: data1)
-p | --project-name PRJ_NAME Mantadory field. MeG project name. Examples:
- 'online'
- 'offline'
- 'shared'
-F | --force-destination-mkdir Create the destination parent directory (default: false)
Example: mkdir -p $(dirname /data/project/meg/data1/PROJECT_NAME)
Result: mkdir -p /data/project/meg/data1
-s | --split N Number of files per split (default: 20000)
-f | --filesize SIZE File size threshold (default: 100G)
-r | --runid ID Reuse an existing runid session
-l | --list-runids List available runid sessions and exit
-x | --delete-runid Delete runid. Requires: -r | --runid ID
-E | --rsync-delete-option [WARNING] Use this to delete files in the destination
which are not present in the source any more.
[WARNING] USE THIS OPTION CAREFULLY!
Typically used in last rsync to have an exact
mirror of the source directory.
[WARNING] Some files in destination might be deleted!
Use 'man fpsync' for more information.
-h | --help Show this help message
-v | --verbose Run fpsync with -v option
```
!!! tip
Defaults can be updated if necessary.
#### Migration examples
##### Example: Migrating the Entire `online` Directory
The following example demonstrates how to migrate the **entire `online`** directory.
!!! tip
You may also choose to migrate only specific subdirectories if needed.
However, migrating full directories is generally **simpler** and **less
error-prone** compared to handling multiple subdirectory migrations.
```bash
[root@meg-s-001 ~]# experiment_migration.bash -S data1 -D data1 -p "online"
🔄 Transferring project:
From: /meg/data1/online
To: login001.merlin7.psi.ch:/data/project/meg/data1/online
Threads: 10 | Split: 20000 files | Max size: 100G
RunID:
Please confirm to start (y/N):
❌ Transfer cancelled by user.
```
##### Example: Migrating a Specific Subdirectory
The following example demonstrates how to migrate **only a subdirectory**. In this case, we use the option `-F` to create the parent directory in the destination, to ensure that this exists before transferring:
⚠️ **Important:**
* When migrating a subdirectory, **do not** run concurrent migrations on its parent directories.
* For example, avoid running migrations with `-p "shared"` while simultaneously migrating `-p "shared/subprojects"`.
```bash
[root@meg-s-001 ~]# experiment_migration.bash -p "shared/subprojects/meg1" -F
🔄 Transferring project:
From: /meg/data1/shared/subprojects/meg1
To: login002.merlin7.psi.ch:/data/project/meg/data1/shared/subprojects/meg1
Threads: 10 | Split: 20000 files | Max size: 100G
RunID:
Please confirm to start (y/N): N
❌ Transfer cancelled by user.
```
This command initiates the migration of the directory, by creating the destination parant directory (`-F` option):
* Creates the destination directory as follows:
```bash
ssh login002.merlin.psi.ch mkdir -p /data/project/meg/data1/shared/subprojects
```
* Runs FPSYNC with 10 threads and N parts of max 20000 files or 100G files:
* Source: `/meg/data1/shared/subprojects/meg1`
* Destination: `login002.merlin7.psi.ch:/data/project/meg/data1/shared/subprojects/meg1`

View File

@@ -0,0 +1,44 @@
---
title: Cluster 'merlin5'
#tags:
#keywords:
last_updated: 07 April 2021
#summary: "Merlin 5 cluster overview"
sidebar: merlin6_sidebar
permalink: /merlin5/cluster-introduction.html
---
## Slurm 'merlin5' cluster
**Merlin5** was the old official PSI Local HPC cluster for development and
mission-critical applications which was built in 2016-2017. It was an
extension of the Merlin4 cluster and built from existing hardware due
to a lack of central investment on Local HPC Resources. **Merlin5** was
then replaced by the **[Merlin6](../merlin6/cluster-introduction.md)** cluster in 2019,
with an important central investment of ~1,5M CHF. **Merlin5** was mostly
based on CPU resources, but also contained a small amount of GPU-based
resources which were mostly used by the BIO experiments.
**Merlin5** has been kept as a **Local HPC [Slurm](https://slurm.schedmd.com/overview.html) cluster**,
called **`merlin5`**. In that way, the old CPU computing nodes are still available as extra computation resources,
and as an extension of the official production **`merlin6`** [Slurm](https://slurm.schedmd.com/overview.html) cluster.
The old Merlin5 _**login nodes**_, _**GPU nodes**_ and _**storage**_ were fully migrated to the **[Merlin6](../merlin6/index.md)**
cluster, which becomes the **main Local HPC Cluster**. Hence, **[Merlin6](/merlin6/index.html)**
contains the storage which is mounted on the different Merlin HPC [Slurm](https://slurm.schedmd.com/overview.html) Clusters (`merlin5`, `merlin6`, `gmerlin6`).
### Submitting jobs to 'merlin5'
To submit jobs to the **`merlin5`** Slurm cluster, it must be done from the **Merlin6** login nodes by using
the option `--clusters=merlin5` on any of the Slurm commands (`sbatch`, `salloc`, `srun`, etc. commands).
## The Merlin Architecture
### Multi Non-Federated Cluster Architecture Design: The Merlin cluster
The following image shows the Slurm architecture design for Merlin cluster.
It contains a multi non-federated cluster setup, with a central Slurm database
and multiple independent clusters (`merlin5`, `merlin6`, `gmerlin6`):
![Merlin6 Slurm Architecture Design](../images/merlin-slurm-architecture.png)

View File

@@ -0,0 +1,97 @@
---
title: Hardware And Software Description
#tags:
#keywords:
last_updated: 09 April 2021
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin5/hardware-and-software.html
---
## Hardware
### Computing Nodes
Merlin5 is built from recycled nodes, and hardware will be decomissioned as soon as it fails (due to expired warranty and age of the cluster).
* Merlin5 is based on the [**HPE c7000 Enclosure**](https://h20195.www2.hpe.com/v2/getdocument.aspx?docname=c04128339) solution, with 16 x [**HPE ProLiant BL460c Gen8**](https://h20195.www2.hpe.com/v2/getdocument.aspx?docname=c04123239) nodes per chassis.
* Connectivity is based on Infiniband **ConnectX-3 QDR-40Gbps**
* 16 internal ports for intra chassis communication
* 2 connected external ports for inter chassis communication and storage access.
The below table summarizes the hardware setup for the Merlin5 computing nodes:
<table>
<thead>
<tr>
<th scope='colgroup' style="vertical-align:middle;text-align:center;" colspan="8">Merlin5 CPU Computing Nodes</th>
</tr>
<tr>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Chassis</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Node</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Processor</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Sockets</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Cores</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Threads</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Scratch</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Memory</th>
</tr>
</thead>
<tbody>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td style="vertical-align:middle;text-align:center;" rowspan="2"><b>#0</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-c-[18-30]</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="2"><a href="https://ark.intel.com/content/www/us/en/ark/products/64595/intel-xeon-processor-e5-2670-20m-cache-2-60-ghz-8-00-gt-s-intel-qpi.html">Intel Xeon E5-2670</a></td>
<td style="vertical-align:middle;text-align:center;" rowspan="2">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="2">16</td>
<td style="vertical-align:middle;text-align:center;" rowspan="2">1</td>
<td style="vertical-align:middle;text-align:center;" rowspan="2">50GB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">64GB</td>
</tr>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td rowspan="1"><b>merlin-c-[31,32]</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>128GB</b></td>
</tr>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td style="vertical-align:middle;text-align:center;" rowspan="2"><b>#1</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-c-[33-45]</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="2"><a href="https://ark.intel.com/content/www/us/en/ark/products/64595/intel-xeon-processor-e5-2670-20m-cache-2-60-ghz-8-00-gt-s-intel-qpi.html">Intel Xeon E5-2670</a></td>
<td style="vertical-align:middle;text-align:center;" rowspan="2">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="2">16</td>
<td style="vertical-align:middle;text-align:center;" rowspan="2">1</td>
<td style="vertical-align:middle;text-align:center;" rowspan="2">50GB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">64GB</td>
</tr>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td rowspan="1"><b>merlin-c-[46,47]</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>128GB</b></td>
</tr>
</tbody>
</table>
### Login Nodes
The login nodes are part of the **[Merlin6](../merlin6/index.md)** HPC cluster,
and are used to compile and to submit jobs to the different ***Merlin Slurm clusters*** (`merlin5`,`merlin6`,`gmerlin6`,etc.).
Please refer to the **[Merlin6 Hardware Documentation](../merlin6/hardware-and-software-description.md)** for further information.
### Storage
The storage is part of the **[Merlin6](../merlin6/index.md)** HPC cluster,
and is mounted in all the ***Slurm clusters*** (`merlin5`,`merlin6`,`gmerlin6`,etc.).
Please refer to the **[Merlin6 Hardware Documentation](../merlin6/hardware-and-software-description.md)** for further information.
### Network
Merlin5 cluster connectivity is based on the [Infiniband QDR](https://en.wikipedia.org/wiki/InfiniBand) technology.
This allows fast access with very low latencies to the data as well as running extremely efficient MPI-based jobs.
However, this is an old version of Infiniband which requires older drivers and software can not take advantage of the latest features.
## Software
In Merlin5, we try to keep software stack coherency with the main cluster [Merlin6](../merlin6/index.md).
Due to this, Merlin5 runs:
* [**RedHat Enterprise Linux 7**](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/7.9_release_notes/index)
* [**Slurm**](https://slurm.schedmd.com/), we usually try to keep it up to date with the most recent versions.
* [**GPFS v5**](https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.2/ibmspectrumscale502_welcome.html)
* [**MLNX_OFED LTS v.4.9-2.2.4.0**](https://www.mellanox.com/products/infiniband-drivers/linux/mlnx_ofed), which is an old version, but required because **ConnectX-3** support has been dropped on newer OFED versions.

View File

@@ -0,0 +1,142 @@
---
title: Slurm Configuration
#tags:
keywords: configuration, partitions, node definition
last_updated: 20 May 2021
summary: "This document describes a summary of the Merlin5 Slurm configuration."
sidebar: merlin6_sidebar
permalink: /merlin5/slurm-configuration.html
---
This documentation shows basic Slurm configuration and options needed to run jobs in the Merlin5 cluster.
The Merlin5 cluster is an old cluster with old hardware which is maintained in a best effort for increasing the CPU power of the Merlin cluster.
## Merlin5 CPU nodes definition
The following table show default and maximum resources that can be used per node:
| Nodes | Def.#CPUs | Max.#CPUs | #Threads | Max.Mem/Node | Max.Swap |
|:----------------:| ---------:| :--------:| :------: | :----------: | :-------:|
| merlin-c-[18-30] | 1 core | 16 cores | 1 | 60000 | 10000 |
| merlin-c-[31-32] | 1 core | 16 cores | 1 | 124000 | 10000 |
| merlin-c-[33-45] | 1 core | 16 cores | 1 | 60000 | 10000 |
| merlin-c-[46-47] | 1 core | 16 cores | 1 | 124000 | 10000 |
There is one *main difference between the Merlin5 and Merlin6 clusters*: Merlin5 is keeping an old configuration which does not
consider the memory as a *consumable resource*. Hence, users can *oversubscribe* memory. This might trigger some side-effects, but
this legacy configuration has been kept to ensure that old jobs can keep running in the same way they did a few years ago.
If you know that this might be a problem for you, please, always use Merlin6 instead.
## Running jobs in the 'merlin5' cluster
In this chapter we will cover basic settings that users need to specify in order to run jobs in the Merlin5 CPU cluster.
### Merlin5 CPU cluster
To run jobs in the **`merlin5`** cluster users **must** specify the cluster name in Slurm:
```bash
#SBATCH --cluster=merlin5
```
### Merlin5 CPU partitions
Users might need to specify the Slurm partition. If no partition is specified, it will default to **`merlin`**:
```bash
#SBATCH --partition=<partition_name> # Possible <partition_name> values: merlin, merlin-long:
```
The table below resumes shows all possible partitions available to users:
| CPU Partition | Default Time | Max Time | Max Nodes | PriorityJobFactor\* | PriorityTier\*\* |
|:-----------------: | :----------: | :------: | :-------: | :-----------------: | :--------------: |
| **<u>merlin</u>** | 5 days | 1 week | All nodes | 500 | 1 |
| **merlin-long** | 5 days | 21 days | 4 | 1 | 1 |
**\***The **PriorityJobFactor** value will be added to the job priority (*PARTITION* column in `sprio -l` ). In other words, jobs sent to higher priority
partitions will usually run first (however, other factors such like **job age** or mainly **fair share** might affect to that decision). For the GPU
partitions, Slurm will also attempt first to allocate jobs on partitions with higher priority over partitions with lesser priority.
**\*\***Jobs submitted to a partition with a higher **PriorityTier** value will be dispatched before pending jobs in partition with lower *PriorityTier* value
and, if possible, they will preempt running jobs from partitions with lower *PriorityTier* values.
The **`merlin-long`** partition **is limited to 4 nodes**, as it might contain jobs running for up to 21 days.
### Merlin5 CPU Accounts
Users need to ensure that the public **`merlin`** account is specified. No specifying account options would default to this account.
This is mostly needed by users which have multiple Slurm accounts, which may define by mistake a different account.
```bash
#SBATCH --account=merlin # Possible values: merlin
```
### Slurm CPU specific options
Some options are available when using CPUs. These are detailed here.
Alternative Slurm options for CPU based jobs are available. Please refer to the **man** pages
for each Slurm command for further information about it (`man salloc`, `man sbatch`, `man srun`).
Below are listed the most common settings:
```bash
#SBATCH --ntasks=<ntasks>
#SBATCH --ntasks-per-core=<ntasks>
#SBATCH --ntasks-per-socket=<ntasks>
#SBATCH --ntasks-per-node=<ntasks>
#SBATCH --mem=<size[units]>
#SBATCH --mem-per-cpu=<size[units]>
#SBATCH --cpus-per-task=<ncpus>
#SBATCH --cpu-bind=[{quiet,verbose},]<type> # only for 'srun' command
```
Notice that in **Merlin5** no hyper-threading is available (while in **Merlin6** it is).
Hence, in **Merlin5** there is not need to specify `--hint` hyper-threading related options.
## User and job limits
In the CPU cluster we provide some limits which basically apply to jobs and users. The idea behind this is to ensure a fair usage of the resources and to
avoid overabuse of the resources from a single user or job. However, applying limits might affect the overall usage efficiency of the cluster (in example,
pending jobs from a single user while having many idle nodes due to low overall activity is something that can be seen when user limits are applied).
In the same way, these limits can be also used to improve the efficiency of the cluster (in example, without any job size limits, a job requesting all
resources from the batch system would drain the entire cluster for fitting the job, which is undesirable).
Hence, there is a need of setting up wise limits and to ensure that there is a fair usage of the resources, by trying to optimize the overall efficiency
of the cluster while allowing jobs of different nature and sizes (it is, **single core** based **vs parallel jobs** of different sizes) to run.
In the **`merlin5`** cluster, as not many users are running on it, these limits are wider than the ones set in the **`merlin6`** and **`gmerlin6`** clusters.
### Per job limits
These are limits which apply to a single job. In other words, there is a maximum of resources a single job can use. These limits are described in the table below,
with the format `SlurmQoS(limits)` (`SlurmQoS` can be listed from the `sacctmgr show qos` command):
| Partition | Mon-Sun 0h-24h | Other limits |
|:---------------: | :--------------: | :----------: |
| **merlin** | merlin5(cpu=384) | None |
| **merlin-long** | merlin5(cpu=384) | Max. 4 nodes |
By default, by QoS limits, a job can not use more than 384 cores (max CPU per job).
However, for the `merlin-long`, this is even more restricted: there is an extra limit of 4 dedicated nodes for this partion. This is defined
at the partition level, and will overwrite any QoS limit as long as this is more restrictive.
### Per user limits for CPU partitions
No user limits apply by QoS. For the **`merlin`** partition, a single user could fill the whole batch system with jobs (however, the restriction is at the job size, as explained above). For the **`merlin-limit`** partition, the 4 node limitation still applies.
## Advanced Slurm configuration
Clusters at PSI use the [Slurm Workload Manager](http://slurm.schedmd.com/) as the batch system technology for managing and scheduling jobs.
Slurm has been installed in a **multi-clustered** configuration, allowing to integrate multiple clusters in the same batch system.
For understanding the Slurm configuration setup in the cluster, sometimes may be useful to check the following files:
* ``/etc/slurm/slurm.conf`` - can be found in the login nodes and computing nodes.
* ``/etc/slurm/gres.conf`` - can be found in the GPU nodes, is also propgated to login nodes and computing nodes for user read access.
* ``/etc/slurm/cgroup.conf`` - can be found in the computing nodes, is also propagated to login nodes for user read access.
The previous configuration files which can be found in the login nodes, correspond exclusively to the **merlin6** cluster configuration files.
Configuration files for the old **merlin5** cluster or for the **gmerlin6** cluster must be checked directly on any of the **merlin5** or **gmerlin6** computing nodes (in example, by login in to one of the nodes while a job or an active allocation is running).

View File

@@ -0,0 +1,61 @@
---
title: Downtimes
#tags:
#keywords:
last_updated: 28 June 2019
#summary: "Merlin 6 cluster overview"
sidebar: merlin6_sidebar
permalink: /merlin6/downtimes.html
---
On the first Monday of each month the Merlin6 cluster might be subject to interruption due to maintenance.
Users will be informed with at least one week in advance when a downtime is scheduled for the next month.
Downtimes will be informed to users through the <merlin-users@lists.psi.ch> mail list. Also, a detailed description
for the nexts scheduled interventions will be available in [Next Scheduled Downtimes](#next-scheduled-downtimes)).
---
## Scheduled Downtime Draining Policy
Scheduled downtimes mostly affecting the storage and Slurm configurantions may require draining the nodes.
When this is required, users will be informed accordingly. Two different types of draining are possible:
* **soft drain**: new jobs may be queued on the partition, but queued jobs may not be allocated nodes and run from the partition.
Jobs already running on the partition continue to run. This will be the **default** drain method.
* **hard drain**: no new jobs may be queued on the partition (job submission requests will be denied with an error message),
but jobs already queued on the partition may be allocated to nodes and run.
Unless explicitly specified, the default draining policy for each partition will be the following:
* The **daily** and **general** partitions will be soft drained 12h before the downtime.
* The **hourly** partition will be soft drained 1 hour before the downtime.
* The **gpu** and **gpu-short** partitions will be soft drained 1 hour before the downtime.
Finally, **remaining running jobs will be killed** by default when the downtime starts. In some specific rare cases jobs will be
just *paused* and *resumed* back when the downtime finished.
### Draining Policy Summary
The following table contains a summary of the draining policies during a Schedule Downtime:
| **Partition** | **Drain Policy** | **Default Drain Type** | **Default Job Policy** |
|:---------------:| -----------------:| ----------------------:| --------------------------------:|
| **general** | 12h before the SD | soft drain | Kill running jobs when SD starts |
| **daily** | 12h before the SD | soft drain | Kill running jobs when SD starts |
| **hourly** | 1h before the SD | soft drain | Kill running jobs when SD starts |
| **gpu** | 1h before the SD | soft drain | Kill running jobs when SD starts |
| **gpu-short** | 1h before the SD | soft drain | Kill running jobs when SD starts |
| **gfa-asa** | 1h before the SD | soft drain | Kill running jobs when SD starts |
---
## Next Scheduled Downtimes
The table below shows a description for the next Scheduled Downtime:
| From | To | Service | Description |
| ---------------- | ---------------- |:------------:|:----------------------------------------------------------------------- |
| 05.09.2020 8am | 05.09.2020 6pm | <pending> | <pending> |
* **Note**: An e-mail will be sent when the services are fully available.

View File

@@ -0,0 +1,38 @@
---
title: Past Downtimes
#tags:
#keywords:
last_updated: 03 September 2019
#summary: "Merlin 6 cluster overview"
sidebar: merlin6_sidebar
permalink: /merlin6/past-downtimes.html
---
## Past Downtimes: Log Changes
### 2020
| From | To | Service | Clusters | Description | Exceptions |
| ---------------- | ---------------- |:------------:|:---------------:|:--------------------------------------------------------------|:-------------------------------------------:|
| 03.08.2020 8am | 03.08.2020 6pm | Archive | merlin6 | Replace old merlin-export-01 for merlin-export-02 | |
| 03.08.2020 8am | 03.08.2020 6pm | RemoteAccess | merlin6 | ra-merlin-0[1,2] Remount merlin-export-02 | |
| 06.07.2020 | 06.07.2020 | All services | merlin5,merlin6 | GPFS v5.0.4-4,OFED v5.0,YFS v0.195,RHEL7.7,Slurm v19.05.7,f/w | |
| 04.05.2020 | 04.05.2020 | Login nodes | merlin6 | Outage. YFS (AFS) update v0.194 and reboot | |
| 04.05.2020 | 04.05.2020 | CN | merlin5 | Outage. O.S. update, OFED drivers update, YFS (AFS) update. | |
| 03.02.2020 9am | 03.02.2020 10am | Slurm | merlin5,merlin6 | Upgrading config [HPCLOCAL-321](https://jira.psi.ch/browse/HPCLOCAL-321) | |
| 10.01.2020 9am | 10.01.2020 6pm | All Services | merlin5,merlin6 | Slurm v18->v19, IB Connected Mode, other. [HPCLOCAL-300](https://jira.psi.ch/browse/HPCLOCAL-300) | |
## Older downtimes
| From | To | Service | Clusters | Description | Exceptions |
| ---------------- | ---------------- |:------------:|:---------------:|:--------------------------------------------------------------|:-------------------------------------------:|
| 02.09.2019 | 02.09.2019 | GPFS | merlin5,merlin6 | v5.0.2-3 -> v5.0.3-2 | |
| 02.09.2019 | 02.09.2019 | O.S. | merlin5 | RHEL7.4 (rhel-7.4) -> RHEL7.6 (prod-00048) | merlin-g-40, still running RHEL7.4\* |
| 02.09.2019 | 02.09.2019 | O.S. | merlin6 | RHEL7.6 (prod-00030) -> RHEL7.6 (prod-00048) | |
| 02.09.2019 | 02.09.2019 | Infiniband | merlin5 | OFED v4.4 -> v4.6 | merlin-g-40, still running OFED v4.4\* |
| 02.09.2019 | 02.09.2019 | Infiniband | merlin6 | OFED v4.5 -> v4.6 | |
| 02.09.2019 | 02.09.2019 | PModules | merlin5,merlin6 | PModules v1.0.0rc4 -> v1.0.0rc5 | |
| 02.09.2019 | 02.09.2019 | AFS(YFS) | merlin5 | OpenAFS v1.6.22.2-236 -> YFS v188 | merlin-g-40, still running OpenAFS\* |
| 02.09.2019 | 02.09.2019 | AFS(YFS) | merlin6 | YFS v186 -> YFS v188 | |
| 02.09.2019 | 02.09.2019 | O.S. | merlin5 | RHEL7.4 -> RHEL7.6 (prod-00048) | |
| 02.09.2019 | 02.09.2019 | Slurm | merlin5,merlin6 | Slurm v18.08.6 -> v18.08.8 | |

View File

@@ -0,0 +1,49 @@
---
title: Contact
#tags:
keywords: contact, support, snow, service now, mailing list, mailing, email, mail, merlin-admins@lists.psi.ch, merlin-users@lists.psi.ch, merlin users
last_updated: 07 September 2022
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/contact.html
---
## Support
Basic contact information can be also found when logging into the Merlin Login Nodes through the *Message of the Day*.
Support can be asked through:
* [PSI Service Now](https://psi.service-now.com/psisp)
* E-Mail: <merlin-admins@lists.psi.ch>
### PSI Service Now
**[PSI Service Now](https://psi.service-now.com/psisp)**: is the official tool for opening incident requests.
* PSI HelpDesk will redirect the incident to the corresponding department, or
* you can always assign it directly by checking the box `I know which service is affected` and providing the service name `Local HPC Resources (e.g. Merlin) [CF]` (just type in `Local` and you should get the valid completions).
### Contact Merlin6 Administrators
**E-Mail <merlin-admins@lists.psi.ch>**
* This is the official way to contact Merlin6 Administrators for discussions which do not fit well into the incident category.
Do not hesitate to contact us for such cases.
---
## Get updated through the Merlin User list!
Is strongly recommended that users subscribe to the Merlin Users mailing list: **<merlin-users@lists.psi.ch>**
This mailing list is the official channel used by Merlin6 administrators to inform users about downtimes,
interventions or problems. Users can be subscribed in two ways:
* *(Preferred way)* Self-registration through **[Sympa](https://psilists.ethz.ch/sympa/info/merlin-users)**
* If you need to subscribe many people (e.g. your whole group) by sending a request to the admin list **<merlin-admins@lists.psi.ch>**
and providing a list of email addresses.
---
## The Merlin Cluster Team
The PSI Merlin clusters are managed by the **[High Performance Computing and Emerging technologies Group](https://www.psi.ch/de/lsm/hpce-group)**, which
is part of the [Science IT Infrastructure, and Services department (AWI)](https://www.psi.ch/en/awi) in PSI's [Center for Scientific Computing, Theory and Data (SCD)](https://www.psi.ch/en/csd).

View File

@@ -0,0 +1,52 @@
---
title: FAQ
#tags:
keywords: faq, frequently asked questions, support
last_updated: 27 October 2022
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/faq.html
---
{%include toc.html %}
## How do I register for Merlin?
See [Requesting Merlin Access](../quick-start-guide/requesting-accounts.md).
## How do I get information about downtimes and updates?
See [Get updated through the Merlin User list!](contact.md#get-updated-through-the-merlin-user-list)
## How can I request access to a Merlin project directory?
Merlin projects are placed in the `/data/project` directory. Access to each project is controlled by Unix group membership.
If you require access to an existing project, please request group membership as described in [Requesting Unix Group Membership](../quick-start-guide/requesting-projects.md#requesting-unix-group-membership).
Your project leader or project colleagues will know what Unix group you should belong to. Otherwise, you can check what Unix group is allowed to access that project directory (simply run `ls -ltrhd` for the project directory).
## Can I install software myself?
Most software can be installed in user directories without any special permissions. We recommend using `/data/user/$USER/bin` for software since home directories are fairly small. For software that will be used by multiple groups/users you can also [request the admins](contact.md) install it as a [module](../how-to-use-merlin/using-modules.md).
How to install depends a bit on the software itself. There are three common installation procedures:
1. *binary distributions*. These are easy; just put them in a directory (eg `/data/user/$USER/bin`) and add that to your PATH.
2. *source compilation* using make/cmake/autoconfig/etc. Usually the compilation scripts accept a `--prefix=/data/user/$USER` directory for where to install it. Then they place files under `<prefix>/bin`, `<prefix>/lib`, etc. The exact syntax should be documented in the installation instructions.
3. *conda environment*. This is now becoming standard for python-based software, including lots of the AI tools. First follow the [initial setup instructions](../software-support/python.md#anaconda) to configure conda to use /data/user instead of your home directory. Then you can create environments like:
```
module load anaconda/2019.07
# if they provide environment.yml
conda env create -f environment.yml
# or to create manually
conda create --name myenv python==3.9 ...
conda activate myenv
```
## Something doesn't work
Check the list of [known problems](known-problems.md) to see if a solution is known.
If not, please [contact the admins](contact.md).

View File

@@ -0,0 +1,180 @@
---
title: Known Problems
#tags:
keywords: "known problems, troubleshooting, illegal instructions, paraview, ansys, shell, opengl, mesa, vglrun, module: command not found, error"
last_updated: 07 September 2022
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/known-problems.html
---
## Common errors
### Illegal instruction error
It may happened that your code, compiled on one machine will not be executed on another throwing exception like **"(Illegal instruction)"**.
This is usually because the software was compiled with a set of instructions newer than the ones available in the node where the software runs,
and it mostly depends on the processor generation.
In example, `merlin-l-001` and `merlin-l-002` contain a newer generation of processors than the old GPUs nodes, or than the Merlin5 cluster.
Hence, unless one compiles the software with compatibility with set of instructions from older processors, it will not run on old nodes.
Sometimes, this is properly set by default at the compilation time, but sometimes is not.
For GCC, please refer to [GCC x86 Options](https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html) for compiling options. In case of doubts, contact us.
## Slurm
### sbatch using one core despite setting -c/--cpus-per-task
From **Slurm v22.05.6**, the behavior of `srun` has changed. Merlin has been updated to this version since *Tuesday 13.12.2022*.
`srun` will no longer read in `SLURM_CPUS_PER_TASK`, which is typically set when defining `-c/--cpus-per-task` in the `sbatch` command.
This means you will implicitly have to specify `-c\--cpus-per-task` also on your `srun` calls, or set the new `SRUN_CPUS_PER_TASK` environment variable to accomplish the same thing.
Therefore, unless this is implicitly specified, `srun` will use only one Core per task (resulting in 2 CPUs per task when multithreading is enabled)
An example for setting up `srun` with `-c\--cpus-per-task`:
```bash
(base)[caubet_m@merlin-l-001:/data/user/caubet_m]# cat mysbatch_method1
#!/bin/bash
#SBATCH -n 1
#SBATCH --cpus-per-task=8
echo 'From Slurm v22.05.8 srun does not inherit $SLURM_CPUS_PER_TASK'
srun python -c "import os; print(os.sched_getaffinity(0))"
echo 'One has to implicitly specify $SLURM_CPUS_PER_TASK'
echo 'In this example, by setting -c/--cpus-per-task in srun'
srun --cpus-per-task=$SLURM_CPUS_PER_TASK python -c "import os; print(os.sched_getaffinity(0))"
(base)[caubet_m@merlin-l-001:/data/user/caubet_m]# sbatch mysbatch_method1
Submitted batch job 8000813
(base)[caubet_m@merlin-l-001:/data/user/caubet_m]# cat slurm-8000813.out
From Slurm v22.05.8 srun does not inherit $SLURM_CPUS_PER_TASK
{1, 45}
One has to implicitly specify $SLURM_CPUS_PER_TASK
In this example, by setting -c/--cpus-per-task in srun
{1, 2, 3, 4, 45, 46, 47, 48}
```
An example to accomplish the same thing with the `SRUN_CPUS_PER_TASK` environment variable:
```bash
(base)[caubet_m@merlin-l-001:/data/user/caubet_m]# cat mysbatch_method2
#!/bin/bash
#SBATCH -n 1
#SBATCH --cpus-per-task=8
echo 'From Slurm v22.05.8 srun does not inherit $SLURM_CPUS_PER_TASK'
srun python -c "import os; print(os.sched_getaffinity(0))"
echo 'One has to implicitly specify $SLURM_CPUS_PER_TASK'
echo 'In this example, by setting an environment variable SRUN_CPUS_PER_TASK'
export SRUN_CPUS_PER_TASK=$SLURM_CPUS_PER_TASK
srun python -c "import os; print(os.sched_getaffinity(0))"
(base)[caubet_m@merlin-l-001:/data/user/caubet_m]# sbatch mysbatch_method2
Submitted batch job 8000815
(base)[caubet_m@merlin-l-001:/data/user/caubet_m]# cat slurm-8000815.out
From Slurm v22.05.8 srun does not inherit $SLURM_CPUS_PER_TASK
{1, 45}
One has to implicitly specify $SLURM_CPUS_PER_TASK
In this example, by setting an environment variable SRUN_CPUS_PER_TASK
{1, 2, 3, 4, 45, 46, 47, 48}
```
## General topics
### Default SHELL
In general, **`/bin/bash` is the recommended default user's SHELL** when working in Merlin.
Some users might notice that BASH is not the default SHELL when logging in to Merlin systems, or they might need to run a different SHELL.
This is probably because when the PSI account was requested, no SHELL description was specified or a different one was requested explicitly by the requestor.
Users can check which is the default SHELL specified in the PSI account with the following command:
```bash
getent passwd $USER | awk -F: '{print $NF}'
```
If SHELL does not correspond to the one you need to use, you should request a central change for it.
This is because Merlin accounts are central PSI accounts. Hence, **change must be requested via [PSI Service Now](contact.md#psi-service-now)**.
Alternatively, if you work on other PSI Linux systems but for Merlin you need a different SHELL type, a temporary change can be performed during login startup.
You can update one of the following files:
* `~/.login`
* `~/.profile`
* Any `rc` or `profile` file in your home directory (i.e. `.cshrc`, `.bashrc`, `.bash_profile`, etc.)
with the following lines:
```bash
# Replace MY_SHELL with the bash type you need
MY_SHELL=/bin/bash
exec $MY_SHELL -l
```
Notice that available *shells* can be found in the following file:
```bash
cat /etc/shells
```
### 3D acceleration: OpenGL vs Mesa
Some applications can run with OpenGL support. This is only possible when the node contains a GPU card.
In general, X11 with Mesa Driver is the recommended method as it will work in all cases (no need of GPUs). In example, for ParaView:
```bash
module load paraview
paraview-mesa paraview # 'paraview --mesa' for old releases
```
However, if one needs to run with OpenGL support, this is still possible by running `vglrun`. In example, for running Paraview:
```bash
module load paraview
vglrun paraview
```
Officially, the supported method for running `vglrun` is by using the [NoMachine remote desktop](../how-to-use-merlin/nomachine.md).
Running `vglrun` it's also possible using SSH with X11 Forwarding. However, it's very slow and it's only recommended when running
in Slurm (from [NoMachine](../how-to-use-merlin/nomachine.md)). Please, avoid running `vglrun` over SSH from a desktop or laptop.
## Software
### ANSYS
Sometimes, running ANSYS/Fluent requires X11 support. For that, one should run fluent as follows.
```bash
module load ANSYS
fluent -driver x11
```
### Paraview
For running Paraview, one can run it with Mesa support or OpenGL support. Please refer to [OpenGL vs Mesa](#3d-acceleration-opengl-vs-mesa) for
further information about how to run it.
### Module command not found
In some circumstances the module command may not be initialized properly. For instance, you may see the following error upon logon:
```
bash: module: command not found
```
The most common cause for this is a custom `.bashrc` file which fails to source the global `/etc/bashrc` responsible for setting up PModules in some OS versions. To fix this, add the following to `$HOME/.bashrc`:
```bash
if [ -f /etc/bashrc ]; then
. /etc/bashrc
fi
```
It can also be fixed temporarily in an existing terminal by running `. /etc/bashrc` manually.

View File

@@ -0,0 +1,139 @@
---
title: Migration From Merlin5
#tags:
keywords: merlin5, merlin6, migration, rsync, archive, archiving, lts, long-term storage
last_updated: 07 September 2022
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/migrating.html
---
## Directories
### Merlin5 vs Merlin6
| Cluster | Home Directory | User Home Directory | Group Home Directory |
| ------- |:-------------------- |:-------------------- |:---------------------------------------- |
| merlin5 | /gpfs/home/_$username_ | /gpfs/data/_$username_ | /gpfs/group/_$laboratory_ |
| merlin6 | /psi/home/_$username_ | /data/user/_$username_ | /data/project/_\[general\|bio\]_/_$projectname_ |
### Quota limits in Merlin6
| Directory | Quota_Type [Soft:Hard] (Block) | Quota_Type [Soft:Hard] (Files) | Quota Change Policy: Block | Quota Change Policy: Files |
| ---------------------------------- | ------------------------------ | ------------------------------ |:--------------------------------------------- |:--------------------------------------------- |
| /psi/home/$username | USR [10GB:11GB] | *Undef* | Up to x2 when strictly justified. | N/A |
| /data/user/$username | USR [1TB:1.074TB] | USR [1M:1.1M] | Inmutable. Need a project. | Changeable when justified. |
| /data/project/bio/$projectname | GRP+Fileset [1TB:1.074TB] | GRP+Fileset [1M:1.1M] | Changeable according to project requirements. | Changeable according to project requirements. |
| /data/project/general/$projectname | GRP+Fileset [1TB:1.074TB] | GRP+Fileset [1M:1.1M] | Changeable according to project requirements. | Changeable according to project requirements. |
where:
* **Block** is capacity size in GB and TB
* **Files** is number of files + directories in Millions (M)
* **Quota types** are the following:
* **USR**: Quota is setup individually per user name
* **GRP**: Quota is setup individually per Unix Group name
* **Fileset**: Quota is setup per project root directory.
* User data directory ``/data/user`` has a strict user block quota limit policy. If more disk space is required, 'project' must be created.
* Soft quotas can be exceeded for short periods of time. Hard quotas cannot be exceeded.
### Project directory
#### Why is 'project' needed?
Merlin6 introduces the concept of a *project* directory. These are the recommended location for all scientific data.
* `/data/user` is not suitable for sharing data between users
* The Merlin5 *group* directories were a similar concept, but the association with a single organizational group made
interdepartmental sharing difficult. Projects can be shared by any PSI user.
* Projects are shared by multiple users (at a minimum they should be shared with the supervisor/PI). This decreases
the chance of data being orphaned by personnel changes.
* Shared projects are preferable to individual data for transparency and accountability in event of future questions
regarding the data.
* One project member is designated as responsible. Responsibility can be transferred if needed.
#### Requesting a *project*
Refer to [Requesting a project](../quick-start-guide/requesting-projects.md)
---
## Migration Schedule
### Phase 1 [June]: Pre-migration
* Users keep working on Merlin5
* Merlin5 production directories: ``'/gpfs/home/'``, ``'/gpfs/data'``, ``'/gpfs/group'``
* Users may raise any problems (quota limits, unaccessible files, etc.) to merlin-admins@lists.psi.ch
* Users can start migrating data (see [Migration steps](#migration-steps))
* Users should copy their data from Merlin5 ``/gpfs/data`` to Merlin6 ``/data/user``
* Users should copy their home from Merlin5 ``/gpfs/home`` to Merlin6 ``/psi/home``
* Users should inform when migration is done, and which directories were migrated. Deletion for such directories can be requested by admins.
### Phase 2 [July-October]: Migration to Merlin6
* Merlin6 becomes official cluster, and directories are switched to the new structure:
* Merlin6 production directories: ``'/psi/home/'``, ``'/data/user'``, ``'/data/project'``
* Merlin5 directories available in RW in login nodes: ``'/gpfs/home/'``, ``'/gpfs/data'``, ``'/gpfs/group'``
* In Merlin5 computing nodes, Merlin5 directories are mounted in RW: ``'/gpfs/home/'``, ``'/gpfs/data'``, ``'/gpfs/group'``
* In Merlin5 computing nodes, Merlin6 directories are mounted in RW: ``'/psi/home/'``, ``'/data/user'``, ``'/data/project'``
* Users must migrate their data (see [Migration steps](#migration-steps))
* ALL data must be migrated
* Job submissions by default to Merlin6. Submission to Merlin5 computing nodes possible.
* Users should inform when migration is done, and which directories were migrated. Deletion for such directories can be requested by admins.
### Phase 3 [November]: Merlin5 Decomission
* Old Merlin5 storage unmounted.
* Migrated directories reported by users will be deleted.
* Remaining Merlin5 data will be archived.
---
## Migration steps
### Cleanup / Archive files
* Users must cleanup and/or archive files, according to the quota limits for the target storage.
* If extra space is needed, we advise users to request a [project](../quick-start-guide/requesting-projects.md)
* If you need a larger quota in respect to the maximal allowed number of files, you can request an increase of your user quota.
#### File list
### Step 1: Migrating
First migration:
```bash
rsync -avAHXS <source_merlin5> <destination_merlin6>
rsync -avAHXS /gpfs/data/$username/* /data/user/$username
```
This can take several hours or days:
* You can try to parallelize multiple rsync commands in sub-directories for increasing transfer rate.
* Please do not parallelize many concurrent directories. Let's say, don't add more than 10 together.
* We may have other users doing the same and it could cause storage / UI performance problems in the Merlin5 cluster.
### Step 2: Mirroring
Once first migration is done, a second ``rsync`` should be ran. This is done with ``--delete``. With this option ``rsync`` will
behave in a way where it will delete from the destination all files that were removed in the source, but also will propagate
new files from the source to the destination.
```bash
rsync -avAHXS --delete <source_merlin5> <destination_merlin6>
rsync -avAHXS --delete /gpfs/data/$username/* /data/user/$username
```
### Step 3: Removing / Archiving old data
#### Removing migrated data
Once you ensure that everything is migrated to the new storage, data is ready to be deleted from the old storage.
Users must report when migration is finished and report which directories are affected and ready to be removed.
Merlin administrators will remove the directories, always asking for a last confirmation.
#### Archiving data
Once all migrated data has been removed from the old storage, missing data will be archived.

View File

@@ -0,0 +1,48 @@
---
title: Troubleshooting
#tags:
keywords: troubleshooting, problems, faq, known problems
last_updated: 07 September 2022
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/troubleshooting.html
---
For troubleshooting, please contact us through the official channels. See [Contact](contact.md)
for more information.
## Known Problems
Before contacting us for support, please check the **[Merlin6 Support: Known Problems](known-problems.md)** page to see if there is an existing
workaround for your specific problem.
## Troubleshooting Slurm Jobs
If you want to report a problem or request for help when running jobs, please **always provide**
the following information:
1. Provide your batch script or, alternatively, the path to your batch script.
2. Add **always** the following commands to your batch script
```bash
echo "User information:"; who am i
echo "Running hostname:"; hostname
echo "Current location:"; pwd
echo "User environment:"; env
echo "List of PModules:"; module list
```
3. Whenever possible, provide the Slurm JobID.
Providing this information is **extremely important** in order to ease debugging, otherwise
only with the description of the issue or just the error message is completely insufficient
in most cases.
## Troubleshooting SSH
Use the ssh command with the "-vvv" option and copy and paste (no screenshots please)
the output to your request in Service-Now. Example
```bash
ssh -Y -vvv $username@merlin-l-01.psi.ch
```

View File

@@ -0,0 +1,27 @@
---
title: Introduction
#tags:
#keywords:
last_updated: 28 June 2019
#summary: "Merlin 6 cluster overview"
sidebar: merlin6_sidebar
permalink: /merlin6/cluster-introduction.html
---
## Slurm clusters
* The new Slurm CPU cluster is called [**`merlin6`**](cluster-introduction.md).
* The new Slurm GPU cluster is called [**`gmerlin6`**](../gmerlin6/cluster-introduction.md)
* The old Slurm *merlin* cluster is still active and best effort support is provided.
The cluster, was renamed as [**merlin5**](../merlin5/cluster-introduction.md).
From July 2019, **`merlin6`** becomes the **default Slurm cluster** and any job submitted from the login node will be submitted to that cluster if not .
* Users can keep submitting to the old *`merlin5`* computing nodes by using the option ``--cluster=merlin5``.
* Users submitting to the **`gmerlin6`** GPU cluster need to specify the option ``--cluster=gmerlin6``.
### Slurm 'merlin6'
**CPU nodes** are configured in a **Slurm** cluster, called **`merlin6`**, and
this is the _**default Slurm cluster**_. Hence, by default, if no Slurm cluster is
specified (with the `--cluster` option), this will be the cluster to which the jobs
will be sent.

View File

@@ -0,0 +1,171 @@
---
title: Hardware And Software Description
#tags:
#keywords:
last_updated: 13 June 2019
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/hardware-and-software.html
---
## Hardware
### Computing Nodes
The new Merlin6 cluster contains a solution based on **four** [**HPE Apollo k6000 Chassis**](https://h20195.www2.hpe.com/v2/getdocument.aspx?docname=a00016641enw)
* *Three* of them contain 24 x [**HP Apollo XL230K Gen10**](https://h20195.www2.hpe.com/v2/GetDocument.aspx?docname=a00016634enw) blades.
* A *fourth* chassis was purchased on 2021 with [**HP Apollo XL230K Gen10**](https://h20195.www2.hpe.com/v2/GetDocument.aspx?docname=a00016634enw) blades dedicated to few experiments. Blades have slighly different components depending on specific project requirements.
The connectivity for the Merlin6 cluster is based on **ConnectX-5 EDR-100Gbps**, and each chassis contains:
* 1 x [HPE Apollo InfiniBand EDR 36-port Unmanaged Switch](https://h20195.www2.hpe.com/v2/getdocument.aspx?docname=a00016643enw)
* 24 internal EDR-100Gbps ports (1 port per blade for internal low latency connectivity)
* 12 external EDR-100Gbps ports (for external for internal low latency connectivity)
<table>
<thead>
<tr>
<th scope='colgroup' style="vertical-align:middle;text-align:center;" colspan="8">Merlin6 CPU Computing Nodes</th>
</tr>
<tr>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Chassis</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Node</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Processor</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Sockets</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Cores</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Threads</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Scratch</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Memory</th>
</tr>
</thead>
<tbody>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>#0</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-c-0[01-24]</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><a href="https://ark.intel.com/content/www/us/en/ark/products/120491/intel-xeon-gold-6152-processor-30-25m-cache-2-10-ghz.html">Intel Xeon Gold 6152</a></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">44</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">1.2TB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">384GB</td>
</tr>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>#1</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-c-1[01-24]</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><a href="https://ark.intel.com/content/www/us/en/ark/products/120491/intel-xeon-gold-6152-processor-30-25m-cache-2-10-ghz.html">Intel Xeon Gold 6152</a></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">44</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">1.2TB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">384GB</td>
</tr>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>#2</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-c-2[01-24]</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><a href="https://ark.intel.com/content/www/us/en/ark/products/120491/intel-xeon-gold-6152-processor-30-25m-cache-2-10-ghz.html">Intel Xeon Gold 6152</a></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">44</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">1.2TB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">384GB</td>
</tr>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td style="vertical-align:middle;text-align:center;" rowspan="3"><b>#3</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-c-3[01-12]</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="3"><a href="https://ark.intel.com/content/www/us/en/ark/products/199343/intel-xeon-gold-6240r-processor-35-75m-cache-2-40-ghz.html">Intel Xeon Gold 6240R</a></td>
<td style="vertical-align:middle;text-align:center;" rowspan="3">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="3">48</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="3">1.2TB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="2">768GB</td>
</tr>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td rowspan="1"><b>merlin-c-3[03-18]</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">1</td>
</tr>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td rowspan="1"><b>merlin-c-3[19-24]</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">384GB</td>
</tr>
</tbody>
</table>
Each blade contains a NVMe disk, where up to 300TB are dedicated to the O.S., and ~1.2TB are reserved for local `/scratch`.
### Login Nodes
*One old login node* (``merlin-l-01.psi.ch``) is inherit from the previous Merlin5 cluster. Its mainly use is for running some BIO services (`cryosparc`) and for submitting jobs.
*Two new login nodes* (``merlin-l-001.psi.ch``,``merlin-l-002.psi.ch``) with similar configuration to the Merlin6 computing nodes are available for the users. The mainly use
is for compiling software and submitting jobs.
The connectivity is based on **ConnectX-5 EDR-100Gbps** for the new login nodes, and **ConnectIB FDR-56Gbps** for the old one.
<table>
<thead>
<tr>
<th scope='colgroup' style="vertical-align:middle;text-align:center;" colspan="8">Merlin6 CPU Computing Nodes</th>
</tr>
<tr>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Hardware</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Node</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Processor</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Sockets</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Cores</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Threads</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Scratch</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Memory</th>
</tr>
</thead>
<tbody>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>Old</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-l-01</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><a href="https://ark.intel.com/products/91768/Intel-Xeon-Processor-E5-2697A-v4-40M-Cache-2-60-GHz-">Intel Xeon E5-2697AV4</a></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">16</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">100GB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">512GB</td>
</tr>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>New</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-l-00[1,2]</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><a href="https://ark.intel.com/content/www/us/en/ark/products/120491/intel-xeon-gold-6152-processor-30-25m-cache-2-10-ghz.html">Intel Xeon Gold 6152</a></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">44</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">1.8TB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">384GB</td>
</tr>
</tbody>
</table>
### Storage
The storage node is based on the [Lenovo Distributed Storage Solution for IBM Spectrum Scale](https://lenovopress.com/lp0626-lenovo-distributed-storage-solution-for-ibm-spectrum-scale-x3650-m5).
* 2 x **Lenovo DSS G240** systems, each one composed by 2 IO Nodes **ThinkSystem SR650** mounting 4 x **Lenovo Storage D3284 High Density Expansion** enclosures.
* Each IO node has a connectivity of 400Gbps (4 x EDR 100Gbps ports, 2 of them are **ConnectX-5** and 2 are **ConnectX-4**).
The storage solution is connected to the HPC clusters through 2 x **Mellanox SB7800 InfiniBand 1U Switches** for high availability and load balancing.
### Network
Merlin6 cluster connectivity is based on the [**Infiniband**](https://en.wikipedia.org/wiki/InfiniBand) technology. This allows fast access with very low latencies to the data as well as running
extremely efficient MPI-based jobs:
* Connectivity amongst different computing nodes on different chassis ensures up to 1200Gbps of aggregated bandwidth.
* Inter connectivity (communication amongst computing nodes in the same chassis) ensures up to 2400Gbps of aggregated bandwidth.
* Communication to the storage ensures up to 800Gbps of aggregated bandwidth.
Merlin6 cluster currently contains 5 Infiniband Managed switches and 3 Infiniband Unmanaged switches (one per HP Apollo chassis):
* 1 x **MSX6710** (FDR) for connecting old GPU nodes, old login nodes and MeG cluster to the Merlin6 cluster (and storage). No High Availability mode possible.
* 2 x **MSB7800** (EDR) for connecting Login Nodes, Storage and other nodes in High Availability mode.
* 3 x **HP EDR Unmanaged** switches, each one embedded to each HP Apollo k6000 chassis solution.
* 2 x **MSB7700** (EDR) are the top switches, interconnecting the Apollo unmanaged switches and the managed switches (MSX6710, MSB7800).
## Software
In Merlin6, we try to keep the latest software stack release to get the latest features and improvements. Due to this, **Merlin6** runs:
* [**RedHat Enterprise Linux 7**](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/7.9_release_notes/index)
* [**Slurm**](https://slurm.schedmd.com/), we usually try to keep it up to date with the most recent versions.
* [**GPFS v5**](https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.2/ibmspectrumscale502_welcome.html)
* [**MLNX_OFED LTS v.5.2-2.2.0.0 or newer**](https://www.mellanox.com/products/infiniband-drivers/linux/mlnx_ofed) for all **ConnectX-5** or superior cards.
* [MLNX_OFED LTS v.4.9-2.2.4.0](https://www.mellanox.com/products/infiniband-drivers/linux/mlnx_ofed) is installed for remaining **ConnectX-3** and **ConnectIB** cards.

View File

@@ -0,0 +1,372 @@
# Archive & PSI Data Catalog
## PSI Data Catalog as a PSI Central Service
PSI provides access to the ***Data Catalog*** for **long-term data storage and retrieval**. Data is
stored on the ***PetaByte Archive*** at the **Swiss National Supercomputing Centre (CSCS)**.
The Data Catalog and Archive is suitable for:
* Raw data generated by PSI instruments
* Derived data produced by processing some inputs
* Data required to reproduce PSI research and publications
The Data Catalog is part of PSI's effort to conform to the FAIR principles for data management.
In accordance with this policy, ***data will be publicly released under CC-BY-SA 4.0 after an
embargo period expires.***
The Merlin cluster is connected to the Data Catalog. Hence, users archive data stored in the
Merlin storage under the ``/data`` directories (currentlyi, ``/data/user`` and ``/data/project``).
Archiving from other directories is also possible, however the process is much slower as data
can not be directly retrieved by the PSI archive central servers (**central mode**), and needs to
be indirectly copied to these (**decentral mode**).
Archiving can be done from any node accessible by the users (usually from the login nodes).
!!! tip
Archiving can be done in two different ways:
**'Central mode':** Possible for the user and project data directories, is the
fastest way as it does not require remote copy (data is directly retreived by central AIT servers from Merlin
through 'merlin-archive.psi.ch').
**'Decentral mode':** Possible for any directory, is the slowest way of archiving as it requires
to copy ('rsync') the data from Merlin to the central AIT servers.
## Procedure
### Overview
Below are the main steps for using the Data Catalog.
* Ingest the dataset into the Data Catalog. This makes the data known to the Data Catalog system at PSI:
* Prepare a metadata file describing the dataset
* Run **`datasetIngestor`** script
* If necessary, the script will copy the data to the PSI archive servers
* Usually this is necessary when archiving from directories other than **`/data/user`** or
**`/data/project`**. It would be also necessary when the Merlin export server (**`merlin-archive.psi.ch`**)
is down for any reason.
* Archive the dataset:
* Visit [https://discovery.psi.ch](https://discovery.psi.ch)
* Click **`Archive`** for the dataset
* The system will now copy the data to the PetaByte Archive at CSCS
* Retrieve data from the catalog:
* Find the dataset on [https://discovery.psi.ch](https://discovery.psi.ch) and click **`Retrieve`**
* Wait for the data to be copied to the PSI retrieval system
* Run **`datasetRetriever`** script
Since large data sets may take a lot of time to transfer, some steps are
designed to happen in the background. The discovery website can be used to
track the progress of each step.
### Account Registration
Two types of account permit access to the Data Catalog. If your data was
collected at a ***beamline***, you may have been assigned a **`p-group`**
(e.g. `p12345`) for the experiment. Other users are assigned **`a-group`**
(e.g. `a-12345`).
Groups are usually assigned to a PI, and then individual user accounts are added to the group. This must be done
under user request through PSI Service Now. For existing **a-groups** and **p-groups**, you can follow the standard
central procedures. Alternatively, if you do not know how to do that, follow the Merlin6
**[Requesting extra Unix groups](../quick-start-guide/requesting-accounts.md)** procedure, or open
a **[PSI Service Now](https://psi.service-now.com/psisp)** ticket.
### Documentation
Accessing the Data Catalog is done through the [SciCat software](https://melanie.gitpages.psi.ch/SciCatPages/).
Documentation is here: [ingestManual](https://scicatproject.github.io/documentation/Ingestor/ingestManual.html).
#### Loading datacatalog tools
The latest datacatalog software is maintained in the PSI module system. To access it from the Merlin systems, run the following command:
```bash
module load datacatalog
```
It can be done from any host in the Merlin cluster accessible by users. Usually, login nodes will be the nodes used for archiving.
### Finding your token
As of 2022-04-14 a secure token is required to interact with the data catalog. This is a long random string that replaces the previous user/password authentication (allowing access for non-PSI use cases). **This string should be treated like a password and not shared.**
1. Go to discovery.psi.ch
1. Click 'Sign in' in the top right corner. Click the 'Login with PSI account' and log in on the PSI login1. page.
1. You should be redirected to your user settings and see a 'User Information' section. If not, click on1. your username in the top right and choose 'Settings' from the menu.
1. Look for the field 'Catamel Token'. This should be a 64-character string. Click the icon to copy the1. token.
![SciCat website](../../images/scicat_token.png)
You will need to save this token for later steps. To avoid including it in all the commands, I suggest saving it to an environmental variable (Linux):
```bash
SCICAT_TOKEN=RqYMZcqpqMJqluplbNYXLeSyJISLXfnkwlfBKuvTSdnlpKkU
```
(Hint: prefix this line with a space to avoid saving the token to your bash history.)
Tokens expire after 2 weeks and will need to be fetched from the website again.
### Ingestion
The first step to ingesting your data into the catalog is to prepare a file describing what data you have. This is called
**`metadata.json`**, and can be created with a text editor (e.g. *`vim`*). It can in principle be saved anywhere,
but keeping it with your archived data is recommended. For more information about the format, see the 'Bio metadata'
section below. An example follows:
```yaml
{
"principalInvestigator": "albrecht.gessler@psi.ch",
"creationLocation": "/PSI/EMF/JEOL2200FS",
"dataFormat": "TIFF+LZW Image Stack",
"sourceFolder": "/gpfs/group/LBR/pXXX/myimages",
"owner": "Wilhelm Tell",
"ownerEmail": "wilhelm.tell@psi.ch",
"type": "raw",
"description": "EM micrographs of amygdalin",
"ownerGroup": "a-12345",
"scientificMetadata": {
"description": "EM micrographs of amygdalin",
"sample": {
"name": "Amygdalin beta-glucosidase 1",
"uniprot": "P29259",
"species": "Apple"
},
"dataCollection": {
"date": "2018-08-01"
},
"microscopeParameters": {
"pixel size": {
"v": 0.885,
"u": "A"
},
"voltage": {
"v": 200,
"u": "kV"
},
"dosePerFrame": {
"v": 1.277,
"u": "e/A2"
}
}
}
}
```
It is recommended to use the [ScicatEditor](https://bliven_s.gitpages.psi.ch/SciCatEditor/) for creating metadata files. This is a browser-based tool specifically for ingesting PSI data. Using the tool avoids syntax errors and provides templates for common data sets and options. The finished JSON file can then be downloaded to merlin or copied into a text editor.
Another option is to use the SciCat graphical interface from NoMachine. This provides a graphical interface for selecting data to archive. This is particularly useful for data associated with a DUO experiment and p-group. Type `SciCat` to get started after loading the `datacatalog` module. The GUI also replaces the the command-line ingestion described below.
The following steps can be run from wherever you saved your `metadata.json`. First, perform a "dry-run" which will check the metadata for errors:
```bash
datasetIngestor --token $SCICAT_TOKEN metadata.json
```
It will ask for your PSI credentials and then print some info about the data to be ingested. If there are no errors, proceed to the real ingestion:
```bash
datasetIngestor --token $SCICAT_TOKEN --ingest --autoarchive metadata.json
```
You will be asked whether you want to copy the data to the central system:
* If you are on the Merlin cluster and you are archiving data from `/data/user` or `/data/project`, answer 'no' since the data catalog can
directly read the data.
* If you are on a directory other than `/data/user` and `/data/project`, or you are on a desktop computer, answer 'yes'. Copying large datasets
to the PSI archive system may take quite a while (minutes to hours).
If there are no errors, your data has been accepted into the data catalog! From now on, no changes should be made to the ingested data.
This is important, since the next step is for the system to copy all the data to the CSCS Petabyte archive. Writing to tape is slow, so
this process may take several days, and it will fail if any modifications are detected.
If using the `--autoarchive` option as suggested above, your dataset should now be in the queue. Check the data catalog:
[https://discovery.psi.ch](https://discovery.psi.ch). Your job should have status 'WorkInProgress'. You will receive an email when the ingestion
is complete.
If you didn't use `--autoarchive`, you need to manually move the dataset into the archive queue. From **discovery.psi.ch**, navigate to the 'Archive'
tab. You should see the newly ingested dataset. Check the dataset and click **`Archive`**. You should see the status change from **`datasetCreated`** to
**`scheduleArchiveJob`**. This indicates that the data is in the process of being transferred to CSCS.
After a few days the dataset's status will change to **`datasetOnAchive`** indicating the data is stored. At this point it is safe to delete the data.
#### Useful commands
Running the datasetIngestor in dry mode (**without** `--ingest`) finds most errors. However, it is sometimes convenient to find potential errors
yourself with simple unix commands.
Find problematic filenames
```bash
find . -iregex '.*/[^/]*[^a-zA-Z0-9_ ./-][^/]*'=
```
Find broken links
```bash
find -L . -type l
```
Find outside links
```bash
find . -type l -exec bash -c 'realpath --relative-base "`pwd`" "$0" 2>/dev/null |egrep "^[./]" |sed "s|^|$0 ->|" ' '{}' ';'
```
Delete certain files (use with caution)
```bash
# Empty directories
find . -type d -empty -delete
# Backup files
find . -name '*~' -delete
find . -name '*#autosave#' -delete
```
#### Troubleshooting & Known Bugs
* The following message can be safely ignored:
```bash
key_cert_check_authority: invalid certificate
Certificate invalid: name is not a listed principal
```
It indicates that no kerberos token was provided for authentication. You can avoid the warning by first running kinit (PSI linux systems).
* For decentral ingestion cases, the copy step is indicated by a message `Running [/usr/bin/rsync -e ssh -avxz ...`. It is expected that this
step will take a long time and may appear to have hung. You can check what files have been successfully transfered using rsync:
```bash
rsync --list-only user_n@pb-archive.psi.ch:archive/UID/PATH/
```
where UID is the dataset ID (12345678-1234-1234-1234-123456789012) and PATH is the absolute path to your data. Note that rsync creates directories first and that the transfer order is not alphabetical in some cases, but it should be possible to see whether any data has transferred.
* There is currently a limit on the number of files per dataset (technically, the limit is from the total length of all file paths). It is recommended to break up datasets into 300'000 files or less.
* If it is not possible or desirable to split data between multiple datasets, an alternate work-around is to package files into a tarball. For datasets which are already compressed, omit the -z option for a considerable speedup:
```bash
tar -f [output].tar [srcdir]
```
Uncompressed data can be compressed on the cluster using the following command:
```bash
sbatch /data/software/Slurm/Utilities/Parallel_TarGz.batch -s [srcdir] -t [output].tar -n
```
Run /data/software/Slurm/Utilities/Parallel_TarGz.batch -h for more details and options.
#### Sample ingestion output (datasetIngestor 1.1.11)
```text
/data/project/bio/myproject/archive $ datasetIngestor -copy -autoarchive -allowexistingsource -ingest metadata.json
2019/11/06 11:04:43 Latest version: 1.1.11
2019/11/06 11:04:43 Your version of this program is up-to-date
2019/11/06 11:04:43 You are about to add a dataset to the === production === data catalog environment...
2019/11/06 11:04:43 Your username:
user_n
2019/11/06 11:04:48 Your password:
2019/11/06 11:04:52 User authenticated: XXX
2019/11/06 11:04:52 User is member in following a or p groups: XXX
2019/11/06 11:04:52 OwnerGroup information a-XXX verified successfully.
2019/11/06 11:04:52 contactEmail field added: XXX
2019/11/06 11:04:52 Scanning files in dataset /data/project/bio/myproject/archive
2019/11/06 11:04:52 No explicit filelistingPath defined - full folder /data/project/bio/myproject/archive is used.
2019/11/06 11:04:52 Source Folder: /data/project/bio/myproject/archive at /data/project/bio/myproject/archive
2019/11/06 11:04:57 The dataset contains 100000 files with a total size of 50000000000 bytes.
2019/11/06 11:04:57 creationTime field added: 2019-07-29 18:47:08 +0200 CEST
2019/11/06 11:04:57 endTime field added: 2019-11-06 10:52:17.256033 +0100 CET
2019/11/06 11:04:57 license field added: CC BY-SA 4.0
2019/11/06 11:04:57 isPublished field added: false
2019/11/06 11:04:57 classification field added: IN=medium,AV=low,CO=low
2019/11/06 11:04:57 Updated metadata object:
{
"accessGroups": [
"XXX"
],
"classification": "IN=medium,AV=low,CO=low",
"contactEmail": "XXX",
"creationLocation": "XXX",
"creationTime": "2019-07-29T18:47:08+02:00",
"dataFormat": "XXX",
"description": "XXX",
"endTime": "2019-11-06T10:52:17.256033+01:00",
"isPublished": false,
"license": "CC BY-SA 4.0",
"owner": "XXX",
"ownerEmail": "XXX",
"ownerGroup": "a-XXX",
"principalInvestigator": "XXX",
"scientificMetadata": {
...
},
"sourceFolder": "/data/project/bio/myproject/archive",
"type": "raw"
}
2019/11/06 11:04:57 Running [/usr/bin/ssh -l user_n pb-archive.psi.ch test -d /data/project/bio/myproject/archive].
key_cert_check_authority: invalid certificate
Certificate invalid: name is not a listed principal
user_n@pb-archive.psi.ch's password:
2019/11/06 11:05:04 The source folder /data/project/bio/myproject/archive is not centrally available (decentral use case).
The data must first be copied to a rsync cache server.
2019/11/06 11:05:04 Do you want to continue (Y/n)?
Y
2019/11/06 11:05:09 Created dataset with id 12.345.67890/12345678-1234-1234-1234-123456789012
2019/11/06 11:05:09 The dataset contains 108057 files.
2019/11/06 11:05:10 Created file block 0 from file 0 to 1000 with total size of 413229990 bytes
2019/11/06 11:05:10 Created file block 1 from file 1000 to 2000 with total size of 416024000 bytes
2019/11/06 11:05:10 Created file block 2 from file 2000 to 3000 with total size of 416024000 bytes
2019/11/06 11:05:10 Created file block 3 from file 3000 to 4000 with total size of 416024000 bytes
...
2019/11/06 11:05:26 Created file block 105 from file 105000 to 106000 with total size of 416024000 bytes
2019/11/06 11:05:27 Created file block 106 from file 106000 to 107000 with total size of 416024000 bytes
2019/11/06 11:05:27 Created file block 107 from file 107000 to 108000 with total size of 850195143 bytes
2019/11/06 11:05:27 Created file block 108 from file 108000 to 108057 with total size of 151904903 bytes
2019/11/06 11:05:27 short dataset id: 0a9fe316-c9e7-4cc5-8856-e1346dd31e31
2019/11/06 11:05:27 Running [/usr/bin/rsync -e ssh -avxz /data/project/bio/myproject/archive/ user_n@pb-archive.psi.ch:archive
/0a9fe316-c9e7-4cc5-8856-e1346dd31e31/data/project/bio/myproject/archive].
key_cert_check_authority: invalid certificate
Certificate invalid: name is not a listed principal
user_n@pb-archive.psi.ch's password:
Permission denied, please try again.
user_n@pb-archive.psi.ch's password:
/usr/libexec/test_acl.sh: line 30: /tmp/tmpacl.txt: Permission denied
/usr/libexec/test_acl.sh: line 30: /tmp/tmpacl.txt: Permission denied
/usr/libexec/test_acl.sh: line 30: /tmp/tmpacl.txt: Permission denied
/usr/libexec/test_acl.sh: line 30: /tmp/tmpacl.txt: Permission denied
/usr/libexec/test_acl.sh: line 30: /tmp/tmpacl.txt: Permission denied
...
2019/11/06 12:05:08 Successfully updated {"pid":"12.345.67890/12345678-1234-1234-1234-123456789012",...}
2019/11/06 12:05:08 Submitting Archive Job for the ingested datasets.
2019/11/06 12:05:08 Job response Status: okay
2019/11/06 12:05:08 A confirmation email will be sent to XXX
12.345.67890/12345678-1234-1234-1234-123456789012
```
### Publishing
After datasets are are ingested they can be assigned a public DOI. This can be included in publications and will make the datasets on <http://doi.psi.ch>.
For instructions on this, please read the ['Publish' section in the ingest manual](https://scicatproject.github.io/documentation/Ingestor/ingestManual.html#sec-8).
### Retrieving data
Retrieving data from the archive is also initiated through the Data Catalog. Please read the ['Retrieve' section in the ingest manual](https://scicatproject.github.io/documentation/Ingestor/ingestManual.html#sec-6).
## Further Information
* [PSI Data Catalog](https://discovery.psi.ch)
* [Full Documentation](https://scicatproject.github.io/documentation/Ingestor/ingestManual.html)
* [Published Datasets (doi.psi.ch)](https://doi.psi.ch)
* Data Catalog [PSI page](https://www.psi.ch/photon-science-data-services/data-catalog-and-archive)
* Data catalog [SciCat Software](https://scicatproject.github.io/)
* [FAIR](https://www.nature.com/articles/sdata201618) definition and [SNF Research Policy](http://www.snf.ch/en/theSNSF/research-policies/open_research_data/Pages/default.aspx#FAIR%20Data%20Principles%20for%20Research%20Data%20Management)
* [Petabyte Archive at CSCS](https://www.cscs.ch/fileadmin/user_upload/contents_publications/annual_reports/AR2017_Online.pdf)

View File

@@ -0,0 +1,42 @@
# Connecting from a Linux Client
## SSH without X11 Forwarding
This is the standard method. Official X11 support is provided through [NoMachine](nomachine.md).
For normal SSH sessions, use your SSH client as follows:
```bash
ssh $username@merlin-l-01.psi.ch
ssh $username@merlin-l-001.psi.ch
ssh $username@merlin-l-002.psi.ch
```
## SSH with X11 Forwarding
Official X11 Forwarding support is through NoMachine. Please follow the document
[{Job Submission -> Interactive Jobs}](../slurm-general-docs/interactive-jobs.md#requirements) and
[{Accessing Merlin -> NoMachine}](nomachine.md) for more details. However,
we provide a small recipe for enabling X11 Forwarding in Linux.
* For enabling client X11 forwarding, add the following to the start of `~/.ssh/config`
to implicitly add `-X` to all ssh connections:
```bash
ForwardAgent yes
ForwardX11Trusted yes
```
* Alternatively, you can add the option `-Y` to the `ssh` command. In example:
```bash
ssh -X $username@merlin-l-01.psi.ch
ssh -X $username@merlin-l-001.psi.ch
ssh -X $username@merlin-l-002.psi.ch
```
* For testing that X11 forwarding works, just run `xclock`. A X11 based clock should
popup in your client session:
```bash
xclock
```

View File

@@ -0,0 +1,60 @@
---
title: Connecting from a MacOS Client
#tags:
keywords: MacOS, mac os, mac, connecting, client, configuration, SSH, X11
last_updated: 07 September 2022
summary: "This document describes a recommended setup for a MacOS client."
sidebar: merlin6_sidebar
permalink: /merlin6/connect-from-macos.html
---
## SSH without X11 Forwarding
This is the standard method. Official X11 support is provided through [NoMachine](nomachine.md).
For normal SSH sessions, use your SSH client as follows:
```bash
ssh $username@merlin-l-01.psi.ch
ssh $username@merlin-l-001.psi.ch
ssh $username@merlin-l-002.psi.ch
```
## SSH with X11 Forwarding
### Requirements
For running SSH with X11 Forwarding in MacOS, one needs to have a X server running in MacOS.
The official X Server for MacOS is **[XQuartz](https://www.xquartz.org/)**. Please ensure
you have it running before starting a SSH connection with X11 forwarding.
### SSH with X11 Forwarding in MacOS
Official X11 support is through NoMachine. Please follow the document
[{Job Submission -> Interactive Jobs}](../slurm-general-docs/interactive-jobs.md#requirements) and
[{Accessing Merlin -> NoMachine}](nomachine.md) for more details. However,
we provide a small recipe for enabling X11 Forwarding in MacOS.
* Ensure that **[XQuartz](https://www.xquartz.org/)** is installed and running in your MacOS.
* For enabling client X11 forwarding, add the following to the start of ``~/.ssh/config``
to implicitly add ``-X`` to all ssh connections:
```bash
ForwardAgent yes
ForwardX11Trusted yes
```
* Alternatively, you can add the option ``-Y`` to the ``ssh`` command. In example:
```bash
ssh -X $username@merlin-l-01.psi.ch
ssh -X $username@merlin-l-001.psi.ch
ssh -X $username@merlin-l-002.psi.ch
```
* For testing that X11 forwarding works, just run ``xclock``. A X11 based clock should
popup in your client session.
```bash
xclock
```

View File

@@ -0,0 +1,47 @@
---
title: Connecting from a Windows Client
keywords: microsoft, mocosoft, windows, putty, xming, connecting, client, configuration, SSH, X11
last_updated: 07 September 2022
summary: "This document describes a recommended setup for a Windows client."
sidebar: merlin6_sidebar
permalink: /merlin6/connect-from-windows.html
---
## SSH with PuTTY without X11 Forwarding
PuTTY is one of the most common tools for SSH.
Check, if the following software packages are installed on the Windows workstation by
inspecting the *Start* menu (hint: use the *Search* box to save time):
* PuTTY (should be already installed)
* *[Optional]* Xming (needed for [SSH with X11 Forwarding](#ssh-with-putty-with-x11-forwarding))
If they are missing, you can install them using the Software Kiosk icon on the Desktop.
1. Start PuTTY
2. *[Optional]* Enable ``xterm`` to have similar mouse behavour as in Linux:
![Enable 'xterm'](../../images/PuTTY/Putty_Mouse_XTerm.png)
3. Create session to a Merlin login node and *Open*:
![Create Merlin Session](../../images/PuTTY/Putty_Session.png)
## SSH with PuTTY with X11 Forwarding
Official X11 Forwarding support is through NoMachine. Please follow the document
[{Job Submission -> Interactive Jobs}](../slurm-general-docs/interactive-jobs.md#requirements) and
[{Accessing Merlin -> NoMachine}](nomachine.md) for more details. However,
we provide a small recipe for enabling X11 Forwarding in Windows.
Check, if the **Xming** is installed on the Windows workstation by inspecting the
*Start* menu (hint: use the *Search* box to save time). If missing, you can install it by
using the Software Kiosk icon (should be located on the Desktop).
1. Ensure that a X server (**Xming**) is running. Otherwise, start it.
2. Enable X11 Forwarding in your SSH client. In example, for Putty:
![Enable X11 Forwarding in Putty](../../images/PuTTY/Putty_X11_Forwarding.png)

View File

@@ -0,0 +1,192 @@
---
title: Kerberos and AFS authentication
#tags:
keywords: kerberos, AFS, kinit, klist, keytab, tickets, connecting, client, configuration, slurm
last_updated: 07 September 2022
summary: "This document describes how to use Kerberos."
sidebar: merlin6_sidebar
permalink: /merlin6/kerberos.html
---
Projects and users have their own areas in the central PSI AFS service. In order
to access to these areas, valid Kerberos and AFS tickets must be granted.
These tickets are automatically granted when accessing through SSH with
username and password. Alternatively, one can get a granting ticket with the `kinit` (Kerberos)
and `aklog` (AFS ticket, which needs to be run after `kinit`) commands.
Due to PSI security policies, the maximum lifetime of the ticket is 7 days, and the default
time is 10 hours. It means than one needs to constantly renew (`krenew` command) the existing
granting tickets, and their validity can not be extended longer than 7 days. At this point,
one needs to obtain new granting tickets.
## Obtaining granting tickets with username and password
As already described above, the most common use case is to obtain Kerberos and AFS granting tickets
by introducing username and password:
* When login to Merlin through SSH protocol, if this is done with username + password authentication,
tickets for Kerberos and AFS will be automatically obtained.
* When login to Merlin through NoMachine, no Kerberos and AFS are granted. Therefore, users need to
run `kinit` (to obtain a granting Kerberos ticket) followed by `aklog` (to obtain a granting AFS ticket).
See further details below.
To manually obtain granting tickets, one has to:
1. To obtain a granting Kerberos ticket, one needs to run `kinit $USER` and enter the PSI password.
```bash
kinit $USER@D.PSI.CH
```
2. To obtain a granting ticket for AFS, one needs to run `aklog`. No password is necessary, but a valid
Kerberos ticket is mandatory.
```bash
aklog
```
3. To list the status of your granted tickets, users can use the `klist` command.
```bash
klist
```
4. To extend the validity of existing granting tickets, users can use the `krenew` command.
```bash
krenew
```
* Keep in mind that the maximum lifetime for granting tickets is 7 days, therefore `krenew` can not be used beyond that limit,
and then `kinit` should be used instead.
## Obtanining granting tickets with keytab
Sometimes, obtaining granting tickets by using password authentication is not possible. An example are user Slurm jobs
requiring access to private areas in AFS. For that, there's the possibility to generate a **keytab** file.
Be aware that the **keytab** file must be **private**, **fully protected** by correct permissions and not shared with any
other users.
### Creating a keytab file
For generating a **keytab**, one has to:
1. Load a newer Kerberos ( `krb5/1.20` or higher) from Pmodules:
```bash
module load krb5/1.20
```
2. Create a private directory for storing the Kerberos **keytab** file
```bash
mkdir -p ~/.k5
```
3. Run the `ktutil` utility which comes with the loaded `krb5` Pmodule:
```bash
ktutil
```
4. In the `ktutil` console, one has to generate a **keytab** file as follows:
```bash
# Replace $USER by your username
add_entry -password -k 0 -f -p $USER
wkt /psi/home/$USER/.k5/krb5.keytab
exit
```
Notice that you will need to add your password once. This step is required for generating the **keytab** file.
5. Once back to the main shell, one has to ensure that the file contains the proper permissions:
```bash
chmod 0600 ~/.k5/krb5.keytab
```
### Obtaining tickets by using keytab files
Once the keytab is created, one can obtain kerberos tickets without being prompted for a password as follows:
```bash
kinit -kt ~/.k5/krb5.keytab $USER
aklog
```
## Slurm jobs accessing AFS
Some jobs may require to access private areas in AFS. For that, having a valid [**keytab**](#obtaining-granting-tickets-with-username-and-password) file is required.
Then, from inside the batch script one can obtain granting tickets for Kerberos and AFS, which can be used for accessing AFS private areas.
The steps should be the following:
* Setup `KRB5CCNAME`, which can be used to specify the location of the Kerberos5 credentials (ticket) cache. In general it should point to a shared area
(`$HOME/.k5` is a good location), and is strongly recommended to generate an independent Kerberos5 credential cache (it is, creating a new credential cache per Slurm job):
```bash
export KRB5CCNAME="$(mktemp "$HOME/.k5/krb5cc_XXXXXX")"
```
* To obtain a Kerberos5 granting ticket, run `kinit` by using your keytab:
```bash
kinit -kt "$HOME/.k5/krb5.keytab" $USER@D.PSI.CH
```
* To obtain a granting AFS ticket, run `aklog`:
```bash
aklog
```
* At the end of the job, you can remove destroy existing Kerberos tickets.
```bash
kdestroy
```
### Slurm batch script example: obtaining KRB+AFS granting tickets
#### Example 1: Independent crendetial cache per Slurm job
This is the **recommended** way. At the end of the job, is strongly recommended to remove / destroy the existing kerberos tickets.
```bash
#!/bin/bash
#SBATCH --partition=hourly # Specify 'general' or 'daily' or 'hourly'
#SBATCH --time=01:00:00 # Strictly recommended when using 'general' partition.
#SBATCH --output=run.out # Generate custom output file
#SBATCH --error=run.err # Generate custom error file
#SBATCH --nodes=1 # Uncomment and specify #nodes to use
#SBATCH --ntasks=1 # Uncomment and specify #nodes to use
#SBATCH --cpus-per-task=1
#SBATCH --constraint=xeon-gold-6152
#SBATCH --hint=nomultithread
#SBATCH --job-name=krb5
export KRB5CCNAME="$(mktemp "$HOME/.k5/krb5cc_XXXXXX")"
kinit -kt "$HOME/.k5/krb5.keytab" $USER@D.PSI.CH
aklog
klist
echo "Here should go my batch script code."
# Destroy Kerberos tickets created for this job only
kdestroy
klist
```
#### Example 2: Shared credential cache
Some users may need/prefer to run with a shared cache file. For doing that, one needs to
setup `KRB5CCNAME` from the **login node** session, before submitting the job.
```bash
export KRB5CCNAME="$(mktemp "$HOME/.k5/krb5cc_XXXXXX")"
```
Then, you can run one or multiple jobs scripts (or parallel job with `srun`). `KRB5CCNAME` will be propagated to the
job script or to the parallel job, therefore a single credential cache will be shared amongst different Slurm runs.
```bash
#!/bin/bash
#SBATCH --partition=hourly # Specify 'general' or 'daily' or 'hourly'
#SBATCH --time=01:00:00 # Strictly recommended when using 'general' partition.
#SBATCH --output=run.out # Generate custom output file
#SBATCH --error=run.err # Generate custom error file
#SBATCH --nodes=1 # Uncomment and specify #nodes to use
#SBATCH --ntasks=1 # Uncomment and specify #nodes to use
#SBATCH --cpus-per-task=1
#SBATCH --constraint=xeon-gold-6152
#SBATCH --hint=nomultithread
#SBATCH --job-name=krb5
# KRB5CCNAME is inherit from the login node session
kinit -kt "$HOME/.k5/krb5.keytab" $USER@D.PSI.CH
aklog
klist
echo "Here should go my batch script code."
echo "No need to run 'kdestroy', as it may have to survive for running other jobs"
```

View File

@@ -0,0 +1,107 @@
# Remote Desktop Access
Users can login in Merlin through a Linux Remote Desktop Session. NoMachine
is a desktop virtualization tool. It is similar to VNC, Remote Desktop, etc.
It uses the NX protocol to enable a graphical login to remote servers.
## Installation
NoMachine is available for PSI Windows computers in the Software Kiosk under the
name **NX Client**. Please use the latest version (at least 6.0). For MacOS and
Linux, the NoMachine client can be downloaded from <https://www.nomachine.com/>.
## Accessing Merlin6 NoMachine from PSI
The Merlin6 NoMachine service is hosted in the following machine:
* **`merlin-nx.psi.ch`**
This is the **front-end** (hence, *the door*) to the NoMachine **back-end nodes**,
which contain the NoMachine desktop service. The **back-end nodes** are the following:
* `merlin-l-001.psi.ch`
* `merlin-l-002.psi.ch`
Any access to the login node desktops must be done through **`merlin-nx.psi.ch`**
(or from **`rem-acc.psi.ch -> merlin-nx.psi.ch`** when connecting from outside PSI).
The **front-end** service running on **`merlin-nx.psi.ch`** will load balance the sessions
and login to any of the available nodes in the **back-end**.
**Only 1 session per back-end** is possible.
Below are explained all the steps necessary for configuring the access to the
NoMachine service running on a login node.
### Creating a Merlin6 NoMachine connection
#### Adding a new connection to the front-end
Click the **Add** button to create a new connection to the **`merlin-nx.psi.ch` front-end**, and fill up
the following fields:
* **Name**: Specify a custom name for the connection. Examples: `merlin-nx`, `merlin-nx.psi.ch`, `Merlin Desktop`
* **Host**: Specify the hostname of the **front-end** service: **`merlin-nx.psi.ch`**
* **Protocol**: specify the protocol that will be used for the connection. *Recommended* protocol: **`NX`**
* **Port**: Specify the listening port of the **front-end**. It must be **`4000`**.
![Create New NoMachine Connection](../../images/nomachine/screen_nx_connect.png)
#### Configuring NoMachine Authentication Method
Depending on the client version, it may ask for different authentication options.
If it's required, choose your authentication method and **Continue** (**Password** or *Kerberos* are the recommended ones).
You will be requested for the crendentials (username / password). **Do not add `PSICH\`** as a prefix for the username.
### Opening NoMachine desktop sessions
By default, when connecting to the **`merlin-nx.psi.ch` front-end** it will automatically open a new
session if none exists.
If there are existing sessions, instead of opening a new desktop session, users can reconnect to an
existing one by clicking to the proper icon (see image below).
![Open an existing Session](../../images/nomachine/screen_nx_existingsession.png)
Users can also create a second desktop session by selecting the **`New Desktop`** button (*red* rectangle in the
below image). This will create a second session on the second login node, as long as this node is up and running.
![Open a New Desktop](../../images/nomachine/screen_nx_newsession.png)
### NoMachine LightDM Session Example
An example of the NoMachine session, which is based on [LightDM](https://github.com/canonical/lightdm)
X Windows:
![NoMachine Session: LightDM Desktop](../../images/nomachine/screen_nx11.png)
## Accessing Merlin6 NoMachine from outside PSI
### No VPN access
Access to the Merlin6 NoMachine service is possible without VPN through **'rem-acc.psi.ch'**.
Please follow the steps described in [PSI Remote Interactive Access](https://www.psi.ch/en/photon-science-data-services/remote-interactive-access) for
remote access to the Merlin6 NoMachine services. Once logged in **'rem-acc.psi.ch'**, you must then login to the **`merlin-nx.psi.ch` front-end** .
services.
### VPN access
Remote access is also possible through VPN, however, you **must not use 'rem-acc.psi.ch'**, and you have to connect directly
to the Merlin6 NoMachine **`merlin-nx.psi.ch` front-end** as if you were inside PSI. For VPN access, you should request
it to the IT department by opening a PSI Service Now ticket:
[VPN Access (PSI employees)](https://psi.service-now.com/psisp?id=psi_new_sc_cat_item&sys_id=beccc01b6f44a200d02a82eeae3ee440).
## Advanced Display Settings
**Nomachine Display Settings** can be accessed and changed either when creating a new session or by clicking the very top right corner of a running session.
### Prevent Rescaling
These settings prevent "bluriness" at the cost of some performance! (You might want to choose depending on performance)
* Display > Resize remote display (forces 1:1 pixel sizes)
* Display > Change settings > Quality: Choose Medium-Best Quality
* Display > Change settings > Modify advanced settings
* Check: Disable network-adaptive display quality (diables lossy compression)
* Check: Disable client side image post-processing

View File

@@ -0,0 +1,159 @@
---
title: Configuring SSH Keys in Merlin
#tags:
keywords: linux, connecting, client, configuration, SSH, Keys, SSH-Keys, RSA, authorization, authentication
last_updated: 15 Jul 2020
summary: "This document describes how to deploy SSH Keys in Merlin."
sidebar: merlin6_sidebar
permalink: /merlin6/ssh-keys.html
---
Merlin users sometimes will need to access the different Merlin services without being constantly requested by a password.
One can achieve that with Kerberos authentication, however in some cases some software would require the setup of SSH Keys.
One example is ANSYS Fluent, which, when used interactively, the way of communication between the GUI and the different nodes
is through the SSH protocol, and the use of SSH Keys is enforced.
## Setting up SSH Keys on Merlin
For security reason, users **must always protect SSH Keys with a passphrase**.
User can check whether a SSH key already exists. These would be placed in the **~/.ssh/** directory. `RSA` encryption
is usually the default one, and files in there would be **`id_rsa`** (private key) and **`id_rsa.pub`** (public key).
```bash
ls ~/.ssh/id*
```
For creating **SSH RSA Keys**, one should:
1. Run `ssh-keygen`, a password will be requested twice. You **must remember** this password for the future.
* Due to security reasons, ***always try protecting it with a password***. There is only one exception, when running ANSYS software, which in general should not use password to simplify the way of running the software in Slurm.
* This will generate a private key **id_rsa**, and a public key **id_rsa.pub** in your **~/.ssh** directory.
2. Add your public key to the **`authorized_keys`** file, and ensure proper permissions for that file, as follows:
```bash
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 0600 ~/.ssh/authorized_keys
```
3. Configure the SSH client in order to force the usage of the **psi.ch** domain for trusting keys:
```bash
echo "CanonicalizeHostname yes" >> ~/.ssh/config
```
4. Configure further SSH options as follows:
```bash
echo "AddKeysToAgent yes" >> ~/.ssh/config
echo "ForwardAgent yes" >> ~/.ssh/config
```
Other options may be added.
5. Check that your SSH config file contains at least the lines mentioned in steps 3 and 4:
```bash
(base) ❄ [caubet_m@merlin-l-001:/data/user/caubet_m]# cat ~/.ssh/config
CanonicalizeHostname yes
AddKeysToAgent yes
ForwardAgent yes
```
## Using the SSH Keys
### Using Authentication Agent in SSH session
By default, when accessing the login node via SSH (with `ForwardAgent=yes`), it will automatically add your
SSH Keys to the authentication agent. Hence, no actions should not be needed by the user. One can configure
`ForwardAgent=yes` as follows:
* **(Recommended)** In your local Linux (workstation, laptop or desktop) add the following line in the
`$HOME/.ssh/config` (or alternatively in `/etc/ssh/ssh_config`) file:
```
ForwardAgent yes
```
* Alternatively, on each SSH you can add the option `ForwardAgent=yes` in the SSH command. In example:
```bash
ssh -XY -o ForwardAgent=yes merlin-l-001.psi.ch
```
If `ForwardAgent` is not enabled as shown above, one needs to run the authentication agent and then add your key
to the **ssh-agent**. This must be done once per SSH session, as follows:
* Run `eval $(ssh-agent -s)` to run the **ssh-agent** in that SSH session
* Check whether the authentication agent has your key already added:
```bash
ssh-add -l | grep "/psi/home/$(whoami)/.ssh"
```
* If no key is returned in the previous step, you have to add the private key identity to the authentication agent.
You will be requested for the **passphrase** of your key, and it can be done by running:
```bash
ssh-add
```
### Using Authentication Agent in NoMachine Session
By default, when using a NoMachine session, the `ssh-agent` should be automatically started. Hence, there is no need of
starting the agent or forwarding it.
However, for NoMachine one always need to add the private key identity to the authentication agent. This can be done as follows:
1. Check whether the authentication agent has already the key added:
```bash
ssh-add -l | grep "/psi/home/$(whoami)/.ssh"
```
2. If no key is returned in the previous step, you have to add the private key identity to the authentication agent.
You will be requested for the **passphrase** of your key, and it can be done by running:
```bash
ssh-add
```
You just need to run it once per NoMachine session, and it would apply to all terminal windows within that NoMachine session.
## Troubleshooting
### Errors when running 'ssh-add'
If the error `Could not open a connection to your authentication agent.` appears when running `ssh-add`, it means
that the authentication agent is not running. Please follow the previous procedures for starting it.
### Add/Update SSH RSA Key password
If an existing SSH Key does not have password, or you want to update an existing password with a new one, you can do it as follows:
```bash
ssh-keygen -p -f ~/.ssh/id_rsa
```
### SSH Keys deployed but not working
Please ensure proper permissions of the involved files, as well as any typos in the file names involved:
```bash
chmod u+rwx,go-rwx,g+s ~/.ssh
chmod u+rw-x,go-rwx ~/.ssh/authorized_keys
chmod u+rw-x,go-rwx ~/.ssh/id_rsa
chmod u+rw-x,go+r-wx ~/.ssh/id_rsa.pub
```
### Testing SSH Keys
Once SSH Key is created, for testing that the SSH Key is valid, one can do the following:
1. Create a **new** SSH session in one of the login nodes:
```bash
ssh merlin-l-001
```
2. In the login node session, destroy any existing Kerberos ticket or active SSH Key:
```bash
kdestroy
ssh-add -D
```
3. Add the new private key identity to the authentication agent. You will be requested by the passphrase.
```bash
ssh-add
```
4. Check that your key is active by the SSH agent:
```bash
ssh-add -l
```
4. SSH to the second login node. No password should be requested:
```bash
ssh -vvv merlin-l-002
```
If the last step succeeds, then means that your SSH Key is properly setup.

View File

@@ -0,0 +1,195 @@
# Merlin6 Storage
## Introduction
This document describes the different directories of the Merlin6 cluster.
### User and project data
* ***Users are responsible for backing up their own data***. Is recommended to backup the data on third party independent systems (i.e. LTS, Archive, AFS, SwitchDrive, Windows Shares, etc.).
* **`/psi/home`**, as this contains a small amount of data, is the only directory where we can provide daily snapshots for one week. This can be found in the following directory **`/psi/home/.snapshot/`**
* ***When a user leaves PSI, she or her supervisor/team are responsible to backup and move the data out from the cluster***: every few months, the storage space will be recycled for those old users who do not have an existing and valid PSI account.
!!! warning
When a user leaves PSI and his account has been removed, her storage space in Merlin may be recycled.
Hence, **when a user leaves PSI**, she, her supervisor or team **must ensure that the data is backed up to an external storage**
### Checking user quota
For each directory, we provide a way for checking quotas (when required). However, a single command ``merlin_quotas``
is provided. This is useful to show with a single command all quotas for your filesystems (including AFS, which is not mentioned here).
To check your quotas, please run:
```bash
merlin_quotas
```
## Merlin6 directories
Merlin6 offers the following directory classes for users:
* ``/psi/home/<username>``: Private user **home** directory
* ``/data/user/<username>``: Private user **data** directory
* ``/data/project/general/<projectname>``: Shared **Project** directory
* For BIO experiments, a dedicated ``/data/project/bio/$projectname`` exists.
* ``/scratch``: Local *scratch* disk (only visible by the node running a job).
* ``/shared-scratch``: Shared *scratch* disk (visible from all nodes).
* ``/export``: Export directory for data transfer, visible from `ra-merlin-01.psi.ch`, `ra-merlin-02.psi.ch` and Merlin login nodes.
* Refer to **[Transferring Data](../how-to-use-merlin/transfer-data.md)** for more information about the export area and data transfer service.
!!! tip
In GPFS there is a concept called **GraceTime**. Filesystems have a block
(amount of data) and file (number of files) quota. This quota contains a soft
and hard limits. Once the soft limit is reached, users can keep writing up to
their hard limit quota during the **grace period**. Once **GraceTime** or hard
limit are reached, users will be unable to write and will need remove data
below the soft limit (or ask for a quota increase when this is possible, see
below table).
Properties of the directory classes:
| Directory | Block Quota [Soft:Hard] | Block Quota [Soft:Hard] | GraceTime | Quota Change Policy: Block | Quota Change Policy: Files | Backup | Backup Policy |
| ---------------------------------- | ----------------------- | ----------------------- | :-------: | :--------------------------------- |:-------------------------------- | ------ | :----------------------------- |
| /psi/home/$username | USR [10GB:11GB] | *Undef* | N/A | Up to x2 when strongly justified. | N/A | yes | Daily snapshots for 1 week |
| /data/user/$username | USR [1TB:1.074TB] | USR [1M:1.1M] | 7d | Inmutable. Need a project. | Changeable when justified. | no | Users responsible for backup |
| /data/project/bio/$projectname | GRP [1TB:1.074TB] | GRP [1M:1.1M] | 7d | Subject to project requirements. | Subject to project requirements. | no | Project responsible for backup |
| /data/project/general/$projectname | GRP [1TB:1.074TB] | GRP [1M:1.1M] | 7d | Subject to project requirements. | Subject to project requirements. | no | Project responsible for backup |
| /scratch | *Undef* | *Undef* | N/A | N/A | N/A | no | N/A |
| /shared-scratch | USR [512GB:2TB] | USR [2M:2.5M] | 7d | Up to x2 when strongly justified. | Changeable when justified. | no | N/A |
| /export | USR [10MB:20TB] | USR [512K:5M] | 10d | Soft can be temporary increased. | Changeable when justified. | no | N/A |
!!! warning
The use of **scratch** and **export** areas as an extension of the quota
_is forbidden_. **scratch** and **export** areas _must not contain_ final
data.
**_Auto cleanup policies_** in the **scratch** and **export** areas are applied.
### User home directory
This is the default directory users will land when login in to any Merlin6 machine.
It is intended for your scripts, documents, software development, and other files which
you want to have backuped. Do not use it for data or HPC I/O-hungry tasks.
This directory is mounted in the login and computing nodes under the path:
```bash
/psi/home/$username
```
Home directories are part of the PSI NFS Central Home storage provided by AIT and
are managed by the Merlin6 administrators.
Users can check their quota by running the following command:
```bash
quota -s
```
#### Home directory policy
* Read **[Important: Code of Conduct](../quick-start-guide/code-of-conduct.md)** for more information about Merlin6 policies.
* Is **forbidden** to use the home directories for IO intensive tasks
* Use `/scratch`, `/shared-scratch`, `/data/user` or `/data/project` for this purpose.
* Users can retrieve up to 1 week of their lost data thanks to the automatic **daily snapshots for 1 week**.
Snapshots can be accessed at this path:
```bash
/psi/home/.snapshop/$username
```
### User data directory
The user data directory is intended for *fast IO access* and keeping large amounts of private data.
This directory is mounted in the login and computing nodes under the directory
```bash
/data/user/$username
```
Users can check their quota by running the following command:
```bash
mmlsquota -u <username> --block-size auto merlin-user
```
#### User data directory policy
* Read **[Important: Code of Conduct](../quick-start-guide/code-of-conduct.md)** for more information about Merlin6 policies.
* Is **forbidden** to use the data directories as ``scratch`` area during a job runtime.
* Use ``/scratch``, ``/shared-scratch`` for this purpose.
* No backup policy is applied for user data directories: users are responsible for backing up their data.
### Project data directory
This storage is intended for *fast IO access* and keeping large amounts of a project's data, where the data also can be
shared by all members of the project (the project's corresponding unix group). We recommend to keep most data in
project related storage spaces, since it allows users to coordinate. Also, project spaces have more flexible policies
regarding extending the available storage space.
Experiments can request a project space as described in **[[Accessing Merlin -> Requesting a Project]](../quick-start-guide/requesting-projects.md)**
Once created, the project data directory will be mounted in the login and computing nodes under the dirctory:
```bash
/data/project/general/$projectname
```
Project quotas are defined on a per *group* basis. Users can check the project quota by running the following command:
```bash
mmlsquota -j $projectname --block-size auto -C merlin.psi.ch merlin-proj
```
#### Project Directory policy
* Read **[Important: Code of Conduct](../quick-start-guide/code-of-conduct.md)** for more information about Merlin6 policies.
* It is **forbidden** to use the data directories as `scratch` area during a job's runtime, i.e. for high throughput I/O for a job's temporary files. Please Use `/scratch`, `/shared-scratch` for this purpose.
* No backups: users are responsible for managing the backups of their data directories.
### Scratch directories
There are two different types of scratch storage: **local** (`/scratch`) and **shared** (`/shared-scratch`).
**local** scratch should be used for all jobs that do not require the scratch files to be accessible from multiple nodes, which is trivially
true for all jobs running on a single node.
**shared** scratch is intended for files that need to be accessible by multiple nodes, e.g. by a MPI-job where tasks are spread out over the cluster
and all tasks need to do I/O on the same temporary files.
**local** scratch in Merlin6 computing nodes provides a huge number of IOPS thanks to the NVMe technology. **Shared** scratch is implemented using a distributed parallel filesystem (GPFS) resulting in a higher latency, since it involves remote storage resources and more complex I/O coordination.
`/shared-scratch` is only mounted in the *Merlin6* computing nodes (i.e. not on the login nodes), and its current size is 50TB. This can be increased in the future.
The properties of the available scratch storage spaces are given in the following table
| Cluster | Service | Scratch | Scratch Mountpoint | Shared Scratch | Shared Scratch Mountpoint | Comments |
| ------- | -------------- | ------------ | ------------------ | -------------- | ------------------------- | ------------------------------------ |
| merlin5 | computing node | 50GB / SAS | `/scratch` | `N/A` | `N/A` | `merlin-c-[01-64]` |
| merlin6 | login node | 100GB / SAS | `/scratch` | 50TB / GPFS | `/shared-scratch` | `merlin-l-0[1,2]` |
| merlin6 | computing node | 1.3TB / NVMe | `/scratch` | 50TB / GPFS | `/shared-scratch` | `merlin-c-[001-024,101-124,201-224]` |
| merlin6 | login node | 2.0TB / NVMe | `/scratch` | 50TB / GPFS | `/shared-scratch` | `merlin-l-00[1,2]` |
#### Scratch directories policy
* Read **[Important: Code of Conduct](../quick-start-guide/code-of-conduct.md)** for more information about Merlin6 policies.
* By default, *always* use **local** first and only use **shared** if your specific use case requires it.
* Temporary files *must be deleted at the end of the job by the user*.
* Remaining files will be deleted by the system if detected.
* Files not accessed within 28 days will be automatically cleaned up by the system.
* If for some reason the scratch areas get full, admins have the rights to cleanup the oldest data.
### Export directory
Export directory is exclusively intended for transferring data from outside PSI to Merlin and viceversa. Is a temporary directoy with an auto-cleanup policy.
Please read **[Transferring Data](../how-to-use-merlin/transfer-data.md)** for more information about it.
#### Export directory policy
* Temporary files *must be deleted at the end of the job by the user*.
* Remaining files will be deleted by the system if detected.
* Files not accessed within 28 days will be automatically cleaned up by the system.
* If for some reason the export area gets full, admins have the rights to cleanup the oldest data

View File

@@ -0,0 +1,165 @@
---
title: Transferring Data
#tags:
keywords: transferring data, data transfer, rsync, winscp, copy data, copying, sftp, import, export, hopx, vpn
last_updated: 24 August 2023
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/transfer-data.html
---
## Overview
Most methods allow data to be either transmitted or received, so it may make sense to
initiate the transfer from either merlin or the other system, depending on the network
visibility.
- Merlin login nodes are visible from the PSI network, so direct data transfer
(rsync/WinSCP) is generally preferable. This can be initiated from either endpoint.
- Merlin login nodes can access the internet using a limited set of protocols
- SSH-based protocols using port 22 (rsync-over-ssh, sftp, WinSCP, etc)
- HTTP-based protocols using ports 80 or 445 (https, WebDav, etc)
- Protocols using other ports require admin configuration and may only work with
specific hosts (ftp, rsync daemons, etc)
- Systems on the internet can access the [PSI Data Transfer](https://www.psi.ch/en/photon-science-data-services/data-transfer) service
`datatransfer.psi.ch`, using ssh-based protocols and [Globus](https://www.globus.org/)
## Direct transfer via Merlin6 login nodes
The following methods transfer data directly via the [login
nodes](../quick-start-guide/accessing-interactive-nodes.md). They are suitable
for use from within the PSI network.
### Rsync
Rsync is the preferred method to transfer data from Linux/MacOS. It allows
transfers to be easily resumed if they get interrupted. The general syntax is:
```
rsync -avAHXS <src> <dst>
```
For example, to transfer files from your local computer to a merlin project
directory:
```
rsync -avAHXS ~/localdata user@merlin-l-01.psi.ch:/data/project/general/myproject/
```
You can resume interrupted transfers by simply rerunning the command. Previously
transferred files will be skipped.
### WinSCP
The WinSCP tool can be used for remote file transfer on Windows. It is available
from the Software Kiosk on PSI machines. Add `merlin-l-01.psi.ch` as a host and
connect with your PSI credentials. You can then drag-and-drop files between your
local computer and merlin.
### SWITCHfilesender
**[SWITCHfilesender](https://filesender.switch.ch/filesender2/?s=upload)** is an installation of the FileSender project (filesender.org) which is a web based application that allows authenticated users to securely and easily send arbitrarily large files to other users.
Authentication of users is provided through SimpleSAMLphp, supporting SAML2, LDAP and RADIUS and more. Users without an account can be sent an upload voucher by an authenticated user. FileSender is developed to the requirements of the higher education and research community.
The purpose of the software is to send a large file to someone, have that file available for download for a certain number of downloads and/or a certain amount of time, and after that automatically delete the file. The software is not intended as a permanent file publishing platform.
**[SWITCHfilesender](https://filesender.switch.ch/filesender2/?s=upload)** is fully integrated with PSI, therefore, PSI employees can log in by using their PSI account (through Authentication and Authorization Infrastructure / AAI, by selecting PSI as the institution to be used for log in).
## PSI Data Transfer
From August 2024, Merlin is connected to the **[PSI Data Transfer](https://www.psi.ch/en/photon-science-data-services/data-transfer)** service,
`datatransfer.psi.ch`. This is a central service managed by the **[Linux team](https://linux.psi.ch/index.html)**. However, any problems or questions related to it can be directly
[reported](../99-support/contact.md) to the Merlin administrators, which will forward the request if necessary.
The PSI Data Transfer servers supports the following protocols:
* Data Transfer - SSH (scp / rsync)
* Data Transfer - Globus
Notice that `datatransfer.psi.ch` does not allow SSH login, only `rsync`, `scp` and [Globus](https://www.globus.org/) access is allowed.
The following filesystems are mounted:
* `/merlin/export` which points to the `/export` directory in Merlin.
* `/merlin/data/experiment/mu3e` which points to the `/data/experiment/mu3e` directories in Merlin.
* Mu3e sub-directories are mounted in RW (read-write), except for `data` (read-only mounted)
* `/merlin/data/project/general` which points to the `/data/project/general` directories in Merlin.
* Owners of Merlin projects should request explicit access to it.
* Currently, only `CSCS` is available for transferring files between PizDaint/Alps and Merlin
* `/merlin/data/project/bio` which points to the `/data/project/bio` directories in Merlin.
* `/merlin/data/user` which points to the `/data/user` directories in Merlin.
Access to the PSI Data Transfer uses ***Multi factor authentication*** (MFA).
Therefore, having the Microsoft Authenticator App is required as explained [here](https://www.psi.ch/en/computing/change-to-mfa).
!!! tip "Official Documentation"
Please follow the [Official PSI Data Transfer](https://www.psi.ch/en/photon-science-data-services/data-transfer) documentation for further instructions.
### Directories
#### /merlin/data/user
User data directories are mounted in RW.
!!! warning "Secure Permissions"
Please, **ensure proper secured permissions** in your `/data/user` directory. By default, when directory is created, the system applies the most restrictive permissions. However, this does not prevent users for changing permissions if they wish. At this point, users become responsible of those changes.
#### /merlin/export
Transferring big amounts of data from outside PSI to Merlin is always possible through `/export`.
!!! tip "Export Directory Access"
The `/export` directory can be used by any Merlin user. This is configured in Read/Write mode. If you need access, please, contact the Merlin administrators.
!!! warning "Export Usage Policy"
The use **export** as an extension of the quota *is forbidden*.
Auto cleanup policies in the **export** area apply for files older than 28 days.
##### Exporting data from Merlin
For exporting data from Merlin to outside PSI by using `/export`, one has to:
* From a Merlin login node, copy your data from any directory (i.e. `/data/project`, `/data/user`, `/scratch`) to
`/export`. Ensure to properly secure your directories and files with proper permissions.
* Once data is copied, from **`datatransfer.psi.ch`**, copy the data from `/merlin/export` to outside PSI
##### Importing data to Merlin
For importing data from outside PSI to Merlin by using `/export`, one has to:
* From **`datatransfer.psi.ch`**, copy the data from outside PSI to `/merlin/export`.
Ensure to properly secure your directories and files with proper permissions.
* Once data is copied, from a Merlin login node, copy your data from `/export` to any directory (i.e. `/data/project`, `/data/user`, `/scratch`).
#### Request access to your project directory
Optionally, instead of using `/export`, Merlin project owners can request Read/Write or Read/Only access to their project directory.
!!! tip "Project Access"
Merlin projects can request direct access. This can be configured in Read/Write or Read/Only modes. If your project needs access, please, contact the Merlin administrators.
## Connecting to Merlin6 from outside PSI
Merlin6 is fully accessible from within the PSI network. To connect from outside you can use:
- [VPN](https://www.psi.ch/en/computing/vpn) ([alternate instructions](https://intranet.psi.ch/BIO/ComputingVPN))
- [SSH hopx](https://www.psi.ch/en/computing/ssh-hop)
* Please avoid transferring big amount data through **hopx**
- [No Machine](nomachine.md)
* Remote Interactive Access through [**'rem-acc.psi.ch'**](https://www.psi.ch/en/photon-science-data-services/remote-interactive-access)
* Please avoid transferring big amount of data through **NoMachine**
## Connecting from Merlin6 to outside file shares
### `merlin_rmount` command
Merlin provides a command for mounting remote file systems, called `merlin_rmount`. This
provides a helpful wrapper over the Gnome storage utilities, and provides support for a wide range of remote file formats, including
- SMB/CIFS (Windows shared folders)
- WebDav
- AFP
- FTP, SFTP
- [others](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/using_the_desktop_environment_in_rhel_8/managing-storage-volumes-in-gnome_using-the-desktop-environment-in-rhel-8#gvfs-back-ends_managing-storage-volumes-in-gnome)
[More instruction on using `merlin_rmount`](../software-support/merlin-rmount.md)

Some files were not shown because too many files have changed in this diff Show More