initial formatting changes complete

This commit is contained in:
2026-01-06 16:40:15 +01:00
parent f58c1f57b8
commit 7db5d0fd05
81 changed files with 805 additions and 1112 deletions

View File

@@ -1,17 +1,9 @@
---
title: Downtimes
#tags:
#keywords:
last_updated: 28 June 2019
#summary: "Merlin 6 cluster overview"
sidebar: merlin6_sidebar
permalink: /merlin6/downtimes.html
---
# Downtimes
On the first Monday of each month the Merlin6 cluster might be subject to interruption due to maintenance.
On the first Monday of each month the Merlin6 cluster might be subject to interruption due to maintenance.
Users will be informed with at least one week in advance when a downtime is scheduled for the next month.
Downtimes will be informed to users through the <merlin-users@lists.psi.ch> mail list. Also, a detailed description
Downtimes will be informed to users through the <merlin-users@lists.psi.ch> mail list. Also, a detailed description
for the nexts scheduled interventions will be available in [Next Scheduled Downtimes](#next-scheduled-downtimes)).
---
@@ -21,12 +13,14 @@ for the nexts scheduled interventions will be available in [Next Scheduled Downt
Scheduled downtimes mostly affecting the storage and Slurm configurantions may require draining the nodes.
When this is required, users will be informed accordingly. Two different types of draining are possible:
* **soft drain**: new jobs may be queued on the partition, but queued jobs may not be allocated nodes and run from the partition.
Jobs already running on the partition continue to run. This will be the **default** drain method.
* **hard drain**: no new jobs may be queued on the partition (job submission requests will be denied with an error message),
but jobs already queued on the partition may be allocated to nodes and run.
* **soft drain**: new jobs may be queued on the partition, but queued jobs may not be allocated nodes and run from the partition.
Unless explicitly specified, the default draining policy for each partition will be the following:
Jobs already running on the partition continue to run. This will be the **default** drain method.
* **hard drain**: no new jobs may be queued on the partition (job submission requests will be denied with an error message),
but jobs already queued on the partition may be allocated to nodes and run.
Unless explicitly specified, the default draining policy for each partition will be the following:
* The **daily** and **general** partitions will be soft drained 12h before the downtime.
* The **hourly** partition will be soft drained 1 hour before the downtime.

View File

@@ -1,12 +1,4 @@
---
title: Past Downtimes
#tags:
#keywords:
last_updated: 03 September 2019
#summary: "Merlin 6 cluster overview"
sidebar: merlin6_sidebar
permalink: /merlin6/past-downtimes.html
---
# Past Downtimes
## Past Downtimes: Log Changes

View File

@@ -1,12 +1,4 @@
---
title: Contact
#tags:
keywords: contact, support, snow, service now, mailing list, mailing, email, mail, merlin-admins@lists.psi.ch, merlin-users@lists.psi.ch, merlin users
last_updated: 07 September 2022
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/contact.html
---
# Contact
## Support

View File

@@ -1,14 +1,4 @@
---
title: FAQ
#tags:
keywords: faq, frequently asked questions, support
last_updated: 27 October 2022
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/faq.html
---
{%include toc.html %}
# FAQ
## How do I register for Merlin?
@@ -35,7 +25,7 @@ How to install depends a bit on the software itself. There are three common inst
2. *source compilation* using make/cmake/autoconfig/etc. Usually the compilation scripts accept a `--prefix=/data/user/$USER` directory for where to install it. Then they place files under `<prefix>/bin`, `<prefix>/lib`, etc. The exact syntax should be documented in the installation instructions.
3. *conda environment*. This is now becoming standard for python-based software, including lots of the AI tools. First follow the [initial setup instructions](../software-support/python.md#anaconda) to configure conda to use /data/user instead of your home directory. Then you can create environments like:
```
```bash
module load anaconda/2019.07
# if they provide environment.yml
conda env create -f environment.yml

View File

@@ -1,12 +1,4 @@
---
title: Known Problems
#tags:
keywords: "known problems, troubleshooting, illegal instructions, paraview, ansys, shell, opengl, mesa, vglrun, module: command not found, error"
last_updated: 07 September 2022
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/known-problems.html
---
# Known Problems
## Common errors
@@ -17,8 +9,8 @@ This is usually because the software was compiled with a set of instructions new
and it mostly depends on the processor generation.
In example, `merlin-l-001` and `merlin-l-002` contain a newer generation of processors than the old GPUs nodes, or than the Merlin5 cluster.
Hence, unless one compiles the software with compatibility with set of instructions from older processors, it will not run on old nodes.
Sometimes, this is properly set by default at the compilation time, but sometimes is not.
Hence, unless one compiles the software with compatibility with set of instructions from older processors, it will not run on old nodes.
Sometimes, this is properly set by default at the compilation time, but sometimes is not.
For GCC, please refer to [GCC x86 Options](https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html) for compiling options. In case of doubts, contact us.
@@ -49,7 +41,7 @@ srun --cpus-per-task=$SLURM_CPUS_PER_TASK python -c "import os; print(os.sched_g
(base)[caubet_m@merlin-l-001:/data/user/caubet_m]# sbatch mysbatch_method1
Submitted batch job 8000813
(base)[caubet_m@merlin-l-001:/data/user/caubet_m]# cat slurm-8000813.out
(base)[caubet_m@merlin-l-001:/data/user/caubet_m]# cat slurm-8000813.out
From Slurm v22.05.8 srun does not inherit $SLURM_CPUS_PER_TASK
{1, 45}
One has to implicitly specify $SLURM_CPUS_PER_TASK
@@ -72,11 +64,10 @@ echo 'In this example, by setting an environment variable SRUN_CPUS_PER_TASK'
export SRUN_CPUS_PER_TASK=$SLURM_CPUS_PER_TASK
srun python -c "import os; print(os.sched_getaffinity(0))"
(base)[caubet_m@merlin-l-001:/data/user/caubet_m]# sbatch mysbatch_method2
Submitted batch job 8000815
(base)[caubet_m@merlin-l-001:/data/user/caubet_m]# cat slurm-8000815.out
(base)[caubet_m@merlin-l-001:/data/user/caubet_m]# cat slurm-8000815.out
From Slurm v22.05.8 srun does not inherit $SLURM_CPUS_PER_TASK
{1, 45}
One has to implicitly specify $SLURM_CPUS_PER_TASK
@@ -84,14 +75,13 @@ In this example, by setting an environment variable SRUN_CPUS_PER_TASK
{1, 2, 3, 4, 45, 46, 47, 48}
```
## General topics
### Default SHELL
In general, **`/bin/bash` is the recommended default user's SHELL** when working in Merlin.
Some users might notice that BASH is not the default SHELL when logging in to Merlin systems, or they might need to run a different SHELL.
Some users might notice that BASH is not the default SHELL when logging in to Merlin systems, or they might need to run a different SHELL.
This is probably because when the PSI account was requested, no SHELL description was specified or a different one was requested explicitly by the requestor.
Users can check which is the default SHELL specified in the PSI account with the following command:
@@ -99,10 +89,10 @@ Users can check which is the default SHELL specified in the PSI account with the
getent passwd $USER | awk -F: '{print $NF}'
```
If SHELL does not correspond to the one you need to use, you should request a central change for it.
If SHELL does not correspond to the one you need to use, you should request a central change for it.
This is because Merlin accounts are central PSI accounts. Hence, **change must be requested via [PSI Service Now](contact.md#psi-service-now)**.
Alternatively, if you work on other PSI Linux systems but for Merlin you need a different SHELL type, a temporary change can be performed during login startup.
Alternatively, if you work on other PSI Linux systems but for Merlin you need a different SHELL type, a temporary change can be performed during login startup.
You can update one of the following files:
* `~/.login`
* `~/.profile`
@@ -116,7 +106,7 @@ MY_SHELL=/bin/bash
exec $MY_SHELL -l
```
Notice that available *shells* can be found in the following file:
Notice that available *shells* can be found in the following file:
```bash
cat /etc/shells

View File

@@ -1,12 +1,4 @@
---
title: Migration From Merlin5
#tags:
keywords: merlin5, merlin6, migration, rsync, archive, archiving, lts, long-term storage
last_updated: 07 September 2022
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/migrating.html
---
# Migration From Merlin5
## Directories
@@ -30,9 +22,10 @@ where:
* **Block** is capacity size in GB and TB
* **Files** is number of files + directories in Millions (M)
* **Quota types** are the following:
* **USR**: Quota is setup individually per user name
* **GRP**: Quota is setup individually per Unix Group name
* **Fileset**: Quota is setup per project root directory.
* **USR**: Quota is setup individually per user name
* **GRP**: Quota is setup individually per Unix Group name
* **Fileset**: Quota is setup per project root directory.
* User data directory ``/data/user`` has a strict user block quota limit policy. If more disk space is required, 'project' must be created.
* Soft quotas can be exceeded for short periods of time. Hard quotas cannot be exceeded.

View File

@@ -1,15 +1,7 @@
---
title: Troubleshooting
#tags:
keywords: troubleshooting, problems, faq, known problems
last_updated: 07 September 2022
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/troubleshooting.html
---
# Troubleshooting
For troubleshooting, please contact us through the official channels. See [Contact](contact.md)
for more information.
For troubleshooting, please contact us through the official channels. See [Contact](contact.md)
for more information.
## Known Problems
@@ -30,7 +22,7 @@ the following information:
echo "Current location:"; pwd
echo "User environment:"; env
echo "List of PModules:"; module list
```
```
3. Whenever possible, provide the Slurm JobID.

View File

@@ -1,25 +1,19 @@
---
title: Hardware And Software Description
#tags:
#keywords:
last_updated: 13 June 2019
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/hardware-and-software.html
---
# Hardware And Software Description
## Hardware
### Computing Nodes
The new Merlin6 cluster contains a solution based on **four** [**HPE Apollo k6000 Chassis**](https://h20195.www2.hpe.com/v2/getdocument.aspx?docname=a00016641enw)
* *Three* of them contain 24 x [**HP Apollo XL230K Gen10**](https://h20195.www2.hpe.com/v2/GetDocument.aspx?docname=a00016634enw) blades.
* A *fourth* chassis was purchased on 2021 with [**HP Apollo XL230K Gen10**](https://h20195.www2.hpe.com/v2/GetDocument.aspx?docname=a00016634enw) blades dedicated to few experiments. Blades have slighly different components depending on specific project requirements.
The connectivity for the Merlin6 cluster is based on **ConnectX-5 EDR-100Gbps**, and each chassis contains:
* 1 x [HPE Apollo InfiniBand EDR 36-port Unmanaged Switch](https://h20195.www2.hpe.com/v2/getdocument.aspx?docname=a00016643enw)
* 24 internal EDR-100Gbps ports (1 port per blade for internal low latency connectivity)
* 12 external EDR-100Gbps ports (for external for internal low latency connectivity)
* 24 internal EDR-100Gbps ports (1 port per blade for internal low latency connectivity)
* 12 external EDR-100Gbps ports (for external for internal low latency connectivity)
<table>
<thead>
@@ -142,6 +136,7 @@ The connectivity is based on **ConnectX-5 EDR-100Gbps** for the new login nodes,
### Storage
The storage node is based on the [Lenovo Distributed Storage Solution for IBM Spectrum Scale](https://lenovopress.com/lp0626-lenovo-distributed-storage-solution-for-ibm-spectrum-scale-x3650-m5).
* 2 x **Lenovo DSS G240** systems, each one composed by 2 IO Nodes **ThinkSystem SR650** mounting 4 x **Lenovo Storage D3284 High Density Expansion** enclosures.
* Each IO node has a connectivity of 400Gbps (4 x EDR 100Gbps ports, 2 of them are **ConnectX-5** and 2 are **ConnectX-4**).
@@ -151,11 +146,13 @@ The storage solution is connected to the HPC clusters through 2 x **Mellanox SB7
Merlin6 cluster connectivity is based on the [**Infiniband**](https://en.wikipedia.org/wiki/InfiniBand) technology. This allows fast access with very low latencies to the data as well as running
extremely efficient MPI-based jobs:
* Connectivity amongst different computing nodes on different chassis ensures up to 1200Gbps of aggregated bandwidth.
* Inter connectivity (communication amongst computing nodes in the same chassis) ensures up to 2400Gbps of aggregated bandwidth.
* Communication to the storage ensures up to 800Gbps of aggregated bandwidth.
Merlin6 cluster currently contains 5 Infiniband Managed switches and 3 Infiniband Unmanaged switches (one per HP Apollo chassis):
* 1 x **MSX6710** (FDR) for connecting old GPU nodes, old login nodes and MeG cluster to the Merlin6 cluster (and storage). No High Availability mode possible.
* 2 x **MSB7800** (EDR) for connecting Login Nodes, Storage and other nodes in High Availability mode.
* 3 x **HP EDR Unmanaged** switches, each one embedded to each HP Apollo k6000 chassis solution.
@@ -164,8 +161,9 @@ Merlin6 cluster currently contains 5 Infiniband Managed switches and 3 Infiniban
## Software
In Merlin6, we try to keep the latest software stack release to get the latest features and improvements. Due to this, **Merlin6** runs:
* [**RedHat Enterprise Linux 7**](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/7.9_release_notes/index)
* [**Slurm**](https://slurm.schedmd.com/), we usually try to keep it up to date with the most recent versions.
* [**GPFS v5**](https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.2/ibmspectrumscale502_welcome.html)
* [**MLNX_OFED LTS v.5.2-2.2.0.0 or newer**](https://www.mellanox.com/products/infiniband-drivers/linux/mlnx_ofed) for all **ConnectX-5** or superior cards.
* [MLNX_OFED LTS v.4.9-2.2.4.0](https://www.mellanox.com/products/infiniband-drivers/linux/mlnx_ofed) is installed for remaining **ConnectX-3** and **ConnectIB** cards.
* [**MLNX_OFED LTS v.5.2-2.2.0.0 or newer**](https://www.mellanox.com/products/infiniband-drivers/linux/mlnx_ofed) for all **ConnectX-5** or superior cards.
* [MLNX_OFED LTS v.4.9-2.2.4.0](https://www.mellanox.com/products/infiniband-drivers/linux/mlnx_ofed) is installed for remaining **ConnectX-3** and **ConnectIB** cards.

View File

@@ -41,20 +41,21 @@ Archiving can be done from any node accessible by the users (usually from the lo
Below are the main steps for using the Data Catalog.
* Ingest the dataset into the Data Catalog. This makes the data known to the Data Catalog system at PSI:
* Prepare a metadata file describing the dataset
* Run **`datasetIngestor`** script
* If necessary, the script will copy the data to the PSI archive servers
* Prepare a metadata file describing the dataset
* Run **`datasetIngestor`** script
* If necessary, the script will copy the data to the PSI archive servers
* Usually this is necessary when archiving from directories other than **`/data/user`** or
**`/data/project`**. It would be also necessary when the Merlin export server (**`merlin-archive.psi.ch`**)
is down for any reason.
* Archive the dataset:
* Visit [https://discovery.psi.ch](https://discovery.psi.ch)
* Click **`Archive`** for the dataset
* The system will now copy the data to the PetaByte Archive at CSCS
* Visit [<https://discovery.psi.ch](https://discovery.psi.ch>)
* Click **`Archive`** for the dataset
* The system will now copy the data to the PetaByte Archive at CSCS
* Retrieve data from the catalog:
* Find the dataset on [https://discovery.psi.ch](https://discovery.psi.ch) and click **`Retrieve`**
* Wait for the data to be copied to the PSI retrieval system
* Run **`datasetRetriever`** script
* Find the dataset on [<https://discovery.psi.ch](https://discovery.psi.ch>) and click **`Retrieve`**
* Wait for the data to be copied to the PSI retrieval system
* Run **`datasetRetriever`** script
Since large data sets may take a lot of time to transfer, some steps are
designed to happen in the background. The discovery website can be used to
@@ -246,7 +247,7 @@ step will take a long time and may appear to have hung. You can check what files
where UID is the dataset ID (12345678-1234-1234-1234-123456789012) and PATH is the absolute path to your data. Note that rsync creates directories first and that the transfer order is not alphabetical in some cases, but it should be possible to see whether any data has transferred.
* There is currently a limit on the number of files per dataset (technically, the limit is from the total length of all file paths). It is recommended to break up datasets into 300'000 files or less.
* If it is not possible or desirable to split data between multiple datasets, an alternate work-around is to package files into a tarball. For datasets which are already compressed, omit the -z option for a considerable speedup:
* If it is not possible or desirable to split data between multiple datasets, an alternate work-around is to package files into a tarball. For datasets which are already compressed, omit the -z option for a considerable speedup:
```bash
tar -f [output].tar [srcdir]
@@ -266,7 +267,6 @@ step will take a long time and may appear to have hung. You can check what files
/data/project/bio/myproject/archive $ datasetIngestor -copy -autoarchive -allowexistingsource -ingest metadata.json
2019/11/06 11:04:43 Latest version: 1.1.11
2019/11/06 11:04:43 Your version of this program is up-to-date
2019/11/06 11:04:43 You are about to add a dataset to the === production === data catalog environment...
2019/11/06 11:04:43 Your username:
@@ -316,7 +316,6 @@ user_n@pb-archive.psi.ch's password:
2019/11/06 11:05:04 The source folder /data/project/bio/myproject/archive is not centrally available (decentral use case).
The data must first be copied to a rsync cache server.
2019/11/06 11:05:04 Do you want to continue (Y/n)?
Y
2019/11/06 11:05:09 Created dataset with id 12.345.67890/12345678-1234-1234-1234-123456789012

View File

@@ -1,12 +1,4 @@
---
title: Connecting from a MacOS Client
#tags:
keywords: MacOS, mac os, mac, connecting, client, configuration, SSH, X11
last_updated: 07 September 2022
summary: "This document describes a recommended setup for a MacOS client."
sidebar: merlin6_sidebar
permalink: /merlin6/connect-from-macos.html
---
# Connecting from a MacOS Client
## SSH without X11 Forwarding

View File

@@ -1,18 +1,12 @@
---
title: Connecting from a Windows Client
keywords: microsoft, mocosoft, windows, putty, xming, connecting, client, configuration, SSH, X11
last_updated: 07 September 2022
summary: "This document describes a recommended setup for a Windows client."
sidebar: merlin6_sidebar
permalink: /merlin6/connect-from-windows.html
---
# Connecting from a Windows Client
## SSH with PuTTY without X11 Forwarding
PuTTY is one of the most common tools for SSH.
Check, if the following software packages are installed on the Windows workstation by
Check, if the following software packages are installed on the Windows workstation by
inspecting the *Start* menu (hint: use the *Search* box to save time):
* PuTTY (should be already installed)
* *[Optional]* Xming (needed for [SSH with X11 Forwarding](#ssh-with-putty-with-x11-forwarding))
@@ -28,7 +22,6 @@ If they are missing, you can install them using the Software Kiosk icon on the D
![Create Merlin Session](../../images/PuTTY/Putty_Session.png)
## SSH with PuTTY with X11 Forwarding
Official X11 Forwarding support is through NoMachine. Please follow the document

View File

@@ -1,26 +1,17 @@
---
title: Kerberos and AFS authentication
#tags:
keywords: kerberos, AFS, kinit, klist, keytab, tickets, connecting, client, configuration, slurm
last_updated: 07 September 2022
summary: "This document describes how to use Kerberos."
sidebar: merlin6_sidebar
permalink: /merlin6/kerberos.html
---
# Kerberos and AFS authentication
Projects and users have their own areas in the central PSI AFS service. In order
to access to these areas, valid Kerberos and AFS tickets must be granted.
to access to these areas, valid Kerberos and AFS tickets must be granted.
These tickets are automatically granted when accessing through SSH with
These tickets are automatically granted when accessing through SSH with
username and password. Alternatively, one can get a granting ticket with the `kinit` (Kerberos)
and `aklog` (AFS ticket, which needs to be run after `kinit`) commands.
Due to PSI security policies, the maximum lifetime of the ticket is 7 days, and the default
time is 10 hours. It means than one needs to constantly renew (`krenew` command) the existing
Due to PSI security policies, the maximum lifetime of the ticket is 7 days, and the default
time is 10 hours. It means than one needs to constantly renew (`krenew` command) the existing
granting tickets, and their validity can not be extended longer than 7 days. At this point,
one needs to obtain new granting tickets.
## Obtaining granting tickets with username and password
As already described above, the most common use case is to obtain Kerberos and AFS granting tickets
@@ -28,8 +19,9 @@ by introducing username and password:
* When login to Merlin through SSH protocol, if this is done with username + password authentication,
tickets for Kerberos and AFS will be automatically obtained.
* When login to Merlin through NoMachine, no Kerberos and AFS are granted. Therefore, users need to
run `kinit` (to obtain a granting Kerberos ticket) followed by `aklog` (to obtain a granting AFS ticket).
See further details below.
See further details below.
To manually obtain granting tickets, one has to:
1. To obtain a granting Kerberos ticket, one needs to run `kinit $USER` and enter the PSI password.
@@ -49,16 +41,16 @@ klist
```bash
krenew
```
* Keep in mind that the maximum lifetime for granting tickets is 7 days, therefore `krenew` can not be used beyond that limit,
* Keep in mind that the maximum lifetime for granting tickets is 7 days, therefore `krenew` can not be used beyond that limit,
and then `kinit` should be used instead.
## Obtanining granting tickets with keytab
Sometimes, obtaining granting tickets by using password authentication is not possible. An example are user Slurm jobs
requiring access to private areas in AFS. For that, there's the possibility to generate a **keytab** file.
Sometimes, obtaining granting tickets by using password authentication is not possible. An example are user Slurm jobs
requiring access to private areas in AFS. For that, there's the possibility to generate a **keytab** file.
Be aware that the **keytab** file must be **private**, **fully protected** by correct permissions and not shared with any
Be aware that the **keytab** file must be **private**, **fully protected** by correct permissions and not shared with any
other users.
### Creating a keytab file
@@ -70,6 +62,7 @@ For generating a **keytab**, one has to:
module load krb5/1.20
```
2. Create a private directory for storing the Kerberos **keytab** file
```bash
mkdir -p ~/.k5
```
@@ -78,6 +71,7 @@ mkdir -p ~/.k5
ktutil
```
4. In the `ktutil` console, one has to generate a **keytab** file as follows:
```bash
# Replace $USER by your username
add_entry -password -k 0 -f -p $USER
@@ -85,6 +79,7 @@ wkt /psi/home/$USER/.k5/krb5.keytab
exit
```
Notice that you will need to add your password once. This step is required for generating the **keytab** file.
5. Once back to the main shell, one has to ensure that the file contains the proper permissions:
```bash
chmod 0600 ~/.k5/krb5.keytab
@@ -112,14 +107,17 @@ The steps should be the following:
export KRB5CCNAME="$(mktemp "$HOME/.k5/krb5cc_XXXXXX")"
```
* To obtain a Kerberos5 granting ticket, run `kinit` by using your keytab:
```bash
kinit -kt "$HOME/.k5/krb5.keytab" $USER@D.PSI.CH
```
* To obtain a granting AFS ticket, run `aklog`:
```bash
aklog
```
* At the end of the job, you can remove destroy existing Kerberos tickets.
* At the end of the job, you can remove destroy existing Kerberos tickets.
```bash
kdestroy
```
@@ -137,7 +135,7 @@ This is the **recommended** way. At the end of the job, is strongly recommended
#SBATCH --output=run.out # Generate custom output file
#SBATCH --error=run.err # Generate custom error file
#SBATCH --nodes=1 # Uncomment and specify #nodes to use
#SBATCH --ntasks=1 # Uncomment and specify #nodes to use
#SBATCH --ntasks=1 # Uncomment and specify #nodes to use
#SBATCH --cpus-per-task=1
#SBATCH --constraint=xeon-gold-6152
#SBATCH --hint=nomultithread

View File

@@ -103,5 +103,5 @@ These settings prevent "bluriness" at the cost of some performance! (You might w
* Display > Resize remote display (forces 1:1 pixel sizes)
* Display > Change settings > Quality: Choose Medium-Best Quality
* Display > Change settings > Modify advanced settings
* Check: Disable network-adaptive display quality (diables lossy compression)
* Check: Disable client side image post-processing
* Check: Disable network-adaptive display quality (diables lossy compression)
* Check: Disable client side image post-processing

View File

@@ -1,13 +1,4 @@
---
title: Configuring SSH Keys in Merlin
#tags:
keywords: linux, connecting, client, configuration, SSH, Keys, SSH-Keys, RSA, authorization, authentication
last_updated: 15 Jul 2020
summary: "This document describes how to deploy SSH Keys in Merlin."
sidebar: merlin6_sidebar
permalink: /merlin6/ssh-keys.html
---
# Configuring SSH Keys in Merlin
Merlin users sometimes will need to access the different Merlin services without being constantly requested by a password.
One can achieve that with Kerberos authentication, however in some cases some software would require the setup of SSH Keys.
@@ -22,14 +13,15 @@ User can check whether a SSH key already exists. These would be placed in the **
is usually the default one, and files in there would be **`id_rsa`** (private key) and **`id_rsa.pub`** (public key).
```bash
ls ~/.ssh/id*
ls ~/.ssh/id*
```
For creating **SSH RSA Keys**, one should:
1. Run `ssh-keygen`, a password will be requested twice. You **must remember** this password for the future.
* Due to security reasons, ***always try protecting it with a password***. There is only one exception, when running ANSYS software, which in general should not use password to simplify the way of running the software in Slurm.
* This will generate a private key **id_rsa**, and a public key **id_rsa.pub** in your **~/.ssh** directory.
* Due to security reasons, ***always try protecting it with a password***. There is only one exception, when running ANSYS software, which in general should not use password to simplify the way of running the software in Slurm.
* This will generate a private key **id_rsa**, and a public key **id_rsa.pub** in your **~/.ssh** directory.
2. Add your public key to the **`authorized_keys`** file, and ensure proper permissions for that file, as follows:
```bash
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
@@ -57,16 +49,16 @@ For creating **SSH RSA Keys**, one should:
### Using Authentication Agent in SSH session
By default, when accessing the login node via SSH (with `ForwardAgent=yes`), it will automatically add your
By default, when accessing the login node via SSH (with `ForwardAgent=yes`), it will automatically add your
SSH Keys to the authentication agent. Hence, no actions should not be needed by the user. One can configure
`ForwardAgent=yes` as follows:
* **(Recommended)** In your local Linux (workstation, laptop or desktop) add the following line in the
`$HOME/.ssh/config` (or alternatively in `/etc/ssh/ssh_config`) file:
* **(Recommended)** In your local Linux (workstation, laptop or desktop) add the following line in the
`$HOME/.ssh/config` (or alternatively in `/etc/ssh/ssh_config`) file:
```
ForwardAgent yes
```
* Alternatively, on each SSH you can add the option `ForwardAgent=yes` in the SSH command. In example:
* Alternatively, on each SSH you can add the option `ForwardAgent=yes` in the SSH command. In example:
```bash
ssh -XY -o ForwardAgent=yes merlin-l-001.psi.ch
```
@@ -74,12 +66,12 @@ SSH Keys to the authentication agent. Hence, no actions should not be needed by
If `ForwardAgent` is not enabled as shown above, one needs to run the authentication agent and then add your key
to the **ssh-agent**. This must be done once per SSH session, as follows:
* Run `eval $(ssh-agent -s)` to run the **ssh-agent** in that SSH session
* Check whether the authentication agent has your key already added:
* Run `eval $(ssh-agent -s)` to run the **ssh-agent** in that SSH session
* Check whether the authentication agent has your key already added:
```bash
ssh-add -l | grep "/psi/home/$(whoami)/.ssh"
```
* If no key is returned in the previous step, you have to add the private key identity to the authentication agent.
* If no key is returned in the previous step, you have to add the private key identity to the authentication agent.
You will be requested for the **passphrase** of your key, and it can be done by running:
```bash
ssh-add
@@ -96,7 +88,7 @@ However, for NoMachine one always need to add the private key identity to the au
```bash
ssh-add -l | grep "/psi/home/$(whoami)/.ssh"
```
2. If no key is returned in the previous step, you have to add the private key identity to the authentication agent.
2. If no key is returned in the previous step, you have to add the private key identity to the authentication agent.
You will be requested for the **passphrase** of your key, and it can be done by running:
```bash
ssh-add

View File

@@ -7,7 +7,8 @@ This document describes the different directories of the Merlin6 cluster.
### User and project data
* ***Users are responsible for backing up their own data***. Is recommended to backup the data on third party independent systems (i.e. LTS, Archive, AFS, SwitchDrive, Windows Shares, etc.).
* **`/psi/home`**, as this contains a small amount of data, is the only directory where we can provide daily snapshots for one week. This can be found in the following directory **`/psi/home/.snapshot/`**
* **`/psi/home`**, as this contains a small amount of data, is the only directory where we can provide daily snapshots for one week. This can be found in the following directory **`/psi/home/.snapshot/`**
* ***When a user leaves PSI, she or her supervisor/team are responsible to backup and move the data out from the cluster***: every few months, the storage space will be recycled for those old users who do not have an existing and valid PSI account.
!!! warning
@@ -31,13 +32,15 @@ merlin_quotas
Merlin6 offers the following directory classes for users:
* ``/psi/home/<username>``: Private user **home** directory
* ``/data/user/<username>``: Private user **data** directory
* ``/data/project/general/<projectname>``: Shared **Project** directory
* For BIO experiments, a dedicated ``/data/project/bio/$projectname`` exists.
* For BIO experiments, a dedicated ``/data/project/bio/$projectname`` exists.
* ``/scratch``: Local *scratch* disk (only visible by the node running a job).
* ``/shared-scratch``: Shared *scratch* disk (visible from all nodes).
* ``/export``: Export directory for data transfer, visible from `ra-merlin-01.psi.ch`, `ra-merlin-02.psi.ch` and Merlin login nodes.
* Refer to **[Transferring Data](../how-to-use-merlin/transfer-data.md)** for more information about the export area and data transfer service.
* Refer to **[Transferring Data](../how-to-use-merlin/transfer-data.md)** for more information about the export area and data transfer service.
!!! tip
@@ -65,7 +68,7 @@ Properties of the directory classes:
The use of **scratch** and **export** areas as an extension of the quota
_is forbidden_. **scratch** and **export** areas _must not contain_ final
data.
data.
**_Auto cleanup policies_** in the **scratch** and **export** areas are applied.
@@ -94,7 +97,8 @@ quota -s
* Read **[Important: Code of Conduct](../quick-start-guide/code-of-conduct.md)** for more information about Merlin6 policies.
* Is **forbidden** to use the home directories for IO intensive tasks
* Use `/scratch`, `/shared-scratch`, `/data/user` or `/data/project` for this purpose.
* Use `/scratch`, `/shared-scratch`, `/data/user` or `/data/project` for this purpose.
* Users can retrieve up to 1 week of their lost data thanks to the automatic **daily snapshots for 1 week**.
Snapshots can be accessed at this path:
@@ -121,7 +125,8 @@ mmlsquota -u <username> --block-size auto merlin-user
* Read **[Important: Code of Conduct](../quick-start-guide/code-of-conduct.md)** for more information about Merlin6 policies.
* Is **forbidden** to use the data directories as ``scratch`` area during a job runtime.
* Use ``/scratch``, ``/shared-scratch`` for this purpose.
* Use ``/scratch``, ``/shared-scratch`` for this purpose.
* No backup policy is applied for user data directories: users are responsible for backing up their data.
### Project data directory
@@ -178,7 +183,7 @@ The properties of the available scratch storage spaces are given in the followin
* Read **[Important: Code of Conduct](../quick-start-guide/code-of-conduct.md)** for more information about Merlin6 policies.
* By default, *always* use **local** first and only use **shared** if your specific use case requires it.
* Temporary files *must be deleted at the end of the job by the user*.
* Remaining files will be deleted by the system if detected.
* Remaining files will be deleted by the system if detected.
* Files not accessed within 28 days will be automatically cleaned up by the system.
* If for some reason the scratch areas get full, admins have the rights to cleanup the oldest data.
@@ -190,6 +195,6 @@ Please read **[Transferring Data](../how-to-use-merlin/transfer-data.md)** for m
#### Export directory policy
* Temporary files *must be deleted at the end of the job by the user*.
* Remaining files will be deleted by the system if detected.
* Remaining files will be deleted by the system if detected.
* Files not accessed within 28 days will be automatically cleaned up by the system.
* If for some reason the export area gets full, admins have the rights to cleanup the oldest data

View File

@@ -1,12 +1,4 @@
---
title: Transferring Data
#tags:
keywords: transferring data, data transfer, rsync, winscp, copy data, copying, sftp, import, export, hopx, vpn
last_updated: 24 August 2023
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/transfer-data.html
---
# Transferring Data
## Overview
@@ -24,7 +16,6 @@ visibility.
- Systems on the internet can access the [PSI Data Transfer](https://www.psi.ch/en/photon-science-data-services/data-transfer) service
`datatransfer.psi.ch`, using ssh-based protocols and [Globus](https://www.globus.org/)
## Direct transfer via Merlin6 login nodes
The following methods transfer data directly via the [login
@@ -50,7 +41,6 @@ rsync -avAHXS ~/localdata user@merlin-l-01.psi.ch:/data/project/general/myprojec
You can resume interrupted transfers by simply rerunning the command. Previously
transferred files will be skipped.
### WinSCP
The WinSCP tool can be used for remote file transfer on Windows. It is available
@@ -64,7 +54,7 @@ local computer and merlin.
Authentication of users is provided through SimpleSAMLphp, supporting SAML2, LDAP and RADIUS and more. Users without an account can be sent an upload voucher by an authenticated user. FileSender is developed to the requirements of the higher education and research community.
The purpose of the software is to send a large file to someone, have that file available for download for a certain number of downloads and/or a certain amount of time, and after that automatically delete the file. The software is not intended as a permanent file publishing platform.
The purpose of the software is to send a large file to someone, have that file available for download for a certain number of downloads and/or a certain amount of time, and after that automatically delete the file. The software is not intended as a permanent file publishing platform.
**[SWITCHfilesender](https://filesender.switch.ch/filesender2/?s=upload)** is fully integrated with PSI, therefore, PSI employees can log in by using their PSI account (through Authentication and Authorization Infrastructure / AAI, by selecting PSI as the institution to be used for log in).
@@ -82,11 +72,13 @@ Notice that `datatransfer.psi.ch` does not allow SSH login, only `rsync`, `scp`
The following filesystems are mounted:
* `/merlin/export` which points to the `/export` directory in Merlin.
* `/merlin/data/experiment/mu3e` which points to the `/data/experiment/mu3e` directories in Merlin.
* Mu3e sub-directories are mounted in RW (read-write), except for `data` (read-only mounted)
* `/merlin/data/experiment/mu3e` which points to the `/data/experiment/mu3e` directories in Merlin.
* Mu3e sub-directories are mounted in RW (read-write), except for `data` (read-only mounted)
* `/merlin/data/project/general` which points to the `/data/project/general` directories in Merlin.
* Owners of Merlin projects should request explicit access to it.
* Currently, only `CSCS` is available for transferring files between PizDaint/Alps and Merlin
* Owners of Merlin projects should request explicit access to it.
* Currently, only `CSCS` is available for transferring files between PizDaint/Alps and Merlin
* `/merlin/data/project/bio` which points to the `/data/project/bio` directories in Merlin.
* `/merlin/data/user` which points to the `/data/user` directories in Merlin.
@@ -120,16 +112,17 @@ Transferring big amounts of data from outside PSI to Merlin is always possible t
##### Exporting data from Merlin
For exporting data from Merlin to outside PSI by using `/export`, one has to:
* From a Merlin login node, copy your data from any directory (i.e. `/data/project`, `/data/user`, `/scratch`) to
* From a Merlin login node, copy your data from any directory (i.e. `/data/project`, `/data/user`, `/scratch`) to
`/export`. Ensure to properly secure your directories and files with proper permissions.
* Once data is copied, from **`datatransfer.psi.ch`**, copy the data from `/merlin/export` to outside PSI
* Once data is copied, from **`datatransfer.psi.ch`**, copy the data from `/merlin/export` to outside PSI
##### Importing data to Merlin
For importing data from outside PSI to Merlin by using `/export`, one has to:
* From **`datatransfer.psi.ch`**, copy the data from outside PSI to `/merlin/export`.
* From **`datatransfer.psi.ch`**, copy the data from outside PSI to `/merlin/export`.
Ensure to properly secure your directories and files with proper permissions.
* Once data is copied, from a Merlin login node, copy your data from `/export` to any directory (i.e. `/data/project`, `/data/user`, `/scratch`).
* Once data is copied, from a Merlin login node, copy your data from `/export` to any directory (i.e. `/data/project`, `/data/user`, `/scratch`).
#### Request access to your project directory
@@ -144,10 +137,10 @@ Merlin6 is fully accessible from within the PSI network. To connect from outside
- [VPN](https://www.psi.ch/en/computing/vpn) ([alternate instructions](https://intranet.psi.ch/BIO/ComputingVPN))
- [SSH hopx](https://www.psi.ch/en/computing/ssh-hop)
* Please avoid transferring big amount data through **hopx**
* Please avoid transferring big amount data through **hopx**
- [No Machine](nomachine.md)
* Remote Interactive Access through [**'rem-acc.psi.ch'**](https://www.psi.ch/en/photon-science-data-services/remote-interactive-access)
* Please avoid transferring big amount of data through **NoMachine**
* Remote Interactive Access through [**'rem-acc.psi.ch'**](https://www.psi.ch/en/photon-science-data-services/remote-interactive-access)
* Please avoid transferring big amount of data through **NoMachine**
## Connecting from Merlin6 to outside file shares
@@ -161,5 +154,4 @@ provides a helpful wrapper over the Gnome storage utilities, and provides suppor
- FTP, SFTP
- [others](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/using_the_desktop_environment_in_rhel_8/managing-storage-volumes-in-gnome_using-the-desktop-environment-in-rhel-8#gvfs-back-ends_managing-storage-volumes-in-gnome)
[More instruction on using `merlin_rmount`](../software-support/merlin-rmount.md)

View File

@@ -1,17 +1,8 @@
---
#tags:
keywords: Pmodules, software, stable, unstable, deprecated, overlay, overlays, release stage, module, package, packages, library, libraries
last_updated: 07 September 2022
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/using-modules.html
---
# Using PModules
## Environment Modules
On top of the operating system stack we provide different software using the PSI developed PModule system.
On top of the operating system stack we provide different software using the PSI developed PModule system.
PModules is the official supported way and each package is deployed by a specific expert. Usually, in PModules
software which is used by many people will be found.
@@ -79,8 +70,8 @@ module use overlay_merlin
```
Then, once `overlay_merlin` is invoked, it will disable central software installations with the same version (if exist), and will be replaced
by the local ones in Merlin. Releases from the central Pmodules repository which do not have a copy in the Merlin overlay will remain
visible. In example, for each ANSYS release, one can identify where it is installed by searching ANSYS in PModules with the `--verbose`
by the local ones in Merlin. Releases from the central Pmodules repository which do not have a copy in the Merlin overlay will remain
visible. In example, for each ANSYS release, one can identify where it is installed by searching ANSYS in PModules with the `--verbose`
option. This will show the location of the different ANSYS releases as follows:
* For ANSYS releases installed in the central repositories, the path starts with `/opt/psi`
* For ANSYS releases installed in the Merlin6 repository (and/or overwritting the central ones), the path starts with `/data/software/pmodules`

View File

@@ -4,16 +4,18 @@
* The new Slurm CPU cluster is called **`merlin6`**.
* The new Slurm GPU cluster is called [**`gmerlin6`**](../gmerlin6/cluster-introduction.md)
* The old Slurm *merlin* cluster is still active and best effort support is provided.
* The old Slurm *merlin* cluster is still active and best effort support is provided.
The cluster, was renamed as [**merlin5**](../merlin5/cluster-introduction.md).
From July 2019, **`merlin6`** becomes the **default Slurm cluster** and any job submitted from the login node will be submitted to that cluster if not .
From July 2019, **`merlin6`** becomes the **default Slurm cluster** and any job submitted from the login node will be submitted to that cluster if not.
* Users can keep submitting to the old *`merlin5`* computing nodes by using the option ``--cluster=merlin5``.
* Users submitting to the **`gmerlin6`** GPU cluster need to specify the option ``--cluster=gmerlin6``.
### Slurm 'merlin6'
**CPU nodes** are configured in a **Slurm** cluster, called **`merlin6`**, and
this is the _**default Slurm cluster**_. Hence, by default, if no Slurm cluster is
this is the ***default Slurm cluster***. Hence, by default, if no Slurm cluster is
specified (with the `--cluster` option), this will be the cluster to which the jobs
will be sent.

View File

@@ -7,10 +7,10 @@ applications (e.g. a text editor, but for more performant graphical access, refe
in the login nodes and X11 forwarding can be used for those users who have properly configured X11 support in their desktops, however:
* Merlin6 administrators **do not offer support** for user desktop configuration (Windows, MacOS, Linux).
* Hence, Merlin6 administrators **do not offer official support** for X11 client setup.
* Nevertheless, a generic guide for X11 client setup (*Linux*, *Windows* and *MacOS*) is provided below.
* Hence, Merlin6 administrators **do not offer official support** for X11 client setup.
* Nevertheless, a generic guide for X11 client setup (*Linux*, *Windows* and *MacOS*) is provided below.
* PSI desktop configuration issues must be addressed through **[PSI Service Now](https://psi.service-now.com/psisp)** as an *Incident Request*.
* Ticket will be redirected to the corresponding Desktop support group (Windows, Linux).
* Ticket will be redirected to the corresponding Desktop support group (Windows, Linux).
### Accessing from a Linux client

View File

@@ -10,25 +10,27 @@ The basic principle is courtesy and consideration for other users.
## Interactive nodes
* The interactive nodes (also known as login nodes) are for development and quick testing:
* It is **strictly forbidden to run production jobs** on the login nodes. All production jobs must
* It is **strictly forbidden to run production jobs** on the login nodes. All production jobs must
be submitted to the batch system.
* It is **forbidden to run long processes** occupying big parts of a login node's resources.
* According to the previous rules, **misbehaving running processes will have to be killed.**
* It is **forbidden to run long processes** occupying big parts of a login node's resources.
* According to the previous rules, **misbehaving running processes will have to be killed.**
in order to keep the system responsive for other users.
## Batch system
* Make sure that no broken or run-away processes are left when your job is done. Keep the process space clean on all nodes.
* During the runtime of a job, it is mandatory to use the ``/scratch`` and ``/shared-scratch`` partitions for temporary data:
* It is **forbidden** to use the ``/data/user``, ``/data/project`` or ``/psi/home/`` for that purpose.
* Always remove files you do not need any more (e.g. core dumps, temporary files) as early as possible. Keep the disk space clean on all nodes.
* Prefer ``/scratch`` over ``/shared-scratch`` and use the latter only when you require the temporary files to be visible from multiple nodes.
* It is **forbidden** to use the ``/data/user``, ``/data/project`` or ``/psi/home/`` for that purpose.
* Always remove files you do not need any more (e.g. core dumps, temporary files) as early as possible. Keep the disk space clean on all nodes.
* Prefer ``/scratch`` over ``/shared-scratch`` and use the latter only when you require the temporary files to be visible from multiple nodes.
* Read the description in **[Merlin6 directory structure](../how-to-use-merlin/storage.md#merlin6-directories)** for learning about the correct usage of each partition type.
## User and project data
* ***Users are responsible for backing up their own data***. Is recommended to backup the data on third party independent systems (i.e. LTS, Archive, AFS, SwitchDrive, Windows Shares, etc.).
* **`/psi/home`**, as this contains a small amount of data, is the only directory where we can provide daily snapshots for one week. This can be found in the following directory **`/psi/home/.snapshot/`**
* **`/psi/home`**, as this contains a small amount of data, is the only directory where we can provide daily snapshots for one week. This can be found in the following directory **`/psi/home/.snapshot/`**
* ***When a user leaves PSI, she or her supervisor/team are responsible to backup and move the data out from the cluster***: every few months, the storage space will be recycled for those old users who do not have an existing and valid PSI account.
!!! warning
@@ -41,8 +43,8 @@ The basic principle is courtesy and consideration for other users.
* The system administrator has the right to temporarily block the access to
Merlin6 for an account violating the Code of Conduct in order to maintain the
efficiency and stability of the system.
* Repetitive violations by the same user will be escalated to the user's supervisor.
* Repetitive violations by the same user will be escalated to the user's supervisor.
* The system administrator has the right to delete files in the **scratch** directories
* after a job, if the job failed to clean up its files.
* during the job in order to prevent a job from destabilizing a node or multiple nodes.
* after a job, if the job failed to clean up its files.
* during the job in order to prevent a job from destabilizing a node or multiple nodes.
* The system administrator has the right to kill any misbehaving running processes.

View File

@@ -9,8 +9,8 @@ At present, the **Merlin local HPC cluster** contains _two_ generations of it:
* the old **Merlin5** cluster (`merlin5` Slurm cluster), and
* the newest generation **Merlin6**, which is divided in two Slurm clusters:
* `merlin6` as the Slurm CPU cluster
* `gmerlin6` as the Slurm GPU cluster.
* `merlin6` as the Slurm CPU cluster
* `gmerlin6` as the Slurm GPU cluster.
Access to the different Slurm clusters is possible from the [**Merlin login nodes**](accessing-interactive-nodes.md),
which can be accessed through the [SSH protocol](accessing-interactive-nodes.md#ssh-access) or the [NoMachine (NX) service](../how-to-use-merlin/nomachine.md).
@@ -42,9 +42,9 @@ by the BIO Division and by Deep Leaning project.
These computational resources are split into **two** different **[Slurm](https://slurm.schedmd.com/overview.html)** clusters:
* The Merlin6 CPU nodes are in a dedicated **[Slurm](https://slurm.schedmd.com/overview.html)** cluster called [**`merlin6`**](../slurm-configuration.md).
* This is the **default Slurm cluster** configured in the login nodes: any job submitted without the option `--cluster` will be submited to this cluster.
* This is the **default Slurm cluster** configured in the login nodes: any job submitted without the option `--cluster` will be submited to this cluster.
* The Merlin6 GPU resources are in a dedicated **[Slurm](https://slurm.schedmd.com/overview.html)** cluster called [**`gmerlin6`**](../../gmerlin6/slurm-configuration.md).
* Users submitting to the **`gmerlin6`** GPU cluster need to specify the option ``--cluster=gmerlin6``.
* Users submitting to the **`gmerlin6`** GPU cluster need to specify the option ``--cluster=gmerlin6``.
### Merlin5

View File

@@ -43,9 +43,9 @@ The owner of the group is the person who will be allowed to modify the group.
```text
Dear HelpDesk
I would like to request a new unix group.
Unix Group Name: unx-xxxxx
Initial Group Members: xxxxx, yyyyy, zzzzz, ...
Group Owner: xxxxx
@@ -93,16 +93,16 @@ To request a project, please provide the following information in a **[PSI Servi
```text
Dear HelpDesk
I would like to request a new Merlin6 project.
Project Name: xxxxx
UnixGroup: xxxxx # Must be an existing Unix Group
The project responsible is the Owner of the Unix Group.
If you need a storage quota exceeding the defaults, please provide a description
and motivation for the higher storage needs:
Storage Quota: 1TB with a maximum of 1M Files
Reason: (None for default 1TB/1M)

View File

@@ -1,12 +1,4 @@
---
title: Slurm Configuration
#tags:
keywords: configuration, partitions, node definition
last_updated: 29 January 2021
summary: "This document describes a summary of the Merlin6 configuration."
sidebar: merlin6_sidebar
permalink: /merlin6/slurm-configuration.html
---
# Slurm Configuration
This documentation shows basic Slurm configuration and options needed to run jobs in the Merlin6 CPU cluster.
@@ -23,11 +15,12 @@ The following table show default and maximum resources that can be used per node
| merlin-c-[313-318] | 1 core | 44 cores | 1 | 748800 | 748800 | 10000 | N/A | N/A |
| merlin-c-[319-324] | 1 core | 44 cores | 2 | 748800 | 748800 | 10000 | N/A | N/A |
If nothing is specified, by default each core will use up to 8GB of memory. Memory can be increased with the `--mem=<mem_in_MB>` and
If nothing is specified, by default each core will use up to 8GB of memory. Memory can be increased with the `--mem=<mem_in_MB>` and
`--mem-per-cpu=<mem_in_MB>` options, and maximum memory allowed is `Max.Mem/Node`.
In **`merlin6`**, Memory is considered a Consumable Resource, as well as the CPU. Hence, both resources will account when submitting a job,
and by default resources can not be oversubscribed. This is a main difference with the old **`merlin5`** cluster, when only CPU were accounted,
and memory was by default oversubscribed.
!!! tip "Check Configuration"
@@ -66,12 +59,12 @@ The following *partitions* (also known as *queues*) are configured in Slurm:
| **asa-ansys** | 1 hour | 90 days | unlimited | 1000 | 4 | 15600 |
| **mu3e** | 1 day | 7 days | unlimited | 1000 | 4 | 3712 |
\*The **PriorityJobFactor** value will be added to the job priority (*PARTITION* column in `sprio -l` ). In other words, jobs sent to higher priority
The **PriorityJobFactor** value will be added to the job priority (**PARTITION** column in `sprio -l` ). In other words, jobs sent to higher priority
partitions will usually run first (however, other factors such like **job age** or mainly **fair share** might affect to that decision). For the GPU
partitions, Slurm will also attempt first to allocate jobs on partitions with higher priority over partitions with lesser priority.
**\*\***Jobs submitted to a partition with a higher **PriorityTier** value will be dispatched before pending jobs in partition with lower *PriorityTier* value
and, if possible, they will preempt running jobs from partitions with lower *PriorityTier* values.
Jobs submitted to a partition with a higher **PriorityTier** value will be dispatched before pending jobs in partition with lower *PriorityTier* value
and, if possible, they will preempt running jobs from partitions with lower **PriorityTier** values.
* The **`general`** partition is the **default**. It can not have more than 50 nodes running jobs.
* For **`daily`** this limitation is extended to 67 nodes.
@@ -79,11 +72,18 @@ and, if possible, they will preempt running jobs from partitions with lower *Pri
* **`asa-general`,`asa-daily`,`asa-ansys`,`asa-visas` and `mu3e`** are **private** partitions, belonging to different experiments owning the machines. **Access is restricted** in all cases. However, by agreement with the experiments, nodes are usually added to the **`hourly`** partition as extra resources for the public resources.
!!! tip "Partition Selection"
Jobs which would run for less than one day should be always sent to **daily**, while jobs that would run for less than one hour should be sent to **hourly**. This would ensure that you have highest priority over jobs sent to partitions with less priority, but also because **general** has limited the number of nodes that can be used for that. The idea behind that, is that the cluster can not be blocked by long jobs and we can always ensure resources for shorter jobs.
Jobs which would run for less than one day should be always sent to
**daily**, while jobs that would run for less than one hour should be sent
to **hourly**. This would ensure that you have highest priority over jobs
sent to partitions with less priority, but also because **general** has
limited the number of nodes that can be used for that. The idea behind
that, is that the cluster can not be blocked by long jobs and we can always
ensure resources for shorter jobs.
### Merlin5 CPU Accounts
Users need to ensure that the public **`merlin`** account is specified. No specifying account options would default to this account.
This is mostly needed by users which have multiple Slurm accounts, which may define by mistake a different account.
```bash
@@ -100,16 +100,14 @@ Not all the accounts can be used on all partitions. This is resumed in the table
#### Private accounts
* The *`gfa-asa`* and *`mu3e`* accounts are private accounts. These can be used for accessing dedicated
partitions with nodes owned by different groups.
* The *`gfa-asa`* and *`mu3e`* accounts are private accounts. These can be used for accessing dedicated partitions with nodes owned by different groups.
### Slurm CPU specific options
Some options are available when using CPUs. These are detailed here.
Alternative Slurm options for CPU based jobs are available. Please refer to the **man** pages
for each Slurm command for further information about it (`man salloc`, `man sbatch`, `man srun`).
Below are listed the most common settings:
Alternative Slurm options for CPU based jobs are available. Please refer to the
**man** pages for each Slurm command for further information about it (`man
salloc`, `man sbatch`, `man srun`). Below are listed the most common settings:
```bash
#SBATCH --hint=[no]multithread
@@ -125,8 +123,9 @@ Below are listed the most common settings:
#### Enabling/Disabling Hyper-Threading
The **`merlin6`** cluster contains nodes with Hyper-Threading enabled. One should always specify
whether to use Hyper-Threading or not. If not defined, Slurm will generally use it (exceptions apply).
The **`merlin6`** cluster contains nodes with Hyper-Threading enabled. One
should always specify whether to use Hyper-Threading or not. If not defined,
Slurm will generally use it (exceptions apply).
```bash
#SBATCH --hint=multithread # Use extra threads with in-core multi-threading.
@@ -138,7 +137,7 @@ whether to use Hyper-Threading or not. If not defined, Slurm will generally use
Slurm allows to define a set of features in the node definition. This can be used to filter and select nodes according to one or more
specific features. For the CPU nodes, we have the following features:
```
```text
NodeName=merlin-c-[001-024,101-124,201-224] Features=mem_384gb,xeon-gold-6152
NodeName=merlin-c-[301-312] Features=mem_768gb,xeon-gold-6240r
NodeName=merlin-c-[313-318] Features=mem_768gb,xeon-gold-6240r
@@ -149,26 +148,36 @@ Therefore, users running on `hourly` can select which node they want to use (fat
This is possible by using the option `--constraint=<feature_name>` in Slurm.
Examples:
1. Select nodes with 48 cores only (nodes with [2 x Xeon Gold 6240R](https://ark.intel.com/content/www/us/en/ark/products/199343/intel-xeon-gold-6240r-processor-35-75m-cache-2-40-ghz.html)):
```
sbatch --constraint=xeon-gold-6240r ...
```
2. Select nodes with 44 cores only (nodes with [2 x Xeon Gold 6152](https://ark.intel.com/content/www/us/en/ark/products/120491/intel-xeon-gold-6152-processor-30-25m-cache-2-10-ghz.html)):
```
sbatch --constraint=xeon-gold-6152 ...
```
3. Select fat memory nodes only:
```
sbatch --constraint=mem_768gb ...
```
4. Select regular memory nodes only:
```
sbatch --constraint=mem_384gb ...
```
5. Select fat memory nodes with 48 cores only:
```
sbatch --constraint=mem_768gb,xeon-gold-6240r ...
```
```bash
sbatch --constraint=xeon-gold-6240r ...
```
1. Select nodes with 44 cores only (nodes with [2 x Xeon Gold 6152](https://ark.intel.com/content/www/us/en/ark/products/120491/intel-xeon-gold-6152-processor-30-25m-cache-2-10-ghz.html)):
```bash
sbatch --constraint=xeon-gold-6152 ...
```
1. Select fat memory nodes only:
```bash
sbatch --constraint=mem_768gb ...
```
1. Select regular memory nodes only:
```bash
sbatch --constraint=mem_384gb ...
```
1. Select fat memory nodes with 48 cores only:
```bash
sbatch --constraint=mem_768gb,xeon-gold-6240r ...
```
Detailing exactly which type of nodes you want to use is important, therefore, for groups with private accounts (`mu3e`,`gfa-asa`) or for
public users running on the `hourly` partition, *constraining nodes by features is recommended*. This becomes even more important when
@@ -178,11 +187,11 @@ having heterogeneous clusters.
In this chapter we will cover basic settings that users need to specify in order to run jobs in the Merlin6 CPU cluster.
### User and job limits
### User and job limits
In the CPU cluster we provide some limits which basically apply to jobs and users. The idea behind this is to ensure a fair usage of the resources and to
In the CPU cluster we provide some limits which basically apply to jobs and users. The idea behind this is to ensure a fair usage of the resources and to
avoid overabuse of the resources from a single user or job. However, applying limits might affect the overall usage efficiency of the cluster (in example,
pending jobs from a single user while having many idle nodes due to low overall activity is something that can be seen when user limits are applied).
pending jobs from a single user while having many idle nodes due to low overall activity is something that can be seen when user limits are applied).
In the same way, these limits can be also used to improve the efficiency of the cluster (in example, without any job size limits, a job requesting all
resources from the batch system would drain the entire cluster for fitting the job, which is undesirable).
@@ -190,14 +199,24 @@ Hence, there is a need of setting up wise limits and to ensure that there is a f
of the cluster while allowing jobs of different nature and sizes (it is, **single core** based **vs parallel jobs** of different sizes) to run.
!!! warning "Resource Limits"
Wide limits are provided in the **daily** and **hourly** partitions, while for **general** those limits are more restrictive. However, we kindly ask users to inform the Merlin administrators when there are plans to send big jobs which would require a massive draining of nodes for allocating such jobs. This would apply to jobs requiring the **unlimited** QoS (see below "Per job limits").
Wide limits are provided in the **daily** and **hourly** partitions, while
for **general** those limits are more restrictive. However, we kindly ask
users to inform the Merlin administrators when there are plans to send big
jobs which would require a massive draining of nodes for allocating such
jobs. This would apply to jobs requiring the **unlimited** QoS (see below
"Per job limits").
!!! tip "Custom Requirements"
If you have different requirements, please let us know, we will try to accommodate or propose a solution for you.
If you have different requirements, please let us know, we will try to
accommodate or propose a solution for you.
#### Per job limits
These are limits which apply to a single job. In other words, there is a maximum of resources a single job can use. Limits are described in the table below with the format: `SlurmQoS(limits)` (possible `SlurmQoS` values can be listed with the command `sacctmgr show qos`). Some limits will vary depending on the day and time of the week.
These are limits which apply to a single job. In other words, there is a
maximum of resources a single job can use. Limits are described in the table
below with the format: `SlurmQoS(limits)` (possible `SlurmQoS` values can be
listed with the command `sacctmgr show qos`). Some limits will vary depending
on the day and time of the week.
| Partition | Mon-Fri 0h-18h | Sun-Thu 18h-0h | From Fri 18h to Mon 0h |
|:----------: | :------------------------------: | :------------------------------: | :------------------------------: |
@@ -205,18 +224,29 @@ These are limits which apply to a single job. In other words, there is a maximum
| **daily** | daytime(cpu=704,mem=2750G) | nighttime(cpu=1408,mem=5500G) | unlimited(cpu=2200,mem=8593.75G) |
| **hourly** | unlimited(cpu=2200,mem=8593.75G) | unlimited(cpu=2200,mem=8593.75G) | unlimited(cpu=2200,mem=8593.75G) |
By default, a job can not use more than 704 cores (max CPU per job). In the same way, memory is also proportionally limited. This is equivalent as
running a job using up to 8 nodes at once. This limit applies to the **general** partition (fixed limit) and to the **daily** partition (only during working hours).
Limits are softed for the **daily** partition during non working hours, and during the weekend limits are even wider.
By default, a job can not use more than 704 cores (max CPU per job). In the
same way, memory is also proportionally limited. This is equivalent as running
a job using up to 8 nodes at once. This limit applies to the **general**
partition (fixed limit) and to the **daily** partition (only during working
hours).
For the **hourly** partition, **despite running many parallel jobs is something not desirable** (for allocating such jobs it requires massive draining of nodes),
wider limits are provided. In order to avoid massive nodes drain in the cluster, for allocating huge jobs, setting per job limits is necessary. Hence, **unlimited** QoS
mostly refers to "per user" limits more than to "per job" limits (in other words, users can run any number of hourly jobs, but the job size for such jobs is limited
with wide values).
Limits are softed for the **daily** partition during non working hours, and
during the weekend limits are even wider. For the **hourly** partition,
**despite running many parallel jobs is something not desirable** (for
allocating such jobs it requires massive draining of nodes), wider limits are
provided. In order to avoid massive nodes drain in the cluster, for allocating
huge jobs, setting per job limits is necessary. Hence, **unlimited** QoS mostly
refers to "per user" limits more than to "per job" limits (in other words,
users can run any number of hourly jobs, but the job size for such jobs is
limited with wide values).
#### Per user limits for CPU partitions
These limits which apply exclusively to users. In other words, there is a maximum of resources a single user can use. Limits are described in the table below with the format: `SlurmQoS(limits)` (possible `SlurmQoS` values can be listed with the command `sacctmgr show qos`). Some limits will vary depending on the day and time of the week.
These limits which apply exclusively to users. In other words, there is a
maximum of resources a single user can use. Limits are described in the table
below with the format: `SlurmQoS(limits)` (possible `SlurmQoS` values can be
listed with the command `sacctmgr show qos`). Some limits will vary depending
on the day and time of the week.
| Partition | Mon-Fri 0h-18h | Sun-Thu 18h-0h | From Fri 18h to Mon 0h |
|:-----------:| :----------------------------: | :---------------------------: | :----------------------------: |
@@ -224,15 +254,22 @@ These limits which apply exclusively to users. In other words, there is a maximu
| **daily** | daytime(cpu=1408,mem=5500G) | nighttime(cpu=2112,mem=8250G) | unlimited(cpu=6336,mem=24750G) |
| **hourly** | unlimited(cpu=6336,mem=24750G) | unlimited(cpu=6336,mem=24750G)| unlimited(cpu=6336,mem=24750G) |
By default, users can not use more than 704 cores at the same time (max CPU per user). Memory is also proportionally limited in the same way. This is
equivalent to 8 exclusive nodes. This limit applies to the **general** partition (fixed limit) and to the **daily** partition (only during working hours).
For the **hourly** partition, there are no limits restriction and user limits are removed. Limits are softed for the **daily** partition during non
working hours, and during the weekend limits are removed.
By default, users can not use more than 704 cores at the same time (max CPU per
user). Memory is also proportionally limited in the same way. This is
equivalent to 8 exclusive nodes. This limit applies to the **general**
partition (fixed limit) and to the **daily** partition (only during working
hours).
For the **hourly** partition, there are no limits restriction and user limits
are removed. Limits are softed for the **daily** partition during non working
hours, and during the weekend limits are removed.
## Advanced Slurm configuration
Clusters at PSI use the [Slurm Workload Manager](http://slurm.schedmd.com/) as the batch system technology for managing and scheduling jobs.
Slurm has been installed in a **multi-clustered** configuration, allowing to integrate multiple clusters in the same batch system.
Clusters at PSI use the [Slurm Workload Manager](http://slurm.schedmd.com/) as
the batch system technology for managing and scheduling jobs. Slurm has been
installed in a **multi-clustered** configuration, allowing to integrate
multiple clusters in the same batch system.
For understanding the Slurm configuration setup in the cluster, sometimes may be useful to check the following files:
@@ -240,5 +277,10 @@ For understanding the Slurm configuration setup in the cluster, sometimes may be
* ``/etc/slurm/gres.conf`` - can be found in the GPU nodes, is also propgated to login nodes and computing nodes for user read access.
* ``/etc/slurm/cgroup.conf`` - can be found in the computing nodes, is also propagated to login nodes for user read access.
The previous configuration files which can be found in the login nodes, correspond exclusively to the **merlin6** cluster configuration files.
Configuration files for the old **merlin5** cluster or for the **gmerlin6** cluster must be checked directly on any of the **merlin5** or **gmerlin6** computing nodes (in example, by login in to one of the nodes while a job or an active allocation is running).
The previous configuration files which can be found in the login nodes,
correspond exclusively to the **merlin6** cluster configuration files.
Configuration files for the old **merlin5** cluster or for the **gmerlin6**
cluster must be checked directly on any of the **merlin5** or **gmerlin6**
computing nodes (in example, by login in to one of the nodes while a job or an
active allocation is running).

View File

@@ -57,7 +57,7 @@ a shell (`$SHELL`) at the end of the `salloc` command. In example:
```bash
# Typical 'salloc' call
# - Same as running:
# 'salloc --clusters=merlin6 -N 2 -n 2 srun -n1 -N1 --mem-per-cpu=0 --gres=gpu:0 --pty --preserve-env --mpi=none $SHELL'
# 'salloc --clusters=merlin6 -N 2 -n 2 srun -n1 -N1 --mem-per-cpu=0 --gres=gpu:0 --pty --preserve-env --mpi=none $SHELL'
salloc --clusters=merlin6 -N 2 -n 2
# Custom 'salloc' call
@@ -155,7 +155,7 @@ srun --clusters=merlin6 --x11 --pty bash
srun: job 135095591 queued and waiting for resources
srun: job 135095591 has been allocated resources
(base) [caubet_m@merlin-l-001 ~]$
(base) [caubet_m@merlin-l-001 ~]$
(base) [caubet_m@merlin-l-001 ~]$ srun --clusters=merlin6 --x11 --pty bash
srun: job 135095592 queued and waiting for resources
@@ -198,7 +198,7 @@ salloc --clusters=merlin6 --x11
salloc: Granted job allocation 135171355
salloc: Relinquishing job allocation 135171355
(base) [caubet_m@merlin-l-001 ~]$ salloc --clusters=merlin6 --x11
(base) [caubet_m@merlin-l-001 ~]$ salloc --clusters=merlin6 --x11
salloc: Pending job allocation 135171349
salloc: job 135171349 queued and waiting for resources
salloc: job 135171349 has been allocated resources

View File

@@ -166,20 +166,20 @@ sjstat
Scheduling pool data:
----------------------------------------------------------------------------------
Total Usable Free Node Time Other
Pool Memory Cpus Nodes Nodes Nodes Limit Limit traits
Total Usable Free Node Time Other
Pool Memory Cpus Nodes Nodes Nodes Limit Limit traits
----------------------------------------------------------------------------------
test 373502Mb 88 6 6 1 UNLIM 1-00:00:00
general* 373502Mb 88 66 66 8 50 7-00:00:00
daily 373502Mb 88 72 72 9 60 1-00:00:00
hourly 373502Mb 88 72 72 9 UNLIM 01:00:00
gpu 128000Mb 8 1 1 0 UNLIM 7-00:00:00
gpu 128000Mb 20 8 8 0 UNLIM 7-00:00:00
test 373502Mb 88 6 6 1 UNLIM 1-00:00:00
general* 373502Mb 88 66 66 8 50 7-00:00:00
daily 373502Mb 88 72 72 9 60 1-00:00:00
hourly 373502Mb 88 72 72 9 UNLIM 01:00:00
gpu 128000Mb 8 1 1 0 UNLIM 7-00:00:00
gpu 128000Mb 20 8 8 0 UNLIM 7-00:00:00
Running job data:
---------------------------------------------------------------------------------------------------
Time Time Time
JobID User Procs Pool Status Used Limit Started Master/Other
Time Time Time
JobID User Procs Pool Status Used Limit Started Master/Other
---------------------------------------------------------------------------------------------------
13433377 collu_g 1 gpu PD 0:00 24:00:00 N/A (Resources)
13433389 collu_g 20 gpu PD 0:00 24:00:00 N/A (Resources)
@@ -249,11 +249,10 @@ sview
!['sview' graphical user interface](../../images/slurm/sview.png)
## General Monitoring
The following pages contain basic monitoring for Slurm and computing nodes.
Currently, monitoring is based on Grafana + InfluxDB. In the future it will
The following pages contain basic monitoring for Slurm and computing nodes.
Currently, monitoring is based on Grafana + InfluxDB. In the future it will
be moved to a different service based on ElasticSearch + LogStash + Kibana.
In the meantime, the following monitoring pages are available in a best effort
@@ -262,17 +261,17 @@ support:
### Merlin6 Monitoring Pages
* Slurm monitoring:
* ***[Merlin6 Slurm Statistics - XDMOD](https://merlin-slurmmon01.psi.ch/)***
* [Merlin6 Slurm Live Status](https://hpc-monitor02.psi.ch/d/QNcbW1AZk/merlin6-slurm-live-status?orgId=1&refresh=10s)
* [Merlin6 Slurm Overview](https://hpc-monitor02.psi.ch/d/94UxWJ0Zz/merlin6-slurm-overview?orgId=1&refresh=10s)
* ***[Merlin6 Slurm Statistics - XDMOD](https://merlin-slurmmon01.psi.ch/)***
* [Merlin6 Slurm Live Status](https://hpc-monitor02.psi.ch/d/QNcbW1AZk/merlin6-slurm-live-status?orgId=1&refresh=10s)
* [Merlin6 Slurm Overview](https://hpc-monitor02.psi.ch/d/94UxWJ0Zz/merlin6-slurm-overview?orgId=1&refresh=10s)
* Nodes monitoring:
* [Merlin6 CPU Nodes Overview](https://hpc-monitor02.psi.ch/d/JmvLR8gZz/merlin6-computing-cpu-nodes?orgId=1&refresh=10s)
* [Merlin6 GPU Nodes Overview](https://hpc-monitor02.psi.ch/d/gOo1Z10Wk/merlin6-computing-gpu-nodes?orgId=1&refresh=10s)
* [Merlin6 CPU Nodes Overview](https://hpc-monitor02.psi.ch/d/JmvLR8gZz/merlin6-computing-cpu-nodes?orgId=1&refresh=10s)
* [Merlin6 GPU Nodes Overview](https://hpc-monitor02.psi.ch/d/gOo1Z10Wk/merlin6-computing-gpu-nodes?orgId=1&refresh=10s)
### Merlin5 Monitoring Pages
* Slurm monitoring:
* [Merlin5 Slurm Live Status](https://hpc-monitor02.psi.ch/d/o8msZJ0Zz/merlin5-slurm-live-status?orgId=1&refresh=10s)
* [Merlin5 Slurm Overview](https://hpc-monitor02.psi.ch/d/eWLEW1AWz/merlin5-slurm-overview?orgId=1&refresh=10s)
* [Merlin5 Slurm Live Status](https://hpc-monitor02.psi.ch/d/o8msZJ0Zz/merlin5-slurm-live-status?orgId=1&refresh=10s)
* [Merlin5 Slurm Overview](https://hpc-monitor02.psi.ch/d/eWLEW1AWz/merlin5-slurm-overview?orgId=1&refresh=10s)
* Nodes monitoring:
* [Merlin5 CPU Nodes Overview](https://hpc-monitor02.psi.ch/d/ejTyWJAWk/merlin5-computing-cpu-nodes?orgId=1&refresh=10s)
* [Merlin5 CPU Nodes Overview](https://hpc-monitor02.psi.ch/d/ejTyWJAWk/merlin5-computing-cpu-nodes?orgId=1&refresh=10s)

View File

@@ -5,19 +5,19 @@
Before starting using the cluster, please read the following rules:
1. To ease and improve *scheduling* and *backfilling*, always try to **estimate and** to **define a proper run time** of your jobs:
* Use `--time=<D-HH:MM:SS>` for that.
* For very long runs, please consider using ***[Job Arrays with Checkpointing](#array-jobs-running-very-long-tasks-with-checkpoint-files)***
* Use `--time=<D-HH:MM:SS>` for that.
* For very long runs, please consider using ***[Job Arrays with Checkpointing](#array-jobs-running-very-long-tasks-with-checkpoint-files)***
2. Try to optimize your jobs for running at most within **one day**. Please, consider the following:
* Some software can simply scale up by using more nodes while drastically reducing the run time.
* Some software allow to save a specific state, and a second job can start from that state: ***[Job Arrays with Checkpointing](#array-jobs-running-very-long-tasks-with-checkpoint-files)*** can help you with that.
* Jobs submitted to **`hourly`** get more priority than jobs submitted to **`daily`**: always use **`hourly`** for jobs shorter than 1 hour.
* Jobs submitted to **`daily`** get more priority than jobs submitted to **`general`**: always use **`daily`** for jobs shorter than 1 day.
* Some software can simply scale up by using more nodes while drastically reducing the run time.
* Some software allow to save a specific state, and a second job can start from that state: ***[Job Arrays with Checkpointing](#array-jobs-running-very-long-tasks-with-checkpoint-files)*** can help you with that.
* Jobs submitted to **`hourly`** get more priority than jobs submitted to **`daily`**: always use **`hourly`** for jobs shorter than 1 hour.
* Jobs submitted to **`daily`** get more priority than jobs submitted to **`general`**: always use **`daily`** for jobs shorter than 1 day.
3. Is **forbidden** to run **very short jobs** as they cause a lot of overhead but also can cause severe problems to the main scheduler.
* ***Question:*** Is my job a very short job? ***Answer:*** If it lasts in few seconds or very few minutes, yes.
* ***Question:*** How long should my job run? ***Answer:*** as the *Rule of Thumb*, from 5' would start being ok, from 15' would preferred.
* Use ***[Packed Jobs](#packed-jobs-running-a-large-number-of-short-tasks)*** for running a large number of short tasks.
* ***Question:*** Is my job a very short job? ***Answer:*** If it lasts in few seconds or very few minutes, yes.
* ***Question:*** How long should my job run? ***Answer:*** as the *Rule of Thumb*, from 5' would start being ok, from 15' would preferred.
* Use ***[Packed Jobs](#packed-jobs-running-a-large-number-of-short-tasks)*** for running a large number of short tasks.
4. Do not submit hundreds of similar jobs!
* Use ***[Array Jobs](#array-jobs-launching-a-large-number-of-related-jobs)*** for gathering jobs instead.
* Use ***[Array Jobs](#array-jobs-launching-a-large-number-of-related-jobs)*** for gathering jobs instead.
!!! tip
Having a good estimation of the *time* needed by your jobs, a proper way for
@@ -37,6 +37,7 @@ Before starting using the cluster, please read the following rules:
## Basic settings
For a complete list of options and parameters available is recommended to use the **man pages** (i.e. `man sbatch`, `man srun`, `man salloc`).
Please, notice that behaviour for some parameters might change depending on the command used when running jobs (in example, `--exclusive` behaviour in `sbatch` differs from `srun`).
In this chapter we show the basic parameters which are usually needed in the Merlin cluster.
@@ -115,20 +116,20 @@ The following template should be used by any user submitting jobs to the Merlin6
```bash
#!/bin/bash
#SBATCH --cluster=merlin6 # Cluster name
#SBATCH --cluster=merlin6 # Cluster name
#SBATCH --partition=general,daily,hourly # Specify one or multiple partitions
#SBATCH --time=<D-HH:MM:SS> # Strongly recommended
#SBATCH --output=<output_file> # Generate custom output file
#SBATCH --error=<error_file> # Generate custom error file
#SBATCH --hint=nomultithread # Mandatory for multithreaded jobs
##SBATCH --exclusive # Uncomment if you need exclusive node usage
##SBATCH --ntasks-per-core=1 # Only mandatory for multithreaded single tasks
## Advanced options example
##SBATCH --nodes=1 # Uncomment and specify #nodes to use
##SBATCH --ntasks=44 # Uncomment and specify #nodes to use
##SBATCH --ntasks-per-node=44 # Uncomment and specify #tasks per node
##SBATCH --cpus-per-task=44 # Uncomment and specify the number of cores per task
#SBATCH --time=<D-HH:MM:SS> # Strongly recommended
#SBATCH --output=<output_file> # Generate custom output file
#SBATCH --error=<error_file> # Generate custom error file
#SBATCH --hint=nomultithread # Mandatory for multithreaded jobs
##SBATCH --exclusive # Uncomment if you need exclusive node usage
##SBATCH --ntasks-per-core=1 # Only mandatory for multithreaded single tasks
## Advanced options example
##SBATCH --nodes=1 # Uncomment and specify #nodes to use
##SBATCH --ntasks=44 # Uncomment and specify #nodes to use
##SBATCH --ntasks-per-node=44 # Uncomment and specify #tasks per node
##SBATCH --cpus-per-task=44 # Uncomment and specify the number of cores per task
```
#### Multithreaded jobs template
@@ -241,7 +242,7 @@ strategy:
#SBATCH --time=7-00:00:00 # each job can run for 7 days
#SBATCH --cpus-per-task=1
#SBATCH --array=1-10%1 # Run a 10-job array, one job at a time.
if test -e checkpointfile; then
if test -e checkpointfile; then
# There is a checkpoint file;
myprogram --read-checkp checkpointfile
else

View File

@@ -9,9 +9,9 @@ information about options and examples.
Useful commands for the slurm:
```bash
sinfo # to see the name of nodes, their occupancy,
sinfo # to see the name of nodes, their occupancy,
# name of slurm partitions, limits (try out with "-l" option)
squeue # to see the currently running/waiting jobs in slurm
squeue # to see the currently running/waiting jobs in slurm
# (additional "-l" option may also be useful)
sbatch Script.sh # to submit a script (example below) to the slurm.
srun <command> # to submit a command to Slurm. Same options as in 'sbatch' can be used.
@@ -30,7 +30,7 @@ sacct # Show job accounting, useful for checking details of finished
```bash
sinfo -N -l # list nodes, state, resources (#CPUs, memory per node, ...), etc.
sshare -a # to list shares of associations to a cluster
sprio -l # to view the factors that comprise a job's scheduling priority
sprio -l # to view the factors that comprise a job's scheduling priority
# add '-u <username>' for filtering user
```

View File

@@ -251,7 +251,6 @@ The `%1` in the `#SBATCH --array=1-10%1` statement defines that only 1 subjob ca
this will result in subjob n+1 only being started when job n has finished. It will read the checkpoint file
if it is present.
### Packed jobs: running a large number of short tasks
Since the launching of a Slurm job incurs some overhead, you should not submit each short task as a separate

View File

@@ -54,7 +54,7 @@ export ANSYSLI_SERVERS=2325@$LICENSE_SERVER
# [Optional:END]
SOLVER_FILE=/data/user/caubet_m/CFX5/mysolver.in
cfx5solve -batch -def "$JOURNAL_FILE"
cfx5solve -batch -def "$JOURNAL_FILE"
```
One can enable hypertheading by defining `--hint=multithread`,
@@ -99,23 +99,24 @@ if [ "$INTELMPI" == "yes" ]
then
export I_MPI_DEBUG=4
export I_MPI_PIN_CELL=core
# Simple example: cfx5solve -batch -def "$JOURNAL_FILE" -par-dist "$HOSTLIST" \
# -part $SLURM_NTASKS \
# Simple example: cfx5solve -batch -def "$JOURNAL_FILE" -par-dist "$HOSTLIST" \
# -part $SLURM_NTASKS \
# -start-method 'Intel MPI Distributed Parallel'
cfx5solve -batch -part-large -double -verbose -def "$JOURNAL_FILE" -par-dist "$HOSTLIST" \
cfx5solve -batch -part-large -double -verbose -def "$JOURNAL_FILE" -par-dist "$HOSTLIST" \
-part $SLURM_NTASKS -par-local -start-method 'Intel MPI Distributed Parallel'
else
# Simple example: cfx5solve -batch -def "$JOURNAL_FILE" -par-dist "$HOSTLIST" \
# -part $SLURM_NTASKS \
# Simple example: cfx5solve -batch -def "$JOURNAL_FILE" -par-dist "$HOSTLIST" \
# -part $SLURM_NTASKS \
# -start-method 'IBM MPI Distributed Parallel'
cfx5solve -batch -part-large -double -verbose -def "$JOURNAL_FILE" -par-dist "$HOSTLIST" \
cfx5solve -batch -part-large -double -verbose -def "$JOURNAL_FILE" -par-dist "$HOSTLIST" \
-part $SLURM_NTASKS -par-local -start-method 'IBM MPI Distributed Parallel'
fi
```
In the above example, one can increase the number of *nodes* and/or *ntasks* if needed and combine it
with `--exclusive` whenever needed. In general, **no hypertheading** is recommended for MPI based jobs.
Also, one can combine it with `--exclusive` when necessary. Finally, one can change the MPI technology in `-start-method`
(check CFX documentation for possible values).

View File

@@ -75,10 +75,10 @@ To setup HFSS RSM for using it with the Merlin cluster, it must be done from the
![RSM_Remote_Scheduler](../../images/ANSYS/HFSS/02_Select_Scheduler_RSM_Remote.png)
* **Select Scheduler**: `Remote RSM`.
* **Server**: Add a Merlin login node.
* **User name**: Add your Merlin username.
* **Password**: Add you Merlin username password.
* **Select Scheduler**: `Remote RSM`.
* **Server**: Add a Merlin login node.
* **User name**: Add your Merlin username.
* **Password**: Add you Merlin username password.
Once *refreshed*, the **Scheduler info** box must provide **Slurm**
information of the server (see above picture). If the box contains that
@@ -92,7 +92,7 @@ To setup HFSS RSM for using it with the Merlin cluster, it must be done from the
![Product_Path](../../images/ANSYS/HFSS/05_Submit_Job_Product_Path.png)
* In example, for **ANSYS/2022R1**, the location is `/data/software/pmodules/Tools/ANSYS/2021R1/v211/AnsysEM21.1/Linux64/ansysedt.exe`.
* In example, for **ANSYS/2022R1**, the location is `/data/software/pmodules/Tools/ANSYS/2021R1/v211/AnsysEM21.1/Linux64/ansysedt.exe`.
### HFSS Slurm (from login node only)
@@ -118,10 +118,10 @@ Desktop** to submit to Slurm. This can set as follows:
![RSM_Remote_Scheduler](../../images/ANSYS/HFSS/03_Select_Scheduler_Slurm.png)
* **Select Scheduler**: `Slurm`.
* **Server**: must point to `localhost`.
* **User name**: must be empty.
* **Password**: must be empty.
* **Select Scheduler**: `Slurm`.
* **Server**: must point to `localhost`.
* **User name**: must be empty.
* **Password**: must be empty.
The **Server, User name** and **Password** boxes can't be modified, but if
value do not match with the above settings, they should be changed by

View File

@@ -1,6 +1,4 @@
---
title: ANSYS - MAPDL
---
# ANSYS - MAPDL
# ANSYS - Mechanical APDL
@@ -143,12 +141,12 @@ then
# When using -mpi=intelmpi, KMP Affinity must be disabled
export KMP_AFFINITY=disabled
# INTELMPI is not aware about distribution of tasks.
# INTELMPI is not aware about distribution of tasks.
# - We need to define tasks distribution.
HOSTLIST=$(srun hostname | sort | uniq -c | awk '{print $2 ":" $1}' | tr '\n' ':' | sed 's/:$/\n/g')
mapdl -b -dis -mpi intelmpi -machines $HOSTLIST -np ${SLURM_NTASKS} -i "$SOLVER_FILE"
else
# IBMMPI (default) will be aware of the distribution of tasks.
# IBMMPI (default) will be aware of the distribution of tasks.
# - In principle, no need to force tasks distribution
mapdl -b -dis -mpi ibmmpi -np ${SLURM_NTASKS} -i "$SOLVER_FILE"
fi

View File

@@ -56,25 +56,25 @@ The different steps and settings required to make it work are that following:
2. Right-click the **HPC Resources** icon followed by **Add HPC Resource...**
![Adding a new HPC Resource](../../images/ANSYS/rsm-1-add_hpc_resource.png)
3. In the **HPC Resource** tab, fill up the corresponding fields as follows:
![HPC Resource](../../images/ANSYS/rsm-2-add_cluster.png)
* **"Name"**: Add here the preffered name for the cluster. In example: `Merlin6 cluster - merlin-l-001`
* **"HPC Type"**: Select `SLURM`
* **"Submit host"**: Add one of the login nodes. In example `merlin-l-001`.
* **"Slurm Job submission arguments (optional)"**: Add any required Slurm options for running your jobs.
![HPC Resource](../../images/ANSYS/rsm-2-add_cluster.png)
* **"Name"**: Add here the preffered name for the cluster. In example: `Merlin6 cluster - merlin-l-001`
* **"HPC Type"**: Select `SLURM`
* **"Submit host"**: Add one of the login nodes. In example `merlin-l-001`.
* **"Slurm Job submission arguments (optional)"**: Add any required Slurm options for running your jobs.
* In general, `--hint=nomultithread` should be at least present.
* Check **"Use SSH protocol for inter and intra-node communication (Linux only)"**
* Select **"Able to directly submit and monitor HPC jobs"**.
* **"Apply"** changes.
* Check **"Use SSH protocol for inter and intra-node communication (Linux only)"**
* Select **"Able to directly submit and monitor HPC jobs"**.
* **"Apply"** changes.
4. In the **"File Management"** tab, fill up the corresponding fields as follows:
![File Management](../../images/ANSYS/rsm-3-add_scratch_info.png)
* Select **"RSM internal file transfer mechanism"** and add **`/shared-scratch`** as the **"Staging directory path on Cluster"**
* Select **"Scratch directory local to the execution node(s)"** and add **`/scratch`** as the **HPC scratch directory**.
* **Never check** the option "Keep job files in the staging directory when job is complete" if the previous
![File Management](../../images/ANSYS/rsm-3-add_scratch_info.png)
* Select **"RSM internal file transfer mechanism"** and add **`/shared-scratch`** as the **"Staging directory path on Cluster"**
* Select **"Scratch directory local to the execution node(s)"** and add **`/scratch`** as the **HPC scratch directory**.
* **Never check** the option "Keep job files in the staging directory when job is complete" if the previous
option "Scratch directory local to the execution node(s)" was set.
* **"Apply"** changes.
* **"Apply"** changes.
5. In the **"Queues"** tab, use the left button to auto-discover partitions
![Queues](../../images/ANSYS/rsm-4-get_slurm_queues.png)
* If no authentication method was configured before, an authentication window will appear. Use your
* If no authentication method was configured before, an authentication window will appear. Use your
PSI account to authenticate. Notice that the **`PSICH\`** prefix **must not be added**.
![Authenticating](../../images/ANSYS/rsm-5-authenticating.png)
* From the partition list, select the ones you want to typically use.

View File

@@ -40,17 +40,17 @@ option. This will show the location of the different ANSYS releases as follows:
Module Rel.stage Group Dependencies/Modulefile
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
ANSYS/2019R3 stable Tools dependencies:
ANSYS/2019R3 stable Tools dependencies:
modulefile: /data/software/pmodules/Tools/modulefiles/ANSYS/2019R3
ANSYS/2020R1 stable Tools dependencies:
ANSYS/2020R1 stable Tools dependencies:
modulefile: /opt/psi/Tools/modulefiles/ANSYS/2020R1
ANSYS/2020R1-1 stable Tools dependencies:
ANSYS/2020R1-1 stable Tools dependencies:
modulefile: /opt/psi/Tools/modulefiles/ANSYS/2020R1-1
ANSYS/2020R2 stable Tools dependencies:
ANSYS/2020R2 stable Tools dependencies:
modulefile: /data/software/pmodules/Tools/modulefiles/ANSYS/2020R2
ANSYS/2021R1 stable Tools dependencies:
ANSYS/2021R1 stable Tools dependencies:
modulefile: /data/software/pmodules/Tools/modulefiles/ANSYS/2021R1
ANSYS/2021R2 stable Tools dependencies:
ANSYS/2021R2 stable Tools dependencies:
modulefile: /data/software/pmodules/Tools/modulefiles/ANSYS/2021R2
```
@@ -62,6 +62,7 @@ option. This will show the location of the different ANSYS releases as follows:
### ANSYS RSM
**ANSYS Remote Solve Manager (RSM)** is used by ANSYS Workbench to submit computational jobs to HPC clusters directly from Workbench on your desktop.
Therefore, PSI workstations with direct access to Merlin can submit jobs by using RSM.
For further information, please visit the **[ANSYS RSM](ansys-rsm.md)** section.

View File

@@ -78,7 +78,7 @@ To submit an interactive job, consider the following requirements:
# Example 1: Define GTHTMP before the allocation
export GTHTMP=/scratch
salloc ...
# Example 2: Define GTHTMP after the allocation
salloc ...
export GTHTMP=/scratch
@@ -89,7 +89,7 @@ To submit an interactive job, consider the following requirements:
allocation! In example:
```bash
# Example 1:
# Example 1:
export GTHTMP=/scratch/$USER
salloc ...
mkdir -p $GTHTMP
@@ -125,7 +125,7 @@ the [General requirements](#general-requirements) section.
* Requesting a full node:
```bash
salloc --partition=hourly -N 1 -n 1 -c 88 --hint=multithread --x11 --exclusive --mem=0
salloc --partition=hourly -N 1 -n 1 -c 88 --hint=multithread --x11 --exclusive --mem=0
```
* Requesting 22 CPUs from a node, with default memory per CPU (4000MB/CPU):
@@ -177,16 +177,16 @@ requirements](#general-requirements) section, and:
#SBATCH --exclusive
#SBATCH --mem=0
#SBATCH --clusters=merlin6
INPUT_FILE='MY_INPUT.SIN'
mkdir -p /scratch/$USER/$SLURM_JOB_ID
export GTHTMP=/scratch/$USER/$SLURM_JOB_ID
/data/project/general/software/gothic/gothic8.3qa/bin/gothic_s.sh $INPUT_FILE -m -np $SLURM_CPUS_PER_TASK
gth_exit_code=$?
# Clean up data in /scratch
# Clean up data in /scratch
rm -rf /scratch/$USER/$SLURM_JOB_ID
# Return exit code from GOTHIC
@@ -205,16 +205,16 @@ requirements](#general-requirements) section, and:
#SBATCH --cpus-per-task=22
#SBATCH --hint=multithread
#SBATCH --clusters=merlin6
INPUT_FILE='MY_INPUT.SIN'
mkdir -p /scratch/$USER/$SLURM_JOB_ID
export GTHTMP=/scratch/$USER/$SLURM_JOB_ID
/data/project/general/software/gothic/gothic8.3qa/bin/gothic_s.sh $INPUT_FILE -m -np $SLURM_CPUS_PER_TASK
gth_exit_code=$?
# Clean up data in /scratch
# Clean up data in /scratch
rm -rf /scratch/$USER/$SLURM_JOB_ID
# Return exit code from GOTHIC