first stab at mkdocs migration

This commit is contained in:
2025-11-26 17:28:07 +01:00
parent 149de6fb18
commit 1d9c01572d
282 changed files with 200 additions and 8940 deletions

View File

@@ -0,0 +1,56 @@
---
title: Accessing Interactive Nodes
#tags:
keywords: How to, HowTo, access, accessing, nomachine, ssh
last_updated: 07 September 2022
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/interactive.html
---
## SSH Access
For interactive command shell access, use an SSH client. We recommend to activate SSH's X11 forwarding to allow you to use graphical
applications (e.g. a text editor, but for more performant graphical access, refer to the sections below). X applications are supported
in the login nodes and X11 forwarding can be used for those users who have properly configured X11 support in their desktops, however:
* Merlin6 administrators **do not offer support** for user desktop configuration (Windows, MacOS, Linux).
* Hence, Merlin6 administrators **do not offer official support** for X11 client setup.
* Nevertheless, a generic guide for X11 client setup (*Linux*, *Windows* and *MacOS*) is provided below.
* PSI desktop configuration issues must be addressed through **[PSI Service Now](https://psi.service-now.com/psisp)** as an *Incident Request*.
* Ticket will be redirected to the corresponding Desktop support group (Windows, Linux).
### Accessing from a Linux client
Refer to [{How To Use Merlin -> Accessing from Linux Clients}](/merlin6/connect-from-linux.html) for **Linux** SSH client and X11 configuration.
### Accessing from a Windows client
Refer to [{How To Use Merlin -> Accessing from Windows Clients}](/merlin6/connect-from-windows.html) for **Windows** SSH client and X11 configuration.
### Accessing from a MacOS client
Refer to [{How To Use Merlin -> Accessing from MacOS Clients}](/merlin6/connect-from-macos.html) for **MacOS** SSH client and X11 configuration.
## NoMachine Remote Desktop Access
X applications are supported in the login nodes and can run efficiently through a **NoMachine** client. This is the officially supported way to run more demanding X applications on Merlin6.
* For PSI Windows workstations, this can be installed from the Software Kiosk as 'NX Client'. If you have difficulties installing, please request support through **[PSI Service Now](https://psi.service-now.com/psisp)** as an *Incident Request*.
* For other workstations The client software can be downloaded from the [Nomachine Website](https://www.nomachine.com/product&p=NoMachine%20Enterprise%20Client).
### Configuring NoMachine
Refer to [{How To Use Merlin -> Remote Desktop Access}](/merlin6/nomachine.html) for further instructions of how to configure the NoMachine client and how to access it from PSI and from outside PSI.
## Login nodes hardware description
The Merlin6 login nodes are the official machines for accessing the recources of Merlin6.
From these machines, users can submit jobs to the Slurm batch system as well as visualize or compile their software.
The Merlin6 login nodes are the following:
| Hostname | SSH | NoMachine | #cores | #Threads | CPU | Memory | Scratch | Scratch Mountpoint |
| ------------------- | --- | --------- | ------ |:--------:| :-------------------- | ------ | ---------- | :------------------ |
| merlin-l-001.psi.ch | yes | yes | 2 x 22 | 2 | Intel Xeon Gold 6152 | 384GB | 1.8TB NVMe | ``/scratch`` |
| merlin-l-002.psi.ch | yes | yes | 2 x 22 | 2 | Intel Xeon Gold 6142 | 384GB | 1.8TB NVMe | ``/scratch`` |
| merlin-l-01.psi.ch | yes | - | 2 x 16 | 2 | Intel Xeon E5-2697Av4 | 512GB | 100GB SAS | ``/scratch`` |

View File

@@ -0,0 +1,53 @@
---
title: Accessing Slurm Cluster
#tags:
keywords: slurm, batch system, merlin5, merlin6, gmerlin6, cpu, gpu
last_updated: 07 September 2022
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/slurm-access.html
---
## The Merlin Slurm clusters
Merlin contains a multi-cluster setup, where multiple Slurm clusters coexist under the same umbrella.
It basically contains the following clusters:
* The **Merlin6 Slurm CPU cluster**, which is called [**`merlin6`**](/merlin6/slurm-access.html#merlin6-cpu-cluster-access).
* The **Merlin6 Slurm GPU cluster**, which is called [**`gmerlin6`**](/merlin6/slurm-access.html#merlin6-gpu-cluster-access).
* The *old Merlin5 Slurm CPU cluster*, which is called [**`merlin5`**](/merlin6/slurm-access.html#merlin5-cpu-cluster-access), still supported in a best effort basis.
## Accessing the Slurm clusters
Any job submission must be performed from a **Merlin login node**. Please refer to the [**Accessing the Interactive Nodes documentation**](/merlin6/interactive.html)
for further information about how to access the cluster.
In addition, any job *must be submitted from a high performance storage area visible by the login nodes and by the computing nodes*. For this, the possible storage areas are the following:
* `/data/user`
* `/data/project`
* `/shared-scratch`
Please, avoid using `/psi/home` directories for submitting jobs.
### Merlin6 CPU cluster access
The **Merlin6 CPU cluster** (**`merlin6`**) is the default cluster configured in the login nodes. Any job submission will use by default this cluster, unless
the option `--cluster` is specified with another of the existing clusters.
For further information about how to use this cluster, please visit: [**Merlin6 CPU Slurm Cluster documentation**](/merlin6/slurm-configuration.html).
### Merlin6 GPU cluster access
The **Merlin6 GPU cluster** (**`gmerlin6`**) is visible from the login nodes. However, to submit jobs to this cluster, one needs to specify the option `--cluster=gmerlin6` when submitting a job or allocation.
For further information about how to use this cluster, please visit: [**Merlin6 GPU Slurm Cluster documentation**](/gmerlin6/slurm-configuration.html).
### Merlin5 CPU cluster access
The **Merlin5 CPU cluster** (**`merlin5`**) is visible from the login nodes. However, to submit jobs
to this cluster, one needs to specify the option `--cluster=merlin5` when submitting a job or allocation.
Using this cluster is in general not recommended, however this is still available for old users needing
extra computational resources or longer jobs. Have in mind that this cluster is only supported in a
**best effort basis**, and it contains very old hardware and configurations.
For further information about how to use this cluster, please visit the [**Merlin5 CPU Slurm Cluster documentation**](/gmerlin6/slurm-configuration.html).

View File

@@ -0,0 +1,52 @@
---
title: Code Of Conduct
#tags:
keywords: code of conduct, rules, principle, policy, policies, administrator, backup
last_updated: 07 September 2022
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/code-of-conduct.html
---
## The Basic principle
The basic principle is courtesy and consideration for other users.
* Merlin6 is a system shared by many users, therefore you are kindly requested to apply common courtesy in using its resources. Please follow our guidelines which aim at providing and maintaining an efficient compute environment for all our users.
* Basic shell programming skills are an essential requirement in a Linux/UNIX HPC cluster environment; a proficiency in shell programming is greatly beneficial.
## Interactive nodes
* The interactive nodes (also known as login nodes) are for development and quick testing:
* It is **strictly forbidden to run production jobs** on the login nodes. All production jobs must be submitted to the batch system.
* It is **forbidden to run long processes** occupying big parts of a login node's resources.
* According to the previous rules, **misbehaving running processes will have to be killed.**
in order to keep the system responsive for other users.
## Batch system
* Make sure that no broken or run-away processes are left when your job is done. Keep the process space clean on all nodes.
* During the runtime of a job, it is mandatory to use the ``/scratch`` and ``/shared-scratch`` partitions for temporary data:
* It is **forbidden** to use the ``/data/user``, ``/data/project`` or ``/psi/home/`` for that purpose.
* Always remove files you do not need any more (e.g. core dumps, temporary files) as early as possible. Keep the disk space clean on all nodes.
* Prefer ``/scratch`` over ``/shared-scratch`` and use the latter only when you require the temporary files to be visible from multiple nodes.
* Read the description in **[Merlin6 directory structure](/merlin6/storage.html#merlin6-directories)** for learning about the correct usage of each partition type.
## User and project data
* ***Users are responsible for backing up their own data***. Is recommended to backup the data on third party independent systems (i.e. LTS, Archive, AFS, SwitchDrive, Windows Shares, etc.).
* **`/psi/home`**, as this contains a small amount of data, is the only directory where we can provide daily snapshots for one week. This can be found in the following directory **`/psi/home/.snapshot/`**
* ***When a user leaves PSI, she or her supervisor/team are responsible to backup and move the data out from the cluster***: every few months, the storage space will be recycled for those old users who do not have an existing and valid PSI account.
{{site.data.alerts.warning}}When a user leaves PSI and his account has been removed, her storage space in Merlin may be recycled.
Hence, <b>when a user leaves PSI</b>, she, her supervisor or team <b>must ensure that the data is backed up to an external storage</b>
{{site.data.alerts.end}}
## System Administrator Rights
* The system administrator has the right to temporarily block the access to Merlin6 for an account violating the Code of Conduct in order to maintain the efficiency and stability of the system.
* Repetitive violations by the same user will be escalated to the user's supervisor.
* The system administrator has the right to delete files in the **scratch** directories
* after a job, if the job failed to clean up its files.
* during the job in order to prevent a job from destabilizing a node or multiple nodes.
* The system administrator has the right to kill any misbehaving running processes.

View File

@@ -0,0 +1,64 @@
---
title: Introduction
#tags:
keywords: introduction, home, welcome, architecture, design
last_updated: 07 September 2022
#summary: "Merlin 6 cluster overview"
sidebar: merlin6_sidebar
permalink: /merlin6/introduction.html
redirect_from:
- /merlin6
- /merlin6/index.html
---
## The Merlin local HPC cluster
Historically, the local HPC clusters at PSI were named **Merlin**. Over the years,
multiple generations of Merlin have been deployed.
At present, the **Merlin local HPC cluster** contains _two_ generations of it:
* the old **Merlin5** cluster (`merlin5` Slurm cluster), and
* the newest generation **Merlin6**, which is divided in two Slurm clusters:
* `merlin6` as the Slurm CPU cluster
* `gmerlin6` as the Slurm GPU cluster.
Access to the different Slurm clusters is possible from the [**Merlin login nodes**](/merlin6/interactive.html),
which can be accessed through the [SSH protocol](/merlin6/interactive.html#ssh-access) or the [NoMachine (NX) service](/merlin6/nomachine.html).
The following image shows the Slurm architecture design for the Merlin5 & Merlin6 (CPU & GPU) clusters:
![Merlin6 Slurm Architecture Design]({{ "/images/merlin-slurm-architecture.png" }})
### Merlin6
Merlin6 is a the official PSI Local HPC cluster for development and
mission-critical applications that has been built in 2019. It replaces
the Merlin5 cluster.
Merlin6 is designed to be extensible, so is technically possible to add
more compute nodes and cluster storage without significant increase of
the costs of the manpower and the operations.
Merlin6 contains all the main services needed for running cluster, including
**login nodes**, **storage**, **computing nodes** and other *subservices*,
connected to the central PSI IT infrastructure.
#### CPU and GPU Slurm clusters
The Merlin6 **computing nodes** are mostly based on **CPU** resources. However,
it also contains a small amount of **GPU**-based resources, which are mostly used
by the BIO Division and by Deep Leaning project.
These computational resources are split into **two** different **[Slurm](https://slurm.schedmd.com/overview.html)** clusters:
* The Merlin6 CPU nodes are in a dedicated **[Slurm](https://slurm.schedmd.com/overview.html)** cluster called [**`merlin6`**](/merlin6/slurm-configuration.html).
* This is the **default Slurm cluster** configured in the login nodes: any job submitted without the option `--cluster` will be submited to this cluster.
* The Merlin6 GPU resources are in a dedicated **[Slurm](https://slurm.schedmd.com/overview.html)** cluster called [**`gmerlin6`**](/gmerlin6/slurm-configuration.html).
* Users submitting to the **`gmerlin6`** GPU cluster need to specify the option ``--cluster=gmerlin6``.
### Merlin5
The old Slurm **CPU** *merlin* cluster is still active and is maintained in a best effort basis.
**Merlin5** only contains **computing nodes** resources in a dedicated **[Slurm](https://slurm.schedmd.com/overview.html)** cluster.
* The Merlin5 CPU cluster is called [**merlin5**](/merlin5/slurm-configuration.html).

View File

@@ -0,0 +1,47 @@
---
title: Requesting Merlin Accounts
#tags:
keywords: registration, register, account, merlin5, merlin6, snow, service now
last_updated: 07 September 2022
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/request-account.html
---
## Requesting Access to Merlin6
Access to Merlin6 is regulated by a PSI user's account being a member of the **`svc-cluster_merlin6`** group. Access to this group will also grant access to older generations of Merlin (`merlin5`).
Requesting **Merlin6** access *has to be done* with the corresponding **[Request Linux Group Membership](https://psi.service-now.com/psisp?id=psi_new_sc_cat_item&sys_id=84f2c0c81b04f110679febd9bb4bcbb1)** form, available in the [PSI Service Now Service Catalog](https://psi.service-now.com/psisp).
![Example: Requesting access to Merlin6]({{ "/images/Access/01-request-merlin6-membership.png" }})
Mandatory customizable fields are the following:
* **`Order Access for user`**, which defaults to the logged in user. However, requesting access for another user it's also possible.
* **`Request membership for group`**, for Merlin6 the **`svc-cluster_merlin6`** must be selected.
* **`Justification`**, please add here a short justification why access to Merlin6 is necessary.
Once submitted, the Merlin responsible will approve the request as soon as possible (within the next few hours on working days). Once the request is approved, *it may take up to 30 minutes to get the account fully configured*.
## Requesting Access to Merlin5
Access to Merlin5 is regulated by a PSI user's account being a member of the **`svc-cluster_merlin5`** group. Access to this group does not grant access to newer generations of Merlin (`merlin6`, `gmerlin6`, and future ones).
Requesting **Merlin5** access *has to be done* with the corresponding **[Request Linux Group Membership](https://psi.service-now.com/psisp?id=psi_new_sc_cat_item&sys_id=84f2c0c81b04f110679febd9bb4bcbb1)** form, available in the [PSI Service Now Service Catalog](https://psi.service-now.com/psisp).
![Example: Requesting access to Merlin5]({{ "/images/Access/01-request-merlin5-membership.png" }})
Mandatory customizable fields are the following:
* **`Order Access for user`**, which defaults to the logged in user. However, requesting access for another user it's also possible.
* **`Request membership for group`**, for Merlin5 the **`svc-cluster_merlin5`** must be selected.
* **`Justification`**, please add here a short justification why access to Merlin5 is necessary.
Once submitted, the Merlin responsible will approve the request as soon as possible (within the next few hours on working days). Once the request is approved, *it may take up to 30 minutes to get the account fully configured*.
## Further documentation
Further information it's also available in the Linux Central Documentation:
* [Unix Group / Group Management for users](https://linux.psi.ch/documentation/services/user-guide/unix_groups.html)
* [Unix Group / Group Management for group managers](https://linux.psi.ch/documentation/services/admin-guide/unix_groups.html)
**Special thanks** to the **Linux Central Team** and **AIT** to make this possible.

View File

@@ -0,0 +1,123 @@
---
title: Requesting a Merlin Project
#tags:
keywords: merlin project, project, snow, service now
last_updated: 07 September 2022
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/request-project.html
---
A project owns its own storage area in Merlin, which can be accessed by other group members.
Projects can receive a higher storage quota than user areas and should be the primary way of organizing bigger storage requirements
in a multi-user collaboration.
Access to a project's directories is governed by project members belonging to a common **Unix group**. You may use an existing
Unix group or you may have a new Unix group created especially for the project. The **project responsible** will be the owner of
the Unix group (*this is important*)!
This document explains how to request new Unix group, to request membership for existing groups, and the procedure for requesting a Merlin project.
## About Unix groups
Before requesting a Merlin project, it is important to have a Unix group that can be used to grant access to it to different members
of the project.
Unix groups in the PSI Active Directory (which is the PSI central database containing user and group information, and more) are defined by the `unx-` prefix, followed by a name.
In general, PSI employees working on Linux systems (including HPC clusters, like Merlin) can request for a non-existing Unix group, and can become responsible for managing it.
In addition, a list of administrators can be set. The administrators, together with the group manager, can approve or deny membership requests. Further information about this topic
is covered in the [Linux Documentation - Services Admin Guides: Unix Groups / Group Management](https://linux.psi.ch/documentation/services/admin-guide/unix_groups.html), managed by the Central Linux Team.
To gran access to specific Merlin project directories, some users may require to be added to some specific **Unix groups**:
* Each Merlin project (i.e. `/data/project/{bio|general}/$projectname`) or experiment (i.e. `/data/experiment/$experimentname`) directory has access restricted by ownership and group membership (with a very few exceptions allowing public access).
* Users requiring access to a specific restricted project or experiment directory have to request membership for the corresponding Unix group owning the directory.
### Requesting a new Unix group
**If you need a new Unix group** to be created, you need to first get this group through a separate
**[PSI Service Now ticket](https://psi.service-now.com/psisp)**. **Please use the following template.**
You can also specify the login names of the initial group members and the **owner** of the group.
The owner of the group is the person who will be allowed to modify the group.
* Please open an *Incident Request* with subject:
```
Subject: Request for new unix group xxxx
```
* and base the text field of the request on this template
```
Dear HelpDesk
I would like to request a new unix group.
Unix Group Name: unx-xxxxx
Initial Group Members: xxxxx, yyyyy, zzzzz, ...
Group Owner: xxxxx
Group Administrators: aaaaa, bbbbb, ccccc, ....
Best regards,
```
### Requesting Unix group membership
Existing Merlin projects have already a Unix group assigned. To have access to a project, users must belong to the proper **Unix group** owning that project.
Supervisors should inform new users which extra groups are needed for their project(s). If this information is not known, one can check the permissions for that directory. In example:
```bash
(base) ❄ [caubet_m@merlin-l-001:/data/user/caubet_m]# ls -ltrhd /data/project/general/$projectname
(base) ❄ [caubet_m@merlin-l-001:/data/user/caubet_m]# ls -ltrhd /data/project/bio/$projectname
```
Requesting membership for a specific Unix group *has to be done* with the corresponding **[Request Linux Group Membership](https://psi.service-now.com/psisp?id=psi_new_sc_cat_item&sys_id=84f2c0c81b04f110679febd9bb4bcbb1)** form, available in the [PSI Service Now Service Catalog](https://psi.service-now.com/psisp).
![Example: Requesting Unix Group membership]({{ "/images/Access/01-request-unx-group-membership.png" }})
Once submitted, the responsible of the Unix group has to approve the request.
**Important note**: Requesting access to specific Unix Groups will require validation from the responsible of the Unix Group. If you ask for inclusion in many groups it may take longer, since the fulfillment of the request will depend on more people.
Further information can be found in the [Linux Documentation - Services User guide: Unix Groups / Group Management](https://linux.psi.ch/documentation/services/user-guide/unix_groups.html)
### Managing Unix Groups
Other administration operations on Unix Groups it's mainly covered in the [Linux Documentation - Services Admin Guides: Unix Groups / Group Management](https://linux.psi.ch/documentation/services/admin-guide/unix_groups.html), managed by the Central Linux Team.
## Requesting a Merlin project
Once a Unix group is available, a Merlin project can be requested.
To request a project, please provide the following information in a **[PSI Service Now ticket](https://psi.service-now.com/psisp)**
* Please open an *Incident Request* with subject:
```
Subject: [Merlin6] Project Request for project name xxxxxx
```
* and base the text field of the request on this template
```
Dear HelpDesk
I would like to request a new Merlin6 project.
Project Name: xxxxx
UnixGroup: xxxxx # Must be an existing Unix Group
The project responsible is the Owner of the Unix Group.
If you need a storage quota exceeding the defaults, please provide a description
and motivation for the higher storage needs:
Storage Quota: 1TB with a maximum of 1M Files
Reason: (None for default 1TB/1M)
Best regards,
```
The **default storage quota** for a project is 1TB (with a maximal *Number of Files* of 1M). If you need a larger assignment, you
need to request this and provide a description of your storage needs.
## Further documentation
Further information it's also available in the Linux Central Documentation:
* [Unix Group / Group Management for users](https://linux.psi.ch/documentation/services/user-guide/unix_groups.html)
* [Unix Group / Group Management for group managers](https://linux.psi.ch/documentation/services/admin-guide/unix_groups.html)
**Special thanks** to the **Linux Central Team** and **AIT** to make this possible.

View File

@@ -0,0 +1,379 @@
---
title: Archive & PSI Data Catalog
#tags:
keywords: linux, archive, data catalog, archiving, lts, tape, long term storage, ingestion, datacatalog
last_updated: 31 January 2020
summary: "This document describes how to use the PSI Data Catalog for archiving Merlin6 data."
sidebar: merlin6_sidebar
permalink: /merlin6/archive.html
---
## PSI Data Catalog as a PSI Central Service
PSI provides access to the ***Data Catalog*** for **long-term data storage and retrieval**. Data is
stored on the ***PetaByte Archive*** at the **Swiss National Supercomputing Centre (CSCS)**.
The Data Catalog and Archive is suitable for:
* Raw data generated by PSI instruments
* Derived data produced by processing some inputs
* Data required to reproduce PSI research and publications
The Data Catalog is part of PSI's effort to conform to the FAIR principles for data management.
In accordance with this policy, ***data will be publicly released under CC-BY-SA 4.0 after an
embargo period expires.***
The Merlin cluster is connected to the Data Catalog. Hence, users archive data stored in the
Merlin storage under the ``/data`` directories (currentlyi, ``/data/user`` and ``/data/project``).
Archiving from other directories is also possible, however the process is much slower as data
can not be directly retrieved by the PSI archive central servers (**central mode**), and needs to
be indirectly copied to these (**decentral mode**).
Archiving can be done from any node accessible by the users (usually from the login nodes).
{{site.data.alerts.tip}} Archiving can be done in two different ways:
<br>
<b>'Central mode':</b> Possible for the user and project data directories, is the
fastest way as it does not require remote copy (data is directly retreived by central AIT servers from Merlin
through 'merlin-archive.psi.ch').
<br>
<br>
<b>'Decentral mode':</b> Possible for any directory, is the slowest way of archiving as it requires
to copy ('rsync') the data from Merlin to the central AIT servers.
{{site.data.alerts.end}}
## Procedure
### Overview
Below are the main steps for using the Data Catalog.
* Ingest the dataset into the Data Catalog. This makes the data known to the Data Catalog system at PSI:
* Prepare a metadata file describing the dataset
* Run **``datasetIngestor``** script
* If necessary, the script will copy the data to the PSI archive servers
* Usually this is necessary when archiving from directories other than **``/data/user``** or
**``/data/project``**. It would be also necessary when the Merlin export server (**``merlin-archive.psi.ch``**)
is down for any reason.
* Archive the dataset:
* Visit [https://discovery.psi.ch](https://discovery.psi.ch)
* Click **``Archive``** for the dataset
* The system will now copy the data to the PetaByte Archive at CSCS
* Retrieve data from the catalog:
* Find the dataset on [https://discovery.psi.ch](https://discovery.psi.ch) and click **``Retrieve``**
* Wait for the data to be copied to the PSI retrieval system
* Run **``datasetRetriever``** script
Since large data sets may take a lot of time to transfer, some steps are designed to happen in the
background. The discovery website can be used to track the progress of each step.
### Account Registration
Two types of account permit access to the Data Catalog. If your data was collected at a ***beamline***, you may
have been assigned a **``p-group``** (e.g. ``p12345``) for the experiment. Other users are assigned **``a-group``**
(e.g. ``a-12345``).
Groups are usually assigned to a PI, and then individual user accounts are added to the group. This must be done
under user request through PSI Service Now. For existing **a-groups** and **p-groups**, you can follow the standard
central procedures. Alternatively, if you do not know how to do that, follow the Merlin6
**[Requesting extra Unix groups](/merlin6/request-account.html#requesting-extra-unix-groups)** procedure, or open
a **[PSI Service Now](https://psi.service-now.com/psisp)** ticket.
### Documentation
Accessing the Data Catalog is done through the [SciCat software](https://melanie.gitpages.psi.ch/SciCatPages/).
Documentation is here: [ingestManual](https://scicatproject.github.io/documentation/Ingestor/ingestManual.html).
#### Loading datacatalog tools
The latest datacatalog software is maintained in the PSI module system. To access it from the Merlin systems, run the following command:
```bash
module load datacatalog
```
It can be done from any host in the Merlin cluster accessible by users. Usually, login nodes will be the nodes used for archiving.
### Finding your token
As of 2022-04-14 a secure token is required to interact with the data catalog. This is a long random string that replaces the previous user/password authentication (allowing access for non-PSI use cases). **This string should be treated like a password and not shared.**
1. Go to discovery.psi.ch
1. Click 'Sign in' in the top right corner. Click the 'Login with PSI account' and log in on the PSI login1. page.
1. You should be redirected to your user settings and see a 'User Information' section. If not, click on1. your username in the top right and choose 'Settings' from the menu.
1. Look for the field 'Catamel Token'. This should be a 64-character string. Click the icon to copy the1. token.
![SciCat website](/images/scicat_token.png)
You will need to save this token for later steps. To avoid including it in all the commands, I suggest saving it to an environmental variable (Linux):
```
$ SCICAT_TOKEN=RqYMZcqpqMJqluplbNYXLeSyJISLXfnkwlfBKuvTSdnlpKkU
```
(Hint: prefix this line with a space to avoid saving the token to your bash history.)
Tokens expire after 2 weeks and will need to be fetched from the website again.
### Ingestion
The first step to ingesting your data into the catalog is to prepare a file describing what data you have. This is called
**``metadata.json``**, and can be created with a text editor (e.g. *``vim``*). It can in principle be saved anywhere,
but keeping it with your archived data is recommended. For more information about the format, see the 'Bio metadata'
section below. An example follows:
```yaml
{
"principalInvestigator": "albrecht.gessler@psi.ch",
"creationLocation": "/PSI/EMF/JEOL2200FS",
"dataFormat": "TIFF+LZW Image Stack",
"sourceFolder": "/gpfs/group/LBR/pXXX/myimages",
"owner": "Wilhelm Tell",
"ownerEmail": "wilhelm.tell@psi.ch",
"type": "raw",
"description": "EM micrographs of amygdalin",
"ownerGroup": "a-12345",
"scientificMetadata": {
"description": "EM micrographs of amygdalin",
"sample": {
"name": "Amygdalin beta-glucosidase 1",
"uniprot": "P29259",
"species": "Apple"
},
"dataCollection": {
"date": "2018-08-01"
},
"microscopeParameters": {
"pixel size": {
"v": 0.885,
"u": "A"
},
"voltage": {
"v": 200,
"u": "kV"
},
"dosePerFrame": {
"v": 1.277,
"u": "e/A2"
}
}
}
}
```
It is recommended to use the [ScicatEditor](https://bliven_s.gitpages.psi.ch/SciCatEditor/) for creating metadata files. This is a browser-based tool specifically for ingesting PSI data. Using the tool avoids syntax errors and provides templates for common data sets and options. The finished JSON file can then be downloaded to merlin or copied into a text editor.
Another option is to use the SciCat graphical interface from NoMachine. This provides a graphical interface for selecting data to archive. This is particularly useful for data associated with a DUO experiment and p-group. Type `SciCat`` to get started after loading the `datacatalog`` module. The GUI also replaces the the command-line ingestion described below.
The following steps can be run from wherever you saved your ``metadata.json``. First, perform a "dry-run" which will check the metadata for errors:
```bash
datasetIngestor --token $SCICAT_TOKEN metadata.json
```
It will ask for your PSI credentials and then print some info about the data to be ingested. If there are no errors, proceed to the real ingestion:
```bash
datasetIngestor --token $SCICAT_TOKEN --ingest --autoarchive metadata.json
```
You will be asked whether you want to copy the data to the central system:
* If you are on the Merlin cluster and you are archiving data from ``/data/user`` or ``/data/project``, answer 'no' since the data catalog can
directly read the data.
* If you are on a directory other than ``/data/user`` and ``/data/project, or you are on a desktop computer, answer 'yes'. Copying large datasets
to the PSI archive system may take quite a while (minutes to hours).
If there are no errors, your data has been accepted into the data catalog! From now on, no changes should be made to the ingested data.
This is important, since the next step is for the system to copy all the data to the CSCS Petabyte archive. Writing to tape is slow, so
this process may take several days, and it will fail if any modifications are detected.
If using the ``--autoarchive`` option as suggested above, your dataset should now be in the queue. Check the data catalog:
[https://discovery.psi.ch](https://discovery.psi.ch). Your job should have status 'WorkInProgress'. You will receive an email when the ingestion
is complete.
If you didn't use ``--autoarchive``, you need to manually move the dataset into the archive queue. From **discovery.psi.ch**, navigate to the 'Archive'
tab. You should see the newly ingested dataset. Check the dataset and click **``Archive``**. You should see the status change from **``datasetCreated``** to
**``scheduleArchiveJob``**. This indicates that the data is in the process of being transferred to CSCS.
After a few days the dataset's status will change to **``datasetOnAchive``** indicating the data is stored. At this point it is safe to delete the data.
#### Useful commands
Running the datasetIngestor in dry mode (**without** ``--ingest``) finds most errors. However, it is sometimes convenient to find potential errors
yourself with simple unix commands.
Find problematic filenames
```bash
find . -iregex '.*/[^/]*[^a-zA-Z0-9_ ./-][^/]*'=
```
Find broken links
```bash
find -L . -type l
```
Find outside links
```bash
find . -type l -exec bash -c 'realpath --relative-base "`pwd`" "$0" 2>/dev/null |egrep "^[./]" |sed "s|^|$0 ->|" ' '{}' ';'
```
Delete certain files (use with caution)
```bash
# Empty directories
find . -type d -empty -delete
# Backup files
find . -name '*~' -delete
find . -name '*#autosave#' -delete
```
#### Troubleshooting & Known Bugs
* The following message can be safely ignored:
```bash
key_cert_check_authority: invalid certificate
Certificate invalid: name is not a listed principal
```
It indicates that no kerberos token was provided for authentication. You can avoid the warning by first running kinit (PSI linux systems).
* For decentral ingestion cases, the copy step is indicated by a message ``Running [/usr/bin/rsync -e ssh -avxz ...``. It is expected that this
step will take a long time and may appear to have hung. You can check what files have been successfully transfered using rsync:
```bash
rsync --list-only user_n@pb-archive.psi.ch:archive/UID/PATH/
```
where UID is the dataset ID (12345678-1234-1234-1234-123456789012) and PATH is the absolute path to your data. Note that rsync creates directories first and that the transfer order is not alphabetical in some cases, but it should be possible to see whether any data has transferred.
* There is currently a limit on the number of files per dataset (technically, the limit is from the total length of all file paths). It is recommended to break up datasets into 300'000 files or less.
* If it is not possible or desirable to split data between multiple datasets, an alternate work-around is to package files into a tarball. For datasets which are already compressed, omit the -z option for a considerable speedup:
```
tar -f [output].tar [srcdir]
```
Uncompressed data can be compressed on the cluster using the following command:
```
sbatch /data/software/Slurm/Utilities/Parallel_TarGz.batch -s [srcdir] -t [output].tar -n
```
Run /data/software/Slurm/Utilities/Parallel_TarGz.batch -h for more details and options.
#### Sample ingestion output (datasetIngestor 1.1.11)
<details>
<summary>[Show Example]: Sample ingestion output (datasetIngestor 1.1.11)</summary>
<pre class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false">
/data/project/bio/myproject/archive $ datasetIngestor -copy -autoarchive -allowexistingsource -ingest metadata.json
2019/11/06 11:04:43 Latest version: 1.1.11
2019/11/06 11:04:43 Your version of this program is up-to-date
2019/11/06 11:04:43 You are about to add a dataset to the === production === data catalog environment...
2019/11/06 11:04:43 Your username:
user_n
2019/11/06 11:04:48 Your password:
2019/11/06 11:04:52 User authenticated: XXX
2019/11/06 11:04:52 User is member in following a or p groups: XXX
2019/11/06 11:04:52 OwnerGroup information a-XXX verified successfully.
2019/11/06 11:04:52 contactEmail field added: XXX
2019/11/06 11:04:52 Scanning files in dataset /data/project/bio/myproject/archive
2019/11/06 11:04:52 No explicit filelistingPath defined - full folder /data/project/bio/myproject/archive is used.
2019/11/06 11:04:52 Source Folder: /data/project/bio/myproject/archive at /data/project/bio/myproject/archive
2019/11/06 11:04:57 The dataset contains 100000 files with a total size of 50000000000 bytes.
2019/11/06 11:04:57 creationTime field added: 2019-07-29 18:47:08 +0200 CEST
2019/11/06 11:04:57 endTime field added: 2019-11-06 10:52:17.256033 +0100 CET
2019/11/06 11:04:57 license field added: CC BY-SA 4.0
2019/11/06 11:04:57 isPublished field added: false
2019/11/06 11:04:57 classification field added: IN=medium,AV=low,CO=low
2019/11/06 11:04:57 Updated metadata object:
{
"accessGroups": [
"XXX"
],
"classification": "IN=medium,AV=low,CO=low",
"contactEmail": "XXX",
"creationLocation": "XXX",
"creationTime": "2019-07-29T18:47:08+02:00",
"dataFormat": "XXX",
"description": "XXX",
"endTime": "2019-11-06T10:52:17.256033+01:00",
"isPublished": false,
"license": "CC BY-SA 4.0",
"owner": "XXX",
"ownerEmail": "XXX",
"ownerGroup": "a-XXX",
"principalInvestigator": "XXX",
"scientificMetadata": {
...
},
"sourceFolder": "/data/project/bio/myproject/archive",
"type": "raw"
}
2019/11/06 11:04:57 Running [/usr/bin/ssh -l user_n pb-archive.psi.ch test -d /data/project/bio/myproject/archive].
key_cert_check_authority: invalid certificate
Certificate invalid: name is not a listed principal
user_n@pb-archive.psi.ch's password:
2019/11/06 11:05:04 The source folder /data/project/bio/myproject/archive is not centrally available (decentral use case).
The data must first be copied to a rsync cache server.
2019/11/06 11:05:04 Do you want to continue (Y/n)?
Y
2019/11/06 11:05:09 Created dataset with id 12.345.67890/12345678-1234-1234-1234-123456789012
2019/11/06 11:05:09 The dataset contains 108057 files.
2019/11/06 11:05:10 Created file block 0 from file 0 to 1000 with total size of 413229990 bytes
2019/11/06 11:05:10 Created file block 1 from file 1000 to 2000 with total size of 416024000 bytes
2019/11/06 11:05:10 Created file block 2 from file 2000 to 3000 with total size of 416024000 bytes
2019/11/06 11:05:10 Created file block 3 from file 3000 to 4000 with total size of 416024000 bytes
...
2019/11/06 11:05:26 Created file block 105 from file 105000 to 106000 with total size of 416024000 bytes
2019/11/06 11:05:27 Created file block 106 from file 106000 to 107000 with total size of 416024000 bytes
2019/11/06 11:05:27 Created file block 107 from file 107000 to 108000 with total size of 850195143 bytes
2019/11/06 11:05:27 Created file block 108 from file 108000 to 108057 with total size of 151904903 bytes
2019/11/06 11:05:27 short dataset id: 0a9fe316-c9e7-4cc5-8856-e1346dd31e31
2019/11/06 11:05:27 Running [/usr/bin/rsync -e ssh -avxz /data/project/bio/myproject/archive/ user_n@pb-archive.psi.ch:archive
/0a9fe316-c9e7-4cc5-8856-e1346dd31e31/data/project/bio/myproject/archive].
key_cert_check_authority: invalid certificate
Certificate invalid: name is not a listed principal
user_n@pb-archive.psi.ch's password:
Permission denied, please try again.
user_n@pb-archive.psi.ch's password:
/usr/libexec/test_acl.sh: line 30: /tmp/tmpacl.txt: Permission denied
/usr/libexec/test_acl.sh: line 30: /tmp/tmpacl.txt: Permission denied
/usr/libexec/test_acl.sh: line 30: /tmp/tmpacl.txt: Permission denied
/usr/libexec/test_acl.sh: line 30: /tmp/tmpacl.txt: Permission denied
/usr/libexec/test_acl.sh: line 30: /tmp/tmpacl.txt: Permission denied
...
2019/11/06 12:05:08 Successfully updated {"pid":"12.345.67890/12345678-1234-1234-1234-123456789012",...}
2019/11/06 12:05:08 Submitting Archive Job for the ingested datasets.
2019/11/06 12:05:08 Job response Status: okay
2019/11/06 12:05:08 A confirmation email will be sent to XXX
12.345.67890/12345678-1234-1234-1234-123456789012
</pre>
</details>
### Publishing
After datasets are are ingested they can be assigned a public DOI. This can be included in publications and will make the datasets on http://doi.psi.ch.
For instructions on this, please read the ['Publish' section in the ingest manual](https://scicatproject.github.io/documentation/Ingestor/ingestManual.html#sec-8).
### Retrieving data
Retrieving data from the archive is also initiated through the Data Catalog. Please read the ['Retrieve' section in the ingest manual](https://scicatproject.github.io/documentation/Ingestor/ingestManual.html#sec-6).
## Further Information
* [PSI Data Catalog](https://discovery.psi.ch)
* [Full Documentation](https://scicatproject.github.io/documentation/Ingestor/ingestManual.html)
* [Published Datasets (doi.psi.ch)](https://doi.psi.ch)
* Data Catalog [PSI page](https://www.psi.ch/photon-science-data-services/data-catalog-and-archive)
* Data catalog [SciCat Software](https://scicatproject.github.io/)
* [FAIR](https://www.nature.com/articles/sdata201618) definition and [SNF Research Policy](http://www.snf.ch/en/theSNSF/research-policies/open_research_data/Pages/default.aspx#FAIR%20Data%20Principles%20for%20Research%20Data%20Management)
* [Petabyte Archive at CSCS](https://www.cscs.ch/fileadmin/user_upload/contents_publications/annual_reports/AR2017_Online.pdf)

View File

@@ -0,0 +1,50 @@
---
title: Connecting from a Linux Client
#tags:
keywords: linux, connecting, client, configuration, SSH, X11
last_updated: 07 September 2022
summary: "This document describes a recommended setup for a Linux client."
sidebar: merlin6_sidebar
permalink: /merlin6/connect-from-linux.html
---
## SSH without X11 Forwarding
This is the standard method. Official X11 support is provided through [NoMachine](/merlin6/nomachine.html).
For normal SSH sessions, use your SSH client as follows:
```bash
ssh $username@merlin-l-01.psi.ch
ssh $username@merlin-l-001.psi.ch
ssh $username@merlin-l-002.psi.ch
```
## SSH with X11 Forwarding
Official X11 Forwarding support is through NoMachine. Please follow the document
[{Job Submission -> Interactive Jobs}](/merlin6/interactive-jobs.html#Requirements) and
[{Accessing Merlin -> NoMachine}](/merlin6/nomachine.html) for more details. However,
we provide a small recipe for enabling X11 Forwarding in Linux.
* For enabling client X11 forwarding, add the following to the start of ``~/.ssh/config``
to implicitly add ``-X`` to all ssh connections:
```bash
ForwardAgent yes
ForwardX11Trusted yes
```
* Alternatively, you can add the option ``-Y`` to the ``ssh`` command. In example:
```bash
ssh -X $username@merlin-l-01.psi.ch
ssh -X $username@merlin-l-001.psi.ch
ssh -X $username@merlin-l-002.psi.ch
```
* For testing that X11 forwarding works, just run ``xclock``. A X11 based clock should
popup in your client session:
```bash
xclock
```

View File

@@ -0,0 +1,60 @@
---
title: Connecting from a MacOS Client
#tags:
keywords: MacOS, mac os, mac, connecting, client, configuration, SSH, X11
last_updated: 07 September 2022
summary: "This document describes a recommended setup for a MacOS client."
sidebar: merlin6_sidebar
permalink: /merlin6/connect-from-macos.html
---
## SSH without X11 Forwarding
This is the standard method. Official X11 support is provided through [NoMachine](/merlin6/nomachine.html).
For normal SSH sessions, use your SSH client as follows:
```bash
ssh $username@merlin-l-01.psi.ch
ssh $username@merlin-l-001.psi.ch
ssh $username@merlin-l-002.psi.ch
```
## SSH with X11 Forwarding
### Requirements
For running SSH with X11 Forwarding in MacOS, one needs to have a X server running in MacOS.
The official X Server for MacOS is **[XQuartz](https://www.xquartz.org/)**. Please ensure
you have it running before starting a SSH connection with X11 forwarding.
### SSH with X11 Forwarding in MacOS
Official X11 support is through NoMachine. Please follow the document
[{Job Submission -> Interactive Jobs}](/merlin6/interactive-jobs.html#Requirements) and
[{Accessing Merlin -> NoMachine}](/merlin6/nomachine.html) for more details. However,
we provide a small recipe for enabling X11 Forwarding in MacOS.
* Ensure that **[XQuartz](https://www.xquartz.org/)** is installed and running in your MacOS.
* For enabling client X11 forwarding, add the following to the start of ``~/.ssh/config``
to implicitly add ``-X`` to all ssh connections:
```bash
ForwardAgent yes
ForwardX11Trusted yes
```
* Alternatively, you can add the option ``-Y`` to the ``ssh`` command. In example:
```bash
ssh -X $username@merlin-l-01.psi.ch
ssh -X $username@merlin-l-001.psi.ch
ssh -X $username@merlin-l-002.psi.ch
```
* For testing that X11 forwarding works, just run ``xclock``. A X11 based clock should
popup in your client session.
```bash
xclock
```

View File

@@ -0,0 +1,47 @@
---
title: Connecting from a Windows Client
keywords: microsoft, mocosoft, windows, putty, xming, connecting, client, configuration, SSH, X11
last_updated: 07 September 2022
summary: "This document describes a recommended setup for a Windows client."
sidebar: merlin6_sidebar
permalink: /merlin6/connect-from-windows.html
---
## SSH with PuTTY without X11 Forwarding
PuTTY is one of the most common tools for SSH.
Check, if the following software packages are installed on the Windows workstation by
inspecting the *Start* menu (hint: use the *Search* box to save time):
* PuTTY (should be already installed)
* *[Optional]* Xming (needed for [SSH with X11 Forwarding](/merlin6/connect-from-windows.html#ssh-with-x11-forwarding))
If they are missing, you can install them using the Software Kiosk icon on the Desktop.
1. Start PuTTY
2. *[Optional]* Enable ``xterm`` to have similar mouse behavour as in Linux:
![Enable 'xterm']({{ "/images/PuTTY/Putty_Mouse_XTerm.png" }})
3. Create session to a Merlin login node and *Open*:
![Create Merlin Session]({{ "/images/PuTTY/Putty_Session.png" }})
## SSH with PuTTY with X11 Forwarding
Official X11 Forwarding support is through NoMachine. Please follow the document
[{Job Submission -> Interactive Jobs}](/merlin6/interactive-jobs.html#Requirements) and
[{Accessing Merlin -> NoMachine}](/merlin6/nomachine.html) for more details. However,
we provide a small recipe for enabling X11 Forwarding in Windows.
Check, if the **Xming** is installed on the Windows workstation by inspecting the
*Start* menu (hint: use the *Search* box to save time). If missing, you can install it by
using the Software Kiosk icon (should be located on the Desktop).
1. Ensure that a X server (**Xming**) is running. Otherwise, start it.
2. Enable X11 Forwarding in your SSH client. In example, for Putty:
![Enable X11 Forwarding in Putty]({{ "/images/PuTTY/Putty_X11_Forwarding.png" }})

View File

@@ -0,0 +1,192 @@
---
title: Kerberos and AFS authentication
#tags:
keywords: kerberos, AFS, kinit, klist, keytab, tickets, connecting, client, configuration, slurm
last_updated: 07 September 2022
summary: "This document describes how to use Kerberos."
sidebar: merlin6_sidebar
permalink: /merlin6/kerberos.html
---
Projects and users have their own areas in the central PSI AFS service. In order
to access to these areas, valid Kerberos and AFS tickets must be granted.
These tickets are automatically granted when accessing through SSH with
username and password. Alternatively, one can get a granting ticket with the `kinit` (Kerberos)
and `aklog` (AFS ticket, which needs to be run after `kinit`) commands.
Due to PSI security policies, the maximum lifetime of the ticket is 7 days, and the default
time is 10 hours. It means than one needs to constantly renew (`krenew` command) the existing
granting tickets, and their validity can not be extended longer than 7 days. At this point,
one needs to obtain new granting tickets.
## Obtaining granting tickets with username and password
As already described above, the most common use case is to obtain Kerberos and AFS granting tickets
by introducing username and password:
* When login to Merlin through SSH protocol, if this is done with username + password authentication,
tickets for Kerberos and AFS will be automatically obtained.
* When login to Merlin through NoMachine, no Kerberos and AFS are granted. Therefore, users need to
run `kinit` (to obtain a granting Kerberos ticket) followed by `aklog` (to obtain a granting AFS ticket).
See further details below.
To manually obtain granting tickets, one has to:
1. To obtain a granting Kerberos ticket, one needs to run `kinit $USER` and enter the PSI password.
```bash
kinit $USER@D.PSI.CH
```
2. To obtain a granting ticket for AFS, one needs to run `aklog`. No password is necessary, but a valid
Kerberos ticket is mandatory.
```bash
aklog
```
3. To list the status of your granted tickets, users can use the `klist` command.
```bash
klist
```
4. To extend the validity of existing granting tickets, users can use the `krenew` command.
```bash
krenew
```
* Keep in mind that the maximum lifetime for granting tickets is 7 days, therefore `krenew` can not be used beyond that limit,
and then `kinit** should be used instead.
## Obtanining granting tickets with keytab
Sometimes, obtaining granting tickets by using password authentication is not possible. An example are user Slurm jobs
requiring access to private areas in AFS. For that, there's the possibility to generate a **keytab** file.
Be aware that the **keytab** file must be **private**, **fully protected** by correct permissions and not shared with any
other users.
### Creating a keytab file
For generating a **keytab**, one has to:
1. Load a newer Kerberos ( `krb5/1.20` or higher) from Pmodules:
```bash
module load krb5/1.20
```
2. Create a private directory for storing the Kerberos **keytab** file
```bash
mkdir -p ~/.k5
```
3. Run the `ktutil` utility which comes with the loaded `krb5` Pmodule:
```bash
ktutil
```
4. In the `ktutil` console, one has to generate a **keytab** file as follows:
```bash
# Replace $USER by your username
add_entry -password -k 0 -f -p $USER
wkt /psi/home/$USER/.k5/krb5.keytab
exit
```
Notice that you will need to add your password once. This step is required for generating the **keytab** file.
5. Once back to the main shell, one has to ensure that the file contains the proper permissions:
```bash
chmod 0600 ~/.k5/krb5.keytab
```
### Obtaining tickets by using keytab files
Once the keytab is created, one can obtain kerberos tickets without being prompted for a password as follows:
```bash
kinit -kt ~/.k5/krb5.keytab $USER
aklog
```
## Slurm jobs accessing AFS
Some jobs may require to access private areas in AFS. For that, having a valid [**keytab**](/merlin6/kerberos.html#generating-granting-tickets-with-keytab) file is required.
Then, from inside the batch script one can obtain granting tickets for Kerberos and AFS, which can be used for accessing AFS private areas.
The steps should be the following:
* Setup `KRB5CCNAME`, which can be used to specify the location of the Kerberos5 credentials (ticket) cache. In general it should point to a shared area
(`$HOME/.k5` is a good location), and is strongly recommended to generate an independent Kerberos5 credential cache (it is, creating a new credential cache per Slurm job):
```bash
export KRB5CCNAME="$(mktemp "$HOME/.k5/krb5cc_XXXXXX")"
```
* To obtain a Kerberos5 granting ticket, run `kinit` by using your keytab:
```bash
kinit -kt "$HOME/.k5/krb5.keytab" $USER@D.PSI.CH
```
* To obtain a granting AFS ticket, run `aklog`:
```bash
aklog
```
* At the end of the job, you can remove destroy existing Kerberos tickets.
```bash
kdestroy
```
### Slurm batch script example: obtaining KRB+AFS granting tickets
#### Example 1: Independent crendetial cache per Slurm job
This is the **recommended** way. At the end of the job, is strongly recommended to remove / destroy the existing kerberos tickets.
```bash
#!/bin/bash
#SBATCH --partition=hourly # Specify 'general' or 'daily' or 'hourly'
#SBATCH --time=01:00:00 # Strictly recommended when using 'general' partition.
#SBATCH --output=run.out # Generate custom output file
#SBATCH --error=run.err # Generate custom error file
#SBATCH --nodes=1 # Uncomment and specify #nodes to use
#SBATCH --ntasks=1 # Uncomment and specify #nodes to use
#SBATCH --cpus-per-task=1
#SBATCH --constraint=xeon-gold-6152
#SBATCH --hint=nomultithread
#SBATCH --job-name=krb5
export KRB5CCNAME="$(mktemp "$HOME/.k5/krb5cc_XXXXXX")"
kinit -kt "$HOME/.k5/krb5.keytab" $USER@D.PSI.CH
aklog
klist
echo "Here should go my batch script code."
# Destroy Kerberos tickets created for this job only
kdestroy
klist
```
#### Example 2: Shared credential cache
Some users may need/prefer to run with a shared cache file. For doing that, one needs to
setup `KRB5CCNAME` from the **login node** session, before submitting the job.
```bash
export KRB5CCNAME="$(mktemp "$HOME/.k5/krb5cc_XXXXXX")"
```
Then, you can run one or multiple jobs scripts (or parallel job with `srun`). `KRB5CCNAME` will be propagated to the
job script or to the parallel job, therefore a single credential cache will be shared amongst different Slurm runs.
```bash
#!/bin/bash
#SBATCH --partition=hourly # Specify 'general' or 'daily' or 'hourly'
#SBATCH --time=01:00:00 # Strictly recommended when using 'general' partition.
#SBATCH --output=run.out # Generate custom output file
#SBATCH --error=run.err # Generate custom error file
#SBATCH --nodes=1 # Uncomment and specify #nodes to use
#SBATCH --ntasks=1 # Uncomment and specify #nodes to use
#SBATCH --cpus-per-task=1
#SBATCH --constraint=xeon-gold-6152
#SBATCH --hint=nomultithread
#SBATCH --job-name=krb5
# KRB5CCNAME is inherit from the login node session
kinit -kt "$HOME/.k5/krb5.keytab" $USER@D.PSI.CH
aklog
klist
echo "Here should go my batch script code."
echo "No need to run 'kdestroy', as it may have to survive for running other jobs"
```

View File

@@ -0,0 +1,109 @@
---
title: Using merlin_rmount
#tags:
keywords: >-
transferring data, data transfer, rsync, dav, webdav, sftp, ftp, smb, cifs,
copy data, copying, mount, file, folder, sharing
last_updated: 24 August 2023
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/merlin-rmount.html
---
## Background
Merlin provides a command for mounting remote file systems, called `merlin_rmount`. This
provides a helpful wrapper over the Gnome storage utilities (GIO and GVFS), and provides support for a wide range of remote file formats, including
- SMB/CIFS (Windows shared folders)
- WebDav
- AFP
- FTP, SFTP
- [complete list](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/using_the_desktop_environment_in_rhel_8/managing-storage-volumes-in-gnome_using-the-desktop-environment-in-rhel-8#gvfs-back-ends_managing-storage-volumes-in-gnome)
## Usage
### Start a session
First, start a new session. This will start a new bash shell in the current terminal where you can add further commands.
```
$ merlin_rmount --init
[INFO] Starting new D-Bus RMOUNT session
(RMOUNT STARTED) [bliven_s@merlin-l-002 ~]$
```
Note that behind the scenes this is creating a new dbus daemon. Running multiple daemons on the same login node leads to unpredictable results, so it is best not to initialize multiple sessions in parallel.
### Standard Endpoints
Standard endpoints can be mounted using
```
merlin_rmount --select-mount
```
Select the desired url using the arrow keys.
![merlin_rmount --select-mount](/images/rmount/select-mount.png)
From this list any of the standard supported endpoints can be mounted.
### Other endpoints
Other endpoints can be mounted using the `merlin_rmount --mount <endpoint>` command.
![merlin_rmount --mount](/images/rmount/mount.png)
### Accessing Files
After mounting a volume the script will print the mountpoint. It should be of the form
```
/run/user/$UID/gvfs/<endpoint>
```
where `$UID` gives your unix user id (a 5-digit number, also viewable with `id -u`) and
`<endpoint>` is some string generated from the mount options.
For convenience, it may be useful to add a symbolic link for this gvfs directory. For instance, this would allow all volumes to be accessed in ~/mnt/:
```
ln -s ~/mnt /run/user/$UID/gvfs
```
Files are accessible as long as the `merlin_rmount` shell remains open.
### Disconnecting
To disconnect, close the session with one of the following:
- The exit command
- CTRL-D
- Closing the terminal
Disconnecting will unmount all volumes.
## Alternatives
### Thunar
Users that prefer a GUI file browser may prefer the `thunar` command, which opens the Gnome File Browser. This is also available in NoMachine sessions in the bottom bar (1). Thunar supports the same remote filesystems as `merlin_rmount`; just type the URL in the address bar (2).
![Mounting with thunar](/images/rmount/thunar_mount.png)
When using thunar within a NoMachine session, file transfers continue after closing NoMachine (as long as the NoMachine session stays active).
Files can also be accessed at the command line as needed (see 'Accessing Files' above).
## Resources
- [BIO docs](https://intranet.psi.ch/en/bio/webdav-data) on using these tools for
transfering EM data
- [Redhad docs on GVFS](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/using_the_desktop_environment_in_rhel_8/managing-storage-volumes-in-gnome_using-the-desktop-environment-in-rhel-8)
- [gio reference](https://developer-old.gnome.org/gio/stable/gio.html)

View File

@@ -0,0 +1,122 @@
---
title: Remote Desktop Access
#tags:
keywords: NX, nomachine, remote desktop access, login node, merlin-l-001, merlin-l-002, merlin-nx-01, merlin-nx-02, merlin-nx, rem-acc, vpn
last_updated: 07 September 2022
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/nomachine.html
---
Users can login in Merlin through a Linux Remote Desktop Session. NoMachine
is a desktop virtualization tool. It is similar to VNC, Remote Desktop, etc.
It uses the NX protocol to enable a graphical login to remote servers.
## Installation
NoMachine is available for PSI Windows computers in the Software Kiosk under the
name **NX Client**. Please use the latest version (at least 6.0). For MacOS and
Linux, the NoMachine client can be downloaded from https://www.nomachine.com/.
## Accessing Merlin6 NoMachine from PSI
The Merlin6 NoMachine service is hosted in the following machine:
* **`merlin-nx.psi.ch`**
This is the **front-end** (hence, *the door*) to the NoMachine **back-end nodes**,
which contain the NoMachine desktop service. The **back-end nodes** are the following:
* `merlin-l-001.psi.ch`
* `merlin-l-002.psi.ch`
Any access to the login node desktops must be done through **`merlin-nx.psi.ch`**
(or from **`rem-acc.psi.ch -> merlin-nx.psi.ch`** when connecting from outside PSI).
The **front-end** service running on **`merlin-nx.psi.ch`** will load balance the sessions
and login to any of the available nodes in the **back-end**.
**Only 1 session per back-end** is possible.
Below are explained all the steps necessary for configuring the access to the
NoMachine service running on a login node.
### Creating a Merlin6 NoMachine connection
#### Adding a new connection to the front-end
Click the **Add** button to create a new connection to the **`merlin-nx.psi.ch` front-end**, and fill up
the following fields:
* **Name**: Specify a custom name for the connection. Examples: `merlin-nx`, `merlin-nx.psi.ch`, `Merlin Desktop`
* **Host**: Specify the hostname of the **front-end** service: **`merlin-nx.psi.ch`**
* **Protocol**: specify the protocol that will be used for the connection. *Recommended* protocol: **`NX`**
* **Port**: Specify the listening port of the **front-end**. It must be **`4000`**.
![Create New NoMachine Connection]({{ "/images/NoMachine/screen_nx_connect.png" }})
#### Configuring NoMachine Authentication Method
Depending on the client version, it may ask for different authentication options.
If it's required, choose your authentication method and **Continue** (**Password** or *Kerberos* are the recommended ones).
You will be requested for the crendentials (username / password). **Do not add `PSICH\`** as a prefix for the username.
### Opening NoMachine desktop sessions
By default, when connecting to the **`merlin-nx.psi.ch` front-end** it will automatically open a new
session if none exists.
If there are existing sessions, instead of opening a new desktop session, users can reconnect to an
existing one by clicking to the proper icon (see image below).
![Open an existing Session]({{ "/images/NoMachine/screen_nx_existingsession.png" }})
Users can also create a second desktop session by selecting the **`New Desktop`** button (*red* rectangle in the
below image). This will create a second session on the second login node, as long as this node is up and running.
![Open a New Desktop]({{ "/images/NoMachine/screen_nx_newsession.png" }})
### NoMachine LightDM Session Example
An example of the NoMachine session, which is based on [LightDM](https://github.com/canonical/lightdm)
X Windows:
![NoMachine Session: LightDM Desktop]({{ "/images/NoMachine/screen_nx11.png" }})
## Accessing Merlin6 NoMachine from outside PSI
### No VPN access
Access to the Merlin6 NoMachine service is possible without VPN through **'rem-acc.psi.ch'**.
Please follow the steps described in [PSI Remote Interactive Access](https://www.psi.ch/en/photon-science-data-services/remote-interactive-access) for
remote access to the Merlin6 NoMachine services. Once logged in **'rem-acc.psi.ch'**, you must then login to the **`merlin-nx.psi.ch` front-end** .
services.
### VPN access
Remote access is also possible through VPN, however, you **must not use 'rem-acc.psi.ch'**, and you have to connect directly
to the Merlin6 NoMachine **`merlin-nx.psi.ch` front-end** as if you were inside PSI. For VPN access, you should request
it to the IT department by opening a PSI Service Now ticket:
[VPN Access (PSI employees)](https://psi.service-now.com/psisp?id=psi_new_sc_cat_item&sys_id=beccc01b6f44a200d02a82eeae3ee440).
## Advanced Display Settings
**Nomachine Display Settings** can be accessed and changed either when creating a new session or by clicking the very top right corner of a running session.
### Prevent Rescaling
These settings prevent "bluriness" at the cost of some performance! (You might want to choose depending on performance)
* Display > Resize remote display (forces 1:1 pixel sizes)
* Display > Change settings > Quality: Choose Medium-Best Quality
* Display > Change settings > Modify advanced settings
* Check: Disable network-adaptive display quality (diables lossy compression)
* Check: Disable client side image post-processing

View File

@@ -0,0 +1,159 @@
---
title: Configuring SSH Keys in Merlin
#tags:
keywords: linux, connecting, client, configuration, SSH, Keys, SSH-Keys, RSA, authorization, authentication
last_updated: 15 Jul 2020
summary: "This document describes how to deploy SSH Keys in Merlin."
sidebar: merlin6_sidebar
permalink: /merlin6/ssh-keys.html
---
Merlin users sometimes will need to access the different Merlin services without being constantly requested by a password.
One can achieve that with Kerberos authentication, however in some cases some software would require the setup of SSH Keys.
One example is ANSYS Fluent, which, when used interactively, the way of communication between the GUI and the different nodes
is through the SSH protocol, and the use of SSH Keys is enforced.
## Setting up SSH Keys on Merlin
For security reason, users **must always protect SSH Keys with a passphrase**.
User can check whether a SSH key already exists. These would be placed in the **~/.ssh/** directory. `RSA` encryption
is usually the default one, and files in there would be **`id_rsa`** (private key) and **`id_rsa.pub`** (public key).
```bash
ls ~/.ssh/id*
```
For creating **SSH RSA Keys**, one should:
1. Run `ssh-keygen`, a password will be requested twice. You **must remember** this password for the future.
* Due to security reasons, ***always try protecting it with a password***. There is only one exception, when running ANSYS software, which in general should not use password to simplify the way of running the software in Slurm.
* This will generate a private key **id_rsa**, and a public key **id_rsa.pub** in your **~/.ssh** directory.
2. Add your public key to the **`authorized_keys`** file, and ensure proper permissions for that file, as follows:
```bash
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 0600 ~/.ssh/authorized_keys
```
3. Configure the SSH client in order to force the usage of the **psi.ch** domain for trusting keys:
```bash
echo "CanonicalizeHostname yes" >> ~/.ssh/config
```
4. Configure further SSH options as follows:
```bash
echo "AddKeysToAgent yes" >> ~/.ssh/config
echo "ForwardAgent yes" >> ~/.ssh/config
```
Other options may be added.
5. Check that your SSH config file contains at least the lines mentioned in steps 3 and 4:
```bash
(base) ❄ [caubet_m@merlin-l-001:/data/user/caubet_m]# cat ~/.ssh/config
CanonicalizeHostname yes
AddKeysToAgent yes
ForwardAgent yes
```
## Using the SSH Keys
### Using Authentication Agent in SSH session
By default, when accessing the login node via SSH (with `ForwardAgent=yes`), it will automatically add your
SSH Keys to the authentication agent. Hence, no actions should not be needed by the user. One can configure
`ForwardAgent=yes` as follows:
* **(Recommended)** In your local Linux (workstation, laptop or desktop) add the following line in the
`$HOME/.ssh/config` (or alternatively in `/etc/ssh/ssh_config`) file:
```
ForwardAgent yes
```
* Alternatively, on each SSH you can add the option `ForwardAgent=yes` in the SSH command. In example:
```bash
ssh -XY -o ForwardAgent=yes merlin-l-001.psi.ch
```
If `ForwardAgent` is not enabled as shown above, one needs to run the authentication agent and then add your key
to the **ssh-agent**. This must be done once per SSH session, as follows:
* Run `eval $(ssh-agent -s)` to run the **ssh-agent** in that SSH session
* Check whether the authentication agent has your key already added:
```bash
ssh-add -l | grep "/psi/home/$(whoami)/.ssh"
```
* If no key is returned in the previous step, you have to add the private key identity to the authentication agent.
You will be requested for the **passphrase** of your key, and it can be done by running:
```bash
ssh-add
```
### Using Authentication Agent in NoMachine Session
By default, when using a NoMachine session, the `ssh-agent` should be automatically started. Hence, there is no need of
starting the agent or forwarding it.
However, for NoMachine one always need to add the private key identity to the authentication agent. This can be done as follows:
1. Check whether the authentication agent has already the key added:
```bash
ssh-add -l | grep "/psi/home/$(whoami)/.ssh"
```
2. If no key is returned in the previous step, you have to add the private key identity to the authentication agent.
You will be requested for the **passphrase** of your key, and it can be done by running:
```bash
ssh-add
```
You just need to run it once per NoMachine session, and it would apply to all terminal windows within that NoMachine session.
## Troubleshooting
### Errors when running 'ssh-add'
If the error `Could not open a connection to your authentication agent.` appears when running `ssh-add`, it means
that the authentication agent is not running. Please follow the previous procedures for starting it.
### Add/Update SSH RSA Key password
If an existing SSH Key does not have password, or you want to update an existing password with a new one, you can do it as follows:
```bash
ssh-keygen -p -f ~/.ssh/id_rsa
```
### SSH Keys deployed but not working
Please ensure proper permissions of the involved files, as well as any typos in the file names involved:
```bash
chmod u+rwx,go-rwx,g+s ~/.ssh
chmod u+rw-x,go-rwx ~/.ssh/authorized_keys
chmod u+rw-x,go-rwx ~/.ssh/id_rsa
chmod u+rw-x,go+r-wx ~/.ssh/id_rsa.pub
```
### Testing SSH Keys
Once SSH Key is created, for testing that the SSH Key is valid, one can do the following:
1. Create a **new** SSH session in one of the login nodes:
```bash
ssh merlin-l-001
```
2. In the login node session, destroy any existing Kerberos ticket or active SSH Key:
```bash
kdestroy
ssh-add -D
```
3. Add the new private key identity to the authentication agent. You will be requested by the passphrase.
```bash
ssh-add
```
4. Check that your key is active by the SSH agent:
```bash
ssh-add -l
```
4. SSH to the second login node. No password should be requested:
```bash
ssh -vvv merlin-l-002
```
If the last step succeeds, then means that your SSH Key is properly setup.

View File

@@ -0,0 +1,197 @@
---
title: Merlin6 Storage
#tags:
keywords: storage, /data/user, /data/software, /data/project, /scratch, /shared-scratch, quota, export, user, project, scratch, data, shared-scratch, merlin_quotas
last_updated: 07 September 2022
#summary: ""
sidebar: merlin6_sidebar
redirect_from: /merlin6/data-directories.html
permalink: /merlin6/storage.html
---
## Introduction
This document describes the different directories of the Merlin6 cluster.
### User and project data
* ***Users are responsible for backing up their own data***. Is recommended to backup the data on third party independent systems (i.e. LTS, Archive, AFS, SwitchDrive, Windows Shares, etc.).
* **`/psi/home`**, as this contains a small amount of data, is the only directory where we can provide daily snapshots for one week. This can be found in the following directory **`/psi/home/.snapshot/`**
* ***When a user leaves PSI, she or her supervisor/team are responsible to backup and move the data out from the cluster***: every few months, the storage space will be recycled for those old users who do not have an existing and valid PSI account.
{{site.data.alerts.warning}}When a user leaves PSI and his account has been removed, her storage space in Merlin may be recycled.
Hence, <b>when a user leaves PSI</b>, she, her supervisor or team <b>must ensure that the data is backed up to an external storage</b>
{{site.data.alerts.end}}
### Checking user quota
For each directory, we provide a way for checking quotas (when required). However, a single command ``merlin_quotas``
is provided. This is useful to show with a single command all quotas for your filesystems (including AFS, which is not mentioned here).
To check your quotas, please run:
```bash
merlin_quotas
```
## Merlin6 directories
Merlin6 offers the following directory classes for users:
* ``/psi/home/<username>``: Private user **home** directory
* ``/data/user/<username>``: Private user **data** directory
* ``/data/project/general/<projectname>``: Shared **Project** directory
* For BIO experiments, a dedicated ``/data/project/bio/$projectname`` exists.
* ``/scratch``: Local *scratch* disk (only visible by the node running a job).
* ``/shared-scratch``: Shared *scratch* disk (visible from all nodes).
* ``/export``: Export directory for data transfer, visible from `ra-merlin-01.psi.ch`, `ra-merlin-02.psi.ch` and Merlin login nodes.
* Refer to **[Transferring Data](/merlin6/transfer-data.html)** for more information about the export area and data transfer service.
{{site.data.alerts.tip}}In GPFS there is a concept called <b>GraceTime</b>. Filesystems have a block (amount of data) and file (number of files) quota.
This quota contains a soft and hard limits. Once the soft limit is reached, users can keep writing up to their hard limit quota during the <b>grace period</b>.
Once <b>GraceTime</b> or hard limit are reached, users will be unable to write and will need remove data below the soft limit (or ask for a quota increase
when this is possible, see below table).
{{site.data.alerts.end}}
Properties of the directory classes:
| Directory | Block Quota [Soft:Hard] | Block Quota [Soft:Hard] | GraceTime | Quota Change Policy: Block | Quota Change Policy: Files | Backup | Backup Policy |
| ---------------------------------- | ----------------------- | ----------------------- | :-------: | :--------------------------------- |:-------------------------------- | ------ | :----------------------------- |
| /psi/home/$username | USR [10GB:11GB] | *Undef* | N/A | Up to x2 when strongly justified. | N/A | yes | Daily snapshots for 1 week |
| /data/user/$username | USR [1TB:1.074TB] | USR [1M:1.1M] | 7d | Inmutable. Need a project. | Changeable when justified. | no | Users responsible for backup |
| /data/project/bio/$projectname | GRP [1TB:1.074TB] | GRP [1M:1.1M] | 7d | Subject to project requirements. | Subject to project requirements. | no | Project responsible for backup |
| /data/project/general/$projectname | GRP [1TB:1.074TB] | GRP [1M:1.1M] | 7d | Subject to project requirements. | Subject to project requirements. | no | Project responsible for backup |
| /scratch | *Undef* | *Undef* | N/A | N/A | N/A | no | N/A |
| /shared-scratch | USR [512GB:2TB] | USR [2M:2.5M] | 7d | Up to x2 when strongly justified. | Changeable when justified. | no | N/A |
| /export | USR [10MB:20TB] | USR [512K:5M] | 10d | Soft can be temporary increased. | Changeable when justified. | no | N/A |
{{site.data.alerts.warning}}The use of <b>scratch</b> and <b>export</b> areas as an extension of the quota <i>is forbidden</i>. <b>scratch</b> and <b>export</b> areas <i>must not contain</i> final data.
<br><b><i>Auto cleanup policies</i></b> in the <b>scratch</b> and <b>export</b> areas are applied.
{{site.data.alerts.end}}
### User home directory
This is the default directory users will land when login in to any Merlin6 machine.
It is intended for your scripts, documents, software development, and other files which
you want to have backuped. Do not use it for data or HPC I/O-hungry tasks.
This directory is mounted in the login and computing nodes under the path:
```bash
/psi/home/$username
```
Home directories are part of the PSI NFS Central Home storage provided by AIT and
are managed by the Merlin6 administrators.
Users can check their quota by running the following command:
```bash
quota -s
```
#### Home directory policy
* Read **[Important: Code of Conduct](## Important: Code of Conduct)** for more information about Merlin6 policies.
* Is **forbidden** to use the home directories for IO intensive tasks
* Use ``/scratch``, ``/shared-scratch``, ``/data/user`` or ``/data/project`` for this purpose.
* Users can retrieve up to 1 week of their lost data thanks to the automatic **daily snapshots for 1 week**.
Snapshots can be accessed at this path:
```bash
/psi/home/.snapshop/$username
```
### User data directory
The user data directory is intended for *fast IO access* and keeping large amounts of private data.
This directory is mounted in the login and computing nodes under the directory
```bash
/data/user/$username
```
Users can check their quota by running the following command:
```bash
mmlsquota -u <username> --block-size auto merlin-user
```
#### User data directory policy
* Read **[Important: Code of Conduct](## Important: Code of Conduct)** for more information about Merlin6 policies.
* Is **forbidden** to use the data directories as ``scratch`` area during a job runtime.
* Use ``/scratch``, ``/shared-scratch`` for this purpose.
* No backup policy is applied for user data directories: users are responsible for backing up their data.
### Project data directory
This storage is intended for *fast IO access* and keeping large amounts of a project's data, where the data also can be
shared by all members of the project (the project's corresponding unix group). We recommend to keep most data in
project related storage spaces, since it allows users to coordinate. Also, project spaces have more flexible policies
regarding extending the available storage space.
Experiments can request a project space as described in **[[Accessing Merlin -> Requesting a Project]](/merlin6/request-project.html)**
Once created, the project data directory will be mounted in the login and computing nodes under the dirctory:
```bash
/data/project/general/$projectname
```
Project quotas are defined on a per *group* basis. Users can check the project quota by running the following command:
```bash
mmlsquota -j $projectname --block-size auto -C merlin.psi.ch merlin-proj
```
#### Project Directory policy
* Read **[Important: Code of Conduct](## Important: Code of Conduct)** for more information about Merlin6 policies.
* It is **forbidden** to use the data directories as ``scratch`` area during a job's runtime, i.e. for high throughput I/O for a job's temporary files. Please Use ``/scratch``, ``/shared-scratch`` for this purpose.
* No backups: users are responsible for managing the backups of their data directories.
### Scratch directories
There are two different types of scratch storage: **local** (``/scratch``) and **shared** (``/shared-scratch``).
**local** scratch should be used for all jobs that do not require the scratch files to be accessible from multiple nodes, which is trivially
true for all jobs running on a single node.
**shared** scratch is intended for files that need to be accessible by multiple nodes, e.g. by a MPI-job where tasks are spread out over the cluster
and all tasks need to do I/O on the same temporary files.
**local** scratch in Merlin6 computing nodes provides a huge number of IOPS thanks to the NVMe technology. **Shared** scratch is implemented using a distributed parallel filesystem (GPFS) resulting in a higher latency, since it involves remote storage resources and more complex I/O coordination.
``/shared-scratch`` is only mounted in the *Merlin6* computing nodes (i.e. not on the login nodes), and its current size is 50TB. This can be increased in the future.
The properties of the available scratch storage spaces are given in the following table
| Cluster | Service | Scratch | Scratch Mountpoint | Shared Scratch | Shared Scratch Mountpoint | Comments |
| ------- | -------------- | ------------ | ------------------ | -------------- | ------------------------- | -------------------------------------- |
| merlin5 | computing node | 50GB / SAS | ``/scratch`` | ``N/A`` | ``N/A`` | ``merlin-c-[01-64]`` |
| merlin6 | login node | 100GB / SAS | ``/scratch`` | 50TB / GPFS | ``/shared-scratch`` | ``merlin-l-0[1,2]`` |
| merlin6 | computing node | 1.3TB / NVMe | ``/scratch`` | 50TB / GPFS | ``/shared-scratch`` | ``merlin-c-[001-024,101-124,201-224]`` |
| merlin6 | login node | 2.0TB / NVMe | ``/scratch`` | 50TB / GPFS | ``/shared-scratch`` | ``merlin-l-00[1,2]`` |
#### Scratch directories policy
* Read **[Important: Code of Conduct](## Important: Code of Conduct)** for more information about Merlin6 policies.
* By default, *always* use **local** first and only use **shared** if your specific use case requires it.
* Temporary files *must be deleted at the end of the job by the user*.
* Remaining files will be deleted by the system if detected.
* Files not accessed within 28 days will be automatically cleaned up by the system.
* If for some reason the scratch areas get full, admins have the rights to cleanup the oldest data.
### Export directory
Export directory is exclusively intended for transferring data from outside PSI to Merlin and viceversa. Is a temporary directoy with an auto-cleanup policy.
Please read **[Transferring Data](/merlin6/transfer-data.html)** for more information about it.
#### Export directory policy
* Temporary files *must be deleted at the end of the job by the user*.
* Remaining files will be deleted by the system if detected.
* Files not accessed within 28 days will be automatically cleaned up by the system.
* If for some reason the export area gets full, admins have the rights to cleanup the oldest data
---

View File

@@ -0,0 +1,170 @@
---
title: Transferring Data
#tags:
keywords: transferring data, data transfer, rsync, winscp, copy data, copying, sftp, import, export, hopx, vpn
last_updated: 24 August 2023
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/transfer-data.html
---
## Overview
Most methods allow data to be either transmitted or received, so it may make sense to
initiate the transfer from either merlin or the other system, depending on the network
visibility.
- Merlin login nodes are visible from the PSI network, so direct data transfer
(rsync/WinSCP) is generally preferable. This can be initiated from either endpoint.
- Merlin login nodes can access the internet using a limited set of protocols
- SSH-based protocols using port 22 (rsync-over-ssh, sftp, WinSCP, etc)
- HTTP-based protocols using ports 80 or 445 (https, WebDav, etc)
- Protocols using other ports require admin configuration and may only work with
specific hosts (ftp, rsync daemons, etc)
- Systems on the internet can access the [PSI Data Transfer](https://www.psi.ch/en/photon-science-data-services/data-transfer) service
`datatransfer.psi.ch`, using ssh-based protocols and [Globus](https://www.globus.org/)
## Direct transfer via Merlin6 login nodes
The following methods transfer data directly via the [login
nodes](/merlin6/interactive.html#login-nodes-hardware-description). They are suitable
for use from within the PSI network.
### Rsync
Rsync is the preferred method to transfer data from Linux/MacOS. It allows
transfers to be easily resumed if they get interrupted. The general syntax is:
```
rsync -avAHXS <src> <dst>
```
For example, to transfer files from your local computer to a merlin project
directory:
```
rsync -avAHXS ~/localdata user@merlin-l-01.psi.ch:/data/project/general/myproject/
```
You can resume interrupted transfers by simply rerunning the command. Previously
transferred files will be skipped.
### WinSCP
The WinSCP tool can be used for remote file transfer on Windows. It is available
from the Software Kiosk on PSI machines. Add `merlin-l-01.psi.ch` as a host and
connect with your PSI credentials. You can then drag-and-drop files between your
local computer and merlin.
### SWITCHfilesender
**[SWITCHfilesender](https://filesender.switch.ch/filesender2/?s=upload)** is an installation of the FileSender project (filesender.org) which is a web based application that allows authenticated users to securely and easily send arbitrarily large files to other users.
Authentication of users is provided through SimpleSAMLphp, supporting SAML2, LDAP and RADIUS and more. Users without an account can be sent an upload voucher by an authenticated user. FileSender is developed to the requirements of the higher education and research community.
The purpose of the software is to send a large file to someone, have that file available for download for a certain number of downloads and/or a certain amount of time, and after that automatically delete the file. The software is not intended as a permanent file publishing platform.
**[SWITCHfilesender](https://filesender.switch.ch/filesender2/?s=upload)** is fully integrated with PSI, therefore, PSI employees can log in by using their PSI account (through Authentication and Authorization Infrastructure / AAI, by selecting PSI as the institution to be used for log in).
## PSI Data Transfer
From August 2024, Merlin is connected to the **[PSI Data Transfer](https://www.psi.ch/en/photon-science-data-services/data-transfer)** service,
`datatransfer.psi.ch`. This is a central service managed by the **[Linux team](https://linux.psi.ch/index.html)**. However, any problems or questions related to it can be directly
[reported](/merlin6/contact.html) to the Merlin administrators, which will forward the request if necessary.
The PSI Data Transfer servers supports the following protocols:
* Data Transfer - SSH (scp / rsync)
* Data Transfer - Globus
Notice that `datatransfer.psi.ch` does not allow SSH login, only `rsync`, `scp` and [Globus](https://www.globus.org/) access is allowed.
The following filesystems are mounted:
* `/merlin/export` which points to the `/export` directory in Merlin.
* `/merlin/data/experiment/mu3e` which points to the `/data/experiment/mu3e` directories in Merlin.
* Mu3e sub-directories are mounted in RW (read-write), except for `data` (read-only mounted)
* `/merlin/data/project/general` which points to the `/data/project/general` directories in Merlin.
* Owners of Merlin projects should request explicit access to it.
* Currently, only `CSCS` is available for transferring files between PizDaint/Alps and Merlin
* `/merlin/data/project/bio` which points to the `/data/project/bio` directories in Merlin.
* `/merlin/data/user` which points to the `/data/user` directories in Merlin.
Access to the PSI Data Transfer uses ***Multi factor authentication*** (MFA).
Therefore, having the Microsoft Authenticator App is required as explained [here](https://www.psi.ch/en/computing/change-to-mfa).
{{site.data.alerts.tip}}Please follow the
<b><a href="https://www.psi.ch/en/photon-science-data-services/data-transfer">Official PSI Data Transfer</a></b> documentation for further instructions.
{{site.data.alerts.end}}
### Directories
#### /merlin/data/user
User data directories are mounted in RW.
{{site.data.alerts.warning}}Please, <b>ensure proper secured permissions</b> in your '/data/user'
directory. By default, when directory is created, the system applies the most restrictive
permissions. However, this does not prevent users for changing permissions if they wish. At this
point, users become responsible of those changes.
{{site.data.alerts.end}}
#### /merlin/export
Transferring big amounts of data from outside PSI to Merlin is always possible through `/export`.
{{site.data.alerts.tip}}<b>The '/export' directory can be used by any Merlin user.</b>
This is configured in Read/Write mode. If you need access, please, contact the Merlin administrators.
{{site.data.alerts.end}}
{{site.data.alerts.warning}}The use <b>export</b> as an extension of the quota <i>is forbidden</i>.
<br><b><i>Auto cleanup policies</i></b> in the <b>export</b> area apply for files older than 28 days.
{{site.data.alerts.end}}
##### Exporting data from Merlin
For exporting data from Merlin to outside PSI by using `/export`, one has to:
* From a Merlin login node, copy your data from any directory (i.e. `/data/project`, `/data/user`, `/scratch`) to
`/export`. Ensure to properly secure your directories and files with proper permissions.
* Once data is copied, from **`datatransfer.psi.ch`**, copy the data from `/merlin/export` to outside PSI
##### Importing data to Merlin
For importing data from outside PSI to Merlin by using `/export`, one has to:
* From **`datatransfer.psi.ch`**, copy the data from outside PSI to `/merlin/export`.
Ensure to properly secure your directories and files with proper permissions.
* Once data is copied, from a Merlin login node, copy your data from `/export` to any directory (i.e. `/data/project`, `/data/user`, `/scratch`).
#### Request access to your project directory
Optionally, instead of using `/export`, Merlin project owners can request Read/Write or Read/Only access to their project directory.
{{site.data.alerts.tip}}<b>Merlin projects can request direct access.</b>
This can be configured in Read/Write or Read/Only modes. If your project needs access, please, contact the Merlin administrators.
{{site.data.alerts.end}}
## Connecting to Merlin6 from outside PSI
Merlin6 is fully accessible from within the PSI network. To connect from outside you can use:
- [VPN](https://www.psi.ch/en/computing/vpn) ([alternate instructions](https://intranet.psi.ch/BIO/ComputingVPN))
- [SSH hopx](https://www.psi.ch/en/computing/ssh-hop)
* Please avoid transferring big amount data through **hopx**
- [No Machine](nomachine.md)
* Remote Interactive Access through [**'rem-acc.psi.ch'**](https://www.psi.ch/en/photon-science-data-services/remote-interactive-access)
* Please avoid transferring big amount of data through **NoMachine**
## Connecting from Merlin6 to outside file shares
### `merlin_rmount` command
Merlin provides a command for mounting remote file systems, called `merlin_rmount`. This
provides a helpful wrapper over the Gnome storage utilities, and provides support for a wide range of remote file formats, including
- SMB/CIFS (Windows shared folders)
- WebDav
- AFP
- FTP, SFTP
- [others](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/using_the_desktop_environment_in_rhel_8/managing-storage-volumes-in-gnome_using-the-desktop-environment-in-rhel-8#gvfs-back-ends_managing-storage-volumes-in-gnome)
[More instruction on using `merlin_rmount`](/merlin6/merlin-rmount.html)

View File

@@ -0,0 +1,208 @@
---
title: Using PModules
#tags:
keywords: Pmodules, software, stable, unstable, deprecated, overlay, overlays, release stage, module, package, packages, library, libraries
last_updated: 07 September 2022
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/using-modules.html
---
## Environment Modules
On top of the operating system stack we provide different software using the PSI developed PModule system.
PModules is the official supported way and each package is deployed by a specific expert. Usually, in PModules
software which is used by many people will be found.
If you miss any package/versions or a software with a specific missing feature, contact us. We will study if is feasible or not to install it.
## Module release stages
Three different **release stages** are available in Pmodules, ensuring proper software life cycling. These are the following: **`unstable`**, **`stable`** and **`deprecated`**
### Unstable release stage
The **`unstable`** release stage contains *unstable* releases of software. Software compilations here are usually under development or are not fully production ready.
This release stage is **not directly visible** by the end users, and needs to be explicitly invoked as follows:
```bash
module use unstable
```
Once software is validated and considered production ready, this is moved to the `stable` release stage.
### Stable release stage
The **`stable`** release stage contains *stable* releases of software, which have been deeply tested and are fully supported.
This is the ***default*** release stage, and is visible by default. Whenever possible, users are strongly advised to use packages from this release stage.
### Deprecated release stage
The **`deprecated`** release stage contains *deprecated* releases of software. Software in this release stage is usually deprecated or discontinued by their developers.
Also, minor versions or redundant compilations are moved here as long as there is a valid copy in the *stable* repository.
This release stage is **not directly visible** by the users, and needs to be explicitly invoked as follows:
```bash
module use deprecated
```
However, software moved to this release stage can be directly loaded without the need of invoking it. This ensure proper life cycling of the software, and making it transparent for the end users.
## Module overlays
Recent Pmodules releases contain a feature called **Pmodules overlays**. In Merlin, overlays are used to source software from a different location.
In that way, we can have custom private versions of software in the cluster installed on high performance storage accessed over a low latency network.
**Pmodules overlays** are still ***under development***, therefore consider that *some features may not work or do not work as expected*.
Pmodule overlays can be used from Pmodules `v1.1.5`. However, Merlin is running Pmodules `v1.0.0rc10` as the default version.
Therefore, one needs to load first a newer version of it: this is available in the repositories and can be loaded with **`module load Pmodules/$version`** command.
Once running the proper Pmodules version, **overlays** are added (or invoked) with the **`module use $overlay_name`** command.
### overlay_merlin
Some Merlin software is already provided through **PModule overlays** and has been validated for using and running it in that way.
Therefore, Melin contains an overlay called **`overlay_merlin`**. In this overlay, the software is installed in the Merlin high performance storage,
specifically in the ``/data/software/pmodules`` directory. In general, if another copy exists in the standard repository, we strongly recommend to use
the replica in the `overlay_merlin` overlay instead, as it provides faster access and it may also provide some customizations for the Merlin6 cluster.
For loading the `overlay_merlin`, please run:
```bash
module load Pmodules/1.1.6 # Or newer version
module use overlay_merlin
```
Then, once `overlay_merlin` is invoked, it will disable central software installations with the same version (if exist), and will be replaced
by the local ones in Merlin. Releases from the central Pmodules repository which do not have a copy in the Merlin overlay will remain
visible. In example, for each ANSYS release, one can identify where it is installed by searching ANSYS in PModules with the `--verbose`
option. This will show the location of the different ANSYS releases as follows:
* For ANSYS releases installed in the central repositories, the path starts with `/opt/psi`
* For ANSYS releases installed in the Merlin6 repository (and/or overwritting the central ones), the path starts with `/data/software/pmodules`
```bash
(base) ❄ [caubet_m@merlin-l-001:/data/user/caubet_m]# module load Pmodules/1.1.6
module load: unstable module has been loaded -- Pmodules/1.1.6
(base) ❄ [caubet_m@merlin-l-001:/data/user/caubet_m]# module use merlin_overlay
(base) ❄ [caubet_m@merlin-l-001:/data/user/caubet_m]# module search ANSYS --verbose
Module Rel.stage Group Dependencies/Modulefile
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
ANSYS/2019R3 stable Tools dependencies:
modulefile: /data/software/pmodules/Tools/modulefiles/ANSYS/2019R3
ANSYS/2020R1 stable Tools dependencies:
modulefile: /opt/psi/Tools/modulefiles/ANSYS/2020R1
ANSYS/2020R1-1 stable Tools dependencies:
modulefile: /opt/psi/Tools/modulefiles/ANSYS/2020R1-1
ANSYS/2020R2 stable Tools dependencies:
modulefile: /data/software/pmodules/Tools/modulefiles/ANSYS/2020R2
ANSYS/2021R1 stable Tools dependencies:
modulefile: /data/software/pmodules/Tools/modulefiles/ANSYS/2021R1
ANSYS/2021R2 stable Tools dependencies:
modulefile: /data/software/pmodules/Tools/modulefiles/ANSYS/2021R2
```
## PModules commands
Below is listed a summary of all available commands:
```bash
module use # show all available PModule Software Groups as well as Release Stages
module avail # to see the list of available software packages provided via pmodules
module use unstable # to get access to a set of packages not fully tested by the community
module load <package>/<version> # to load specific software package with a specific version
module search <string> # to search for a specific software package and its dependencies.
module list # to list which software is loaded in your environment
module purge # unload all loaded packages and cleanup the environment
```
### module use/unuse
Without any parameter, `use` **lists** all available PModule **Software Groups and Release Stages**.
```bash
module use
```
When followed by a parameter, `use`/`unuse` invokes/uninvokes a PModule **Software Group** or **Release Stage**.
```bash
module use EM # Invokes the 'EM' software group
module unuse EM # Uninvokes the 'EM' software group
module use unstable # Invokes the 'unstable' Release stable
module unuse unstable # Uninvokes the 'unstable' Release stable
```
### module avail
This option **lists** all available PModule **Software Groups and their packages**.
Please run `module avail --help` for further listing options.
### module search
This is used to **search** for **software packages**. By default, if no **Release Stage** or **Software Group** is specified
in the options of the `module search` command, it will search from the already invoked *Software Groups* and *Release Stages*.
Direct package dependencies will be also showed.
```bash
(base) [caubet_m@merlin-l-001 caubet_m]$ module search openmpi/4.0.5_slurm
Module Release Group Requires
---------------------------------------------------------------------------
openmpi/4.0.5_slurm stable Compiler gcc/8.4.0
openmpi/4.0.5_slurm stable Compiler gcc/9.2.0
openmpi/4.0.5_slurm stable Compiler gcc/9.3.0
openmpi/4.0.5_slurm stable Compiler intel/20.4
(base) [caubet_m@merlin-l-001 caubet_m]$ module load intel/20.4 openmpi/4.0.5_slurm
```
Please run `module search --help` for further search options.
### module load/unload
This loads/unloads specific software packages. Packages might have direct dependencies that need to be loaded first. Other dependencies
will be automatically loaded.
In the example below, the ``openmpi/4.0.5_slurm`` package will be loaded, however ``gcc/9.3.0`` must be loaded as well as this is a strict dependency. Direct dependencies must be loaded in advance. Users can load multiple packages one by one or at once. This can be useful for instance when loading a package with direct dependencies.
```bash
# Single line
module load gcc/9.3.0 openmpi/4.0.5_slurm
# Multiple line
module load gcc/9.3.0
module load openmpi/4.0.5_slurm
```
#### module purge
This command is an alternative to `module unload`, which can be used to unload **all** loaded module files.
```bash
module purge
```
## When to request for new PModules packages
### Missing software
If you don't find a specific software and you know from other people interesing on it, it can be installed in PModules. Please contact us
and we will try to help with that. Deploying new software in PModules may take few days.
Usually installation of new software are possible as long as few users will use it. If you are insterested in to maintain this software,
please let us know.
### Missing version
If the existing PModules versions for a specific package do not fit to your needs, is possible to ask for a new version.
Usually installation of newer versions will be supported, as long as few users will use it. Installation of intermediate versions can
be supported if this is strictly justified.

View File

@@ -0,0 +1,235 @@
---
title: Running Interactive Jobs
#tags:
keywords: interactive, X11, X, srun, salloc, job, jobs, slurm, nomachine, nx
last_updated: 07 September 2022
summary: "This document describes how to run interactive jobs as well as X based software."
sidebar: merlin6_sidebar
permalink: /merlin6/interactive-jobs.html
---
## Running interactive jobs
There are two different ways for running interactive jobs in Slurm. This is possible by using
the ``salloc`` and ``srun`` commands:
* **``salloc``**: to obtain a Slurm job allocation (a set of nodes), execute command(s), and then release the allocation when the command is finished.
* **``srun``**: is used for running parallel tasks.
### srun
Is run is used to run parallel jobs in the batch system. It can be used within a batch script
(which can be run with ``sbatch``), or within a job allocation (which can be run with ``salloc``).
Also, it can be used as a direct command (in example, from the login nodes).
When used inside a batch script or during a job allocation, ``srun`` is constricted to the
amount of resources allocated by the ``sbatch``/``salloc`` commands. In ``sbatch``, usually
these resources are defined inside the batch script with the format ``#SBATCH <option>=<value>``.
In other words, if you define in your batch script or allocation 88 tasks (and 1 thread / core)
and 2 nodes, ``srun`` is constricted to these amount of resources (you can use less, but never
exceed those limits).
When used from the login node, usually is used to run a specific command or software in an
interactive way. ``srun`` is a blocking process (it will block bash prompt until the ``srun``
command finishes, unless you run it in background with ``&``). This can be very useful to run
interactive software which pops up a Window and then submits jobs or run sub-tasks in the
background (in example, **Relion**, **cisTEM**, etc.)
Refer to ``man srun`` for exploring all possible options for that command.
<details>
<summary>[Show 'srun' example]: Running 'hostname' command on 3 nodes, using 2 cores (1 task/core) per node</summary>
<pre class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false">
(base) [caubet_m@merlin-l-001 ~]$ srun --clusters=merlin6 --ntasks=6 --ntasks-per-node=2 --nodes=3 hostname
srun: job 135088230 queued and waiting for resources
srun: job 135088230 has been allocated resources
merlin-c-102.psi.ch
merlin-c-102.psi.ch
merlin-c-101.psi.ch
merlin-c-101.psi.ch
merlin-c-103.psi.ch
merlin-c-103.psi.ch
</pre>
</details>
### salloc
**``salloc``** is used to obtain a Slurm job allocation (a set of nodes). Once job is allocated,
users are able to execute interactive command(s). Once finished (``exit`` or ``Ctrl+D``),
the allocation is released. **``salloc``** is a blocking command, it is, command will be blocked
until the requested resources are allocated.
When running **``salloc``**, once the resources are allocated, *by default* the user will get
a ***new shell on one of the allocated resources*** (if a user has requested few nodes, it will
prompt a new shell on the first allocated node). However, this behaviour can be changed by adding
a shell (`$SHELL`) at the end of the `salloc` command. In example:
```bash
# Typical 'salloc' call
# - Same as running:
# 'salloc --clusters=merlin6 -N 2 -n 2 srun -n1 -N1 --mem-per-cpu=0 --gres=gpu:0 --pty --preserve-env --mpi=none $SHELL'
salloc --clusters=merlin6 -N 2 -n 2
# Custom 'salloc' call
# - $SHELL will open a local shell on the login node from where ``salloc`` is running
salloc --clusters=merlin6 -N 2 -n 2 $SHELL
```
<details>
<summary>[Show 'salloc' example]: Allocating 2 cores (1 task/core) in 2 nodes (1 core/node) - <i>Default</i></summary>
<pre class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false">
(base) [caubet_m@merlin-l-001 ~]$ salloc --clusters=merlin6 --ntasks=2 --nodes=2
salloc: Pending job allocation 135171306
salloc: job 135171306 queued and waiting for resources
salloc: job 135171306 has been allocated resources
salloc: Granted job allocation 135171306
(base) [caubet_m@merlin-c-213 ~]$ srun hostname
merlin-c-213.psi.ch
merlin-c-214.psi.ch
(base) [caubet_m@merlin-c-213 ~]$ exit
exit
salloc: Relinquishing job allocation 135171306
(base) [caubet_m@merlin-l-001 ~]$ salloc --clusters=merlin6 -N 2 -n 2 srun -n1 -N1 --mem-per-cpu=0 --gres=gpu:0 --pty --preserve-env --mpi=none $SHELL
salloc: Pending job allocation 135171342
salloc: job 135171342 queued and waiting for resources
salloc: job 135171342 has been allocated resources
salloc: Granted job allocation 135171342
(base) [caubet_m@merlin-c-021 ~]$ srun hostname
merlin-c-021.psi.ch
merlin-c-022.psi.ch
(base) [caubet_m@merlin-c-021 ~]$ exit
exit
salloc: Relinquishing job allocation 135171342
</pre>
</details>
<details>
<summary>[Show 'salloc' example]: Allocating 2 cores (1 task/core) in 2 nodes (1 core/node) - <i>$SHELL</i></summary>
<pre class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false">
(base) [caubet_m@merlin-export-01 ~]$ salloc --clusters=merlin6 --ntasks=2 --nodes=2 $SHELL
salloc: Pending job allocation 135171308
salloc: job 135171308 queued and waiting for resources
salloc: job 135171308 has been allocated resources
salloc: Granted job allocation 135171308
(base) [caubet_m@merlin-export-01 ~]$ srun hostname
merlin-c-218.psi.ch
merlin-c-117.psi.ch
(base) [caubet_m@merlin-export-01 ~]$ exit
exit
salloc: Relinquishing job allocation 135171308
</pre>
</details>
## Running interactive jobs with X11 support
### Requirements
#### Graphical access
[NoMachine](/merlin6/nomachine.html) is the official supported service for graphical
access in the Merlin cluster. This service is running on the login nodes. Check the
document [{Accessing Merlin -> NoMachine}](/merlin6/nomachine.html) for details about
how to connect to the **NoMachine** service in the Merlin cluster.
For other non officially supported graphical access (X11 forwarding):
* For Linux clients, please follow [{How To Use Merlin -> Accessing from Linux Clients}](/merlin6/connect-from-linux.html)
* For Windows clients, please follow [{How To Use Merlin -> Accessing from Windows Clients}](/merlin6/connect-from-windows.html)
* For MacOS clients, please follow [{How To Use Merlin -> Accessing from MacOS Clients}](/merlin6/connect-from-macos.html)
### 'srun' with x11 support
Merlin5 and Merlin6 clusters allow running any windows based applications. For that, you need to
add the option ``--x11`` to the ``srun`` command. In example:
```bash
srun --clusters=merlin6 --x11 xclock
```
will popup a X11 based clock.
In the same manner, you can create a bash shell with x11 support. For doing that, you need
to add the option ``--pty`` to the ``srun --x11`` command. Once resource is allocated, from
there you can interactively run X11 and non-X11 based commands.
```bash
srun --clusters=merlin6 --x11 --pty bash
```
<details>
<summary>[Show 'srun' with X11 support examples]</summary>
<pre class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false">
(base) [caubet_m@merlin-l-001 ~]$ srun --clusters=merlin6 --x11 xclock
srun: job 135095591 queued and waiting for resources
srun: job 135095591 has been allocated resources
(base) [caubet_m@merlin-l-001 ~]$
(base) [caubet_m@merlin-l-001 ~]$ srun --clusters=merlin6 --x11 --pty bash
srun: job 135095592 queued and waiting for resources
srun: job 135095592 has been allocated resources
(base) [caubet_m@merlin-c-205 ~]$ xclock
(base) [caubet_m@merlin-c-205 ~]$ echo "This was an example"
This was an example
(base) [caubet_m@merlin-c-205 ~]$ exit
exit
</pre>
</details>
### 'salloc' with x11 support
**Merlin5** and **Merlin6** clusters allow running any windows based applications. For that, you need to
add the option ``--x11`` to the ``salloc`` command. In example:
```bash
salloc --clusters=merlin6 --x11 xclock
```
will popup a X11 based clock.
In the same manner, you can create a bash shell with x11 support. For doing that, you need
to add to run just ``salloc --clusters=merlin6 --x11``. Once resource is allocated, from
there you can interactively run X11 and non-X11 based commands.
```bash
salloc --clusters=merlin6 --x11
```
<details>
<summary>[Show 'salloc' with X11 support examples]</summary>
<pre class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false">
(base) [caubet_m@merlin-l-001 ~]$ salloc --clusters=merlin6 --x11 xclock
salloc: Pending job allocation 135171355
salloc: job 135171355 queued and waiting for resources
salloc: job 135171355 has been allocated resources
salloc: Granted job allocation 135171355
salloc: Relinquishing job allocation 135171355
(base) [caubet_m@merlin-l-001 ~]$ salloc --clusters=merlin6 --x11
salloc: Pending job allocation 135171349
salloc: job 135171349 queued and waiting for resources
salloc: job 135171349 has been allocated resources
salloc: Granted job allocation 135171349
salloc: Waiting for resource configuration
salloc: Nodes merlin-c-117 are ready for job
(base) [caubet_m@merlin-c-117 ~]$ xclock
(base) [caubet_m@merlin-c-117 ~]$ echo "This was an example"
This was an example
(base) [caubet_m@merlin-c-117 ~]$ exit
exit
salloc: Relinquishing job allocation 135171349
</pre>
</details>

View File

@@ -0,0 +1,288 @@
---
title: Monitoring
#tags:
keywords: monitoring, jobs, slurm, job status, squeue, sinfo, sacct
last_updated: 07 September 2022
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/monitoring.html
---
## Slurm Monitoring
### Job status
The status of submitted jobs can be check with the ``squeue`` command:
```bash
squeue -u $username
```
Common statuses:
* **merlin-\***: Running on the specified host
* **(Priority)**: Waiting in the queue
* **(Resources)**: At the head of the queue, waiting for machines to become available
* **(AssocGrpCpuLimit), (AssocGrpNodeLimit)**: Job would exceed per-user limitations on
the number of simultaneous CPUs/Nodes. Use `scancel` to remove the job and
resubmit with fewer resources, or else wait for your other jobs to finish.
* **(PartitionNodeLimit)**: Exceeds all resources available on this partition.
Run `scancel` and resubmit to a different partition (`-p`) or with fewer
resources.
Check in the **man** pages (``man squeue``) for all possible options for this command.
<details>
<summary>[Show 'squeue' example]</summary>
<pre class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false">
[root@merlin-slurmctld01 ~]# squeue -u feichtinger
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
134332544 general spawner- feichtin R 5-06:47:45 1 merlin-c-204
134321376 general subm-tal feichtin R 5-22:27:59 1 merlin-c-204
</pre>
</details>
### Partition status
The status of the nodes and partitions (a.k.a. queues) can be seen with the ``sinfo`` command:
```bash
sinfo
```
Check in the **man** pages (``man sinfo``) for all possible options for this command.
<details>
<summary>[Show 'sinfo' example]</summary>
<pre class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false">
[root@merlin-l-001 ~]# sinfo -l
Thu Jan 23 16:34:49 2020
PARTITION AVAIL TIMELIMIT JOB_SIZE ROOT OVERSUBS GROUPS NODES STATE NODELIST
test up 1-00:00:00 1-infinite no NO all 3 mixed merlin-c-[024,223-224]
test up 1-00:00:00 1-infinite no NO all 2 allocated merlin-c-[123-124]
test up 1-00:00:00 1-infinite no NO all 1 idle merlin-c-023
general* up 7-00:00:00 1-50 no NO all 6 mixed merlin-c-[007,204,207-209,219]
general* up 7-00:00:00 1-50 no NO all 57 allocated merlin-c-[001-005,008-020,101-122,201-203,205-206,210-218,220-222]
general* up 7-00:00:00 1-50 no NO all 3 idle merlin-c-[006,021-022]
daily up 1-00:00:00 1-60 no NO all 9 mixed merlin-c-[007,024,204,207-209,219,223-224]
daily up 1-00:00:00 1-60 no NO all 59 allocated merlin-c-[001-005,008-020,101-124,201-203,205-206,210-218,220-222]
daily up 1-00:00:00 1-60 no NO all 4 idle merlin-c-[006,021-023]
hourly up 1:00:00 1-infinite no NO all 9 mixed merlin-c-[007,024,204,207-209,219,223-224]
hourly up 1:00:00 1-infinite no NO all 59 allocated merlin-c-[001-005,008-020,101-124,201-203,205-206,210-218,220-222]
hourly up 1:00:00 1-infinite no NO all 4 idle merlin-c-[006,021-023]
gpu up 7-00:00:00 1-infinite no NO all 1 mixed merlin-g-007
gpu up 7-00:00:00 1-infinite no NO all 8 allocated merlin-g-[001-006,008-009]
</pre>
</details>
### Slurm commander
The **[Slurm Commander (scom)](https://github.com/CLIP-HPC/SlurmCommander/)** is a simple but very useful open source text-based user interface for
simple and efficient interaction with Slurm. It is developed by the **CLoud Infrastructure Project (CLIP-HPC)** and external contributions. To use it, one can
simply run the following command:
```bash
scom # merlin6 cluster
SLURM_CLUSTERS=merlin5 scom # merlin5 cluster
SLURM_CLUSTERS=gmerlin6 scom # gmerlin6 cluster
scom -h # Help and extra options
scom -d 14 # Set Job History to 14 days (instead of default 7)
```
With this simple interface, users can interact with their jobs, as well as getting information about past and present jobs:
* Filtering jobs by substring is possible with the `/` key.
* Users can perform multiple actions on their jobs (such like cancelling, holding or requeing a job), SSH to a node with an already running job,
or getting extended details and statistics of the job itself.
Also, users can check the status of the cluster, to get statistics and node usage information as well as getting information about node properties.
The interface also provides a few job templates for different use cases (i.e. MPI, OpenMP, Hybrid, single core). Users can modify these templates,
save it locally to the current directory, and submit the job to the cluster.
{{site.data.alerts.note}}Currently, <span style="color:darkblue;">scom</span> does not provide live updated information for the <span style="color:darkorange;">[Job History]</span> tab.
To update Job History information, users have to exit the application with the <span style="color:darkorange;">q</span> key. Other tabs will be updated every 5 seconds (default).
On the other hand, the <span style="color:darkorange;">[Job History]</span> tab contains only information for the <b>merlin6</b> CPU cluster only. Future updates will provide information
for other clusters.
{{site.data.alerts.end}}
For further information about how to use **scom**, please refer to the **[Slurm Commander Project webpage](https://github.com/CLIP-HPC/SlurmCommander/)**
!['scom' text-based user interface]({{ "/images/Slurm/scom.gif" }})
### Job accounting
Users can check detailed information of jobs (pending, running, completed, failed, etc.) with the `sacct` command.
This command is very flexible and can provide a lot of information. For checking all the available options, please read `man sacct`.
Below, we summarize some examples that can be useful for the users:
```bash
# Today jobs, basic summary
sacct
# Today jobs, with details
sacct --long
# Jobs from January 1, 2022, 12pm, with details
sacct -S 2021-01-01T12:00:00 --long
# Specific job accounting
sacct --long -j $jobid
# Jobs custom details, without steps (-X)
sacct -X --format=User%20,JobID,Jobname,partition,state,time,submit,start,end,elapsed,AveRss,MaxRss,MaxRSSTask,MaxRSSNode%20,MaxVMSize,nnodes,ncpus,ntasks,reqcpus,totalcpu,reqmem,cluster,TimeLimit,TimeLimitRaw,cputime,nodelist%50,AllocTRES%80
# Jobs custom details, with steps
sacct --format=User%20,JobID,Jobname,partition,state,time,submit,start,end,elapsed,AveRss,MaxRss,MaxRSSTask,MaxRSSNode%20,MaxVMSize,nnodes,ncpus,ntasks,reqcpus,totalcpu,reqmem,cluster,TimeLimit,TimeLimitRaw,cputime,nodelist%50,AllocTRES%80
```
### Job efficiency
Users can check how efficient are their jobs. For that, the ``seff`` command is available.
```bash
seff $jobid
```
<details>
<summary>[Show 'seff' example]</summary>
<pre class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false">
[root@merlin-slurmctld01 ~]# seff 134333893
Job ID: 134333893
Cluster: merlin6
User/Group: albajacas_a/unx-sls
State: COMPLETED (exit code 0)
Nodes: 1
Cores per node: 8
CPU Utilized: 00:26:15
CPU Efficiency: 49.47% of 00:53:04 core-walltime
Job Wall-clock time: 00:06:38
Memory Utilized: 60.73 MB
Memory Efficiency: 0.19% of 31.25 GB
</pre>
</details>
### List job attributes
The ``sjstat`` command is used to display statistics of jobs under control of SLURM. To use it
```bash
sjstat
```
<details>
<summary>[Show 'sjstat' example]</summary>
<pre class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false">
[root@merlin-l-001 ~]# sjstat -v
Scheduling pool data:
----------------------------------------------------------------------------------
Total Usable Free Node Time Other
Pool Memory Cpus Nodes Nodes Nodes Limit Limit traits
----------------------------------------------------------------------------------
test 373502Mb 88 6 6 1 UNLIM 1-00:00:00
general* 373502Mb 88 66 66 8 50 7-00:00:00
daily 373502Mb 88 72 72 9 60 1-00:00:00
hourly 373502Mb 88 72 72 9 UNLIM 01:00:00
gpu 128000Mb 8 1 1 0 UNLIM 7-00:00:00
gpu 128000Mb 20 8 8 0 UNLIM 7-00:00:00
Running job data:
---------------------------------------------------------------------------------------------------
Time Time Time
JobID User Procs Pool Status Used Limit Started Master/Other
---------------------------------------------------------------------------------------------------
13433377 collu_g 1 gpu PD 0:00 24:00:00 N/A (Resources)
13433389 collu_g 20 gpu PD 0:00 24:00:00 N/A (Resources)
13433382 jaervine 4 gpu PD 0:00 24:00:00 N/A (Priority)
13433386 barret_d 20 gpu PD 0:00 24:00:00 N/A (Priority)
13433382 pamula_f 20 gpu PD 0:00 168:00:00 N/A (Priority)
13433387 pamula_f 4 gpu PD 0:00 24:00:00 N/A (Priority)
13433365 andreani 132 daily PD 0:00 24:00:00 N/A (Dependency)
13433388 marino_j 6 gpu R 1:43:12 168:00:00 01-23T14:54:57 merlin-g-007
13433377 choi_s 40 gpu R 2:09:55 48:00:00 01-23T14:28:14 merlin-g-006
13433373 qi_c 20 gpu R 7:00:04 24:00:00 01-23T09:38:05 merlin-g-004
13433390 jaervine 2 gpu R 5:18 24:00:00 01-23T16:32:51 merlin-g-007
13433390 jaervine 2 gpu R 15:18 24:00:00 01-23T16:22:51 merlin-g-007
13433375 bellotti 4 gpu R 7:35:44 9:00:00 01-23T09:02:25 merlin-g-001
13433358 bellotti 1 gpu R 1-05:52:19 144:00:00 01-22T10:45:50 merlin-g-007
13433377 lavriha_ 20 gpu R 5:13:24 24:00:00 01-23T11:24:45 merlin-g-008
13433370 lavriha_ 40 gpu R 22:43:09 24:00:00 01-22T17:55:00 merlin-g-003
13433373 qi_c 20 gpu R 15:03:15 24:00:00 01-23T01:34:54 merlin-g-002
13433371 qi_c 4 gpu R 22:14:14 168:00:00 01-22T18:23:55 merlin-g-001
13433254 feichtin 2 general R 5-07:26:11 156:00:00 01-18T09:11:58 merlin-c-204
13432137 feichtin 2 general R 5-23:06:25 160:00:00 01-17T17:31:44 merlin-c-204
13433389 albajaca 32 hourly R 41:19 1:00:00 01-23T15:56:50 merlin-c-219
13433387 riemann_ 2 general R 1:51:47 4:00:00 01-23T14:46:22 merlin-c-204
13433370 jimenez_ 2 general R 23:20:45 168:00:00 01-22T17:17:24 merlin-c-106
13433381 jimenez_ 2 general R 4:55:33 168:00:00 01-23T11:42:36 merlin-c-219
13433390 sayed_m 128 daily R 21:49 10:00:00 01-23T16:16:20 merlin-c-223
13433359 adelmann 2 general R 1-05:00:09 48:00:00 01-22T11:38:00 merlin-c-204
13433377 zimmerma 2 daily R 6:13:38 24:00:00 01-23T10:24:31 merlin-c-007
13433375 zohdirad 24 daily R 7:33:16 10:00:00 01-23T09:04:53 merlin-c-218
13433363 zimmerma 6 general R 1-02:54:20 47:50:00 01-22T13:43:49 merlin-c-106
13433376 zimmerma 6 general R 7:25:42 23:50:00 01-23T09:12:27 merlin-c-007
13433371 vazquez_ 16 daily R 21:46:31 23:59:00 01-22T18:51:38 merlin-c-106
13433382 vazquez_ 16 daily R 4:09:23 23:59:00 01-23T12:28:46 merlin-c-024
13433376 jiang_j1 440 daily R 7:11:14 10:00:00 01-23T09:26:55 merlin-c-123
13433376 jiang_j1 24 daily R 7:08:19 10:00:00 01-23T09:29:50 merlin-c-220
13433384 kranjcev 440 daily R 2:48:19 24:00:00 01-23T13:49:50 merlin-c-108
13433371 vazquez_ 16 general R 20:15:15 120:00:00 01-22T20:22:54 merlin-c-210
13433371 vazquez_ 16 general R 21:15:51 120:00:00 01-22T19:22:18 merlin-c-210
13433374 colonna_ 176 daily R 8:23:18 24:00:00 01-23T08:14:51 merlin-c-211
13433374 bures_l 88 daily R 10:45:06 24:00:00 01-23T05:53:03 merlin-c-001
13433375 derlet 88 daily R 7:32:05 24:00:00 01-23T09:06:04 merlin-c-107
13433373 derlet 88 daily R 17:21:57 24:00:00 01-22T23:16:12 merlin-c-002
13433373 derlet 88 daily R 18:13:05 24:00:00 01-22T22:25:04 merlin-c-112
13433365 andreani 264 daily R 4:10:08 24:00:00 01-23T12:28:01 merlin-c-003
13431187 mahrous_ 88 general R 6-15:59:16 168:00:00 01-17T00:38:53 merlin-c-111
13433387 kranjcev 2 general R 1:48:47 4:00:00 01-23T14:49:22 merlin-c-204
13433368 karalis_ 352 general R 1-00:05:22 96:00:00 01-22T16:32:47 merlin-c-013
13433367 karalis_ 352 general R 1-00:06:44 96:00:00 01-22T16:31:25 merlin-c-118
13433385 karalis_ 352 general R 1:37:24 96:00:00 01-23T15:00:45 merlin-c-213
13433374 sato 256 general R 14:55:55 24:00:00 01-23T01:42:14 merlin-c-204
13433374 sato 64 general R 10:43:35 24:00:00 01-23T05:54:34 merlin-c-106
67723568 sato 32 general R 10:40:07 24:00:00 01-23T05:58:02 merlin-c-007
13433265 khanppna 440 general R 3-18:20:58 168:00:00 01-19T22:17:11 merlin-c-008
13433375 khanppna 704 general R 7:31:24 24:00:00 01-23T09:06:45 merlin-c-101
13433371 khanppna 616 general R 21:40:33 24:00:00 01-22T18:57:36 merlin-c-208
</pre>
</details>
### Graphical user interface
When using **ssh** with X11 forwarding (``ssh -XY``), or when using NoMachine, users can use ``sview``.
**SView** is a graphical user interface to view and modify Slurm states. To run **sview**:
```bash
ssh -XY $username@merlin-l-001.psi.ch # Not necessary when using NoMachine
sview
```
!['sview' graphical user interface]({{ "/images/Slurm/sview.png" }})
## General Monitoring
The following pages contain basic monitoring for Slurm and computing nodes.
Currently, monitoring is based on Grafana + InfluxDB. In the future it will
be moved to a different service based on ElasticSearch + LogStash + Kibana.
In the meantime, the following monitoring pages are available in a best effort
support:
### Merlin6 Monitoring Pages
* Slurm monitoring:
* ***[Merlin6 Slurm Statistics - XDMOD](https://merlin-slurmmon01.psi.ch/)***
* [Merlin6 Slurm Live Status](https://hpc-monitor02.psi.ch/d/QNcbW1AZk/merlin6-slurm-live-status?orgId=1&refresh=10s)
* [Merlin6 Slurm Overview](https://hpc-monitor02.psi.ch/d/94UxWJ0Zz/merlin6-slurm-overview?orgId=1&refresh=10s)
* Nodes monitoring:
* [Merlin6 CPU Nodes Overview](https://hpc-monitor02.psi.ch/d/JmvLR8gZz/merlin6-computing-cpu-nodes?orgId=1&refresh=10s)
* [Merlin6 GPU Nodes Overview](https://hpc-monitor02.psi.ch/d/gOo1Z10Wk/merlin6-computing-gpu-nodes?orgId=1&refresh=10s)
### Merlin5 Monitoring Pages
* Slurm monitoring:
* [Merlin5 Slurm Live Status](https://hpc-monitor02.psi.ch/d/o8msZJ0Zz/merlin5-slurm-live-status?orgId=1&refresh=10s)
* [Merlin5 Slurm Overview](https://hpc-monitor02.psi.ch/d/eWLEW1AWz/merlin5-slurm-overview?orgId=1&refresh=10s)
* Nodes monitoring:
* [Merlin5 CPU Nodes Overview](https://hpc-monitor02.psi.ch/d/ejTyWJAWk/merlin5-computing-cpu-nodes?orgId=1&refresh=10s)

View File

@@ -0,0 +1,284 @@
---
title: Running Slurm Scripts
#tags:
keywords: batch script, slurm, sbatch, srun, jobs, job, submit, submission, array jobs, array, squeue, sinfo, scancel, packed jobs, short jobs, very short jobs, multithread, rules, no-multithread, HT
last_updated: 07 September 2022
summary: "This document describes how to run batch scripts in Slurm."
sidebar: merlin6_sidebar
permalink: /merlin6/running-jobs.html
---
## The rules
Before starting using the cluster, please read the following rules:
1. To ease and improve *scheduling* and *backfilling*, always try to **estimate and** to **define a proper run time** of your jobs:
* Use ``--time=<D-HH:MM:SS>`` for that.
* For very long runs, please consider using ***[Job Arrays with Checkpointing](/merlin6/running-jobs.html#array-jobs-running-very-long-tasks-with-checkpoint-files)***
2. Try to optimize your jobs for running at most within **one day**. Please, consider the following:
* Some software can simply scale up by using more nodes while drastically reducing the run time.
* Some software allow to save a specific state, and a second job can start from that state: ***[Job Arrays with Checkpointing](/merlin6/running-jobs.html#array-jobs-running-very-long-tasks-with-checkpoint-files)*** can help you with that.
* Jobs submitted to **`hourly`** get more priority than jobs submitted to **`daily`**: always use **`hourly`** for jobs shorter than 1 hour.
* Jobs submitted to **`daily`** get more priority than jobs submitted to **`general`**: always use **`daily`** for jobs shorter than 1 day.
3. Is **forbidden** to run **very short jobs** as they cause a lot of overhead but also can cause severe problems to the main scheduler.
* ***Question:*** Is my job a very short job? ***Answer:*** If it lasts in few seconds or very few minutes, yes.
* ***Question:*** How long should my job run? ***Answer:*** as the *Rule of Thumb*, from 5' would start being ok, from 15' would preferred.
* Use ***[Packed Jobs](/merlin6/running-jobs.html#packed-jobs-running-a-large-number-of-short-tasks)*** for running a large number of short tasks.
4. Do not submit hundreds of similar jobs!
* Use ***[Array Jobs](/merlin6/running-jobs.html#array-jobs-launching-a-large-number-of-related-jobs)*** for gathering jobs instead.
{{site.data.alerts.tip}}Having a good estimation of the <i>time</i> needed by your jobs, a proper way for running them, and optimizing the jobs to <i>run within one day</i> will contribute to make the system fairly and efficiently used.
{{site.data.alerts.end}}
## Basic commands for running batch scripts
* Use **``sbatch``** for submitting a batch script to Slurm.
* Use **``srun``** for running parallel tasks.
* Use **``squeue``** for checking jobs status.
* Use **``scancel``** for cancelling/deleting a job from the queue.
{{site.data.alerts.tip}}Use Linux <b>'man'</b> pages when needed (i.e. <span style="color:orange;">'man sbatch'</span>), mostly for checking the available options for the above commands.
{{site.data.alerts.end}}
## Basic settings
For a complete list of options and parameters available is recommended to use the **man pages** (i.e. ``man sbatch``, ``man srun``, ``man salloc``).
Please, notice that behaviour for some parameters might change depending on the command used when running jobs (in example, ``--exclusive`` behaviour in ``sbatch`` differs from ``srun``).
In this chapter we show the basic parameters which are usually needed in the Merlin cluster.
### Common settings
The following settings are the minimum required for running a job in the Merlin CPU and GPU nodes. Please, consider taking a look to the **man pages** (i.e. `man sbatch`, `man salloc`, `man srun`) for more information about all possible options. Also, do not hesitate to contact us on any questions.
* **Clusters:** For running jobs in the different Slurm clusters, users should to add the following option:
```bash
#SBATCH --clusters=<cluster_name> # Possible values: merlin5, merlin6, gmerlin6
```
Refer to the documentation of each cluster ([**`merlin6`**](/merlin6/slurm-configuration.html),[**`gmerlin6`**](/gmerlin6/slurm-configuration.html),[**`merlin5`**](/merlin5/slurm-configuration.html) for further information.
* **Partitions:** except when using the *default* partition for each cluster, one needs to specify the partition:
```bash
#SBATCH --partition=<partition_name> # Check each cluster documentation for possible values
```
Refer to the documentation of each cluster ([**`merlin6`**](/merlin6/slurm-configuration.html),[**`gmerlin6`**](/gmerlin6/slurm-configuration.html),[**`merlin5`**](/merlin5/slurm-configuration.html) for further information.
* **[Optional] Disabling shared nodes**: by default, nodes are not exclusive. Hence, multiple users can run in the same node. One can request exclusive node usage with the following option:
```bash
#SBATCH --exclusive # Only if you want a dedicated node
```
* **Time**: is important to define how long a job should run, according to the reality. This will help Slurm when *scheduling* and *backfilling*, and will let Slurm managing job queues in a more efficient way. This value can never exceed the `MaxTime` of the affected partition.
```bash
#SBATCH --time=<D-HH:MM:SS> # Can not exceed the partition `MaxTime`
```
Refer to the documentation of each cluster ([**`merlin6`**](/merlin6/slurm-configuration.html),[**`gmerlin6`**](/gmerlin6/slurm-configuration.html),[**`merlin5`**](/merlin5/slurm-configuration.html) for further information about partition `MaxTime` values.
* **Output and error files**: by default, Slurm script will generate standard output (``slurm-%j.out``, where `%j` is the job_id) and error (``slurm-%j.err``, where `%j` is the job_id) files in the directory from where the job was submitted. Users can change default name with the following options:
```bash
#SBATCH --output=<filename> # Can include path. Patterns accepted (i.e. %j)
#SBATCH --error=<filename> # Can include path. Patterns accepted (i.e. %j)
```
Use **man sbatch** (``man sbatch | grep -A36 '^filename pattern'``) for getting a list specification of **filename patterns**.
* **Enable/Disable Hyper-Threading**: Whether a node has or not Hyper-Threading depends on the node configuration. By default, HT nodes have HT enabled, but one should specify it from the Slurm command as follows:
```bash
#SBATCH --hint=multithread # Use extra threads with in-core multi-threading.
#SBATCH --hint=nomultithread # Don't use extra threads with in-core multi-threading.
```
Refer to the documentation of each cluster ([**`merlin6`**](/merlin6/slurm-configuration.html),[**`gmerlin6`**](/gmerlin6/slurm-configuration.html),[**`merlin5`**](/merlin5/slurm-configuration.html) for further information about node configuration and Hyper-Threading.
Consider that, sometimes, depending on your job requirements, you might need also to setup how many `--ntasks-per-core` or `--cpus-per-task` (even other options) in addition to the `--hint` command. Please, contact us in case of doubts.
{{site.data.alerts.tip}} In general, for the cluster `merlin6` <span style="color:orange;"><b>--hint=[no]multithread</b></span> is a recommended field. On the other hand, <span style="color:orange;"><b>--ntasks-per-core</b></span> is only needed when
one needs to define how a task should be handled within a core, and this setting will not be generally used on Hybrid MPI/OpenMP jobs where multiple cores are needed for single tasks.
{{site.data.alerts.end}}
## Batch script templates
### CPU-based jobs templates
The following examples apply to the **Merlin6** cluster.
#### Nomultithreaded jobs template
The following template should be used by any user submitting jobs to the Merlin6 CPU nodes:
```bash
#!/bin/bash
#SBATCH --cluster=merlin6 # Cluster name
#SBATCH --partition=general,daily,hourly # Specify one or multiple partitions
#SBATCH --time=<D-HH:MM:SS> # Strongly recommended
#SBATCH --output=<output_file> # Generate custom output file
#SBATCH --error=<error_file> # Generate custom error file
#SBATCH --hint=nomultithread # Mandatory for multithreaded jobs
##SBATCH --exclusive # Uncomment if you need exclusive node usage
##SBATCH --ntasks-per-core=1 # Only mandatory for multithreaded single tasks
## Advanced options example
##SBATCH --nodes=1 # Uncomment and specify #nodes to use
##SBATCH --ntasks=44 # Uncomment and specify #nodes to use
##SBATCH --ntasks-per-node=44 # Uncomment and specify #tasks per node
##SBATCH --cpus-per-task=44 # Uncomment and specify the number of cores per task
```
#### Multithreaded jobs template
The following template should be used by any user submitting jobs to the Merlin6 CPU nodes:
```bash
#!/bin/bash
#SBATCH --cluster=merlin6 # Cluster name
#SBATCH --partition=general,daily,hourly # Specify one or multiple partitions
#SBATCH --time=<D-HH:MM:SS> # Strongly recommended
#SBATCH --output=<output_file> # Generate custom output file
#SBATCH --error=<error_file> # Generate custom error file
#SBATCH --hint=multithread # Mandatory for multithreaded jobs
##SBATCH --exclusive # Uncomment if you need exclusive node usage
##SBATCH --ntasks-per-core=2 # Only mandatory for multithreaded single tasks
## Advanced options example
##SBATCH --nodes=1 # Uncomment and specify #nodes to use
##SBATCH --ntasks=88 # Uncomment and specify #nodes to use
##SBATCH --ntasks-per-node=88 # Uncomment and specify #tasks per node
##SBATCH --cpus-per-task=88 # Uncomment and specify the number of cores per task
```
### GPU-based jobs templates
The following template should be used by any user submitting jobs to GPU nodes:
```bash
#!/bin/bash
#SBATCH --cluster=gmerlin6 # Cluster name
#SBATCH --partition=gpu,gpu-short # Specify one or multiple partitions, or
#SBATCH --partition=gwendolen,gwendolen-long # Only for Gwendolen users
#SBATCH --gpus="<type>:<num_gpus>" # <type> is optional, <num_gpus> is mandatory
#SBATCH --time=<D-HH:MM:SS> # Strongly recommended
#SBATCH --output=<output_file> # Generate custom output file
#SBATCH --error=<error_file> # Generate custom error file
##SBATCH --exclusive # Uncomment if you need exclusive node usage
## Advanced options example
##SBATCH --nodes=1 # Uncomment and specify number of nodes to use
##SBATCH --ntasks=1 # Uncomment and specify number of nodes to use
##SBATCH --cpus-per-gpu=5 # Uncomment and specify the number of cores per task
##SBATCH --mem-per-gpu=16000 # Uncomment and specify the number of cores per task
##SBATCH --gpus-per-node=<type>:2 # Uncomment and specify the number of GPUs per node
##SBATCH --gpus-per-socket=<type>:2 # Uncomment and specify the number of GPUs per socket
##SBATCH --gpus-per-task=<type>:1 # Uncomment and specify the number of GPUs per task
```
## Advanced configurations
### Array Jobs: launching a large number of related jobs
If you need to run a large number of jobs based on the same executable with systematically varying inputs,
e.g. for a parameter sweep, you can do this most easily in form of a **simple array job**.
``` bash
#!/bin/bash
#SBATCH --job-name=test-array
#SBATCH --partition=daily
#SBATCH --ntasks=1
#SBATCH --time=08:00:00
#SBATCH --array=1-8
echo $(date) "I am job number ${SLURM_ARRAY_TASK_ID}"
srun myprogram config-file-${SLURM_ARRAY_TASK_ID}.dat
```
This will run 8 independent jobs, where each job can use the counter
variable `SLURM_ARRAY_TASK_ID` defined by Slurm inside of the job's
environment to feed the correct input arguments or configuration file
to the "myprogram" executable. Each job will receive the same set of
configurations (e.g. time limit of 8h in the example above).
The jobs are independent, but they will run in parallel (if the cluster resources allow for
it). The jobs will get JobIDs like {some-number}_0 to {some-number}_7, and they also will each
have their own output file.
**Note:**
* Do not use such jobs if you have very short tasks, since each array sub job will incur the full overhead for launching an independent Slurm job. For such cases you should used a **packed job** (see below).
* If you want to control how many of these jobs can run in parallel, you can use the `#SBATCH --array=1-100%5` syntax. The `%5` will define
that only 5 sub jobs may ever run in parallel.
You also can use an array job approach to run over all files in a directory, substituting the payload with
``` bash
FILES=(/path/to/data/*)
srun ./myprogram ${FILES[$SLURM_ARRAY_TASK_ID]}
```
Or for a trivial case you could supply the values for a parameter scan in form
of a argument list that gets fed to the program using the counter variable.
``` bash
ARGS=(0.05 0.25 0.5 1 2 5 100)
srun ./my_program.exe ${ARGS[$SLURM_ARRAY_TASK_ID]}
```
### Array jobs: running very long tasks with checkpoint files
If you need to run a job for much longer than the queues (partitions) permit, and
your executable is able to create checkpoint files, you can use this
strategy:
``` bash
#!/bin/bash
#SBATCH --job-name=test-checkpoint
#SBATCH --partition=general
#SBATCH --ntasks=1
#SBATCH --time=7-00:00:00 # each job can run for 7 days
#SBATCH --cpus-per-task=1
#SBATCH --array=1-10%1 # Run a 10-job array, one job at a time.
if test -e checkpointfile; then
# There is a checkpoint file;
myprogram --read-checkp checkpointfile
else
# There is no checkpoint file, start a new simulation.
myprogram
fi
```
The `%1` in the `#SBATCH --array=1-10%1` statement defines that only 1 subjob can ever run in parallel, so
this will result in subjob n+1 only being started when job n has finished. It will read the checkpoint file
if it is present.
### Packed jobs: running a large number of short tasks
Since the launching of a Slurm job incurs some overhead, you should not submit each short task as a separate
Slurm job. Use job packing, i.e. you run the short tasks within the loop of a single Slurm job.
You can launch the short tasks using `srun` with the `--exclusive` switch (not to be confused with the
switch of the same name used in the SBATCH commands). This switch will ensure that only a specified
number of tasks can run in parallel.
As an example, the following job submission script will ask Slurm for
44 cores (threads), then it will run the =myprog= program 1000 times with
arguments passed from 1 to 1000. But with the =-N1 -n1 -c1
--exclusive= option, it will control that at any point in time only 44
instances are effectively running, each being allocated one CPU. You
can at this point decide to allocate several CPUs or tasks by adapting
the corresponding parameters.
``` bash
#! /bin/bash
#SBATCH --job-name=test-checkpoint
#SBATCH --partition=general
#SBATCH --ntasks=1
#SBATCH --time=7-00:00:00
#SBATCH --ntasks=44 # defines the number of parallel tasks
for i in {1..1000}
do
srun -N1 -n1 -c1 --exclusive ./myprog $i &
done
wait
```
**Note:** The `&` at the end of the `srun` line is needed to not have the script waiting (blocking).
The `wait` command waits for all such background tasks to finish and returns the exit code.

View File

@@ -0,0 +1,63 @@
---
title: Slurm Basic Commands
#tags:
keywords: sinfo, squeue, sbatch, srun, salloc, scancel, sview, seff, sjstat, sacct, basic commands, slurm commands, cluster
last_updated: 07 September 2022
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/slurm-basics.html
---
In this document some basic commands for using Slurm are showed. Advanced examples for some of these
are explained in other Merlin6 Slurm pages. You can always use ```man <command>``` pages for more
information about options and examples.
## Basic commands
Useful commands for the slurm:
```bash
sinfo # to see the name of nodes, their occupancy,
# name of slurm partitions, limits (try out with "-l" option)
squeue # to see the currently running/waiting jobs in slurm
# (additional "-l" option may also be useful)
sbatch Script.sh # to submit a script (example below) to the slurm.
srun <command> # to submit a command to Slurm. Same options as in 'sbatch' can be used.
salloc # to allocate computing nodes. Use for interactive runs.
scancel job_id # to cancel slurm job, job id is the numeric id, seen by the squeue.
sview # X interface for managing jobs and track job run information.
seff # Calculates the efficiency of a job
sjstat # List attributes of jobs under the SLURM control
sacct # Show job accounting, useful for checking details of finished jobs.
```
---
## Advanced basic commands:
```bash
sinfo -N -l # list nodes, state, resources (#CPUs, memory per node, ...), etc.
sshare -a # to list shares of associations to a cluster
sprio -l # to view the factors that comprise a job's scheduling priority
# add '-u <username>' for filtering user
```
## Show information for specific cluster
By default, any of the above commands shows information of the local cluster which is ***merlin6**.
If you want to see the same information for **merlin5** you have to add the parameter ``--clusters=merlin5``.
If you want to see both clusters at the same time, add the option ``--federation``.
Examples:
```bash
sinfo # 'sinfo' local cluster which is 'merlin6'
sinfo --clusters=merlin5 # 'sinfo' non-local cluster 'merlin5'
sinfo --federation # 'sinfo' all clusters which are 'merlin5' & 'merlin6'
squeue # 'squeue' local cluster which is 'merlin6'
squeue --clusters=merlin5 # 'squeue' non-local cluster 'merlin5'
squeue --federation # 'squeue' all clusters which are 'merlin5' & 'merlin6'
```
---

View File

@@ -0,0 +1,354 @@
---
title: Slurm Examples
#tags:
keywords: slurm example, template, examples, templates, running jobs, sbatch, single core based jobs, HT, multithread, no-multithread, mpi, openmp, packed jobs, hands-on, array jobs, gpu
last_updated: 07 September 2022
summary: "This document shows different template examples for running jobs in the Merlin cluster."
sidebar: merlin6_sidebar
permalink: /merlin6/slurm-examples.html
---
## Single core based job examples
### Example 1: Hyperthreaded job
In this example we want to use hyperthreading (``--ntasks-per-core=2`` and ``--hint=multithread``). In our Merlin6 configuration,
the default memory per CPU (a CPU is equivalent to a core thread) is 4000MB, hence each task can use up 8000MB (2 threads x 4000MB).
```bash
#!/bin/bash
#SBATCH --partition=hourly # Using 'hourly' will grant higher priority
#SBATCH --ntasks-per-core=2 # Request the max ntasks be invoked on each core
#SBATCH --hint=multithread # Use extra threads with in-core multi-threading
#SBATCH --time=00:30:00 # Define max time job will run
#SBATCH --output=myscript.out # Define your output file
#SBATCH --error=myscript.err # Define your error file
module purge
module load $MODULE_NAME # where $MODULE_NAME is a software in PModules
srun $MYEXEC # where $MYEXEC is a path to your binary file
```
### Example 2: Non-hyperthreaded job
In this example we do not want hyper-threading (``--ntasks-per-core=1`` and ``--hint=nomultithread``). In our Merlin6 configuration,
the default memory per cpu (a CPU is equivalent to a core thread) is 4000MB. If we do not specify anything else, our
single core task will use a default of 4000MB. However, one could double it with ``--mem-per-cpu=8000`` if you require more memory
(remember, the second thread will not be used so we can safely assign +4000MB to the unique active thread).
```bash
#!/bin/bash
#SBATCH --partition=hourly # Using 'hourly' will grant higher priority
#SBATCH --ntasks-per-core=1 # Request the max ntasks be invoked on each core
#SBATCH --hint=nomultithread # Don't use extra threads with in-core multi-threading
#SBATCH --time=00:30:00 # Define max time job will run
#SBATCH --output=myscript.out # Define your output file
#SBATCH --error=myscript.err # Define your error file
module purge
module load $MODULE_NAME # where $MODULE_NAME is a software in PModules
srun $MYEXEC # where $MYEXEC is a path to your binary file
```
## Multi core based job examples
### Example 1: MPI with Hyper-Threading
In this example we run a job that will run 88 tasks. Merlin6 Apollo nodes have 44 cores each one with hyper-threading
enabled. This means that we can run 2 threads per core, in total 88 threads. To accomplish that, users should specify
``--ntasks-per-core=2`` and ``--hint=multithread``.
Use `--nodes=1` if you want to use a node exclusively (88 hyperthreaded tasks would fit in a Merlin6 node).
```bash
#!/bin/bash
#SBATCH --partition=hourly # Using 'hourly' will grant higher priority
#SBATCH --ntasks=88 # Job will run 88 tasks
#SBATCH --ntasks-per-core=2 # Request the max ntasks be invoked on each core
#SBATCH --hint=multithread # Use extra threads with in-core multi-threading
#SBATCH --time=00:30:00 # Define max time job will run
#SBATCH --output=myscript.out # Define your output file
#SBATCH --error=myscript.err # Define your error file
module purge
module load $MODULE_NAME # where $MODULE_NAME is a software in PModules
srun $MYEXEC # where $MYEXEC is a path to your binary file
```
### Example 2: MPI without Hyper-Threading
In this example, we want to run a job that will run 44 tasks, and due to performance reasons we want to disable hyper-threading.
Merlin6 Apollo nodes have 44 cores, each one with hyper-threading enabled. For ensuring that only 1 thread will be used per task,
users should specify ``--ntasks-per-core=1`` and ``--hint=nomultithread``. With this configuration, we tell Slurm to run only 1
tasks per core and no hyperthreading should be used. Hence, each tasks will be assigned to an independent core.
Use `--nodes=1` if you want to use a node exclusively (44 non-hyperthreaded tasks would fit in a Merlin6 node).
```bash
#!/bin/bash
#SBATCH --partition=hourly # Using 'hourly' will grant higher priority
#SBATCH --ntasks=44 # Job will run 44 tasks
#SBATCH --ntasks-per-core=1 # Request the max ntasks be invoked on each core
#SBATCH --hint=nomultithread # Don't use extra threads with in-core multi-threading
#SBATCH --time=00:30:00 # Define max time job will run
#SBATCH --output=myscript.out # Define your output file
#SBATCH --error=myscript.err # Define your output file
module purge
module load $MODULE_NAME # where $MODULE_NAME is a software in PModules
srun $MYEXEC # where $MYEXEC is a path to your binary file
```
### Example 3: Hyperthreaded Hybrid MPI/OpenMP job
In this example, we want to run a Hybrid Job using MPI and OpenMP using hyperthreading. In this job, we want to run 4 MPI
tasks by using 8 CPUs per task. Each task in our example requires 128GB of memory. Then we specify 16000MB per CPU
(8 x 16000MB = 128000MB). Notice that since hyperthreading is enabled, Slurm will use 4 cores per task (with hyperthreading
2 threads -a.k.a. Slurm CPUs- fit into a core).
```bash
#!/bin/bash -l
#SBATCH --clusters=merlin6
#SBATCH --job-name=test
#SBATCH --ntasks=4
#SBATCH --ntasks-per-socket=1
#SBATCH --mem-per-cpu=16000
#SBATCH --cpus-per-task=8
#SBATCH --partition=hourly
#SBATCH --time=01:00:00
#SBATCH --output=srun_%j.out
#SBATCH --error=srun_%j.err
#SBATCH --hint=multithread
module purge
module load $MODULE_NAME # where $MODULE_NAME is a software in PModules
srun $MYEXEC # where $MYEXEC is a path to your binary file
```
{{site.data.alerts.tip}} Also, always consider that **`'--mem-per-cpu' x '--cpus-per-task'`** can **never** exceed the maximum amount of memory per node (352000MB).
{{site.data.alerts.end}}
### Example 4: Non-hyperthreaded Hybrid MPI/OpenMP job
In this example, we want to run a Hybrid Job using MPI and OpenMP without hyperthreading. In this job, we want to run 4 MPI
tasks by using 8 CPUs per task. Each task in our example requires 128GB of memory. Then we specify 16000MB per CPU
(8 x 16000MB = 128000MB). Notice that since hyperthreading is disabled, Slurm will use 8 cores per task (disabling hyperthreading
we force the use of only 1 thread -a.k.a. 1 CPU- per core).
```bash
#!/bin/bash -l
#SBATCH --clusters=merlin6
#SBATCH --job-name=test
#SBATCH --ntasks=4
#SBATCH --ntasks-per-socket=1
#SBATCH --mem-per-cpu=16000
#SBATCH --cpus-per-task=8
#SBATCH --partition=hourly
#SBATCH --time=01:00:00
#SBATCH --output=srun_%j.out
#SBATCH --error=srun_%j.err
#SBATCH --hint=nomultithread
module purge
module load $MODULE_NAME # where $MODULE_NAME is a software in PModules
srun $MYEXEC # where $MYEXEC is a path to your binary file
```
{{site.data.alerts.tip}} Also, always consider that **`'--mem-per-cpu' x '--cpus-per-task'`** can **never** exceed the maximum amount of memory per node (352000MB).
{{site.data.alerts.end}}
## GPU examples
Using GPUs requires two major changes. First, the cluster needs to be specified
to `gmerlin6`. This should also be added to later commands pertaining to the
job, e.g. `scancel --cluster=gmerlin6 <jobid>`. Second, the number of GPUs
should be specified using `--gpus`, `--gpus-per-task`, or similar parameters.
Here's an example for a simple test job:
```bash
#!/bin/bash
#SBATCH --partition=gpu # Or 'gpu-short' for higher priority but 2-hour limit
#SBATCH --cluster=gmerlin6 # Required for GPU
#SBATCH --gpus=2 # Total number of GPUs
#SBATCH --cpus-per-gpu=5 # Request CPU resources
#SBATCH --time=1-00:00:00 # Define max time job will run
#SBATCH --output=myscript.out # Define your output file
#SBATCH --error=myscript.err # Define your error file
module purge
module load cuda # load any needed modules here
srun $MYEXEC # where $MYEXEC is a path to your binary file
```
Slurm will automatically set the gpu visibility (eg `$CUDA_VISIBLE_DEVICES`).
## Advanced examples
### Array Jobs: launching a large number of related jobs
If you need to run a large number of jobs based on the same executable with systematically varying inputs,
e.g. for a parameter sweep, you can do this most easily in form of a **simple array job**.
``` bash
#!/bin/bash
#SBATCH --job-name=test-array
#SBATCH --partition=daily
#SBATCH --ntasks=1
#SBATCH --time=08:00:00
#SBATCH --array=1-8
echo $(date) "I am job number ${SLURM_ARRAY_TASK_ID}"
srun $MYEXEC config-file-${SLURM_ARRAY_TASK_ID}.dat
```
This will run 8 independent jobs, where each job can use the counter
variable `SLURM_ARRAY_TASK_ID` defined by Slurm inside of the job's
environment to feed the correct input arguments or configuration file
to the "myprogram" executable. Each job will receive the same set of
configurations (e.g. time limit of 8h in the example above).
The jobs are independent, but they will run in parallel (if the cluster resources allow for
it). The jobs will get JobIDs like {some-number}_0 to {some-number}_7, and they also will each
have their own output file.
**Note:**
* Do not use such jobs if you have very short tasks, since each array sub job will incur the full overhead for launching an independent Slurm job. For such cases you should used a **packed job** (see below).
* If you want to control how many of these jobs can run in parallel, you can use the `#SBATCH --array=1-100%5` syntax. The `%5` will define
that only 5 sub jobs may ever run in parallel.
You also can use an array job approach to run over all files in a directory, substituting the payload with
``` bash
FILES=(/path/to/data/*)
srun $MYEXEC ${FILES[$SLURM_ARRAY_TASK_ID]}
```
Or for a trivial case you could supply the values for a parameter scan in form
of a argument list that gets fed to the program using the counter variable.
``` bash
ARGS=(0.05 0.25 0.5 1 2 5 100)
srun $MYEXEC ${ARGS[$SLURM_ARRAY_TASK_ID]}
```
### Array jobs: running very long tasks with checkpoint files
If you need to run a job for much longer than the queues (partitions) permit, and
your executable is able to create checkpoint files, you can use this
strategy:
``` bash
#!/bin/bash
#SBATCH --job-name=test-checkpoint
#SBATCH --partition=general
#SBATCH --ntasks=1
#SBATCH --time=7-00:00:00 # each job can run for 7 days
#SBATCH --cpus-per-task=1
#SBATCH --array=1-10%1 # Run a 10-job array, one job at a time.
if test -e checkpointfile; then
# There is a checkpoint file;
$MYEXEC --read-checkp checkpointfile
else
# There is no checkpoint file, start a new simulation.
$MYEXEC
fi
```
The `%1` in the `#SBATCH --array=1-10%1` statement defines that only 1 subjob can ever run in parallel, so
this will result in subjob n+1 only being started when job n has finished. It will read the checkpoint file
if it is present.
### Packed jobs: running a large number of short tasks
Since the launching of a Slurm job incurs some overhead, you should not submit each short task as a separate
Slurm job. Use job packing, i.e. you run the short tasks within the loop of a single Slurm job.
You can launch the short tasks using `srun` with the `--exclusive` switch (not to be confused with the
switch of the same name used in the SBATCH commands). This switch will ensure that only a specified
number of tasks can run in parallel.
As an example, the following job submission script will ask Slurm for
44 cores (threads), then it will run the =myprog= program 1000 times with
arguments passed from 1 to 1000. But with the =-N1 -n1 -c1
--exclusive= option, it will control that at any point in time only 44
instances are effectively running, each being allocated one CPU. You
can at this point decide to allocate several CPUs or tasks by adapting
the corresponding parameters.
``` bash
#! /bin/bash
#SBATCH --job-name=test-checkpoint
#SBATCH --partition=general
#SBATCH --ntasks=1
#SBATCH --time=7-00:00:00
#SBATCH --ntasks=44 # defines the number of parallel tasks
for i in {1..1000}
do
srun -N1 -n1 -c1 --exclusive $MYEXEC $i &
done
wait
```
**Note:** The `&` at the end of the `srun` line is needed to not have the script waiting (blocking).
The `wait` command waits for all such background tasks to finish and returns the exit code.
## Hands-On Example
Copy-paste the following example in a file called myAdvancedTest.batch):
```bash
#!/bin/bash
#SBATCH --partition=daily # name of slurm partition to submit
#SBATCH --time=2:00:00 # limit the execution of this job to 2 hours, see sinfo for the max. allowance
#SBATCH --nodes=2 # number of nodes
#SBATCH --ntasks=44 # number of tasks
#SBATCH --ntasks-per-core=1 # Request the max ntasks be invoked on each core
#SBATCH --hint=nomultithread # Don't use extra threads with in-core multi-threading
module load gcc/9.2.0 openmpi/3.1.5-1_merlin6
module list
echo "Example no-MPI:" ; hostname # will print one hostname per node
echo "Example MPI:" ; srun hostname # will print one hostname per ntask
```
In the above example are specified the options ``--nodes=2`` and ``--ntasks=44``. This means that up 2 nodes are requested,
and is expected to run 44 tasks. Hence, 44 cores are needed for running that job. Slurm will try to allocate a maximum of
2 nodes, both together having at least 44 cores. Since our nodes have 44 cores / each, if nodes are empty (no other users
have running jobs there), job can land on a single node (it has enough cores to run 44 tasks).
If we want to ensure that job is using at least two different nodes (i.e. for boosting CPU frequency, or because the job
requires more memory per core) you should specify other options.
A good example is ``--ntasks-per-node=22``. This will equally distribute 22 tasks on 2 nodes.
```bash
#SBATCH --ntasks-per-node=22
```
A different example could be by specifying how much memory per core is needed. For instance ``--mem-per-cpu=32000`` will reserve
~32000MB per core. Since we have a maximum of 352000MB per Apollo node, Slurm will be only able to allocate 11 cores (32000MB x 11cores = 352000MB) per node.
It means that 4 nodes will be needed (max 11 tasks per node due to memory definition, and we need to run 44 tasks), in this case we need to change ``--nodes=4``
(or remove ``--nodes``). Alternatively, we can decrease ``--mem-per-cpu`` to a lower value which can allow the use of at least 44 cores per node (i.e. with ``16000``
should be able to use 2 nodes)
```bash
#SBATCH --mem-per-cpu=16000
```
Finally, in order to ensure exclusivity of the node, an option *--exclusive* can be used (see below). This will ensure that
the requested nodes are exclusive for the job (no other users jobs will interact with this node, and only completely
free nodes will be allocated).
```bash
#SBATCH --exclusive
```
This can be combined with the previous examples.
More advanced configurations can be defined and can be combined with the previous examples. More information about advanced
options can be found in the following link: https://slurm.schedmd.com/sbatch.html (or run 'man sbatch').
If you have questions about how to properly execute your jobs, please contact us through merlin-admins@lists.psi.ch. Do not run
advanced configurations unless your are sure of what you are doing.

View File

@@ -0,0 +1,29 @@
---
title: Jupyter examples on merlin6
#tags:
#keywords:
last_updated: 1 October 2019
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/jupyter-examples.html
---
These examples demonstrate the use of certain python libraries and modules in the merlin6 environment. They are provided to get you started fast. You can check out a repository of the examples from
<https://git.psi.ch/lsm-hpce/merlin6-jupyterhub-examples>
A number of standard data sets for the tutorials of the libraries are hosted on merlin6 centrally under `/data/project/general/public`, so you do not need to store them in your user space.
# Dask
[Dask](https://dask.org/) is a flexible library for parallel computing in Python. It provides the abstraction of a dask dataframe that can reside on multiple machines and can be manipulated by an API designed to be as close as possible to [pandas](https://pandas.pydata.org/). The example shows how to start up dask workers on merlin6 through slurm.
* [Link to example](https://git.psi.ch/lsm-hpce/merlin6-jupyterhub-examples/blob/master/dask-example.ipynb)
* The data sets for the [dask tutorial](https://github.com/dask/dask-tutorial) are hosted at `/data/project/general/public/dask-tutorial`.
# Plotly
[Plotly](https://plot.ly/python/getting-started/) is an interactive open source plotting library
* [Link to example](https://git.psi.ch/lsm-hpce/merlin6-jupyterhub-examples/blob/master/plotly-example.ipynb)

View File

@@ -0,0 +1,32 @@
---
title: Jupyter Extensions
#tags:
#keywords:
last_updated: 30 September 2019
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/jupyter-extensions.html
---
## Using nbextensions for adding features to your notebook
There exist a number of useful [contributed but unofficial
extensions](https://jupyter-contrib-nbextensions.readthedocs.io/en/latest/index.html)
that add useful features to your notebooks.
From the classic Notebook UI you can access the available extensions in a separate tab as displayed in the screenshot, below. You may have to unselect the *disable configuration for nbextensions without explicit copatibility*. The extensions we tested still worked fine with this jupyterhub version of 1.0.0.
{% include image.html file="jupyter-nbextensions.png" caption="Launch Classic Notebook" max-width=586 %}
## Extensions for working with large notebooks
Especially the following extensions make working with larger notebooks easier
* **Table of Contents**: Displays a TOC on the left and you can also configure it
to add and update a TOC at the head of the document.
* **Collapsible Headings**: allows you to fold all the cells below a heading
It may also be interesting for you to explore the [Jupytext](jupytext.html) server extension.
## Variable Inspector
The `variable inspector` extension provides a constantly updated window in which you can see the value and type of your notebook's variables.

View File

@@ -0,0 +1,63 @@
---
title: Jupyterhub Troubleshooting
#tags:
#keywords:
last_updated: 18 February 2020
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/jupyterhub-trouble.html
---
In case of problems or requests, please either submit a **[PSI Service Now](https://psi.service-now.com/psisp)** incident containing *"Merlin Jupyterhub"* as part of the subject, or contact us by mail through <merlin-admins@lists.psi.ch>.
## General steps for troubleshooting
### Investigate the Slurm output file
Your jupyterhub session runs as a normal batch job on the cluster, and each launch will create a slurm output file in your HOME directory named like `jupyterhub_batchspawner_{$JOBID}.log`, where the $JOBID part is the slurm job ID of your job. After a failed launch, investigate the contents of that file. An error message will usually be found towards the end of the file, often including a python backtrace.
### Investigate python environment interferences
Jupyterhub just runs a jupyter notebook executable as your user inside the batch job. A frequent source of errors consists of a user's local python environment definitions getting mixed up with the environment that jupyter needs to launch.
- setting PYTHONPATH inside of the ~/.bash_profile or any other startup script
- having installed packages to your local user area (e.g. using `pip install --user <some-package>`). Such installation will interfere with the environment offered by the `module` system on our cluster (based on anaconda). You can list such packages by executing
`pip list user`. They are usually located in `~/.local/lib/pythonX.Y/...`.
You can investigate the launching of a notebook interactively, by logging in to Merlin6 and running a jupyter command in the correct environment.
```
module use unstable
module load anaconda/2019.07
conda activate jupyterhub-1.0.0_py36
jupyter --paths
```
## Known Problems and workarounds
### Spawner times out
If the cluster is very full, it may be difficult to launch a session. We always reserve some slots for interactive Jupyterhub use, but it may be that these slots have been taken or that the resources you requested are currently not available.
Inside of a Merlin6 terminal shell, you can run the standard commands like `sinfo` and `squeue` to get an overview of how full the cluster is.
### Your user environment is not among the kernels offered for choice
Refer to our documentation about [using your own custom made
environments with jupyterhub](/merlin6/jupyterhub.html).
### Cannot save notebook - _xsrf argument missing_
You cannot save your notebook anymore and you get this error:
**'_xsrf' argument missing from POST**
This issue occurs very seldomly. There exists the following workaround:
Go to the jupyterhub file browsing window and just open another
notebook using the same kernel in another browser window. The issue
should then go away. For more information refer to [this github
thread](https://github.com/nteract/hydrogen/issues/922#issuecomment-405456346)
<!-- ## Error HTTP 500 when starting the spawner -->
<!-- The spawner screen shows after launching an error message like the following: -->
<!-- `Internal server error (Spawner failed to start [status=..]. The logs may contain details)` -->

View File

@@ -0,0 +1,112 @@
---
title: Jupyterhub on Merlin
#tags:
#keywords:
last_updated: 31 July 2019
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/jupyterhub.html
---
Jupyterhub provides [jupyter notebooks](https://jupyter.org/) that are launched on
cluster nodes of merlin and can be accessed through a web portal.
## Accessing Jupyterhub and launching a session
The service is available inside of PSI (or through a VPN connection) at
**<https://merlin-jupyter.psi.ch:8000>**
1. **Login**: You will be presented with a **Login** web page for
authenticating with your PSI account.
1. **Spawn job**: The **Spawner Options** page allows you to
specify the properties (Slurm partition, running time,...) of
the batch jobs that will be running your jupyter notebook. Once
you click on the `Spawn` button, your job will be sent to the
Slurm batch system. If the cluster is not currently overloaded
and the resources you requested are available, your job will
usually start within 30 seconds.
## Jupyter software environments - running different kernels
Your notebooks can run within different software environments which are offered by a number of available **Jupyter kernels**.
E.g. in this test installation we provide two environments targeted at data science
* **tensorflow-1.13.1_py37**: contains Tensorflow, Keras, scikit-learn, Pandas, numpy, dask, and dependencies. Stable
* **talos_py36**: also contains the Talos package. This
environment is experimental and subject to updates and changes.
When you create a new notebook you will be asked to specify which kernel you want to use. It is also possible to switch the kernel of a running notebook, but you will lose the state of the current kernel, so you will have to recalculate the notebook cells with this new kernel.
These environments are also available for standard work in a shell session. You can activate an environment in a normal merlin terminal session by using the `module` (q.v. [using Pmodules](using-modules.html)) command to load anaconda python, and from there using the `conda` command to switch to the desired environment
```
module use unstable
module load anaconda/2019.07
conda activate tensorflow-1.13.1_py36
```
When the `anaconda` module has been loaded, you can list the available environments by executing
```
conda info -e
```
You can get more info on the use of the `conda` package management tool at its official [https://conda.io/projects/conda/en/latest/commands.html](documentation site).
## Using your own custom made environments with jupyterhub
Python environments can take up a lot of space due to the many dependencies that will be installed. You should always install your extra environments to the data area belonging to your account, e.g. `/data/user/${YOUR-USERNAME}/conda-envs`
In order for jupyterhub (and jupyter in general) to recognize the provided environment as a valid kernel, make sure that you include the `nb_conda_kernels` package in your environment. This package provides the necessary activation and the dependencies.
Example:
```
conda create -c conda-forge -p /data/user/${USER}/conda-envs/my-test-env python=3.7 nb_conda_kernels
```
After this, your new kernel will be visible as `my-test-env` inside of your jupyterhub session.
## Requesting additional resources
The **Spawner Options** page covers the most common options. These are used to
create a submission script for the jupyterhub job and submit it to the slurm
queue. Additional customization can be implemented using the *'Optional user
defined line to be added to the batch launcher script'* option. This line is
added to the submission script at the end of other `#SBATCH` lines. Parameters can
be passed to SLURM by starting the line with `#SBATCH`, like in [Running Slurm
Scripts](/merlin6/running-jobs.html). Some ideas:
**Request additional memory**
```
#SBATCH --mem=100G
```
**Request multiple GPUs** (gpu partition only)
```
#SBATCH --gpus=2
```
**Log additional information**
```
hostname; date; echo $USER
```
Output is found in `~/jupyterhub_batchspawner_<jobid>.log`.
## Contact
In case of problems or requests, please either submit a **[PSI Service
Now](https://psi.service-now.com/psisp)** incident containing *"Merlin
Jupyterhub"* as part of the subject, or contact us by mail through
<merlin-admins@lists.psi.ch>.

View File

@@ -0,0 +1,28 @@
---
title: Jupyterlab User interface
#tags:
#keywords:
last_updated: 31 July 2019
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/jupyterlab.html
---
## Testing out Jupyterlab
**Jupyterlab** is a new interface to interact with your Jupyter notebooks. However, it is in very active development and undergoing constant changes. You can read about its features [on the official website](https://jupyterlab.readthedocs.io/en/stable/user/interface.html).
You can test it out on our server by using the following kind of URL, where *$YOUR-USER* must be replaced by your PSI username. You must already have an active session on the jupyterhub.
https://merlin-jupyter.psi.ch:8000/user/$YOUR-USER/lab
## Switching to the Classic Notebook user interface
You can switch to the classical notebook UI by using the **"Launch Classic Notebook"** command from the left sidebar of JupyterLab.
{% include image.html file="jupyter-launch-classic.png" caption="Launch Classic Notebook" max-width=501 %}
## Jupyterlab does not support the older nbextensions
These regrettably are not yet supported from within the JupyterLab UI,
but you can activate them through the Classic Notebook interface (see
above)

View File

@@ -0,0 +1,37 @@
---
title: Jupytext - efficient editing
#tags:
#keywords:
last_updated: 30 September 2019
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/jupytext.html
---
[Jupytext](https://github.com/mwouts/jupytext) is a Jupyter serverextension that allows creating a text file from a notebook that can be kept in sync with it **with the aim of using more efficient editors or IDEs on it**. The file can be created in a number of formats, e.g. *markdown, *.py (light Script)*, and others. `Jupytext` will keep the both the notebook and this **paired** file in sync: If you save the paired file, changes will be carried over into the notebook, and vice versa. This pairing will persist also in new sessions of your notebook until you explicitely remove it again.
The paired file contains only the cell contents and not the output. Therefore it also is **much better suited for revision control**, since the differences between versions are limited to the cells and these file formats yield more meaningful text differences than the default notebook storage format.
## Creating a paired file in python format for efficient refactoring
From your notebook, go to the `file` menu and navigate to the `jupytext` submenu. Select the **light script** pairing option. This will create a `*.py` file version with the same basename as your notebook file.
{% include image.html file="jupytext_menu.png" caption="Jupytext menu" max-width=501 %}
You can edit that file separately in your favourite python editor. The markdown text parts will be conserved in the file in the form of python comments.
When you save the file and do a browser page reload of your jupyter notebook, you will see all the changes carried over into your jupyter notebook.
## Creating a paired file in mardown format for efficient text authoring
If you want to efficiently work on the descriptive text base of your notebook, just pair it using the `Pair notebook with Markdown` menu item and edit the generated `*.md` file with your favourite Markdown editor.
## Disable autosaving when working on the paired file
Your notebooks usually auto save every 2 min (default). Turn this feature off when working with the paired file. Otherwise Jupyter will continue to save the state while you are editing the paired file, and the changes will be synced to the disk version of the paired file. You can disable the autosave by unchecking the `Autosave notebook` menu item in the Juytext menu (see above image).
## Further information
Please refer to
* [the Jupytext FAQ](https://jupytext.readthedocs.io/en/latest/faq.html)
* [the Jupytext documentation](https://jupytext.readthedocs.io/en/latest/index.html)

View File

@@ -0,0 +1,144 @@
---
title: ANSYS / CFX
#tags:
keywords: software, ansys, cfx5, cfx, slurm, interactive, rsm, batch job
last_updated: 07 September 2022
summary: "This document describes how to run ANSYS/CFX in the Merlin6 cluster"
sidebar: merlin6_sidebar
permalink: /merlin6/ansys-cfx.html
---
This document describes the different ways for running **ANSYS/CFX**
## ANSYS/CFX
Is always recommended to check which parameters are available in CFX and adapt the below examples according to your needs.
For that, run `cfx5solve -help` for getting a list of options.
## Running CFX jobs
### PModules
Is strongly recommended the use of the latest ANSYS software available in PModules.
```bash
module use unstable
module load Pmodules/1.1.6
module use overlay_merlin
module load ANSYS/2022R1
```
### Interactive: RSM from remote PSI Workstations
Is possible to run CFX through RSM from remote PSI (Linux or Windows) Workstation having a local installation of ANSYS CFX and RSM client.
For that, please refer to the ***[ANSYS RSM]*(/merlin6/ansys-rsm.html)** in the Merlin documentation for further information of how to setup a RSM client for submitting jobs to Merlin.
### Non-interactive: sbatch
Running jobs with `sbatch` is always the recommended method. This makes the use of the resources more efficient. Notice that for
running non interactive Mechanical APDL jobs one must specify the `-batch` option.
#### Serial example
This example shows a very basic serial job.
```bash
#!/bin/bash
#SBATCH --job-name=CFX # Job Name
#SBATCH --partition=hourly # Using 'daily' will grant higher priority than 'general'
#SBATCH --time=0-01:00:00 # Time needed for running the job. Must match with 'partition' limits.
#SBATCH --cpus-per-task=1 # Double if hyperthreading enabled
#SBATCH --ntasks-per-core=1 # Double if hyperthreading enabled
#SBATCH --hint=nomultithread # Disable Hyperthreading
#SBATCH --error=slurm-%j.err # Define your error file
module use unstable
module load ANSYS/2020R1-1
# [Optional:BEGIN] Specify your license server if this is not 'lic-ansys.psi.ch'
LICENSE_SERVER=<your_license_server>
export ANSYSLMD_LICENSE_FILE=1055@$LICENSE_SERVER
export ANSYSLI_SERVERS=2325@$LICENSE_SERVER
# [Optional:END]
SOLVER_FILE=/data/user/caubet_m/CFX5/mysolver.in
cfx5solve -batch -def "$JOURNAL_FILE"
```
One can enable hypertheading by defining `--hint=multithread`, `--cpus-per-task=2` and `--ntasks-per-core=2`.
However, this is in general not recommended, unless one can ensure that can be beneficial.
#### MPI-based example
An example for running CFX using a Slurm batch script is the following:
```bash
#!/bin/bash
#SBATCH --job-name=CFX # Job Name
#SBATCH --partition=hourly # Using 'daily' will grant higher priority than 'general'
#SBATCH --time=0-01:00:00 # Time needed for running the job. Must match with 'partition' limits.
#SBATCH --nodes=1 # Number of nodes
#SBATCH --ntasks=44 # Number of tasks
#SBATCH --cpus-per-task=1 # Double if hyperthreading enabled
#SBATCH --ntasks-per-core=1 # Double if hyperthreading enabled
#SBATCH --hint=nomultithread # Disable Hyperthreading
#SBATCH --error=slurm-%j.err # Define a file for standard error messages
##SBATCH --exclusive # Uncomment if you want exclusive usage of the nodes
module use unstable
module load ANSYS/2020R1-1
# [Optional:BEGIN] Specify your license server if this is not 'lic-ansys.psi.ch'
LICENSE_SERVER=<your_license_server>
export ANSYSLMD_LICENSE_FILE=1055@$LICENSE_SERVER
export ANSYSLI_SERVERS=2325@$LICENSE_SERVER
# [Optional:END]
export HOSTLIST=$(scontrol show hostname | tr '\n' ',' | sed 's/,$//g')
JOURNAL_FILE=myjournal.in
# INTELMPI=no for IBM MPI
# INTELMPI=yes for INTEL MPI
INTELMPI=no
if [ "$INTELMPI" == "yes" ]
then
export I_MPI_DEBUG=4
export I_MPI_PIN_CELL=core
# Simple example: cfx5solve -batch -def "$JOURNAL_FILE" -par-dist "$HOSTLIST" \
# -part $SLURM_NTASKS \
# -start-method 'Intel MPI Distributed Parallel'
cfx5solve -batch -part-large -double -verbose -def "$JOURNAL_FILE" -par-dist "$HOSTLIST" \
-part $SLURM_NTASKS -par-local -start-method 'Intel MPI Distributed Parallel'
else
# Simple example: cfx5solve -batch -def "$JOURNAL_FILE" -par-dist "$HOSTLIST" \
# -part $SLURM_NTASKS \
# -start-method 'IBM MPI Distributed Parallel'
cfx5solve -batch -part-large -double -verbose -def "$JOURNAL_FILE" -par-dist "$HOSTLIST" \
-part $SLURM_NTASKS -par-local -start-method 'IBM MPI Distributed Parallel'
fi
```
In the above example, one can increase the number of *nodes* and/or *ntasks* if needed and combine it
with `--exclusive` whenever needed. In general, **no hypertheading** is recommended for MPI based jobs.
Also, one can combine it with `--exclusive` when necessary. Finally, one can change the MPI technology in `-start-method`
(check CFX documentation for possible values).
## CFX5 Launcher: CFD-Pre/Post, Solve Manager, TurboGrid
Some users might need to visualize or change some parameters when running calculations with the CFX Solver. For running
**TurboGrid**, **CFX-Pre**, **CFX-Solver Manager** or **CFD-Post** one should run it with the **`cfx5` launcher** binary:
```bash
cfx5
```
![CFX5 Launcher Example]({{ "/images/ANSYS/cfx5launcher.png" }})
Then, from the launcher, one can open the proper application (i.e. **CFX-Solver Manager** for visualizing and modifying an
existing job run)
For running CFX5 Launcher, is required a proper SSH + X11 Forwarding access (`ssh -XY`) or *preferrible* **NoMachine**.
If **ssh** does not work for you, please use **NoMachine** instead (which is the supported X based access, and simpler).

View File

@@ -0,0 +1,162 @@
---
title: ANSYS / Fluent
#tags:
keywords: software, ansys, fluent, slurm, interactive, rsm, batch job
last_updated: 07 September 2022
summary: "This document describes how to run ANSYS/Fluent in the Merlin6 cluster"
sidebar: merlin6_sidebar
permalink: /merlin6/ansys-fluent.html
---
This document describes the different ways for running **ANSYS/Fluent**
## ANSYS/Fluent
Is always recommended to check which parameters are available in Fluent and adapt the below example according to your needs.
For that, run `fluent -help` for getting a list of options. However, as when running Fluent one must specify one of the
following flags:
* **2d**: This is a 2D solver with single point precision.
* **3d**: This is a 3D solver with single point precision.
* **2dpp**: This is a 2D solver with double point precision.
* **3dpp**: This is a 3D solver with double point precision.
## Running Fluent jobs
### PModules
Is strongly recommended the use of the latest ANSYS software available in PModules.
```bash
module use unstable
module load Pmodules/1.1.6
module use overlay_merlin
module load ANSYS/2022R1
```
### Interactive: RSM from remote PSI Workstations
Is possible to run Fluent through RSM from remote PSI (Linux or Windows) Workstation having a local installation of ANSYS Fluent and RSM client.
For that, please refer to the ***[ANSYS RSM]*(/merlin6/ansys-rsm.html)** in the Merlin documentation for further information of how to setup a RSM client for submitting jobs to Merlin.
### Non-interactive: sbatch
Running jobs with `sbatch` is always the recommended method. This makes the use of the resources more efficient.
For running it as a job, one needs to run in no graphical mode (`-g` option).
#### Serial example
This example shows a very basic serial job.
```bash
#!/bin/bash
#SBATCH --job-name=Fluent # Job Name
#SBATCH --partition=hourly # Using 'daily' will grant higher priority than 'general'
#SBATCH --time=0-01:00:00 # Time needed for running the job. Must match with 'partition' limits.
#SBATCH --cpus-per-task=1 # Double if hyperthreading enabled
#SBATCH --hint=nomultithread # Disable Hyperthreading
#SBATCH --error=slurm-%j.err # Define your error file
module use unstable
module load ANSYS/2020R1-1
# [Optional:BEGIN] Specify your license server if this is not 'lic-ansys.psi.ch'
LICENSE_SERVER=<your_license_server>
export ANSYSLMD_LICENSE_FILE=1055@$LICENSE_SERVER
export ANSYSLI_SERVERS=2325@$LICENSE_SERVER
# [Optional:END]
JOURNAL_FILE=/data/user/caubet_m/Fluent/myjournal.in
fluent 3ddp -g -i ${JOURNAL_FILE}
```
One can enable hypertheading by defining `--hint=multithread`, `--cpus-per-task=2` and `--ntasks-per-core=2`.
However, this is in general not recommended, unless one can ensure that can be beneficial.
#### MPI-based example
An example for running Fluent using a Slurm batch script is the following:
```bash
#!/bin/bash
#SBATCH --job-name=Fluent # Job Name
#SBATCH --partition=hourly # Using 'daily' will grant higher priority than 'general'
#SBATCH --time=0-01:00:00 # Time needed for running the job. Must match with 'partition' limits.
#SBATCH --nodes=1 # Number of nodes
#SBATCH --ntasks=44 # Number of tasks
#SBATCH --cpus-per-task=1 # Double if hyperthreading enabled
#SBATCH --ntasks-per-core=1 # Run one task per core
#SBATCH --hint=nomultithread # Disable Hyperthreading
#SBATCH --error=slurm-%j.err # Define a file for standard error messages
##SBATCH --exclusive # Uncomment if you want exclusive usage of the nodes
module use unstable
module load ANSYS/2020R1-1
# [Optional:BEGIN] Specify your license server if this is not 'lic-ansys.psi.ch'
LICENSE_SERVER=<your_license_server>
export ANSYSLMD_LICENSE_FILE=1055@$LICENSE_SERVER
export ANSYSLI_SERVERS=2325@$LICENSE_SERVER
# [Optional:END]
JOURNAL_FILE=/data/user/caubet_m/Fluent/myjournal.in
fluent 3ddp -g -t ${SLURM_NTASKS} -i ${JOURNAL_FILE}
```
In the above example, one can increase the number of *nodes* and/or *ntasks* if needed. One can remove
`--nodes` for running on multiple nodes, but may lead to communication overhead. In general, **no
hyperthreading** is recommended for MPI based jobs. Also, one can combine it with `--exclusive` when necessary.
## Interactive: salloc
Running Fluent interactively is strongly not recommended and one should whenever possible use `sbatch`.
However, sometimes interactive runs are needed. For jobs requiring only few CPUs (in example, 2 CPUs) **and** for a short period of time, one can use the login nodes.
Otherwise, one must use the Slurm batch system using allocations:
* For short jobs requiring more CPUs, one can use the Merlin shortest partitions (`hourly`).
* For longer jobs, one can use longer partitions, however, interactive access is not always possible (depending on the usage of the cluster).
Please refer to the documentation **[Running Interactive Jobs](/merlin6/interactive-jobs.html)** for firther information about different ways for running interactive
jobs in the Merlin6 cluster.
### Requirements
#### SSH Keys
Running Fluent interactively requires the use of SSH Keys. This is the way of communication between the GUI and the different nodes. For doing that, one must have
a **passphrase protected** SSH Key. If the user does not have SSH Keys yet (simply run **`ls $HOME/.ssh/`** to check whether **`id_rsa`** files exist or not). For
deploying SSH Keys for running Fluent interactively, one should follow this documentation: **[Configuring SSH Keys](/merlin6/ssh-keys.html)**
#### List of hosts
For running Fluent using Slurm computing nodes, one needs to get the list of the reserved nodes. For getting that list, once you have the allocation, one can run
the following command:
```bash
scontrol show hostname
```
This list must be included in the settings as the list of hosts where to run Fluent. Alternatively, one can give that list as parameter (`-cnf` option) when running `fluent`,
as follows:
<details>
<summary>[Running Fluent with 'salloc' example]</summary>
<pre class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false">
(base) [caubet_m@merlin-l-001 caubet_m]$ salloc --nodes=2 --ntasks=88 --hint=nomultithread --time=0-01:00:00 --partition=test $SHELL
salloc: Pending job allocation 135030174
salloc: job 135030174 queued and waiting for resources
salloc: job 135030174 has been allocated resources
salloc: Granted job allocation 135030174
(base) [caubet_m@merlin-l-001 caubet_m]$ module use unstable
(
base) [caubet_m@merlin-l-001 caubet_m]$ module load ANSYS/2020R1-1
module load: unstable module has been loaded -- ANSYS/2020R1-1
(base) [caubet_m@merlin-l-001 caubet_m]$ fluent 3ddp -t$SLURM_NPROCS -cnf=$(scontrol show hostname | tr '\n' ',')
(base) [caubet_m@merlin-l-001 caubet_m]$ exit
exit
salloc: Relinquishing job allocation 135030174
salloc: Job allocation 135030174 has been revoked.
</pre>
</details>

View File

@@ -0,0 +1,117 @@
---
title: ANSYS HFSS / ElectroMagnetics
#tags:
keywords: software, ansys, ansysEM, em, slurm, hfss, interactive, rsm, batch job
last_updated: 07 September 2022
summary: "This document describes how to run ANSYS HFSS (ElectroMagnetics) in the Merlin6 cluster"
sidebar: merlin6_sidebar
permalink: /merlin6/ansys-hfss.html
---
This document describes the different ways for running **ANSYS HFSS (ElectroMagnetics)**
## ANSYS HFSS (ElectroMagnetics)
This recipe is intended to show how to run ANSYS HFSS (ElectroMagnetics) in Slurm.
Having in mind that in general, running ANSYS HFSS means running **ANSYS Electronics Desktop**.
## Running HFSS / Electromagnetics jobs
### PModules
Is necessary to run at least ANSYS software **ANSYS/2022R1**, which is available in PModules:
```bash
module use unstable
module load Pmodules/1.1.6
module use overlay_merlin
module load ANSYS/2022R1
```
## Remote job submission: HFSS RSM and SLURM
Running jobs through Remote RSM or Slurm is the recommended way for running ANSYS HFSS.
* **HFSS RSM** can be used from ANSYS HFSS installations running on Windows workstations at PSI (as long as are in the internal PSI network).
* **Slurm** can be used when submitting directly from a Merlin login node (i.e. `sbatch` command or interactively from **ANSYS Electronics Desktop**)
### HFSS RSM (from remote workstations)
Running jobs through Remote RSM is the way for running ANSYS HFSS when submitting from an ANSYS HFSS installation on a PSI Windows workstation.
A HFSS RSM service is running on each **Merlin login node**, and the listening port depends on the ANSYS EM version. Current support ANSYS EM RSM
release and associated listening ports are the following:
<table>
<thead>
<tr>
<th scope='col' style="vertical-align:middle;text-align:center;">ANSYS version</th>
<th scope='col' style="vertical-align:middle;text-align:center;">Login nodes</th>
<th scope='col' style="vertical-align:middle;text-align:center;">Listening port</th>
</tr>
</thead>
<tbody>
<tr align="center">
<td>2022R1</td>
<td><font size="2" face="Courier New">merlin-l-001 merlin-l-001 merlin-l-001</font></td>
<td>32958</td>
</tr>
<tr align="center">
<td>2022R2</td>
<td><font size="2" face="Courier New">merlin-l-001 merlin-l-001 merlin-l-001</font></td>
<td>32959</td>
</tr>
<tr align="center">
<td>2023R2</td>
<td><font size="2" face="Courier New">merlin-l-001 merlin-l-001 merlin-l-001</font></td>
<td>32960</td>
</tr>
</tbody>
</table>
Notice that by default ANSYS EM is listening on port **`32958`**, this is the default for **ANSYS/2022R1** only.
* Workstations connecting to the Merlin ANSYS EM service must ensure that **Electronics Desktop** is connecting to the proper port.
* In the same way, the ANSYS Workstation version must be the same as the version running on Merlin.
Notice that _HFSS RSM is not the same RSM provided for other ANSYS products._ Therefore, the configuration is different from [ANSYS RSM](/merlin6/ansys-rsm.html).
To setup HFSS RSM for using it with the Merlin cluster, it must be done from the following **ANSYS Electronics Desktop** menu:
1. **[Tools]->[Job Management]->[Select Scheduler]**.
![Select_Scheduler]({{"/images/ANSYS/HFSS/01_Select_Scheduler_Menu.png"}})
2. In the new **[Select scheduler]** window, setup the following settings and **Refresh**:
![RSM_Remote_Scheduler]({{"/images/ANSYS/HFSS/02_Select_Scheduler_RSM_Remote.png"}})
* **Select Scheduler**: `Remote RSM`.
* **Server**: Add a Merlin login node.
* **User name**: Add your Merlin username.
* **Password**: Add you Merlin username password.
Once *refreshed*, the **Scheduler info** box must provide **Slurm** information of the server (see above picture). If the box contains that information, then you can save changes (`OK` button).
3. **[Tools]->[Job Management]->[Submit Job...]**.
![Submit_Job]({{"/images/ANSYS/HFSS/04_Submit_Job_Menu.png"}})
4. In the new **[Submite Job]** window, you must specify the location of the **ANSYS Electronics Desktop** binary.
![Product_Path]({{"/images/ANSYS/HFSS/05_Submit_Job_Product_Path.png"}})
* In example, for **ANSYS/2022R1**, the location is `/data/software/pmodules/Tools/ANSYS/2021R1/v211/AnsysEM21.1/Linux64/ansysedt.exe`:.
### HFSS Slurm (from login node only)
Running jobs through Slurm from **ANSYS Electronics Desktop** is the way for running ANSYS HFSS when submitting from an ANSYS HFSS installation in a Merlin login node. **ANSYS Electronics Desktop** usually needs to be run from the **[Merlin NoMachine](/merlin6/nomachine.html)** service, which currently runs on:
- `merlin-l-001.psi.ch`
- `merlin-l-002.psi.ch`
Since the Slurm client is present in the login node (where **ANSYS Electronics Desktop** is running), the application will be able to detect and to submit directly to Slurm. Therefore, we only have to configure **ANSYS Electronics Desktop** to submit to Slurm. This can set as follows:
1. **[Tools]->[Job Management]->[Select Scheduler]**.
![Select_Scheduler]({{"/images/ANSYS/HFSS/01_Select_Scheduler_Menu.png"}})
2. In the new **[Select scheduler]** window, setup the following settings and **Refresh**:
![RSM_Remote_Scheduler]({{"/images/ANSYS/HFSS/03_Select_Scheduler_Slurm.png"}})
* **Select Scheduler**: `Slurm`.
* **Server**: must point to `localhost`.
* **User name**: must be empty.
* **Password**: must be empty.
The **Server, User name** and **Password** boxes can't be modified, but if value do not match with the above settings, they should be changed by selecting another Scheduler which allows editig these boxes (i.e. **RSM Remote**).
Once *refreshed*, the **Scheduler info** box must provide **Slurm** information of the server (see above picture). If the box contains that information, then you can save changes (`OK` button).

View File

@@ -0,0 +1,162 @@
---
title: ANSYS / MAPDL
#tags:
keywords: software, ansys, mapdl, slurm, apdl, interactive, rsm, batch job
last_updated: 07 September 2022
summary: "This document describes how to run ANSYS/Mechanical APDL in the Merlin6 cluster"
sidebar: merlin6_sidebar
permalink: /merlin6/ansys-mapdl.html
---
This document describes the different ways for running **ANSYS/Mechanical APDL**
## ANSYS/Mechanical APDL
Is always recommended to check which parameters are available in Mechanical APDL and adapt the below examples according to your needs.
For that, please refer to the official Mechanical APDL documentation.
## Running Mechanical APDL jobs
### PModules
Is strongly recommended the use of the latest ANSYS software available in PModules.
```bash
module use unstable
module load Pmodules/1.1.6
module use overlay_merlin
module load ANSYS/2022R1
```
### Interactive: RSM from remote PSI Workstations
Is possible to run Mechanical through RSM from remote PSI (Linux or Windows) Workstation having a local installation of ANSYS Mechanical and RSM client.
For that, please refer to the ***[ANSYS RSM]*(/merlin6/ansys-rsm.html)** in the Merlin documentation for further information of how to setup a RSM client for submitting jobs to Merlin.
### Non-interactive: sbatch
Running jobs with `sbatch` is always the recommended method. This makes the use of the resources more efficient. Notice that for
running non interactive Mechanical APDL jobs one must specify the `-b` option.
#### Serial example
This example shows a very basic serial job.
```bash
#!/bin/bash
#SBATCH --job-name=MAPDL # Job Name
#SBATCH --partition=hourly # Using 'daily' will grant higher priority than 'general'
#SBATCH --time=0-01:00:00 # Time needed for running the job. Must match with 'partition' limits.
#SBATCH --cpus-per-task=1 # Double if hyperthreading enabled
#SBATCH --ntasks-per-core=1 # Double if hyperthreading enabled
#SBATCH --hint=nomultithread # Disable Hyperthreading
#SBATCH --error=slurm-%j.err # Define your error file
module use unstable
module load ANSYS/2020R1-1
# [Optional:BEGIN] Specify your license server if this is not 'lic-ansys.psi.ch'
LICENSE_SERVER=<your_license_server>
export ANSYSLMD_LICENSE_FILE=1055@$LICENSE_SERVER
export ANSYSLI_SERVERS=2325@$LICENSE_SERVER
# [Optional:END]
SOLVER_FILE=/data/user/caubet_m/MAPDL/mysolver.in
mapdl -b -i "$SOLVER_FILE"
```
One can enable hypertheading by defining `--hint=multithread`, `--cpus-per-task=2` and `--ntasks-per-core=2`.
However, this is in general not recommended, unless one can ensure that can be beneficial.
#### SMP-based example
This example shows how to running Mechanical APDL in Shared-Memory Parallelism mode. It limits the use
to 1 single node, but by using many cores. In the example below, we use a node by using all his cores
and the whole memory.
```bash
#!/bin/bash
#SBATCH --job-name=MAPDL # Job Name
#SBATCH --partition=hourly # Using 'daily' will grant higher priority than 'general'
#SBATCH --time=0-01:00:00 # Time needed for running the job. Must match with 'partition' limits.
#SBATCH --nodes=1 # Number of nodes
#SBATCH --ntasks=1 # Number of tasks
#SBATCH --cpus-per-task=44 # Double if hyperthreading enabled
#SBATCH --hint=nomultithread # Disable Hyperthreading
#SBATCH --error=slurm-%j.err # Define a file for standard error messages
#SBATCH --exclusive # Uncomment if you want exclusive usage of the nodes
module use unstable
module load ANSYS/2020R1-1
# [Optional:BEGIN] Specify your license server if this is not 'lic-ansys.psi.ch'
LICENSE_SERVER=<your_license_server>
export ANSYSLMD_LICENSE_FILE=1055@$LICENSE_SERVER
export ANSYSLI_SERVERS=2325@$LICENSE_SERVER
# [Optional:END]
SOLVER_FILE=/data/user/caubet_m/MAPDL/mysolver.in
mapdl -b -np ${SLURM_CPUS_PER_TASK} -i "$SOLVER_FILE"
```
In the above example, one can reduce the number of **cpus per task**. Here usually `--exclusive`
is recommended if one needs to use the whole memory.
For **SMP** runs, one might try the hyperthreading mode by doubling the proper settings
(`--cpus-per-task`), in some cases it might be beneficial.
Please notice that `--ntasks-per-core=1` is not defined here, this is because we want to run 1
task on many cores! As an alternative, one can explore `--ntasks-per-socket` or `--ntasks-per-node`
for fine grained configurations.
#### MPI-based example
This example enables Distributed ANSYS for running Mechanical APDL using a Slurm batch script.
```bash
#!/bin/bash
#SBATCH --job-name=MAPDL # Job Name
#SBATCH --partition=hourly # Using 'daily' will grant higher priority than 'general'
#SBATCH --time=0-01:00:00 # Time needed for running the job. Must match with 'partition' limits.
#SBATCH --nodes=1 # Number of nodes
#SBATCH --ntasks=44 # Number of tasks
#SBATCH --cpus-per-task=1 # Double if hyperthreading enabled
#SBATCH --ntasks-per-core=1 # Run one task per core
#SBATCH --hint=nomultithread # Disable Hyperthreading
#SBATCH --error=slurm-%j.err # Define a file for standard error messages
##SBATCH --exclusive # Uncomment if you want exclusive usage of the nodes
module use unstable
module load ANSYS/2020R1-1
# [Optional:BEGIN] Specify your license server if this is not 'lic-ansys.psi.ch'
LICENSE_SERVER=<your_license_server>
export ANSYSLMD_LICENSE_FILE=1055@$LICENSE_SERVER
export ANSYSLI_SERVERS=2325@$LICENSE_SERVER
# [Optional:END]
SOLVER_FILE=input.dat
# INTELMPI=no for IBM MPI
# INTELMPI=yes for INTEL MPI
INTELMPI=no
if [ "$INTELMPI" == "yes" ]
then
# When using -mpi=intelmpi, KMP Affinity must be disabled
export KMP_AFFINITY=disabled
# INTELMPI is not aware about distribution of tasks.
# - We need to define tasks distribution.
HOSTLIST=$(srun hostname | sort | uniq -c | awk '{print $2 ":" $1}' | tr '\n' ':' | sed 's/:$/\n/g')
mapdl -b -dis -mpi intelmpi -machines $HOSTLIST -np ${SLURM_NTASKS} -i "$SOLVER_FILE"
else
# IBMMPI (default) will be aware of the distribution of tasks.
# - In principle, no need to force tasks distribution
mapdl -b -dis -mpi ibmmpi -np ${SLURM_NTASKS} -i "$SOLVER_FILE"
fi
```
In the above example, one can increase the number of *nodes* and/or *ntasks* if needed and combine it
with `--exclusive` when necessary. In general, **no hypertheading** is recommended for MPI based jobs.
Also, one can combine it with `--exclusive` when necessary.

View File

@@ -0,0 +1,108 @@
---
title: ANSYS RSM (Remote Resolve Manager)
#tags:
keywords: software, ansys, rsm, slurm, interactive, rsm, windows
last_updated: 07 September 2022
summary: "This document describes how to use the ANSYS Remote Resolve Manager service in the Merlin6 cluster"
sidebar: merlin6_sidebar
permalink: /merlin6/ansys-rsm.html
---
## ANSYS Remote Resolve Manager
**ANSYS Remote Solve Manager (RSM)** is used by ANSYS Workbench to submit computational jobs to HPC clusters directly from Workbench on your desktop.
Therefore, PSI workstations ***with direct access to Merlin*** can submit jobs by using RSM.
Users are responsible for requesting possible necessary network access and debugging any possible connectivity problem with the cluster.
In example, in case that the workstation is behind a firewall, users would need to request a **[firewall rule](https://psi.service-now.com/psisp/?id=psi_new_sc_category&sys_id=6f07ab1e4f3913007f7660fe0310c7ba)** to enable access to Merlin.
{{site.data.alerts.warning}} The Merlin6 administrators <b>are not responsible for connectivity problems</b> between users workstations and the Merlin6 cluster.
{{site.data.alerts.end}}
### The Merlin6 RSM service
A RSM service is running on each login node. This service will listen a specific port and will process any request using RSM (in example, from ANSYS users workstations).
The following login nodes are configured with such services:
* `merlin-l-01.psi.ch`
* `merlin-l-001.psi.ch`
* `merlin-l-002.psi.ch`
Each ANSYS release installed in `/data/software/pmodules/ANSYS` should have its own RSM service running (the listening port is the default one set by that ANSYS release). With the following command users can check which ANSYS releases have an RSM instance running:
```bash
systemctl | grep pli-ansys-rsm-v[0-9][0-9][0-9].service
```
<details>
<summary>[Example] Listing RSM service running on merlin-l-001.psi.ch</summary>
<pre class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false">
(base) ❄ [caubet_m@merlin-l-001:/data/user/caubet_m]# systemctl | grep pli-ansys-rsm-v[0-9][0-9][0-9].service
pli-ansys-rsm-v195.service loaded active exited PSI ANSYS RSM v195
pli-ansys-rsm-v202.service loaded active exited PSI ANSYS RSM v202
pli-ansys-rsm-v211.service loaded active exited PSI ANSYS RSM v211
pli-ansys-rsm-v212.service loaded active exited PSI ANSYS RSM v212
pli-ansys-rsm-v221.service loaded active exited PSI ANSYS RSM v221
</pre>
</details>
## Configuring RSM client on Windows workstations
Users can setup ANSYS RSM in their workstations to connect to the Merlin6 cluster.
The different steps and settings required to make it work are that following:
1. Open the RSM Configuration service in Windows for the ANSYS release you want to configure.
2. Right-click the **HPC Resources** icon followed by **Add HPC Resource...**
![Adding a new HPC Resource]({{ "/images/ANSYS/rsm-1-add_hpc_resource.png" }})
3. In the **HPC Resource** tab, fill up the corresponding fields as follows:
![HPC Resource]({{"/images/ANSYS/rsm-2-add_cluster.png"}})
* **"Name"**: Add here the preffered name for the cluster. In example: `Merlin6 cluster - merlin-l-001`
* **"HPC Type"**: Select `SLURM`
* **"Submit host"**: Add one of the login nodes. In example `merlin-l-001`.
* **"Slurm Job submission arguments (optional)"**: Add any required Slurm options for running your jobs.
* In general, `--hint=nomultithread` should be at least present.
* Check **"Use SSH protocol for inter and intra-node communication (Linux only)"**
* Select **"Able to directly submit and monitor HPC jobs"**.
* **"Apply"** changes.
4. In the **"File Management"** tab, fill up the corresponding fields as follows:
![File Management]({{"/images/ANSYS/rsm-3-add_scratch_info.png"}})
* Select **"RSM internal file transfer mechanism"** and add **`/shared-scratch`** as the **"Staging directory path on Cluster"**
* Select **"Scratch directory local to the execution node(s)"** and add **`/scratch`** as the **HPC scratch directory**.
* **Never check** the option "Keep job files in the staging directory when job is complete" if the previous
option "Scratch directory local to the execution node(s)" was set.
* **"Apply"** changes.
5. In the **"Queues"** tab, use the left button to auto-discover partitions
![Queues]({{"/images/ANSYS/rsm-4-get_slurm_queues.png"}})
* If no authentication method was configured before, an authentication window will appear. Use your
PSI account to authenticate. Notice that the **`PSICH\`** prefix **must not be added**.
![Authenticating]({{"/images/ANSYS/rsm-5-authenticating.png"}})
* From the partition list, select the ones you want to typically use.
* In general, standard Merlin users must use **`hourly`**, **`daily`** and **`general`** only.
* Other partitions are reserved for allowed users only.
* **"Apply"** changes.
![Select partitions]({{"/images/ANSYS/rsm-6-selected-partitions.png"}})
6. *[Optional]* You can perform a test by submitting a test job on each partition by clicking on the **Submit** button
for each selected partition.
{{site.data.alerts.tip}}
Repeat the process from for adding other login nodes if necessary. This will give users the alternative
of using another login node in case of maintenance windows.
{{site.data.alerts.end}}
## Using RSM in ANSYS
Using the RSM service in ANSYS is slightly different depending on the ANSYS software being used.
Please follow the official ANSYS documentation for details about how to use it for that specific software.
Alternativaly, please refer to some the examples showed in the following chapters (ANSYS specific software).
### Using RSM in ANSYS Fluent
For further information for using RSM with Fluent, please visit the **[ANSYS RSM](/merlin6/ansys-fluent.html)** section.
### Using RSM in ANSYS CFX
For further information for using RSM with CFX, please visit the **[ANSYS RSM](/merlin6/ansys-cfx.html)** section.
### Using RSM in ANSYS MAPDL
For further information for using RSM with MAPDL, please visit the **[ANSYS RSM](/merlin6/ansys-mapdl.html)** section.

View File

@@ -0,0 +1,89 @@
---
title: ANSYS
#tags:
keywords: software, ansys, slurm, interactive, rsm, pmodules, overlay, overlays
last_updated: 07 September 2022
summary: "This document describes how to load and use ANSYS in the Merlin6 cluster"
sidebar: merlin6_sidebar
permalink: /merlin6/ansys.html
---
This document describes generic information of how to load and run ANSYS software in the Merlin cluster
## ANSYS software in Pmodules
The ANSYS software can be loaded through **[PModules](/merlin6/using-modules.html)**.
The default ANSYS versions are loaded from the central PModules repository.
However, there are some known problems that can pop up when using some specific ANSYS packages in advanced mode.
Due to this, and also to improve the interactive experience of the user, ANSYS has been also installed in the
Merlin high performance storage and we have made it available from Pmodules.
### Loading Merlin6 ANSYS
For loading the Merlin6 ANSYS software, one needs to run Pmodules v1.1.4 or newer, and then use a specific repository
(called **`overlay_merlin`**) which is ***only available from the Merlin cluster***:
```bash
module load Pmodules/1.1.6
module use overlay_merlin
```
Once `overlay_merlin` is invoked, it will disable central ANSYS installations with the same version, which will be replaced
by the local ones in Merlin. Releases from the central Pmodules repository which have not a local installation will remain
visible. For each ANSYS release, one can identify where it is installed by searching ANSYS in PModules with the `--verbose`
option. This will show the location of the different ANSYS releases as follows:
* For ANSYS releases installed in the central repositories, the path starts with `/opt/psi`
* For ANSYS releases installed in the Merlin6 repository (and/or overwritting the central ones), the path starts with `/data/software/pmodules`
<details>
<summary>[Example] Loading ANSYS from the Merlin6 PModules repository</summary>
<pre class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false">
(base) ❄ [caubet_m@merlin-l-001:/data/user/caubet_m]# module load Pmodules/1.1.6
module load: unstable module has been loaded -- Pmodules/1.1.6
(base) ❄ [caubet_m@merlin-l-001:/data/user/caubet_m]# module use merlin_overlay
(base) ❄ [caubet_m@merlin-l-001:/data/user/caubet_m]# module search ANSYS --verbose
Module Rel.stage Group Dependencies/Modulefile
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
ANSYS/2019R3 stable Tools dependencies:
modulefile: /data/software/pmodules/Tools/modulefiles/ANSYS/2019R3
ANSYS/2020R1 stable Tools dependencies:
modulefile: /opt/psi/Tools/modulefiles/ANSYS/2020R1
ANSYS/2020R1-1 stable Tools dependencies:
modulefile: /opt/psi/Tools/modulefiles/ANSYS/2020R1-1
ANSYS/2020R2 stable Tools dependencies:
modulefile: /data/software/pmodules/Tools/modulefiles/ANSYS/2020R2
ANSYS/2021R1 stable Tools dependencies:
modulefile: /data/software/pmodules/Tools/modulefiles/ANSYS/2021R1
ANSYS/2021R2 stable Tools dependencies:
modulefile: /data/software/pmodules/Tools/modulefiles/ANSYS/2021R2
</pre>
</details>
{{site.data.alerts.tip}} Please <b>only use Merlin6 ANSYS installations from `overlay_merlin`</b> in the Merlin cluster.
{{site.data.alerts.end}}
## ANSYS Documentation by product
### ANSYS RSM
**ANSYS Remote Solve Manager (RSM)** is used by ANSYS Workbench to submit computational jobs to HPC clusters directly from Workbench on your desktop.
Therefore, PSI workstations with direct access to Merlin can submit jobs by using RSM.
For further information, please visit the **[ANSYS RSM](/merlin6/ansys-rsm.html)** section.
### ANSYS Fluent
For further information, please visit the **[ANSYS RSM](/merlin6/ansys-fluent.html)** section.
### ANSYS CFX
For further information, please visit the **[ANSYS RSM](/merlin6/ansys-cfx.html)** section.
### ANSYS MAPDL
For further information, please visit the **[ANSYS RSM](/merlin6/ansys-mapdl.html)** section.

View File

@@ -0,0 +1,216 @@
---
title: GOTHIC
#tags:
keywords: software, gothic, slurm, interactive, batch job
last_updated: 07 September 2022
summary: "This document describes how to run Gothic in the Merlin cluster"
sidebar: merlin6_sidebar
permalink: /merlin6/gothic.html
---
This document describes generic information of how to run Gothic in the
Merlin cluster
## Gothic installation
Gothic is locally installed in Merlin in the following directory:
```bash
/data/project/general/software/gothic
```
Multiple versions are available. As of August 22, 2022, the latest
installed version is **Gothic 8.3 QA**.
Future releases will be placed in the PSI Modules system, therefore,
loading it through PModules will be possible at some point. However, in the
meantime one has to use the existing installations present in
`/data/project/general/software/gothic`.
## Running Gothic
### General requirements
When running Gothic in interactive or batch mode, one has to consider
the following requirements:
* **Use always one node only**: Gothic runs a single instance.
Therefore, it can not run on multiple nodes. Adding option `--nodes=1-1`
or `-N 1-1` is strongly recommended: this will prevent Slurm to allocate
multiple nodes if the Slurm allocation definition is ambiguous.
* **Use one task only**: Gothic spawns one main process, which then will
spawn multiple threads depending on the number of available cores.
Therefore, one has to specify 1 task (`--ntasks=1` or `-n 1`).
* **Use multiple CPUs**: since Gothic will spawn multiple threads, then
multiple CPUs can be used. Adding `--cpus-per-task=<num_cpus>`
or `-c <num_cpus>` is in general recommended.
Notice that `<num_cpus>` must never exceed the maximum number of CPUS
in a compute node (usually *88*).
* **Use multithread**: Gothic is an OpenMP based software, therefore,
running in hyper-threading mode is strongly recommended. Use the option
`--hint=multithread` for enforcing hyper-threading.
* **[Optional]** *Memory setup*: The default memory per CPU (4000MB)
is usually enough for running Gothic. If you require more memory, you
can always set the `--mem=<mem_in_MB>` option. This is in general
*not necessary*.
### Interactive
**Is not allowed to run CPU intensive interactive jobs in the
login nodes**. Only applications capable to limit the number of cores are
allowed to run for longer time. Also, **running in the login nodes is not
efficient**, since resources are shared with other processes and users.
Is possible to submit interactive jobs to the cluster by allocating a
full compute node, or even by allocating a few cores only. This will grant
dedicated CPUs and resources and in general it will not affect other users.
For interactive jobs, is strongly recommended to use the `hourly` partition,
which usually has a good availability of nodes.
For longer runs, one should use the `daily` (or `general`) partition.
However, getting interactive access to nodes on these partitions is
sometimes more difficult if the cluster is pretty full.
To submit an interactive job, consider the following requirements:
* **X11 forwarding must be enabled**: Gothic spawns an interactive
window which requires X11 forwarding when using it remotely, therefore
using the Slurm option `--x11` is necessary.
* **Ensure that the scratch area is accessible**: For running Gothic,
one has to define a scratch area with the `GTHTMP` environment variable.
There are two options:
1. **Use local scratch**: Each compute node has its own `/scratch` area.
This area is independent to any other node, therefore not visible by other nodes.
Using the top directory `/scratch` for interactive jobs is the simplest way,
and it can be defined before or after the allocation creation, as follows:
```bash
# Example 1: Define GTHTMP before the allocation
export GTHTMP=/scratch
salloc ...
# Example 2: Define GTHTMP after the allocation
salloc ...
export GTHTMP=/scratch
```
Notice that if you want to create a custom sub-directory (i.e.
`/scratch/$USER`, one has to create the sub-directory on every new
allocation! In example:
```bash
# Example 1:
export GTHTMP=/scratch/$USER
salloc ...
mkdir -p $GTHTMP
# Example 2:
salloc ...
export GTHTMP=/scratch/$USER
mkdir -p $GTHTMP
```
Creating sub-directories makes the process more complex, therefore
using just `/scratch` is simpler and recommended.
2. **Shared scratch**: Using shared scratch allows to have a
directory visible from all compute nodes and login nodes. Therefore,
one can use `/shared-scratch` to achieve the same as in **1.**, but
creating a sub-directory needs to be done just once.
Please, consider that `/scratch` usually provides better performance and,
in addition, will offload the main storage. Therefore, using **local scratch**
is strongly recommended. Use the shared scratch only when strongly necessary.
* **Use the `hourly` partition**: Using the `hourly` partition is
recommended for running interactive jobs (latency is in general
lower). However, `daily` and `general` are also available if you expect
longer runs, but in these cases you should expect longer waiting times.
These requirements are in addition to the requirements previously described
in the [General requirements](/merlin6/gothic.html#general-requirements)
section.
#### Interactive allocations: examples
* Requesting a full node,
```bash
salloc --partition=hourly -N 1 -n 1 -c 88 --hint=multithread --x11 --exclusive --mem=0
```
* Requesting 22 CPUs from a node, with default memory per CPU (4000MB/CPU):
```bash
num_cpus=22
salloc --partition=hourly -N 1 -n 1 -c $num_cpus --hint=multithread --x11
```
### Batch job
The Slurm cluster is mainly used by non interactive batch jobs: Users
submit a job, which goes into a queue, and waits until Slurm can assign
resources to it. In general, the longer the job, the longer the waiting time,
unless there are enough free resources to inmediately start running it.
Running Gothic in a Slurm batch script is pretty simple. One has to mainly
consider the requirements described in the [General requirements](/merlin6/gothic.html#general-requirements)
section, and:
* **Use local scratch** for running batch jobs. In general, defining
`GTHTMP` in a batch script is simpler than on an allocation. If you plan
to run multiple jobs in the same node, you can even create a second sub-directory
level based on the Slurm Job ID:
```bash
mkdir -p /scratch/$USER/$SLURM_JOB_ID
export GTHTMP=/scratch/$USER/$SLURM_JOB_ID
... # Run Gothic here
rm -rf /scratch/$USER/$SLURM_JOB_ID
```
Temporary data generated by the job in `GTHTMP` must be removed at the end of
the job, as showed above.
#### Batch script: examples
* Requesting a full node:
```bash
#!/bin/bash -l
#SBATCH --job-name=Gothic
#SBATCH --time=3-00:00:00
#SBATCH --partition=general
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=88
#SBATCH --hint=multithread
#SBATCH --exclusive
#SBATCH --mem=0
#SBATCH --clusters=merlin6
INPUT_FILE='MY_INPUT.SIN'
mkdir -p /scratch/$USER/$SLURM_JOB_ID
export GTHTMP=/scratch/$USER/$SLURM_JOB_ID
/data/project/general/software/gothic/gothic8.3qa/bin/gothic_s.sh $INPUT_FILE -m -np $SLURM_CPUS_PER_TASK
gth_exit_code=$?
# Clean up data in /scratch
rm -rf /scratch/$USER/$SLURM_JOB_ID
# Return exit code from GOTHIC
exit $gth_exit_code
```
* Requesting 22 CPUs from a node, with default memory per CPU (4000MB/CPU):
```bash
#!/bin/bash -l
#SBATCH --job-name=Gothic
#SBATCH --time=3-00:00:00
#SBATCH --partition=general
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=22
#SBATCH --hint=multithread
#SBATCH --clusters=merlin6
INPUT_FILE='MY_INPUT.SIN'
mkdir -p /scratch/$USER/$SLURM_JOB_ID
export GTHTMP=/scratch/$USER/$SLURM_JOB_ID
/data/project/general/software/gothic/gothic8.3qa/bin/gothic_s.sh $INPUT_FILE -m -np $SLURM_CPUS_PER_TASK
gth_exit_code=$?
# Clean up data in /scratch
rm -rf /scratch/$USER/$SLURM_JOB_ID
# Return exit code from GOTHIC
exit $gth_exit_code
```

View File

@@ -0,0 +1,45 @@
---
title: Intel MPI Support
#tags:
last_updated: 13 March 2020
keywords: software, impi, slurm
summary: "This document describes how to use Intel MPI in the Merlin6 cluster"
sidebar: merlin6_sidebar
permalink: /merlin6/impi.html
---
## Introduction
This document describes which set of Intel MPI versions in PModules are supported in the Merlin6 cluster.
### srun
We strongly recommend the use of **'srun'** over **'mpirun'** or **'mpiexec'**. Using **'srun'** would properly
bind tasks in to cores and less customization is needed, while **'mpirun'** and '**mpiexec**' might need more advanced
configuration and should be only used by advanced users. Please, ***always*** adapt your scripts for using **'srun'**
before opening a support ticket. Also, please contact us on any problem when using a module.
{{site.data.alerts.tip}} Always run Intel MPI with the <b>srun</b> command. The only exception is for advanced users, however <b>srun</b> is still recommended.
{{site.data.alerts.end}}
When running with **srun**, one should tell Intel MPI to use the PMI libraries provided by Slurm. For PMI-1:
```bash
export I_MPI_PMI_LIBRARY=/usr/lib64/libpmi.so
srun ./app
```
Alternatively, one can use PMI-2, but then one needs to specify it as follows:
```bash
export I_MPI_PMI_LIBRARY=/usr/lib64/libpmi2.so
export I_MPI_PMI2=yes
srun ./app
```
For more information, please read [Slurm Intel MPI Guide](https://slurm.schedmd.com/mpi_guide.html#intel_mpi)
**Note**: Please note that PMI2 might not work properly in some Intel MPI versions. If so, you can either fallback
to PMI-1 or to contact the Merlin administrators.

View File

@@ -0,0 +1,140 @@
---
title: OpenMPI Support
#tags:
last_updated: 13 March 2020
keywords: software, openmpi, slurm
summary: "This document describes how to use OpenMPI in the Merlin6 cluster"
sidebar: merlin6_sidebar
permalink: /merlin6/openmpi.html
---
## Introduction
This document describes which set of OpenMPI versions in PModules are supported in the Merlin6 cluster.
### srun
We strongly recommend the use of **'srun'** over **'mpirun'** or **'mpiexec'**. Using **'srun'** would properly
bind tasks in to cores and less customization is needed, while **'mpirun'** and '**mpiexec**' might need more advanced
configuration and should be only used by advanced users. Please, ***always*** adapt your scripts for using **'srun'**
before opening a support ticket. Also, please contact us on any problem when using a module.
Example:
```bash
srun ./app
```
{{site.data.alerts.tip}} Always run OpenMPI with the <b>srun</b> command. The only exception is for advanced users, however <b>srun</b> is still recommended.
{{site.data.alerts.end}}
### OpenMPI with UCX
**OpenMPI** supports **UCX** starting from version 3.0, but its recommended to use version 4.0 or higher due to stability and performance improvements.
**UCX** should be used only by advanced users, as it requires to run it with **mpirun** (needs advanced knowledge) and is an exception for running MPI
without **srun** (**UCX** is not integrated at PSI within **srun**).
For running UCX, one should:
* add the following options to **mpirun**:
```bash
-mca pml ucx --mca btl ^vader,tcp,openib,uct -x UCX_NET_DEVICES=mlx5_0:1
```
* or alternatively, add the following options **before mpirun**
```bash
export OMPI_MCA_pml="ucx"
export OMPI_MCA_btl="^vader,tcp,openib,uct"
export UCX_NET_DEVICES=mlx5_0:1
```
In addition, one can add the following options for debugging purposes (visit [UCX Logging](https://github.com/openucx/ucx/wiki/Logging) for possible `UCX_LOG_LEVEL` values):
```bash
-x UCX_LOG_LEVEL=<data|debug|warn|info|...> -x UCX_LOG_FILE=<filename>
```
This can be also added externally before the **mpirun** call (see below example). Full example:
* Within the **mpirun** command:
```bash
mpirun -np $SLURM_NTASKS -mca pml ucx --mca btl ^vader,tcp,openib,uct -x UCX_NET_DEVICES=mlx5_0:1 -x UCX_LOG_LEVEL=data -x UCX_LOG_FILE=UCX-$SLURM_JOB_ID.log ./app
```
* Outside the **mpirun** command:
```bash
export OMPI_MCA_pml="ucx"
export OMPI_MCA_btl="^vader,tcp,openib,uct"
export UCX_NET_DEVICES=mlx5_0:1
export UCX_LOG_LEVEL=data
export UCX_LOG_FILE=UCX-$SLURM_JOB_ID.log
mpirun -np $SLURM_NTASKS ./app
```
## Supported OpenMPI versions
For running OpenMPI properly in a Slurm batch system, ***OpenMPI and Slurm must be compiled accordingly***.
We can find a large number of compilations of OpenMPI modules in the PModules central repositories. However, only
some of them are suitable for running in a Slurm cluster: ***any OpenMPI versions with suffixes `_slurm`
are suitable for running in the Merlin6 cluster***. Also, OpenMPI with suffix `_merlin6` can be used, but these will be fully
replaced by the `_slurm` series in the future (so it can be used on any Slurm cluster at PSI). Please, ***avoid using any other OpenMPI releases***.
{{site.data.alerts.tip}} Suitable <b>OpenMPI</b> versions for running in the Merlin6 cluster:
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;-&nbsp;
<span class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false"><b>openmpi/&lt;version&gt;&#95;slurm</b>
</span>&nbsp;<b>[<u>Recommended</u>]</b>
</p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;-&nbsp;
<span class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false">openmpi/&lt;version&gt;&#95;merlin6
</span>
</p>
{{site.data.alerts.end}}
#### 'unstable' repository
New OpenMPI versions that need to be tested will be compiled first in the **``unstable``** repository, and once validated will be moved to **``stable``**.
We can not ensure that modules in that repository are production ready, but you can use it *at your own risk*.
For using *unstable* modules, you might need to load the **``unstable``** PModules repository as follows:
```bash
module use unstable
```
#### 'stable' repository
Officially supported OpenMPI versions (https://www.open-mpi.org/) will be available in the **``stable``** repository (which is the *default* loaded repository).
For further information, please check [https://www.open-mpi.org/software/ompi/ -> Current & Still Supported](https://www.open-mpi.org/software/ompi/)
versions.
Usually, not more than 2 minor update releases will be present in the **``stable``** repository. Older minor update releases will be moved to **``deprecated``**
despite are officially supported. This will ensure that users compile new software with the latest stable versions, but we keep available the old versions
for software which was compiled with it.
#### 'deprecated' repository
Old OpenMPI versions (it is, any official OpenMPI version which has been moved to **retired** or **ancient**) will be
moved to the ***'deprecated'*** PModules repository.
For further information, please check [https://www.open-mpi.org/software/ompi/ -> Older Versions](https://www.open-mpi.org/software/ompi/)
versions.
Also, as mentioned in [before](/merlin6/openmpi.html#stable-repository), older official supported OpenMPI releases (minor updates) will be moved to ``deprecated``.
For using *deprecated* modules, you might need to load the **``deprecated``** PModules repository as follows:
```bash
module use deprecated
```
However, this is usually not needed: when loading directly a specific version in the ``deprecated`` repository, if this is not found in
``stable`` it try to search and to fallback to other repositories (``deprecated`` or ``unstable``).
#### About missing versions
##### Missing OpenMPI versions
For legacy software, some users might require a different OpenMPI version. **We always encourage** users to try one of the existing stable versions
(*OpenMPI always with suffix ``_slurm`` or ``_merlin6``!*), as they will contain the latest bug fixes and they usually should work. In the worst case, you
can also try with the ones in the deprecated repository (again, *OpenMPI always with suffix ``_slurm`` or ``_merlin6``!*), or for very old software which
was based on OpenMPI v1 you can follow the guide [FAQ: Removed MPI constructs](https://www.open-mpi.org/faq/?category=mpi-removed), which provides
some easy steps for migrating from OpenMPI v1 to v2 or superior or also is useful to find out why your code does not compile properly.
When, after trying the mentioned versions and guide, you are still facing problems, please contact us. Also, please contact us if you require a newer
version with a different ``gcc`` or ``intel`` compiler (in example, Intel v19).

View File

@@ -0,0 +1,63 @@
---
title: Running Paraview
#tags:
last_updated: 03 December 2020
keywords: software, paraview, mesa, OpenGL, interactive
summary: "This document describes how to run ParaView in the Merlin6 cluster"
sidebar: merlin6_sidebar
permalink: /merlin6/paraview.html
---
## Requirements
**[NoMachine](/merlin6/nomachine.html)** is the official **strongly recommended and supported** tool for running *ParaView*.
Consider that running over SSH (X11-Forwarding needed) is very slow, but also configuration might not work as it also depends
on the client configuration (Linux workstation/laptop, Windows with XMing, etc.). Hence, please **avoid running Paraview over SSH**.
The only exception for running over SSH is when running it as a job from a NoMachine client.
## ParaView
### PModules
Is strongly recommended the use of the latest ParaView version available in PModules. In example, for loading **paraview**:
```bash
module use unstable
module load paraview/5.8.1
```
### Running ParaView
For running ParaView, one can run it with **VirtualGL** to take advantatge of the GPU card located on each login node. For that, once loaded, you can start **paraview** as follows:
```bash
vglrun paraview
```
Alternatively, one can run **paraview** with *mesa* support with the below command. This can be useful when running on CPU computing nodes (with `srun` / `salloc`)
which have no graphics card (and where `vglrun` is not possible):
```bash
paraview-mesa paraview
```
#### Running older versions of ParaView
Older versions of ParaView available in PModules (i.e. *paraview/5.0.1* and *paraview/5.4.1*) might require a different command
for running paraview with **Mesa** support. The command is the following:
```bash
# Warning: only for Paraview 5.4.1 and older
paraview --mesa
```
#### Running ParaView interactively in the batch system
One can run ParaView interactively in the CPU cluster as follows:
```bash
# First, load module. In example: "module load paraview/5.8.1"
srun --pty --x11 --partition=general --ntasks=1 paraview-mesa paraview
```
One can change the partition, number of tasks or specify extra parameters to `srun` if needed.

View File

@@ -0,0 +1,162 @@
---
title: Python
#tags:
last_updated: 28 September 2020
keywords: [python, anaconda, conda, jupyter, numpy]
summary: Running Python on Merlin
sidebar: merlin6_sidebar
permalink: /merlin6/python.html
---
PSI provides a variety of ways to execute python code.
2. **Anaconda** - Custom environments for using installation and development
3. **Jupyterhub** - Execute Jupyter notebooks on the cluster
4. **System Python** - Do not use! Only for OS applications.
## Anaconda
[Anaconda](https://www.anaconda.com/) ("conda" for short) is a package manager with
excellent python integration. Using it you can create isolated environments for each
of your python applications, containing exactly the dependencies needed for that app.
It is similar to the [virtualenv](http://virtualenv.readthedocs.org/) python package,
but can also manage non-python requirements.
### Loading conda
Conda is loaded from the module system:
```
module load anaconda
```
### Using pre-made environments
Loading the module provides the `conda` command, but does not otherwise change your
environment. First an environment needs to be activated. Available environments can
be seen with `conda info --envs` and include many specialized environments for
software installs. After activating you should see the environment name in your
prompt:
```
~ $ conda activate datascience_py37
(datascience_py37) ~ $
```
### CondaRC file
Creating a `~/.condarc` file is recommended if you want to create new environments on
merlin. Environments can grow quite large, so you will need to change the default
storage location from the default (your home directory) to a larger volume (usually
`/data/user/$USER`).
Save the following as `$HOME/.condarc`:
```
always_copy: true
envs_dirs:
- /data/user/$USER/conda/envs
pkgs_dirs:
- /data/user/$USER/conda/pkgs
- $ANACONDA_PREFIX/conda/pkgs
channels:
- conda-forge
- nodefaults
```
Run `conda info` to check that the variables are being set correctly.
### Creating environments
We will create an environment named `myenv` which uses an older version of numpy, e.g. to test for backwards compatibility of our code (the `-q` and `--yes` switches are just for not getting prompted and disabling the progress bar). The environment will be created in the default location as defined by the `.condarc` configuration file (see above).
```
~ $ conda create -q --yes -n 'myenv1' numpy=1.8 scipy ipython
Fetching package metadata: ...
Solving package specifications: .
Package plan for installation in environment /gpfs/home/feichtinger/conda-envs/myenv1:
The following NEW packages will be INSTALLED:
ipython: 2.3.0-py27_0
numpy: 1.8.2-py27_0
openssl: 1.0.1h-1
pip: 1.5.6-py27_0
python: 2.7.8-1
readline: 6.2-2
scipy: 0.14.0-np18py27_0
setuptools: 5.8-py27_0
sqlite: 3.8.4.1-0
system: 5.8-1
tk: 8.5.15-0
zlib: 1.2.7-0
To activate this environment, use:
$ source activate myenv1
To deactivate this environment, use:
$ source deactivate
```
The created environment contains **just the packages that are needed to satisfy the
requirements** and it is local to your installation. The python installation is even
independent of the central installation, i.e. your code will still work in such an
environment, even if you are offline or AFS is down. However, you need the central
installation if you want to use the `conda` command itself.
Packages for your new environment will be either copied from the central one into
your new environment, or if there are newer packages available from anaconda and you
did not specify exactly the version from our central installation, they may get
downloaded from the web. **This will require significant space in the `envs_dirs`
that you defined in `.condarc`. If you create other environments on the same local
disk, they will share the packages using hard links.
We can switch to the newly created environment with the `conda activate` command.
```
$ conda activate myenv1
```
{% include callout.html type="info" content="Note that anaconda's activate/deactivate
scripts are compatible with the bash and zsh shells but not with [t]csh." %}
Let's test whether we indeed got the desired numpy version:
```
$ python -c 'import numpy as np; print np.version.version'
1.8.2
```
You can install additional packages into the active environment using the `conda
install` command.
```
$ conda install --yes -q bottle
Fetching package metadata: ...
Solving package specifications: .
Package plan for installation in environment /gpfs/home/feichtinger/conda-envs/myenv1:
The following NEW packages will be INSTALLED:
bottle: 0.12.5-py27_0
```
## Jupyterhub
Jupyterhub is a service for running code notebooks on the cluster, particularly in
python. It is a powerful tool for data analysis and prototyping. For more infomation
see the [Jupyterhub documentation]({{"jupyterhub.html"}}).
## Pythons to avoid
Avoid using the system python (`/usr/bin/python`). It is intended for OS software and
may not be up to date.
Also avoid the 'python' module (`module load python`). This is a minimal install of
python intended for embedding in other modules.

View File

@@ -0,0 +1,61 @@
---
title: Downtimes
#tags:
#keywords:
last_updated: 28 June 2019
#summary: "Merlin 6 cluster overview"
sidebar: merlin6_sidebar
permalink: /merlin6/downtimes.html
---
On the first Monday of each month the Merlin6 cluster might be subject to interruption due to maintenance.
Users will be informed with at least one week in advance when a downtime is scheduled for the next month.
Downtimes will be informed to users through the <merlin-users@lists.psi.ch> mail list. Also, a detailed description
for the nexts scheduled interventions will be available in [Next Scheduled Downtimes](/merlin6/downtimes.html#next-scheduled-downtimes)).
---
## Scheduled Downtime Draining Policy
Scheduled downtimes mostly affecting the storage and Slurm configurantions may require draining the nodes.
When this is required, users will be informed accordingly. Two different types of draining are possible:
* **soft drain**: new jobs may be queued on the partition, but queued jobs may not be allocated nodes and run from the partition.
Jobs already running on the partition continue to run. This will be the **default** drain method.
* **hard drain**: no new jobs may be queued on the partition (job submission requests will be denied with an error message),
but jobs already queued on the partition may be allocated to nodes and run.
Unless explicitly specified, the default draining policy for each partition will be the following:
* The **daily** and **general** partitions will be soft drained 12h before the downtime.
* The **hourly** partition will be soft drained 1 hour before the downtime.
* The **gpu** and **gpu-short** partitions will be soft drained 1 hour before the downtime.
Finally, **remaining running jobs will be killed** by default when the downtime starts. In some specific rare cases jobs will be
just *paused* and *resumed* back when the downtime finished.
### Draining Policy Summary
The following table contains a summary of the draining policies during a Schedule Downtime:
| **Partition** | **Drain Policy** | **Default Drain Type** | **Default Job Policy** |
|:---------------:| -----------------:| ----------------------:| --------------------------------:|
| **general** | 12h before the SD | soft drain | Kill running jobs when SD starts |
| **daily** | 12h before the SD | soft drain | Kill running jobs when SD starts |
| **hourly** | 1h before the SD | soft drain | Kill running jobs when SD starts |
| **gpu** | 1h before the SD | soft drain | Kill running jobs when SD starts |
| **gpu-short** | 1h before the SD | soft drain | Kill running jobs when SD starts |
| **gfa-asa** | 1h before the SD | soft drain | Kill running jobs when SD starts |
---
## Next Scheduled Downtimes
The table below shows a description for the next Scheduled Downtime:
| From | To | Service | Description |
| ---------------- | ---------------- |:------------:|:----------------------------------------------------------------------- |
| 05.09.2020 8am | 05.09.2020 6pm | <pending> | <pending> |
* **Note**: An e-mail will be sent when the services are fully available.

View File

@@ -0,0 +1,38 @@
---
title: Past Downtimes
#tags:
#keywords:
last_updated: 03 September 2019
#summary: "Merlin 6 cluster overview"
sidebar: merlin6_sidebar
permalink: /merlin6/past-downtimes.html
---
## Past Downtimes: Log Changes
### 2020
| From | To | Service | Clusters | Description | Exceptions |
| ---------------- | ---------------- |:------------:|:---------------:|:--------------------------------------------------------------|:-------------------------------------------:|
| 03.08.2020 8am | 03.08.2020 6pm | Archive | merlin6 | Replace old merlin-export-01 for merlin-export-02 | |
| 03.08.2020 8am | 03.08.2020 6pm | RemoteAccess | merlin6 | ra-merlin-0[1,2] Remount merlin-export-02 | |
| 06.07.2020 | 06.07.2020 | All services | merlin5,merlin6 | GPFS v5.0.4-4,OFED v5.0,YFS v0.195,RHEL7.7,Slurm v19.05.7,f/w | |
| 04.05.2020 | 04.05.2020 | Login nodes | merlin6 | Outage. YFS (AFS) update v0.194 and reboot | |
| 04.05.2020 | 04.05.2020 | CN | merlin5 | Outage. O.S. update, OFED drivers update, YFS (AFS) update. | |
| 03.02.2020 9am | 03.02.2020 10am | Slurm | merlin5,merlin6 | Upgrading config [HPCLOCAL-321](https://jira.psi.ch/browse/HPCLOCAL-321) | |
| 10.01.2020 9am | 10.01.2020 6pm | All Services | merlin5,merlin6 | Slurm v18->v19, IB Connected Mode, other. [HPCLOCAL-300](https://jira.psi.ch/browse/HPCLOCAL-300) | |
## Older downtimes
| From | To | Service | Clusters | Description | Exceptions |
| ---------------- | ---------------- |:------------:|:---------------:|:--------------------------------------------------------------|:-------------------------------------------:|
| 02.09.2019 | 02.09.2019 | GPFS | merlin5,merlin6 | v5.0.2-3 -> v5.0.3-2 | |
| 02.09.2019 | 02.09.2019 | O.S. | merlin5 | RHEL7.4 (rhel-7.4) -> RHEL7.6 (prod-00048) | merlin-g-40, still running RHEL7.4\* |
| 02.09.2019 | 02.09.2019 | O.S. | merlin6 | RHEL7.6 (prod-00030) -> RHEL7.6 (prod-00048) | |
| 02.09.2019 | 02.09.2019 | Infiniband | merlin5 | OFED v4.4 -> v4.6 | merlin-g-40, still running OFED v4.4\* |
| 02.09.2019 | 02.09.2019 | Infiniband | merlin6 | OFED v4.5 -> v4.6 | |
| 02.09.2019 | 02.09.2019 | PModules | merlin5,merlin6 | PModules v1.0.0rc4 -> v1.0.0rc5 | |
| 02.09.2019 | 02.09.2019 | AFS(YFS) | merlin5 | OpenAFS v1.6.22.2-236 -> YFS v188 | merlin-g-40, still running OpenAFS\* |
| 02.09.2019 | 02.09.2019 | AFS(YFS) | merlin6 | YFS v186 -> YFS v188 | |
| 02.09.2019 | 02.09.2019 | O.S. | merlin5 | RHEL7.4 -> RHEL7.6 (prod-00048) | |
| 02.09.2019 | 02.09.2019 | Slurm | merlin5,merlin6 | Slurm v18.08.6 -> v18.08.8 | |

View File

@@ -0,0 +1,49 @@
---
title: Contact
#tags:
keywords: contact, support, snow, service now, mailing list, mailing, email, mail, merlin-admins@lists.psi.ch, merlin-users@lists.psi.ch, merlin users
last_updated: 07 September 2022
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/contact.html
---
## Support
Basic contact information can be also found when logging into the Merlin Login Nodes through the *Message of the Day*.
Support can be asked through:
* [PSI Service Now](https://psi.service-now.com/psisp)
* E-Mail: <merlin-admins@lists.psi.ch>
### PSI Service Now
**[PSI Service Now](https://psi.service-now.com/psisp)**: is the official tool for opening incident requests.
* PSI HelpDesk will redirect the incident to the corresponding department, or
* you can always assign it directly by checking the box `I know which service is affected` and providing the service name `Local HPC Resources (e.g. Merlin) [CF]` (just type in `Local` and you should get the valid completions).
### Contact Merlin6 Administrators
**E-Mail <merlin-admins@lists.psi.ch>**
* This is the official way to contact Merlin6 Administrators for discussions which do not fit well into the incident category.
Do not hesitate to contact us for such cases.
---
## Get updated through the Merlin User list!
Is strongly recommended that users subscribe to the Merlin Users mailing list: **<merlin-users@lists.psi.ch>**
This mailing list is the official channel used by Merlin6 administrators to inform users about downtimes,
interventions or problems. Users can be subscribed in two ways:
* *(Preferred way)* Self-registration through **[Sympa](https://psilists.ethz.ch/sympa/info/merlin-users)**
* If you need to subscribe many people (e.g. your whole group) by sending a request to the admin list **<merlin-admins@lists.psi.ch>**
and providing a list of email addresses.
---
## The Merlin Cluster Team
The PSI Merlin clusters are managed by the **[High Performance Computing and Emerging technologies Group](https://www.psi.ch/de/lsm/hpce-group)**, which
is part of the [Science IT Infrastructure, and Services department (AWI)](https://www.psi.ch/en/awi) in PSI's [Center for Scientific Computing, Theory and Data (SCD)](https://www.psi.ch/en/csd).

View File

@@ -0,0 +1,52 @@
---
title: FAQ
#tags:
keywords: faq, frequently asked questions, support
last_updated: 27 October 2022
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/faq.html
---
{%include toc.html %}
## How do I register for Merlin?
See [Requesting Merlin Access](/merlin6/request-account.html).
## How do I get information about downtimes and updates?
See [Get updated through the Merlin User list!](/merlin6/contact.html#get-updated-through-the-merlin-user-list)
## How can I request access to a Merlin project directory?
Merlin projects are placed in the `/data/project` directory. Access to each project is controlled by Unix group membership.
If you require access to an existing project, please request group membership as described in [Requesting Unix Group Membership](/merlin6/request-project.html#requesting-unix-group-membership).
Your project leader or project colleagues will know what Unix group you should belong to. Otherwise, you can check what Unix group is allowed to access that project directory (simply run `ls -ltrhd` for the project directory).
## Can I install software myself?
Most software can be installed in user directories without any special permissions. We recommend using `/data/user/$USER/bin` for software since home directories are fairly small. For software that will be used by multiple groups/users you can also [request the admins](/merlin6/contact.html) install it as a [module](/merlin6/using-modules.html).
How to install depends a bit on the software itself. There are three common installation procedures:
1. *binary distributions*. These are easy; just put them in a directory (eg `/data/user/$USER/bin`) and add that to your PATH.
2. *source compilation* using make/cmake/autoconfig/etc. Usually the compilation scripts accept a `--prefix=/data/user/$USER` directory for where to install it. Then they place files under `<prefix>/bin`, `<prefix>/lib`, etc. The exact syntax should be documented in the installation instructions.
3. *conda environment*. This is now becoming standard for python-based software, including lots of the AI tools. First follow the [initial setup instructions](/merlin6/python.html#anaconda) to configure conda to use /data/user instead of your home directory. Then you can create environments like:
```
module load anaconda/2019.07
# if they provide environment.yml
conda env create -f environment.yml
# or to create manually
conda create --name myenv python==3.9 ...
conda activate myenv
```
## Something doesn't work
Check the list of [known problems](/merlin6/known-problems.html) to see if a solution is known.
If not, please [contact the admins](/merlin6/contact.html).

View File

@@ -0,0 +1,180 @@
---
title: Known Problems
#tags:
keywords: "known problems, troubleshooting, illegal instructions, paraview, ansys, shell, opengl, mesa, vglrun, module: command not found, error"
last_updated: 07 September 2022
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/known-problems.html
---
## Common errors
### Illegal instruction error
It may happened that your code, compiled on one machine will not be executed on another throwing exception like **"(Illegal instruction)"**.
This is usually because the software was compiled with a set of instructions newer than the ones available in the node where the software runs,
and it mostly depends on the processor generation.
In example, `merlin-l-001` and `merlin-l-002` contain a newer generation of processors than the old GPUs nodes, or than the Merlin5 cluster.
Hence, unless one compiles the software with compatibility with set of instructions from older processors, it will not run on old nodes.
Sometimes, this is properly set by default at the compilation time, but sometimes is not.
For GCC, please refer to [GCC x86 Options](https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html) for compiling options. In case of doubts, contact us.
## Slurm
### sbatch using one core despite setting -c/--cpus-per-task
From **Slurm v22.05.6**, the behavior of `srun` has changed. Merlin has been updated to this version since *Tuesday 13.12.2022*.
`srun` will no longer read in `SLURM_CPUS_PER_TASK`, which is typically set when defining `-c/--cpus-per-task` in the `sbatch` command.
This means you will implicitly have to specify `-c\--cpus-per-task` also on your `srun` calls, or set the new `SRUN_CPUS_PER_TASK` environment variable to accomplish the same thing.
Therefore, unless this is implicitly specified, `srun` will use only one Core per task (resulting in 2 CPUs per task when multithreading is enabled)
An example for setting up `srun` with `-c\--cpus-per-task`:
```bash
(base)[caubet_m@merlin-l-001:/data/user/caubet_m]# cat mysbatch_method1
#!/bin/bash
#SBATCH -n 1
#SBATCH --cpus-per-task=8
echo 'From Slurm v22.05.8 srun does not inherit $SLURM_CPUS_PER_TASK'
srun python -c "import os; print(os.sched_getaffinity(0))"
echo 'One has to implicitly specify $SLURM_CPUS_PER_TASK'
echo 'In this example, by setting -c/--cpus-per-task in srun'
srun --cpus-per-task=$SLURM_CPUS_PER_TASK python -c "import os; print(os.sched_getaffinity(0))"
(base)[caubet_m@merlin-l-001:/data/user/caubet_m]# sbatch mysbatch_method1
Submitted batch job 8000813
(base)[caubet_m@merlin-l-001:/data/user/caubet_m]# cat slurm-8000813.out
From Slurm v22.05.8 srun does not inherit $SLURM_CPUS_PER_TASK
{1, 45}
One has to implicitly specify $SLURM_CPUS_PER_TASK
In this example, by setting -c/--cpus-per-task in srun
{1, 2, 3, 4, 45, 46, 47, 48}
```
An example to accomplish the same thing with the `SRUN_CPUS_PER_TASK` environment variable:
```bash
(base)[caubet_m@merlin-l-001:/data/user/caubet_m]# cat mysbatch_method2
#!/bin/bash
#SBATCH -n 1
#SBATCH --cpus-per-task=8
echo 'From Slurm v22.05.8 srun does not inherit $SLURM_CPUS_PER_TASK'
srun python -c "import os; print(os.sched_getaffinity(0))"
echo 'One has to implicitly specify $SLURM_CPUS_PER_TASK'
echo 'In this example, by setting an environment variable SRUN_CPUS_PER_TASK'
export SRUN_CPUS_PER_TASK=$SLURM_CPUS_PER_TASK
srun python -c "import os; print(os.sched_getaffinity(0))"
(base)[caubet_m@merlin-l-001:/data/user/caubet_m]# sbatch mysbatch_method2
Submitted batch job 8000815
(base)[caubet_m@merlin-l-001:/data/user/caubet_m]# cat slurm-8000815.out
From Slurm v22.05.8 srun does not inherit $SLURM_CPUS_PER_TASK
{1, 45}
One has to implicitly specify $SLURM_CPUS_PER_TASK
In this example, by setting an environment variable SRUN_CPUS_PER_TASK
{1, 2, 3, 4, 45, 46, 47, 48}
```
## General topics
### Default SHELL
In general, **`/bin/bash` is the recommended default user's SHELL** when working in Merlin.
Some users might notice that BASH is not the default SHELL when logging in to Merlin systems, or they might need to run a different SHELL.
This is probably because when the PSI account was requested, no SHELL description was specified or a different one was requested explicitly by the requestor.
Users can check which is the default SHELL specified in the PSI account with the following command:
```bash
getent passwd $USER | awk -F: '{print $NF}'
```
If SHELL does not correspond to the one you need to use, you should request a central change for it.
This is because Merlin accounts are central PSI accounts. Hence, **change must be requested via [PSI Service Now](/merlin6/contact.html#psi-service-now)**.
Alternatively, if you work on other PSI Linux systems but for Merlin you need a different SHELL type, a temporary change can be performed during login startup.
You can update one of the following files:
* `~/.login`
* `~/.profile`
* Any `rc` or `profile` file in your home directory (i.e. `.cshrc`, `.bashrc`, `.bash_profile`, etc.)
with the following lines:
```bash
# Replace MY_SHELL with the bash type you need
MY_SHELL=/bin/bash
exec $MY_SHELL -l
```
Notice that available *shells* can be found in the following file:
```bash
cat /etc/shells
```
### 3D acceleration: OpenGL vs Mesa
Some applications can run with OpenGL support. This is only possible when the node contains a GPU card.
In general, X11 with Mesa Driver is the recommended method as it will work in all cases (no need of GPUs). In example, for ParaView:
```bash
module load paraview
paraview-mesa paraview # 'paraview --mesa' for old releases
```
However, if one needs to run with OpenGL support, this is still possible by running `vglrun`. In example, for running Paraview:
```bash
module load paraview
vglrun paraview
```
Officially, the supported method for running `vglrun` is by using the [NoMachine remote desktop](/merlin6/nomachine.html).
Running `vglrun` it's also possible using SSH with X11 Forwarding. However, it's very slow and it's only recommended when running
in Slurm (from [NoMachine](/merlin6/nomachine.html)). Please, avoid running `vglrun` over SSH from a desktop or laptop.
## Software
### ANSYS
Sometimes, running ANSYS/Fluent requires X11 support. For that, one should run fluent as follows.
```bash
module load ANSYS
fluent -driver x11
```
### Paraview
For running Paraview, one can run it with Mesa support or OpenGL support. Please refer to [OpenGL vs Mesa](/merlin6/known-problems.html#opengl-vs-mesa) for
further information about how to run it.
### Module command not found
In some circumstances the module command may not be initialized properly. For instance, you may see the following error upon logon:
```
bash: module: command not found
```
The most common cause for this is a custom `.bashrc` file which fails to source the global `/etc/bashrc` responsible for setting up PModules in some OS versions. To fix this, add the following to `$HOME/.bashrc`:
```bash
if [ -f /etc/bashrc ]; then
. /etc/bashrc
fi
```
It can also be fixed temporarily in an existing terminal by running `. /etc/bashrc` manually.

View File

@@ -0,0 +1,139 @@
---
title: Migration From Merlin5
#tags:
keywords: merlin5, merlin6, migration, rsync, archive, archiving, lts, long-term storage
last_updated: 07 September 2022
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/migrating.html
---
## Directories
### Merlin5 vs Merlin6
| Cluster | Home Directory | User Home Directory | Group Home Directory |
| ------- |:-------------------- |:-------------------- |:---------------------------------------- |
| merlin5 | /gpfs/home/_$username_ | /gpfs/data/_$username_ | /gpfs/group/_$laboratory_ |
| merlin6 | /psi/home/_$username_ | /data/user/_$username_ | /data/project/_\[general\|bio\]_/_$projectname_ |
### Quota limits in Merlin6
| Directory | Quota_Type [Soft:Hard] (Block) | Quota_Type [Soft:Hard] (Files) | Quota Change Policy: Block | Quota Change Policy: Files |
| ---------------------------------- | ------------------------------ | ------------------------------ |:--------------------------------------------- |:--------------------------------------------- |
| /psi/home/$username | USR [10GB:11GB] | *Undef* | Up to x2 when strictly justified. | N/A |
| /data/user/$username | USR [1TB:1.074TB] | USR [1M:1.1M] | Inmutable. Need a project. | Changeable when justified. |
| /data/project/bio/$projectname | GRP+Fileset [1TB:1.074TB] | GRP+Fileset [1M:1.1M] | Changeable according to project requirements. | Changeable according to project requirements. |
| /data/project/general/$projectname | GRP+Fileset [1TB:1.074TB] | GRP+Fileset [1M:1.1M] | Changeable according to project requirements. | Changeable according to project requirements. |
where:
* **Block** is capacity size in GB and TB
* **Files** is number of files + directories in Millions (M)
* **Quota types** are the following:
* **USR**: Quota is setup individually per user name
* **GRP**: Quota is setup individually per Unix Group name
* **Fileset**: Quota is setup per project root directory.
* User data directory ``/data/user`` has a strict user block quota limit policy. If more disk space is required, 'project' must be created.
* Soft quotas can be exceeded for short periods of time. Hard quotas cannot be exceeded.
### Project directory
#### Why is 'project' needed?
Merlin6 introduces the concept of a *project* directory. These are the recommended location for all scientific data.
* `/data/user` is not suitable for sharing data between users
* The Merlin5 *group* directories were a similar concept, but the association with a single organizational group made
interdepartmental sharing difficult. Projects can be shared by any PSI user.
* Projects are shared by multiple users (at a minimum they should be shared with the supervisor/PI). This decreases
the chance of data being orphaned by personnel changes.
* Shared projects are preferable to individual data for transparency and accountability in event of future questions
regarding the data.
* One project member is designated as responsible. Responsibility can be transferred if needed.
#### Requesting a *project*
Refer to [Requesting a project](/merlin6/request-project.html)
---
## Migration Schedule
### Phase 1 [June]: Pre-migration
* Users keep working on Merlin5
* Merlin5 production directories: ``'/gpfs/home/'``, ``'/gpfs/data'``, ``'/gpfs/group'``
* Users may raise any problems (quota limits, unaccessible files, etc.) to merlin-admins@lists.psi.ch
* Users can start migrating data (see [Migration steps](/merlin6/migrating.html#migration-steps))
* Users should copy their data from Merlin5 ``/gpfs/data`` to Merlin6 ``/data/user``
* Users should copy their home from Merlin5 ``/gpfs/home`` to Merlin6 ``/psi/home``
* Users should inform when migration is done, and which directories were migrated. Deletion for such directories can be requested by admins.
### Phase 2 [July-October]: Migration to Merlin6
* Merlin6 becomes official cluster, and directories are switched to the new structure:
* Merlin6 production directories: ``'/psi/home/'``, ``'/data/user'``, ``'/data/project'``
* Merlin5 directories available in RW in login nodes: ``'/gpfs/home/'``, ``'/gpfs/data'``, ``'/gpfs/group'``
* In Merlin5 computing nodes, Merlin5 directories are mounted in RW: ``'/gpfs/home/'``, ``'/gpfs/data'``, ``'/gpfs/group'``
* In Merlin5 computing nodes, Merlin6 directories are mounted in RW: ``'/psi/home/'``, ``'/data/user'``, ``'/data/project'``
* Users must migrate their data (see [Migration steps](/merlin6/migrating.html#migration-steps))
* ALL data must be migrated
* Job submissions by default to Merlin6. Submission to Merlin5 computing nodes possible.
* Users should inform when migration is done, and which directories were migrated. Deletion for such directories can be requested by admins.
### Phase 3 [November]: Merlin5 Decomission
* Old Merlin5 storage unmounted.
* Migrated directories reported by users will be deleted.
* Remaining Merlin5 data will be archived.
---
## Migration steps
### Cleanup / Archive files
* Users must cleanup and/or archive files, according to the quota limits for the target storage.
* If extra space is needed, we advise users to request a [project](/merlin6/request-project.html)
* If you need a larger quota in respect to the maximal allowed number of files, you can request an increase of your user quota.
#### File list
### Step 1: Migrating
First migration:
```bash
rsync -avAHXS <source_merlin5> <destination_merlin6>
rsync -avAHXS /gpfs/data/$username/* /data/user/$username
```
This can take several hours or days:
* You can try to parallelize multiple rsync commands in sub-directories for increasing transfer rate.
* Please do not parallelize many concurrent directories. Let's say, don't add more than 10 together.
* We may have other users doing the same and it could cause storage / UI performance problems in the Merlin5 cluster.
### Step 2: Mirroring
Once first migration is done, a second ``rsync`` should be ran. This is done with ``--delete``. With this option ``rsync`` will
behave in a way where it will delete from the destination all files that were removed in the source, but also will propagate
new files from the source to the destination.
```bash
rsync -avAHXS --delete <source_merlin5> <destination_merlin6>
rsync -avAHXS --delete /gpfs/data/$username/* /data/user/$username
```
### Step 3: Removing / Archiving old data
#### Removing migrated data
Once you ensure that everything is migrated to the new storage, data is ready to be deleted from the old storage.
Users must report when migration is finished and report which directories are affected and ready to be removed.
Merlin administrators will remove the directories, always asking for a last confirmation.
#### Archiving data
Once all migrated data has been removed from the old storage, missing data will be archived.

View File

@@ -0,0 +1,48 @@
---
title: Troubleshooting
#tags:
keywords: troubleshooting, problems, faq, known problems
last_updated: 07 September 2022
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/troubleshooting.html
---
For troubleshooting, please contact us through the official channels. See [Contact](/merlin6/contact.html)
for more information.
## Known Problems
Before contacting us for support, please check the **[Merlin6 Support: Known Problems](/merlin6/known-problems.html)** page to see if there is an existing
workaround for your specific problem.
## Troubleshooting Slurm Jobs
If you want to report a problem or request for help when running jobs, please **always provide**
the following information:
1. Provide your batch script or, alternatively, the path to your batch script.
2. Add **always** the following commands to your batch script
```bash
echo "User information:"; who am i
echo "Running hostname:"; hostname
echo "Current location:"; pwd
echo "User environment:"; env
echo "List of PModules:"; module list
```
3. Whenever possible, provide the Slurm JobID.
Providing this information is **extremely important** in order to ease debugging, otherwise
only with the description of the issue or just the error message is completely insufficient
in most cases.
## Troubleshooting SSH
Use the ssh command with the "-vvv" option and copy and paste (no screenshots please)
the output to your request in Service-Now. Example
```bash
ssh -Y -vvv $username@merlin-l-01.psi.ch
```

View File

@@ -0,0 +1,27 @@
---
title: Introduction
#tags:
#keywords:
last_updated: 28 June 2019
#summary: "Merlin 6 cluster overview"
sidebar: merlin6_sidebar
permalink: /merlin6/cluster-introduction.html
---
## Slurm clusters
* The new Slurm CPU cluster is called [**`merlin6`**](/merlin6/cluster-introduction.html).
* The new Slurm GPU cluster is called [**`gmerlin6`**](/gmerlin6/cluster-introduction.html)
* The old Slurm *merlin* cluster is still active and best effort support is provided.
The cluster, was renamed as [**merlin5**](/merlin5/cluster-introduction.html).
From July 2019, **`merlin6`** becomes the **default Slurm cluster** and any job submitted from the login node will be submitted to that cluster if not .
* Users can keep submitting to the old *`merlin5`* computing nodes by using the option ``--cluster=merlin5``.
* Users submitting to the **`gmerlin6`** GPU cluster need to specify the option ``--cluster=gmerlin6``.
### Slurm 'merlin6'
**CPU nodes** are configured in a **Slurm** cluster, called **`merlin6`**, and
this is the _**default Slurm cluster**_. Hence, by default, if no Slurm cluster is
specified (with the `--cluster` option), this will be the cluster to which the jobs
will be sent.

View File

@@ -0,0 +1,171 @@
---
title: Hardware And Software Description
#tags:
#keywords:
last_updated: 13 June 2019
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/hardware-and-software.html
---
## Hardware
### Computing Nodes
The new Merlin6 cluster contains a solution based on **four** [**HPE Apollo k6000 Chassis**](https://h20195.www2.hpe.com/v2/getdocument.aspx?docname=a00016641enw)
* *Three* of them contain 24 x [**HP Apollo XL230K Gen10**](https://h20195.www2.hpe.com/v2/GetDocument.aspx?docname=a00016634enw) blades.
* A *fourth* chassis was purchased on 2021 with [**HP Apollo XL230K Gen10**](https://h20195.www2.hpe.com/v2/GetDocument.aspx?docname=a00016634enw) blades dedicated to few experiments. Blades have slighly different components depending on specific project requirements.
The connectivity for the Merlin6 cluster is based on **ConnectX-5 EDR-100Gbps**, and each chassis contains:
* 1 x [HPE Apollo InfiniBand EDR 36-port Unmanaged Switch](https://h20195.www2.hpe.com/v2/getdocument.aspx?docname=a00016643enw)
* 24 internal EDR-100Gbps ports (1 port per blade for internal low latency connectivity)
* 12 external EDR-100Gbps ports (for external for internal low latency connectivity)
<table>
<thead>
<tr>
<th scope='colgroup' style="vertical-align:middle;text-align:center;" colspan="8">Merlin6 CPU Computing Nodes</th>
</tr>
<tr>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Chassis</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Node</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Processor</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Sockets</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Cores</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Threads</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Scratch</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Memory</th>
</tr>
</thead>
<tbody>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>#0</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-c-0[01-24]</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><a href="https://ark.intel.com/content/www/us/en/ark/products/120491/intel-xeon-gold-6152-processor-30-25m-cache-2-10-ghz.html">Intel Xeon Gold 6152</a></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">44</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">1.2TB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">384GB</td>
</tr>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>#1</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-c-1[01-24]</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><a href="https://ark.intel.com/content/www/us/en/ark/products/120491/intel-xeon-gold-6152-processor-30-25m-cache-2-10-ghz.html">Intel Xeon Gold 6152</a></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">44</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">1.2TB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">384GB</td>
</tr>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>#2</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-c-2[01-24]</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><a href="https://ark.intel.com/content/www/us/en/ark/products/120491/intel-xeon-gold-6152-processor-30-25m-cache-2-10-ghz.html">Intel Xeon Gold 6152</a></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">44</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">1.2TB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">384GB</td>
</tr>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td style="vertical-align:middle;text-align:center;" rowspan="3"><b>#3</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-c-3[01-12]</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="3"><a href="https://ark.intel.com/content/www/us/en/ark/products/199343/intel-xeon-gold-6240r-processor-35-75m-cache-2-40-ghz.html">Intel Xeon Gold 6240R</a></td>
<td style="vertical-align:middle;text-align:center;" rowspan="3">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="3">48</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="3">1.2TB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="2">768GB</td>
</tr>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td rowspan="1"><b>merlin-c-3[03-18]</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">1</td>
</tr>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td rowspan="1"><b>merlin-c-3[19-24]</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">384GB</td>
</tr>
</tbody>
</table>
Each blade contains a NVMe disk, where up to 300TB are dedicated to the O.S., and ~1.2TB are reserved for local `/scratch`.
### Login Nodes
*One old login node* (``merlin-l-01.psi.ch``) is inherit from the previous Merlin5 cluster. Its mainly use is for running some BIO services (`cryosparc`) and for submitting jobs.
*Two new login nodes* (``merlin-l-001.psi.ch``,``merlin-l-002.psi.ch``) with similar configuration to the Merlin6 computing nodes are available for the users. The mainly use
is for compiling software and submitting jobs.
The connectivity is based on **ConnectX-5 EDR-100Gbps** for the new login nodes, and **ConnectIB FDR-56Gbps** for the old one.
<table>
<thead>
<tr>
<th scope='colgroup' style="vertical-align:middle;text-align:center;" colspan="8">Merlin6 CPU Computing Nodes</th>
</tr>
<tr>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Hardware</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Node</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Processor</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Sockets</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Cores</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Threads</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Scratch</th>
<th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Memory</th>
</tr>
</thead>
<tbody>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>Old</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-l-01</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><a href="https://ark.intel.com/products/91768/Intel-Xeon-Processor-E5-2697A-v4-40M-Cache-2-60-GHz-">Intel Xeon E5-2697AV4</a></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">16</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">100GB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">512GB</td>
</tr>
<tr style="vertical-align:middle;text-align:center;" ralign="center">
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>New</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-l-00[1,2]</b></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1"><a href="https://ark.intel.com/content/www/us/en/ark/products/120491/intel-xeon-gold-6152-processor-30-25m-cache-2-10-ghz.html">Intel Xeon Gold 6152</a></td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">44</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">2</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">1.8TB</td>
<td style="vertical-align:middle;text-align:center;" rowspan="1">384GB</td>
</tr>
</tbody>
</table>
### Storage
The storage node is based on the [Lenovo Distributed Storage Solution for IBM Spectrum Scale](https://lenovopress.com/lp0626-lenovo-distributed-storage-solution-for-ibm-spectrum-scale-x3650-m5).
* 2 x **Lenovo DSS G240** systems, each one composed by 2 IO Nodes **ThinkSystem SR650** mounting 4 x **Lenovo Storage D3284 High Density Expansion** enclosures.
* Each IO node has a connectivity of 400Gbps (4 x EDR 100Gbps ports, 2 of them are **ConnectX-5** and 2 are **ConnectX-4**).
The storage solution is connected to the HPC clusters through 2 x **Mellanox SB7800 InfiniBand 1U Switches** for high availability and load balancing.
### Network
Merlin6 cluster connectivity is based on the [**Infiniband**](https://en.wikipedia.org/wiki/InfiniBand) technology. This allows fast access with very low latencies to the data as well as running
extremely efficient MPI-based jobs:
* Connectivity amongst different computing nodes on different chassis ensures up to 1200Gbps of aggregated bandwidth.
* Inter connectivity (communication amongst computing nodes in the same chassis) ensures up to 2400Gbps of aggregated bandwidth.
* Communication to the storage ensures up to 800Gbps of aggregated bandwidth.
Merlin6 cluster currently contains 5 Infiniband Managed switches and 3 Infiniband Unmanaged switches (one per HP Apollo chassis):
* 1 x **MSX6710** (FDR) for connecting old GPU nodes, old login nodes and MeG cluster to the Merlin6 cluster (and storage). No High Availability mode possible.
* 2 x **MSB7800** (EDR) for connecting Login Nodes, Storage and other nodes in High Availability mode.
* 3 x **HP EDR Unmanaged** switches, each one embedded to each HP Apollo k6000 chassis solution.
* 2 x **MSB7700** (EDR) are the top switches, interconnecting the Apollo unmanaged switches and the managed switches (MSX6710, MSB7800).
## Software
In Merlin6, we try to keep the latest software stack release to get the latest features and improvements. Due to this, **Merlin6** runs:
* [**RedHat Enterprise Linux 7**](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/7.9_release_notes/index)
* [**Slurm**](https://slurm.schedmd.com/), we usually try to keep it up to date with the most recent versions.
* [**GPFS v5**](https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.2/ibmspectrumscale502_welcome.html)
* [**MLNX_OFED LTS v.5.2-2.2.0.0 or newer**](https://www.mellanox.com/products/infiniband-drivers/linux/mlnx_ofed) for all **ConnectX-5** or superior cards.
* [MLNX_OFED LTS v.4.9-2.2.4.0](https://www.mellanox.com/products/infiniband-drivers/linux/mlnx_ofed) is installed for remaining **ConnectX-3** and **ConnectIB** cards.

View File

@@ -0,0 +1,250 @@
---
title: Slurm Configuration
#tags:
keywords: configuration, partitions, node definition
last_updated: 29 January 2021
summary: "This document describes a summary of the Merlin6 configuration."
sidebar: merlin6_sidebar
permalink: /merlin6/slurm-configuration.html
---
This documentation shows basic Slurm configuration and options needed to run jobs in the Merlin6 CPU cluster.
## Merlin6 CPU nodes definition
The following table show default and maximum resources that can be used per node:
| Nodes | Def.#CPUs | Max.#CPUs | #Threads | Max.Mem/CPU | Max.Mem/Node | Max.Swap | Def.#GPUs | Max.#GPUs |
|:--------------------:| ---------:| :--------:| :------: | :----------:| :-----------:| :-------:| :-------: | :-------: |
| merlin-c-[001-024] | 1 core | 44 cores | 2 | 352000 | 352000 | 10000 | N/A | N/A |
| merlin-c-[101-124] | 1 core | 44 cores | 2 | 352000 | 352000 | 10000 | N/A | N/A |
| merlin-c-[201-224] | 1 core | 44 cores | 2 | 352000 | 352000 | 10000 | N/A | N/A |
| merlin-c-[301-312] | 1 core | 44 cores | 2 | 748800 | 748800 | 10000 | N/A | N/A |
| merlin-c-[313-318] | 1 core | 44 cores | 1 | 748800 | 748800 | 10000 | N/A | N/A |
| merlin-c-[319-324] | 1 core | 44 cores | 2 | 748800 | 748800 | 10000 | N/A | N/A |
If nothing is specified, by default each core will use up to 8GB of memory. Memory can be increased with the `--mem=<mem_in_MB>` and
`--mem-per-cpu=<mem_in_MB>` options, and maximum memory allowed is `Max.Mem/Node`.
In **`merlin6`**, Memory is considered a Consumable Resource, as well as the CPU. Hence, both resources will account when submitting a job,
and by default resources can not be oversubscribed. This is a main difference with the old **`merlin5`** cluster, when only CPU were accounted,
and memory was by default oversubscribed.
{{site.data.alerts.tip}}Always check <b>'/etc/slurm/slurm.conf'</b> for changes in the hardware.
{{site.data.alerts.end}}
### Merlin6 CPU cluster
To run jobs in the **`merlin6`** cluster users **can optionally** specify the cluster name in Slurm:
```bash
#SBATCH --cluster=merlin6
```
If no cluster name is specified, by default any job will be submitted to this cluster (as this is the main cluster).
Hence, this would be only necessary if one has to deal with multiple clusters or when one has defined some environmental
variables which can modify the cluster name.
### Merlin6 CPU partitions
Users might need to specify the Slurm partition. If no partition is specified, it will default to **`general`**:
```bash
#SBATCH --partition=<partition_name> # Possible <partition_name> values: general, daily, hourly
```
The following *partitions* (also known as *queues*) are configured in Slurm:
| CPU Partition | Default Time | Max Time | Max Nodes | PriorityJobFactor\* | PriorityTier\*\* | DefMemPerCPU |
|:-----------------: | :----------: | :------: | :-------: | :-----------------: | :--------------: |:------------:|
| **<u>general</u>** | 1 day | 1 week | 50 | 1 | 1 | 4000 |
| **daily** | 1 day | 1 day | 67 | 500 | 1 | 4000 |
| **hourly** | 1 hour | 1 hour | unlimited | 1000 | 1 | 4000 |
| **asa-general** | 1 hour | 2 weeks | unlimited | 1 | 2 | 3712 |
| **asa-daily** | 1 hour | 1 week | unlimited | 500 | 2 | 3712 |
| **asa-visas** | 1 hour | 90 days | unlimited | 1000 | 4 | 3712 |
| **asa-ansys** | 1 hour | 90 days | unlimited | 1000 | 4 | 15600 |
| **mu3e** | 1 day | 7 days | unlimited | 1000 | 4 | 3712 |
\*The **PriorityJobFactor** value will be added to the job priority (*PARTITION* column in `sprio -l` ). In other words, jobs sent to higher priority
partitions will usually run first (however, other factors such like **job age** or mainly **fair share** might affect to that decision). For the GPU
partitions, Slurm will also attempt first to allocate jobs on partitions with higher priority over partitions with lesser priority.
**\*\***Jobs submitted to a partition with a higher **PriorityTier** value will be dispatched before pending jobs in partition with lower *PriorityTier* value
and, if possible, they will preempt running jobs from partitions with lower *PriorityTier* values.
* The **`general`** partition is the **default**. It can not have more than 50 nodes running jobs.
* For **`daily`** this limitation is extended to 67 nodes.
* For **`hourly`** there are no limits.
* **`asa-general`,`asa-daily`,`asa-ansys`,`asa-visas` and `mu3e`** are **private** partitions, belonging to different experiments owning the machines. **Access is restricted** in all cases. However, by agreement with the experiments, nodes are usually added to the **`hourly`** partition as extra resources for the public resources.
{{site.data.alerts.tip}}Jobs which would run for less than one day should be always sent to <b>daily</b>, while jobs that would run for less
than one hour should be sent to <b>hourly</b>. This would ensure that you have highest priority over jobs sent to partitions with less priority,
but also because <b>general</b> has limited the number of nodes that can be used for that. The idea behind that, is that the cluster can not
be blocked by long jobs and we can always ensure resources for shorter jobs.
{{site.data.alerts.end}}
### Merlin5 CPU Accounts
Users need to ensure that the public **`merlin`** account is specified. No specifying account options would default to this account.
This is mostly needed by users which have multiple Slurm accounts, which may define by mistake a different account.
```bash
#SBATCH --account=merlin # Possible values: merlin, gfa-asa
```
Not all the accounts can be used on all partitions. This is resumed in the table below:
| Slurm Account | Slurm Partitions |
| :------------------: | :----------------------------------: |
| **<u>merlin</u>** | `hourly`,`daily`, `general` |
| **gfa-asa** | `asa-general`,`asa-daily`,`asa-visas`,`asa-ansys`,`hourly`,`daily`, `general` |
| **mu3e** | `mu3e` |
#### Private accounts
* The *`gfa-asa`* and *`mu3e`* accounts are private accounts. These can be used for accessing dedicated
partitions with nodes owned by different groups.
### Slurm CPU specific options
Some options are available when using CPUs. These are detailed here.
Alternative Slurm options for CPU based jobs are available. Please refer to the **man** pages
for each Slurm command for further information about it (`man salloc`, `man sbatch`, `man srun`).
Below are listed the most common settings:
```bash
#SBATCH --hint=[no]multithread
#SBATCH --ntasks=<ntasks>
#SBATCH --ntasks-per-core=<ntasks>
#SBATCH --ntasks-per-socket=<ntasks>
#SBATCH --ntasks-per-node=<ntasks>
#SBATCH --mem=<size[units]>
#SBATCH --mem-per-cpu=<size[units]>
#SBATCH --cpus-per-task=<ncpus>
#SBATCH --cpu-bind=[{quiet,verbose},]<type> # only for 'srun' command
```
#### Enabling/Disabling Hyper-Threading
The **`merlin6`** cluster contains nodes with Hyper-Threading enabled. One should always specify
whether to use Hyper-Threading or not. If not defined, Slurm will generally use it (exceptions apply).
```bash
#SBATCH --hint=multithread # Use extra threads with in-core multi-threading.
#SBATCH --hint=nomultithread # Don't use extra threads with in-core multi-threading.
```
#### Constraint / Features
Slurm allows to define a set of features in the node definition. This can be used to filter and select nodes according to one or more
specific features. For the CPU nodes, we have the following features:
```
NodeName=merlin-c-[001-024,101-124,201-224] Features=mem_384gb,xeon-gold-6152
NodeName=merlin-c-[301-312] Features=mem_768gb,xeon-gold-6240r
NodeName=merlin-c-[313-318] Features=mem_768gb,xeon-gold-6240r
NodeName=merlin-c-[319-324] Features=mem_384gb,xeon-gold-6240r
```
Therefore, users running on `hourly` can select which node they want to use (fat memory nodes vs regular memory nodes, CPU type).
This is possible by using the option `--constraint=<feature_name>` in Slurm.
Examples:
1. Select nodes with 48 cores only (nodes with [2 x Xeon Gold 6240R](https://ark.intel.com/content/www/us/en/ark/products/199343/intel-xeon-gold-6240r-processor-35-75m-cache-2-40-ghz.html)):
```
sbatch --constraint=xeon-gold-6240r ...
```
2. Select nodes with 44 cores only (nodes with [2 x Xeon Gold 6152](https://ark.intel.com/content/www/us/en/ark/products/120491/intel-xeon-gold-6152-processor-30-25m-cache-2-10-ghz.html)):
```
sbatch --constraint=xeon-gold-6152 ...
```
3. Select fat memory nodes only:
```
sbatch --constraint=mem_768gb ...
```
4. Select regular memory nodes only:
```
sbatch --constraint=mem_384gb ...
```
5. Select fat memory nodes with 48 cores only:
```
sbatch --constraint=mem_768gb,xeon-gold-6240r ...
```
Detailing exactly which type of nodes you want to use is important, therefore, for groups with private accounts (`mu3e`,`gfa-asa`) or for
public users running on the `hourly` partition, *constraining nodes by features is recommended*. This becomes even more important when
having heterogeneous clusters.
## Running jobs in the 'merlin6' cluster
In this chapter we will cover basic settings that users need to specify in order to run jobs in the Merlin6 CPU cluster.
### User and job limits
In the CPU cluster we provide some limits which basically apply to jobs and users. The idea behind this is to ensure a fair usage of the resources and to
avoid overabuse of the resources from a single user or job. However, applying limits might affect the overall usage efficiency of the cluster (in example,
pending jobs from a single user while having many idle nodes due to low overall activity is something that can be seen when user limits are applied).
In the same way, these limits can be also used to improve the efficiency of the cluster (in example, without any job size limits, a job requesting all
resources from the batch system would drain the entire cluster for fitting the job, which is undesirable).
Hence, there is a need of setting up wise limits and to ensure that there is a fair usage of the resources, by trying to optimize the overall efficiency
of the cluster while allowing jobs of different nature and sizes (it is, **single core** based **vs parallel jobs** of different sizes) to run.
{{site.data.alerts.warning}}Wide limits are provided in the <b>daily</b> and <b>hourly</b> partitions, while for <b>general</b> those limits are
more restrictive.
<br>However, we kindly ask users to inform the Merlin administrators when there are plans to send big jobs which would require a
massive draining of nodes for allocating such jobs. This would apply to jobs requiring the <b>unlimited</b> QoS (see below <i>"Per job limits"</i>)
{{site.data.alerts.end}}
{{site.data.alerts.tip}}If you have different requirements, please let us know, we will try to accomodate or propose a solution for you.
{{site.data.alerts.end}}
#### Per job limits
These are limits which apply to a single job. In other words, there is a maximum of resources a single job can use. Limits are described in the table below with the format: `SlurmQoS(limits)` (possible `SlurmQoS` values can be listed with the command `sacctmgr show qos`). Some limits will vary depending on the day and time of the week.
| Partition | Mon-Fri 0h-18h | Sun-Thu 18h-0h | From Fri 18h to Mon 0h |
|:----------: | :------------------------------: | :------------------------------: | :------------------------------: |
| **general** | normal(cpu=704,mem=2750G) | normal(cpu=704,mem=2750G) | normal(cpu=704,mem=2750G) |
| **daily** | daytime(cpu=704,mem=2750G) | nighttime(cpu=1408,mem=5500G) | unlimited(cpu=2200,mem=8593.75G) |
| **hourly** | unlimited(cpu=2200,mem=8593.75G) | unlimited(cpu=2200,mem=8593.75G) | unlimited(cpu=2200,mem=8593.75G) |
By default, a job can not use more than 704 cores (max CPU per job). In the same way, memory is also proportionally limited. This is equivalent as
running a job using up to 8 nodes at once. This limit applies to the **general** partition (fixed limit) and to the **daily** partition (only during working hours).
Limits are softed for the **daily** partition during non working hours, and during the weekend limits are even wider.
For the **hourly** partition, **despite running many parallel jobs is something not desirable** (for allocating such jobs it requires massive draining of nodes),
wider limits are provided. In order to avoid massive nodes drain in the cluster, for allocating huge jobs, setting per job limits is necessary. Hence, **unlimited** QoS
mostly refers to "per user" limits more than to "per job" limits (in other words, users can run any number of hourly jobs, but the job size for such jobs is limited
with wide values).
#### Per user limits for CPU partitions
These limits which apply exclusively to users. In other words, there is a maximum of resources a single user can use. Limits are described in the table below with the format: `SlurmQoS(limits)` (possible `SlurmQoS` values can be listed with the command `sacctmgr show qos`). Some limits will vary depending on the day and time of the week.
| Partition | Mon-Fri 0h-18h | Sun-Thu 18h-0h | From Fri 18h to Mon 0h |
|:-----------:| :----------------------------: | :---------------------------: | :----------------------------: |
| **general** | normal(cpu=704,mem=2750G) | normal(cpu=704,mem=2750G) | normal(cpu=704,mem=2750G) |
| **daily** | daytime(cpu=1408,mem=5500G) | nighttime(cpu=2112,mem=8250G) | unlimited(cpu=6336,mem=24750G) |
| **hourly** | unlimited(cpu=6336,mem=24750G) | unlimited(cpu=6336,mem=24750G)| unlimited(cpu=6336,mem=24750G) |
By default, users can not use more than 704 cores at the same time (max CPU per user). Memory is also proportionally limited in the same way. This is
equivalent to 8 exclusive nodes. This limit applies to the **general** partition (fixed limit) and to the **daily** partition (only during working hours).
For the **hourly** partition, there are no limits restriction and user limits are removed. Limits are softed for the **daily** partition during non
working hours, and during the weekend limits are removed.
## Advanced Slurm configuration
Clusters at PSI use the [Slurm Workload Manager](http://slurm.schedmd.com/) as the batch system technology for managing and scheduling jobs.
Slurm has been installed in a **multi-clustered** configuration, allowing to integrate multiple clusters in the same batch system.
For understanding the Slurm configuration setup in the cluster, sometimes may be useful to check the following files:
* ``/etc/slurm/slurm.conf`` - can be found in the login nodes and computing nodes.
* ``/etc/slurm/gres.conf`` - can be found in the GPU nodes, is also propgated to login nodes and computing nodes for user read access.
* ``/etc/slurm/cgroup.conf`` - can be found in the computing nodes, is also propagated to login nodes for user read access.
The previous configuration files which can be found in the login nodes, correspond exclusively to the **merlin6** cluster configuration files.
Configuration files for the old **merlin5** cluster or for the **gmerlin6** cluster must be checked directly on any of the **merlin5** or **gmerlin6** computing nodes (in example, by login in to one of the nodes while a job or an active allocation is running).