This commit is contained in:
2025-12-11 13:30:12 +01:00
parent 84f9846a0c
commit 01ac18b3f4
24 changed files with 179 additions and 190 deletions

View File

@@ -142,7 +142,7 @@ ibstat | grep Rate
## Software
In the Merlin6 GPU computing nodes, we try to keep software stack coherency with the main cluster [Merlin6](/merlin6/index.html).
In the Merlin6 GPU computing nodes, we try to keep software stack coherency with the main cluster [Merlin6](../merlin6/index.md).
Due to this, the Merlin6 GPU nodes run:
* [**RedHat Enterprise Linux 7**](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/7.9_release_notes/index)

View File

@@ -24,8 +24,8 @@ The table below shows a summary of the hardware setup for the different GPU node
| merlin-g-015 | 1 core | 48 cores | 1 | 5120 | 360448 | 360448 | 10000 | **A5000** | 1 | 8 |
| merlin-g-100 | 1 core | 128 cores | 2 | 3900 | 998400 | 998400 | 10000 | **A100** | 1 | 8 |
{{site.data.alerts.tip}}Always check <b>'/etc/slurm/gres.conf'</b> and <b>'/etc/slurm/slurm.conf'</b> for changes in the GPU type and details of the hardware.
{{site.data.alerts.end}}
!!! tip
Always check `/etc/slurm/gres.conf` and `/etc/slurm/slurm.conf` for changes in the GPU type and details of the hardware.
## Running jobs in the 'gmerlin6' cluster

View File

@@ -23,7 +23,7 @@ resources which were mostly used by the BIO experiments.
called **`merlin5`**. In that way, the old CPU computing nodes are still available as extra computation resources,
and as an extension of the official production **`merlin6`** [Slurm](https://slurm.schedmd.com/overview.html) cluster.
The old Merlin5 _**login nodes**_, _**GPU nodes**_ and _**storage**_ were fully migrated to the **[Merlin6](/merlin6/index.html)**
The old Merlin5 _**login nodes**_, _**GPU nodes**_ and _**storage**_ were fully migrated to the **[Merlin6](../merlin6/index.md)**
cluster, which becomes the **main Local HPC Cluster**. Hence, **[Merlin6](/merlin6/index.html)**
contains the storage which is mounted on the different Merlin HPC [Slurm](https://slurm.schedmd.com/overview.html) Clusters (`merlin5`, `merlin6`, `gmerlin6`).

View File

@@ -70,15 +70,15 @@ The below table summarizes the hardware setup for the Merlin5 computing nodes:
### Login Nodes
The login nodes are part of the **[Merlin6](/merlin6/introduction.html)** HPC cluster,
The login nodes are part of the **[Merlin6](../merlin6/index.md)** HPC cluster,
and are used to compile and to submit jobs to the different ***Merlin Slurm clusters*** (`merlin5`,`merlin6`,`gmerlin6`,etc.).
Please refer to the **[Merlin6 Hardware Documentation](/merlin6/hardware-and-software.html)** for further information.
Please refer to the **[Merlin6 Hardware Documentation](../merlin6/hardware-and-software-description.md)** for further information.
### Storage
The storage is part of the **[Merlin6](/merlin6/introduction.html)** HPC cluster,
The storage is part of the **[Merlin6](../merlin6/index.md)** HPC cluster,
and is mounted in all the ***Slurm clusters*** (`merlin5`,`merlin6`,`gmerlin6`,etc.).
Please refer to the **[Merlin6 Hardware Documentation](/merlin6/hardware-and-software.html)** for further information.
Please refer to the **[Merlin6 Hardware Documentation](../merlin6/hardware-and-software-description.md)** for further information.
### Network
@@ -88,7 +88,7 @@ However, this is an old version of Infiniband which requires older drivers and s
## Software
In Merlin5, we try to keep software stack coherency with the main cluster [Merlin6](/merlin6/index.html).
In Merlin5, we try to keep software stack coherency with the main cluster [Merlin6](../merlin6/index.md).
Due to this, Merlin5 runs:
* [**RedHat Enterprise Linux 7**](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/7.9_release_notes/index)

View File

@@ -93,9 +93,8 @@ The following filesystems are mounted:
Access to the PSI Data Transfer uses ***Multi factor authentication*** (MFA).
Therefore, having the Microsoft Authenticator App is required as explained [here](https://www.psi.ch/en/computing/change-to-mfa).
{{site.data.alerts.tip}}Please follow the
<b><a href="https://www.psi.ch/en/photon-science-data-services/data-transfer">Official PSI Data Transfer</a></b> documentation for further instructions.
{{site.data.alerts.end}}
!!! tip "Official Documentation"
Please follow the [Official PSI Data Transfer](https://www.psi.ch/en/photon-science-data-services/data-transfer) documentation for further instructions.
### Directories
@@ -103,23 +102,20 @@ Therefore, having the Microsoft Authenticator App is required as explained [here
User data directories are mounted in RW.
{{site.data.alerts.warning}}Please, <b>ensure proper secured permissions</b> in your '/data/user'
directory. By default, when directory is created, the system applies the most restrictive
permissions. However, this does not prevent users for changing permissions if they wish. At this
point, users become responsible of those changes.
{{site.data.alerts.end}}
!!! warning "Secure Permissions"
Please, **ensure proper secured permissions** in your `/data/user` directory. By default, when directory is created, the system applies the most restrictive permissions. However, this does not prevent users for changing permissions if they wish. At this point, users become responsible of those changes.
#### /merlin/export
Transferring big amounts of data from outside PSI to Merlin is always possible through `/export`.
{{site.data.alerts.tip}}<b>The '/export' directory can be used by any Merlin user.</b>
This is configured in Read/Write mode. If you need access, please, contact the Merlin administrators.
{{site.data.alerts.end}}
!!! tip "Export Directory Access"
The `/export` directory can be used by any Merlin user. This is configured in Read/Write mode. If you need access, please, contact the Merlin administrators.
{{site.data.alerts.warning}}The use <b>export</b> as an extension of the quota <i>is forbidden</i>.
<br><b><i>Auto cleanup policies</i></b> in the <b>export</b> area apply for files older than 28 days.
{{site.data.alerts.end}}
!!! warning "Export Usage Policy"
The use **export** as an extension of the quota *is forbidden*.
Auto cleanup policies in the **export** area apply for files older than 28 days.
##### Exporting data from Merlin
@@ -139,9 +135,8 @@ Ensure to properly secure your directories and files with proper permissions.
Optionally, instead of using `/export`, Merlin project owners can request Read/Write or Read/Only access to their project directory.
{{site.data.alerts.tip}}<b>Merlin projects can request direct access.</b>
This can be configured in Read/Write or Read/Only modes. If your project needs access, please, contact the Merlin administrators.
{{site.data.alerts.end}}
!!! tip "Project Access"
Merlin projects can request direct access. This can be configured in Read/Write or Read/Only modes. If your project needs access, please, contact the Merlin administrators.
## Connecting to Merlin6 from outside PSI

View File

@@ -14,15 +14,15 @@ in the login nodes and X11 forwarding can be used for those users who have prope
### Accessing from a Linux client
Refer to [{How To Use Merlin -> Accessing from Linux Clients}](/merlin6/connect-from-linux.html) for **Linux** SSH client and X11 configuration.
Refer to [{How To Use Merlin -> Accessing from Linux Clients}](../how-to-use-merlin/connect-from-linux.md) for **Linux** SSH client and X11 configuration.
### Accessing from a Windows client
Refer to [{How To Use Merlin -> Accessing from Windows Clients}](/merlin6/connect-from-windows.html) for **Windows** SSH client and X11 configuration.
Refer to [{How To Use Merlin -> Accessing from Windows Clients}](../how-to-use-merlin/connect-from-windows.md) for **Windows** SSH client and X11 configuration.
### Accessing from a MacOS client
Refer to [{How To Use Merlin -> Accessing from MacOS Clients}](/merlin6/connect-from-macos.html) for **MacOS** SSH client and X11 configuration.
Refer to [{How To Use Merlin -> Accessing from MacOS Clients}](../how-to-use-merlin/connect-from-macos.md) for **MacOS** SSH client and X11 configuration.
## NoMachine Remote Desktop Access
@@ -33,7 +33,7 @@ X applications are supported in the login nodes and can run efficiently through
### Configuring NoMachine
Refer to [{How To Use Merlin -> Remote Desktop Access}](/merlin6/nomachine.html) for further instructions of how to configure the NoMachine client and how to access it from PSI and from outside PSI.
Refer to [{How To Use Merlin -> Remote Desktop Access}](../how-to-use-merlin/nomachine.md) for further instructions of how to configure the NoMachine client and how to access it from PSI and from outside PSI.
## Login nodes hardware description

View File

@@ -23,7 +23,7 @@ The basic principle is courtesy and consideration for other users.
* It is **forbidden** to use the ``/data/user``, ``/data/project`` or ``/psi/home/`` for that purpose.
* Always remove files you do not need any more (e.g. core dumps, temporary files) as early as possible. Keep the disk space clean on all nodes.
* Prefer ``/scratch`` over ``/shared-scratch`` and use the latter only when you require the temporary files to be visible from multiple nodes.
* Read the description in **[Merlin6 directory structure](/merlin6/storage.html#merlin6-directories)** for learning about the correct usage of each partition type.
* Read the description in **[Merlin6 directory structure](../how-to-use-merlin/storage.md#merlin6-directories)** for learning about the correct usage of each partition type.
## User and project data

View File

@@ -30,8 +30,8 @@ In **`merlin6`**, Memory is considered a Consumable Resource, as well as the CPU
and by default resources can not be oversubscribed. This is a main difference with the old **`merlin5`** cluster, when only CPU were accounted,
and memory was by default oversubscribed.
{{site.data.alerts.tip}}Always check <b>'/etc/slurm/slurm.conf'</b> for changes in the hardware.
{{site.data.alerts.end}}
!!! tip "Check Configuration"
Always check `/etc/slurm/slurm.conf` for changes in the hardware.
### Merlin6 CPU cluster
@@ -78,11 +78,8 @@ and, if possible, they will preempt running jobs from partitions with lower *Pri
* For **`hourly`** there are no limits.
* **`asa-general`,`asa-daily`,`asa-ansys`,`asa-visas` and `mu3e`** are **private** partitions, belonging to different experiments owning the machines. **Access is restricted** in all cases. However, by agreement with the experiments, nodes are usually added to the **`hourly`** partition as extra resources for the public resources.
{{site.data.alerts.tip}}Jobs which would run for less than one day should be always sent to <b>daily</b>, while jobs that would run for less
than one hour should be sent to <b>hourly</b>. This would ensure that you have highest priority over jobs sent to partitions with less priority,
but also because <b>general</b> has limited the number of nodes that can be used for that. The idea behind that, is that the cluster can not
be blocked by long jobs and we can always ensure resources for shorter jobs.
{{site.data.alerts.end}}
!!! tip "Partition Selection"
Jobs which would run for less than one day should be always sent to **daily**, while jobs that would run for less than one hour should be sent to **hourly**. This would ensure that you have highest priority over jobs sent to partitions with less priority, but also because **general** has limited the number of nodes that can be used for that. The idea behind that, is that the cluster can not be blocked by long jobs and we can always ensure resources for shorter jobs.
### Merlin5 CPU Accounts
@@ -192,14 +189,11 @@ resources from the batch system would drain the entire cluster for fitting the j
Hence, there is a need of setting up wise limits and to ensure that there is a fair usage of the resources, by trying to optimize the overall efficiency
of the cluster while allowing jobs of different nature and sizes (it is, **single core** based **vs parallel jobs** of different sizes) to run.
{{site.data.alerts.warning}}Wide limits are provided in the <b>daily</b> and <b>hourly</b> partitions, while for <b>general</b> those limits are
more restrictive.
<br>However, we kindly ask users to inform the Merlin administrators when there are plans to send big jobs which would require a
massive draining of nodes for allocating such jobs. This would apply to jobs requiring the <b>unlimited</b> QoS (see below <i>"Per job limits"</i>)
{{site.data.alerts.end}}
!!! warning "Resource Limits"
Wide limits are provided in the **daily** and **hourly** partitions, while for **general** those limits are more restrictive. However, we kindly ask users to inform the Merlin administrators when there are plans to send big jobs which would require a massive draining of nodes for allocating such jobs. This would apply to jobs requiring the **unlimited** QoS (see below "Per job limits").
{{site.data.alerts.tip}}If you have different requirements, please let us know, we will try to accomodate or propose a solution for you.
{{site.data.alerts.end}}
!!! tip "Custom Requirements"
If you have different requirements, please let us know, we will try to accommodate or propose a solution for you.
#### Per job limits

View File

@@ -117,8 +117,8 @@ module load $MODULE_NAME # where $MODULE_NAME is a software in PModules
srun $MYEXEC # where $MYEXEC is a path to your binary file
```
{{site.data.alerts.tip}} Also, always consider that **`'--mem-per-cpu' x '--cpus-per-task'`** can **never** exceed the maximum amount of memory per node (352000MB).
{{site.data.alerts.end}}
!!! tip "Memory Limit"
Also, always consider that `--mem-per-cpu` x `--cpus-per-task` can **never** exceed the maximum amount of memory per node (352000MB).
### Example 4: Non-hyperthreaded Hybrid MPI/OpenMP job
@@ -146,8 +146,8 @@ module load $MODULE_NAME # where $MODULE_NAME is a software in PModules
srun $MYEXEC # where $MYEXEC is a path to your binary file
```
{{site.data.alerts.tip}} Also, always consider that **`'--mem-per-cpu' x '--cpus-per-task'`** can **never** exceed the maximum amount of memory per node (352000MB).
{{site.data.alerts.end}}
!!! tip "Memory Limit"
Also, always consider that `--mem-per-cpu` x `--cpus-per-task` can **never** exceed the maximum amount of memory per node (352000MB).
## GPU examples

View File

@@ -14,15 +14,15 @@ in the login nodes and X11 forwarding can be used for those users who have prope
### Accessing from a Linux client
Refer to [{How To Use Merlin -> Accessing from Linux Clients}](/merlin7/connect-from-linux.html) for **Linux** SSH client and X11 configuration.
Refer to [{How To Use Merlin -> Accessing from Linux Clients}](../02-How-To-Use-Merlin/connect-from-windows.md) for **Linux** SSH client and X11 configuration.
### Accessing from a Windows client
Refer to [{How To Use Merlin -> Accessing from Windows Clients}](/merlin7/connect-from-windows.html) for **Windows** SSH client and X11 configuration.
Refer to [{How To Use Merlin -> Accessing from Windows Clients}](../02-How-To-Use-Merlin/connect-from-windows.md) for **Windows** SSH client and X11 configuration.
### Accessing from a MacOS client
Refer to [{How To Use Merlin -> Accessing from MacOS Clients}](/merlin7/connect-from-macos.html) for **MacOS** SSH client and X11 configuration.
Refer to [{How To Use Merlin -> Accessing from MacOS Clients}](../02-How-To-Use-Merlin/connect-from-macos.md) for **MacOS** SSH client and X11 configuration.
## NoMachine Remote Desktop Access
@@ -32,7 +32,7 @@ X applications are supported in the login nodes and can run efficiently through
### Configuring NoMachine
Refer to [{How To Use Merlin -> Remote Desktop Access}](/merlin7/nomachine.html) for further instructions of how to configure the NoMachine client and how to access it from PSI and from outside PSI.
Refer to [{How To Use Merlin -> Remote Desktop Access}](../02-How-To-Use-Merlin/nomachine.md) for further instructions of how to configure the NoMachine client and how to access it from PSI and from outside PSI.
## Login nodes hardware description

View File

@@ -13,8 +13,8 @@ permalink: /merlin7/slurm-access.html
Merlin contains a multi-cluster setup, where multiple Slurm clusters coexist under the same umbrella.
It basically contains the following clusters:
* The **Merlin7 Slurm CPU cluster**, which is called [**`merlin7`**](/merlin7/slurm-access.html#merlin7-cpu-cluster-access).
* The **Merlin7 Slurm GPU cluster**, which is called [**`gmerlin7`**](/merlin7/slurm-access.html#merlin7-gpu-cluster-access).
* The **Merlin7 Slurm CPU cluster**, which is called [**`merlin7`**](#merlin7-cpu-cluster-access).
* The **Merlin7 Slurm GPU cluster**, which is called [**`gmerlin7`**](#merlin7-gpu-cluster-access).
## Accessing the Slurm clusters
@@ -31,10 +31,10 @@ In addition, any job *must be submitted from a high performance storage area vis
The **Merlin7 CPU cluster** (**`merlin7`**) is the default cluster configured in the login nodes. Any job submission will use by default this cluster, unless
the option `--cluster` is specified with another of the existing clusters.
For further information about how to use this cluster, please visit: [**Merlin7 CPU Slurm Cluster documentation**](/merlin7/slurm-configuration.html#cpu-cluster-merlin7).
For further information about how to use this cluster, please visit: [**Merlin7 CPU Slurm Cluster documentation**](../03-Slurm-General-Documentation/slurm-configuration.md#cpu-cluster-merlin7).
### Merlin7 GPU cluster access
The **Merlin7 GPU cluster** (**`gmerlin7`**) is visible from the login nodes. However, to submit jobs to this cluster, one needs to specify the option `--cluster=gmerlin7` when submitting a job or allocation.
For further information about how to use this cluster, please visit: [**Merlin7 GPU Slurm Cluster documentation**](/merlin7/slurm-configuration.html#gpu-cluster-gmerlin7).
For further information about how to use this cluster, please visit: [**Merlin7 GPU Slurm Cluster documentation**](../03-Slurm-General-Documentation/slurm-configuration.md#gpu-cluster-gmerlin7).

View File

@@ -30,16 +30,18 @@ The basic principle is courtesy and consideration for other users.
* It is **forbidden** to use the ``/data/user`` or ``/data/project`` for that purpose.
* Always remove files you do not need any more (e.g. core dumps, temporary files) as early as possible. Keep the disk space clean on all nodes.
* Prefer ``/scratch`` over ``/data/scratch/shared`` and _use the latter only when you require the temporary files to be visible from multiple nodes_.
* Read the description in **[Merlin7 directory structure](/merlin7/storage.html#merlin7-directories)** for learning about the correct usage of each partition type.
* Read the description in **[Merlin7 directory structure](../02-How-To-Use-Merlin/storage.md#merlin7-directories)** for learning about the correct usage of each partition type.
## User and project data
* ***Users are responsible for backing up their own data***. Is recommended to backup the data on third party independent systems (i.e. LTS, Archive, AFS, SwitchDrive, Windows Shares, etc.).
* ***When a user leaves PSI, she or her supervisor/team are responsible to backup and move the data out from the cluster***: every few months, the storage space will be recycled for those old users who do not have an existing and valid PSI account.
{{site.data.alerts.warning}}When a user leaves PSI and his account has been removed, her storage space in Merlin may be recycled.
Hence, <b>when a user leaves PSI</b>, she, her supervisor or team <b>must ensure that the data is backed up to an external storage</b>
{{site.data.alerts.end}}
!!! warning
When a user leaves PSI and his account has been removed, her storage space
in Merlin may be recycled. Hence, **when a user leaves PSI**, she, her
supervisor or team **must ensure that the data is backed up to an external
storage**!
## System Administrator Rights

View File

@@ -31,16 +31,15 @@ be indirectly copied to these (**decentral mode**).
Archiving can be done from any node accessible by the users (usually from the login nodes).
{{site.data.alerts.tip}} Archiving can be done in two different ways:
<br>
<b>'Central mode':</b> Possible for the user and project data directories, is the
fastest way as it does not require remote copy (data is directly retreived by central AIT servers from Merlin
through 'merlin-archive.psi.ch').
<br>
<br>
<b>'Decentral mode':</b> Possible for any directory, is the slowest way of archiving as it requires
to copy ('rsync') the data from Merlin to the central AIT servers.
{{site.data.alerts.end}}
!!! tip
Archiving can be done in two different ways:
* **Central mode**: Possible for the user and project data directories, is
the fastest way as it does not require remote copy (data is directly retreived
by central AIT servers from Merlin through <merlin-archive.psi.ch>).
* **Decentral mode**: Possible for any directory, is the slowest way of
archiving as it requires to copy ('rsync') the data from Merlin to the
central AIT servers.
## Procedure
@@ -76,7 +75,7 @@ have been assigned a **``p-group``** (e.g. ``p12345``) for the experiment. Other
Groups are usually assigned to a PI, and then individual user accounts are added to the group. This must be done
under user request through PSI Service Now. For existing **a-groups** and **p-groups**, you can follow the standard
central procedures. Alternatively, if you do not know how to do that, follow the Merlin7
**[Requesting extra Unix groups](/merlin7/request-account.html#requesting-extra-unix-groups)** procedure, or open
**[Requesting extra Unix groups](../01-Quick-Start-Guide/requesting-accounts.md#requesting-extra-unix-groups)** procedure, or open
a **[PSI Service Now](https://psi.service-now.com/psisp)** ticket.
### Documentation

View File

@@ -26,7 +26,7 @@ If they are missing, you can install them using the Software Kiosk icon on the D
Official X11 Forwarding support is through NoMachine. Please follow the document
[{Job Submission -> Interactive Jobs}](../03-Slurm-General-Documentation/interactive-jobs.md#requirements) and
[{Accessing Merlin -> NoMachine}](/merlin7/nomachine.html) for more details. However,
[{Accessing Merlin -> NoMachine}](../02-How-To-Use-Merlin/nomachine.md) for more details. However,
we provide a small recipe for enabling X11 Forwarding in Windows.
Check, if the **Xming** is installed on the Windows workstation by inspecting the

View File

@@ -40,10 +40,10 @@ Path SpaceUsed SpaceQuota Space % FilesUsed FilesQuota Files %
└─ bio/hpce
```
{{site.data.alerts.tip}}You can change the width of the table by either passing
<code>--no-wrap</code> (to disable wrapping of the <i>Path</i>) or <code>--width N</code>
(to explicitly set some width by <code>N</code> characters).
{{site.data.alerts.end}}
!!! tip
You can change the width of the table by either passing `--no-wrap` (to
disable wrapping of the *Path*) or `--width N` (to explicitly set some
width by `N` characters).
#### Example #2: Project view
@@ -86,9 +86,9 @@ Project ID Path Owner Group
600000013 /data/project/bio/steinmetz steinmetz unx-bio_steinmetz
```
{{site.data.alerts.tip}}As above you can change the table width by pass either
<code>--no-wrap</code> or <code>--width N</code>.
{{site.data.alerts.end}}
!!! tip
As above you can change the table width by pass either `--no-wrap` or
`--width N`.
#### Example #3: Project config

View File

@@ -22,17 +22,15 @@ Key Features:
* **Broad Availability:** Commonly used software, such as OpenMPI, ANSYS, MATLAB, and other, is provided within PModules.
* **Custom Requests:** If a package, version, or feature is missing, users can contact the support team to explore feasibility for installation.
{{site.data.alerts.tip}}
For further information about **PModules** on Merlin7 please refer to the [PSI Modules](../05-Software-Support/pmodules.md) chapter.
{{site.data.alerts.end}}
!!! tip
For further information about **PModules** on Merlin7 please refer to the [PSI Modules](../05-Software-Support/pmodules.md) chapter.
### Spack Modules
Merlin7 also provides Spack modules, offering a modern and flexible package management system. Spack supports a wide variety of software packages and versions. For more information, refer to the **external [PSI Spack](https://gitea.psi.ch/HPCE/spack-psi) documentation**.
{{site.data.alerts.tip}}
For further information about **Spack** on Merlin7 please refer to the [Spack](../05-Software-Support/spack.md) chapter.
{{site.data.alerts.end}}
!!! tip
For further information about **Spack** on Merlin7 please refer to the [Spack](../05-Software-Support/spack.md) chapter.
### Cray Environment Modules
@@ -44,7 +42,6 @@ Recommendations:
* **Compiling Software:** Cray modules can be used when optimization for Cray hardware is essential.
* **General Use:** For most applications, prefer PModules, which ensure stability, backward compatibility, and long-term support.
{{site.data.alerts.tip}}
For further information about **CPE** on Merlin7 please refer to the [Cray Modules](../05-Software-Support/cray-module.env.md) chapter.
{{site.data.alerts.end}}
!!! tip
For further information about **CPE** on Merlin7 please refer to the [Cray Modules](../05-Software-Support/cray-module.env.md) chapter.

View File

@@ -18,9 +18,11 @@ This document describes the different directories of the Merlin7 cluster.
* ***Users are responsible for backing up their own data***. Is recommended to backup the data on third party independent systems (i.e. LTS, Archive, AFS, SwitchDrive, Windows Shares, etc.).
* ***When a user leaves PSI, she or her supervisor/team are responsible to backup and move the data out from the cluster***: every few months, the storage space will be recycled for those old users who do not have an existing and valid PSI account.
{{site.data.alerts.warning}}When a user leaves PSI and their account is removed, their storage space in Merlin may be recycled.
Hence, <b>when a user leaves PSI</b>, they, their supervisor or team <b>must ensure that the data is backed up to an external storage</b>
{{site.data.alerts.end}}
!!! warning
When a user leaves PSI and their account is removed, their storage space in
Merlin may be recycled. Hence, **when a user leaves PSI**, they, their
supervisor or team **must ensure that the data is backed up to an external
storage**!
### How to check quotas
@@ -43,20 +45,22 @@ Path SpaceUsed SpaceQuota Space % FilesUsed FilesQuota Files %
└─ bio/hpce
```
{{site.data.alerts.note}}On first use you will see a message about some configuration being generated, this is expected. Don't be
surprised that it takes some time. After this using <code>merlin_quotas</code> should be faster.
{{site.data.alerts.end}}
!!! note
On first use you will see a message about some configuration being
generated, this is expected. Don't be surprised that it takes some time.
After this using `merlin_quotas` should be faster.
The output shows the quotas set and how much you are using of the quota, for each filesystem that has this set. Notice that some users will have
one or more `/data/project/...` directories showing, depending on whether you are part of a specific PSI research group or project.
The general quota constraints for the different directories are shown in the [table below](#dir_classes). Further details on how to use `merlin_quotas`
can be found on the [Tools page](/merlin7/tools.html).
can be found on the [Tools page](merlin_tools.md).
{{site.data.alerts.tip}}If you're interesting, you can retrieve the Lustre-based quota information directly by calling
<code>lfs quota -h -p $(( 100000000 + $(id -u $USER) )) /data</code> directly. Using the <code>merlin_quotas</code> command is more
convenient and shows all your relevant filesystem quotas.
{{site.data.alerts.end}}
!!! tip
If you're interesting, you can retrieve the Lustre-based quota information
directly by calling `lfs quota -h -p $(( 100000000 + $(id -u $USER) ))
/data` directly. Using the `merlin_quotas` command is more convenient and
shows all your relevant filesystem quotas.
## Merlin7 directories
@@ -70,11 +74,14 @@ Merlin7 offers the following directory classes for users:
* `/scratch`: Local *scratch* disk (only visible by the node running a job).
* `/data/scratch/shared`: Shared *scratch* disk (visible from all nodes).
{{site.data.alerts.tip}}In Lustre there is a concept called <b>grace time</b>. Filesystems have a block (amount of data) and inode (number of files) quota.
These quotas contain a soft and hard limits. Once the soft limit is reached, users can keep writing up to their hard limit quota during the <b>grace period</b>.
Once the <b>grace time</b> or hard limit are reached, users will be unable to write and will need remove data below the soft limit (or ask for a quota increase
when this is possible, see below table).
{{site.data.alerts.end}}
!!! tip
In Lustre there is a concept called **grace time**. Filesystems have a
block (amount of data) and inode (number of files) quota. These quotas
contain a soft and hard limits. Once the soft limit is reached, users can
keep writing up to their hard limit quota during the **grace period**.
Once the **grace time** or hard limit are reached, users will be unable to
write and will need remove data below the soft limit (or ask for a quota
increase when this is possible, see below table).
<a name="dir_classes"></a>Properties of the directory classes:
@@ -86,10 +93,11 @@ when this is possible, see below table).
| /data/scratch/shared | USR [512GB:2TB] | | 7d | Up to x2 when strongly justified. | Changeable when justified. | no |
| /scratch | *Undef* | *Undef* | N/A | N/A | N/A | no |
{{site.data.alerts.warning}}The use of <b>/scratch</b> and <b>/data/scratch/shared</b> areas as an extension of the quota <i>is forbidden</i>. The <b>/scratch</b> and
<b>/data/scratch/shared</b> areas <i>must not contain</i> final data. Keep in mind that <br><b><i>auto cleanup policies</i></b> in the <b>/scratch</b> and
<b>/data/scratch/shared</b> areas are applied.
{{site.data.alerts.end}}
!!! warning
The use of `/scratch` and `/data/scratch/shared` areas as an extension of
the quota *is forbidden*. The `/scratch` and `/data/scratch/shared` areas
***must not contain*** final data. Keep in mind that ***auto cleanup
policies*** in the `/scratch` and `/data/scratch/shared` areas are applied.
### User home directory
@@ -134,9 +142,10 @@ Project quotas are defined in a per Lustre project basis. Users can check the pr
lfs quota -h -p $projectid /data
```
{{site.data.alerts.warning}}Checking <b>quotas</b> for the Merlin projects is not yet possible.
In the future, a list of `projectid` will be provided, so users can check their quotas.
{{site.data.alerts.end}}
!!! warning
Checking **quotas** for the Merlin projects is not yet possible. In the
future, a list of `projectid` will be provided, so users can check their
quotas.
Directory policies:
@@ -178,7 +187,7 @@ and all tasks need to do I/O on the same temporary files.
Scratch directories policies:
* Read **[Important: Code of Conduct](/merlin7/code-of-conduct.html)** for more information about Merlin7 policies.
* Read **[Important: Code of Conduct](../01-Quick-Start-Guide/code-of-conduct.md)** for more information about Merlin7 policies.
* By default, *always* use **local** first and only use **shared** if your specific use case requires it.
* Temporary files *must be deleted at the end of the job by the user*.
* Remaining files will be deleted by the system if detected.

View File

@@ -55,12 +55,12 @@ rsync -avAHXS <src> <dst>
```bash
rsync -avAHXS ~/localdata $USER@login001.merlin7.psi.ch:/data/project/general/myproject/
```
{{site.data.alerts.tip}}
If a transfer is interrupted, just rerun the command: <code>rsync</code> will skip existing files.
{{site.data.alerts.end}}
{{site.data.alerts.warning}}
Rsync uses SSH (port 22). For large datasets, transfer speed might be limited.
{{site.data.alerts.end}}
!!! tip
If a transfer is interrupted, just rerun the command: `rsync` will skip existing files.
!!! warning
Rsync uses SSH (port 22). For large datasets, transfer speed might be limited.
### SCP
@@ -76,9 +76,8 @@ A `vsftpd` service is available on the login nodes, providing high-speed transfe
* **`service03.merlin7.psi.ch`**: Encrypted control channel only.
Use if your data can be transferred unencrypted. **Fastest** method.
{{site.data.alerts.tip}}
The <b>control channel</b> is always <b>encrypted</b>, therefore, authentication is encrypted and secured.
{{site.data.alerts.end}}
!!! tip
The **control channel** is always **encrypted**, therefore, authentication is encrypted and secured.
## UI-based Clients for Data Transfer
### WinSCP (Windows)
@@ -125,9 +124,8 @@ The service is designed to **send large files for temporary availability**, not
3. File remains available until the specified **expiration date** is reached, or the **download limit** is reached.
4. The file is **automatically deleted** after expiration.
{{site.data.alerts.warning}}
SWITCHfilesender <b>is not</b> a long-term storage or archiving solution.
{{site.data.alerts.end}}
!!! warning
SWITCHfilesender **is not** a long-term storage or archiving solution.
## PSI Data Transfer
@@ -144,9 +142,10 @@ Notice that `datatransfer.psi.ch` does not allow SSH login, only `rsync`, `scp`
Access to the PSI Data Transfer uses ***Multi factor authentication*** (MFA).
Therefore, having the Microsoft Authenticator App is required as explained [here](https://www.psi.ch/en/computing/change-to-mfa).
{{site.data.alerts.tip}}Please follow the
<b><a href="https://www.psi.ch/en/photon-science-data-services/data-transfer">Official PSI Data Transfer</a></b> documentation for further instructions.
{{site.data.alerts.end}}
!!! tip
Please follow the [Official PSI Data
Transfer](https://www.psi.ch/en/photon-science-data-services/data-transfer)
documentation for further instructions.
## Connecting to Merlin7 from outside PSI

View File

@@ -7,9 +7,8 @@ This partition allows CPU oversubscription (up to four users may share the same
On the **`gmerlin7`** cluster, additional interactive partitions are available, but these are primarily intended for CPU-only workloads (such like compiling GPU-based software, or creating an allocation for submitting jobs to Grace-Hopper nodes).
{{site.data.alerts.warning}}
Because <b>GPU resources are scarce and expensive</b>, interactive allocations on GPU nodes that use GPUs should only be submitted when strictly necessary and well justified.
{{site.data.alerts.end}}
!!! warning
Because **GPU resources are scarce and expensive**, interactive allocations on GPU nodes that use GPUs should only be submitted when strictly necessary and well justified.
## Running interactive jobs

View File

@@ -86,9 +86,8 @@ adjusted.
For MPI-based jobs, where performance generally improves with single-threaded CPUs, this option is recommended.
In such cases, you should double the **`--mem-per-cpu`** value to account for the reduced number of threads.
{{site.data.alerts.tip}}
Always verify the Slurm <b>'/var/spool/slurmd/conf-cache/slurm.conf'</b> configuration file for potential changes.
{{site.data.alerts.end}}
!!! tip
Always verify the Slurm `/var/spool/slurmd/conf-cache/slurm.conf` configuration file for potential changes.
### User and job limits with QoS
@@ -132,11 +131,10 @@ Where:
* **`cpu_interactive` QoS:** Is restricted to one node and a few CPUs only, and is intended to be used when interactive
allocations are necessary (`salloc`, `srun`).
For additional details, refer to the [CPU partitions](slurm-configuration.md#CPU-partitions) section.
For additional details, refer to the [CPU partitions](#cpu-partitions) section.
{{site.data.alerts.tip}}
Always verify QoS definitions for potential changes using the <b>'sacctmgr show qos format="Name%22,MaxTRESPU%35,MaxTRES%35"'</b> command.
{{site.data.alerts.end}}
!!! tip
Always verify QoS definitions for potential changes using the `sacctmgr show qos format="Name%22,MaxTRESPU%35,MaxTRES%35"` command.
### CPU partitions
@@ -151,11 +149,10 @@ Key concepts:
partitions, where applicable.
* **`QoS`**: Specifies the quality of service associated with a partition. It is used to control and restrict resource availability
for specific partitions, ensuring that resource allocation aligns with intended usage policies. Detailed explanations of the various
QoS settings can be found in the [User and job limits with QoS](/merlin7/slurm-configuration.html#user-and-job-limits-with-qos) section.
QoS settings can be found in the [User and job limits with QoS](#user-and-job-limits-with-qos) section.
{{site.data.alerts.tip}}
Always verify partition configurations for potential changes using the <b>'scontrol show partition'</b> command.
{{site.data.alerts.end}}
!!! tip
Always verify partition configurations for potential changes using the `scontrol show partition` command.
#### CPU public partitions
@@ -169,11 +166,11 @@ Always verify partition configurations for potential changes using the <b>'scon
All Merlin users are part of the `merlin` account, which is used as the *default account* when submitting jobs.
Similarly, if no partition is specified, jobs are automatically submitted to the `general` partition by default.
{{site.data.alerts.tip}}
For jobs running less than one day, submit them to the <b>daily</b> partition.
For jobs running less than one hour, use the <b>hourly</b> partition.
These partitions provide higher priority and ensure quicker scheduling compared to <b>general</b>, which has limited node availability.
{{site.data.alerts.end}}
!!! tip
For jobs running less than one day, submit them to the **daily** partition.
For jobs running less than one hour, use the **hourly** partition. These
partitions provide higher priority and ensure quicker scheduling compared
to **general**, which has limited node availability.
The **`hourly`** partition may include private nodes as an additional buffer. However, the current Slurm partition configuration, governed
by **`PriorityTier`**, ensures that jobs submitted to private partitions are prioritized and processed first. As a result, access to the
@@ -188,10 +185,10 @@ before any jobs in other partitions.
* **Intended Use:** This partition is ideal for debugging, testing, compiling, short interactive runs, and other activities where
immediate access is important.
{{site.data.alerts.warning}}
Because of CPU sharing, the performance on the **'interactive'** partition may not be optimal for compute-intensive tasks.
For long-running or production workloads, use a dedicated batch partition instead.
{{site.data.alerts.end}}
!!! warning
Because of CPU sharing, the performance on the **interactive** partition
may not be optimal for compute-intensive tasks. For long-running or
production workloads, use a dedicated batch partition instead.
#### CPU private partitions
@@ -261,9 +258,8 @@ adjusted.
For MPI-based jobs, where performance generally improves with single-threaded CPUs, this option is recommended.
In such cases, you should double the **`--mem-per-cpu`** value to account for the reduced number of threads.
{{site.data.alerts.tip}}
Always verify the Slurm <b>'/var/spool/slurmd/conf-cache/slurm.conf'</b> configuration file for potential changes.
{{site.data.alerts.end}}
!!! tip
Always verify the Slurm `/var/spool/slurmd/conf-cache/slurm.conf` configuration file for potential changes.
### User and job limits with QoS
@@ -308,11 +304,10 @@ Where:
* **`gpu_a100_interactive` & `gpu_gh_interactive` QoS:** Guarantee interactive access to GPU nodes for software compilation and
small testing.
For additional details, refer to the [GPU partitions](slurm-configuration.md#GPU-partitions) section.
For additional details, refer to the [GPU partitions](#gpu-partitions) section.
{{site.data.alerts.tip}}
Always verify QoS definitions for potential changes using the <b>'sacctmgr show qos format="Name%22,MaxTRESPU%35,MaxTRES%35"'</b> command.
{{site.data.alerts.end}}
!!! tip
Always verify QoS definitions for potential changes using the `sacctmgr show qos format="Name%22,MaxTRESPU%35,MaxTRES%35"` command.
### GPU partitions
@@ -327,11 +322,10 @@ Key concepts:
partitions, where applicable.
* **`QoS`**: Specifies the quality of service associated with a partition. It is used to control and restrict resource availability
for specific partitions, ensuring that resource allocation aligns with intended usage policies. Detailed explanations of the various
QoS settings can be found in the [User and job limits with QoS](/merlin7/slurm-configuration.html#user-and-job-limits-with-qos) section.
QoS settings can be found in the [User and job limits with QoS](#user-and-job-limits-with-qos) section.
{{site.data.alerts.tip}}
Always verify partition configurations for potential changes using the <b>'scontrol show partition'</b> command.
{{site.data.alerts.end}}
!!! tip
Always verify partition configurations for potential changes using the `scontrol show partition` command.
#### A100-based partitions
@@ -345,11 +339,12 @@ Always verify partition configurations for potential changes using the <b>'scon
All Merlin users are part of the `merlin` account, which is used as the *default account* when submitting jobs.
Similarly, if no partition is specified, jobs are automatically submitted to the `general` partition by default.
{{site.data.alerts.tip}}
For jobs running less than one day, submit them to the <b>a100-daily</b> partition.
For jobs running less than one hour, use the <b>a100-hourly</b> partition.
These partitions provide higher priority and ensure quicker scheduling compared to <b>a100-general</b>, which has limited node availability.
{{site.data.alerts.end}}
!!! tip
For jobs running less than one day, submit them to the **a100-daily**
partition. For jobs running less than one hour, use the **a100-hourly**
partition. These partitions provide higher priority and ensure quicker
scheduling compared to **a100-general**, which has limited node
availability.
#### GH-based partitions
@@ -363,8 +358,8 @@ These partitions provide higher priority and ensure quicker scheduling compared
All Merlin users are part of the `merlin` account, which is used as the *default account* when submitting jobs.
Similarly, if no partition is specified, jobs are automatically submitted to the `general` partition by default.
{{site.data.alerts.tip}}
For jobs running less than one day, submit them to the <b>gh-daily</b> partition.
For jobs running less than one hour, use the <b>gh-hourly</b> partition.
These partitions provide higher priority and ensure quicker scheduling compared to <b>gh-general</b>, which has limited node availability.
{{site.data.alerts.end}}
!!! tip
For jobs running less than one day, submit them to the **gh-daily**
partition. For jobs running less than one hour, use the **gh-hourly**
partition. These partitions provide higher priority and ensure quicker
scheduling compared to **gh-general**, which has limited node availability.

View File

@@ -42,7 +42,7 @@ queue. Additional customization can be implemented using the *'Optional user
defined line to be added to the batch launcher script'* option. This line is
added to the submission script at the end of other `#SBATCH` lines. Parameters can
be passed to SLURM by starting the line with `#SBATCH`, like in [Running Slurm
Scripts](/merlin7/running-jobs.html). Some ideas:
Scripts](../slurm-general-docs/running-jobs.md). Some ideas:
**Request additional memory**

View File

@@ -12,8 +12,8 @@ permalink: /merlin7/ansys-rsm.html
**ANSYS Remote Solve Manager (RSM)** is used by ANSYS Workbench to submit computational jobs to HPC clusters directly from Workbench on your desktop.
{{site.data.alerts.warning}} Merlin7 is running behind a firewall, however, there are firewall policies in place to access the Merlin7 ANSYS RSM service from the main PSI networks. If you can not connect to it, please contact us, and please provide the IP address for the corresponding workstation: we will check the PSI firewall rules in place and request for an update if necessary.
{{site.data.alerts.end}}
!!! warning
Merlin7 is running behind a firewall, however, there are firewall policies in place to access the Merlin7 ANSYS RSM service from the main PSI networks. If you can not connect to it, please contact us, and please provide the IP address for the corresponding workstation: we will check the PSI firewall rules in place and request for an update if necessary.
### The Merlin7 RSM service
@@ -62,9 +62,8 @@ PSI account to authenticate. Notice that the **`PSICH\`** prefix **must not be a
6. *[Optional]* You can perform a test by submitting a test job on each partition by clicking on the **Submit** button
for each selected partition.
{{site.data.alerts.tip}}
In the future, we might provide this service also from the login nodes for better transfer performance.
{{site.data.alerts.end}}
!!! tip
In the future, we might provide this service also from the login nodes for better transfer performance.
## Using RSM in ANSYS

View File

@@ -70,8 +70,8 @@ ANSYS/2025R2:
</details>
{{site.data.alerts.tip}}Please always run <b>ANSYS/2024R2 or superior</b>.
{{site.data.alerts.end}}
!!! tip
Please always run **ANSYS/2024R2 or superior**.
## ANSYS Documentation by product
@@ -84,12 +84,12 @@ For further information, please visit the **[ANSYS RSM](ansys-rsm.md)** section.
### ANSYS Fluent
ANSYS Fluent is not currently documented for Merlin7. Please refer to the [Merlin6 documentation](../merlin6/software-support/ansys-fluent.md) for information about ANSYS Fluent on Merlin6.
ANSYS Fluent is not currently documented for Merlin7. Please refer to the [Merlin6 documentation](../../merlin6/software-support/ansys-fluent.md) for information about ANSYS Fluent on Merlin6.
### ANSYS CFX
ANSYS CFX is not currently documented for Merlin7. Please refer to the [Merlin6 documentation](../merlin6/software-support/ansys-cfx.md) for information about ANSYS CFX on Merlin6.
ANSYS CFX is not currently documented for Merlin7. Please refer to the [Merlin6 documentation](../../merlin6/software-support/ansys-cfx.md) for information about ANSYS CFX on Merlin6.
### ANSYS MAPDL
ANSYS MAPDL is not currently documented for Merlin7. Please refer to the [Merlin6 documentation](../merlin6/software-support/ansys-mapdl.md) for information about ANSYS MAPDL on Merlin6.
ANSYS MAPDL is not currently documented for Merlin7. Please refer to the [Merlin6 documentation](../../merlin6/software-support/ansys-mapdl.md) for information about ANSYS MAPDL on Merlin6.

View File

@@ -52,9 +52,8 @@ Example Usage:
srun ./app
```
{{site.data.alerts.tip}}
Always run OpenMPI applications with <b>srun</b> for a seamless experience.
{{site.data.alerts.end}}
!!! tip
Always run OpenMPI applications with `srun` for a seamless experience.
### PMIx Support in Merlin7
@@ -69,12 +68,14 @@ MPI plugin types are...
cray_shasta
specific pmix plugin versions available: pmix_v5,pmix_v4,pmix_v3,pmix_v2
```
Important Notes:
* For OpenMPI, always use `pmix` by specifying the appropriate version (`pmix_$version`).
When loading an OpenMPI module (via [PModules](pmodules.md) or [Spack](spack.md)), the corresponding PMIx version will be automatically loaded.
* Users do not need to manually manage PMIx compatibility.
{{site.data.alerts.warning}}
PMI-2 is not supported in OpenMPI 5.0.0 or later releases.
Despite this, <b>pmi2</b> remains the default SLURM PMI type in Merlin7 as it is the officially supported type and maintains compatibility with other MPI implementations.
{{site.data.alerts.end}}
!!! warning
PMI-2 is not supported in OpenMPI 5.0.0 or later releases. Despite this,
**pmi2** remains the default SLURM PMI type in Merlin7 as it is the
officially supported type and maintains compatibility with other MPI
implementations.