Clean Merlin6 docs
All checks were successful
Build and deploy documentation / build-and-deploy-docs (push) Successful in 24s

This commit is contained in:
2026-02-10 10:46:02 +01:00
parent 6812bb6bad
commit 17053a363f
14 changed files with 85 additions and 185 deletions

View File

@@ -63,45 +63,6 @@ Check in the **man** pages (`man sinfo`) for all possible options for this comma
gpu up 7-00:00:00 1-infinite no NO all 8 allocated merlin-g-[001-006,008-009]
```
### Slurm commander
The **[Slurm Commander (scom)](https://github.com/CLIP-HPC/SlurmCommander/)** is a simple but very useful open source text-based user interface for
simple and efficient interaction with Slurm. It is developed by the **CLoud Infrastructure Project (CLIP-HPC)** and external contributions. To use it, one can
simply run the following command:
```bash
scom # merlin6 cluster
SLURM_CLUSTERS=merlin5 scom # merlin5 cluster
SLURM_CLUSTERS=gmerlin6 scom # gmerlin6 cluster
scom -h # Help and extra options
scom -d 14 # Set Job History to 14 days (instead of default 7)
```
With this simple interface, users can interact with their jobs, as well as getting information about past and present jobs:
* Filtering jobs by substring is possible with the `/` key.
* Users can perform multiple actions on their jobs (such like cancelling,
holding or requeing a job), SSH to a node with an already running job,
or getting extended details and statistics of the job itself.
Also, users can check the status of the cluster, to get statistics and node usage information as well as getting information about node properties.
The interface also provides a few job templates for different use cases (i.e. MPI, OpenMP, Hybrid, single core). Users can modify these templates,
save it locally to the current directory, and submit the job to the cluster.
!!! note
Currently, `scom` does not provide live updated information for the <span
style="color:darkorange;">[Job History]</span> tab. To update Job History
information, users have to exit the application with the <span
style="color:darkorange;">q</span> key. Other tabs will be updated every 5
seconds (default). On the other hand, the <span style="color:darkorange;">[Job
History]</span> tab contains only information for the **merlin6** CPU cluster
only. Future updates will provide information for other clusters.
For further information about how to use **scom**, please refer to the **[Slurm Commander Project webpage](https://github.com/CLIP-HPC/SlurmCommander/)**
!['scom' text-based user interface](../../images/slurm/scom.gif)
### Job accounting
Users can check detailed information of jobs (pending, running, completed, failed, etc.) with the `sacct` command.
@@ -267,11 +228,3 @@ support:
* Nodes monitoring:
* [Merlin6 CPU Nodes Overview](https://hpc-monitor02.psi.ch/d/JmvLR8gZz/merlin6-computing-cpu-nodes?orgId=1&refresh=10s)
* [Merlin6 GPU Nodes Overview](https://hpc-monitor02.psi.ch/d/gOo1Z10Wk/merlin6-computing-gpu-nodes?orgId=1&refresh=10s)
### Merlin5 Monitoring Pages
* Slurm monitoring:
* [Merlin5 Slurm Live Status](https://hpc-monitor02.psi.ch/d/o8msZJ0Zz/merlin5-slurm-live-status?orgId=1&refresh=10s)
* [Merlin5 Slurm Overview](https://hpc-monitor02.psi.ch/d/eWLEW1AWz/merlin5-slurm-overview?orgId=1&refresh=10s)
* Nodes monitoring:
* [Merlin5 CPU Nodes Overview](https://hpc-monitor02.psi.ch/d/ejTyWJAWk/merlin5-computing-cpu-nodes?orgId=1&refresh=10s)

View File

@@ -49,10 +49,10 @@ The following settings are the minimum required for running a job in the Merlin
* **Clusters:** For running jobs in the different Slurm clusters, users should to add the following option:
```bash
#SBATCH --clusters=<cluster_name> # Possible values: merlin5, merlin6, gmerlin6
#SBATCH --clusters=<cluster_name> # Possible values: merlin6, gmerlin6
```
Refer to the documentation of each cluster ([**`merlin6`**](../slurm-configuration.md),[**`gmerlin6`**](../../gmerlin6/slurm-configuration.md),[**`merlin5`**](../../merlin5/slurm-configuration.md) for further information.
Refer to the documentation of each cluster ([**`merlin6`**](../slurm-configuration.md),[**`gmerlin6`**](../../gmerlin6/slurm-configuration.md) for further information.
* **Partitions:** except when using the *default* partition for each cluster, one needs to specify the partition:
@@ -60,7 +60,7 @@ The following settings are the minimum required for running a job in the Merlin
#SBATCH --partition=<partition_name> # Check each cluster documentation for possible values
```
Refer to the documentation of each cluster ([**`merlin6`**](../slurm-configuration.md),[**`gmerlin6`**](../../gmerlin6/slurm-configuration.md),[**`merlin5`**](../../merlin5/slurm-configuration.md) for further information.
Refer to the documentation of each cluster ([**`merlin6`**](../slurm-configuration.md),[**`gmerlin6`**](../../gmerlin6/slurm-configuration.md) for further information.
* **[Optional] Disabling shared nodes**: by default, nodes are not exclusive. Hence, multiple users can run in the same node. One can request exclusive node usage with the following option:
@@ -74,7 +74,7 @@ The following settings are the minimum required for running a job in the Merlin
#SBATCH --time=<D-HH:MM:SS> # Can not exceed the partition `MaxTime`
```
Refer to the documentation of each cluster ([**`merlin6`**](../slurm-configuration.md),[**`gmerlin6`**](../../gmerlin6/slurm-configuration.md),[**`merlin5`**](../../merlin5/slurm-configuration.md) for further information about partition `MaxTime` values.
Refer to the documentation of each cluster ([**`merlin6`**](../slurm-configuration.md),[**`gmerlin6`**](../../gmerlin6/slurm-configuration.md) for further information about partition `MaxTime` values.
* **Output and error files**: by default, Slurm script will generate standard output (`slurm-%j.out`, where `%j` is the job_id) and error (`slurm-%j.err`, where `%j` is the job_id) files in the directory from where the job was submitted. Users can change default name with the following options:
@@ -92,7 +92,7 @@ The following settings are the minimum required for running a job in the Merlin
#SBATCH --hint=nomultithread # Don't use extra threads with in-core multi-threading.
```
Refer to the documentation of each cluster ([**`merlin6`**](../slurm-configuration.md),[**`gmerlin6`**](../../gmerlin6/slurm-configuration.md),[**`merlin5`**](../../merlin5/slurm-configuration.md) for further information about node configuration and Hyper-Threading.
Refer to the documentation of each cluster ([**`merlin6`**](../slurm-configuration.md),[**`gmerlin6`**](../../gmerlin6/slurm-configuration.md) for further information about node configuration and Hyper-Threading.
Consider that, sometimes, depending on your job requirements, you might need also to setup how many `--ntasks-per-core` or `--cpus-per-task` (even other options) in addition to the `--hint` command. Please, contact us in case of doubts.
!!! tip

View File

@@ -33,23 +33,3 @@ sshare -a # to list shares of associations to a cluster
sprio -l # to view the factors that comprise a job's scheduling priority
# add '-u <username>' for filtering user
```
## Show information for specific cluster
By default, any of the above commands shows information of the local cluster which is **merlin6**.
If you want to see the same information for **merlin5** you have to add the parameter ``--clusters=merlin5``.
If you want to see both clusters at the same time, add the option ``--federation``.
Examples:
```bash
sinfo # 'sinfo' local cluster which is 'merlin6'
sinfo --clusters=merlin5 # 'sinfo' non-local cluster 'merlin5'
sinfo --federation # 'sinfo' all clusters which are 'merlin5' & 'merlin6'
squeue # 'squeue' local cluster which is 'merlin6'
squeue --clusters=merlin5 # 'squeue' non-local cluster 'merlin5'
squeue --federation # 'squeue' all clusters which are 'merlin5' & 'merlin6'
```
---