Clean Merlin6 docs
All checks were successful
Build and deploy documentation / build-and-deploy-docs (push) Successful in 24s
All checks were successful
Build and deploy documentation / build-and-deploy-docs (push) Successful in 24s
This commit is contained in:
@@ -63,45 +63,6 @@ Check in the **man** pages (`man sinfo`) for all possible options for this comma
|
||||
gpu up 7-00:00:00 1-infinite no NO all 8 allocated merlin-g-[001-006,008-009]
|
||||
```
|
||||
|
||||
### Slurm commander
|
||||
|
||||
The **[Slurm Commander (scom)](https://github.com/CLIP-HPC/SlurmCommander/)** is a simple but very useful open source text-based user interface for
|
||||
simple and efficient interaction with Slurm. It is developed by the **CLoud Infrastructure Project (CLIP-HPC)** and external contributions. To use it, one can
|
||||
simply run the following command:
|
||||
|
||||
```bash
|
||||
scom # merlin6 cluster
|
||||
SLURM_CLUSTERS=merlin5 scom # merlin5 cluster
|
||||
SLURM_CLUSTERS=gmerlin6 scom # gmerlin6 cluster
|
||||
scom -h # Help and extra options
|
||||
scom -d 14 # Set Job History to 14 days (instead of default 7)
|
||||
```
|
||||
|
||||
With this simple interface, users can interact with their jobs, as well as getting information about past and present jobs:
|
||||
|
||||
* Filtering jobs by substring is possible with the `/` key.
|
||||
* Users can perform multiple actions on their jobs (such like cancelling,
|
||||
holding or requeing a job), SSH to a node with an already running job,
|
||||
or getting extended details and statistics of the job itself.
|
||||
|
||||
Also, users can check the status of the cluster, to get statistics and node usage information as well as getting information about node properties.
|
||||
|
||||
The interface also provides a few job templates for different use cases (i.e. MPI, OpenMP, Hybrid, single core). Users can modify these templates,
|
||||
save it locally to the current directory, and submit the job to the cluster.
|
||||
|
||||
!!! note
|
||||
Currently, `scom` does not provide live updated information for the <span
|
||||
style="color:darkorange;">[Job History]</span> tab. To update Job History
|
||||
information, users have to exit the application with the <span
|
||||
style="color:darkorange;">q</span> key. Other tabs will be updated every 5
|
||||
seconds (default). On the other hand, the <span style="color:darkorange;">[Job
|
||||
History]</span> tab contains only information for the **merlin6** CPU cluster
|
||||
only. Future updates will provide information for other clusters.
|
||||
|
||||
For further information about how to use **scom**, please refer to the **[Slurm Commander Project webpage](https://github.com/CLIP-HPC/SlurmCommander/)**
|
||||
|
||||

|
||||
|
||||
### Job accounting
|
||||
|
||||
Users can check detailed information of jobs (pending, running, completed, failed, etc.) with the `sacct` command.
|
||||
@@ -267,11 +228,3 @@ support:
|
||||
* Nodes monitoring:
|
||||
* [Merlin6 CPU Nodes Overview](https://hpc-monitor02.psi.ch/d/JmvLR8gZz/merlin6-computing-cpu-nodes?orgId=1&refresh=10s)
|
||||
* [Merlin6 GPU Nodes Overview](https://hpc-monitor02.psi.ch/d/gOo1Z10Wk/merlin6-computing-gpu-nodes?orgId=1&refresh=10s)
|
||||
|
||||
### Merlin5 Monitoring Pages
|
||||
|
||||
* Slurm monitoring:
|
||||
* [Merlin5 Slurm Live Status](https://hpc-monitor02.psi.ch/d/o8msZJ0Zz/merlin5-slurm-live-status?orgId=1&refresh=10s)
|
||||
* [Merlin5 Slurm Overview](https://hpc-monitor02.psi.ch/d/eWLEW1AWz/merlin5-slurm-overview?orgId=1&refresh=10s)
|
||||
* Nodes monitoring:
|
||||
* [Merlin5 CPU Nodes Overview](https://hpc-monitor02.psi.ch/d/ejTyWJAWk/merlin5-computing-cpu-nodes?orgId=1&refresh=10s)
|
||||
|
||||
@@ -49,10 +49,10 @@ The following settings are the minimum required for running a job in the Merlin
|
||||
* **Clusters:** For running jobs in the different Slurm clusters, users should to add the following option:
|
||||
|
||||
```bash
|
||||
#SBATCH --clusters=<cluster_name> # Possible values: merlin5, merlin6, gmerlin6
|
||||
#SBATCH --clusters=<cluster_name> # Possible values: merlin6, gmerlin6
|
||||
```
|
||||
|
||||
Refer to the documentation of each cluster ([**`merlin6`**](../slurm-configuration.md),[**`gmerlin6`**](../../gmerlin6/slurm-configuration.md),[**`merlin5`**](../../merlin5/slurm-configuration.md) for further information.
|
||||
Refer to the documentation of each cluster ([**`merlin6`**](../slurm-configuration.md),[**`gmerlin6`**](../../gmerlin6/slurm-configuration.md) for further information.
|
||||
|
||||
* **Partitions:** except when using the *default* partition for each cluster, one needs to specify the partition:
|
||||
|
||||
@@ -60,7 +60,7 @@ The following settings are the minimum required for running a job in the Merlin
|
||||
#SBATCH --partition=<partition_name> # Check each cluster documentation for possible values
|
||||
```
|
||||
|
||||
Refer to the documentation of each cluster ([**`merlin6`**](../slurm-configuration.md),[**`gmerlin6`**](../../gmerlin6/slurm-configuration.md),[**`merlin5`**](../../merlin5/slurm-configuration.md) for further information.
|
||||
Refer to the documentation of each cluster ([**`merlin6`**](../slurm-configuration.md),[**`gmerlin6`**](../../gmerlin6/slurm-configuration.md) for further information.
|
||||
|
||||
* **[Optional] Disabling shared nodes**: by default, nodes are not exclusive. Hence, multiple users can run in the same node. One can request exclusive node usage with the following option:
|
||||
|
||||
@@ -74,7 +74,7 @@ The following settings are the minimum required for running a job in the Merlin
|
||||
#SBATCH --time=<D-HH:MM:SS> # Can not exceed the partition `MaxTime`
|
||||
```
|
||||
|
||||
Refer to the documentation of each cluster ([**`merlin6`**](../slurm-configuration.md),[**`gmerlin6`**](../../gmerlin6/slurm-configuration.md),[**`merlin5`**](../../merlin5/slurm-configuration.md) for further information about partition `MaxTime` values.
|
||||
Refer to the documentation of each cluster ([**`merlin6`**](../slurm-configuration.md),[**`gmerlin6`**](../../gmerlin6/slurm-configuration.md) for further information about partition `MaxTime` values.
|
||||
|
||||
* **Output and error files**: by default, Slurm script will generate standard output (`slurm-%j.out`, where `%j` is the job_id) and error (`slurm-%j.err`, where `%j` is the job_id) files in the directory from where the job was submitted. Users can change default name with the following options:
|
||||
|
||||
@@ -92,7 +92,7 @@ The following settings are the minimum required for running a job in the Merlin
|
||||
#SBATCH --hint=nomultithread # Don't use extra threads with in-core multi-threading.
|
||||
```
|
||||
|
||||
Refer to the documentation of each cluster ([**`merlin6`**](../slurm-configuration.md),[**`gmerlin6`**](../../gmerlin6/slurm-configuration.md),[**`merlin5`**](../../merlin5/slurm-configuration.md) for further information about node configuration and Hyper-Threading.
|
||||
Refer to the documentation of each cluster ([**`merlin6`**](../slurm-configuration.md),[**`gmerlin6`**](../../gmerlin6/slurm-configuration.md) for further information about node configuration and Hyper-Threading.
|
||||
Consider that, sometimes, depending on your job requirements, you might need also to setup how many `--ntasks-per-core` or `--cpus-per-task` (even other options) in addition to the `--hint` command. Please, contact us in case of doubts.
|
||||
|
||||
!!! tip
|
||||
|
||||
@@ -33,23 +33,3 @@ sshare -a # to list shares of associations to a cluster
|
||||
sprio -l # to view the factors that comprise a job's scheduling priority
|
||||
# add '-u <username>' for filtering user
|
||||
```
|
||||
|
||||
## Show information for specific cluster
|
||||
|
||||
By default, any of the above commands shows information of the local cluster which is **merlin6**.
|
||||
|
||||
If you want to see the same information for **merlin5** you have to add the parameter ``--clusters=merlin5``.
|
||||
If you want to see both clusters at the same time, add the option ``--federation``.
|
||||
|
||||
Examples:
|
||||
|
||||
```bash
|
||||
sinfo # 'sinfo' local cluster which is 'merlin6'
|
||||
sinfo --clusters=merlin5 # 'sinfo' non-local cluster 'merlin5'
|
||||
sinfo --federation # 'sinfo' all clusters which are 'merlin5' & 'merlin6'
|
||||
squeue # 'squeue' local cluster which is 'merlin6'
|
||||
squeue --clusters=merlin5 # 'squeue' non-local cluster 'merlin5'
|
||||
squeue --federation # 'squeue' all clusters which are 'merlin5' & 'merlin6'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
Reference in New Issue
Block a user