ADD: interactive-jobs.md
This commit is contained in:
@ -48,6 +48,8 @@ entries:
|
|||||||
folderitems:
|
folderitems:
|
||||||
- title: Merlin7 Infrastructure
|
- title: Merlin7 Infrastructure
|
||||||
url: /merlin7/slurm-configuration.html
|
url: /merlin7/slurm-configuration.html
|
||||||
|
- title: Running Slurm Interactive Jobs
|
||||||
|
url: /merlin7/interactive-jobs.html
|
||||||
- title: Slurm Batch Script Examples
|
- title: Slurm Batch Script Examples
|
||||||
url: /merlin7/slurm-examples.html
|
url: /merlin7/slurm-examples.html
|
||||||
- title: Software Support
|
- title: Software Support
|
||||||
|
202
pages/merlin7/03-Slurm-General-Documentation/interactive-jobs.md
Normal file
202
pages/merlin7/03-Slurm-General-Documentation/interactive-jobs.md
Normal file
@ -0,0 +1,202 @@
|
|||||||
|
---
|
||||||
|
title: Running Interactive Jobs
|
||||||
|
#tags:
|
||||||
|
keywords: interactive, X11, X, srun, salloc, job, jobs, slurm, nomachine, nx
|
||||||
|
last_updated: 07 August 2024
|
||||||
|
summary: "This document describes how to run interactive jobs as well as X based software."
|
||||||
|
sidebar: merlin7_sidebar
|
||||||
|
permalink: /merlin7/interactive-jobs.html
|
||||||
|
---
|
||||||
|
|
||||||
|
## Running interactive jobs
|
||||||
|
|
||||||
|
There are two different ways for running interactive jobs in Slurm. This is possible by using
|
||||||
|
the ``salloc`` and ``srun`` commands:
|
||||||
|
|
||||||
|
* **``salloc``**: to obtain a Slurm job allocation (a set of nodes), execute command(s), and then release the allocation when the command is finished.
|
||||||
|
* **``srun``**: is used for running parallel tasks.
|
||||||
|
|
||||||
|
### srun
|
||||||
|
|
||||||
|
Is run is used to run parallel jobs in the batch system. It can be used within a batch script
|
||||||
|
(which can be run with ``sbatch``), or within a job allocation (which can be run with ``salloc``).
|
||||||
|
Also, it can be used as a direct command (in example, from the login nodes).
|
||||||
|
|
||||||
|
When used inside a batch script or during a job allocation, ``srun`` is constricted to the
|
||||||
|
amount of resources allocated by the ``sbatch``/``salloc`` commands. In ``sbatch``, usually
|
||||||
|
these resources are defined inside the batch script with the format ``#SBATCH <option>=<value>``.
|
||||||
|
In other words, if you define in your batch script or allocation 88 tasks (and 1 thread / core)
|
||||||
|
and 2 nodes, ``srun`` is constricted to these amount of resources (you can use less, but never
|
||||||
|
exceed those limits).
|
||||||
|
|
||||||
|
When used from the login node, usually is used to run a specific command or software in an
|
||||||
|
interactive way. ``srun`` is a blocking process (it will block bash prompt until the ``srun``
|
||||||
|
command finishes, unless you run it in background with ``&``). This can be very useful to run
|
||||||
|
interactive software which pops up a Window and then submits jobs or run sub-tasks in the
|
||||||
|
background (in example, **Relion**, **cisTEM**, etc.)
|
||||||
|
|
||||||
|
Refer to ``man srun`` for exploring all possible options for that command.
|
||||||
|
|
||||||
|
<details>
|
||||||
|
<summary>[Show 'srun' example]: Running 'hostname' command on 3 nodes, using 2 cores (1 task/core) per node</summary>
|
||||||
|
<pre class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false">
|
||||||
|
caubet_m@login001:~> srun --clusters=merlin7 --ntasks=6 --ntasks-per-node=2 --nodes=3 hostname
|
||||||
|
cn001.merlin7.psi.ch
|
||||||
|
cn001.merlin7.psi.ch
|
||||||
|
cn002.merlin7.psi.ch
|
||||||
|
cn002.merlin7.psi.ch
|
||||||
|
cn003.merlin7.psi.ch
|
||||||
|
cn003.merlin7.psi.ch
|
||||||
|
</pre>
|
||||||
|
</details>
|
||||||
|
|
||||||
|
### salloc
|
||||||
|
|
||||||
|
**``salloc``** is used to obtain a Slurm job allocation (a set of nodes). Once job is allocated,
|
||||||
|
users are able to execute interactive command(s). Once finished (``exit`` or ``Ctrl+D``),
|
||||||
|
the allocation is released. **``salloc``** is a blocking command, it is, command will be blocked
|
||||||
|
until the requested resources are allocated.
|
||||||
|
|
||||||
|
When running **``salloc``**, once the resources are allocated, *by default* the user will get
|
||||||
|
a ***new shell on one of the allocated resources*** (if a user has requested few nodes, it will
|
||||||
|
prompt a new shell on the first allocated node). However, this behaviour can be changed by adding
|
||||||
|
a shell (`$SHELL`) at the end of the `salloc` command. In example:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Typical 'salloc' call
|
||||||
|
salloc --clusters=merlin7 -N 2 -n 2
|
||||||
|
|
||||||
|
# Custom 'salloc' call
|
||||||
|
# - $SHELL will open a local shell on the login node from where ``salloc`` is running
|
||||||
|
salloc --clusters=merlin7 -N 2 -n 2 $SHELL
|
||||||
|
```
|
||||||
|
|
||||||
|
<details>
|
||||||
|
<summary>[Show 'salloc' example]: Allocating 2 cores (1 task/core) in 2 nodes (1 core/node) - <i>Default</i></summary>
|
||||||
|
<pre class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false">
|
||||||
|
caubet_m@login001:~> salloc --clusters=merlin7 -N 2 -n 2
|
||||||
|
salloc: Granted job allocation 161
|
||||||
|
salloc: Nodes cn[001-002] are ready for job
|
||||||
|
|
||||||
|
caubet_m@login001:~> srun hostname
|
||||||
|
cn002.merlin7.psi.ch
|
||||||
|
cn001.merlin7.psi.ch
|
||||||
|
|
||||||
|
caubet_m@login001:~> exit
|
||||||
|
exit
|
||||||
|
salloc: Relinquishing job allocation 161
|
||||||
|
</pre>
|
||||||
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
<summary>[Show 'salloc' example]: Allocating 2 cores (1 task/core) in 2 nodes (1 core/node) - <i>$SHELL</i></summary>
|
||||||
|
<pre class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false">
|
||||||
|
caubet_m@login001:~> salloc --clusters=merlin7 --ntasks=2 --nodes=2 $SHELL
|
||||||
|
salloc: Granted job allocation 165
|
||||||
|
salloc: Nodes cn[001-002] are ready for job
|
||||||
|
caubet_m@login001:~> srun hostname
|
||||||
|
cn001.merlin7.psi.ch
|
||||||
|
cn002.merlin7.psi.ch
|
||||||
|
caubet_m@login001:~> exit
|
||||||
|
exit
|
||||||
|
salloc: Relinquishing job allocation 165
|
||||||
|
</pre>
|
||||||
|
</details>
|
||||||
|
|
||||||
|
## Running interactive jobs with X11 support
|
||||||
|
|
||||||
|
### Requirements
|
||||||
|
|
||||||
|
#### Graphical access
|
||||||
|
|
||||||
|
[NoMachine](/merlin7/nomachine.html) is the official supported service for graphical
|
||||||
|
access in the Merlin cluster. This service is running on the login nodes. Check the
|
||||||
|
document [{Accessing Merlin -> NoMachine}](/merlin7/nomachine.html) for details about
|
||||||
|
how to connect to the **NoMachine** service in the Merlin cluster.
|
||||||
|
|
||||||
|
For other non officially supported graphical access (X11 forwarding):
|
||||||
|
|
||||||
|
* For Linux clients, please follow [{How To Use Merlin -> Accessing from Linux Clients}](/merlin7/connect-from-linux.html)
|
||||||
|
* For Windows clients, please follow [{How To Use Merlin -> Accessing from Windows Clients}](/merlin7/connect-from-windows.html)
|
||||||
|
* For MacOS clients, please follow [{How To Use Merlin -> Accessing from MacOS Clients}](/merlin7/connect-from-macos.html)
|
||||||
|
|
||||||
|
### 'srun' with x11 support
|
||||||
|
|
||||||
|
Merlin6 and merlin7 clusters allow running any windows based applications. For that, you need to
|
||||||
|
add the option ``--x11`` to the ``srun`` command. In example:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
srun --clusters=merlin7 --x11 sview
|
||||||
|
```
|
||||||
|
|
||||||
|
will popup a X11 based slurm view of the cluster.
|
||||||
|
|
||||||
|
In the same manner, you can create a bash shell with x11 support. For doing that, you need
|
||||||
|
to add the option ``--pty`` to the ``srun --x11`` command. Once resource is allocated, from
|
||||||
|
there you can interactively run X11 and non-X11 based commands.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
srun --clusters=merlin7 --x11 --pty bash
|
||||||
|
```
|
||||||
|
|
||||||
|
<details>
|
||||||
|
<summary>[Show 'srun' with X11 support examples]</summary>
|
||||||
|
<pre class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false">
|
||||||
|
caubet_m@login001:~> srun --clusters=merlin7 --x11 sview
|
||||||
|
|
||||||
|
caubet_m@login001:~>
|
||||||
|
|
||||||
|
caubet_m@login001:~> srun --clusters=merlin7 --x11 --pty bash
|
||||||
|
|
||||||
|
caubet_m@cn003:~> sview
|
||||||
|
|
||||||
|
caubet_m@cn003:~> echo "This was an example"
|
||||||
|
This was an example
|
||||||
|
|
||||||
|
caubet_m@cn003:~> exit
|
||||||
|
exit
|
||||||
|
</pre>
|
||||||
|
</details>
|
||||||
|
|
||||||
|
### 'salloc' with x11 support
|
||||||
|
|
||||||
|
**Merlin6** and **merlin7** clusters allow running any windows based applications. For that, you need to
|
||||||
|
add the option ``--x11`` to the ``salloc`` command. In example:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
salloc --clusters=merlin7 --x11 sview
|
||||||
|
```
|
||||||
|
|
||||||
|
will popup a X11 based clock.
|
||||||
|
|
||||||
|
In the same manner, you can create a bash shell with x11 support. For doing that, you need
|
||||||
|
to add to run just ``salloc --clusters=merlin7 --x11``. Once resource is allocated, from
|
||||||
|
there you can interactively run X11 and non-X11 based commands.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
salloc --clusters=merlin7 --x11
|
||||||
|
```
|
||||||
|
|
||||||
|
<details>
|
||||||
|
<summary>[Show 'salloc' with X11 support examples]</summary>
|
||||||
|
<pre class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false">
|
||||||
|
caubet_m@login001:~> salloc --clusters=merlin7 --x11 sview
|
||||||
|
salloc: Granted job allocation 174
|
||||||
|
salloc: Nodes cn001 are ready for job
|
||||||
|
salloc: Relinquishing job allocation 174
|
||||||
|
|
||||||
|
caubet_m@login001:~> salloc --clusters=merlin7 --x11
|
||||||
|
salloc: Granted job allocation 175
|
||||||
|
salloc: Nodes cn001 are ready for job
|
||||||
|
caubet_m@cn001:~>
|
||||||
|
|
||||||
|
caubet_m@cn001:~> sview
|
||||||
|
|
||||||
|
caubet_m@cn001:~> echo "This was an example"
|
||||||
|
This was an example
|
||||||
|
|
||||||
|
caubet_m@cn001:~> exit
|
||||||
|
exit
|
||||||
|
salloc: Relinquishing job allocation 175
|
||||||
|
</pre>
|
||||||
|
</details>
|
Reference in New Issue
Block a user