Added interactive-jobs.md and linux/macos/windows client recipes

This commit is contained in:
2019-10-23 12:08:18 +02:00
parent 126d6a79b6
commit 3b8e2fc9d1
17 changed files with 408 additions and 14 deletions

View File

@@ -0,0 +1,199 @@
---
title: Interactive Jobs
#tags:
keywords: interactive, X11, X, srun
last_updated: 22 October 2019
summary: "This document describes how to run interactive jobs as well as X based software."
sidebar: merlin6_sidebar
permalink: /merlin6/interactive-jobs.html
---
## Running interactive jobs
There are two different ways for running interactive jobs in Slurm. This is possible by using
the ``srun`` or the ``salloc`` commands.
### srun
Is run is used to run parallel jobs in the batch system. It can be used within a batch script
(which can be run with ``sbatch``), or within a job allocation (which can be run with ``salloc``).
Also, it can be used as a direct command (in example, from the login nodes).
When used inside a batch script or during a job allocation, ``srun`` is constricted to the
amount of resources allocated by the ``sbatch``/``salloc`` commands. In ``sbatch``, usually
these resources are defined inside the batch script with the format ``#SBATCH <option>=<value>``.
In other words, if you define in your batch script or allocation 88 tasks (and 1 thread / core)
and 2 nodes, ``srun`` is constricted to these amount of resources (you can use less, but never
exceed those limits).
When used from the login node, usually is used to run a specific command or software in an
interactive way. ``srun`` is a blocking process (it will block bash prompt until the ``srun``
command finishes, unless you run it in background with ``&``). This can be very useful to run
interactive software which pops up a Window and then submits jobs or run sub-tasks in the
background (in example, **Relion**, **cisTEM**, etc.)
Refer to ``man srun`` for exploring all possible options for that command.
<details>
<summary>[Show 'srun' example]: Running 'hostname' command on 3 nodes, using 2 cores (1 task/core) per node</summary>
<pre class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false">
(base) [caubet_m@merlin-l-001 ~]$ srun --clusters=merlin6 --ntasks=6 --ntasks-per-node=2 --nodes=3 hostname
srun: job 135088230 queued and waiting for resources
srun: job 135088230 has been allocated resources
merlin-c-102.psi.ch
merlin-c-102.psi.ch
merlin-c-101.psi.ch
merlin-c-101.psi.ch
merlin-c-103.psi.ch
merlin-c-103.psi.ch
</pre>
</details>
### salloc
``salloc`` is used to obtain a Slurm job allocation (a set of nodes). Once job is allocated,
users are able to execute interactive command(s). Once finished (``exit`` or ``Ctrl+D``),
the allocation is released.
Please, not that ``salloc`` is by default a blocking process, and once the resources get
allocated user gets the bash prompt back. Once prompt is back, a new bash shell is created,
and user can run any ``srun`` commands, which will run tasks on the allocated resources. Once
finished, exiting the bash shell will release the allocation.
<details>
<summary>[Show 'salloc' example]: Allocating 2 cores (1 task/core) in 2 nodes (1 core/node)</summary>
<pre class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false">
(base) [caubet_m@merlin-l-001 ~]$ salloc --clusters=merlin6 --ntasks=2 --nodes=2
salloc: Pending job allocation 135087844
salloc: job 135087844 queued and waiting for resources
salloc: job 135087844 has been allocated resources
salloc: Granted job allocation 135087844
(base) [caubet_m@merlin-l-001 ~]$ srun hostname
merlin-c-120.psi.ch
merlin-c-119.psi.ch
(base) [caubet_m@merlin-c-119 ~]$ exit
logout
Connection to merlin-c-119 closed.
(base) [caubet_m@merlin-l-001 ~]$ exit
exit
salloc: Relinquishing job allocation 135087844
</pre>
</details>
## Running interactive jobs with X11 support
### Requirements
#### Graphical access
[NoMachine](/merlin6/nomachine.html) is the official supported service for graphical
access in the Merlin cluster. This service is running on the login nodes. Check the
document [Accessing Merlin -> NoMachine](/merlin6/nomachine.html) for details about
how to connect to the **NoMachine** service in the Merlin cluster.
For other non officially supported graphical access (X11 forwarding):
* For Linux clients, please follow [{Accessing Merlin -> Accessing from Linux Clients}](/merlin6/connect-from-linux.html)
* For Windows clients, please follow [{Accessing Merlin -> Accessing from Windows Clients}](/merlin6/connect-from-windows.html)
* For MacOS clients, please follow [{Accessing Merlin -> Accessing from MacOS Clients}](/merlin6/connect-from-macos.html)
#### Enable SSH Keys authentication
For running ``srun`` with **X11** support (``srun --x11``) , you need to setup RSA keys properly.
1. Generate the RSA keys as follows:
```bash
ssh-keygen -t rsa
```
You will be requested for an *optional* passphrase. Entering it, provides more security (if somebody steals your private key he will
need to know the passphrase, however every time you use RSA keys you will need to type it). Whether to set a passphrase or not is up
to the users.
2. Add the public key to the ``~/.ssh/authorized_keys`` file
```bash
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
```
3. Ensure that ``~/.ssh/authorized_keys`` has proper permissions:
```bash
chmod 600 ~/.ssh/authorized_keys
```
<details>
<summary>[Show 'ssh-keygen' example]: Generate RSA keys with default key filenames</summary>
<pre class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false">
(base) [caubet_m@merlin-l-001 .ssh]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/psi/home/caubet_m/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /psi/home/caubet_m/.ssh/id_rsa.
Your public key has been saved in /psi/home/caubet_m/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:AMvGhBWxXs1MXHvwTpvXCOpjUZgy30E+5V38bcj4k2I caubet_m@merlin-l-001.psi.ch
The key's randomart image is:
+---[RSA 2048]----+
| o*o ...o . ...|
| .+ + =. O o .o|
| * o * + Xo..+|
| o . . + B.*oo+|
| . S + =.oo.|
| . .E.+ |
| +. . . |
| . . |
| |
+----[SHA256]-----+
(base) [caubet_m@merlin-l-001 .ssh]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
(base) [caubet_m@merlin-l-001 .ssh]$ chmod 600 ~/.ssh/authorized_keys
</pre>
</details>
### 'srun' with x11 support
Once RSA keys are setup, you can run any windows based application. For that, you need to
add the option ``--x11`` to the ``srun`` command. In example:
```bash
srun --x11 xclock
```
will popup a X11 based clock.
In the same manner, you can create a bash shell with x11 support. For doing that, you need
to add the option ``--pty`` to the ``srun --x11`` command. Once resource is allocated, from
there you can interactively run X11 and non-X11 based commands.
```bash
srun --x11 --pty bash
```
<details>
<summary>[Show 'srun' with X11 support examples]</summary>
<pre class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false">
(base) [caubet_m@merlin-l-001 ~]$ srun --x11 xclock
srun: job 135095591 queued and waiting for resources
srun: job 135095591 has been allocated resources
(base) [caubet_m@merlin-l-001 ~]$
(base) [caubet_m@merlin-l-001 ~]$ srun --x11 --pty bash
srun: job 135095592 queued and waiting for resources
srun: job 135095592 has been allocated resources
(base) [caubet_m@merlin-c-205 ~]$ xclock
(base) [caubet_m@merlin-c-205 ~]$ echo "This was an example"
This was an example
(base) [caubet_m@merlin-c-205 ~]$ exit
exit
</pre>
</details>