9.1 KiB
title, keywords, last_updated, summary, sidebar, permalink
title | keywords | last_updated | summary | sidebar | permalink |
---|---|---|---|---|---|
Running Interactive Jobs | interactive, X11, X, srun | 23 January 2020 | This document describes how to run interactive jobs as well as X based software. | merlin6_sidebar | /merlin6/interactive-jobs.html |
Running interactive jobs
There are two different ways for running interactive jobs in Slurm. This is possible by using
the salloc
and srun
commands:
salloc
: to obtain a Slurm job allocation (a set of nodes), execute command(s), and then release the allocation when the command is finished.srun
: is used for running parallel tasks.
srun
Is run is used to run parallel jobs in the batch system. It can be used within a batch script
(which can be run with sbatch
), or within a job allocation (which can be run with salloc
).
Also, it can be used as a direct command (in example, from the login nodes).
When used inside a batch script or during a job allocation, srun
is constricted to the
amount of resources allocated by the sbatch
/salloc
commands. In sbatch
, usually
these resources are defined inside the batch script with the format #SBATCH <option>=<value>
.
In other words, if you define in your batch script or allocation 88 tasks (and 1 thread / core)
and 2 nodes, srun
is constricted to these amount of resources (you can use less, but never
exceed those limits).
When used from the login node, usually is used to run a specific command or software in an
interactive way. srun
is a blocking process (it will block bash prompt until the srun
command finishes, unless you run it in background with &
). This can be very useful to run
interactive software which pops up a Window and then submits jobs or run sub-tasks in the
background (in example, Relion, cisTEM, etc.)
Refer to man srun
for exploring all possible options for that command.
[Show 'srun' example]: Running 'hostname' command on 3 nodes, using 2 cores (1 task/core) per node
(base) [caubet_m@merlin-l-001 ~]$ srun --clusters=merlin6 --ntasks=6 --ntasks-per-node=2 --nodes=3 hostname
srun: job 135088230 queued and waiting for resources
srun: job 135088230 has been allocated resources
merlin-c-102.psi.ch
merlin-c-102.psi.ch
merlin-c-101.psi.ch
merlin-c-101.psi.ch
merlin-c-103.psi.ch
merlin-c-103.psi.ch
salloc
salloc
is used to obtain a Slurm job allocation (a set of nodes). Once job is allocated,
users are able to execute interactive command(s). Once finished (exit
or Ctrl+D
),
the allocation is released. salloc
is a blocking command, it is, command will be blocked
until the requested resources are allocated.
When running salloc
, once the resources are allocated, by default the user will get
a new shell on one of the allocated resources (if a user has requested few nodes, it will
prompt a new shell on the first allocated node). However, this behaviour can be changed by adding
a shell ($SHELL
) at the end of the salloc
command. In example:
# Typical 'salloc' call
# - Same as running:
# 'salloc --clusters=merlin6 -N 2 -n 2 srun -n1 -N1 --mem-per-cpu=0 --gres=gpu:0 --pty --preserve-env --mpi=none $SHELL'
salloc --clusters=merlin6 -N 2 -n 2
# Custom 'salloc' call
# - $SHELL will open a local shell on the login node from where ``salloc`` is running
salloc --clusters=merlin6 -N 2 -n 2 $SHELL
[Show 'salloc' example]: Allocating 2 cores (1 task/core) in 2 nodes (1 core/node) - Default
(base) [caubet_m@merlin-l-001 ~]$ salloc --clusters=merlin6 --ntasks=2 --nodes=2
salloc: Pending job allocation 135171306
salloc: job 135171306 queued and waiting for resources
salloc: job 135171306 has been allocated resources
salloc: Granted job allocation 135171306
(base) [caubet_m@merlin-c-213 ~]$ srun hostname
merlin-c-213.psi.ch
merlin-c-214.psi.ch
(base) [caubet_m@merlin-c-213 ~]$ exit
exit
salloc: Relinquishing job allocation 135171306
(base) [caubet_m@merlin-l-001 ~]$ salloc --clusters=merlin6 -N 2 -n 2 srun -n1 -N1 --mem-per-cpu=0 --gres=gpu:0 --pty --preserve-env --mpi=none $SHELL
salloc: Pending job allocation 135171342
salloc: job 135171342 queued and waiting for resources
salloc: job 135171342 has been allocated resources
salloc: Granted job allocation 135171342
(base) [caubet_m@merlin-c-021 ~]$ srun hostname
merlin-c-021.psi.ch
merlin-c-022.psi.ch
(base) [caubet_m@merlin-c-021 ~]$ exit
exit
salloc: Relinquishing job allocation 135171342
[Show 'salloc' example]: Allocating 2 cores (1 task/core) in 2 nodes (1 core/node) - $SHELL
(base) [caubet_m@merlin-export-01 ~]$ salloc --clusters=merlin6 --ntasks=2 --nodes=2 $SHELL
salloc: Pending job allocation 135171308
salloc: job 135171308 queued and waiting for resources
salloc: job 135171308 has been allocated resources
salloc: Granted job allocation 135171308
(base) [caubet_m@merlin-export-01 ~]$ srun hostname
merlin-c-218.psi.ch
merlin-c-117.psi.ch
(base) [caubet_m@merlin-export-01 ~]$ exit
exit
salloc: Relinquishing job allocation 135171308
Running interactive jobs with X11 support
Requirements
Graphical access
NoMachine is the official supported service for graphical access in the Merlin cluster. This service is running on the login nodes. Check the document {Accessing Merlin -> NoMachine} for details about how to connect to the NoMachine service in the Merlin cluster.
For other non officially supported graphical access (X11 forwarding):
- For Linux clients, please follow {How To Use Merlin -> Accessing from Linux Clients}
- For Windows clients, please follow {How To Use Merlin -> Accessing from Windows Clients}
- For MacOS clients, please follow {How To Use Merlin -> Accessing from MacOS Clients}
'srun' with x11 support
Merlin5 and Merlin6 clusters allow running any windows based applications. For that, you need to
add the option --x11
to the srun
command. In example:
srun --clusters=merlin6 --x11 xclock
will popup a X11 based clock.
In the same manner, you can create a bash shell with x11 support. For doing that, you need
to add the option --pty
to the srun --x11
command. Once resource is allocated, from
there you can interactively run X11 and non-X11 based commands.
srun --clusters=merlin6 --x11 --pty bash
[Show 'srun' with X11 support examples]
(base) [caubet_m@merlin-l-001 ~]$ srun --clusters=merlin6 --x11 xclock
srun: job 135095591 queued and waiting for resources
srun: job 135095591 has been allocated resources
(base) [caubet_m@merlin-l-001 ~]$
(base) [caubet_m@merlin-l-001 ~]$ srun --clusters=merlin6 --x11 --pty bash
srun: job 135095592 queued and waiting for resources
srun: job 135095592 has been allocated resources
(base) [caubet_m@merlin-c-205 ~]$ xclock
(base) [caubet_m@merlin-c-205 ~]$ echo "This was an example"
This was an example
(base) [caubet_m@merlin-c-205 ~]$ exit
exit
'salloc' with x11 support
Merlin5 and Merlin6 clusters allow running any windows based applications. For that, you need to
add the option --x11
to the salloc
command. In example:
salloc --clusters=merlin6 --x11 xclock
will popup a X11 based clock.
In the same manner, you can create a bash shell with x11 support. For doing that, you need
to add to run just salloc --clusters=merlin6 --x11
. Once resource is allocated, from
there you can interactively run X11 and non-X11 based commands.
salloc --clusters=merlin6 --x11
[Show 'salloc' with X11 support examples]
(base) [caubet_m@merlin-l-001 ~]$ salloc --clusters=merlin6 --x11 xclock
salloc: Pending job allocation 135171355
salloc: job 135171355 queued and waiting for resources
salloc: job 135171355 has been allocated resources
salloc: Granted job allocation 135171355
salloc: Relinquishing job allocation 135171355
(base) [caubet_m@merlin-l-001 ~]$ salloc --clusters=merlin6 --x11
salloc: Pending job allocation 135171349
salloc: job 135171349 queued and waiting for resources
salloc: job 135171349 has been allocated resources
salloc: Granted job allocation 135171349
salloc: Waiting for resource configuration
salloc: Nodes merlin-c-117 are ready for job
(base) [caubet_m@merlin-c-117 ~]$ xclock
(base) [caubet_m@merlin-c-117 ~]$ echo "This was an example"
This was an example
(base) [caubet_m@merlin-c-117 ~]$ exit
exit
salloc: Relinquishing job allocation 135171349