From 520d86119189cfa23fce80b5792c403cd0085819 Mon Sep 17 00:00:00 2001 From: caubet_m Date: Tue, 5 Nov 2019 16:25:03 +0100 Subject: [PATCH] Updated interactive jobs --- .../03 Job Submission/interactive-jobs.md | 140 +++++++++++++++--- 1 file changed, 116 insertions(+), 24 deletions(-) diff --git a/pages/merlin6/03 Job Submission/interactive-jobs.md b/pages/merlin6/03 Job Submission/interactive-jobs.md index 743941e..60eb1e9 100644 --- a/pages/merlin6/03 Job Submission/interactive-jobs.md +++ b/pages/merlin6/03 Job Submission/interactive-jobs.md @@ -51,35 +51,79 @@ merlin-c-103.psi.ch ### salloc -``salloc`` is used to obtain a Slurm job allocation (a set of nodes). Once job is allocated, +**``salloc``** is used to obtain a Slurm job allocation (a set of nodes). Once job is allocated, users are able to execute interactive command(s). Once finished (``exit`` or ``Ctrl+D``), -the allocation is released. +the allocation is released. **``salloc``** is a blocking command, it is, command will be blocked +until the requested resources are allocated. -Please, not that ``salloc`` is by default a blocking process, and once the resources get -allocated user gets the bash prompt back. Once prompt is back, a new bash shell is created, -and user can run any ``srun`` commands, which will run tasks on the allocated resources. Once -finished, exiting the bash shell will release the allocation. +When running **``salloc``**, once the resources get allocated, *by default* the user will get +a ***new shell on one of the allocated resources*** (if a user has requested few nodes, it will +prompt a new shell in the first allocated node). This is thanks to the default command +``srun -n1 -N1 --mem-per-cpu=0 --gres=gpu:0 --pty --preserve-env --mpi=none $SHELL`` which will run +in the background (users do not need to specify any ``srun`` command). However, this behaviour can +be changed by running a different command after the **``salloc``** command, and someone can replace +the default ``srun`` command. In example: + +```bash +# Typical 'salloc' call +# - Same as running: +# 'salloc --clusters=merlin6 -N 2 -n 2 srun -n1 -N1 --mem-per-cpu=0 --gres=gpu:0 --pty --preserve-env --mpi=none $SHELL' +salloc --clusters=merlin6 -N 2 -n 2 + +# Custom 'salloc' call +# - $SHELL will open a shell locally on the machine running 'salloc' (login node shell) +salloc --clusters=merlin6 -N 2 -n 2 $SHELL +```
-[Show 'salloc' example]: Allocating 2 cores (1 task/core) in 2 nodes (1 core/node) +[Show 'salloc' example]: Allocating 2 cores (1 task/core) in 2 nodes (1 core/node) - Default
 (base) [caubet_m@merlin-l-001 ~]$ salloc --clusters=merlin6 --ntasks=2 --nodes=2
-salloc: Pending job allocation 135087844
-salloc: job 135087844 queued and waiting for resources
-salloc: job 135087844 has been allocated resources
-salloc: Granted job allocation 135087844
+salloc: Pending job allocation 135171306
+salloc: job 135171306 queued and waiting for resources
+salloc: job 135171306 has been allocated resources
+salloc: Granted job allocation 135171306
 
-(base) [caubet_m@merlin-l-001 ~]$ srun hostname
-merlin-c-120.psi.ch
-merlin-c-119.psi.ch
+(base) [caubet_m@merlin-c-213 ~]$ srun hostname
+merlin-c-213.psi.ch
+merlin-c-214.psi.ch
 
-(base) [caubet_m@merlin-c-119 ~]$ exit
-logout
-Connection to merlin-c-119 closed.
-
-(base) [caubet_m@merlin-l-001 ~]$ exit
+(base) [caubet_m@merlin-c-213 ~]$ exit
 exit
-salloc: Relinquishing job allocation 135087844
+salloc: Relinquishing job allocation 135171306
+
+(base) [caubet_m@merlin-l-001 ~]$ salloc --clusters=merlin6 -N 2 -n 2 srun -n1 -N1 --mem-per-cpu=0 --gres=gpu:0 --pty --preserve-env --mpi=none $SHELL
+salloc: Pending job allocation 135171342
+salloc: job 135171342 queued and waiting for resources
+salloc: job 135171342 has been allocated resources
+salloc: Granted job allocation 135171342
+
+(base) [caubet_m@merlin-c-021 ~]$ srun hostname
+merlin-c-021.psi.ch
+merlin-c-022.psi.ch
+
+(base) [caubet_m@merlin-c-021 ~]$ exit
+exit
+salloc: Relinquishing job allocation 135171342
+
+
+ +
+[Show 'salloc' example]: Allocating 2 cores (1 task/core) in 2 nodes (1 core/node) - $SHELL +
+(base) [caubet_m@merlin-export-01 ~]$ salloc --clusters=merlin6 --ntasks=2 --nodes=2 $SHELL
+salloc: Pending job allocation 135171308
+salloc: job 135171308 queued and waiting for resources
+salloc: job 135171308 has been allocated resources
+salloc: Granted job allocation 135171308
+
+(base) [caubet_m@merlin-export-01 ~]$ srun hostname
+merlin-c-218.psi.ch
+merlin-c-117.psi.ch
+
+(base) [caubet_m@merlin-export-01 ~]$ exit
+exit
+salloc: Relinquishing job allocation 135171308
 
@@ -162,7 +206,7 @@ Once RSA keys are setup, you can run any windows based application. For that, yo add the option ``--x11`` to the ``srun`` command. In example: ```bash -srun --x11 xclock +srun --clusters=merlin6 --x11 xclock ``` will popup a X11 based clock. @@ -172,19 +216,19 @@ to add the option ``--pty`` to the ``srun --x11`` command. Once resource is allo there you can interactively run X11 and non-X11 based commands. ```bash -srun --x11 --pty bash +srun --clusters=merlin6 --x11 --pty bash ```
[Show 'srun' with X11 support examples]
-(base) [caubet_m@merlin-l-001 ~]$ srun --x11 xclock
+(base) [caubet_m@merlin-l-001 ~]$ srun --clusters=merlin6 --x11 xclock
 srun: job 135095591 queued and waiting for resources
 srun: job 135095591 has been allocated resources
 
 (base) [caubet_m@merlin-l-001 ~]$ 
 
-(base) [caubet_m@merlin-l-001 ~]$ srun --x11 --pty bash
+(base) [caubet_m@merlin-l-001 ~]$ srun --clusters=merlin6 --x11 --pty bash
 srun: job 135095592 queued and waiting for resources
 srun: job 135095592 has been allocated resources
 
@@ -197,3 +241,51 @@ This was an example
 exit
 
+ +### 'salloc' with x11 support + +Once RSA keys are setup, you can run any windows based application. For that, you need to +add the option ``--x11`` to the ``salloc`` command. In example: + +```bash +salloc --clusters=merlin6 --x11 xclock +``` + +will popup a X11 based clock. + +In the same manner, you can create a bash shell with x11 support. For doing that, you need +to add to run just ``salloc --clusters=merlin6 --x11``. Once resource is allocated, from +there you can interactively run X11 and non-X11 based commands. + +```bash +salloc --clusters=merlin6 --x11 +``` + +
+[Show 'salloc' with X11 support examples] +
+(base) [caubet_m@merlin-l-001 ~]$ salloc --clusters=merlin6 --x11 xclock
+salloc: Pending job allocation 135171355
+salloc: job 135171355 queued and waiting for resources
+salloc: job 135171355 has been allocated resources
+salloc: Granted job allocation 135171355
+salloc: Relinquishing job allocation 135171355
+
+(base) [caubet_m@merlin-l-001 ~]$ salloc --clusters=merlin6 --x11 
+salloc: Pending job allocation 135171349
+salloc: job 135171349 queued and waiting for resources
+salloc: job 135171349 has been allocated resources
+salloc: Granted job allocation 135171349
+salloc: Waiting for resource configuration
+salloc: Nodes merlin-c-117 are ready for job
+
+(base) [caubet_m@merlin-c-117 ~]$ xclock
+
+(base) [caubet_m@merlin-c-117 ~]$ echo "This was an example"
+This was an example
+
+(base) [caubet_m@merlin-c-117 ~]$ exit
+exit
+salloc: Relinquishing job allocation 135171349
+
+