Added interactive-jobs.md and linux/macos/windows client recipes

2019-10-23 12:08:18 +02:00
parent 126d6a79b6
commit 3b8e2fc9d1
17 changed files with 408 additions and 14 deletions
--- a/Submission/running-jobs.md
+++ b/Submission/running-jobs.md
@@ -0,0 +1,199 @@
+---
+title: Running Jobs
+#tags:
+#keywords:
+last_updated: 18 June 2019
+#summary: ""
+sidebar: merlin6_sidebar
+permalink: /merlin6/running-jobs.html
+---
+
+## Commands for running jobs
+
+* ``sbatch``: to submit a batch script to Slurm. Use ``squeue`` for checking jobs status and ``scancel`` for deleting a job from the queue.
+* ``srun``: to run parallel jobs in the batch system
+* ``salloc``: to obtain a Slurm job allocation (a set of nodes), execute command(s), and then release the allocation when the command is finished.
+This is equivalent to interactive run.
+
+## Running on Merlin5
+
+The **Merlin5** cluster will be available at least until 1st of November 2019. In the meantime, users can keep submitting jobs to the old cluster
+but they will need to specify a couple of extra options to their scripts.
+
+```bash
+#SBATCH --clusters=merlin5
+```
+
+By adding ``--clusters=merlin5`` it will send the jobs to the old Merlin5 computing nodes. Also, ``--partition=<merlin|gpu>`` can be specified in
+order to use the old Merlin5 partitions.
+
+## Running on Merlin6
+
+In order to run on the **Merlin6** cluster, users have to add the following options:
+
+```bash
+#SBATCH --clusters=merlin6
+```
+
+By adding ``--clusters=merlin6`` it will send the jobs to the old Merlin6 computing nodes.
+
+## Shared nodes and exclusivity
+
+The **Merlin6** cluster has been designed in a way that should allow running MPI/OpenMP processes as well as single core based jobs. For allowing
+co-existence, nodes are configured by default in a shared mode. It means, that multiple jobs from multiple users may land in the same node. This
+behaviour can be changed by a user if they require exclusive usage of nodes.
+
+By default, Slurm will try to allocate jobs on nodes that are already occupied by processes not requiring exclusive usage of a node. In this way,
+we fill up first mixed nodes and we ensure that free full resources are available for MPI/OpenMP jobs.
+
+Exclusivity of a node can be setup by specific the ``--exclusive`` option as follows:
+
+```bash
+#SBATCH --exclusive
+```
+
+## Output and Errors
+
+By default, Slurm script will generate standard output and errors files in the directory from where
+you submit the batch script:
+
+* standard output will be written into a file ``slurm-$SLURM_JOB_ID.out``.
+* standard error will be written into a file ``slurm-$SLURM_JOB_ID.err``.
+
+If you want to the default names it can be done with the options ``--output`` and ``--error``. In example:
+
+```bash
+#SBATCH --output=logs/myJob.%N.%j.out  # Generate an output file per hostname and jobid
+#SBATCH --error=logs/myJob.%N.%j.err   # Generate an errori file per hostname and jobid
+```
+
+Use **man sbatch** (``man sbatch | grep -A36 '^filename pattern'``) for getting a list specification of **filename patterns**.
+
+## Partitions
+
+Merlin6 contains 6 partitions for general purpose:
+
+   * For the CPU these are ``general``, ``daily`` and ``hourly``.
+   * For the GPU these are ``gpu``.
+
+If no partition is defined, ``general`` will be the default. Partition can be defined with the ``--partition`` option as follows:
+
+```bash
+#SBATCH --partition=<partition_name>  # Partition to use. 'general' is the 'default'.
+```
+
+Please check the section [Slurm Configuration#Merlin6 Slurm Partitions] for more information about Merlin6 partition setup.
+
+## CPU-based Jobs Settings
+
+CPU-based jobs are available for all PSI users. Users must belong to the ``merlin6`` Slurm ``Account`` in order to be able
+to run on CPU-based nodes. All users registered in Merlin6 are automatically included in the ``Account``.
+
+### Slurm CPU Recommended Settings
+
+There are some settings that are not mandatory but would be needed or useful to specify. These are the following:
+
+* ``--time``: mostly used when you need to specify longer runs in the ``general`` partition, also useful for specifying
+shorter times. This may affect scheduling priorities.
+
+   ```bash
+   #SBATCH --time=<D-HH:MM:SS>   # Time job needs to run
+   ```
+
+### Slurm CPU Template
+
+The following template should be used by any user submitting jobs to CPU nodes:
+
+```bash
+#!/bin/sh
+#SBATCH --partition=<general|daily|hourly>  # Specify 'general' or 'daily' or 'hourly'
+#SBATCH --time=<D-HH:MM:SS>                 # Strictly recommended when using 'general' partition.
+#SBATCH --output=<output_file>              # Generate custom output file
+#SBATCH --error=<error_file>                # Generate custom error  file
+#SBATCH --ntasks-per-core=1                 # Recommended one thread per core
+##SBATCH --exclusive                        # Uncomment if you need exclusive node usage
+
+## Advanced options example
+##SBATCH --nodes=1                          # Uncomment and specify #nodes to use
+##SBATCH --ntasks=44                        # Uncomment and specify #nodes to use
+##SBATCH --ntasks-per-node=44               # Uncomment and specify #tasks per node
+##SBATCH --ntasks-per-core=2                # Uncomment and specify #tasks per core (a.k.a. threads)
+##SBATCH --cpus-per-task=44                 # Uncomment and specify the number of cores per task
+```
+
+* Users needing hyper-threading can specify ``--ntasks-per-core=2`` instead. This is not recommended for generic usage.
+
+## GPU-based Jobs Settings
+
+GPU-base jobs are restricted to BIO users, however access for PSI users can be requested on demand. Users must belong to
+the ``merlin6-gpu`` Slurm ``Account`` in order to be able to run GPU-based nodes. BIO users belonging to any BIO group
+are automatically registered to the ``merlin6-gpu`` account. Other users should request access to the Merlin6 administrators.
+
+### Slurm CPU Mandatory Settings
+
+The following options are mandatory settings that **must be included** in your batch scripts:
+
+```bash
+#SBATCH --gres=gpu         # Always set at least this option when using GPUs
+```
+
+### Slurm GPU Recommended Settings
+
+GPUs are also a shared resource. Hence, multiple users can run jobs on a single node, but only one GPU per user process
+must be used. Users can define which GPUs resources they need with the ``--gres`` option.
+Valid ``gres`` options are: ``gpu[[:type]:count]`` where ``type=GTX1080|GTX1080Ti`` and ``count=<number of gpus to use>``
+This would be according to the following rules:
+
+In example:
+
+```bash
+#SBATCH --gres=gpu:GTX1080:8   # Use 8 x GTX1080 GPUs
+```
+
+### Slurm GPU Template
+
+The following template should be used by any user submitting jobs to GPU nodes:
+
+```bash
+#!/bin/sh
+#SBATCH --partition=gpu_<general|daily|hourly> # Specify 'general' or 'daily' or 'hourly'
+#SBATCH --time=<D-HH:MM:SS>                 # Strictly recommended when using 'general' partition.
+#SBATCH --output=<output_file>              # Generate custom output file
+#SBATCH --error=<error_file                 # Generate custom error  file
+#SBATCH --gres="gpu:<type>:<number_gpus>"   # You should specify at least 'gpu'
+#SBATCH --ntasks-per-core=1                 # GPU nodes have hyper-threading disabled
+##SBATCH --exclusive                        # Uncomment if you need exclusive node usage
+
+## Advanced options example
+##SBATCH --nodes=1                          # Uncomment and specify number of nodes to use
+##SBATCH --ntasks=44                        # Uncomment and specify number of nodes to use
+##SBATCH --ntasks-per-node=44               # Uncomment and specify number of tasks per node
+##SBATCH --cpus-per-task=44                 # Uncomment and specify the number of cores per task
+```
+
+
+## Job status
+
+The status of submitted jobs can be check with the `squeue` command:
+
+```
+~ $ squeue -u bliven_s
+             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
+         134507729       gpu test_scr bliven_s PD       0:00      3 (AssocGrpNodeLimit)
+         134507768   general test_scr bliven_s PD       0:00     19 (AssocGrpCpuLimit)
+         134507729       gpu test_scr bliven_s PD       0:00      3 (Resources)
+         134506301       gpu test_scr bliven_s PD       0:00      1 (Priority)
+         134506288       gpu test_scr bliven_s  R       9:16      1 merlin-g-008
+```
+
+Common Statuses:
+- *merlin-\** Running on the specified host
+- *(Priority)* Waiting in the queue
+- *(Resources)* At the head of the queue, waiting for machines to become available
+- *(AssocGrpCpuLimit), (AssocGrpNodeLimit)* Job would exceed per-user limitations on
+  the number of simultaneous CPUs/Nodes. Use `scancel` to remove the job and
+  resubmit with fewer resources, or else wait for your other jobs to finish.
+- *(PartitionNodeLimit)* Exceeds all resources available on this partition.
+  Run `scancel` and resubmit to a different partition (`-p`) or with fewer
+  resources.
+