From 9417abdd847da7f0482b9810c21388d931db1a32 Mon Sep 17 00:00:00 2001 From: caubet_m Date: Wed, 19 Jun 2019 16:46:34 +0200 Subject: [PATCH] Finished basic commands --- .../merlin6-slurm/slurm-basic-commands.md | 136 ++---------------- 1 file changed, 9 insertions(+), 127 deletions(-) diff --git a/pages/merlin6/merlin6-slurm/slurm-basic-commands.md b/pages/merlin6/merlin6-slurm/slurm-basic-commands.md index 5ed04b0..5a13927 100644 --- a/pages/merlin6/merlin6-slurm/slurm-basic-commands.md +++ b/pages/merlin6/merlin6-slurm/slurm-basic-commands.md @@ -2,12 +2,16 @@ title: Slurm Basic Commands #tags: #keywords: -last_updated: 13 June 2019 +last_updated: 19 June 2019 #summary: "" sidebar: merlin6_sidebar permalink: /merlin6/slurm-basics.html --- +In this document some basic commands for using Slurm are showed. Advanced examples for some of these +are explained in other Merlin6 Slurm pages. You can always use ```man ``` pages for more +information about options and examples. + ## Basic commands Useful commands for the slurm: @@ -16,12 +20,14 @@ Useful commands for the slurm: sinfo # to see the name of nodes, their occupancy, name of slurm partitions, limits (try out with "-l" option) squeue # to see the currently running/waiting jobs in slurm (additional "-l" option may also be useful) sbatch Script.sh # to submit a script (example below) to the slurm. -srun command # to submit a command to Slurm. Same options as in 'sbatch' can be used. +srun # to submit a command to Slurm. Same options as in 'sbatch' can be used. salloc # to allocate computing nodes. Useful for running interactive jobs (ANSYS, Python Notebooks, etc.). scancel job_id # to cancel slurm job, job id is the numeric id, seen by the squeue ``` -Other advanced basic commands: +--- + +## Advanced basic commands: ```bash sinfo -N -l # list nodes, state, resources (number of CPUs, memory per node, etc.), and other information @@ -30,127 +36,3 @@ sprio -l # to view the factors that comprise a job's scheduling priority ``` --- - -## Slurm CPU Template (Mandatory Settings) - -The following Slurm template shows mandatory settings that must be included in your batch scripts: - -```bash -#!/bin/sh -#SBATCH --partition= # name of slurm partition to submit. General is the 'default'. -#SBATCH --constraint= # For CPU, always set it to 'mc'. For GPU, set it to 'gpu' (only 'merlin6-gpu' accounts can submit there) - - - ---- - -## Basic slurm example - -You can copy-paste the following example in a file file called ``mySlurm.batch``. -Some basic parameters are explained in the example. -Please notice that ``#`` is an enabled option while ``##`` is a commented out option (no effect). - -```bash -#!/bin/sh -#SBATCH --partition=daily # name of slurm partition to submit. Can be 'general' (default if not specified), 'daily', 'hourly'. -#SBATCH --job-name="mySlurmTest" # name of the job. Useful when submitting different types of jobs for filtering (i.e. 'squeue' command) -#SBATCH --time=0-12:00:00 # time limit. Here is shortened to 12 hours (default and max for 'daily' is 1 day). -#SBATCH --exclude=merlin-c-001 # exclude which nodes you don't want to submit -#SBATCH --nodes=10 # number of nodes you want to allocate the job -#SBATCH --ntasks=440 # number of tasks to run -##SBATCH --exclusive # enable if you need exclusive usage of a node. If this option is not specified, nodes are shared by default. -##SBATCH --ntasks-per-node=32 # number of tasks per node. Each Apollo node has 44 cores, using less with exclusive mode may help to turbo boost CPU frequency. If this option is enabled, setup --ntasks and --nodes according to it. - -module load gcc/8.3.0 openmpi/3.1.3 - -echo "Example no-MPI:" ; hostname # will print one hostname per node -echo "Example MPI:" ; mpirun hostname # will print one hostname per ntask -``` - -### Submitting a job - -Submit job to slurm and check it's status: - -```bash -sbatch mySlurm.batch # submit this job to slurm -squeue # check its status -``` - ---- - -## Advanced slurm test script - -Copy-paste the following example in a file called myAdvancedTest.batch): - -```bash -#!/bin/bash -#SBATCH --partition=merlin # name of slurm partition to submit -#SBATCH --time=2:00:00 # limit the execution of this job to 2 hours, see sinfo for the max. allowance -#SBATCH --nodes=2 # number of nodes -#SBATCH --ntasks=24 # number of tasks - -module load gcc/8.3.0 openmpi/3.1.3 -module list - -echo "Example no-MPI:" ; hostname # will print one hostname per node -echo "Example MPI:" ; mpirun hostname # will print one hostname per ntask -``` - -In the above example are specified the options ``--nodes=2`` and ``--ntasks=24``. This means that up 2 nodes are requested, -and is expected to run 24 tasks. Hence, 24 cores are needed for running that job. Slurm will try to allocate a maximum of 2 nodes, -both together having at least 24 cores. Since our nodes have 44 cores / each, if nodes are empty (no other users -have running jobs there), job will land on a single node (it has enough cores to run 24 tasks). - -If we want to ensure that job is using at least two different nodes (i.e. for boosting CPU frequency, or because the job requires -more memory per core) you should specify other options. - -A good example is ``--ntasks-per-node=12``. This will equally distribute 12 tasks on 2 nodes. - -```bash -#SBATCH --ntasks-per-node=12 -``` - -A different example could be by specifying how much memory per core is needed. For instance ``--mem-per-cpu=32000`` will reserve -~32000MB per core. Since we have a maximum of 352000MB per node, Slurm will be only able to allocate 11 cores (32000MB x 11cores = 352000MB). -It means that 3 nodes will be needed (can not run 12 tasks per node, only 11), or we need to decrease ``--mem-per-cpu`` to a lower value which -can allow the use of at least 12 cores per node (i.e. ``28000``) - -```bash -#SBATCH --mem-per-cpu=28000 -``` - -Finally, in order to ensure exclusivity of the node, an option *--exclusive* can be used (see below). This will ensure that -the requested nodes are exclusive for the job (no other users jobs will interact with this node, and only completely -free nodes will be allocated). - -```bash -#SBATCH --exclusive -``` - -This can be combined with the previous examples. - -More advanced configurations can be defined and can be combined with the previous examples. More information about advanced -options can be found in the following link: https://slurm.schedmd.com/sbatch.html (or run 'man sbatch'). - -If you have questions about how to properly execute your jobs, please contact us through merlin-admins@lists.psi.ch. Do not run -advanced configurations unless your are sure of what you are doing. - ---- - -## Environment Modules - - On top of the operating system stack we provide different software using the PSI developed -pmodule system. Useful commands: - -```bash -module avail # to see the list of available software provided via pmodules -module load gnuplot/5.2.0 # to load specific version of gnuplot package -module search hdf # try it out to see which version of hdf5 package is provided and with which dependencies -module load gcc/6.2.0 openmpi/1.10.2 hdf5/1.8.17 # load the specific version of hdf5, compiled with specific version of gcc and openmpi -module use unstable # to get access to the packages which are not considered to be fully stable by module provider (may be very fresh version, or not yet tested by community) -module list # to see which software is loaded in your environment -``` - -### Requests for New Software - -If you miss some package/version, contact us