Updates in Slurm

2019-07-01 18:04:15 +02:00
parent 864ef84a0f
commit 5c2ea17076
4 changed files with 23 additions and 88 deletions
--- a/pages/merlin6/accessing-merlin6/accessing-slurm.md
+++ b/pages/merlin6/accessing-merlin6/accessing-slurm.md
@@ -19,22 +19,7 @@ Slurm has been installed in a **multi-clustered** configuration, allowing to int
   * **merlin5** will exist as long as hardware incidents are soft and easy to repair/fix (i.e. hard disk replacement)
   * **merlin6** is the default cluster when submitting jobs.

-This document is mostly focused on the **merlin6** cluster. Details for **merlin5** are not shown here, and only basic access and recent
-changes will be explained (**[Official Merlin5 User Guide](https://intranet.psi.ch/PSI_HPC/Merlin5)** is still valid).
-
-### Merlin6 Slurm Configuration Details
-
-For understanding the Slurm configuration setup in the cluster, sometimes can be useful to check the following files:
-
-* ``/etc/slurm/slurm.conf`` - can be found in the login nodes and computing nodes.
-* ``/etc/slurm/cgroup.conf`` - can be found in the computing nodes, is also propagated to login nodes for user read access.
-* ``/etc/slurm/gres.conf`` - can be found in the GPU nodes, is also propgated to login nodes and computing nodes for user read access.
-
-The previous configuration files can be found in the *login nodes* correspond exclusively to the **merlin6** cluster configuration files. These
-configuration files are also present in the **merlin6** *computing nodes*.
-
-Slurm configuration files for the old **merlin5** cluster have to be directly checked on any of the **merlin5** *computing nodes*: those files *do
-not* exist in the **merlin6** *login nodes*.
+Please follow the section **Merlin6 Slurm** for more details about configuration and job submission.

 ### Merlin5 Access

@@ -49,50 +34,8 @@ srun --clusters=merlin5 --partition=merlin hostname
 sbatch --clusters=merlin5 --partition=merlin myScript.batch
 ```

---
+### Merlin6 Access

-## Using Slurm 'merlin6' cluster
-
-Basic usage for the **merlin6** cluster will be detailed here. For advanced usage, please use the following document [LINK TO SLURM ADVANCED CONFIG]()
-
-### Merlin6 Node definition
-
-The following table show default and maximum resources that can be used per node:
-
-| Nodes                              | Def.#CPUs | Max.#CPUs | Def.Mem/CPU | Max.Mem/CPU | Max.Mem/Node | Max.Swap | Def.#GPUs | Max.#GPUs |
-|:---------------------------------- | ---------:| ---------:| -----------:| -----------:| ------------:| --------:| --------- | --------- |
-| merlin-c-[001-022,101-122,201-222] | 1 core    | 44 cores  | 8000        | 352000      | 352000       | 10000    | N/A       | N/A       |
-| merlin-g-[001]                     | 1 core    | 8 cores   | 8000        | 102498      | 102498       | 10000    | 1         | 2         |
-| merlin-g-[002-009]                 | 1 core    | 10 cores  | 8000        | 102498      | 102498       | 10000    | 1         | 4         |
-
-If nothing is specified, by default each core will use up to 8GB of memory. More memory per core can be specified with the ``--mem=<memory>`` option,
-and maximum memory allowed is ``Max.Mem/Node``.
-
-In *Merlin6*, memory is considered a Consumable Resource, as well as the CPU.
-
-### Merlin6 Slurm partitions
-
-Partition can be specified when submitting a job with the ``--partition=<partitionname>`` option.
-The following *partitions* (also known as *queues*) are configured in Slurm:
-
-| Partition   | Default Partition | Default Time | Max Time | Max Nodes | Priority |
-|:----------- | ----------------- | ------------ | -------- | --------- | -------- |
-| **general** | true              | 1 day        | 1 week   | 50        | low      |
-| **daily**   | false             | 1 day        | 1 day    | 60        | medium   |
-| **hourly**  | false             | 1 hour       | 1 hour   | unlimited | highest  |
-
-General is the *default*, so when nothing is specified job will be by default assigned to that partition. General can not have more than 50 nodes
-running jobs. For **daily** this limitation is extended to 60 nodes while for **hourly** there are no limits. Shorter jobs have more priority than
-longer jobs, hence in general terms would be scheduled before (however, other factors such like user fair share value can affect to this decision).
-
-### Merlin6 User limits
-
-By default, users can not use more than 528 cores at the same time (Max CPU per user). This limit applies for the **general**  and **daily** partitions. For the **hourly** partition, there is no restriction.
-These limits are softed for the **daily** partition during non working hours and during the weekend as follows:
-
-| Partition   | Mon-Fri 08h-18h | Sun-Thu 18h-0h | From Fri 18h to Sun 8h  | From Sun 8h to Mon 18h |
-|:----------- | --------------- | -------------- | ----------------------- | ---------------------- |
-| **general** | 528             | 528            | 528                     | 528                    |
-| **daily**   | 528             | 792            | Unlimited               | 792                    |
-| **hourly**  | Unlimited       | Unlimited      | Unlimited               | Unlimited              |
+By default, any job submitted with specifying ``--clusters=`` should use the local cluster, so nothing extra should be specified. In any case,
+you can optionally add ``--clusters=merlin6`` in order to force submission to the Merlin6 cluster.