Expanded PModules docs

2021-05-21 18:39:38 +02:00
parent fcfdbf1344
commit 0fd1653938
11 changed files with 219 additions and 69 deletions
--- a/pages/gmerlin6/slurm-configuration.md
+++ b/pages/gmerlin6/slurm-configuration.md
@ -50,15 +50,15 @@ The table below resumes shows all possible partitions available to users:

 | GPU Partition      |  Default Time | Max Time | PriorityJobFactor\* | PriorityTier\*\* |
 |:-----------------: |  :----------: | :------: | :-----------------: | :--------------: |
-| **<u>gpu</u>**     |  1 day        | 1 week   | 1                   | 1                |
-| **gpu-short**      |  2 hours      | 2 hours  | 1000                | 500              |
-| **gwendolen**      |  1 hour       | 12 hours | 1000                | 1000             |
+| `gpu`              |  1 day        | 1 week   | 1                   | 1                |
+| `gpu-short`        |  2 hours      | 2 hours  | 1000                | 500              |
+| `gwendolen`        |  1 hour       | 12 hours | 1000                | 1000             |

-**\***The **PriorityJobFactor** value will be added to the job priority (*PARTITION* column in `sprio -l` ). In other words, jobs sent to higher priority
+\*The **PriorityJobFactor** value will be added to the job priority (*PARTITION* column in `sprio -l` ). In other words, jobs sent to higher priority
 partitions will usually run first (however, other factors such like **job age** or mainly **fair share** might affect to that decision). For the GPU
 partitions, Slurm will also attempt first to allocate jobs on partitions with higher priority over partitions with lesser priority.

-**\*\***Jobs submitted to a partition with a higher **PriorityTier** value will be dispatched before pending jobs in partition with  lower *PriorityTier*  value
+\*\*Jobs submitted to a partition with a higher **PriorityTier** value will be dispatched before pending jobs in partition with  lower *PriorityTier*  value
 and, if possible, they will preempt running jobs from partitions with lower *PriorityTier* values.

 ### Merlin6 GPU Accounts
@ -71,11 +71,11 @@ This is mostly needed by users which have multiple Slurm accounts, which may def
 ```
 Not all the accounts can be used on all partitions. This is resumed in the table below:

-| Slurm Account        | Slurm Partitions  |
-|:-------------------: |  :--------------: |
-| **<u>merlin</u>**    | `gpu`,`gpu-short` |
-| **gwendolen_public** | `gwendolen`       |
-| **gwendolen**        | `gwendolen`       |
+| Slurm Account        | Slurm Partitions      |
+|:-------------------: |  :------------------: |
+| **`merlin`**         | **`gpu`**,`gpu-short` |
+| `gwendolen_public`   | `gwendolen`           |
+| `gwendolen`          | `gwendolen`           |

 By default, all users belong to the `merlin` and `gwendolen_public` Slurm accounts. `gwendolen` is a restricted account.

@ -103,14 +103,61 @@ The GPU type is optional: if left empty, it will try allocating any type of GPU.
 The different `[<type>:]` values and `<number>` of GPUs depends on the node.
 This is detailed in the below table.

-| Nodes              |  GPU Type              | #GPUs |
-|:------------------:|  :-------------------: | :---: |
-| merlin-g-[001]     |  `geforce_gtx_1080`    | 2     |
-| merlin-g-[002-005] |  `geforce_gtx_1080`    | 4     |
-| merlin-g-[006-009] |  `geforce_gtx_1080_ti` | 4     |
-| merlin-g-[010-013] |  `geforce_rtx_2080_ti` | 4     |
-| merlin-g-014       |  `geforce_rtx_2080_ti` | 8     |
-| merlin-g-100       |  `A100`                | 8     |
+| Nodes                  | GPU Type                  | #GPUs |
+|:---------------------: | :-----------------------: | :---: |
+| **merlin-g-[001]**     | **`geforce_gtx_1080`**    | 2     |
+| **merlin-g-[002-005]** | **`geforce_gtx_1080`**    | 4     |
+| **merlin-g-[006-009]** | **`geforce_gtx_1080_ti`** | 4     |
+| **merlin-g-[010-013]** | **`geforce_rtx_2080_ti`** | 4     |
+| **merlin-g-014**       | **`geforce_rtx_2080_ti`** | 8     |
+| **merlin-g-100**       | **`A100`**                | 8     |
+
+#### Constraint / Features
+
+Instead of specifying the GPU **type**, sometimes users would need to **specify the GPU by the amount of memory available in the GPU** card itself.
+This has been defined in Slurm with **Features**, which is a tag which defines the GPU memory for the different GPU cards.
+Users can specify which GPU memory size needs to be used with the `--constraint` option. In that case, notice that *in many cases
+ there is not need to specify `[<type>:]`* in the `--gpus` option.
+
+```bash
+#SBATCH --contraint=<Feature>    # Possible values: gpumem_8gb, gpumem_11gb, gpumem_40gb
+```
+
+The table below shows the available **Features** and which GPU card models and GPU nodes they belong to: 
+
+<table>
+  <thead>
+   <tr>
+   <th scope='colgroup' style="vertical-align:middle;text-align:center;" colspan="3">Merlin6 CPU Computing Nodes</th>
+   </tr>
+   <tr>
+   <th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Nodes</th>
+   <th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">GPU Type</th>
+   <th scope='col' style="vertical-align:middle;text-align:center;" colspan="1">Feature</th>
+   </tr>
+  </thead>
+  <tbody>
+   <tr style="vertical-align:middle;text-align:center;" ralign="center">
+   <td markdown="span" style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-g-[001-005]</b></td>
+   <td markdown="span" style="vertical-align:middle;text-align:center;" rowspan="1">`geforce_gtx_1080`</td>
+   <td markdown="span" style="vertical-align:middle;text-align:center;" rowspan="1"><b>`gpumem_8gb`</b></td>
+   </tr>
+   <tr style="vertical-align:middle;text-align:center;" ralign="center">
+   <td markdown="span" style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-g-[006-009]</b></td>
+   <td markdown="span" style="vertical-align:middle;text-align:center;" rowspan="1">`geforce_gtx_1080_ti`</td>
+   <td markdown="span" style="vertical-align:middle;text-align:center;" rowspan="2"><b>`gpumem_11gb`</b></td>
+   </tr>
+   <tr style="vertical-align:middle;text-align:center;" ralign="center">
+   <td markdown="span" style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-g-[010-014]</b></td>
+   <td markdown="span" style="vertical-align:middle;text-align:center;" rowspan="1">`geforce_rtx_2080_ti`</td>
+   </tr>
+   <tr style="vertical-align:middle;text-align:center;" ralign="center">
+   <td markdown="span" style="vertical-align:middle;text-align:center;" rowspan="1"><b>merlin-g-100</b></td>
+   <td markdown="span" style="vertical-align:middle;text-align:center;" rowspan="1">`A100`</td>
+   <td markdown="span" style="vertical-align:middle;text-align:center;" rowspan="1"><b>`gpumem_40gb`</b></td>
+   </tr>
+  </tbody>
+</table>

 #### Other GPU options

@ -120,14 +167,14 @@ Below are listed the most common settings:

 ```bash
 #SBATCH --hint=[no]multithread
-#SBATCH --ntasks=<ntasks>
-#SBATCH --ntasks-per-gpu=<ntasks>
-#SBATCH --mem-per-gpu=<size[units]>
-#SBATCH --cpus-per-gpu=<ncpus>
-#SBATCH --gpus-per-node=[<type>:]<number>
-#SBATCH --gpus-per-socket=[<type>:]<number>
-#SBATCH --gpus-per-task=[<type>:]<number>
-#SBATCH --gpu-bind=[verbose,]<type>
+#SBATCH --ntasks=\<ntasks\>
+#SBATCH --ntasks-per-gpu=\<ntasks\>
+#SBATCH --mem-per-gpu=\<size[units]\>
+#SBATCH --cpus-per-gpu=\<ncpus\>
+#SBATCH --gpus-per-node=[\<type\>:]\<number\>
+#SBATCH --gpus-per-socket=[\<type\>:]\<number\>
+#SBATCH --gpus-per-task=[\<type\>:]\<number\>
+#SBATCH --gpu-bind=[verbose,]\<type\>
 ```

 Please, notice that when defining `[<type>:]` once, then all other options must use it too!