first stab at mkdocs migration
refactor CSCS and Meg content add merlin6 quick start update merlin6 nomachine docs give the userdoc its own color scheme we use the Materials default one refactored slurm general docs merlin6 add merlin6 JB docs add software support m6 docs add all files to nav vibed changes #1 add missing pages further vibing #2 vibe #3 further fixes
This commit is contained in:
222
docs/merlin6/software-support/gothic.md
Normal file
222
docs/merlin6/software-support/gothic.md
Normal file
@@ -0,0 +1,222 @@
|
||||
# GOTHIC
|
||||
|
||||
## Installation
|
||||
|
||||
Gothic is locally installed in Merlin in the following directory:
|
||||
|
||||
```bash
|
||||
/data/project/general/software/gothic
|
||||
```
|
||||
|
||||
Multiple versions are available. As of August 22, 2022, the latest installed
|
||||
version is **Gothic 8.3 QA**.
|
||||
|
||||
Future releases will be placed in the PSI Modules system, therefore, loading it
|
||||
through PModules will be possible at some point. However, in the meantime one
|
||||
has to use the existing installations present in
|
||||
`/data/project/general/software/gothic`.
|
||||
|
||||
## Running Gothic
|
||||
|
||||
### General requirements
|
||||
|
||||
When running Gothic in interactive or batch mode, one has to consider
|
||||
the following requirements:
|
||||
|
||||
* **Use always one node only**: Gothic runs a single instance.
|
||||
Therefore, it can not run on multiple nodes. Adding option `--nodes=1-1` or
|
||||
`-N 1-1` is strongly recommended: this will prevent Slurm to allocate
|
||||
multiple nodes if the Slurm allocation definition is ambiguous.
|
||||
* **Use one task only**: Gothic spawns one main process, which then will
|
||||
spawn multiple threads depending on the number of available cores.
|
||||
Therefore, one has to specify 1 task (`--ntasks=1` or `-n 1`).
|
||||
* **Use multiple CPUs**: since Gothic will spawn multiple threads, then
|
||||
multiple CPUs can be used. Adding `--cpus-per-task=<num_cpus>` or `-c
|
||||
<num_cpus>` is in general recommended. Notice that `<num_cpus>` must never
|
||||
exceed the maximum number of CPUS in a compute node (usually *88*).
|
||||
* **Use multithread**: Gothic is an OpenMP based software, therefore,
|
||||
running in hyper-threading mode is strongly recommended. Use the option
|
||||
`--hint=multithread` for enforcing hyper-threading.
|
||||
* **[Optional]** *Memory setup*: The default memory per CPU (4000MB)
|
||||
is usually enough for running Gothic. If you require more memory, you can
|
||||
always set the `--mem=<mem_in_MB>` option. This is in general *not
|
||||
necessary*.
|
||||
|
||||
### Interactive
|
||||
|
||||
**Is not allowed to run CPU intensive interactive jobs in the login nodes**.
|
||||
Only applications capable to limit the number of cores are allowed to run for
|
||||
longer time. Also, **running in the login nodes is not efficient**, since
|
||||
resources are shared with other processes and users.
|
||||
|
||||
Is possible to submit interactive jobs to the cluster by allocating a full
|
||||
compute node, or even by allocating a few cores only. This will grant dedicated
|
||||
CPUs and resources and in general it will not affect other users.
|
||||
|
||||
For interactive jobs, is strongly recommended to use the `hourly` partition,
|
||||
which usually has a good availability of nodes.
|
||||
|
||||
For longer runs, one should use the `daily` (or `general`) partition. However,
|
||||
getting interactive access to nodes on these partitions is sometimes more
|
||||
difficult if the cluster is pretty full.
|
||||
|
||||
To submit an interactive job, consider the following requirements:
|
||||
|
||||
* **X11 forwarding must be enabled**: Gothic spawns an interactive
|
||||
window which requires X11 forwarding when using it remotely, therefore
|
||||
using the Slurm option `--x11` is necessary.
|
||||
* **Ensure that the scratch area is accessible**: For running Gothic,
|
||||
one has to define a scratch area with the `GTHTMP` environment variable.
|
||||
There are two options:
|
||||
1. **Use local scratch**: Each compute node has its own `/scratch` area.
|
||||
This area is independent to any other node, therefore not visible by other
|
||||
nodes. Using the top directory `/scratch` for interactive jobs is the
|
||||
simplest way, and it can be defined before or after the allocation
|
||||
creation, as follows:
|
||||
|
||||
```bash
|
||||
# Example 1: Define GTHTMP before the allocation
|
||||
export GTHTMP=/scratch
|
||||
salloc ...
|
||||
|
||||
# Example 2: Define GTHTMP after the allocation
|
||||
salloc ...
|
||||
export GTHTMP=/scratch
|
||||
```
|
||||
|
||||
Notice that if you want to create a custom sub-directory (i.e.
|
||||
`/scratch/$USER`, one has to create the sub-directory on every new
|
||||
allocation! In example:
|
||||
|
||||
```bash
|
||||
# Example 1:
|
||||
export GTHTMP=/scratch/$USER
|
||||
salloc ...
|
||||
mkdir -p $GTHTMP
|
||||
|
||||
# Example 2:
|
||||
salloc ...
|
||||
export GTHTMP=/scratch/$USER
|
||||
mkdir -p $GTHTMP
|
||||
```
|
||||
|
||||
Creating sub-directories makes the process more complex, therefore using
|
||||
just `/scratch` is simpler and recommended.
|
||||
2. **Shared scratch**: Using shared scratch allows to have a directory
|
||||
visible from all compute nodes and login nodes. Therefore, one can use
|
||||
`/shared-scratch` to achieve the same as in **1.**, but creating a
|
||||
sub-directory needs to be done just once.
|
||||
|
||||
Please, consider that `/scratch` usually provides better performance and,
|
||||
in addition, will offload the main storage. Therefore, using **local
|
||||
scratch** is strongly recommended. Use the shared scratch only when
|
||||
strongly necessary.
|
||||
|
||||
* **Use the `hourly` partition**: Using the `hourly` partition is
|
||||
recommended for running interactive jobs (latency is in general lower).
|
||||
However, `daily` and `general` are also available if you expect longer
|
||||
runs, but in these cases you should expect longer waiting times.
|
||||
|
||||
These requirements are in addition to the requirements previously described in
|
||||
the [General requirements](#general-requirements) section.
|
||||
|
||||
#### Interactive allocations: examples
|
||||
|
||||
* Requesting a full node:
|
||||
|
||||
```bash
|
||||
salloc --partition=hourly -N 1 -n 1 -c 88 --hint=multithread --x11 --exclusive --mem=0
|
||||
```
|
||||
|
||||
* Requesting 22 CPUs from a node, with default memory per CPU (4000MB/CPU):
|
||||
|
||||
```bash
|
||||
num_cpus=22
|
||||
salloc --partition=hourly -N 1 -n 1 -c $num_cpus --hint=multithread --x11
|
||||
```
|
||||
|
||||
### Batch job
|
||||
|
||||
The Slurm cluster is mainly used by non interactive batch jobs: Users submit a
|
||||
job, which goes into a queue, and waits until Slurm can assign resources to it.
|
||||
In general, the longer the job, the longer the waiting time, unless there are
|
||||
enough free resources to inmediately start running it.
|
||||
|
||||
Running Gothic in a Slurm batch script is pretty simple. One has to mainly
|
||||
consider the requirements described in the [General
|
||||
requirements](#general-requirements) section, and:
|
||||
|
||||
* **Use local scratch** for running batch jobs. In general, defining
|
||||
`GTHTMP` in a batch script is simpler than on an allocation. If you plan to
|
||||
run multiple jobs in the same node, you can even create a second
|
||||
sub-directory level based on the Slurm Job ID:
|
||||
|
||||
```bash
|
||||
mkdir -p /scratch/$USER/$SLURM_JOB_ID
|
||||
export GTHTMP=/scratch/$USER/$SLURM_JOB_ID
|
||||
... # Run Gothic here
|
||||
rm -rf /scratch/$USER/$SLURM_JOB_ID
|
||||
```
|
||||
|
||||
Temporary data generated by the job in `GTHTMP` must be removed at the end of
|
||||
the job, as showed above.
|
||||
|
||||
#### Batch script: examples
|
||||
|
||||
* Requesting a full node:
|
||||
|
||||
```bash
|
||||
#!/bin/bash -l
|
||||
#SBATCH --job-name=Gothic
|
||||
#SBATCH --time=3-00:00:00
|
||||
#SBATCH --partition=general
|
||||
#SBATCH --nodes=1
|
||||
#SBATCH --ntasks=1
|
||||
#SBATCH --cpus-per-task=88
|
||||
#SBATCH --hint=multithread
|
||||
#SBATCH --exclusive
|
||||
#SBATCH --mem=0
|
||||
#SBATCH --clusters=merlin6
|
||||
|
||||
INPUT_FILE='MY_INPUT.SIN'
|
||||
|
||||
mkdir -p /scratch/$USER/$SLURM_JOB_ID
|
||||
export GTHTMP=/scratch/$USER/$SLURM_JOB_ID
|
||||
|
||||
/data/project/general/software/gothic/gothic8.3qa/bin/gothic_s.sh $INPUT_FILE -m -np $SLURM_CPUS_PER_TASK
|
||||
gth_exit_code=$?
|
||||
|
||||
# Clean up data in /scratch
|
||||
rm -rf /scratch/$USER/$SLURM_JOB_ID
|
||||
|
||||
# Return exit code from GOTHIC
|
||||
exit $gth_exit_code
|
||||
```
|
||||
|
||||
* Requesting 22 CPUs from a node, with default memory per CPU (4000MB/CPU):
|
||||
|
||||
```bash
|
||||
#!/bin/bash -l
|
||||
#SBATCH --job-name=Gothic
|
||||
#SBATCH --time=3-00:00:00
|
||||
#SBATCH --partition=general
|
||||
#SBATCH --nodes=1
|
||||
#SBATCH --ntasks=1
|
||||
#SBATCH --cpus-per-task=22
|
||||
#SBATCH --hint=multithread
|
||||
#SBATCH --clusters=merlin6
|
||||
|
||||
INPUT_FILE='MY_INPUT.SIN'
|
||||
|
||||
mkdir -p /scratch/$USER/$SLURM_JOB_ID
|
||||
export GTHTMP=/scratch/$USER/$SLURM_JOB_ID
|
||||
|
||||
/data/project/general/software/gothic/gothic8.3qa/bin/gothic_s.sh $INPUT_FILE -m -np $SLURM_CPUS_PER_TASK
|
||||
gth_exit_code=$?
|
||||
|
||||
# Clean up data in /scratch
|
||||
rm -rf /scratch/$USER/$SLURM_JOB_ID
|
||||
|
||||
# Return exit code from GOTHIC
|
||||
exit $gth_exit_code
|
||||
```
|
||||
Reference in New Issue
Block a user