add software support m6 docs
This commit is contained in:
222
docs/merlin6/software-support/gothic.md
Normal file
222
docs/merlin6/software-support/gothic.md
Normal file
@@ -0,0 +1,222 @@
|
||||
# GOTHIC
|
||||
|
||||
## Installation
|
||||
|
||||
Gothic is locally installed in Merlin in the following directory:
|
||||
|
||||
```bash
|
||||
/data/project/general/software/gothic
|
||||
```
|
||||
|
||||
Multiple versions are available. As of August 22, 2022, the latest installed
|
||||
version is **Gothic 8.3 QA**.
|
||||
|
||||
Future releases will be placed in the PSI Modules system, therefore, loading it
|
||||
through PModules will be possible at some point. However, in the meantime one
|
||||
has to use the existing installations present in
|
||||
`/data/project/general/software/gothic`.
|
||||
|
||||
## Running Gothic
|
||||
|
||||
### General requirements
|
||||
|
||||
When running Gothic in interactive or batch mode, one has to consider
|
||||
the following requirements:
|
||||
|
||||
* **Use always one node only**: Gothic runs a single instance.
|
||||
Therefore, it can not run on multiple nodes. Adding option `--nodes=1-1` or
|
||||
`-N 1-1` is strongly recommended: this will prevent Slurm to allocate
|
||||
multiple nodes if the Slurm allocation definition is ambiguous.
|
||||
* **Use one task only**: Gothic spawns one main process, which then will
|
||||
spawn multiple threads depending on the number of available cores.
|
||||
Therefore, one has to specify 1 task (`--ntasks=1` or `-n 1`).
|
||||
* **Use multiple CPUs**: since Gothic will spawn multiple threads, then
|
||||
multiple CPUs can be used. Adding `--cpus-per-task=<num_cpus>` or `-c
|
||||
<num_cpus>` is in general recommended. Notice that `<num_cpus>` must never
|
||||
exceed the maximum number of CPUS in a compute node (usually *88*).
|
||||
* **Use multithread**: Gothic is an OpenMP based software, therefore,
|
||||
running in hyper-threading mode is strongly recommended. Use the option
|
||||
`--hint=multithread` for enforcing hyper-threading.
|
||||
* **[Optional]** *Memory setup*: The default memory per CPU (4000MB)
|
||||
is usually enough for running Gothic. If you require more memory, you can
|
||||
always set the `--mem=<mem_in_MB>` option. This is in general *not
|
||||
necessary*.
|
||||
|
||||
### Interactive
|
||||
|
||||
**Is not allowed to run CPU intensive interactive jobs in the login nodes**.
|
||||
Only applications capable to limit the number of cores are allowed to run for
|
||||
longer time. Also, **running in the login nodes is not efficient**, since
|
||||
resources are shared with other processes and users.
|
||||
|
||||
Is possible to submit interactive jobs to the cluster by allocating a full
|
||||
compute node, or even by allocating a few cores only. This will grant dedicated
|
||||
CPUs and resources and in general it will not affect other users.
|
||||
|
||||
For interactive jobs, is strongly recommended to use the `hourly` partition,
|
||||
which usually has a good availability of nodes.
|
||||
|
||||
For longer runs, one should use the `daily` (or `general`) partition. However,
|
||||
getting interactive access to nodes on these partitions is sometimes more
|
||||
difficult if the cluster is pretty full.
|
||||
|
||||
To submit an interactive job, consider the following requirements:
|
||||
|
||||
* **X11 forwarding must be enabled**: Gothic spawns an interactive
|
||||
window which requires X11 forwarding when using it remotely, therefore
|
||||
using the Slurm option `--x11` is necessary.
|
||||
* **Ensure that the scratch area is accessible**: For running Gothic,
|
||||
one has to define a scratch area with the `GTHTMP` environment variable.
|
||||
There are two options:
|
||||
1. **Use local scratch**: Each compute node has its own `/scratch` area.
|
||||
This area is independent to any other node, therefore not visible by other
|
||||
nodes. Using the top directory `/scratch` for interactive jobs is the
|
||||
simplest way, and it can be defined before or after the allocation
|
||||
creation, as follows:
|
||||
|
||||
```bash
|
||||
# Example 1: Define GTHTMP before the allocation
|
||||
export GTHTMP=/scratch
|
||||
salloc ...
|
||||
|
||||
# Example 2: Define GTHTMP after the allocation
|
||||
salloc ...
|
||||
export GTHTMP=/scratch
|
||||
```
|
||||
|
||||
Notice that if you want to create a custom sub-directory (i.e.
|
||||
`/scratch/$USER`, one has to create the sub-directory on every new
|
||||
allocation! In example:
|
||||
|
||||
```bash
|
||||
# Example 1:
|
||||
export GTHTMP=/scratch/$USER
|
||||
salloc ...
|
||||
mkdir -p $GTHTMP
|
||||
|
||||
# Example 2:
|
||||
salloc ...
|
||||
export GTHTMP=/scratch/$USER
|
||||
mkdir -p $GTHTMP
|
||||
```
|
||||
|
||||
Creating sub-directories makes the process more complex, therefore using
|
||||
just `/scratch` is simpler and recommended.
|
||||
2. **Shared scratch**: Using shared scratch allows to have a directory
|
||||
visible from all compute nodes and login nodes. Therefore, one can use
|
||||
`/shared-scratch` to achieve the same as in **1.**, but creating a
|
||||
sub-directory needs to be done just once.
|
||||
|
||||
Please, consider that `/scratch` usually provides better performance and,
|
||||
in addition, will offload the main storage. Therefore, using **local
|
||||
scratch** is strongly recommended. Use the shared scratch only when
|
||||
strongly necessary.
|
||||
|
||||
* **Use the `hourly` partition**: Using the `hourly` partition is
|
||||
recommended for running interactive jobs (latency is in general lower).
|
||||
However, `daily` and `general` are also available if you expect longer
|
||||
runs, but in these cases you should expect longer waiting times.
|
||||
|
||||
These requirements are in addition to the requirements previously described in
|
||||
the [General requirements](#general-requirements) section.
|
||||
|
||||
#### Interactive allocations: examples
|
||||
|
||||
* Requesting a full node:
|
||||
|
||||
```bash
|
||||
salloc --partition=hourly -N 1 -n 1 -c 88 --hint=multithread --x11 --exclusive --mem=0
|
||||
```
|
||||
|
||||
* Requesting 22 CPUs from a node, with default memory per CPU (4000MB/CPU):
|
||||
|
||||
```bash
|
||||
num_cpus=22
|
||||
salloc --partition=hourly -N 1 -n 1 -c $num_cpus --hint=multithread --x11
|
||||
```
|
||||
|
||||
### Batch job
|
||||
|
||||
The Slurm cluster is mainly used by non interactive batch jobs: Users submit a
|
||||
job, which goes into a queue, and waits until Slurm can assign resources to it.
|
||||
In general, the longer the job, the longer the waiting time, unless there are
|
||||
enough free resources to inmediately start running it.
|
||||
|
||||
Running Gothic in a Slurm batch script is pretty simple. One has to mainly
|
||||
consider the requirements described in the [General
|
||||
requirements](#general-requirements) section, and:
|
||||
|
||||
* **Use local scratch** for running batch jobs. In general, defining
|
||||
`GTHTMP` in a batch script is simpler than on an allocation. If you plan to
|
||||
run multiple jobs in the same node, you can even create a second
|
||||
sub-directory level based on the Slurm Job ID:
|
||||
|
||||
```bash
|
||||
mkdir -p /scratch/$USER/$SLURM_JOB_ID
|
||||
export GTHTMP=/scratch/$USER/$SLURM_JOB_ID
|
||||
... # Run Gothic here
|
||||
rm -rf /scratch/$USER/$SLURM_JOB_ID
|
||||
```
|
||||
|
||||
Temporary data generated by the job in `GTHTMP` must be removed at the end of
|
||||
the job, as showed above.
|
||||
|
||||
#### Batch script: examples
|
||||
|
||||
* Requesting a full node:
|
||||
|
||||
```bash
|
||||
#!/bin/bash -l
|
||||
#SBATCH --job-name=Gothic
|
||||
#SBATCH --time=3-00:00:00
|
||||
#SBATCH --partition=general
|
||||
#SBATCH --nodes=1
|
||||
#SBATCH --ntasks=1
|
||||
#SBATCH --cpus-per-task=88
|
||||
#SBATCH --hint=multithread
|
||||
#SBATCH --exclusive
|
||||
#SBATCH --mem=0
|
||||
#SBATCH --clusters=merlin6
|
||||
|
||||
INPUT_FILE='MY_INPUT.SIN'
|
||||
|
||||
mkdir -p /scratch/$USER/$SLURM_JOB_ID
|
||||
export GTHTMP=/scratch/$USER/$SLURM_JOB_ID
|
||||
|
||||
/data/project/general/software/gothic/gothic8.3qa/bin/gothic_s.sh $INPUT_FILE -m -np $SLURM_CPUS_PER_TASK
|
||||
gth_exit_code=$?
|
||||
|
||||
# Clean up data in /scratch
|
||||
rm -rf /scratch/$USER/$SLURM_JOB_ID
|
||||
|
||||
# Return exit code from GOTHIC
|
||||
exit $gth_exit_code
|
||||
```
|
||||
|
||||
* Requesting 22 CPUs from a node, with default memory per CPU (4000MB/CPU):
|
||||
|
||||
```bash
|
||||
#!/bin/bash -l
|
||||
#SBATCH --job-name=Gothic
|
||||
#SBATCH --time=3-00:00:00
|
||||
#SBATCH --partition=general
|
||||
#SBATCH --nodes=1
|
||||
#SBATCH --ntasks=1
|
||||
#SBATCH --cpus-per-task=22
|
||||
#SBATCH --hint=multithread
|
||||
#SBATCH --clusters=merlin6
|
||||
|
||||
INPUT_FILE='MY_INPUT.SIN'
|
||||
|
||||
mkdir -p /scratch/$USER/$SLURM_JOB_ID
|
||||
export GTHTMP=/scratch/$USER/$SLURM_JOB_ID
|
||||
|
||||
/data/project/general/software/gothic/gothic8.3qa/bin/gothic_s.sh $INPUT_FILE -m -np $SLURM_CPUS_PER_TASK
|
||||
gth_exit_code=$?
|
||||
|
||||
# Clean up data in /scratch
|
||||
rm -rf /scratch/$USER/$SLURM_JOB_ID
|
||||
|
||||
# Return exit code from GOTHIC
|
||||
exit $gth_exit_code
|
||||
```
|
||||
Reference in New Issue
Block a user