Doc changes
This commit is contained in:
137
pages/merlin6/05-Software-Support/ansys-cfx.md
Normal file
137
pages/merlin6/05-Software-Support/ansys-cfx.md
Normal file
@ -0,0 +1,137 @@
|
||||
---
|
||||
title: ANSYS / CFX
|
||||
#tags:
|
||||
last_updated: 30 June 2020
|
||||
keywords: software, ansys, cfx5, cfx, slurm
|
||||
summary: "This document describes how to run ANSYS/CFX in the Merlin6 cluster"
|
||||
sidebar: merlin6_sidebar
|
||||
permalink: /merlin6/ansys-cfx.html
|
||||
---
|
||||
|
||||
This document describes the different ways for running **ANSYS/CFX**
|
||||
|
||||
## ANSYS/CFX
|
||||
|
||||
Is always recommended to check which parameters are available in CFX and adapt the below examples according to your needs.
|
||||
For that, run `cfx5solve -help` for getting a list of options.
|
||||
|
||||
## Running CFX jobs
|
||||
|
||||
### PModules
|
||||
|
||||
Is strongly recommended the use of the latest ANSYS software **ANSYS/2020R1-1** available in PModules.
|
||||
|
||||
```bash
|
||||
module use unstable
|
||||
module load ANSYS/2020R1-1
|
||||
```
|
||||
|
||||
### Non-interactive: sbatch
|
||||
|
||||
Running jobs with `sbatch` is always the recommended method. This makes the use of the resources more efficient. Notice that for
|
||||
running non interactive Mechanical APDL jobs one must specify the `-batch` option.
|
||||
|
||||
#### Serial example
|
||||
|
||||
This example shows a very basic serial job.
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#SBATCH --job-name=CFX # Job Name
|
||||
#SBATCH --partition=hourly # Using 'daily' will grant higher priority than 'general'
|
||||
#SBATCH --time=0-01:00:00 # Time needed for running the job. Must match with 'partition' limits.
|
||||
#SBATCH --cpus-per-task=1 # Double if hyperthreading enabled
|
||||
#SBATCH --ntasks-per-core=1 # Double if hyperthreading enabled
|
||||
#SBATCH --hint=nomultithread # Disable Hyperthreading
|
||||
#SBATCH --error=slurm-%j.err # Define your error file
|
||||
|
||||
module use unstable
|
||||
module load ANSYS/2020R1-1
|
||||
|
||||
# [Optional:BEGIN] Specify your license server if this is not 'lic-ansys.psi.ch'
|
||||
LICENSE_SERVER=<your_license_server>
|
||||
export ANSYSLMD_LICENSE_FILE=1055@$LICENSE_SERVER
|
||||
export ANSYSLI_SERVERS=2325@$LICENSE_SERVER
|
||||
# [Optional:END]
|
||||
|
||||
SOLVER_FILE=/data/user/caubet_m/CFX5/mysolver.in
|
||||
cfx5solve -batch -def "$JOURNAL_FILE"
|
||||
```
|
||||
|
||||
One can enable hypertheading by defining `--hint=multithread`, `--cpus-per-task=2` and `--ntasks-per-core=2`.
|
||||
However, this is in general not recommended, unless one can ensure that can be beneficial.
|
||||
|
||||
#### MPI-based example
|
||||
|
||||
An example for running CFX using a Slurm batch script is the following:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#SBATCH --job-name=CFX # Job Name
|
||||
#SBATCH --partition=hourly # Using 'daily' will grant higher priority than 'general'
|
||||
#SBATCH --time=0-01:00:00 # Time needed for running the job. Must match with 'partition' limits.
|
||||
#SBATCH --nodes=1 # Number of nodes
|
||||
#SBATCH --ntasks=44 # Number of tasks
|
||||
#SBATCH --cpus-per-task=1 # Double if hyperthreading enabled
|
||||
#SBATCH --ntasks-per-core=1 # Double if hyperthreading enabled
|
||||
#SBATCH --hint=nomultithread # Disable Hyperthreading
|
||||
#SBATCH --error=slurm-%j.err # Define a file for standard error messages
|
||||
##SBATCH --exclusive # Uncomment if you want exclusive usage of the nodes
|
||||
|
||||
module use unstable
|
||||
module load ANSYS/2020R1-1
|
||||
|
||||
# [Optional:BEGIN] Specify your license server if this is not 'lic-ansys.psi.ch'
|
||||
LICENSE_SERVER=<your_license_server>
|
||||
export ANSYSLMD_LICENSE_FILE=1055@$LICENSE_SERVER
|
||||
export ANSYSLI_SERVERS=2325@$LICENSE_SERVER
|
||||
# [Optional:END]
|
||||
|
||||
export HOSTLIST=$(scontrol show hostname | tr '\n' ',' | sed 's/,$//g')
|
||||
|
||||
JOURNAL_FILE=myjournal.in
|
||||
|
||||
# INTELMPI=no for IBM MPI
|
||||
# INTELMPI=yes for INTEL MPI
|
||||
INTELMPI=no
|
||||
|
||||
if [ "$INTELMPI" == "yes" ]
|
||||
then
|
||||
export I_MPI_DEBUG=4
|
||||
export I_MPI_PIN_CELL=core
|
||||
|
||||
# Simple example: cfx5solve -batch -def "$JOURNAL_FILE" -par-dist "$HOSTLIST" \
|
||||
# -part $SLURM_NTASKS \
|
||||
# -start-method 'Intel MPI Distributed Parallel'
|
||||
cfx5solve -batch -part-large -double -verbose -def "$JOURNAL_FILE" -par-dist "$HOSTLIST" \
|
||||
-part $SLURM_NTASKS -par-local -start-method 'Intel MPI Distributed Parallel'
|
||||
else
|
||||
# Simple example: cfx5solve -batch -def "$JOURNAL_FILE" -par-dist "$HOSTLIST" \
|
||||
# -part $SLURM_NTASKS \
|
||||
# -start-method 'IBM MPI Distributed Parallel'
|
||||
cfx5solve -batch -part-large -double -verbose -def "$JOURNAL_FILE" -par-dist "$HOSTLIST" \
|
||||
-part $SLURM_NTASKS -par-local -start-method 'IBM MPI Distributed Parallel'
|
||||
fi
|
||||
```
|
||||
|
||||
In the above example, one can increase the number of *nodes* and/or *ntasks* if needed and combine it
|
||||
with `--exclusive` whenever needed. In general, **no hypertheading** is recommended for MPI based jobs.
|
||||
Also, one can combine it with `--exclusive` when necessary. Finally, one can change the MPI technology in `-start-method`
|
||||
(check CFX documentation for possible values).
|
||||
|
||||
## CFX5 Launcher: CFD-Pre/Post, Solve Manager, TurboGrid
|
||||
|
||||
Some users might need to visualize or change some parameters when running calculations with the CFX Solver. For running
|
||||
**TurboGrid**, **CFX-Pre**, **CFX-Solver Manager** or **CFD-Post** one should run it with the **`cfx5` launcher** binary:
|
||||
|
||||
```bash
|
||||
cfx5
|
||||
```
|
||||
|
||||

|
||||
|
||||
Then, from the launcher, one can open the proper application (i.e. **CFX-Solver Manager** for visualizing and modifying an
|
||||
existing job run)
|
||||
|
||||
For running CFX5 Launcher, is required a proper SSH + X11 Forwarding access (`ssh -XY`) or *preferrible* **NoMachine**.
|
||||
If **ssh** does not work for you, please use **NoMachine** instead (which is the supported X based access, and simpler).
|
156
pages/merlin6/05-Software-Support/ansys-fluent.md
Normal file
156
pages/merlin6/05-Software-Support/ansys-fluent.md
Normal file
@ -0,0 +1,156 @@
|
||||
---
|
||||
title: ANSYS / Fluent
|
||||
#tags:
|
||||
last_updated: 30 June 2020
|
||||
keywords: software, ansys, fluent, slurm
|
||||
summary: "This document describes how to run ANSYS/Fluent in the Merlin6 cluster"
|
||||
sidebar: merlin6_sidebar
|
||||
permalink: /merlin6/ansys-fluent.html
|
||||
---
|
||||
|
||||
This document describes the different ways for running **ANSYS/Fluent**
|
||||
|
||||
## ANSYS/Fluent
|
||||
|
||||
Is always recommended to check which parameters are available in Fluent and adapt the below example according to your needs.
|
||||
For that, run `fluent -help` for getting a list of options. However, as when running Fluent one must specify one of the
|
||||
following flags:
|
||||
* **2d**: This is a 2D solver with single point precision.
|
||||
* **3d**: This is a 3D solver with single point precision.
|
||||
* **2dpp**: This is a 2D solver with double point precision.
|
||||
* **3dpp**: This is a 3D solver with double point precision.
|
||||
|
||||
## Running Fluent jobs
|
||||
|
||||
### PModules
|
||||
|
||||
Is strongly recommended the use of the latest ANSYS software **ANSYS/2020R1-1** available in PModules.
|
||||
|
||||
```bash
|
||||
module use unstable
|
||||
module load ANSYS/2020R1-1
|
||||
```
|
||||
|
||||
### Non-interactive: sbatch
|
||||
|
||||
Running jobs with `sbatch` is always the recommended method. This makes the use of the resources more efficient.
|
||||
For running it as a job, one needs to run in no graphical mode (`-g` option).
|
||||
|
||||
#### Serial example
|
||||
|
||||
This example shows a very basic serial job.
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#SBATCH --job-name=Fluent # Job Name
|
||||
#SBATCH --partition=hourly # Using 'daily' will grant higher priority than 'general'
|
||||
#SBATCH --time=0-01:00:00 # Time needed for running the job. Must match with 'partition' limits.
|
||||
#SBATCH --cpus-per-task=1 # Double if hyperthreading enabled
|
||||
#SBATCH --hint=nomultithread # Disable Hyperthreading
|
||||
#SBATCH --error=slurm-%j.err # Define your error file
|
||||
|
||||
module use unstable
|
||||
module load ANSYS/2020R1-1
|
||||
|
||||
# [Optional:BEGIN] Specify your license server if this is not 'lic-ansys.psi.ch'
|
||||
LICENSE_SERVER=<your_license_server>
|
||||
export ANSYSLMD_LICENSE_FILE=1055@$LICENSE_SERVER
|
||||
export ANSYSLI_SERVERS=2325@$LICENSE_SERVER
|
||||
# [Optional:END]
|
||||
|
||||
JOURNAL_FILE=/data/user/caubet_m/Fluent/myjournal.in
|
||||
fluent 3ddp -g -i ${JOURNAL_FILE}
|
||||
```
|
||||
|
||||
One can enable hypertheading by defining `--hint=multithread`, `--cpus-per-task=2` and `--ntasks-per-core=2`.
|
||||
However, this is in general not recommended, unless one can ensure that can be beneficial.
|
||||
|
||||
#### MPI-based example
|
||||
|
||||
An example for running Fluent using a Slurm batch script is the following:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#SBATCH --job-name=Fluent # Job Name
|
||||
#SBATCH --partition=hourly # Using 'daily' will grant higher priority than 'general'
|
||||
#SBATCH --time=0-01:00:00 # Time needed for running the job. Must match with 'partition' limits.
|
||||
#SBATCH --nodes=1 # Number of nodes
|
||||
#SBATCH --ntasks=44 # Number of tasks
|
||||
#SBATCH --cpus-per-task=1 # Double if hyperthreading enabled
|
||||
#SBATCH --ntasks-per-core=1 # Run one task per core
|
||||
#SBATCH --hint=nomultithread # Disable Hyperthreading
|
||||
#SBATCH --error=slurm-%j.err # Define a file for standard error messages
|
||||
##SBATCH --exclusive # Uncomment if you want exclusive usage of the nodes
|
||||
|
||||
module use unstable
|
||||
module load ANSYS/2020R1-1
|
||||
|
||||
# [Optional:BEGIN] Specify your license server if this is not 'lic-ansys.psi.ch'
|
||||
LICENSE_SERVER=<your_license_server>
|
||||
export ANSYSLMD_LICENSE_FILE=1055@$LICENSE_SERVER
|
||||
export ANSYSLI_SERVERS=2325@$LICENSE_SERVER
|
||||
# [Optional:END]
|
||||
|
||||
JOURNAL_FILE=/data/user/caubet_m/Fluent/myjournal.in
|
||||
fluent 3ddp -g -t ${SLURM_NTASKS} -i ${JOURNAL_FILE}
|
||||
```
|
||||
|
||||
In the above example, one can increase the number of *nodes* and/or *ntasks* if needed. One can remove
|
||||
`--nodes` for running on multiple nodes, but may lead to communication overhead. In general, **no
|
||||
hyperthreading** is recommended for MPI based jobs. Also, one can combine it with `--exclusive` when necessary.
|
||||
|
||||
|
||||
## Interactive: salloc
|
||||
|
||||
Running Fluent interactively is strongly not recommended and one should whenever possible use `sbatch`.
|
||||
However, sometimes interactive runs are needed. For jobs requiring only few CPUs (in example, 2 CPUs) **and** for a short period of time, one can use the login nodes.
|
||||
Otherwise, one must use the Slurm batch system using allocations:
|
||||
* For short jobs requiring more CPUs, one can use the Merlin shortest partitions (`hourly`).
|
||||
* For longer jobs, one can use longer partitions, however, interactive access is not always possible (depending on the usage of the cluster).
|
||||
|
||||
Please refer to the documentation **[Running Interactive Jobs](/merlin6/interactive-jobs.html)** for firther information about different ways for running interactive
|
||||
jobs in the Merlin6 cluster.
|
||||
|
||||
### Requirements
|
||||
|
||||
#### SSH Keys
|
||||
|
||||
Running Fluent interactively requires the use of SSH Keys. This is the way of communication between the GUI and the different nodes. For doing that, one must have
|
||||
a **passphrase protected** SSH Key. If the user does not have SSH Keys yet (simply run **`ls $HOME/.ssh/`** to check whether **`id_rsa`** files exist or not). For
|
||||
deploying SSH Keys for running Fluent interactively, one should follow this documentation: **[Configuring SSH Keys](/merlin6/ssh-keys.html)**
|
||||
|
||||
|
||||
#### List of hosts
|
||||
|
||||
For running Fluent using Slurm computing nodes, one needs to get the list of the reserved nodes. For getting that list, once you have the allocation, one can run
|
||||
the following command:
|
||||
|
||||
```bash
|
||||
scontrol show hostname
|
||||
```
|
||||
|
||||
This list must be included in the settings as the list of hosts where to run Fluent. Alternatively, one can give that list as parameter (`-cnf` option) when running `fluent`,
|
||||
as follows:
|
||||
|
||||
<details>
|
||||
<summary>[Running Fluent with 'salloc' example]</summary>
|
||||
<pre class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false">
|
||||
(base) [caubet_m@merlin-l-001 caubet_m]$ salloc --nodes=2 --ntasks=88 --hint=nomultithread --time=0-01:00:00 --partition=test $SHELL
|
||||
salloc: Pending job allocation 135030174
|
||||
salloc: job 135030174 queued and waiting for resources
|
||||
salloc: job 135030174 has been allocated resources
|
||||
salloc: Granted job allocation 135030174
|
||||
|
||||
(base) [caubet_m@merlin-l-001 caubet_m]$ module use unstable
|
||||
(
|
||||
base) [caubet_m@merlin-l-001 caubet_m]$ module load ANSYS/2020R1-1
|
||||
module load: unstable module has been loaded -- ANSYS/2020R1-1
|
||||
|
||||
(base) [caubet_m@merlin-l-001 caubet_m]$ fluent 3ddp -t$SLURM_NPROCS -cnf=$(scontrol show hostname | tr '\n' ',')
|
||||
|
||||
(base) [caubet_m@merlin-l-001 caubet_m]$ exit
|
||||
exit
|
||||
salloc: Relinquishing job allocation 135030174
|
||||
salloc: Job allocation 135030174 has been revoked.
|
||||
</pre>
|
||||
</details>
|
155
pages/merlin6/05-Software-Support/ansys-mapdl.md
Normal file
155
pages/merlin6/05-Software-Support/ansys-mapdl.md
Normal file
@ -0,0 +1,155 @@
|
||||
---
|
||||
title: ANSYS / MAPDL
|
||||
#tags:
|
||||
last_updated: 30 June 2020
|
||||
keywords: software, ansys, mapdl, slurm, apdl
|
||||
summary: "This document describes how to run ANSYS/Mechanical APDL in the Merlin6 cluster"
|
||||
sidebar: merlin6_sidebar
|
||||
permalink: /merlin6/ansys-mapdl.html
|
||||
---
|
||||
|
||||
This document describes the different ways for running **ANSYS/Mechanical APDL**
|
||||
|
||||
## ANSYS/Mechanical APDL
|
||||
|
||||
Is always recommended to check which parameters are available in Mechanical APDL and adapt the below examples according to your needs.
|
||||
For that, please refer to the official Mechanical APDL documentation.
|
||||
|
||||
## Running Mechanical APDL jobs
|
||||
|
||||
### PModules
|
||||
|
||||
Is strongly recommended the use of the latest ANSYS software **ANSYS/2020R1-1** available in PModules.
|
||||
|
||||
```bash
|
||||
module use unstable
|
||||
module load ANSYS/2020R1-1
|
||||
```
|
||||
|
||||
### Non-interactive: sbatch
|
||||
|
||||
Running jobs with `sbatch` is always the recommended method. This makes the use of the resources more efficient. Notice that for
|
||||
running non interactive Mechanical APDL jobs one must specify the `-b` option.
|
||||
|
||||
#### Serial example
|
||||
|
||||
This example shows a very basic serial job.
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#SBATCH --job-name=MAPDL # Job Name
|
||||
#SBATCH --partition=hourly # Using 'daily' will grant higher priority than 'general'
|
||||
#SBATCH --time=0-01:00:00 # Time needed for running the job. Must match with 'partition' limits.
|
||||
#SBATCH --cpus-per-task=1 # Double if hyperthreading enabled
|
||||
#SBATCH --ntasks-per-core=1 # Double if hyperthreading enabled
|
||||
#SBATCH --hint=nomultithread # Disable Hyperthreading
|
||||
#SBATCH --error=slurm-%j.err # Define your error file
|
||||
|
||||
module use unstable
|
||||
module load ANSYS/2020R1-1
|
||||
|
||||
# [Optional:BEGIN] Specify your license server if this is not 'lic-ansys.psi.ch'
|
||||
LICENSE_SERVER=<your_license_server>
|
||||
export ANSYSLMD_LICENSE_FILE=1055@$LICENSE_SERVER
|
||||
export ANSYSLI_SERVERS=2325@$LICENSE_SERVER
|
||||
# [Optional:END]
|
||||
|
||||
SOLVER_FILE=/data/user/caubet_m/MAPDL/mysolver.in
|
||||
mapdl -b -i "$SOLVER_FILE"
|
||||
```
|
||||
|
||||
One can enable hypertheading by defining `--hint=multithread`, `--cpus-per-task=2` and `--ntasks-per-core=2`.
|
||||
However, this is in general not recommended, unless one can ensure that can be beneficial.
|
||||
|
||||
#### SMP-based example
|
||||
|
||||
This example shows how to running Mechanical APDL in Shared-Memory Parallelism mode. It limits the use
|
||||
to 1 single node, but by using many cores. In the example below, we use a node by using all his cores
|
||||
and the whole memory.
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#SBATCH --job-name=MAPDL # Job Name
|
||||
#SBATCH --partition=hourly # Using 'daily' will grant higher priority than 'general'
|
||||
#SBATCH --time=0-01:00:00 # Time needed for running the job. Must match with 'partition' limits.
|
||||
#SBATCH --nodes=1 # Number of nodes
|
||||
#SBATCH --ntasks=1 # Number of tasks
|
||||
#SBATCH --cpus-per-task=44 # Double if hyperthreading enabled
|
||||
#SBATCH --hint=nomultithread # Disable Hyperthreading
|
||||
#SBATCH --error=slurm-%j.err # Define a file for standard error messages
|
||||
#SBATCH --exclusive # Uncomment if you want exclusive usage of the nodes
|
||||
|
||||
module use unstable
|
||||
module load ANSYS/2020R1-1
|
||||
|
||||
# [Optional:BEGIN] Specify your license server if this is not 'lic-ansys.psi.ch'
|
||||
LICENSE_SERVER=<your_license_server>
|
||||
export ANSYSLMD_LICENSE_FILE=1055@$LICENSE_SERVER
|
||||
export ANSYSLI_SERVERS=2325@$LICENSE_SERVER
|
||||
# [Optional:END]
|
||||
|
||||
SOLVER_FILE=/data/user/caubet_m/MAPDL/mysolver.in
|
||||
mapdl -b -np ${SLURM_CPUS_PER_TASK} -i "$SOLVER_FILE"
|
||||
```
|
||||
|
||||
In the above example, one can reduce the number of **cpus per task**. Here usually `--exclusive`
|
||||
is recommended if one needs to use the whole memory.
|
||||
|
||||
For **SMP** runs, one might try the hyperthreading mode by doubling the proper settings
|
||||
(`--cpus-per-task`), in some cases it might be beneficial.
|
||||
|
||||
Please notice that `--ntasks-per-core=1` is not defined here, this is because we want to run 1
|
||||
task on many cores! As an alternative, one can explore `--ntasks-per-socket` or `--ntasks-per-node`
|
||||
for fine grained configurations.
|
||||
|
||||
#### MPI-based example
|
||||
|
||||
This example enables Distributed ANSYS for running Mechanical APDL using a Slurm batch script.
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
#SBATCH --job-name=MAPDL # Job Name
|
||||
#SBATCH --partition=hourly # Using 'daily' will grant higher priority than 'general'
|
||||
#SBATCH --time=0-01:00:00 # Time needed for running the job. Must match with 'partition' limits.
|
||||
#SBATCH --nodes=1 # Number of nodes
|
||||
#SBATCH --ntasks=44 # Number of tasks
|
||||
#SBATCH --cpus-per-task=1 # Double if hyperthreading enabled
|
||||
#SBATCH --ntasks-per-core=1 # Run one task per core
|
||||
#SBATCH --hint=nomultithread # Disable Hyperthreading
|
||||
#SBATCH --error=slurm-%j.err # Define a file for standard error messages
|
||||
##SBATCH --exclusive # Uncomment if you want exclusive usage of the nodes
|
||||
|
||||
module use unstable
|
||||
module load ANSYS/2020R1-1
|
||||
|
||||
# [Optional:BEGIN] Specify your license server if this is not 'lic-ansys.psi.ch'
|
||||
LICENSE_SERVER=<your_license_server>
|
||||
export ANSYSLMD_LICENSE_FILE=1055@$LICENSE_SERVER
|
||||
export ANSYSLI_SERVERS=2325@$LICENSE_SERVER
|
||||
# [Optional:END]
|
||||
|
||||
SOLVER_FILE=input.dat
|
||||
|
||||
# INTELMPI=no for IBM MPI
|
||||
# INTELMPI=yes for INTEL MPI
|
||||
INTELMPI=no
|
||||
|
||||
if [ "$INTELMPI" == "yes" ]
|
||||
then
|
||||
# When using -mpi=intelmpi, KMP Affinity must be disabled
|
||||
export KMP_AFFINITY=disabled
|
||||
|
||||
# INTELMPI is not aware about distribution of tasks.
|
||||
# - We need to define tasks distribution.
|
||||
HOSTLIST=$(srun hostname | sort | uniq -c | awk '{print $2 ":" $1}' | tr '\n' ':' | sed 's/:$/\n/g')
|
||||
mapdl -b -dis -mpi intelmpi -machines $HOSTLIST -np ${SLURM_NTASKS} -i "$SOLVER_FILE"
|
||||
else
|
||||
# IBMMPI (default) will be aware of the distribution of tasks.
|
||||
# - In principle, no need to force tasks distribution
|
||||
mapdl -b -dis -mpi ibmmpi -np ${SLURM_NTASKS} -i "$SOLVER_FILE"
|
||||
fi
|
||||
```
|
||||
|
||||
In the above example, one can increase the number of *nodes* and/or *ntasks* if needed and combine it
|
||||
with `--exclusive` when necessary. In general, **no hypertheading** is recommended for MPI based jobs.
|
||||
Also, one can combine it with `--exclusive` when necessary.
|
45
pages/merlin6/05-Software-Support/impi.md
Normal file
45
pages/merlin6/05-Software-Support/impi.md
Normal file
@ -0,0 +1,45 @@
|
||||
---
|
||||
title: Intel MPI Support
|
||||
#tags:
|
||||
last_updated: 13 March 2020
|
||||
keywords: software, impi, slurm
|
||||
summary: "This document describes how to use Intel MPI in the Merlin6 cluster"
|
||||
sidebar: merlin6_sidebar
|
||||
permalink: /merlin6/impi.html
|
||||
---
|
||||
|
||||
## Introduction
|
||||
|
||||
This document describes which set of Intel MPI versions in PModules are supported in the Merlin6 cluster.
|
||||
|
||||
### srun
|
||||
|
||||
We strongly recommend the use of **'srun'** over **'mpirun'** or **'mpiexec'**. Using **'srun'** would properly
|
||||
bind tasks in to cores and less customization is needed, while **'mpirun'** and '**mpiexec**' might need more advanced
|
||||
configuration and should be only used by advanced users. Please, ***always*** adapt your scripts for using **'srun'**
|
||||
before opening a support ticket. Also, please contact us on any problem when using a module.
|
||||
|
||||
{{site.data.alerts.tip}} Always run Intel MPI with the <b>srun</b> command. The only exception is for advanced users, however <b>srun</b> is still recommended.
|
||||
{{site.data.alerts.end}}
|
||||
|
||||
When running with **srun**, one should tell Intel MPI to use the PMI libraries provided by Slurm. For PMI-1:
|
||||
|
||||
```bash
|
||||
export I_MPI_PMI_LIBRARY=/usr/lib64/libpmi.so
|
||||
|
||||
srun ./app
|
||||
```
|
||||
|
||||
Alternatively, one can use PMI-2, but then one needs to specify it as follows:
|
||||
|
||||
```bash
|
||||
export I_MPI_PMI_LIBRARY=/usr/lib64/libpmi2.so
|
||||
export I_MPI_PMI2=yes
|
||||
|
||||
srun ./app
|
||||
```
|
||||
|
||||
For more information, please read [Slurm Intel MPI Guide](https://slurm.schedmd.com/mpi_guide.html#intel_mpi)
|
||||
|
||||
**Note**: Please note that PMI2 might not work properly in some Intel MPI versions. If so, you can either fallback
|
||||
to PMI-1 or to contact the Merlin administrators.
|
140
pages/merlin6/05-Software-Support/openmpi.md
Normal file
140
pages/merlin6/05-Software-Support/openmpi.md
Normal file
@ -0,0 +1,140 @@
|
||||
---
|
||||
title: OpenMPI Support
|
||||
#tags:
|
||||
last_updated: 13 March 2020
|
||||
keywords: software, openmpi, slurm
|
||||
summary: "This document describes how to use OpenMPI in the Merlin6 cluster"
|
||||
sidebar: merlin6_sidebar
|
||||
permalink: /merlin6/openmpi.html
|
||||
---
|
||||
|
||||
## Introduction
|
||||
|
||||
This document describes which set of OpenMPI versions in PModules are supported in the Merlin6 cluster.
|
||||
|
||||
### srun
|
||||
|
||||
We strongly recommend the use of **'srun'** over **'mpirun'** or **'mpiexec'**. Using **'srun'** would properly
|
||||
bind tasks in to cores and less customization is needed, while **'mpirun'** and '**mpiexec**' might need more advanced
|
||||
configuration and should be only used by advanced users. Please, ***always*** adapt your scripts for using **'srun'**
|
||||
before opening a support ticket. Also, please contact us on any problem when using a module.
|
||||
|
||||
Example:
|
||||
|
||||
```bash
|
||||
srun ./app
|
||||
```
|
||||
|
||||
{{site.data.alerts.tip}} Always run OpenMPI with the <b>srun</b> command. The only exception is for advanced users, however <b>srun</b> is still recommended.
|
||||
{{site.data.alerts.end}}
|
||||
|
||||
### OpenMPI with UCX
|
||||
|
||||
**OpenMPI** supports **UCX** starting from version 3.0, but it’s recommended to use version 4.0 or higher due to stability and performance improvements.
|
||||
**UCX** should be used only by advanced users, as it requires to run it with **mpirun** (needs advanced knowledge) and is an exception for running MPI
|
||||
without **srun** (**UCX** is not integrated at PSI within **srun**).
|
||||
|
||||
For running UCX, one should:
|
||||
|
||||
* add the following options to **mpirun**:
|
||||
```bash
|
||||
-mca pml ucx --mca btl ^vader,tcp,openib,uct -x UCX_NET_DEVICES=mlx5_0:1
|
||||
```
|
||||
* or alternatively, add the following options **before mpirun**
|
||||
```bash
|
||||
export OMPI_MCA_pml="ucx"
|
||||
export OMPI_MCA_btl="^vader,tcp,openib,uct"
|
||||
export UCX_NET_DEVICES=mlx5_0:1
|
||||
```
|
||||
|
||||
In addition, one can add the following options for debugging purposes (visit [UCX Logging](https://github.com/openucx/ucx/wiki/Logging) for possible `UCX_LOG_LEVEL` values):
|
||||
|
||||
```bash
|
||||
-x UCX_LOG_LEVEL=<data|debug|warn|info|...> -x UCX_LOG_FILE=<filename>
|
||||
```
|
||||
|
||||
This can be also added externally before the **mpirun** call (see below example). Full example:
|
||||
|
||||
* Within the **mpirun** command:
|
||||
```bash
|
||||
mpirun -np $SLURM_NTASKS -mca pml ucx --mca btl ^vader,tcp,openib,uct -x UCX_NET_DEVICES=mlx5_0:1 -x UCX_LOG_LEVEL=data -x UCX_LOG_FILE=UCX-$SLURM_JOB_ID.log ./app
|
||||
```
|
||||
* Outside the **mpirun** command:
|
||||
```bash
|
||||
export OMPI_MCA_pml="ucx"
|
||||
export OMPI_MCA_btl="^vader,tcp,openib,uct"
|
||||
export UCX_NET_DEVICES=mlx5_0:1
|
||||
export UCX_LOG_LEVEL=data
|
||||
export UCX_LOG_FILE=UCX-$SLURM_JOB_ID.log
|
||||
|
||||
mpirun -np $SLURM_NTASKS ./app
|
||||
```
|
||||
|
||||
## Supported OpenMPI versions
|
||||
|
||||
For running OpenMPI properly in a Slurm batch system, ***OpenMPI and Slurm must be compiled accordingly***.
|
||||
|
||||
We can find a large number of compilations of OpenMPI modules in the PModules central repositories. However, only
|
||||
some of them are suitable for running in a Slurm cluster: ***any OpenMPI versions with suffixes `_slurm`
|
||||
are suitable for running in the Merlin6 cluster***. Also, OpenMPI with suffix `_merlin6` can be used, but these will be fully
|
||||
replaced by the `_slurm` series in the future (so it can be used on any Slurm cluster at PSI). Please, ***avoid using any other OpenMPI releases***.
|
||||
|
||||
{{site.data.alerts.tip}} Suitable <b>OpenMPI</b> versions for running in the Merlin6 cluster:
|
||||
<p> -
|
||||
<span class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false"><b>openmpi/<version>_slurm</b>
|
||||
</span> <b>[<u>Recommended</u>]</b>
|
||||
</p>
|
||||
<p> -
|
||||
<span class="terminal code highlight js-syntax-highlight plaintext" lang="plaintext" markdown="false">openmpi/<version>_merlin6
|
||||
</span>
|
||||
</p>
|
||||
{{site.data.alerts.end}}
|
||||
|
||||
#### 'unstable' repository
|
||||
|
||||
New OpenMPI versions that need to be tested will be compiled first in the **``unstable``** repository, and once validated will be moved to **``stable``**.
|
||||
We can not ensure that modules in that repository are production ready, but you can use it *at your own risk*.
|
||||
|
||||
For using *unstable* modules, you might need to load the **``unstable``** PModules repository as follows:
|
||||
```bash
|
||||
module use unstable
|
||||
```
|
||||
|
||||
#### 'stable' repository
|
||||
|
||||
Officially supported OpenMPI versions (https://www.open-mpi.org/) will be available in the **``stable``** repository (which is the *default* loaded repository).
|
||||
For further information, please check [https://www.open-mpi.org/software/ompi/ -> Current & Still Supported](https://www.open-mpi.org/software/ompi/)
|
||||
versions.
|
||||
|
||||
Usually, not more than 2 minor update releases will be present in the **``stable``** repository. Older minor update releases will be moved to **``deprecated``**
|
||||
despite are officially supported. This will ensure that users compile new software with the latest stable versions, but we keep available the old versions
|
||||
for software which was compiled with it.
|
||||
|
||||
#### 'deprecated' repository
|
||||
|
||||
Old OpenMPI versions (it is, any official OpenMPI version which has been moved to **retired** or **ancient**) will be
|
||||
moved to the ***'deprecated'*** PModules repository.
|
||||
For further information, please check [https://www.open-mpi.org/software/ompi/ -> Older Versions](https://www.open-mpi.org/software/ompi/)
|
||||
versions.
|
||||
|
||||
Also, as mentioned in [before](/merlin6/openmpi.html#stable-repository), older official supported OpenMPI releases (minor updates) will be moved to ``deprecated``.
|
||||
|
||||
For using *deprecated* modules, you might need to load the **``deprecated``** PModules repository as follows:
|
||||
```bash
|
||||
module use deprecated
|
||||
```
|
||||
However, this is usually not needed: when loading directly a specific version in the ``deprecated`` repository, if this is not found in
|
||||
``stable`` it try to search and to fallback to other repositories (``deprecated`` or ``unstable``).
|
||||
|
||||
#### About missing versions
|
||||
|
||||
##### Missing OpenMPI versions
|
||||
|
||||
For legacy software, some users might require a different OpenMPI version. **We always encourage** users to try one of the existing stable versions
|
||||
(*OpenMPI always with suffix ``_slurm`` or ``_merlin6``!*), as they will contain the latest bug fixes and they usually should work. In the worst case, you
|
||||
can also try with the ones in the deprecated repository (again, *OpenMPI always with suffix ``_slurm`` or ``_merlin6``!*), or for very old software which
|
||||
was based on OpenMPI v1 you can follow the guide [FAQ: Removed MPI constructs](https://www.open-mpi.org/faq/?category=mpi-removed), which provides
|
||||
some easy steps for migrating from OpenMPI v1 to v2 or superior or also is useful to find out why your code does not compile properly.
|
||||
|
||||
When, after trying the mentioned versions and guide, you are still facing problems, please contact us. Also, please contact us if you require a newer
|
||||
version with a different ``gcc`` or ``intel`` compiler (in example, Intel v19).
|
63
pages/merlin6/05-Software-Support/paraview.md
Normal file
63
pages/merlin6/05-Software-Support/paraview.md
Normal file
@ -0,0 +1,63 @@
|
||||
---
|
||||
title: Running Paraview
|
||||
#tags:
|
||||
last_updated: 03 December 2020
|
||||
keywords: software, paraview, mesa, OpenGL
|
||||
summary: "This document describes how to run ParaView in the Merlin6 cluster"
|
||||
sidebar: merlin6_sidebar
|
||||
permalink: /merlin6/paraview.html
|
||||
---
|
||||
|
||||
## Requirements
|
||||
|
||||
**[NoMachine](/merlin6/nomachine.html)** is the official **strongly recommended and supported** tool for running *ParaView*.
|
||||
Consider that running over SSH (X11-Forwarding needed) is very slow, but also configuration might not work as it also depends
|
||||
on the client configuration (Linux workstation/laptop, Windows with XMing, etc.). Hence, please **avoid running Paraview over SSH**.
|
||||
The only exception for running over SSH is when running it as a job from a NoMachine client.
|
||||
|
||||
## ParaView
|
||||
|
||||
### PModules
|
||||
|
||||
Is strongly recommended the use of the latest ParaView version available in PModules. In example, for loading **paraview**:
|
||||
|
||||
```bash
|
||||
module use unstable
|
||||
module load paraview/5.8.1
|
||||
```
|
||||
|
||||
### Running ParaView
|
||||
|
||||
For running ParaView, one can run it with **VirtualGL** to take advantatge of the GPU card located on each login node. For that, once loaded, you can start **paraview** as follows:
|
||||
|
||||
```bash
|
||||
vglrun paraview
|
||||
```
|
||||
|
||||
Alternatively, one can run **paraview** with *mesa* support with the below command. This can be useful when running on CPU computing nodes (with `srun` / `salloc`)
|
||||
which have no graphics card (and where `vglrun` is not possible):
|
||||
|
||||
```bash
|
||||
paraview-mesa paraview
|
||||
```
|
||||
|
||||
#### Running older versions of ParaView
|
||||
|
||||
Older versions of ParaView available in PModules (i.e. *paraview/5.0.1* and *paraview/5.4.1*) might require a different command
|
||||
for running paraview with **Mesa** support. The command is the following:
|
||||
|
||||
```bash
|
||||
# Warning: only for Paraview 5.4.1 and older
|
||||
paraview --mesa
|
||||
```
|
||||
|
||||
#### Running ParaView interactively in the batch system
|
||||
|
||||
One can run ParaView interactively in the CPU cluster as follows:
|
||||
|
||||
```bash
|
||||
# First, load module. In example: "module load paraview/5.8.1"
|
||||
srun --pty --x11 --partition=general --ntasks=1 paraview-mesa paraview
|
||||
```
|
||||
|
||||
One can change the partition, number of tasks or specify extra parameters to `srun` if needed.
|
192
pages/merlin6/05-Software-Support/python.md
Normal file
192
pages/merlin6/05-Software-Support/python.md
Normal file
@ -0,0 +1,192 @@
|
||||
---
|
||||
title: Python
|
||||
#tags:
|
||||
last_updated: 28 September 2020
|
||||
keywords: [python, anaconda, conda, jupyter, numpy]
|
||||
summary: Running Python on Merlin
|
||||
sidebar: merlin6_sidebar
|
||||
permalink: /merlin6/python.html
|
||||
---
|
||||
|
||||
PSI provides a variety of ways to execute python code.
|
||||
|
||||
1. **psi-python modules** - Central installation with common packages pre-installed
|
||||
2. **Anaconda** - Custom environments for using installation and development
|
||||
3. **Jupyterhub** - Execute Jupyter notebooks on the cluster
|
||||
4. **System Python** - Do not use! Only for OS applications.
|
||||
|
||||
## `psi-python` modules
|
||||
|
||||
The easiest way to use python is using the centrally maintained psi-python modules:
|
||||
|
||||
```
|
||||
~ $ module avail psi-python
|
||||
------------------------------------- Programming: ------------------------------
|
||||
|
||||
psi-python27/2.3.0 psi-python27/2.2.0 psi-python27/2.4.1
|
||||
psi-python27/4.4.0 psi-python34/2.1.0 psi-python35/4.2.0
|
||||
psi-python36/4.4.0
|
||||
|
||||
~ $ module load psi-python36/4.4.0
|
||||
~ $ python --version
|
||||
Python 3.6.1 :: Anaconda 4.4.0 (64-bit)
|
||||
```
|
||||
|
||||
These include over 250 common packages from the
|
||||
[Anaconda](https://docs.anaconda.com/anaconda/) software distribution, including
|
||||
numpy, pandas, requests, flask, hdf5, and more.
|
||||
|
||||
{% include callout.html type="warning" content="
|
||||
**Caution**{: .text-warning}
|
||||
Do not use `module load python`. These modules are minimal installs intended as
|
||||
dependencies for other modules that embed python.
|
||||
"%}
|
||||
|
||||
## Anaconda
|
||||
|
||||
[Anaconda](https://www.anaconda.com/) ("conda" for short) is a package manager with
|
||||
excellent python integration. Using it you can create isolated environments for each
|
||||
of your python applications, containing exactly the dependencies needed for that app.
|
||||
It is similar to the [virtualenv](http://virtualenv.readthedocs.org/) python package,
|
||||
but can also manage non-python requirements.
|
||||
|
||||
### Loading conda
|
||||
|
||||
Conda is loaded from the module system:
|
||||
|
||||
```
|
||||
module load anaconda
|
||||
```
|
||||
|
||||
### Using pre-made environments
|
||||
|
||||
Loading the module provides the `conda` command, but does not otherwise change your
|
||||
environment. First an environment needs to be activated. Available environments can
|
||||
be seen with `conda info --envs` and include many specialized environments for
|
||||
software installs. After activating you should see the environment name in your
|
||||
prompt:
|
||||
|
||||
```
|
||||
~ $ conda activate datascience_py37
|
||||
(datascience_py37) ~ $
|
||||
```
|
||||
|
||||
### CondaRC file
|
||||
|
||||
Creating a `~/.condarc` file is recommended if you want to create new environments on
|
||||
merlin. Environments can grow quite large, so you will need to change the default
|
||||
storage location from the default (your home directory) to a larger volume (usually
|
||||
`/data/user/$USER`).
|
||||
|
||||
Save the following as `$HOME/.condarc` (update USERNAME and module version as
|
||||
necessary):
|
||||
|
||||
```
|
||||
always_copy: true
|
||||
|
||||
envs_dirs:
|
||||
- /data/user/USERNAME/conda/envs
|
||||
|
||||
pkgs_dirs:
|
||||
- /data/user/USERNAME/conda/pkgs
|
||||
- /opt/psi/Programming/anaconda/2019.07/conda/pkgs
|
||||
|
||||
channels:
|
||||
- http://conda-pkg.intranet.psi.ch
|
||||
- conda-forge
|
||||
- defaults
|
||||
```
|
||||
|
||||
Run `conda info` to check that the variables are being set correctly.
|
||||
|
||||
### Creating environments
|
||||
|
||||
We will create an environment named `myenv` which uses an older version of numpy, e.g. to test for backwards compatibility of our code (the `-q` and `--yes` switches are just for not getting prompted and disabling the progress bar). The environment will be created in the default location as defined by the `.condarc` configuration file (see above).
|
||||
|
||||
```
|
||||
~ $ conda create -q --yes -n 'myenv1' numpy=1.8 scipy ipython
|
||||
|
||||
Fetching package metadata: ...
|
||||
Solving package specifications: .
|
||||
Package plan for installation in environment /gpfs/home/feichtinger/conda-envs/myenv1:
|
||||
|
||||
The following NEW packages will be INSTALLED:
|
||||
|
||||
ipython: 2.3.0-py27_0
|
||||
numpy: 1.8.2-py27_0
|
||||
openssl: 1.0.1h-1
|
||||
pip: 1.5.6-py27_0
|
||||
python: 2.7.8-1
|
||||
readline: 6.2-2
|
||||
scipy: 0.14.0-np18py27_0
|
||||
setuptools: 5.8-py27_0
|
||||
sqlite: 3.8.4.1-0
|
||||
system: 5.8-1
|
||||
tk: 8.5.15-0
|
||||
zlib: 1.2.7-0
|
||||
|
||||
To activate this environment, use:
|
||||
$ source activate myenv1
|
||||
|
||||
To deactivate this environment, use:
|
||||
$ source deactivate
|
||||
```
|
||||
|
||||
The created environment contains **just the packages that are needed to satisfy the
|
||||
requirements** and it is local to your installation. The python installation is even
|
||||
independent of the central installation, i.e. your code will still work in such an
|
||||
environment, even if you are offline or AFS is down. However, you need the central
|
||||
installation if you want to use the `conda` command itself.
|
||||
|
||||
Packages for your new environment will be either copied from the central one into
|
||||
your new environment, or if there are newer packages available from anaconda and you
|
||||
did not specify exactly the version from our central installation, they may get
|
||||
downloaded from the web. **This will require significant space in the `envs_dirs`
|
||||
that you defined in `.condarc`. If you create other environments on the same local
|
||||
disk, they will share the packages using hard links.
|
||||
|
||||
We can switch to the newly created environment with the `conda activate` command.
|
||||
|
||||
```
|
||||
$ conda activate myenv1
|
||||
```
|
||||
|
||||
{% include callout.html type="info" content="Note that anaconda's activate/deactivate
|
||||
scripts are compatible with the bash and zsh shells but not with [t]csh." %}
|
||||
|
||||
Let's test whether we indeed got the desired numpy version:
|
||||
|
||||
```
|
||||
$ python -c 'import numpy as np; print np.version.version'
|
||||
|
||||
1.8.2
|
||||
```
|
||||
|
||||
You can install additional packages into the active environment using the `conda
|
||||
install` command.
|
||||
|
||||
```
|
||||
$ conda install --yes -q bottle
|
||||
|
||||
Fetching package metadata: ...
|
||||
Solving package specifications: .
|
||||
Package plan for installation in environment /gpfs/home/feichtinger/conda-envs/myenv1:
|
||||
|
||||
The following NEW packages will be INSTALLED:
|
||||
|
||||
bottle: 0.12.5-py27_0
|
||||
```
|
||||
|
||||
## Jupyterhub
|
||||
|
||||
Jupyterhub is a service for running code notebooks on the cluster, particularly in
|
||||
python. It is a powerful tool for data analysis and prototyping. For more infomation
|
||||
see the [Jupyterhub documentation]({{"jupyterhub.html"}}).
|
||||
|
||||
## Pythons to avoid
|
||||
|
||||
Avoid using the system python (`/usr/bin/python`). It is intended for OS software and
|
||||
may not be up to date.
|
||||
|
||||
Also avoid the 'python' module (`module load python`). This is a minimal install of
|
||||
python intended for embedding in other modules.
|
Reference in New Issue
Block a user