diff --git a/pages/merlin6/05 Software Support/openmpi.md b/pages/merlin6/05 Software Support/openmpi.md index 741df73..b3ad8b6 100644 --- a/pages/merlin6/05 Software Support/openmpi.md +++ b/pages/merlin6/05 Software Support/openmpi.md @@ -19,18 +19,39 @@ bind tasks in to cores and less customization is needed, while **'mpirun'** and configuration and should be only used by advanced users. Please, ***always*** adapt your scripts for using **'srun'** before opening a support ticket. Also, please contact us on any problem when using a module. -{{site.data.alerts.tip}} Always run OpenMPI with the srun command. +{{site.data.alerts.tip}} Always run OpenMPI with the srun command. The only exception is for advanced users. {{site.data.alerts.end}} -### PModules +### OpenMPI with UCX + +**OpenMPI** supports **UCX** starting from version 3.0, but it’s recommended to use version 4.0 or higher due to stability and performance improvements. +**UCX** should be used only by advanced users, as it requires to run it with **mpirun** (needs advanced knowledge) and is an exception for running MPI +without **srun** (**UCX** is not integrated at PSI within **srun**). + +For running UCX, one should add the following options to **mpirun**: + +```bash +mpirun --np $SLURM_NTASKS -mca pml ucx --mca btl ^vader,tcp,openib,uct -x UCX_NET_DEVICES=mlx5_0:1 ./app +``` + +Alternatively, one can add the following options for debugging purposes (visit [UCX Logging](https://github.com/openucx/ucx/wiki/Logging) for possible `UCX_LOG_LEVEL` values): + +```bash +-x UCX_LOG_LEVEL= -x UCX_LOG_FILE= +``` + +## Supported OpenMPI versions + +For running OpenMPI properly in a Slurm batch system, ***OpenMPI and Slurm must be compiled accordingly***. We can find a large number of compilations of OpenMPI modules in the PModules central repositories. However, only -some of them are suitable for running in a Slurm cluster: ***any OpenMPI versions with suffixes ``_slurm`` or ``_merlin6`` -are suitable for running in the Merlin6 cluster***. Please, ***avoid using any other OpenMPI releases***. +some of them are suitable for running in a Slurm cluster: ***any OpenMPI versions with suffixes `_slurm` +are suitable for running in the Merlin6 cluster***. Also, OpenMPI with suffix `_merlin6` can be used, but these will be fully +replaced by the `_slurm` series in the future (so it can be used on any Slurm cluster at PSI). Please, ***avoid using any other OpenMPI releases***. {{site.data.alerts.tip}} Suitable OpenMPI versions for running in the Merlin6 cluster:

            -  - openmpi/<version>_slurm + openmpi/<version>_slurm  [Recommended]

            -