Changes in OpenMPI and added iMPI

This commit is contained in:
caubet_m 2020-05-22 17:59:46 +02:00
parent 05883f5735
commit 1e44aefd9d
2 changed files with 48 additions and 1 deletions

View File

@ -0,0 +1,41 @@
---
title: Intel MPI Support
#tags:
last_updated: 13 March 2020
keywords: software, impi, slurm
summary: "This document describes how to use Intel MPI in the Merlin6 cluster"
sidebar: merlin6_sidebar
permalink: /merlin6/impi.html
---
## Introduction
This document describes which set of Intel MPI versions in PModules are supported in the Merlin6 cluster.
### srun
We strongly recommend the use of **'srun'** over **'mpirun'** or **'mpiexec'**. Using **'srun'** would properly
bind tasks in to cores and less customization is needed, while **'mpirun'** and '**mpiexec**' might need more advanced
configuration and should be only used by advanced users. Please, ***always*** adapt your scripts for using **'srun'**
before opening a support ticket. Also, please contact us on any problem when using a module.
{{site.data.alerts.tip}} Always run Intel MPI with the <b>srun</b> command. The only exception is for advanced users.
{{site.data.alerts.end}}
When running with **srun**, one should tell Intel MPI to use the PMI libraries provided by Slurm. For PMI-1:
```bash
export I_MPI_PMI_LIBRARY=/usr/lib64/libpmi.so
```
Alternatively, one can use PMI-2, but then one needs to specify it as follows:
```bash
export I_MPI_PMI_LIBRARY=/usr/lib64/libpmi2.so
export I_MPI_PMI2=yes
```
For more information, please read [Slurm Intel MPI Guide](https://slurm.schedmd.com/mpi_guide.html#intel_mpi)
**Note**: Please note that PMI2 might not work properly in some Intel MPI versions. If so, you can either fallback
to PMI-1 or to contact the Merlin administrators.

View File

@ -31,7 +31,7 @@ without **srun** (**UCX** is not integrated at PSI within **srun**).
For running UCX, one should add the following options to **mpirun**:
```bash
mpirun --np $SLURM_NTASKS -mca pml ucx --mca btl ^vader,tcp,openib,uct -x UCX_NET_DEVICES=mlx5_0:1 ./app
-mca pml ucx --mca btl ^vader,tcp,openib,uct -x UCX_NET_DEVICES=mlx5_0:1
```
Alternatively, one can add the following options for debugging purposes (visit [UCX Logging](https://github.com/openucx/ucx/wiki/Logging) for possible `UCX_LOG_LEVEL` values):
@ -40,6 +40,12 @@ Alternatively, one can add the following options for debugging purposes (visit [
-x UCX_LOG_LEVEL=<data|debug|warn|info|...> -x UCX_LOG_FILE=<filename>
```
Full example:
```bash
mpirun -np $SLURM_NTASKS -mca pml ucx --mca btl ^vader,tcp,openib,uct -x UCX_NET_DEVICES=mlx5_0:1 -x UCX_LOG_LEVEL=data -x UCX_LOG_FILE=UCX-$SLURM_JOB_ID.log
```bash
## Supported OpenMPI versions
For running OpenMPI properly in a Slurm batch system, ***OpenMPI and Slurm must be compiled accordingly***.