Changes in OpenMPI and added iMPI
This commit is contained in:
41
pages/merlin6/05 Software Support/impi.md
Normal file
41
pages/merlin6/05 Software Support/impi.md
Normal file
@ -0,0 +1,41 @@
|
|||||||
|
---
|
||||||
|
title: Intel MPI Support
|
||||||
|
#tags:
|
||||||
|
last_updated: 13 March 2020
|
||||||
|
keywords: software, impi, slurm
|
||||||
|
summary: "This document describes how to use Intel MPI in the Merlin6 cluster"
|
||||||
|
sidebar: merlin6_sidebar
|
||||||
|
permalink: /merlin6/impi.html
|
||||||
|
---
|
||||||
|
|
||||||
|
## Introduction
|
||||||
|
|
||||||
|
This document describes which set of Intel MPI versions in PModules are supported in the Merlin6 cluster.
|
||||||
|
|
||||||
|
### srun
|
||||||
|
|
||||||
|
We strongly recommend the use of **'srun'** over **'mpirun'** or **'mpiexec'**. Using **'srun'** would properly
|
||||||
|
bind tasks in to cores and less customization is needed, while **'mpirun'** and '**mpiexec**' might need more advanced
|
||||||
|
configuration and should be only used by advanced users. Please, ***always*** adapt your scripts for using **'srun'**
|
||||||
|
before opening a support ticket. Also, please contact us on any problem when using a module.
|
||||||
|
|
||||||
|
{{site.data.alerts.tip}} Always run Intel MPI with the <b>srun</b> command. The only exception is for advanced users.
|
||||||
|
{{site.data.alerts.end}}
|
||||||
|
|
||||||
|
When running with **srun**, one should tell Intel MPI to use the PMI libraries provided by Slurm. For PMI-1:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export I_MPI_PMI_LIBRARY=/usr/lib64/libpmi.so
|
||||||
|
```
|
||||||
|
|
||||||
|
Alternatively, one can use PMI-2, but then one needs to specify it as follows:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
export I_MPI_PMI_LIBRARY=/usr/lib64/libpmi2.so
|
||||||
|
export I_MPI_PMI2=yes
|
||||||
|
```
|
||||||
|
|
||||||
|
For more information, please read [Slurm Intel MPI Guide](https://slurm.schedmd.com/mpi_guide.html#intel_mpi)
|
||||||
|
|
||||||
|
**Note**: Please note that PMI2 might not work properly in some Intel MPI versions. If so, you can either fallback
|
||||||
|
to PMI-1 or to contact the Merlin administrators.
|
@ -31,7 +31,7 @@ without **srun** (**UCX** is not integrated at PSI within **srun**).
|
|||||||
For running UCX, one should add the following options to **mpirun**:
|
For running UCX, one should add the following options to **mpirun**:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
mpirun --np $SLURM_NTASKS -mca pml ucx --mca btl ^vader,tcp,openib,uct -x UCX_NET_DEVICES=mlx5_0:1 ./app
|
-mca pml ucx --mca btl ^vader,tcp,openib,uct -x UCX_NET_DEVICES=mlx5_0:1
|
||||||
```
|
```
|
||||||
|
|
||||||
Alternatively, one can add the following options for debugging purposes (visit [UCX Logging](https://github.com/openucx/ucx/wiki/Logging) for possible `UCX_LOG_LEVEL` values):
|
Alternatively, one can add the following options for debugging purposes (visit [UCX Logging](https://github.com/openucx/ucx/wiki/Logging) for possible `UCX_LOG_LEVEL` values):
|
||||||
@ -40,6 +40,12 @@ Alternatively, one can add the following options for debugging purposes (visit [
|
|||||||
-x UCX_LOG_LEVEL=<data|debug|warn|info|...> -x UCX_LOG_FILE=<filename>
|
-x UCX_LOG_LEVEL=<data|debug|warn|info|...> -x UCX_LOG_FILE=<filename>
|
||||||
```
|
```
|
||||||
|
|
||||||
|
Full example:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
mpirun -np $SLURM_NTASKS -mca pml ucx --mca btl ^vader,tcp,openib,uct -x UCX_NET_DEVICES=mlx5_0:1 -x UCX_LOG_LEVEL=data -x UCX_LOG_FILE=UCX-$SLURM_JOB_ID.log
|
||||||
|
```bash
|
||||||
|
|
||||||
## Supported OpenMPI versions
|
## Supported OpenMPI versions
|
||||||
|
|
||||||
For running OpenMPI properly in a Slurm batch system, ***OpenMPI and Slurm must be compiled accordingly***.
|
For running OpenMPI properly in a Slurm batch system, ***OpenMPI and Slurm must be compiled accordingly***.
|
||||||
|
Reference in New Issue
Block a user