diff --git a/docs/merlin7/05-Software-Support/openmpi.md b/docs/merlin7/05-Software-Support/openmpi.md index 5ef5cefd..28cbd3a0 100644 --- a/docs/merlin7/05-Software-Support/openmpi.md +++ b/docs/merlin7/05-Software-Support/openmpi.md @@ -31,7 +31,7 @@ Administrators recommend caution when relying on `unstable` versions for critica At present, we recommend using OpenMPI **5.0.8 or newer** on Merlin7, compiled with **libfabric/2.2.0-oss** or superior. -### Using srun in Merlin7 +### Running OpenMPI on Merlin7 In OpenMPI versions prior to 5.0.x, using `srun` for direct task launches was faster than `mpirun`. Although this is no longer the case, `srun` remains the recommended method due to its simplicity and ease of use. @@ -40,19 +40,43 @@ Key benefits of `srun`: * Automatically handles task binding to cores. * In general, requires less configuration compared to `mpirun`. * Best suited for most users, while `mpirun` is recommended only for advanced MPI configurations. +* However, `mpirun` has a much faster MPI initialization, which might be suitable for short runs. -Guidelines: -* Always adapt your scripts to use srun before seeking support. -* For any module-related issues, please contact the Merlin7 administrators. +For any module-related issues, please contact the Merlin7 administrators. Example Usage: ```bash srun ./app +mpirun ./app ``` !!! tip Always run OpenMPI applications with `srun` for a seamless experience. +#### CXI vs LinkX provider + +Open MPI on Merlin7 is built with **libfabric** support. By default, libfabric uses the **CXI** (Cassini) provider +on Slingshot-based networks, which is the standard behavior on Merlin7. However, the default CXI provider does not +always deliver the best performance for applications with significant inter-node communication. + +To address this, a new provider plugin specifically designed for Open MPI, **LINKx**, has been built on Merlin7 using +libfabric `-oss` releases. This provider is still considered **preview** and is therefore **not enabled by default**, +but is **strongly recommended**. + +In addition, some environment variables must be configured explicitly, in particular `FI_LNX_PROV_LINKS`, whose value +depends on the node type and on the resources assigned to the job. For this reason, if you want to use the LINKx +provider, you should set: + +```bash +# CPU nodes +export FI_PROVIDER="lnx" +export FI_LNX_PROV_LINKS="shm+cxi:cxi0" + +# GPU nodes +export FI_PROVIDER="lnx" +export FI_LNX_PROV_LINKS="shm+cxi:cxi0|shm+cxi:cxi1|shm+cxi:cxi2|shm+cxi:cxi3" +``` + ### PMIx Support in Merlin7 Merlin7's SLURM installation includes support for multiple PMI types, including pmix. To view the available options, use the following command: