This commit is contained in:
@@ -31,7 +31,7 @@ Administrators recommend caution when relying on `unstable` versions for critica
|
||||
At present, we recommend using OpenMPI **5.0.8 or newer** on Merlin7, compiled with
|
||||
**libfabric/2.2.0-oss** or superior.
|
||||
|
||||
### Using srun in Merlin7
|
||||
### Running OpenMPI on Merlin7
|
||||
|
||||
In OpenMPI versions prior to 5.0.x, using `srun` for direct task launches was faster than `mpirun`.
|
||||
Although this is no longer the case, `srun` remains the recommended method due to its simplicity and ease of use.
|
||||
@@ -40,19 +40,43 @@ Key benefits of `srun`:
|
||||
* Automatically handles task binding to cores.
|
||||
* In general, requires less configuration compared to `mpirun`.
|
||||
* Best suited for most users, while `mpirun` is recommended only for advanced MPI configurations.
|
||||
* However, `mpirun` has a much faster MPI initialization, which might be suitable for short runs.
|
||||
|
||||
Guidelines:
|
||||
* Always adapt your scripts to use srun before seeking support.
|
||||
* For any module-related issues, please contact the Merlin7 administrators.
|
||||
For any module-related issues, please contact the Merlin7 administrators.
|
||||
|
||||
Example Usage:
|
||||
```bash
|
||||
srun ./app
|
||||
mpirun ./app
|
||||
```
|
||||
|
||||
!!! tip
|
||||
Always run OpenMPI applications with `srun` for a seamless experience.
|
||||
|
||||
#### CXI vs LinkX provider
|
||||
|
||||
Open MPI on Merlin7 is built with **libfabric** support. By default, libfabric uses the **CXI** (Cassini) provider
|
||||
on Slingshot-based networks, which is the standard behavior on Merlin7. However, the default CXI provider does not
|
||||
always deliver the best performance for applications with significant inter-node communication.
|
||||
|
||||
To address this, a new provider plugin specifically designed for Open MPI, **LINKx**, has been built on Merlin7 using
|
||||
libfabric `-oss` releases. This provider is still considered **preview** and is therefore **not enabled by default**,
|
||||
but is **strongly recommended**.
|
||||
|
||||
In addition, some environment variables must be configured explicitly, in particular `FI_LNX_PROV_LINKS`, whose value
|
||||
depends on the node type and on the resources assigned to the job. For this reason, if you want to use the LINKx
|
||||
provider, you should set:
|
||||
|
||||
```bash
|
||||
# CPU nodes
|
||||
export FI_PROVIDER="lnx"
|
||||
export FI_LNX_PROV_LINKS="shm+cxi:cxi0"
|
||||
|
||||
# GPU nodes
|
||||
export FI_PROVIDER="lnx"
|
||||
export FI_LNX_PROV_LINKS="shm+cxi:cxi0|shm+cxi:cxi1|shm+cxi:cxi2|shm+cxi:cxi3"
|
||||
```
|
||||
|
||||
### PMIx Support in Merlin7
|
||||
|
||||
Merlin7's SLURM installation includes support for multiple PMI types, including pmix. To view the available options, use the following command:
|
||||
|
||||
Reference in New Issue
Block a user