merge and move support pages
this are now under the /support path, meaning that this is unified for all clusters.
This commit is contained in:
@@ -1,41 +0,0 @@
|
||||
# Contact
|
||||
|
||||
## Support
|
||||
|
||||
Basic contact information can be also found when logging into the Merlin Login Nodes through the *Message of the Day*.
|
||||
|
||||
Support can be asked through:
|
||||
* [PSI Service Now](https://psi.service-now.com/psisp)
|
||||
* E-Mail: <merlin-admins@lists.psi.ch>
|
||||
|
||||
### PSI Service Now
|
||||
|
||||
**[PSI Service Now](https://psi.service-now.com/psisp)**: is the official tool for opening incident requests.
|
||||
* PSI HelpDesk will redirect the incident to the corresponding department, or
|
||||
* you can always assign it directly by checking the box `I know which service is affected` and providing the service name `Local HPC Resources (e.g. Merlin) [CF]` (just type in `Local` and you should get the valid completions).
|
||||
|
||||
### Contact Merlin6 Administrators
|
||||
|
||||
**E-Mail <merlin-admins@lists.psi.ch>**
|
||||
* This is the official way to contact Merlin6 Administrators for discussions which do not fit well into the incident category.
|
||||
Do not hesitate to contact us for such cases.
|
||||
|
||||
---
|
||||
|
||||
## Get updated through the Merlin User list!
|
||||
|
||||
Is strongly recommended that users subscribe to the Merlin Users mailing list: **<merlin-users@lists.psi.ch>**
|
||||
|
||||
This mailing list is the official channel used by Merlin6 administrators to inform users about downtimes,
|
||||
interventions or problems. Users can be subscribed in two ways:
|
||||
|
||||
* *(Preferred way)* Self-registration through **[Sympa](https://psilists.ethz.ch/sympa/info/merlin-users)**
|
||||
* If you need to subscribe many people (e.g. your whole group) by sending a request to the admin list **<merlin-admins@lists.psi.ch>**
|
||||
and providing a list of email addresses.
|
||||
|
||||
---
|
||||
|
||||
## The Merlin Cluster Team
|
||||
|
||||
The PSI Merlin clusters are managed by the **[High Performance Computing and Emerging technologies Group](https://www.psi.ch/de/lsm/hpce-group)**, which
|
||||
is part of the [Science IT Infrastructure, and Services department (AWI)](https://www.psi.ch/en/awi) in PSI's [Center for Scientific Computing, Theory and Data (SCD)](https://www.psi.ch/en/csd).
|
||||
@@ -1,42 +0,0 @@
|
||||
# FAQ
|
||||
|
||||
## How do I register for Merlin?
|
||||
|
||||
See [Requesting Merlin Access](../quick-start-guide/requesting-accounts.md).
|
||||
|
||||
## How do I get information about downtimes and updates?
|
||||
|
||||
See [Get updated through the Merlin User list!](contact.md#get-updated-through-the-merlin-user-list)
|
||||
|
||||
## How can I request access to a Merlin project directory?
|
||||
|
||||
Merlin projects are placed in the `/data/project` directory. Access to each project is controlled by Unix group membership.
|
||||
If you require access to an existing project, please request group membership as described in [Requesting Unix Group Membership](../quick-start-guide/requesting-projects.md#requesting-unix-group-membership).
|
||||
|
||||
Your project leader or project colleagues will know what Unix group you should belong to. Otherwise, you can check what Unix group is allowed to access that project directory (simply run `ls -ltrhd` for the project directory).
|
||||
|
||||
## Can I install software myself?
|
||||
|
||||
Most software can be installed in user directories without any special permissions. We recommend using `/data/user/$USER/bin` for software since home directories are fairly small. For software that will be used by multiple groups/users you can also [request the admins](contact.md) install it as a [module](../how-to-use-merlin/using-modules.md).
|
||||
|
||||
How to install depends a bit on the software itself. There are three common installation procedures:
|
||||
|
||||
1. *binary distributions*. These are easy; just put them in a directory (eg `/data/user/$USER/bin`) and add that to your PATH.
|
||||
2. *source compilation* using make/cmake/autoconfig/etc. Usually the compilation scripts accept a `--prefix=/data/user/$USER` directory for where to install it. Then they place files under `<prefix>/bin`, `<prefix>/lib`, etc. The exact syntax should be documented in the installation instructions.
|
||||
3. *conda environment*. This is now becoming standard for python-based software, including lots of the AI tools. First follow the [initial setup instructions](../software-support/python.md#anaconda) to configure conda to use /data/user instead of your home directory. Then you can create environments like:
|
||||
|
||||
```bash
|
||||
module load anaconda/2019.07
|
||||
# if they provide environment.yml
|
||||
conda env create -f environment.yml
|
||||
|
||||
# or to create manually
|
||||
conda create --name myenv python==3.9 ...
|
||||
|
||||
conda activate myenv
|
||||
```
|
||||
|
||||
## Something doesn't work
|
||||
|
||||
Check the list of [known problems](known-problems.md) to see if a solution is known.
|
||||
If not, please [contact the admins](contact.md).
|
||||
@@ -1,170 +0,0 @@
|
||||
# Known Problems
|
||||
|
||||
## Common errors
|
||||
|
||||
### Illegal instruction error
|
||||
|
||||
It may happened that your code, compiled on one machine will not be executed on another throwing exception like **"(Illegal instruction)"**.
|
||||
This is usually because the software was compiled with a set of instructions newer than the ones available in the node where the software runs,
|
||||
and it mostly depends on the processor generation.
|
||||
|
||||
In example, `merlin-l-001` and `merlin-l-002` contain a newer generation of processors than the old GPUs nodes, or than the Merlin5 cluster.
|
||||
Hence, unless one compiles the software with compatibility with set of instructions from older processors, it will not run on old nodes.
|
||||
Sometimes, this is properly set by default at the compilation time, but sometimes is not.
|
||||
|
||||
For GCC, please refer to [GCC x86 Options](https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html) for compiling options. In case of doubts, contact us.
|
||||
|
||||
## Slurm
|
||||
|
||||
### sbatch using one core despite setting -c/--cpus-per-task
|
||||
|
||||
From **Slurm v22.05.6**, the behavior of `srun` has changed. Merlin has been updated to this version since *Tuesday 13.12.2022*.
|
||||
|
||||
`srun` will no longer read in `SLURM_CPUS_PER_TASK`, which is typically set when defining `-c/--cpus-per-task` in the `sbatch` command.
|
||||
This means you will implicitly have to specify `-c\--cpus-per-task` also on your `srun` calls, or set the new `SRUN_CPUS_PER_TASK` environment variable to accomplish the same thing.
|
||||
Therefore, unless this is implicitly specified, `srun` will use only one Core per task (resulting in 2 CPUs per task when multithreading is enabled)
|
||||
|
||||
An example for setting up `srun` with `-c\--cpus-per-task`:
|
||||
```bash
|
||||
(base) ❄ [caubet_m@merlin-l-001:/data/user/caubet_m]# cat mysbatch_method1
|
||||
#!/bin/bash
|
||||
#SBATCH -n 1
|
||||
#SBATCH --cpus-per-task=8
|
||||
|
||||
echo 'From Slurm v22.05.8 srun does not inherit $SLURM_CPUS_PER_TASK'
|
||||
srun python -c "import os; print(os.sched_getaffinity(0))"
|
||||
|
||||
echo 'One has to implicitly specify $SLURM_CPUS_PER_TASK'
|
||||
echo 'In this example, by setting -c/--cpus-per-task in srun'
|
||||
srun --cpus-per-task=$SLURM_CPUS_PER_TASK python -c "import os; print(os.sched_getaffinity(0))"
|
||||
|
||||
(base) ❄ [caubet_m@merlin-l-001:/data/user/caubet_m]# sbatch mysbatch_method1
|
||||
Submitted batch job 8000813
|
||||
|
||||
(base) ❄ [caubet_m@merlin-l-001:/data/user/caubet_m]# cat slurm-8000813.out
|
||||
From Slurm v22.05.8 srun does not inherit $SLURM_CPUS_PER_TASK
|
||||
{1, 45}
|
||||
One has to implicitly specify $SLURM_CPUS_PER_TASK
|
||||
In this example, by setting -c/--cpus-per-task in srun
|
||||
{1, 2, 3, 4, 45, 46, 47, 48}
|
||||
```
|
||||
|
||||
An example to accomplish the same thing with the `SRUN_CPUS_PER_TASK` environment variable:
|
||||
```bash
|
||||
(base) ❄ [caubet_m@merlin-l-001:/data/user/caubet_m]# cat mysbatch_method2
|
||||
#!/bin/bash
|
||||
#SBATCH -n 1
|
||||
#SBATCH --cpus-per-task=8
|
||||
|
||||
echo 'From Slurm v22.05.8 srun does not inherit $SLURM_CPUS_PER_TASK'
|
||||
srun python -c "import os; print(os.sched_getaffinity(0))"
|
||||
|
||||
echo 'One has to implicitly specify $SLURM_CPUS_PER_TASK'
|
||||
echo 'In this example, by setting an environment variable SRUN_CPUS_PER_TASK'
|
||||
export SRUN_CPUS_PER_TASK=$SLURM_CPUS_PER_TASK
|
||||
srun python -c "import os; print(os.sched_getaffinity(0))"
|
||||
|
||||
(base) ❄ [caubet_m@merlin-l-001:/data/user/caubet_m]# sbatch mysbatch_method2
|
||||
Submitted batch job 8000815
|
||||
|
||||
(base) ❄ [caubet_m@merlin-l-001:/data/user/caubet_m]# cat slurm-8000815.out
|
||||
From Slurm v22.05.8 srun does not inherit $SLURM_CPUS_PER_TASK
|
||||
{1, 45}
|
||||
One has to implicitly specify $SLURM_CPUS_PER_TASK
|
||||
In this example, by setting an environment variable SRUN_CPUS_PER_TASK
|
||||
{1, 2, 3, 4, 45, 46, 47, 48}
|
||||
```
|
||||
|
||||
## General topics
|
||||
|
||||
### Default SHELL
|
||||
|
||||
In general, **`/bin/bash` is the recommended default user's SHELL** when working in Merlin.
|
||||
|
||||
Some users might notice that BASH is not the default SHELL when logging in to Merlin systems, or they might need to run a different SHELL.
|
||||
This is probably because when the PSI account was requested, no SHELL description was specified or a different one was requested explicitly by the requestor.
|
||||
Users can check which is the default SHELL specified in the PSI account with the following command:
|
||||
|
||||
```bash
|
||||
getent passwd $USER | awk -F: '{print $NF}'
|
||||
```
|
||||
|
||||
If SHELL does not correspond to the one you need to use, you should request a central change for it.
|
||||
This is because Merlin accounts are central PSI accounts. Hence, **change must be requested via [PSI Service Now](contact.md#psi-service-now)**.
|
||||
|
||||
Alternatively, if you work on other PSI Linux systems but for Merlin you need a different SHELL type, a temporary change can be performed during login startup.
|
||||
You can update one of the following files:
|
||||
* `~/.login`
|
||||
* `~/.profile`
|
||||
* Any `rc` or `profile` file in your home directory (i.e. `.cshrc`, `.bashrc`, `.bash_profile`, etc.)
|
||||
|
||||
with the following lines:
|
||||
|
||||
```bash
|
||||
# Replace MY_SHELL with the bash type you need
|
||||
MY_SHELL=/bin/bash
|
||||
exec $MY_SHELL -l
|
||||
```
|
||||
|
||||
Notice that available *shells* can be found in the following file:
|
||||
|
||||
```bash
|
||||
cat /etc/shells
|
||||
```
|
||||
|
||||
### 3D acceleration: OpenGL vs Mesa
|
||||
|
||||
Some applications can run with OpenGL support. This is only possible when the node contains a GPU card.
|
||||
|
||||
In general, X11 with Mesa Driver is the recommended method as it will work in all cases (no need of GPUs). In example, for ParaView:
|
||||
|
||||
```bash
|
||||
module load paraview
|
||||
paraview-mesa paraview # 'paraview --mesa' for old releases
|
||||
```
|
||||
|
||||
However, if one needs to run with OpenGL support, this is still possible by running `vglrun`. In example, for running Paraview:
|
||||
|
||||
```bash
|
||||
module load paraview
|
||||
vglrun paraview
|
||||
```
|
||||
|
||||
Officially, the supported method for running `vglrun` is by using the [NoMachine remote desktop](../how-to-use-merlin/nomachine.md).
|
||||
Running `vglrun` it's also possible using SSH with X11 Forwarding. However, it's very slow and it's only recommended when running
|
||||
in Slurm (from [NoMachine](../how-to-use-merlin/nomachine.md)). Please, avoid running `vglrun` over SSH from a desktop or laptop.
|
||||
|
||||
## Software
|
||||
|
||||
### ANSYS
|
||||
|
||||
Sometimes, running ANSYS/Fluent requires X11 support. For that, one should run fluent as follows.
|
||||
|
||||
```bash
|
||||
module load ANSYS
|
||||
fluent -driver x11
|
||||
```
|
||||
|
||||
### Paraview
|
||||
|
||||
For running Paraview, one can run it with Mesa support or OpenGL support. Please refer to [OpenGL vs Mesa](#3d-acceleration-opengl-vs-mesa) for
|
||||
further information about how to run it.
|
||||
|
||||
### Module command not found
|
||||
|
||||
In some circumstances the module command may not be initialized properly. For instance, you may see the following error upon logon:
|
||||
|
||||
```
|
||||
bash: module: command not found
|
||||
```
|
||||
|
||||
The most common cause for this is a custom `.bashrc` file which fails to source the global `/etc/bashrc` responsible for setting up PModules in some OS versions. To fix this, add the following to `$HOME/.bashrc`:
|
||||
|
||||
```bash
|
||||
if [ -f /etc/bashrc ]; then
|
||||
. /etc/bashrc
|
||||
fi
|
||||
```
|
||||
|
||||
It can also be fixed temporarily in an existing terminal by running `. /etc/bashrc` manually.
|
||||
|
||||
@@ -1,40 +0,0 @@
|
||||
# Troubleshooting
|
||||
|
||||
For troubleshooting, please contact us through the official channels. See [Contact](contact.md)
|
||||
for more information.
|
||||
|
||||
## Known Problems
|
||||
|
||||
Before contacting us for support, please check the **[Merlin6 Support: Known Problems](known-problems.md)** page to see if there is an existing
|
||||
workaround for your specific problem.
|
||||
|
||||
## Troubleshooting Slurm Jobs
|
||||
|
||||
If you want to report a problem or request for help when running jobs, please **always provide**
|
||||
the following information:
|
||||
|
||||
1. Provide your batch script or, alternatively, the path to your batch script.
|
||||
2. Add **always** the following commands to your batch script
|
||||
|
||||
```bash
|
||||
echo "User information:"; who am i
|
||||
echo "Running hostname:"; hostname
|
||||
echo "Current location:"; pwd
|
||||
echo "User environment:"; env
|
||||
echo "List of PModules:"; module list
|
||||
```
|
||||
|
||||
3. Whenever possible, provide the Slurm JobID.
|
||||
|
||||
Providing this information is **extremely important** in order to ease debugging, otherwise
|
||||
only with the description of the issue or just the error message is completely insufficient
|
||||
in most cases.
|
||||
|
||||
## Troubleshooting SSH
|
||||
|
||||
Use the ssh command with the "-vvv" option and copy and paste (no screenshots please)
|
||||
the output to your request in Service-Now. Example
|
||||
|
||||
```bash
|
||||
ssh -Y -vvv $username@merlin-l-01.psi.ch
|
||||
```
|
||||
Reference in New Issue
Block a user