Doc changes

This commit is contained in:
2021-05-21 12:34:19 +02:00
parent 42d8f38934
commit fcfdbf1344
46 changed files with 447 additions and 528 deletions

View File

@ -0,0 +1,49 @@
---
title: Contact
#tags:
#keywords:
last_updated: 28 June 2019
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/contact.html
---
## Support
Basic contact information can be also found when logging into the Merlin Login Nodes through the *Message of the Day*.
Support can be asked through:
* [PSI Service Now](https://psi.service-now.com/psisp)
* E-Mail: <merlin-admins@lists.psi.ch>
### PSI Service Now
**[PSI Service Now](https://psi.service-now.com/psisp)**: is the official tool for opening incident requests.
* PSI HelpDesk will redirect the incident to the corresponding department, or
* you can always assign it directly by checking the box `I know which service is affected` and providing the service name `Local HPC Resources (e.g. Merlin) [CF]` (just type in `Local` and you should get the valid completions).
### Contact Merlin6 Administrators
**E-Mail <merlin-admins@lists.psi.ch>**
* This is the official way to contact Merlin6 Administrators for discussions which do not fit well into the incident category.
Do not hesitate to contact us for such cases.
---
## Get updated through the Merlin User list!
Is strictly recommended that users subscribe to the Merlin Users mailing list: **<merlin-users@lists.psi.ch>**
This mailing list is the official channel used by Merlin6 administrators to inform users about downtimes,
interventions or problems. Users can be subscribed in two ways:
* *(Preferred way)* Self-registration through **[Sympa](https://psilists.ethz.ch/sympa/info/merlin-users)**
* If you need to subscribe many people (e.g. your whole group) by sending a request to the admin list **<merlin-admins@lists.psi.ch>**
and providing a list of email addresses.
---
## The Merlin6 Team
Merlin6 is managed by the **[High Performance Computing and Emerging technologies Group](https://www.psi.ch/de/lsm/hpce-group)**, which
is part of **NES/[Laboratory for Scientific Computing and Modelling](https://www.psi.ch/de/lsm)**.

View File

@ -0,0 +1,108 @@
---
title: Known Problems
#tags:
#keywords:
last_updated: 21 January 2021
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/known-problems.html
---
## Known Problems Summary
| Topic |
|:----------------------------------------------------------------------------------------- |
| [Default Shell](/merlin6/known-problems.html#default-shell) |
| [OpenGL vs Mesa](/merlin6/known-problems.html#opengl-vs-mesa) |
| [Paraview](/merlin6/known-problems.html#OpenGL) |
| [ANSYS](/merlin6/known-problems.html#opengl-support-paraview-ansys-etc) |
| [Illegal instructions error](i/merlin6/known-problems.html#illegal-instructions) |
## Default SHELL
In general, **`/bin/bash` is the recommended default user's SHELL** when working in Merlin.
Some users might notice that BASH is not the default SHELL when login to Merlin systems, or they might need to run a different SHELL.
This is probably because when the PSI account was requested, no SHELL description was specified or a different one was requested explicitly by the requestor.
Users can check which is the default SHELL specified in the PSI account with the following command:
```bash
getent passwd $USER | awk -F: '{print $NF}'
```
If SHELL does not correspond to the one you need to use, you should request a central change for it.
This is because Merlin accounts are central PSI accounts. Hence, **change must be requested via [PSI Service Now](/merlin6/contact.html#psi-service-now)**.
Alternatively, if you work on other PSI Linux systems but for Merlin you need a different SHELL type, a temporary change can be performed during login startup.
You can update one of the following files:
* `~/.login`
* `~/.profile`
* Any `rc` or `profile` file in your home directory (i.e. `.cshrc`, `.bashrc`, `.bash_profile`, etc.)
with the following lines:
```bash
# Replace MY_SHELL with the bash type you need
MY_SHELL=/bin/bash
exec $MY_SHELL -l
```
Notice that available *shells* can be found in the following file:
```bash
cat /etc/shells
```
## OpenGL vs Mesa
Some applications can run with OpenGL support. This is only possible when the node contains a GPU card.
In general, X11 with Mesa Driver is the recommended method as it will work in all cases (no need of GPUs). In example, for ParaView:
```bash
module load paraview
paraview-mesa paraview # 'paraview --mesa' for old releases
```
However, if one needs to run with OpenGL support, this is still possible by running `vglrun`. Officially, the supported method is
NoMachine remote desktop (SSH with X11 Forwarding is slow, but also needs to properly setup the client -desktop or laptop-, where
Merlin admins have no access or rights to it). In example, for running Paraview:
```bash
module load paraview
vglrun paraview
```
## ANSYS
Sometimes, running ANSYS/Fluent requires X11 support. For that, one should run fluent as follows.
```bash
module load ANSYS
fluent -driver x11
```
## Paraview
For running Paraview, one can run it with Mesa support or OpenGL support.
```bash
module load paraview
# Run with Mesa support (nodes without GPU)
paraview-mesa paraview # 'paraview --mesa' for old releases
# Run with OpenGL support (nodes with GPU)
vglrun paraview
```
## Illegal instructions
It may happened that your code, compiled on one machine will not be executed on another throwing exception like **"(Illegal instruction)"**.
This is usually because the software was compiled with a set of instructions newer than the ones available in the node where the software runs,
and it mostly depends on the processor generation.
In example, `merlin-l-001` and `merlin-l-002` contain a newer generation of processors than the old GPUs nodes, or than the Merlin5 cluster.
Hence, unless one compiles the software with compatibility with set of instructions from older processors, it will not run on old nodes.
Sometimes, this is properly set by default at the compilation time, but sometimes is not.
For GCC, please refer to https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html for compiling options. In case of doubts, contact us.

View File

@ -0,0 +1,139 @@
---
title: Migration From Merlin5
#tags:
#keywords:
last_updated: 18 June 2019
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/migrating.html
---
## Directories
### Merlin5 vs Merlin6
| Cluster | Home Directory | User Home Directory | Group Home Directory |
| ------- |:-------------------- |:-------------------- |:---------------------------------------- |
| merlin5 | /gpfs/home/_$username_ | /gpfs/data/_$username_ | /gpfs/group/_$laboratory_ |
| merlin6 | /psi/home/_$username_ | /data/user/_$username_ | /data/project/_\[general\|bio\]_/_$projectname_ |
### Quota limits in Merlin6
| Directory | Quota_Type [Soft:Hard] (Block) | Quota_Type [Soft:Hard] (Files) | Quota Change Policy: Block | Quota Change Policy: Files |
| ---------------------------------- | ------------------------------ | ------------------------------ |:--------------------------------------------- |:--------------------------------------------- |
| /psi/home/$username | USR [10GB:11GB] | *Undef* | Up to x2 when strictly justified. | N/A |
| /data/user/$username | USR [1TB:1.074TB] | USR [1M:1.1M] | Inmutable. Need a project. | Changeable when justified. |
| /data/project/bio/$projectname | GRP+Fileset [1TB:1.074TB] | GRP+Fileset [1M:1.1M] | Changeable according to project requirements. | Changeable according to project requirements. |
| /data/project/general/$projectname | GRP+Fileset [1TB:1.074TB] | GRP+Fileset [1M:1.1M] | Changeable according to project requirements. | Changeable according to project requirements. |
where:
* **Block** is capacity size in GB and TB
* **Files** is number of files + directories in Millions (M)
* **Quota types** are the following:
* **USR**: Quota is setup individually per user name
* **GRP**: Quota is setup individually per Unix Group name
* **Fileset**: Quota is setup per project root directory.
* User data directory ``/data/user`` has a strict user block quota limit policy. If more disk space is required, 'project' must be created.
* Soft quotas can be exceeded for short periods of time. Hard quotas cannot be exceeded.
### Project directory
#### Why is 'project' needed?
Merlin6 introduces the concept of a *project* directory. These are the recommended location for all scientific data.
* `/data/user` is not suitable for sharing data between users
* The Merlin5 *group* directories were a similar concept, but the association with a single organizational group made
interdepartmental sharing difficult. Projects can be shared by any PSI user.
* Projects are shared by multiple users (at a minimum they should be shared with the supervisor/PI). This decreases
the chance of data being orphaned by personnel changes.
* Shared projects are preferable to individual data for transparency and accountability in event of future questions
regarding the data.
* One project member is designated as responsible. Responsibility can be transferred if needed.
#### Requesting a *project*
Refer to [Requesting a project](/merlin6/request-project.html)
---
## Migration Schedule
### Phase 1 [June]: Pre-migration
* Users keep working on Merlin5
* Merlin5 production directories: ``'/gpfs/home/'``, ``'/gpfs/data'``, ``'/gpfs/group'``
* Users may raise any problems (quota limits, unaccessible files, etc.) to merlin-admins@lists.psi.ch
* Users can start migrating data (see [Migration steps](/merlin6/migrating.html#migration-steps))
* Users should copy their data from Merlin5 ``/gpfs/data`` to Merlin6 ``/data/user``
* Users should copy their home from Merlin5 ``/gpfs/home`` to Merlin6 ``/psi/home``
* Users should inform when migration is done, and which directories were migrated. Deletion for such directories can be requested by admins.
### Phase 2 [July-October]: Migration to Merlin6
* Merlin6 becomes official cluster, and directories are switched to the new structure:
* Merlin6 production directories: ``'/psi/home/'``, ``'/data/user'``, ``'/data/project'``
* Merlin5 directories available in RW in login nodes: ``'/gpfs/home/'``, ``'/gpfs/data'``, ``'/gpfs/group'``
* In Merlin5 computing nodes, Merlin5 directories are mounted in RW: ``'/gpfs/home/'``, ``'/gpfs/data'``, ``'/gpfs/group'``
* In Merlin5 computing nodes, Merlin6 directories are mounted in RW: ``'/psi/home/'``, ``'/data/user'``, ``'/data/project'``
* Users must migrate their data (see [Migration steps](/merlin6/migrating.html#migration-steps))
* ALL data must be migrated
* Job submissions by default to Merlin6. Submission to Merlin5 computing nodes possible.
* Users should inform when migration is done, and which directories were migrated. Deletion for such directories can be requested by admins.
### Phase 3 [November]: Merlin5 Decomission
* Old Merlin5 storage unmounted.
* Migrated directories reported by users will be deleted.
* Remaining Merlin5 data will be archived.
---
## Migration steps
### Cleanup / Archive files
* Users must cleanup and/or archive files, according to the quota limits for the target storage.
* If extra space is needed, we advise users to request a [project](/merlin6/request-project.html)
* If you need a larger quota in respect to the maximal allowed number of files, you can request an increase of your user quota.
#### File list
### Step 1: Migrating
First migration:
```bash
rsync -avAHXS <source_merlin5> <destination_merlin6>
rsync -avAHXS /gpfs/data/$username/* /data/user/$username
```
This can take several hours or days:
* You can try to parallelize multiple rsync commands in sub-directories for increasing transfer rate.
* Please do not parallelize many concurrent directories. Let's say, don't add more than 10 together.
* We may have other users doing the same and it could cause storage / UI performance problems in the Merlin5 cluster.
### Step 2: Mirroring
Once first migration is done, a second ``rsync`` should be ran. This is done with ``--delete``. With this option ``rsync`` will
behave in a way where it will delete from the destination all files that were removed in the source, but also will propagate
new files from the source to the destination.
```bash
rsync -avAHXS --delete <source_merlin5> <destination_merlin6>
rsync -avAHXS --delete /gpfs/data/$username/* /data/user/$username
```
### Step 3: Removing / Archiving old data
#### Removing migrated data
Once you ensure that everything is migrated to the new storage, data is ready to be deleted from the old storage.
Users must report when migration is finished and report which directories are affected and ready to be removed.
Merlin administrators will remove the directories, always asking for a last confirmation.
#### Archiving data
Once all migrated data has been removed from the old storage, missing data will be archived.

View File

@ -0,0 +1,48 @@
---
title: Troubleshooting
#tags:
#keywords:
last_updated: 21 January 2021
#summary: ""
sidebar: merlin6_sidebar
permalink: /merlin6/troubleshooting.html
---
For troubleshooting, please contact us through the official channels. See [Contact](/merlin6/contact.html)
for more information.
## Known Problems
Before contacting us for support, please check the **[Merlin6 Support: Known Problems](/merlin6/known-problems.html)** page to see if there is an existing
workaround for your specific problem.
## Troubleshooting Slurm Jobs
If you want to report a problem or request for help when running jobs, please **always provide**
the following information:
1. Provide your batch script or, alternatively, the path to your batch script.
2. Add **always** the following commands to your batch script
```bash
echo "User information:"; who am i
echo "Running hostname:"; hostname
echo "Current location:"; pwd
echo "User environment:"; env
echo "List of PModules:"; module list
```
3. Whenever possible, provide the Slurm JobID.
Providing this information is **extremely important** in order to ease debugging, otherwise
only with the description of the issue or just the error message is completely insufficient
in most cases.
## Troubleshooting SSH
Use the ssh command with the "-vvv" option and copy and paste (no screenshots please)
the output to your request in Service-Now. Example
```bash
ssh -Y -vvv $username@merlin-l-01.psi.ch
```