update merlin6 nomachine docs
This commit is contained in:
195
docs/merlin6/how-to-use-merlin/storage.md
Normal file
195
docs/merlin6/how-to-use-merlin/storage.md
Normal file
@@ -0,0 +1,195 @@
|
||||
# Merlin6 Storage
|
||||
|
||||
## Introduction
|
||||
|
||||
This document describes the different directories of the Merlin6 cluster.
|
||||
|
||||
### User and project data
|
||||
|
||||
* ***Users are responsible for backing up their own data***. Is recommended to backup the data on third party independent systems (i.e. LTS, Archive, AFS, SwitchDrive, Windows Shares, etc.).
|
||||
* **`/psi/home`**, as this contains a small amount of data, is the only directory where we can provide daily snapshots for one week. This can be found in the following directory **`/psi/home/.snapshot/`**
|
||||
* ***When a user leaves PSI, she or her supervisor/team are responsible to backup and move the data out from the cluster***: every few months, the storage space will be recycled for those old users who do not have an existing and valid PSI account.
|
||||
|
||||
!!! warning
|
||||
|
||||
When a user leaves PSI and his account has been removed, her storage space in Merlin may be recycled.
|
||||
Hence, **when a user leaves PSI**, she, her supervisor or team **must ensure that the data is backed up to an external storage**
|
||||
|
||||
### Checking user quota
|
||||
|
||||
For each directory, we provide a way for checking quotas (when required). However, a single command ``merlin_quotas``
|
||||
is provided. This is useful to show with a single command all quotas for your filesystems (including AFS, which is not mentioned here).
|
||||
|
||||
To check your quotas, please run:
|
||||
|
||||
```bash
|
||||
merlin_quotas
|
||||
```
|
||||
|
||||
## Merlin6 directories
|
||||
|
||||
Merlin6 offers the following directory classes for users:
|
||||
|
||||
* ``/psi/home/<username>``: Private user **home** directory
|
||||
* ``/data/user/<username>``: Private user **data** directory
|
||||
* ``/data/project/general/<projectname>``: Shared **Project** directory
|
||||
* For BIO experiments, a dedicated ``/data/project/bio/$projectname`` exists.
|
||||
* ``/scratch``: Local *scratch* disk (only visible by the node running a job).
|
||||
* ``/shared-scratch``: Shared *scratch* disk (visible from all nodes).
|
||||
* ``/export``: Export directory for data transfer, visible from `ra-merlin-01.psi.ch`, `ra-merlin-02.psi.ch` and Merlin login nodes.
|
||||
* Refer to **[Transferring Data](../how-to-use-merlin/transfer-data.md)** for more information about the export area and data transfer service.
|
||||
|
||||
!!! tip
|
||||
|
||||
In GPFS there is a concept called **GraceTime**. Filesystems have a block
|
||||
(amount of data) and file (number of files) quota. This quota contains a soft
|
||||
and hard limits. Once the soft limit is reached, users can keep writing up to
|
||||
their hard limit quota during the **grace period**. Once **GraceTime** or hard
|
||||
limit are reached, users will be unable to write and will need remove data
|
||||
below the soft limit (or ask for a quota increase when this is possible, see
|
||||
below table).
|
||||
|
||||
Properties of the directory classes:
|
||||
|
||||
| Directory | Block Quota [Soft:Hard] | Block Quota [Soft:Hard] | GraceTime | Quota Change Policy: Block | Quota Change Policy: Files | Backup | Backup Policy |
|
||||
| ---------------------------------- | ----------------------- | ----------------------- | :-------: | :--------------------------------- |:-------------------------------- | ------ | :----------------------------- |
|
||||
| /psi/home/$username | USR [10GB:11GB] | *Undef* | N/A | Up to x2 when strongly justified. | N/A | yes | Daily snapshots for 1 week |
|
||||
| /data/user/$username | USR [1TB:1.074TB] | USR [1M:1.1M] | 7d | Inmutable. Need a project. | Changeable when justified. | no | Users responsible for backup |
|
||||
| /data/project/bio/$projectname | GRP [1TB:1.074TB] | GRP [1M:1.1M] | 7d | Subject to project requirements. | Subject to project requirements. | no | Project responsible for backup |
|
||||
| /data/project/general/$projectname | GRP [1TB:1.074TB] | GRP [1M:1.1M] | 7d | Subject to project requirements. | Subject to project requirements. | no | Project responsible for backup |
|
||||
| /scratch | *Undef* | *Undef* | N/A | N/A | N/A | no | N/A |
|
||||
| /shared-scratch | USR [512GB:2TB] | USR [2M:2.5M] | 7d | Up to x2 when strongly justified. | Changeable when justified. | no | N/A |
|
||||
| /export | USR [10MB:20TB] | USR [512K:5M] | 10d | Soft can be temporary increased. | Changeable when justified. | no | N/A |
|
||||
|
||||
!!! warning
|
||||
|
||||
The use of **scratch** and **export** areas as an extension of the quota
|
||||
_is forbidden_. **scratch** and **export** areas _must not contain_ final
|
||||
data.
|
||||
|
||||
**_Auto cleanup policies_** in the **scratch** and **export** areas are applied.
|
||||
|
||||
### User home directory
|
||||
|
||||
This is the default directory users will land when login in to any Merlin6 machine.
|
||||
It is intended for your scripts, documents, software development, and other files which
|
||||
you want to have backuped. Do not use it for data or HPC I/O-hungry tasks.
|
||||
|
||||
This directory is mounted in the login and computing nodes under the path:
|
||||
|
||||
```bash
|
||||
/psi/home/$username
|
||||
```
|
||||
|
||||
Home directories are part of the PSI NFS Central Home storage provided by AIT and
|
||||
are managed by the Merlin6 administrators.
|
||||
|
||||
Users can check their quota by running the following command:
|
||||
|
||||
```bash
|
||||
quota -s
|
||||
```
|
||||
|
||||
#### Home directory policy
|
||||
|
||||
* Read **[Important: Code of Conduct](../quick-start-guide/code-of-conduct.md)** for more information about Merlin6 policies.
|
||||
* Is **forbidden** to use the home directories for IO intensive tasks
|
||||
* Use `/scratch`, `/shared-scratch`, `/data/user` or `/data/project` for this purpose.
|
||||
* Users can retrieve up to 1 week of their lost data thanks to the automatic **daily snapshots for 1 week**.
|
||||
Snapshots can be accessed at this path:
|
||||
|
||||
```bash
|
||||
/psi/home/.snapshop/$username
|
||||
```
|
||||
|
||||
### User data directory
|
||||
|
||||
The user data directory is intended for *fast IO access* and keeping large amounts of private data.
|
||||
This directory is mounted in the login and computing nodes under the directory
|
||||
|
||||
```bash
|
||||
/data/user/$username
|
||||
```
|
||||
|
||||
Users can check their quota by running the following command:
|
||||
|
||||
```bash
|
||||
mmlsquota -u <username> --block-size auto merlin-user
|
||||
```
|
||||
|
||||
#### User data directory policy
|
||||
|
||||
* Read **[Important: Code of Conduct](../quick-start-guide/code-of-conduct.md)** for more information about Merlin6 policies.
|
||||
* Is **forbidden** to use the data directories as ``scratch`` area during a job runtime.
|
||||
* Use ``/scratch``, ``/shared-scratch`` for this purpose.
|
||||
* No backup policy is applied for user data directories: users are responsible for backing up their data.
|
||||
|
||||
### Project data directory
|
||||
|
||||
This storage is intended for *fast IO access* and keeping large amounts of a project's data, where the data also can be
|
||||
shared by all members of the project (the project's corresponding unix group). We recommend to keep most data in
|
||||
project related storage spaces, since it allows users to coordinate. Also, project spaces have more flexible policies
|
||||
regarding extending the available storage space.
|
||||
|
||||
Experiments can request a project space as described in **[[Accessing Merlin -> Requesting a Project]](../quick-start-guide/requesting-projects.md)**
|
||||
|
||||
Once created, the project data directory will be mounted in the login and computing nodes under the dirctory:
|
||||
|
||||
```bash
|
||||
/data/project/general/$projectname
|
||||
```
|
||||
|
||||
Project quotas are defined on a per *group* basis. Users can check the project quota by running the following command:
|
||||
|
||||
```bash
|
||||
mmlsquota -j $projectname --block-size auto -C merlin.psi.ch merlin-proj
|
||||
```
|
||||
|
||||
#### Project Directory policy
|
||||
|
||||
* Read **[Important: Code of Conduct](../quick-start-guide/code-of-conduct.md)** for more information about Merlin6 policies.
|
||||
* It is **forbidden** to use the data directories as `scratch` area during a job's runtime, i.e. for high throughput I/O for a job's temporary files. Please Use `/scratch`, `/shared-scratch` for this purpose.
|
||||
* No backups: users are responsible for managing the backups of their data directories.
|
||||
|
||||
### Scratch directories
|
||||
|
||||
There are two different types of scratch storage: **local** (`/scratch`) and **shared** (`/shared-scratch`).
|
||||
|
||||
**local** scratch should be used for all jobs that do not require the scratch files to be accessible from multiple nodes, which is trivially
|
||||
true for all jobs running on a single node.
|
||||
**shared** scratch is intended for files that need to be accessible by multiple nodes, e.g. by a MPI-job where tasks are spread out over the cluster
|
||||
and all tasks need to do I/O on the same temporary files.
|
||||
|
||||
**local** scratch in Merlin6 computing nodes provides a huge number of IOPS thanks to the NVMe technology. **Shared** scratch is implemented using a distributed parallel filesystem (GPFS) resulting in a higher latency, since it involves remote storage resources and more complex I/O coordination.
|
||||
|
||||
`/shared-scratch` is only mounted in the *Merlin6* computing nodes (i.e. not on the login nodes), and its current size is 50TB. This can be increased in the future.
|
||||
|
||||
The properties of the available scratch storage spaces are given in the following table
|
||||
|
||||
| Cluster | Service | Scratch | Scratch Mountpoint | Shared Scratch | Shared Scratch Mountpoint | Comments |
|
||||
| ------- | -------------- | ------------ | ------------------ | -------------- | ------------------------- | ------------------------------------ |
|
||||
| merlin5 | computing node | 50GB / SAS | `/scratch` | `N/A` | `N/A` | `merlin-c-[01-64]` |
|
||||
| merlin6 | login node | 100GB / SAS | `/scratch` | 50TB / GPFS | `/shared-scratch` | `merlin-l-0[1,2]` |
|
||||
| merlin6 | computing node | 1.3TB / NVMe | `/scratch` | 50TB / GPFS | `/shared-scratch` | `merlin-c-[001-024,101-124,201-224]` |
|
||||
| merlin6 | login node | 2.0TB / NVMe | `/scratch` | 50TB / GPFS | `/shared-scratch` | `merlin-l-00[1,2]` |
|
||||
|
||||
#### Scratch directories policy
|
||||
|
||||
* Read **[Important: Code of Conduct](../quick-start-guide/code-of-conduct.md)** for more information about Merlin6 policies.
|
||||
* By default, *always* use **local** first and only use **shared** if your specific use case requires it.
|
||||
* Temporary files *must be deleted at the end of the job by the user*.
|
||||
* Remaining files will be deleted by the system if detected.
|
||||
* Files not accessed within 28 days will be automatically cleaned up by the system.
|
||||
* If for some reason the scratch areas get full, admins have the rights to cleanup the oldest data.
|
||||
|
||||
### Export directory
|
||||
|
||||
Export directory is exclusively intended for transferring data from outside PSI to Merlin and viceversa. Is a temporary directoy with an auto-cleanup policy.
|
||||
Please read **[Transferring Data](../how-to-use-merlin/transfer-data.md)** for more information about it.
|
||||
|
||||
#### Export directory policy
|
||||
|
||||
* Temporary files *must be deleted at the end of the job by the user*.
|
||||
* Remaining files will be deleted by the system if detected.
|
||||
* Files not accessed within 28 days will be automatically cleaned up by the system.
|
||||
* If for some reason the export area gets full, admins have the rights to cleanup the oldest data
|
||||
Reference in New Issue
Block a user