update merlin7 storage documentation
This commit is contained in:
parent
0392a2b3e4
commit
74c6e6866c
@ -30,7 +30,7 @@ The basic principle is courtesy and consideration for other users.
|
||||
* It is **forbidden** to use the ``/data/user``, ``/data/project`` or ``/psi/home/`` for that purpose.
|
||||
* Always remove files you do not need any more (e.g. core dumps, temporary files) as early as possible. Keep the disk space clean on all nodes.
|
||||
* Prefer ``/scratch`` over ``/shared-scratch`` and use the latter only when you require the temporary files to be visible from multiple nodes.
|
||||
* Read the description in **[Merlin6 directory structure](### Merlin6 directory structure)** for learning about the correct usage of each partition type.
|
||||
* Read the description in **[Merlin6 directory structure](/merlin6/storage.html#merlin6-directories)** for learning about the correct usage of each partition type.
|
||||
|
||||
## User and project data
|
||||
|
||||
|
@ -30,7 +30,7 @@ The basic principle is courtesy and consideration for other users.
|
||||
* It is **forbidden** to use the ``/data/user`` or ``/data/project`` for that purpose.
|
||||
* Always remove files you do not need any more (e.g. core dumps, temporary files) as early as possible. Keep the disk space clean on all nodes.
|
||||
* Prefer ``/scratch`` over ``/data/scratch/shared`` and _use the latter only when you require the temporary files to be visible from multiple nodes_.
|
||||
* Read the description in **[Merlin7 directory structure](### Merlin7 directory structure)** for learning about the correct usage of each partition type.
|
||||
* Read the description in **[Merlin7 directory structure](/merlin7/storage.html#merlin7-directories)** for learning about the correct usage of each partition type.
|
||||
|
||||
## User and project data
|
||||
|
||||
|
@ -2,7 +2,7 @@
|
||||
title: Merlin7 Storage
|
||||
#tags:
|
||||
keywords: storage, /data/user, /data/software, /data/project, /scratch, /data/scratch/shared, quota, export, user, project, scratch, data, data/scratch/shared, merlin_quotas
|
||||
last_updated: 07 September 2022
|
||||
#last_updated: 07 September 2022
|
||||
#summary: ""
|
||||
sidebar: merlin7_sidebar
|
||||
redirect_from: /merlin7/data-directories.html
|
||||
@ -16,35 +16,52 @@ This document describes the different directories of the Merlin7 cluster.
|
||||
### Backup and data policies
|
||||
|
||||
* ***Users are responsible for backing up their own data***. Is recommended to backup the data on third party independent systems (i.e. LTS, Archive, AFS, SwitchDrive, Windows Shares, etc.).
|
||||
* ***When a user leaves PSI, she or her supervisor/team are responsible to backup and move the data out from the cluster***: every few months, the storage space will be recycled for those old users who do not have an existing and valid PSI account.
|
||||
* ***When a user leaves PSI, she or her supervisor/team are responsible to backup and move the data out from the cluster***: every few months, the storage space will be recycled for those old users who do not have an existing and valid PSI account.
|
||||
|
||||
{{site.data.alerts.warning}}When a user leaves PSI and his account has been removed, her storage space in Merlin may be recycled.
|
||||
Hence, <b>when a user leaves PSI</b>, she, her supervisor or team <b>must ensure that the data is backed up to an external storage</b>
|
||||
{{site.data.alerts.warning}}When a user leaves PSI and their account is removed, their storage space in Merlin may be recycled.
|
||||
Hence, <b>when a user leaves PSI</b>, they, their supervisor or team <b>must ensure that the data is backed up to an external storage</b>
|
||||
{{site.data.alerts.end}}
|
||||
|
||||
### How to check quotas
|
||||
|
||||
Some of the Merlin7 directories have quotas applied. A way for checking the quotas is provided with the `merlin_quotas` command.
|
||||
This command is useful to show all quotas for the different user storage directories and partitions (including AFS). To check your quotas, please run:
|
||||
```bash
|
||||
merlin_quotas
|
||||
|
||||
```console
|
||||
$ merlin_quotas
|
||||
Path SpaceUsed SpaceQuota Space % FilesUsed FilesQuota Files %
|
||||
-------------- --------- ---------- ------- --------- ---------- -------
|
||||
/data/user 29.85G 1T 0% 366671 2097152 0%
|
||||
└─ <USERNAME>
|
||||
/afs/psi.ch 3.4G 9.5G 36% 0 0 0%
|
||||
└─ user/v/<USERNAME>
|
||||
/data/scratch 680.1M 512G 0% 366897 0 0%
|
||||
└─ shared
|
||||
/data/project 1.115T 10T 0% 50 2097152 0%
|
||||
└─ bio/shared
|
||||
```
|
||||
|
||||
{{site.data.alerts.warning}}Currently, <b>merlin_quotas</b> is not functional one the Merlin7 cluster.
|
||||
We will notify when this is ready to be used.
|
||||
The output shows the quotas set and home much you are using of the quota, for each filesystem that has this set. Notice that some users will have
|
||||
one or more `/data/project/...` directories showing, depending on whether you are part of a specific PSI research group or project.
|
||||
|
||||
The general quota constraints for the different directories are shown in the table below.
|
||||
|
||||
{{site.data.alerts.note}}If you're interesting, you can retrieve the Lustre-based quota information directly by calling
|
||||
<code>lfs quota -h -p $(( 100000000 + $(id -u $USER) )) /data</code> directly. Using the <code>merlin_quotas</code> command is more
|
||||
convenient and shows all your relevant filesystem quotas.
|
||||
{{site.data.alerts.end}}
|
||||
|
||||
## Merlin7 directories
|
||||
|
||||
Merlin7 offers the following directory classes for users:
|
||||
|
||||
* ``/data/user/<username>``: Private user **home** directory
|
||||
* ``/data/project/general``: project directory for Merlin
|
||||
* ``/data/project/bio/$projectname`` project directory for BIO
|
||||
* ``/data/project/mu3e/$projectname`` project directory for Mu3e
|
||||
* ``/data/project/meg/$projectname`` project directory for Mu3e
|
||||
* ``/scratch``: Local *scratch* disk (only visible by the node running a job).
|
||||
* ``/data/scratch/shared``: Shared *scratch* disk (visible from all nodes).
|
||||
* `/data/user/<username>`: Private user **home** directory
|
||||
* `/data/project/general`: project directory for Merlin
|
||||
* `/data/project/bio/$projectname`: project directory for BIO
|
||||
* `/data/project/mu3e/$projectname`: project directory for Mu3e
|
||||
* `/data/project/meg/$projectname`: project directory for Mu3e
|
||||
* `/scratch`: Local *scratch* disk (only visible by the node running a job).
|
||||
* `/data/scratch/shared`: Shared *scratch* disk (visible from all nodes).
|
||||
|
||||
{{site.data.alerts.tip}}In Lustre there is a concept called <b>grace time</b>. Filesystems have a block (amount of data) and inode (number of files) quota.
|
||||
These quotas contain a soft and hard limits. Once the soft limit is reached, users can keep writing up to their hard limit quota during the <b>grace period</b>.
|
||||
@ -56,13 +73,15 @@ Properties of the directory classes:
|
||||
|
||||
| Directory | Block Quota [Soft:Hard] | Inode Quota [Soft:Hard] | GraceTime | Quota Change Policy: Block | Quota Change Policy: Inodes | Backup |
|
||||
| ---------------------------------- | ----------------------- | ----------------------- | :-------: | :--------------------------------- |:-------------------------------- | ------ |
|
||||
| /data/user/$username | PRJ [1TB:1.074TB] | PRJ [2M:2.1M] | 7d | Inmutable. Need a project. | Changeable when justified. | no |
|
||||
| /data/user/$username | PRJ [1TB:1.074TB] | PRJ [2M:2.1M] | 7d | Immutable. Need a project. | Changeable when justified. | no |
|
||||
| /data/project/bio/$projectname | PRJ [1TB:1.074TB] | PRJ [1M:1.1M] | 7d | Subject to project requirements. | Subject to project requirements. | no |
|
||||
| /data/project/general/$projectname | PRJ [1TB:1.074TB] | PRJ [1M:1.1M] | 7d | Subject to project requirements. | Subject to project requirements. | no |
|
||||
| /scratch | *Undef* | *Undef* | N/A | N/A | N/A | no |
|
||||
| /data/scratch/shared | USR [512GB:2TB] | | 7d | Up to x2 when strongly justified. | Changeable when justified. | no |
|
||||
| /scratch | *Undef* | *Undef* | N/A | N/A | N/A | no |
|
||||
|
||||
{{site.data.alerts.warning}}The use of <b>scratch</b> and <b>/data/scratch/shared</b> areas as an extension of the quota <i>is forbidden</i>. <b>scratch</b> and <b>/data/scratch/shared</b> areas <i>must not contain</i> final data. Keep in mind that <br><b><i>auto cleanup policies</i></b> in the <b>scratch</b> and <b>/data/scratch/shared</b> areas are applied.
|
||||
{{site.data.alerts.warning}}The use of <b>/scratch</b> and <b>/data/scratch/shared</b> areas as an extension of the quota <i>is forbidden</i>. The <b>/scratch</b> and
|
||||
<b>/data/scratch/shared</b> areas <i>must not contain</i> final data. Keep in mind that <br><b><i>auto cleanup policies</i></b> in the <b>/scratch</b> and
|
||||
<b>/data/scratch/shared</b> areas are applied.
|
||||
{{site.data.alerts.end}}
|
||||
|
||||
### User home directory
|
||||
@ -71,21 +90,19 @@ This is the default directory users will land when login in to any Merlin7 machi
|
||||
It is intended for your scripts, documents, software development and data. Do not use it for I/O-hungry tasks.
|
||||
|
||||
The home directories are mounted in the login and computing nodes under the directory
|
||||
|
||||
```bash
|
||||
/data/user/$username
|
||||
```
|
||||
|
||||
Directory policies:
|
||||
* Read **[Important: Code of Conduct](## Important: Code of Conduct)** for more information about Merlin7 policies.
|
||||
* Is **forbidden** to use the home directories for IO-intensive tasks
|
||||
* Use always the local ``/scratch`` disk of the compute nodes in first place.
|
||||
* Use ``/data/scratch/shared`` only when necessary. In example, by jobs requiring a fast shared storage area.
|
||||
* No backup policy is applied for the user home directories: users are responsible for backing up their data.
|
||||
|
||||
Home directory quotas are defined in a per Lustre project basis. Users can check the project quota by running the following command:
|
||||
```bash
|
||||
lfs quota -h -p $(( 100000000 + $(id -u $USER) )) /data
|
||||
```
|
||||
* Read **[Important: Code of Conduct](/merlin7/code-of-conduct.html)** for more information about Merlin7 policies.
|
||||
* Is **forbidden** to use the home directories for IO-intensive tasks, instead use one of the **[scratch](/merlin7/storage.html#scratch-directories)** areas instead!
|
||||
* No backup policy is applied for the user home directories: **users are responsible for backing up their data**.
|
||||
|
||||
Home directory quotas are defined in a per Lustre project basis. The quota can be checked using the `merlin_quotas` command described
|
||||
[above](/merlin7/storage.html#how-to-check-quotas).
|
||||
|
||||
### Project data directory
|
||||
|
||||
@ -105,17 +122,20 @@ Once a Merlin project is created, the directory will be mounted in the login and
|
||||
```
|
||||
|
||||
Project quotas are defined in a per Lustre project basis. Users can check the project quota by running the following command:
|
||||
|
||||
```bash
|
||||
lfs quota -h -p $projectid /data
|
||||
```
|
||||
|
||||
{{site.data.alerts.warning}}Checking <b>quotas</b> for the Merlin projects is not yet possible.
|
||||
In the future, a list of `projectid` will be provided, so users can check their quotas.
|
||||
{{site.data.alerts.end}}
|
||||
|
||||
Directory policies:
|
||||
* Read **[Important: Code of Conduct](## Important: Code of Conduct)** for more information about Merlin7 policies.
|
||||
* It is **forbidden** to use the data directories as ``scratch`` area during a job's runtime, i.e. for high throughput I/O for a job's temporary files.
|
||||
* Please Use ``/scratch``, ``/data/scratch/shared`` for this purpose.
|
||||
|
||||
* Read **[Important: Code of Conduct](/merlin7/code-of-conduct.html)** for more information about Merlin7 policies.
|
||||
* It is **forbidden** to use the data directories as `/scratch` area during a job's runtime, i.e. for high throughput I/O for a job's temporary files.
|
||||
* Please Use `/scratch`, `/data/scratch/shared` for this purpose.
|
||||
* No backups: users are responsible for managing the backups of their data directories.
|
||||
|
||||
#### Dedicated project directories
|
||||
@ -133,23 +153,27 @@ They follow the same rules as the general projects, except that they have assign
|
||||
|
||||
### Scratch directories
|
||||
|
||||
There are two different types of scratch storage: **local** (``/scratch``) and **shared** (``/data/scratch/shared``).
|
||||
There are two different types of scratch storage: **local** (`/scratch`) and **shared** (`/data/scratch/shared`).
|
||||
|
||||
* **local** scratch should be used for all jobs that do not require the scratch files to be accessible from multiple nodes, which is trivially
|
||||
true for all jobs running on a single node. Mount path:
|
||||
|
||||
```bash
|
||||
/scratch
|
||||
```
|
||||
|
||||
* **shared** scratch is intended for files that need to be accessible by multiple nodes, e.g. by a MPI-job where tasks are spread out over the cluster
|
||||
and all tasks need to do I/O on the same temporary files.
|
||||
|
||||
```bash
|
||||
/data/scratch/shared
|
||||
```
|
||||
|
||||
Scratch directories policies:
|
||||
* Read **[Important: Code of Conduct](## Important: Code of Conduct)** for more information about Merlin7 policies.
|
||||
|
||||
* Read **[Important: Code of Conduct](/merlin7/code-of-conduct.html)** for more information about Merlin7 policies.
|
||||
* By default, *always* use **local** first and only use **shared** if your specific use case requires it.
|
||||
* Temporary files *must be deleted at the end of the job by the user*.
|
||||
* Remaining files will be deleted by the system if detected.
|
||||
* Files not accessed within 28 days will be automatically cleaned up by the system.
|
||||
* If for some reason the scratch areas get full, admins have the rights to cleanup the oldest data.
|
||||
* Remaining files will be deleted by the system if detected.
|
||||
* Files not accessed within 28 days will be automatically cleaned up by the system.
|
||||
* If for some reason the scratch areas get full, admins have the rights to cleanup the oldest data.
|
||||
|
Loading…
x
Reference in New Issue
Block a user