151 lines
8.0 KiB
Markdown
151 lines
8.0 KiB
Markdown
---
|
|
title: Merlin6 Data Directories
|
|
#tags:
|
|
#keywords:
|
|
last_updated: 28 June 2019
|
|
#summary: ""
|
|
sidebar: merlin6_sidebar
|
|
permalink: /merlin6/data-directories.html
|
|
---
|
|
|
|
## Merlin6 directory structure
|
|
|
|
Merlin6 offers the following directory classes for users:
|
|
|
|
* ``/psi/home/<username>``: Private user **home** directory
|
|
* ``/data/user/<username>``: Private user **data** directory
|
|
* ``/data/project/general/<projectname>``: Shared **Project** directory
|
|
* For BIO experiments, a dedicated ``/data/project/bio/$projectname`` exists.
|
|
* ``/scratch``: Local *scratch* disk (only visible by the node running a job).
|
|
* ``/shared-scratch``: Shared *scratch* disk (visible from all nodes).
|
|
|
|
Properties of the directory classes:
|
|
|
|
| Directory | Block Quota [Soft:Hard] | Block Quota [Soft:Hard] | Quota Change Policy: Block | Quota Change Policy: Files | Backup | Backup Policy |
|
|
| ---------------------------------- | ----------------------- | ----------------------- |:--------------------------------- |:-------------------------------- | ------ | :----------------------------- |
|
|
| /psi/home/$username | USR [10GB:11GB] | *Undef* | Up to x2 when strictly justified. | N/A | yes | Daily snapshots for 1 week |
|
|
| /data/user/$username | USR [1TB:1.074TB] | USR [1M:1.1M] | Inmutable. Need a project. | Changeable when justified. | no | Users responsible for backup |
|
|
| /data/project/bio/$projectname | GRP [1TB:1.074TB] | GRP [1M:1.1M] | Subject to project requirements. | Subject to project requirements. | no | Project responsible for backup |
|
|
| /data/project/general/$projectname | GRP [1TB:1.074TB] | GRP [1M:1.1M] | Subject to project requirements. | Subject to project requirements. | no | Project responsible for backup |
|
|
| /scratch | *Undef* | *Undef* | N/A | N/A | no | N/A |
|
|
| /shared-scratch | *Undef* | *Undef* | N/A | N/A | no | N/A |
|
|
|
|
### User home directory
|
|
|
|
This is the default directory users will land when login in to any Merlin6 machine.
|
|
It is intended for your scripts, documents, software development, and other files which
|
|
you want to have backuped. Do not use it for data or HPC I/O-hungry tasks.
|
|
|
|
This directory is mounted in the login and computing nodes under the path:
|
|
|
|
```bash
|
|
/psi/home/$username
|
|
```
|
|
|
|
Home directories are part of the PSI NFS Central Home storage provided by AIT and
|
|
are managed by the Merlin6 administrators.
|
|
|
|
Users can check their quota by running the following command:
|
|
|
|
```bash
|
|
quota -s
|
|
```
|
|
|
|
#### Home directory policy
|
|
|
|
* Read **[Important: Code of Conduct](## Important: Code of Conduct)** for more information about Merlin6 policies.
|
|
* Is **forbidden** to use the home directories for IO intensive tasks
|
|
* Use ``/scratch``, ``/shared-scratch``, ``/data/user`` or ``/data/project`` for this purpose.
|
|
* Users can retrieve up to 1 week of their lost data thanks to the automatic **daily snapshots for 1 week**.
|
|
Snapshots can be accessed at this path:
|
|
|
|
```bash
|
|
/psi/home/.snapshop/$username
|
|
```
|
|
|
|
### User data directory
|
|
|
|
The user data directory is intended for *fast IO access* and keeping large amounts of private data.
|
|
This directory is mounted in the login and computing nodes under the directory
|
|
|
|
```bash
|
|
/data/user/$username
|
|
```
|
|
|
|
Users can check their quota by running the following command:
|
|
|
|
```bash
|
|
mmlsquota -u <username> --block-size auto merlin-user
|
|
```
|
|
|
|
#### User data directory policy
|
|
|
|
* Read **[Important: Code of Conduct](## Important: Code of Conduct)** for more information about Merlin6 policies.
|
|
* Is **forbidden** to use the data directories as ``scratch`` area during a job runtime.
|
|
* Use ``/scratch``, ``/shared-scratch`` for this purpose.
|
|
* No backup policy is applied for user data directories: users are responsible for backing up their data.
|
|
|
|
### Project data directory
|
|
|
|
This storage is intended for *fast IO access* and keeping large amounts of a project's data, where the data also can be
|
|
shared by all members of the project (the project's corresponding unix group). We recommend to keep most data in
|
|
project related storage spaces, since it allows users to coordinate. Also, project spaces have more flexible policies
|
|
regarding extending the available storage space.
|
|
|
|
You can request a project space by submitting an incident request via **[PSI Service Now](https://psi.service-now.com/psisp)** using the subject line
|
|
|
|
```
|
|
Subject: [Merlin6] Project Request for project name xxxxxx
|
|
```
|
|
|
|
Please list your wish for a project name and list the accounts that should be part of it. The project will receive a corresponding unix group.
|
|
|
|
The project data directory is mounted in the login and computing nodes under the dirctory:
|
|
|
|
```bash
|
|
/data/project/$username
|
|
```
|
|
|
|
Project quotas are defined on a per *group* basis. Users can check the project quota by running the following command:
|
|
|
|
```bash
|
|
mmrepquota merlin-proj:$projectname
|
|
```
|
|
|
|
#### Project Directory policy
|
|
|
|
* Read **[Important: Code of Conduct](## Important: Code of Conduct)** for more information about Merlin6 policies.
|
|
* It is **forbidden** to use the data directories as ``scratch`` area during a job's runtime, i.e. for high throughput I/O for a job's temporary files. Please Use ``/scratch``, ``/shared-scratch`` for this purpose.
|
|
* No backups: users are responsible for managing the backups of their data directories.
|
|
|
|
### Scratch directories
|
|
|
|
There are two different types of scratch storage: **local** (``/scratch``) and **shared** (``/shared-scratch``).
|
|
|
|
**local** scratch should be used for all jobs that do not require the scratch files to be accessible from multiple nodes, which is trivially
|
|
true for all jobs running on a single node.
|
|
**shared** scratch is intended for files that need to be accessible by multiple nodes, e.g. by a MPI-job where tasks are spread out over the cluster
|
|
and all tasks need to do I/O on the same temporary files.
|
|
|
|
**local** scratch in Merlin6 computing nodes provides a huge number of IOPS thanks to the NVMe technology. **Shared** scratch is implemented using a distributed parallel filesystem (GPFS) resulting in a higher latency, since it involves remote storage resources and more complex I/O coordination.
|
|
|
|
``/shared-scratch`` is only mounted in the *Merlin6* computing nodes (i.e. not on the login nodes), and its current size is 50TB. This can be increased in the future.
|
|
|
|
The properties of the available scratch storage spaces are given in the following table
|
|
|
|
| Cluster | Service | Scratch | Scratch Mountpoint | Shared Scratch | Shared Scratch Mountpoint | Comments |
|
|
| ------- | -------------- | ------------ | ------------------ | -------------- | ------------------------- | ------------------------------------- |
|
|
| merlin5 | computing node | 50GB / SAS | ``/scratch`` | ``N/A`` | ``N/A`` | ``merlin-c-[01-64]`` |
|
|
| merlin6 | login node | 100GB / SAS | ``/scratch`` | ``N/A`` | ``N/A`` | ``merlin-l-0[1,2]`` |
|
|
| merlin6 | computing node | 1.3TB / NVMe | ``/scratch`` | 50TB / GPFS | ``/shared-scratch`` | ``merlin-c-[001-022,101-122,201-222`` |
|
|
| merlin6 | login node | 2.0TB / NVMe | ``/scratch`` | ``N/A`` | ``N/A`` | ``merlin-l-00[1,2]`` |
|
|
|
|
#### Scratch directories policy
|
|
|
|
* Read **[Important: Code of Conduct](## Important: Code of Conduct)** for more information about Merlin6 policies.
|
|
* By default, *always* use **local** first and only use **shared** if your specific use case requires it.
|
|
* Temporary files *must be deleted at the end of the job by the user*.
|
|
* Remaining files will be deleted by the system if detected.
|
|
|
|
---
|