149 lines
7.6 KiB
Markdown
149 lines
7.6 KiB
Markdown
---
|
|
title: Merlin6 Data Directories
|
|
#tags:
|
|
#keywords:
|
|
last_updated: 18 June 2019
|
|
#summary: ""
|
|
sidebar: merlin6_sidebar
|
|
permalink: /merlin6/data-directories.html
|
|
---
|
|
|
|
## Merlin6 directory structure
|
|
|
|
Merlin6 contain the following directories available for users:
|
|
|
|
* ``/psi/home/<username>``: private user **home** directory
|
|
* ``/data/user/<username>``: private user **home** directory
|
|
* ``/data/project/general/<projectname>``: Shared **Project** directory
|
|
* For BIO experiments, a dedicate ``/data/project/bio/$projectname`` exists.
|
|
* ``/scratch``: Local *scratch* disk.
|
|
* ``/shared-scratch``: Shared *scratch* disk.
|
|
|
|
A summary for each directory would be:
|
|
|
|
| Directory | Block Quota [Soft:Hard] | Block Quota [Soft:Hard] | Quota Change Policy: Block | Quota Change Policy: Files | Backup | Backup Policy |
|
|
| ---------------------------------- | ----------------------- | ----------------------- |:--------------------------------- |:-------------------------------- | ------ | :----------------------------- |
|
|
| /psi/home/$username | USR [10GB:11GB] | *Undef* | Up to x2 when strictly justified. | N/A | yes | Daily snapshots for 1 week |
|
|
| /data/user/$username | USR [1TB:1.074TB] | USR [1M:1.1M] | Inmutable. Need a project. | Changeable when justified. | no | Users responsible for backup |
|
|
| /data/project/bio/$projectname | GRP [1TB:1.074TB] | GRP [1M:1.1M] | Subject to project requirements. | Subject to project requirements. | no | Project responsible for backup |
|
|
| /data/project/general/$projectname | GRP [1TB:1.074TB] | GRP [1M:1.1M] | Subject to project requirements. | Subject to project requirements. | no | Project responsible for backup |
|
|
| /scratch | *Undef* | *Undef* | N/A | N/A | no | N/A |
|
|
| /shared-scratch | *Undef* | *Undef* | N/A | N/A | no | N/A |
|
|
|
|
### User home directory
|
|
|
|
Home directories are part of the PSI NFS Central Home storage provided by AIT.
|
|
However, administration for the Merlin6 NFS homes is delegated to Merlin6 administrators.
|
|
|
|
This is the default directory users will land when login in to any Merlin6 machine.
|
|
This directory is mounted in the login and computing nodes under the directory:
|
|
|
|
```bash
|
|
/psi/home/$username
|
|
```
|
|
|
|
Users can check their quota by running the following command:
|
|
|
|
```bash
|
|
quota -s
|
|
```
|
|
|
|
#### Home directory policy
|
|
|
|
* Read **[Important: Code of Conduct](## Important: Code of Conduct)** for more information about Merlin6 policies.
|
|
* Is **forbidden** to use the home directories for IO intensive tasks
|
|
* Use ``/scratch``, ``/shared-scratch``, ``/data/user`` or ``/data/project`` for this purpose.
|
|
* Users can recover up to 1 week of their lost data thanks to the automatic **daily snapshorts for 1 week**.
|
|
Snapshots are found in the following directory:
|
|
|
|
```bash
|
|
/psi/home/.snapshop/$username
|
|
```
|
|
|
|
### User data directory
|
|
|
|
User data directories are part of the Merlin6 storage cluster and technology is based on GPFS.
|
|
|
|
The user data directory is intended for *fast IO access* and keeping large amount of private data.
|
|
This directory is mounted in the login and computing nodes under the directory
|
|
|
|
```bash
|
|
/data/user/$username
|
|
```
|
|
|
|
Users can check their quota by running the following command:
|
|
|
|
```bash
|
|
mmlsquota -u <username> --block-size auto merlin-user
|
|
```
|
|
|
|
#### User Directory policy
|
|
|
|
* Read **[Important: Code of Conduct](## Important: Code of Conduct)** for more information about Merlin6 policies.
|
|
* Is **forbidden** to use the data directories as ``scratch`` area during a job runtime.
|
|
* Use ``/scratch``, ``/shared-scratch`` for this purpose.
|
|
* No backup policy is applied for user data directories: users are responsible for backing up their data.
|
|
|
|
### Project data directory
|
|
|
|
Project data directories are part of the Merlin6 storage cluster and technology is based on GPFS.
|
|
|
|
This storage is intended for *fast IO access* and keeping large amount of private data, but also for sharing data amogst
|
|
different users sharing a project.
|
|
Creating a project is the way in where users can expand his storage space and will optimize the usage of the storage
|
|
(by avoiding for instance, duplicated data for different users).
|
|
|
|
Is **highly** recommended the use of a project when multiple persons are involved in the same project managing similar/common data.
|
|
Quotas are defined in a *group* and *fileset* basis: Unix Group name must exist for a specific project or must be created for
|
|
any new project. Contact the Merlin6 administrators for more information about that.
|
|
|
|
The project data directory is mounted in the login and computing nodes under the dirctory:
|
|
|
|
```bash
|
|
/data/project/$username
|
|
```
|
|
|
|
Users can check the project quota by running the following command:
|
|
|
|
```bash
|
|
mmrepquota merlin-proj:$projectname
|
|
```
|
|
|
|
#### Project Directory policy
|
|
|
|
* Read **[Important: Code of Conduct](## Important: Code of Conduct)** for more information about Merlin6 policies.
|
|
* Is **forbidden** to use the data directories as ``scratch`` area during a job runtime.
|
|
* Use ``/scratch``, ``/shared-scratch`` for this purpose.
|
|
* No backups: users are responsible for managing the backups of their data directories.
|
|
|
|
### Scratch directories
|
|
|
|
There are two different types of scratch disk: **local** (``/scratch``) and **shared** (``/shared-scratch``).
|
|
Specific details of each type is described below.
|
|
|
|
Usually **shared** scratch will be used for those jobs running on multiple nodes which need to access to a common shared space
|
|
for creating temporary files, while **local** scratch should be used by those jobs needing a local space for creating temporary files.
|
|
|
|
**local** scratch in Merlin6 computing nodes provides a huge number of IOPS thanks to the NVMe technology,
|
|
while **shared** scratch, despite being also very fast, is an external GPFS storage with more latency.
|
|
|
|
``/shared-scratch`` is only mounted in the *Merlin6* computing nodes, and its current size is 50TB. Whenever necessary, it can be increased in the future.
|
|
|
|
A summary for the scratch directories is the following:
|
|
|
|
| Cluster | Service | Scratch | Scratch Mountpoint | Shared Scratch | Shared Scratch Mountpoint | Comments |
|
|
| ------- | -------------- | ------------ | ------------------ | -------------- | ------------------------- | ------------------------------------- |
|
|
| merlin5 | computing node | 50GB / SAS | ``/scratch`` | ``N/A`` | ``N/A`` | ``merlin-c-[01-64]`` |
|
|
| merlin6 | login node | 100GB / SAS | ``/scratch`` | ``N/A`` | ``N/A`` | ``merlin-l-0[1,2]`` |
|
|
| merlin6 | computing node | 1.3TB / NVMe | ``/scratch`` | 50TB / GPFS | ``/shared-scratch`` | ``merlin-c-[001-022,101-122,201-222`` |
|
|
| merlin6 | login node | 2.0TB / NVMe | ``/scratch`` | ``N/A`` | ``N/A`` | ``merlin-l-00[1,2]`` |
|
|
|
|
#### Scratch directories policy
|
|
|
|
* Read **[Important: Code of Conduct](## Important: Code of Conduct)** for more information about Merlin6 policies.
|
|
* By default, *always* use **local** first and only use **shared** if you specific use case needs a shared scratch area.
|
|
* Temporary files *must be deleted at the end of the job by the user*.
|
|
* Remaining files will be deleted by the system if detected.
|
|
|
|
---
|