7.6 KiB
title, last_updated, sidebar, permalink
title | last_updated | sidebar | permalink |
---|---|---|---|
Merlin6 Data Directories | 18 June 2019 | merlin6_sidebar | /merlin6/data-directories.html |
Merlin6 directory structure
Merlin6 contain the following directories available for users:
/psi/home/<username>
: private user home directory/data/user/<username>
: private user home directory/data/project/general/<projectname>
: Shared Project directory- For BIO experiments, a dedicate
/data/project/bio/$projectname
exists.
- For BIO experiments, a dedicate
/scratch
: Local scratch disk./shared-scratch
: Shared scratch disk.
A summary for each directory would be:
Directory | Block Quota [Soft:Hard] | Block Quota [Soft:Hard] | Quota Change Policy: Block | Quota Change Policy: Files | Backup | Backup Policy |
---|---|---|---|---|---|---|
/psi/home/$username | USR [10GB:11GB] | Undef | Up to x2 when strictly justified. | N/A | yes | Daily snapshots for 1 week |
/data/user/$username | USR [1TB:1.074TB] | USR [1M:1.1M] | Inmutable. Need a project. | Changeable when justified. | no | Users responsible for backup |
/data/project/bio/$projectname | GRP [1TB:1.074TB] | GRP [1M:1.1M] | Subject to project requirements. | Subject to project requirements. | no | Project responsible for backup |
/data/project/general/$projectname | GRP [1TB:1.074TB] | GRP [1M:1.1M] | Subject to project requirements. | Subject to project requirements. | no | Project responsible for backup |
/scratch | Undef | Undef | N/A | N/A | no | N/A |
/shared-scratch | Undef | Undef | N/A | N/A | no | N/A |
User home directory
Home directories are part of the PSI NFS Central Home storage provided by AIT. However, administration for the Merlin6 NFS homes is delegated to Merlin6 administrators.
This is the default directory users will land when login in to any Merlin6 machine. This directory is mounted in the login and computing nodes under the directory:
/psi/home/$username
Users can check their quota by running the following command:
quota -s
Home directory policy
-
Read [Important: Code of Conduct](## Important: Code of Conduct) for more information about Merlin6 policies.
-
Is forbidden to use the home directories for IO intensive tasks
- Use
/scratch
,/shared-scratch
,/data/user
or/data/project
for this purpose.
- Use
-
Users can recover up to 1 week of their lost data thanks to the automatic daily snapshorts for 1 week. Snapshots are found in the following directory:
/psi/home/.snapshop/$username
User data directory
User data directories are part of the Merlin6 storage cluster and technology is based on GPFS.
The user data directory is intended for fast IO access and keeping large amount of private data. This directory is mounted in the login and computing nodes under the directory
/data/user/$username
Users can check their quota by running the following command:
mmlsquota -u <username> --block-size auto merlin-user
User Directory policy
- Read [Important: Code of Conduct](## Important: Code of Conduct) for more information about Merlin6 policies.
- Is forbidden to use the data directories as
scratch
area during a job runtime.- Use
/scratch
,/shared-scratch
for this purpose.
- Use
- No backup policy is applied for user data directories: users are responsible for backing up their data.
Project data directory
Project data directories are part of the Merlin6 storage cluster and technology is based on GPFS.
This storage is intended for fast IO access and keeping large amount of private data, but also for sharing data amogst different users sharing a project. Creating a project is the way in where users can expand his storage space and will optimize the usage of the storage (by avoiding for instance, duplicated data for different users).
Is highly recommended the use of a project when multiple persons are involved in the same project managing similar/common data. Quotas are defined in a group and fileset basis: Unix Group name must exist for a specific project or must be created for any new project. Contact the Merlin6 administrators for more information about that.
The project data directory is mounted in the login and computing nodes under the dirctory:
/data/project/$username
Users can check the project quota by running the following command:
mmrepquota merlin-proj:$projectname
Project Directory policy
- Read [Important: Code of Conduct](## Important: Code of Conduct) for more information about Merlin6 policies.
- Is forbidden to use the data directories as
scratch
area during a job runtime.- Use
/scratch
,/shared-scratch
for this purpose.
- Use
- No backups: users are responsible for managing the backups of their data directories.
Scratch directories
There are two different types of scratch disk: local (/scratch
) and shared (/shared-scratch
).
Specific details of each type is described below.
Usually shared scratch will be used for those jobs running on multiple nodes which need to access to a common shared space for creating temporary files, while local scratch should be used by those jobs needing a local space for creating temporary files.
local scratch in Merlin6 computing nodes provides a huge number of IOPS thanks to the NVMe technology, while shared scratch, despite being also very fast, is an external GPFS storage with more latency.
/shared-scratch
is only mounted in the Merlin6 computing nodes, and its current size is 50TB. Whenever necessary, it can be increased in the future.
A summary for the scratch directories is the following:
Cluster | Service | Scratch | Scratch Mountpoint | Shared Scratch | Shared Scratch Mountpoint | Comments |
---|---|---|---|---|---|---|
merlin5 | computing node | 50GB / SAS | /scratch |
N/A |
N/A |
merlin-c-[01-64] |
merlin6 | login node | 100GB / SAS | /scratch |
N/A |
N/A |
merlin-l-0[1,2] |
merlin6 | computing node | 1.3TB / NVMe | /scratch |
50TB / GPFS | /shared-scratch |
merlin-c-[001-022,101-122,201-222 |
merlin6 | login node | 2.0TB / NVMe | /scratch |
N/A |
N/A |
merlin-l-00[1,2] |
Scratch directories policy
- Read [Important: Code of Conduct](## Important: Code of Conduct) for more information about Merlin6 policies.
- By default, always use local first and only use shared if you specific use case needs a shared scratch area.
- Temporary files must be deleted at the end of the job by the user.
- Remaining files will be deleted by the system if detected.