Migration from merlin5 changes

This commit is contained in:
Spencer Bliven
2019-06-17 15:13:27 +02:00
parent 3654238eb0
commit e8a60c28ff

View File

@ -14,10 +14,10 @@ permalink: /merlin6/migrating.html
| Cluster | Home Directory | User Home Directory | Group Home Directory |
| ------- |:-------------------- |:-------------------- |:---------------------------------------- |
| merlin5 | /gpfs/home/$username | /gpfs/data/$username | /gpfs/group/$laboratory |
| merlin6 | /psi/home/$username | /data/user/$username | /data/project/[general|bio]/$projectname |
| merlin5 | /gpfs/home/_$username_ | /gpfs/data/_$username_ | /gpfs/group/_$laboratory_ |
| merlin6 | /psi/home/_$username_ | /data/user/_$username_ | /data/project/_\[general\|bio\]_/_$projectname_ |
### USR/GRP quota limits in Merlin6
### User/Group quota limits in Merlin6
| Directory | Quota_Type [Soft:Hard] (Block) | Quota_Type [Soft:Hard] (Files) | Quota Change Policy: Block | Quota Change Policy: Files |
| ---------------------------------- | ------------------------------ | ------------------------------ |:--------------------------------------------- |:--------------------------------------------- |
@ -29,25 +29,23 @@ permalink: /merlin6/migrating.html
where:
* **Block** is capacity size in GB and TB
* **Files** is number of files + directories in Millions (M)
* User data directorry ``/data/user`` has a strict user block quota limit policy. If more disk space is required, 'project' must be created.
* User data directory ``/data/user`` has a strict user block quota limit policy. If more disk space is required, 'project' must be created.
* Soft quotas can be exceeded for short periods of time. Hard quotas cannot be exceeded.
### Project directory
#### Why 'project' would be needed?
#### Why is 'project' needed?
In Merlin5 the concept *project* did not exist. A similar concept (*group*) was existing and was mostly focused for BIO experiments.
Merlin6 introduces the concept of a *project* directory. These are the recommended location for all scientific data.
Quite often different users are working in *a similar* / *the same* project. Data was shared in different ways,
such like by allowing other users to access private data, or by having duplicates on each user directory needing access to that data.
This makes the storage usage unefficient and insecure.
Also, there is another problem related to that: when a user leaves, we have plenty of data which needs to be kept and nobody becomes
responsible for that. In addition, after several months user is unregistered from PSI and we end up with orphaned data which needs to
be kept, but we sometimes loose track of the user.
With that, we want to restrict the usage of individual data and bet for project (shared) data. There will be one main responsible for
this project, but if for some reason this person leaves, responsible can be somebody else (successor if exists, supervisor, or in the
worst case, the admin).
- `/data/user` is not suitable for sharing data between users
- The Merlin5 *group* directories were a similar concept, but the association with a single organizational group made
interdepartmental sharing difficult. Projects can be shared by any PSI user.
- Projects are shared by multiple users (at a minimum they should be shared with the supervisor/PI). This decreases
the chance of data being orphaned by personnel changes.
- Shared projects are preferable to individual data for transparency and accountability in event of future questions
regarding the data.
- One project member is designated as responsible. Responsibility can be transferred if needed.
#### Requesting a *project*