Doc changes
This commit is contained in:
139
pages/merlin6/99-support/migration-from-merlin5.md
Normal file
139
pages/merlin6/99-support/migration-from-merlin5.md
Normal file
@ -0,0 +1,139 @@
|
||||
---
|
||||
title: Migration From Merlin5
|
||||
#tags:
|
||||
#keywords:
|
||||
last_updated: 18 June 2019
|
||||
#summary: ""
|
||||
sidebar: merlin6_sidebar
|
||||
permalink: /merlin6/migrating.html
|
||||
---
|
||||
|
||||
## Directories
|
||||
|
||||
### Merlin5 vs Merlin6
|
||||
|
||||
| Cluster | Home Directory | User Home Directory | Group Home Directory |
|
||||
| ------- |:-------------------- |:-------------------- |:---------------------------------------- |
|
||||
| merlin5 | /gpfs/home/_$username_ | /gpfs/data/_$username_ | /gpfs/group/_$laboratory_ |
|
||||
| merlin6 | /psi/home/_$username_ | /data/user/_$username_ | /data/project/_\[general\|bio\]_/_$projectname_ |
|
||||
|
||||
### Quota limits in Merlin6
|
||||
|
||||
| Directory | Quota_Type [Soft:Hard] (Block) | Quota_Type [Soft:Hard] (Files) | Quota Change Policy: Block | Quota Change Policy: Files |
|
||||
| ---------------------------------- | ------------------------------ | ------------------------------ |:--------------------------------------------- |:--------------------------------------------- |
|
||||
| /psi/home/$username | USR [10GB:11GB] | *Undef* | Up to x2 when strictly justified. | N/A |
|
||||
| /data/user/$username | USR [1TB:1.074TB] | USR [1M:1.1M] | Inmutable. Need a project. | Changeable when justified. |
|
||||
| /data/project/bio/$projectname | GRP+Fileset [1TB:1.074TB] | GRP+Fileset [1M:1.1M] | Changeable according to project requirements. | Changeable according to project requirements. |
|
||||
| /data/project/general/$projectname | GRP+Fileset [1TB:1.074TB] | GRP+Fileset [1M:1.1M] | Changeable according to project requirements. | Changeable according to project requirements. |
|
||||
|
||||
where:
|
||||
* **Block** is capacity size in GB and TB
|
||||
* **Files** is number of files + directories in Millions (M)
|
||||
* **Quota types** are the following:
|
||||
* **USR**: Quota is setup individually per user name
|
||||
* **GRP**: Quota is setup individually per Unix Group name
|
||||
* **Fileset**: Quota is setup per project root directory.
|
||||
* User data directory ``/data/user`` has a strict user block quota limit policy. If more disk space is required, 'project' must be created.
|
||||
* Soft quotas can be exceeded for short periods of time. Hard quotas cannot be exceeded.
|
||||
|
||||
### Project directory
|
||||
|
||||
#### Why is 'project' needed?
|
||||
|
||||
Merlin6 introduces the concept of a *project* directory. These are the recommended location for all scientific data.
|
||||
|
||||
* `/data/user` is not suitable for sharing data between users
|
||||
* The Merlin5 *group* directories were a similar concept, but the association with a single organizational group made
|
||||
interdepartmental sharing difficult. Projects can be shared by any PSI user.
|
||||
* Projects are shared by multiple users (at a minimum they should be shared with the supervisor/PI). This decreases
|
||||
the chance of data being orphaned by personnel changes.
|
||||
* Shared projects are preferable to individual data for transparency and accountability in event of future questions
|
||||
regarding the data.
|
||||
* One project member is designated as responsible. Responsibility can be transferred if needed.
|
||||
|
||||
#### Requesting a *project*
|
||||
|
||||
Refer to [Requesting a project](/merlin6/request-project.html)
|
||||
|
||||
---
|
||||
|
||||
## Migration Schedule
|
||||
|
||||
### Phase 1 [June]: Pre-migration
|
||||
|
||||
* Users keep working on Merlin5
|
||||
* Merlin5 production directories: ``'/gpfs/home/'``, ``'/gpfs/data'``, ``'/gpfs/group'``
|
||||
* Users may raise any problems (quota limits, unaccessible files, etc.) to merlin-admins@lists.psi.ch
|
||||
* Users can start migrating data (see [Migration steps](/merlin6/migrating.html#migration-steps))
|
||||
* Users should copy their data from Merlin5 ``/gpfs/data`` to Merlin6 ``/data/user``
|
||||
* Users should copy their home from Merlin5 ``/gpfs/home`` to Merlin6 ``/psi/home``
|
||||
* Users should inform when migration is done, and which directories were migrated. Deletion for such directories can be requested by admins.
|
||||
|
||||
### Phase 2 [July-October]: Migration to Merlin6
|
||||
|
||||
* Merlin6 becomes official cluster, and directories are switched to the new structure:
|
||||
* Merlin6 production directories: ``'/psi/home/'``, ``'/data/user'``, ``'/data/project'``
|
||||
* Merlin5 directories available in RW in login nodes: ``'/gpfs/home/'``, ``'/gpfs/data'``, ``'/gpfs/group'``
|
||||
* In Merlin5 computing nodes, Merlin5 directories are mounted in RW: ``'/gpfs/home/'``, ``'/gpfs/data'``, ``'/gpfs/group'``
|
||||
* In Merlin5 computing nodes, Merlin6 directories are mounted in RW: ``'/psi/home/'``, ``'/data/user'``, ``'/data/project'``
|
||||
* Users must migrate their data (see [Migration steps](/merlin6/migrating.html#migration-steps))
|
||||
* ALL data must be migrated
|
||||
* Job submissions by default to Merlin6. Submission to Merlin5 computing nodes possible.
|
||||
* Users should inform when migration is done, and which directories were migrated. Deletion for such directories can be requested by admins.
|
||||
|
||||
### Phase 3 [November]: Merlin5 Decomission
|
||||
|
||||
* Old Merlin5 storage unmounted.
|
||||
* Migrated directories reported by users will be deleted.
|
||||
* Remaining Merlin5 data will be archived.
|
||||
|
||||
---
|
||||
|
||||
## Migration steps
|
||||
|
||||
### Cleanup / Archive files
|
||||
|
||||
* Users must cleanup and/or archive files, according to the quota limits for the target storage.
|
||||
* If extra space is needed, we advise users to request a [project](/merlin6/request-project.html)
|
||||
* If you need a larger quota in respect to the maximal allowed number of files, you can request an increase of your user quota.
|
||||
|
||||
#### File list
|
||||
|
||||
### Step 1: Migrating
|
||||
|
||||
First migration:
|
||||
|
||||
```bash
|
||||
rsync -avAHXS <source_merlin5> <destination_merlin6>
|
||||
rsync -avAHXS /gpfs/data/$username/* /data/user/$username
|
||||
```
|
||||
|
||||
This can take several hours or days:
|
||||
* You can try to parallelize multiple rsync commands in sub-directories for increasing transfer rate.
|
||||
* Please do not parallelize many concurrent directories. Let's say, don't add more than 10 together.
|
||||
* We may have other users doing the same and it could cause storage / UI performance problems in the Merlin5 cluster.
|
||||
|
||||
### Step 2: Mirroring
|
||||
|
||||
Once first migration is done, a second ``rsync`` should be ran. This is done with ``--delete``. With this option ``rsync`` will
|
||||
behave in a way where it will delete from the destination all files that were removed in the source, but also will propagate
|
||||
new files from the source to the destination.
|
||||
|
||||
```bash
|
||||
rsync -avAHXS --delete <source_merlin5> <destination_merlin6>
|
||||
rsync -avAHXS --delete /gpfs/data/$username/* /data/user/$username
|
||||
```
|
||||
|
||||
### Step 3: Removing / Archiving old data
|
||||
|
||||
#### Removing migrated data
|
||||
|
||||
Once you ensure that everything is migrated to the new storage, data is ready to be deleted from the old storage.
|
||||
Users must report when migration is finished and report which directories are affected and ready to be removed.
|
||||
|
||||
Merlin administrators will remove the directories, always asking for a last confirmation.
|
||||
|
||||
#### Archiving data
|
||||
|
||||
Once all migrated data has been removed from the old storage, missing data will be archived.
|
||||
|
Reference in New Issue
Block a user