--- #tags: keywords: meg, merlin6, merlin7, migration, fpsync, rsync #summary: "" sidebar: meg_sidebar last_updated: 28 May 2025 permalink: /meg/migrating.html --- # Meg to Merlin7 Migration Guide Welcome to the official documentation for migrating experiment data from **MEG** to **Merlin7**. Please follow the instructions carefully to ensure a smooth and secure transition. --- ## Directory Structure Changes ### Meg vs Merlin6 vs Merlin7 | Cluster | Home Directory | User Data Directory | Experiment data | Additional notes | | ------- | :----------------- | :------------------ | --------------------- | ---------------- | | merlin6 | /psi/home/`$USER` | /data/user/`$USER` | /data/experiments/meg | Symlink /meg | | meg | /meg/home/`$USER` | N/A | /meg | | | merlin7 | /data/user/`$USER` | /data/user/`$USER` | /data/project/meg | | * The **Merlin6 home and user data directores have been merged** into the single new home directory `/data/user/$USER` on Merlin7. * This is the same for the home directory in the meg cluster, which has to be merged into `/data/user/$USER` on Merlin7. * Users are responsible for moving the data. * The **experiment directory has been integrated into `/data/project/meg`**. ### Recommended Cleanup Actions * Remove unused files and datasets. * Archive large, inactive data sets. ### Mandatory Actions * Stop activity on Meg and Merlin6 when performing the last rsync. ## Migration Instructions ### Preparation A `experiment_migration.setup` migration script must be executed from **any MeG node** using the account that will perform the migration. #### When using the local `root` account - The script **must be executed after every reboot** of the destination nodes. - **Reason:** On Merlin7, the home directory for the `root` user resides on ephemeral storage (no physical disk). After a reboot, this directory is cleaned, so **SSH keys need to be redeployed** before running the migration again. #### When using a PSI Active Directory (AD) account - Applicable accounts include, for example: - `gac-meg2_data` - `gac-meg2` - The script only needs to be executed **once**, provided that: - The home directory for the AD account is located on a shared storage area. - This shared storage is accessible from the node executing the transfer. - **Reason:** On Merlin7, these accounts have their home directories on persistent shared storage, so the SSH keys remain available across reboots. To run it: ```bash experiment_migration.setup ``` This script will: * Check that you have an account on Merlin7. * Configure and check that your environment is ready for transferring files via Slurm job. If there are issues, the script will: * Print clear diagnostic output * Give you some hints to resolve the issue If you are stuck, email: [merlin-admins@lists.psi.ch](mailto:merlin-admins@lists.psi.ch)/[meg-admins@lists.psi.ch](mailto:meg-admins@lists.psi.ch) ### Migration Procedure 1. **Run an initial sync**, ideally within a `tmux` session * This copies the bulk of the data from MeG to Merlin7. * **IMPORTANT: Do not modify the destination directories** * Please, before starting the transfer ensure that: * The source and destination directories are correct. * The destination directories exist. 2. **Run additional syncs if needed** * Subsequent syncs can be executed to transfer changes. * Ensure that **only one sync for the same directory runs at a time**. * Multiple syncs are often required since the first one may take several hours or even days. 3. Schedule a date for the final migration: * Any activity must be stopped on the source directory. * In the same way, no activity must be done on the destination until the migration is complete. 4. **Perform a final sync with the `-E` option** (if it applies) * Use `-E` **only if you need to delete files on the destination that were removed from the source.** * This ensures the destination becomes an exact mirror of the source. * **Never use `-E` after the destination has gone into production**, as it will delete new data created there. 5. Disable access on the source folder. 6. Enable access on the destination folder. * At this point, **no new syncs have to be performed.** > ⚠️ **Important Notes** > The `-E` option is destructive; handle with care. > Always verify that the destination is ready before triggering the final sync. > For optimal performance, use up to 12 threads with the -t option. #### Running The Migration Script The migration script is installed on the `meg-s-001` server at: `/usr/local/bin/experiment_migration.bash` This script is primarily a **wrapper** around `fpsync`, providing additional logic for synchronizing MeG experiment data. ```bash [root@meg-s-001 ~]# experiment_migration.bash --help Usage: /usr/local/bin/experiment_migration.bash [options] -p Options: -t | --threads N Number of parallel threads (default: 10). Recommended 12 as max. -b | --experiment-src-basedir DIR Experiment base directory (default: /meg) -S | --space-source SPACE Source project space name (default: data1) -B | --experiment-dst-basedir DIR Experiment base directory (default: /data/project/meg) -D | --space-destination SPACE Destination project space name (default: data1) -p | --project-name PRJ_NAME Mantadory field. MeG project name. Examples: - 'online' - 'offline' - 'shared' -F | --force-destination-mkdir Create the destination parent directory (default: false) Example: mkdir -p $(dirname /data/project/meg/data1/PROJECT_NAME) Result: mkdir -p /data/project/meg/data1 -s | --split N Number of files per split (default: 20000) -f | --filesize SIZE File size threshold (default: 100G) -r | --runid ID Reuse an existing runid session -l | --list-runids List available runid sessions and exit -x | --delete-runid Delete runid. Requires: -r | --runid ID -E | --rsync-delete-option [WARNING] Use this to delete files in the destination which are not present in the source any more. [WARNING] USE THIS OPTION CAREFULLY! Typically used in last rsync to have an exact mirror of the source directory. [WARNING] Some files in destination might be deleted! Use 'man fpsync' for more information. -h | --help Show this help message -v | --verbose Run fpsync with -v option ``` > Defaults can be updated if necessary. #### Migration examples ##### Example: Migrating the Entire `online` Directory The following example demonstrates how to migrate the **entire `online`** directory. {{site.data.alerts.tip}} You may also choose to migrate only specific subdirectories if needed. However, migrating full directories is generally simpler and less error-prone compared to handling multiple subdirectory migrations. {{site.data.alerts.end}} ```bash [root@meg-s-001 ~]# experiment_migration.bash -S data1 -D data1 -p "online" 🔄 Transferring project: From: /meg/data1/online To: login001.merlin7.psi.ch:/data/project/meg/data1/online Threads: 10 | Split: 20000 files | Max size: 100G RunID: Please confirm to start (y/N): ❌ Transfer cancelled by user. ``` ##### Example: Migrating a Specific Subdirectory The following example demonstrates how to migrate **only a subdirectory**. In this case, we use the option `-F` to create the parent directory in the destination, to ensure that this exists before transferring: ⚠️ **Important:** - When migrating a subdirectory, **do not** run concurrent migrations on its parent directories. - For example, avoid running migrations with `-p "shared"` while simultaneously migrating `-p "shared/subprojects"`. ```bash [root@meg-s-001 ~]# experiment_migration.bash -p "shared/subprojects/meg1" -F 🔄 Transferring project: From: /meg/data1/shared/subprojects/meg1 To: login002.merlin7.psi.ch:/data/project/meg/data1/shared/subprojects/meg1 Threads: 10 | Split: 20000 files | Max size: 100G RunID: Please confirm to start (y/N): N ❌ Transfer cancelled by user. ``` This command initiates the migration of the directory, by creating the destination parant directory (`-F` option): * Creates the destination directory as follows: ```bash ssh login002.merlin.psi.ch mkdir -p /data/project/meg/data1/shared/subprojects ``` * Runs FPSYNC with 10 threads and N parts of max 20000 files or 100G files: * Source: `/meg/data1/shared/subprojects/meg1` * Destination: `login002.merlin7.psi.ch:/data/project/meg/data1/shared/subprojects/meg1`