gitea-pages/pages/merlin7/99-support/migration-from-merlin6.md
caubet_m a4f4f32e22
All checks were successful
Build and Deploy Documentation / build-and-deploy (push) Successful in 6s
Add common problems
2025-05-27 12:24:02 +02:00

9.4 KiB
Raw Blame History

keywords, sidebar, last_updated, permalink
keywords sidebar last_updated permalink
merlin6, merlin7, migration, fpsync, rsync merlin7_sidebar 27 May 2025 /merlin7/migrating.html

Merlin6 to Merlin7 Migration Guide

Welcome to the official documentation for migrating your data from Merlin6 to Merlin7. Please follow the instructions carefully to ensure a smooth and secure transition.

📅 Migration Schedule

🧍 Phase 1: Users without Projects — Deadline: July 1

If you do not belong to any Merlin project, you must complete your migration before July 1. This includes:

  • Users not in any group project (/data/projects/general)
  • Users not in BIO, MEG, Mu3e
  • Users not part of PSI-owned private Merlin nodes (ASA, MEG, Mu3e)

Users are responsible for initiating and completing the migration process. Contact the Merlin support team merlin-admins@lists.psi.ch if you need help.

⚠️ In this phase, it's important that you don't belong to any project. Once the migration is finished, access to Merlin6 will be no longer possible.

👥 Phase 2: Project Members and Owners — Start Before August 1

For users in active projects:

  • Project owners and members will be contacted by the Merlin admins.
  • Migration will be scheduled individually per project.
  • Expect contact before August 1.

⚠️ In this phase, group owners and members will be also requested to be migrated.


🗂️ Directory Structure Changes

Merlin6 vs Merlin7

Cluster Home Directory User Data Directory Projects Experiments
merlin6 /psi/home/$USER /data/user/$USER /data/project/ /data/experiments
merlin7 /data/user/$USER /data/user/$USER /data/project/ /data/project/
  • The home directory and user data directory have been merged into /data/user/$USER.

  • The experiments directory has been integrated into /data/project/:

    • /data/project/general contains general Merlin7 projects.
    • Other subdirectories are used for large-scale projects such as CLS division, Mu3e, and MeG.

Step-by-Step Migration Instructions

📋 Prerequisites and Preparation

Before starting the migration, make sure you:

  • Are registered on Merlin7.

  • Have cleaned up your data to reduce migration time and space usage.

  • Ensure your total usage on Merlin6 is well below the 1TB quota. Remember:

    • Merlin7 also has a 1TB quota, and you might already have data there.
    • If your usage exceeds this during the transfer, the process might fail.
  • Remove unused files and datasets.

  • Archive large, inactive data sets.

  • Delete or clean up unused conda or virtualenv Python environments:

    • These are often large and may not work as-is on Merlin7.

    • You can export your conda environments with:

      conda env export -n myenv > $HOME/myenv.yml
      
    • Then recreate them later on Merlin7.

🧹 You can always remove more old data after migration — it will be copied into ~/merlin6data and ~/merlin6home on Merlin7.


⚙️ Step 1: Run merlin7_migration.setup

Log into Merlin7 and run:

merlin7_migration.setup

This script will:

  • Check that you have an account on Merlin7.

  • Configure and check that your environment is ready for transferring files via Slurm job.

  • Create two directories:

    • ~/merlin6data → copy of your old /data/user
    • ~/merlin6home → copy of your old home

⚠️ Important: If ~/merlin6home or ~/merlin6data already exist, the script will exit. Please remove them or contact support.

If there are issues, the script will:

  • Print clear diagnostic output
  • Give you some hints to resolve the issue

If you are stuck, email: merlin-admins@lists.psi.ch


📦 Step 2: Run merlin7_migration.start

After setup completes, start the migration by running:

merlin7_migration.start

This script will:

  • Check the status of your quota on Merlin6.

  • Submit SLURM batch jobs to the xfer partition

  • Queue two jobs:

    • migrate_merlin6data.batch (data dir)
    • migrate_merlin6home.batch (home dir)
      • This job will only start if migrate_merlin6data.batch has successfully finished.
  • Automatically track the job IDs

  • Print log file locations for the different jobs

If something goes wrong:

  • Users have to check the reason in the job logs and fix it.
  • Please read ⚠️ Common rsync/fpsync Migration Issues to see how to solve some of the commonest problems.
  • If migrate_merlin6data.batch fails, migrate_merlin6home.batch will be cancelled.
  • Once the issues are fixed, users should restart the migration with merlin7_migration.start
    • Migration will continue.

⚠️ Once both transfers succeed, your access to Merlin6 will be revoked. Do not attempt to reconnect to Merlin6 after this.


📊 Step 3: Monitor Transfer Jobs

To monitor your transfer jobs, run:

squeue -M merlin6 -u $USER -p xfer

Check the output to ensure your jobs are:

  • Running (R) or completed (CG or removed from queue)
  • Not failed (F, TO, or stuck)

You can also check logs (as printed by the script) to verify job completion.

When /data/user/$USER and /psi/home/$USER on Merlin6 are no longer accessible, migration is complete.


💡 Examples

Setup the Migration

merlin7_migration.setup

Expected output:

✅ login002.merlin7.psi.ch                                                   
✅ `$USER` is a member of svc-cluster_merlin7
✅ Skipping key generation                                                   
✅ SSH key already added to agent.                                           
✅ SSH ID successfully copied to login00[1|2].merlin7.psi.ch.
✅ Test successful.
✅ /data/software/xfer_logs/caubet_m created.
✅ ~/merlin6data directory created.
✅ ~/merlin6home directory created.

Start the Migration

merlin7_migration.start

Expected output:

(base)[caubet_m@merlin-l-001:/data/software/admin/scripts/merlin-user-tools/alps(master)]# ./merlin7_migration.start
✅ Quota check passed.
Used: 512 GB, 234001 files

###################################################
Submitting transfer jobs to Slurm

   Job logs can be found here:
➡️  Directory '/data/user/caubet_m' does NOT have 000 permissions. Transfer pending, continuing...
✅ Submitted DATA_MIGRATION job: 24688554. Sleeping 3 seconds...
   - /data/user transfer logs:
     - /data/software/xfer_logs/caubet_m/data-24688554.out
     - /data/software/xfer_logs/caubet_m/data-24688554.err
➡️  Directory '/psi/home/caubet_m' does NOT have 000 permissions. Transfer pending, continuing...
✅ Submitted HOME_MIGRATION job with dependency on 24688554: 24688555. Sleeping 3 seconds...
   - /psi/home transfer logs:
     - /data/software/xfer_logs/caubet_m/home-24688555.out
     - /data/software/xfer_logs/caubet_m/home-24688555.err

✅ You can start manually a monitoring window with:
   tmux new-session -d -s "xfersession" "watch 'squeue -M merlin6 -u caubet_m -p xfer'"
   tmux attach -t "xfersession"

✅ FINISHED - PLEASE CHECK JOB TRANSFER PROGRESS

Monitor Progress

squeue -M merlin6 -u $USER -p xfer

Output:

$ squeue -M merlin6 -u $USER -p xfer
CLUSTER: merlin6
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
          24688581      xfer HOME_MIG caubet_m PD       0:00      1 (Dependency)
          24688580      xfer DATA_MIG caubet_m  R       0:22      1 merlin-c-017

⚠️ Common rsync/fpsync Migration Issues

File Permission Denied

  • Cause: Files or directories are not readable by the user running the transfer.

  • Solution: Fix source-side permissions:

    chmod -R u+rX /path/to/file_or_dir
    

Ownership Mismatches

  • Cause: Source files are owned by another user (e.g. root or a collaborator).

  • Solution:

    • Change ownership before migration:

      chown -R $USER /path/to/file
      

Special Files (e.g. device files, sockets)

  • Cause: rsync tries to copy UNIX sockets, device files, or FIFOs.
  • Effect: Errors or incomplete copies.
  • Solution: Avoid transferring such files entirely (by deleting them).

Exceeded Disk Quota

  • Cause: Combined size of existing + incoming data exceeds 1TB quota on Merlin7.
  • Effect: Transfer stops abruptly.
  • Solution: Clean up or archive non-essential data before migration.

Very Small Files or Large Trees → Many Small rsync Calls

  • Cause: Directory with thousands/millions of small files.

  • Effect: Transfer is slow or hits process limits.

  • Solution: Consider archiving to .tar.gz before transferring:

    tar -czf myenv.tar.gz myenv/
    

📬 Need Help?

If something doesn't work:

  • Re-run the scripts and check the logs carefully.
  • Use less, cat, or tail -f to view your job logs.
  • Contact the Merlin support team: 📧 merlin-admins@lists.psi.ch

We are here to help you migrate safely and efficiently.