first stab at mkdocs migration
refactor CSCS and Meg content add merlin6 quick start update merlin6 nomachine docs give the userdoc its own color scheme we use the Materials default one refactored slurm general docs merlin6 add merlin6 JB docs add software support m6 docs add all files to nav vibed changes #1 add missing pages further vibing #2 vibe #3 further fixes
This commit is contained in:
51
docs/meg/contact.md
Normal file
51
docs/meg/contact.md
Normal file
@@ -0,0 +1,51 @@
|
||||
# Support
|
||||
|
||||
Support can be asked through:
|
||||
|
||||
* [PSI Service Now](https://psi.service-now.com/psisp)
|
||||
* E-Mail: <meg-admins@lists.psi.ch>
|
||||
|
||||
Basic contact information is also displayed on every shell login to the system
|
||||
using the *Message of the Day* mechanism.
|
||||
|
||||
## PSI Service Now
|
||||
|
||||
**[PSI Service Now](https://psi.service-now.com/psisp)**: is the official PSI tool for opening incident requests. However, contact via email (see below) is preferred.
|
||||
|
||||
* PSI HelpDesk will redirect the incident to the corresponding department, or
|
||||
* you can always assign it directly by checking the box `I know which service
|
||||
is affected` and providing the service name `Local HPC Resources (e.g. MEG)
|
||||
[CF]` (just type in `Local` and you should get the valid completions).
|
||||
|
||||
## Contact Meg Administrators
|
||||
|
||||
**E-Mail <meg-admins@lists.psi.ch>** or **<merlin-admins@lists.psi.ch>**
|
||||
|
||||
* This is the preferred way to contact MEG Administrators.
|
||||
Do not hesitate to contact us for such cases.
|
||||
|
||||
---
|
||||
|
||||
## Get updated through the Merlin User list
|
||||
|
||||
Is strongly recommended that users subscribe to the Merlin Users mailing list:
|
||||
**<merlin-users@lists.psi.ch>**
|
||||
|
||||
This mailing list is the official channel used by Merlin administrators to
|
||||
inform users about downtimes, interventions or problems. Users can be
|
||||
subscribed in two ways:
|
||||
|
||||
* *(Preferred way)* Self-registration through **[Sympa](https://psilists.ethz.ch/sympa/info/merlin-users)**
|
||||
* If you need to subscribe many people (e.g. your whole group) by sending a request to the admin list **<merlin-admins@lists.psi.ch>**
|
||||
and providing a list of email addresses.
|
||||
|
||||
---
|
||||
|
||||
## The MEG Cluster Team
|
||||
|
||||
The PSI Merlin and MEG clusters are managed by the **[High Performance
|
||||
Computing and Emerging technologies
|
||||
Group](https://www.psi.ch/de/lsm/hpce-group)**, which is part of the [Science
|
||||
IT Infrastructure, and Services department (AWI)](https://www.psi.ch/en/awi) in
|
||||
PSI's [Center for Scientific Computing, Theory and Data
|
||||
(SCD)](https://www.psi.ch/en/csd).
|
||||
13
docs/meg/index.md
Normal file
13
docs/meg/index.md
Normal file
@@ -0,0 +1,13 @@
|
||||
# The MEG local HPC cluster
|
||||
|
||||
> The MEG II collaboration includes almost 70 physicists from research
|
||||
> institutions from five countries. Researchers and technicians from PSI have
|
||||
> played a leading role, particularly with providing the high-quality beam,
|
||||
> technical support in the detector integration, and in the design, construction,
|
||||
> and operation of the detector readout electronics."
|
||||
>
|
||||
> —— [Source](https://www.psi.ch/en/cnm/news/in-search-of-new-physics-new-result-from-the-meg-ii-collaboration)
|
||||
|
||||
The MEG data analysis cluster is a cluster tightly coupled to Merlin and
|
||||
dedicated to the analysis of data from the MEG experiment. Operated for the
|
||||
Muon Physics group.
|
||||
200
docs/meg/migration-to-merlin7.md
Normal file
200
docs/meg/migration-to-merlin7.md
Normal file
@@ -0,0 +1,200 @@
|
||||
# Meg to Merlin7 Migration Guide
|
||||
|
||||
Welcome to the official documentation for migrating experiment data from **MEG** to **Merlin7**. Please follow the instructions carefully to ensure a smooth and secure transition.
|
||||
|
||||
---
|
||||
|
||||
## Directory Structure Changes
|
||||
|
||||
### Meg vs Merlin6 vs Merlin7
|
||||
|
||||
| Cluster | Home Directory | User Data Directory | Experiment data | Additional notes |
|
||||
| ------- | :----------------- | :------------------ | --------------------- | ---------------- |
|
||||
| merlin6 | /psi/home/`$USER` | /data/user/`$USER` | /data/experiments/meg | Symlink /meg |
|
||||
| meg | /meg/home/`$USER` | N/A | /meg | |
|
||||
| merlin7 | /data/user/`$USER` | /data/user/`$USER` | /data/project/meg | |
|
||||
|
||||
* The **Merlin6 home and user data directores have been merged** into the single new home directory `/data/user/$USER` on Merlin7.
|
||||
* This is the same for the home directory in the meg cluster, which has to be merged into `/data/user/$USER` on Merlin7.
|
||||
* Users are responsible for moving the data.
|
||||
* The **experiment directory has been integrated into `/data/project/meg`**.
|
||||
|
||||
### Recommended Cleanup Actions
|
||||
|
||||
* Remove unused files and datasets.
|
||||
* Archive large, inactive data sets.
|
||||
|
||||
### Mandatory Actions
|
||||
|
||||
* Stop activity on Meg and Merlin6 when performing the last rsync.
|
||||
|
||||
## Migration Instructions
|
||||
|
||||
### Preparation
|
||||
|
||||
A `experiment_migration.setup` migration script must be executed from **any MeG node** using the account that will perform the migration.
|
||||
|
||||
#### When using the local `root` account
|
||||
|
||||
* The script **must be executed after every reboot** of the destination nodes.
|
||||
* **Reason:** On Merlin7, the home directory for the `root` user resides on ephemeral storage (no physical disk).
|
||||
After a reboot, this directory is cleaned, so **SSH keys need to be redeployed** before running the migration again.
|
||||
|
||||
#### When using a PSI Active Directory (AD) account
|
||||
|
||||
* Applicable accounts include, for example:
|
||||
* `gac-meg2_data`
|
||||
* `gac-meg2`
|
||||
* The script only needs to be executed **once**, provided that:
|
||||
* The home directory for the AD account is located on a shared storage area.
|
||||
* This shared storage is accessible from the node executing the transfer.
|
||||
* **Reason:** On Merlin7, these accounts have their home directories on persistent shared storage, so the SSH keys remain available across reboots.
|
||||
|
||||
To run it:
|
||||
|
||||
```bash
|
||||
experiment_migration.setup
|
||||
```
|
||||
|
||||
This script will:
|
||||
|
||||
* Check that you have an account on Merlin7.
|
||||
* Configure and check that your environment is ready for transferring files via Slurm job.
|
||||
|
||||
If there are issues, the script will:
|
||||
|
||||
* Print clear diagnostic output
|
||||
* Give you some hints to resolve the issue
|
||||
|
||||
If you are stuck, email: [merlin-admins@lists.psi.ch](mailto:merlin-admins@lists.psi.ch)/[meg-admins@lists.psi.ch](mailto:meg-admins@lists.psi.ch)
|
||||
|
||||
### Migration Procedure
|
||||
|
||||
1. **Run an initial sync**, ideally within a `tmux` session
|
||||
* This copies the bulk of the data from MeG to Merlin7.
|
||||
* **IMPORTANT: Do not modify the destination directories**
|
||||
* Please, before starting the transfer ensure that:
|
||||
* The source and destination directories are correct.
|
||||
* The destination directories exist.
|
||||
2. **Run additional syncs if needed**
|
||||
* Subsequent syncs can be executed to transfer changes.
|
||||
* Ensure that **only one sync for the same directory runs at a time**.
|
||||
* Multiple syncs are often required since the first one may take several hours or even days.
|
||||
3. Schedule a date for the final migration:
|
||||
* Any activity must be stopped on the source directory.
|
||||
* In the same way, no activity must be done on the destination until the migration is complete.
|
||||
4. **Perform a final sync with the `-E` option** (if it applies)
|
||||
* Use `-E` **only if you need to delete files on the destination that were removed from the source.**
|
||||
* This ensures the destination becomes an exact mirror of the source.
|
||||
* **Never use `-E` after the destination has gone into production**, as it will delete new data created there.
|
||||
5. Disable access on the source folder.
|
||||
6. Enable access on the destination folder.
|
||||
* At this point, **no new syncs have to be performed.**
|
||||
|
||||
!!! note "Important"
|
||||
|
||||
The `-E` option is destructive; handle with care.
|
||||
Always verify that the destination is ready before triggering the final sync.
|
||||
For optimal performance, use up to 12 threads with the -t option.
|
||||
|
||||
#### Running The Migration Script
|
||||
|
||||
The migration script is installed on the `meg-s-001` server at:
|
||||
`/usr/local/bin/experiment_migration.bash`
|
||||
|
||||
This script is primarily a **wrapper** around `fpsync`, providing additional logic for synchronizing MeG experiment data.
|
||||
|
||||
```bash
|
||||
[root@meg-s-001 ~]# experiment_migration.bash --help
|
||||
Usage: /usr/local/bin/experiment_migration.bash [options] -p <project_name>
|
||||
|
||||
Options:
|
||||
-t | --threads N Number of parallel threads (default: 10). Recommended 12 as max.
|
||||
-b | --experiment-src-basedir DIR Experiment base directory (default: /meg)
|
||||
-S | --space-source SPACE Source project space name (default: data1)
|
||||
-B | --experiment-dst-basedir DIR Experiment base directory (default: /data/project/meg)
|
||||
-D | --space-destination SPACE Destination project space name (default: data1)
|
||||
-p | --project-name PRJ_NAME Mantadory field. MeG project name. Examples:
|
||||
- 'online'
|
||||
- 'offline'
|
||||
- 'shared'
|
||||
-F | --force-destination-mkdir Create the destination parent directory (default: false)
|
||||
Example: mkdir -p $(dirname /data/project/meg/data1/PROJECT_NAME)
|
||||
Result: mkdir -p /data/project/meg/data1
|
||||
-s | --split N Number of files per split (default: 20000)
|
||||
-f | --filesize SIZE File size threshold (default: 100G)
|
||||
-r | --runid ID Reuse an existing runid session
|
||||
-l | --list-runids List available runid sessions and exit
|
||||
-x | --delete-runid Delete runid. Requires: -r | --runid ID
|
||||
-E | --rsync-delete-option [WARNING] Use this to delete files in the destination
|
||||
which are not present in the source any more.
|
||||
[WARNING] USE THIS OPTION CAREFULLY!
|
||||
Typically used in last rsync to have an exact
|
||||
mirror of the source directory.
|
||||
[WARNING] Some files in destination might be deleted!
|
||||
Use 'man fpsync' for more information.
|
||||
|
||||
-h | --help Show this help message
|
||||
-v | --verbose Run fpsync with -v option
|
||||
```
|
||||
|
||||
!!! tip
|
||||
|
||||
Defaults can be updated if necessary.
|
||||
|
||||
#### Migration examples
|
||||
|
||||
##### Example: Migrating the Entire `online` Directory
|
||||
|
||||
The following example demonstrates how to migrate the **entire `online`** directory.
|
||||
|
||||
!!! tip
|
||||
|
||||
You may also choose to migrate only specific subdirectories if needed.
|
||||
However, migrating full directories is generally **simpler** and **less
|
||||
error-prone** compared to handling multiple subdirectory migrations.
|
||||
|
||||
```bash
|
||||
[root@meg-s-001 ~]# experiment_migration.bash -S data1 -D data1 -p "online"
|
||||
🔄 Transferring project:
|
||||
From: /meg/data1/online
|
||||
To: login001.merlin7.psi.ch:/data/project/meg/data1/online
|
||||
Threads: 10 | Split: 20000 files | Max size: 100G
|
||||
RunID:
|
||||
|
||||
Please confirm to start (y/N):
|
||||
❌ Transfer cancelled by user.
|
||||
```
|
||||
|
||||
##### Example: Migrating a Specific Subdirectory
|
||||
|
||||
The following example demonstrates how to migrate **only a subdirectory**. In this case, we use the option `-F` to create the parent directory in the destination, to ensure that this exists before transferring:
|
||||
|
||||
⚠️ **Important:**
|
||||
|
||||
* When migrating a subdirectory, **do not** run concurrent migrations on its parent directories.
|
||||
* For example, avoid running migrations with `-p "shared"` while simultaneously migrating `-p "shared/subprojects"`.
|
||||
|
||||
```bash
|
||||
[root@meg-s-001 ~]# experiment_migration.bash -p "shared/subprojects/meg1" -F
|
||||
🔄 Transferring project:
|
||||
From: /meg/data1/shared/subprojects/meg1
|
||||
To: login002.merlin7.psi.ch:/data/project/meg/data1/shared/subprojects/meg1
|
||||
Threads: 10 | Split: 20000 files | Max size: 100G
|
||||
RunID:
|
||||
|
||||
Please confirm to start (y/N): N
|
||||
❌ Transfer cancelled by user.
|
||||
```
|
||||
|
||||
This command initiates the migration of the directory, by creating the destination parant directory (`-F` option):
|
||||
|
||||
* Creates the destination directory as follows:
|
||||
|
||||
```bash
|
||||
ssh login002.merlin.psi.ch mkdir -p /data/project/meg/data1/shared/subprojects
|
||||
```
|
||||
|
||||
* Runs FPSYNC with 10 threads and N parts of max 20000 files or 100G files:
|
||||
* Source: `/meg/data1/shared/subprojects/meg1`
|
||||
* Destination: `login002.merlin7.psi.ch:/data/project/meg/data1/shared/subprojects/meg1`
|
||||
Reference in New Issue
Block a user