Transfer Data page rework
All checks were successful
Build and Deploy Documentation / build-and-deploy (push) Successful in 13s

This commit is contained in:
2025-08-14 10:59:49 +02:00
parent 0f8c3bb7fe
commit cb5e3b24c1

View File

@@ -10,66 +10,119 @@ permalink: /merlin7/transfer-data.html
## Overview ## Overview
Most methods allow data to be either transmitted or received, so it may make sense to Most data transfer methods support both sending and receiving, so you may initiate the transfer from either **Merlin** or the other system — depending on **network visibility**.
initiate the transfer from either merlin or the other system, depending on the network - **From PSI Network to Merlin:** Merlin login nodes are visible from the PSI network, so direct transfers using `rsync`, or **ftp** are generally preferable. Transfers **from Merlin7 to PSI may require special firewall rules**.
visibility. - **From Merlin to the Internet:** Merlin login nodes can access the internet with a **limited set of protocols**:
- HTTP-based protocols on ports `80` or `445` (e.g., HTTPS, WebDAV).
- Other protocols (e.g., SSH, FTP, rsync daemon mode) require admin configuration, may only work with specific hosts, and might need new firewall rules.
- **From the Internet to PSI:** Systems outside PSI can access the [PSI Data Transfer Service](https://www.psi.ch/en/photon-science-data-services/data-transfer) at `datatransfer.psi.ch` using SSH-based protocols or [Globus](https://www.globus.org/).
- Merlin login nodes are visible from the PSI network, so direct data transfer > SSH-based protocols using port `22` **to most PSI servers** are generally **not permitted**.
(rsync/WinSCP/sftp) is generally preferable. > * However, **transfers from any PSI host to Merlin7 using port 22 are allowed**.
- Protocols from Merlin7 to PSI may require special firewall rules. >
- Merlin login nodes can access the internet using a limited set of protocols: > Port `21` is also available for FTP transfers from PSI to Merlin7.
- HTTP-based protocols using ports 80 or 445 (https, WebDav, etc)
- Protocols using other ports require admin configuration and may only work with
specific hosts, and may require new firewall rules (ssh, ftp, rsync daemons, etc).
- Systems on the internet can access the [PSI Data Transfer](https://www.psi.ch/en/photon-science-data-services/data-transfer) service
`datatransfer.psi.ch`, using ssh-based protocols and [Globus](https://www.globus.org/)
SSH-based protocols using port 22 to most PSI servers and (rsync-over-ssh, sftp, WinSCP, etc.), are in general, not permitted ### Choosing the best transfer method
## Direct transfer via Merlin7 login nodes | **Scenario** | **Recommended Method** | **Reason** |
| ------------------------------------------------- | --------------------------------------------------------------------------------------------- | -------------------------------------------------------- |
| Small dataset, Linux/macOS | `rsync` | Resume support, skips existing files, works over SSH |
| Quick one-time small transfer | `scp` | Simple syntax, no need to install extra tools |
| Large dataset, high speed needed (not sensitive) | FTP via `login002` | Fastest transfer speed (unencrypted data channel) |
| Large dataset, high speed needed (sensitive data) | FTP via `login001` | Encrypted control & data channels for security |
| Windows interactive GUI transfer | WinSCP | User-friendly interface, supports drag-and-drop |
| Cross-platform interactive GUI transfer | FileZilla | Works on Linux/macOS/Windows, supports both SSH and FTP |
| From the internet to PSI | [PSI Data Transfer Service](https://www.psi.ch/en/photon-science-data-services/data-transfer) | Supports SSH-based protocols and Globus |
| Need for sharing large files | [SWITCHfilesender](https://filesender.switch.ch/filesender2/?s=upload) | Supports sharing large file and expiration date |
| PSI -> Merlin7 over FTP (port 21) | Any FTP-based client | Port 21 allowed from PSI to Merlin7 |
| PSI -> Merlin7 over SSH (port 22) | Any SSH-based method | Port 22 allowed from PSI to Merlin7 |
The following methods transfer data directly via the [login The next chapters contain detailed information about the different transfer methods available on Merlin7.
nodes](/merlin7/interactive.html#login-nodes-hardware-description). They are suitable
for use from within the PSI network.
### Rsync ## Direct Transfer via Merlin7 Login Nodes
Rsync is the preferred method to transfer data from Linux/MacOS. It allows The following methods transfer data directly via the [login nodes](/merlin7/interactive.html#login-nodes-hardware-description). They are suitable for use from **within the PSI network**.
transfers to be easily resumed if they get interrupted. The general syntax is:
``` ### Rsync (Recommended for Linux/macOS)
Rsync is the **preferred** method for small datasets from Linux/macOS systems. It supports **resuming interrupted transfers** and **skips already transferred files**. Syntax:
```bash
rsync -avAHXS <src> <dst> rsync -avAHXS <src> <dst>
``` ```
For example, to transfer files from your local computer to a merlin project **An example** for transferring local files to a Merlin project directory
directory:
``` ```bash
rsync -avAHXS ~/localdata $USER@login001.merlin7.psi.ch:/data/project/general/myproject/ rsync -avAHXS ~/localdata $USER@login001.merlin7.psi.ch:/data/project/general/myproject/
``` ```
{{site.data.alerts.tip}}
If a transfer is interrupted, just rerun the command: <code>rsync</code> will skip existing files.
{{site.data.alerts.end}}
{{site.data.alerts.warning}}
Rsync uses SSH (port 22). For large datasets, transfer speed might be limited.
{{site.data.alerts.end}}
You can resume interrupted transfers by simply rerunning the command. Previously ### SCP
transferred files will be skipped.
SCP works similarly to `rsync` but **does not support resuming** interrupted transfers. It may be used for quick one-off transfers. Example:
```bash
scp ~/localfile.txt $USER@login001.merlin7.psi.ch:/data/project/general/myproject/
```
### WinSCP ### Secure FTP
A `vsftpd` service is available on the login nodes, providing high-speed transfers. Choose the server based on your **speed vs. encryption** needs:
* **`login001.merlin7.psi.ch`:** Encrypted control & data channels.
**Use if your data is sensitive**. **Slower**, but secure.
* **`login002.merlin7.psi.ch`:** Encrypted control channel only.
Use if your data can be transferred unencrypted. **Fastest** method.
The WinSCP tool can be used for remote file transfer on Windows. It is available {{site.data.alerts.tip}}
from the Software Kiosk on PSI machines. Add `login001.merlin7.psi.ch` or `login002.merlin7.psi.ch` The <b>control channel</b> is always <b>encrypted</b>, therefore, authentication is encrypted and secured.
as a host and connect with your PSI credentials. You can then drag-and-drop files between your {{site.data.alerts.end}}
local computer and merlin.
### SWITCHfilesender ## UI-based Clients for Data Transfer
### WinSCP (Windows)
**[SWITCHfilesender](https://filesender.switch.ch/filesender2/?s=upload)** is an installation of the FileSender project (filesender.org) which is a web based application that allows authenticated users to securely and easily send arbitrarily large files to other users. Available in the **Software Kiosk** on PSI Windows machines.
* Connect to `login001.merlin7.psi.ch` or `login002.merlin7.psi.ch` using your PSI credentials.
* Drag and drop files between your PC and Merlin.
Authentication of users is provided through SimpleSAMLphp, supporting SAML2, LDAP and RADIUS and more. Users without an account can be sent an upload voucher by an authenticated user. FileSender is developed to the requirements of the higher education and research community. **Supported protocols:** SSH (port 22), FTP (port 21)
The purpose of the software is to send a large file to someone, have that file available for download for a certain number of downloads and/or a certain amount of time, and after that automatically delete the file. The software is not intended as a permanent file publishing platform. ### FileZilla (Linux/MacOS/Windows)
**[SWITCHfilesender](https://filesender.switch.ch/filesender2/?s=upload)** is fully integrated with PSI, therefore, PSI employees can log in by using their PSI account (through Authentication and Authorization Infrastructure / AAI, by selecting PSI as the institution to be used for log in). Download from [FileZilla Project](https://filezilla-project.org/), or install from your Linux software repositories if available.
* Connect to `login001.merlin7.psi.ch` or `login002.merlin7.psi.ch` using your PSI credentials.
* Supports drag-and-drop file transfers.
**Supported protocols:** SSH (port 22), FTP (port 21)
## Sharing Files with SWITCHfilesender
**[SWITCHfilesender](https://filesender.switch.ch/filesender2/?s=upload)** is a Swiss-hosted installation of the [FileSender](https://filesender.org/) project — a web-based application that allows authenticated users to securely and easily send **arbitrarily large files** to other users. Features:
- **Secure large file transfers:** Send files that exceed normal email attachment limits.
- **Time-limited availability:** Files are automatically deleted after the chosen expiration date or number of downloads.
- **Voucher system:** Authenticated users can send upload vouchers to external recipients without an account.
- **Designed for research & education:** Developed to meet the needs of universities and research institutions.
About the authentication:
- It uses **SimpleSAMLphp**, supporting multiple authentication mechanisms: SAML2, LDAP, RADIUS and more.
- It's fully integrated with PSI's **Authentication and Authorization Infrastructure (AAI)**.
- PSI employees can log in using their PSI account:
1. Open [SWITCHfilesender](https://filesender.switch.ch/filesender2/?s=upload).
2. Select **PSI** as the institution.
3. Authenticate with your PSI credentials.
The service is designed to **send large files for temporary availability**, not as a permanent publishing platform. Typical use case:
1. Upload a file.
2. Share the download link with a recipient.
3. File remains available until the specified **expiration date** is reached, or the **download limit** is reached.
4. The file is **automatically deleted** after expiration.
{{site.data.alerts.warning}}
SWITCHfilesender <b>is not</b> a long-term storage or archiving solution.
{{site.data.alerts.end}}
{% comment %}
## PSI Data Transfer ## PSI Data Transfer
From August 2024, Merlin is connected to the **[PSI Data Transfer](https://www.psi.ch/en/photon-science-data-services/data-transfer)** service, From August 2024, Merlin is connected to the **[PSI Data Transfer](https://www.psi.ch/en/photon-science-data-services/data-transfer)** service,
@@ -82,16 +135,6 @@ The PSI Data Transfer servers supports the following protocols:
Notice that `datatransfer.psi.ch` does not allow SSH login, only `rsync`, `scp` and [Globus](https://www.globus.org/) access is allowed. Notice that `datatransfer.psi.ch` does not allow SSH login, only `rsync`, `scp` and [Globus](https://www.globus.org/) access is allowed.
The following filesystems are mounted:
* `/merlin/export` which points to the `/export` directory in Merlin.
* `/merlin/data/experiment/mu3e` which points to the `/data/experiment/mu3e` directories in Merlin.
* Mu3e sub-directories are mounted in RW (read-write), except for `data` (read-only mounted)
* `/merlin/data/project/general` which points to the `/data/project/general` directories in Merlin.
* Owners of Merlin projects should request explicit access to it.
* Currently, only `CSCS` is available for transferring files between PizDaint/Alps and Merlin
* `/merlin/data/project/bio` which points to the `/data/project/bio` directories in Merlin.
* `/merlin/data/user` which points to the `/data/user` directories in Merlin.
Access to the PSI Data Transfer uses ***Multi factor authentication*** (MFA). Access to the PSI Data Transfer uses ***Multi factor authentication*** (MFA).
Therefore, having the Microsoft Authenticator App is required as explained [here](https://www.psi.ch/en/computing/change-to-mfa). Therefore, having the Microsoft Authenticator App is required as explained [here](https://www.psi.ch/en/computing/change-to-mfa).
@@ -99,63 +142,17 @@ Therefore, having the Microsoft Authenticator App is required as explained [here
<b><a href="https://www.psi.ch/en/photon-science-data-services/data-transfer">Official PSI Data Transfer</a></b> documentation for further instructions. <b><a href="https://www.psi.ch/en/photon-science-data-services/data-transfer">Official PSI Data Transfer</a></b> documentation for further instructions.
{{site.data.alerts.end}} {{site.data.alerts.end}}
### Directories
#### /merlin/data/user
User data directories are mounted in RW.
{{site.data.alerts.warning}}Please, <b>ensure proper secured permissions</b> in your '/data/user'
directory. By default, when directory is created, the system applies the most restrictive
permissions. However, this does not prevent users for changing permissions if they wish. At this
point, users become responsible of those changes.
{{site.data.alerts.end}}
#### /merlin/export
Transferring big amounts of data from outside PSI to Merlin is always possible through `/export`.
{{site.data.alerts.tip}}<b>The '/export' directory can be used by any Merlin user.</b>
This is configured in Read/Write mode. If you need access, please, contact the Merlin administrators.
{{site.data.alerts.end}}
{{site.data.alerts.warning}}The use <b>export</b> as an extension of the quota <i>is forbidden</i>.
<br><b><i>Auto cleanup policies</i></b> in the <b>export</b> area apply for files older than 28 days.
{{site.data.alerts.end}}
##### Exporting data from Merlin
For exporting data from Merlin to outside PSI by using `/export`, one has to:
* From a Merlin login node, copy your data from any directory (i.e. `/data/project`, `/data/user`, `/scratch`) to
`/export`. Ensure to properly secure your directories and files with proper permissions.
* Once data is copied, from **`datatransfer.psi.ch`**, copy the data from `/merlin/export` to outside PSI
##### Importing data to Merlin
For importing data from outside PSI to Merlin by using `/export`, one has to:
* From **`datatransfer.psi.ch`**, copy the data from outside PSI to `/merlin/export`.
Ensure to properly secure your directories and files with proper permissions.
* Once data is copied, from a Merlin login node, copy your data from `/export` to any directory (i.e. `/data/project`, `/data/user`, `/scratch`).
#### Request access to your project directory
Optionally, instead of using `/export`, Merlin project owners can request Read/Write or Read/Only access to their project directory.
{{site.data.alerts.tip}}<b>Merlin projects can request direct access.</b>
This can be configured in Read/Write or Read/Only modes. If your project needs access, please, contact the Merlin administrators.
{{site.data.alerts.end}}
## Connecting to Merlin7 from outside PSI ## Connecting to Merlin7 from outside PSI
Merlin7 is fully accessible from within the PSI network. To connect from outside you can use: Merlin7 is fully accessible from within the PSI network. To connect from outside you can use:
- [VPN](https://www.psi.ch/en/computing/vpn) ([alternate instructions](https://intranet.psi.ch/BIO/ComputingVPN)) - [VPN](https://www.psi.ch/en/computing/vpn) ([alternate instructions](https://intranet.psi.ch/BIO/ComputingVPN))
- [SSH hop](https://www.psi.ch/en/computing/ssh-hop) - [SSH hopx](https://www.psi.ch/en/computing/ssh-hop)
* Please avoid transferring big amount data through **hop** * Please avoid transferring big amount data through **hop**
- [No Machine](nomachine.md) - [No Machine](nomachine.md)
* Remote Interactive Access through [**'rem-acc.psi.ch'**](https://www.psi.ch/en/photon-science-data-services/remote-interactive-access) * Remote Interactive Access through [**'nx.psi.ch'**](https://www.psi.ch/en/photon-science-data-services/remote-interactive-access)
* Please avoid transferring big amount of data through **NoMachine** * Please avoid transferring big amount of data through **NoMachine**
{% comment %}
## Connecting from Merlin7 to outside file shares ## Connecting from Merlin7 to outside file shares
### `merlin_rmount` command ### `merlin_rmount` command
@@ -171,3 +168,4 @@ provides a helpful wrapper over the Gnome storage utilities, and provides suppor
[More instruction on using `merlin_rmount`](/merlin7/merlin-rmount.html) [More instruction on using `merlin_rmount`](/merlin7/merlin-rmount.html)
{% endcomment %} {% endcomment %}