177 lines
11 KiB
Markdown
177 lines
11 KiB
Markdown
---
|
|
title: Transferring Data
|
|
#tags:
|
|
keywords: transferring data, data transfer, rsync, winscp, copy data, copying, sftp, import, export, hop, vpn
|
|
last_updated: 24 August 2023
|
|
#summary: ""
|
|
sidebar: merlin7_sidebar
|
|
permalink: /merlin7/transfer-data.html
|
|
---
|
|
|
|
## Overview
|
|
|
|
Most data transfer methods support both sending and receiving, so you may initiate the transfer from either **Merlin** or the other system — depending on **network visibility**.
|
|
- **From PSI Network to Merlin:** Merlin login nodes are visible from the PSI network, so direct transfers using `rsync`, or **ftp** are generally preferable. Transfers **from Merlin7 to PSI may require special firewall rules**.
|
|
- **From Merlin to the Internet:** Merlin login nodes can access the internet with a **limited set of protocols**:
|
|
- HTTP-based protocols on ports `80` or `445` (e.g., HTTPS, WebDAV).
|
|
- Other protocols (e.g., SSH, FTP, rsync daemon mode) require admin configuration, may only work with specific hosts, and might need new firewall rules.
|
|
- **From the Internet to PSI:** Systems outside PSI can access the [PSI Data Transfer Service](https://www.psi.ch/en/photon-science-data-services/data-transfer) at `datatransfer.psi.ch` using SSH-based protocols or [Globus](https://www.globus.org/).
|
|
|
|
> SSH-based protocols using port `22` **to most PSI servers** are generally **not permitted**.
|
|
> * However, **transfers from any PSI host to Merlin7 using port 22 are allowed**.
|
|
>
|
|
> Port `21` is also available for FTP transfers from PSI to Merlin7.
|
|
|
|
### Choosing the best transfer method
|
|
|
|
| **Scenario** | **Recommended Method** | **Reason** |
|
|
| ------------------------------------------------- | --------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------- |
|
|
| Small dataset, Linux/macOS | `rsync` | Resume support, skips existing files, works over SSH |
|
|
| Quick one-time small transfer | `scp` | Simple syntax, no need to install extra tools |
|
|
| Large dataset, high speed needed (not sensitive) | FTP via `service03.merlin7.psi.ch` | Fastest transfer speed (unencrypted data channel) |
|
|
| Large dataset, high speed needed (sensitive data) | FTP via `ftp-encrypted.merlin7.psi.ch` | Encrypted control & data channels for security, but slower than `service03` |
|
|
| Windows interactive GUI transfer | WinSCP | User-friendly interface, PSI Software Kiosk, supports drag-and-drop |
|
|
| Cross-platform interactive GUI transfer | FileZilla | User-friendly interface, works on Linux/macOS/Windows, supports drag-and-drop |
|
|
| From the internet to PSI | [PSI Data Transfer Service](https://www.psi.ch/en/photon-science-data-services/data-transfer) | Supports SSH-based protocols and Globus |
|
|
| Need for sharing large files | [SWITCHfilesender](https://filesender.switch.ch/filesender2/?s=upload) | Supports sharing large file and expiration date |
|
|
| PSI -> Merlin7 over FTP | Any FTP-based client | Port 21 allowed from PSI to Merlin7 |
|
|
| PSI -> Merlin7 over SSH | Any SSH-based method | Port 22 allowed from PSI to Merlin7 |
|
|
|
|
The next chapters contain detailed information about the different transfer methods available on Merlin7.
|
|
|
|
## Direct Transfer via Merlin7 Login Nodes
|
|
|
|
The following methods transfer data directly via the [login nodes](../01-Quick-Start-Guide/accessing-interactive-nodes.md#login-nodes-hardware-description). They are suitable for use from **within the PSI network**.
|
|
|
|
### Rsync (Recommended for Linux/macOS)
|
|
|
|
Rsync is the **preferred** method for small datasets from Linux/macOS systems. It supports **resuming interrupted transfers** and **skips already transferred files**. Syntax:
|
|
```bash
|
|
rsync -avAHXS <src> <dst>
|
|
```
|
|
|
|
**An example** for transferring local files to a Merlin project directory
|
|
|
|
```bash
|
|
rsync -avAHXS ~/localdata $USER@login001.merlin7.psi.ch:/data/project/general/myproject/
|
|
```
|
|
|
|
!!! tip
|
|
If a transfer is interrupted, just rerun the command: `rsync` will skip existing files.
|
|
|
|
!!! warning
|
|
Rsync uses SSH (port 22). For large datasets, transfer speed might be limited.
|
|
|
|
### SCP
|
|
|
|
SCP works similarly to `rsync` but **does not support resuming** interrupted transfers. It may be used for quick one-off transfers. Example:
|
|
```bash
|
|
scp ~/localfile.txt $USER@login001.merlin7.psi.ch:/data/project/general/myproject/
|
|
```
|
|
|
|
### Secure FTP
|
|
A `vsftpd` service is available on the login nodes, providing high-speed transfers. Choose the server based on your **speed vs. encryption** needs:
|
|
* **`login001.merlin7.psi.ch`:** Encrypted control & data channels.
|
|
**Use if your data is sensitive**. **Slower**, but secure.
|
|
* **`service03.merlin7.psi.ch`**: Encrypted control channel only.
|
|
Use if your data can be transferred unencrypted. **Fastest** method.
|
|
|
|
!!! tip
|
|
The **control channel** is always **encrypted**, therefore, authentication is encrypted and secured.
|
|
|
|
## UI-based Clients for Data Transfer
|
|
### WinSCP (Windows)
|
|
|
|
Available in the **Software Kiosk** on PSI Windows machines.
|
|
* Using your PSI credentials, connect to
|
|
* when using port 22, connect to `login001.merlin7.psi.ch` or `login002.merlin7.psi.ch`.
|
|
* when using port 21, connect to:
|
|
* `ftp-encrypted.merlin7.psi.ch`: **Fast** transfer rates. **Both** control and data **channels encrypted**.
|
|
* `service03.merlin7.psi.ch`: **Fastest** transfer rates, but **data channel not encrypted**.
|
|
* Drag and drop files between your PC and Merlin.
|
|
|
|
* FTP (port 21)
|
|
|
|
### FileZilla (Linux/MacOS/Windows)
|
|
|
|
Download from [FileZilla Project](https://filezilla-project.org/), or install from your Linux software repositories if available.
|
|
* Using your PSI credentials, connect to
|
|
* when using port 22, connect to `login001.merlin7.psi.ch` or `login002.merlin7.psi.ch`.
|
|
* when using port 21, connect to:
|
|
* `ftp-encrypted.merlin7.psi.ch`: **Fast** transfer rates. **Both** control and data **channels encrypted**.
|
|
* `service03.merlin7.psi.ch`: **Fastest** transfer rates, but **data channel not encrypted**.
|
|
* Supports drag-and-drop file transfers.
|
|
|
|
## Sharing Files with SWITCHfilesender
|
|
|
|
**[SWITCHfilesender](https://filesender.switch.ch/filesender2/?s=upload)** is a Swiss-hosted installation of the [FileSender](https://filesender.org/) project — a web-based application that allows authenticated users to securely and easily send **arbitrarily large files** to other users. Features:
|
|
- **Secure large file transfers:** Send files that exceed normal email attachment limits.
|
|
- **Time-limited availability:** Files are automatically deleted after the chosen expiration date or number of downloads.
|
|
- **Voucher system:** Authenticated users can send upload vouchers to external recipients without an account.
|
|
- **Designed for research & education:** Developed to meet the needs of universities and research institutions.
|
|
|
|
About the authentication:
|
|
- It uses **SimpleSAMLphp**, supporting multiple authentication mechanisms: SAML2, LDAP, RADIUS and more.
|
|
- It's fully integrated with PSI's **Authentication and Authorization Infrastructure (AAI)**.
|
|
- PSI employees can log in using their PSI account:
|
|
1. Open [SWITCHfilesender](https://filesender.switch.ch/filesender2/?s=upload).
|
|
2. Select **PSI** as the institution.
|
|
3. Authenticate with your PSI credentials.
|
|
|
|
The service is designed to **send large files for temporary availability**, not as a permanent publishing platform. Typical use case:
|
|
1. Upload a file.
|
|
2. Share the download link with a recipient.
|
|
3. File remains available until the specified **expiration date** is reached, or the **download limit** is reached.
|
|
4. The file is **automatically deleted** after expiration.
|
|
|
|
!!! warning
|
|
SWITCHfilesender **is not** a long-term storage or archiving solution.
|
|
|
|
## PSI Data Transfer
|
|
|
|
From August 2024, Merlin is connected to the **[PSI Data Transfer](https://www.psi.ch/en/photon-science-data-services/data-transfer)** service,
|
|
`datatransfer.psi.ch`. This is a central service managed by the **[Linux team](https://linux.psi.ch/index.html)**. However, any problems or questions related to it can be directly
|
|
[reported](../99-support/contact.md) to the Merlin administrators, which will forward the request if necessary.
|
|
|
|
The PSI Data Transfer servers supports the following protocols:
|
|
* Data Transfer - SSH (scp / rsync)
|
|
* Data Transfer - Globus
|
|
|
|
Notice that `datatransfer.psi.ch` does not allow SSH login, only `rsync`, `scp` and [Globus](https://www.globus.org/) access is allowed.
|
|
|
|
Access to the PSI Data Transfer uses ***Multi factor authentication*** (MFA).
|
|
Therefore, having the Microsoft Authenticator App is required as explained [here](https://www.psi.ch/en/computing/change-to-mfa).
|
|
|
|
!!! tip
|
|
Please follow the [Official PSI Data
|
|
Transfer](https://www.psi.ch/en/photon-science-data-services/data-transfer)
|
|
documentation for further instructions.
|
|
|
|
## Connecting to Merlin7 from outside PSI
|
|
|
|
Merlin7 is fully accessible from within the PSI network. To connect from outside you can use:
|
|
- [VPN](https://www.psi.ch/en/computing/vpn) ([alternate instructions](https://intranet.psi.ch/BIO/ComputingVPN))
|
|
- [SSH hopx](https://www.psi.ch/en/computing/ssh-hop)
|
|
* Please avoid transferring big amount data through **hop**
|
|
- [No Machine](nomachine.md)
|
|
* Remote Interactive Access through [**'nx.psi.ch'**](https://www.psi.ch/en/photon-science-data-services/remote-interactive-access)
|
|
* Please avoid transferring big amount of data through **NoMachine**
|
|
|
|
{% comment %}
|
|
## Connecting from Merlin7 to outside file shares
|
|
|
|
### `merlin_rmount` command
|
|
|
|
Merlin provides a command for mounting remote file systems, called `merlin_rmount`. This
|
|
provides a helpful wrapper over the Gnome storage utilities, and provides support for a wide range of remote file formats, including
|
|
- SMB/CIFS (Windows shared folders)
|
|
- WebDav
|
|
- AFP
|
|
- FTP, SFTP
|
|
- [others](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/using_the_desktop_environment_in_rhel_8/managing-storage-volumes-in-gnome_using-the-desktop-environment-in-rhel-8#gvfs-back-ends_managing-storage-volumes-in-gnome)
|
|
|
|
|
|
[More instruction on using `merlin_rmount`](merlin-rmount.md)
|
|
{% endcomment %}
|
|
|