Compare commits
11 Commits
update_doc
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
| 1f40eb1334 | |||
| f57138a7b0 | |||
| 00aca09f94 | |||
| f8222e9607 | |||
| cc0423ce99 | |||
| a62b767a3a | |||
| e79ccd1b54 | |||
| 779a652f77 | |||
| a4bf00ece6 | |||
| 23d6f2d689 | |||
|
|
96c71dacf9 |
@@ -1,7 +1,7 @@
|
||||
---
|
||||
title: 2026/01 upgrade
|
||||
title: January 2026 upgrade
|
||||
---
|
||||
# 2026/01 upgrade
|
||||
# January 2026 upgrade
|
||||
|
||||
From the 5th of January to the 15th of January there will be a major upgrade of the data catalog infrastructure (SciCat - [https://discovery.psi.ch](https://discovery.psi.ch) and [https://dacat.psi.ch](https://dacat.psi.ch))
|
||||
|
||||
@@ -9,13 +9,13 @@ During this period, data archiving and retrieval will not be possible.
|
||||
|
||||
## Required changes
|
||||
|
||||
After the upgrade, few changes will be required:
|
||||
After the upgrade, a few changes will be required:
|
||||
|
||||
### CLI changes
|
||||
|
||||
If you are using pmodules, no changes are required, otherwise:
|
||||
|
||||
1. please download the latest CLI version (>=v3.0.0) by following the [download instructions](https://github.com/paulscherrerinstitute/scicat-cli?tab=readme-ov-file#manual-deployment-and-upgrade).
|
||||
1. please download the latest released CLI version (>=v3.0.0) by following the [download instructions](https://github.com/paulscherrerinstitute/scicat-cli?tab=readme-ov-file#manual-deployment-and-upgrade).
|
||||
2. Modify your ingestion scripts, following the [v3 instructions](https://github.com/paulscherrerinstitute/scicat-cli?tab=readme-ov-file#v3-changes);
|
||||
3. or download the [backwards compatible scripts](https://github.com/paulscherrerinstitute/scicat-cli?tab=readme-ov-file#backwards-compatibility-with-v2) (linux only). Please note these scripts will later be discontinued (~ Q2 2026).
|
||||
|
||||
@@ -24,7 +24,13 @@ If you are using pmodules, no changes are required, otherwise:
|
||||
Some APIs have been updated. This affects you only if you interact directly with the SciCat APIs.
|
||||
|
||||
1. You can already find and compare the new API specifications from our [QA environment](https://dacat-qa.psi.ch/explorer).
|
||||
2. Authorization update: API authorization now requires passing your token in the Authorization header, prefixed with Bearer, i.e. Bearer <MY_TOKEN>.
|
||||
2. Authorization update: API authorization now requires passing your token in the **Authorization header**, prefixed with Bearer, i.e. `Bearer <MY_TOKEN>`.
|
||||
3. Login endpoint: API login is now available at `/auth/login` rather than `users/login`.
|
||||
|
||||
### UI changes
|
||||
|
||||
1. The sections menu is now on the left side of the screen
|
||||
2. The `publish` workflow has changed, now all fields are mandatory in the form and it's a three steps process, `save`, `publish`, `register`.
|
||||
|
||||
### Recieving updates notifications
|
||||
|
||||
|
||||
Binary file not shown.
|
Before Width: | Height: | Size: 31 KiB After Width: | Height: | Size: 215 KiB |
@@ -1,7 +1,7 @@
|
||||
---
|
||||
title: Home
|
||||
---
|
||||
# :warning: Warning: Planned upgrade 2026/01 :warning:
|
||||
# :warning: Planned January 2026 upgrade :warning:
|
||||
|
||||
Please note that there will be a major upgrade that requires client changes. For details, please refer to the [upgrade page](./202601Upgrade.md).
|
||||
|
||||
|
||||
@@ -42,13 +42,6 @@ main steps in the lifecycle of the data management:
|
||||
- Publishing of datasets
|
||||
- Retention of datasets
|
||||
|
||||
Note: as of today (June 2021) the services can be only be used from
|
||||
within the PSI intranet with the exception of the published data,
|
||||
which is by definition publicly available. Although the service itself
|
||||
can be used from any operating system, the command line and
|
||||
GUI tools currently offered are available only for Linux and Windows
|
||||
platforms.
|
||||
|
||||
## The Concept of Datasets
|
||||
|
||||
For the following it is useful to have a better understanding of the
|
||||
@@ -127,6 +120,25 @@ first. Installation is described in the appendix Installation of Tools
|
||||
|
||||
## Ingest
|
||||
|
||||
### Important Update since January 2025
|
||||
|
||||
The SciCat stack has gone through a major upgrade, thus the command
|
||||
line syntax has changed.
|
||||
|
||||
The separate executables (like datasetIngestor, datasetRetriever...)
|
||||
were combined into one scicat-cli executable, with each executable's
|
||||
features available as commands given as the first parameter to this executable.
|
||||
|
||||
These commands bear the same names as the former executables.
|
||||
The general syntax change is that if you called
|
||||
./[COMMAND] [flags] before, now it's ./scicat-cli [COMMAND] [flags].
|
||||
|
||||
Furthermore, the use of single hyphen, multi-letter flags is now discontinued,
|
||||
as it went against general convention. So, in practical terms, -[long_flag_name]
|
||||
and --[long_flag_name] were both accepted, but now only the latter is accepted.
|
||||
|
||||
There are backward compatible scripts in the [github repo](https://github.com/paulscherrerinstitute/scicat-cli?tab=readme-ov-file#backwards-compatibility-with-v2).
|
||||
|
||||
### Important Update since April 14th 2022
|
||||
|
||||
For all commandline tools, like the datasetIngestor, datasetRetriever
|
||||
@@ -237,7 +249,7 @@ real life example from Bio department:
|
||||
|
||||
For manual creation of this file there are various helper tools
|
||||
available. One option is to use the ScicatEditor
|
||||
<https://bliven_s.gitpages.psi.ch/SciCatEditor/> for creating these
|
||||
<https://www.scicatproject.org/SciCatEditor/> for creating these
|
||||
metadata files. This is a browser-based tool specifically for
|
||||
ingesting PSI data. Using the tool avoids syntax errors and provides
|
||||
templates for common data sets and options. The finished JSON file can
|
||||
@@ -257,7 +269,7 @@ Linux type notation is used. For the changes which apply to Windows
|
||||
see the separate section below)
|
||||
|
||||
```sh
|
||||
datasetIngestor metadata.json
|
||||
scicat-cli datasetIngestor metadata.json
|
||||
```
|
||||
|
||||
It will ask for your PSI credentials and then print some info
|
||||
@@ -268,7 +280,7 @@ already provided in the metadata.json file. If there are no errors,
|
||||
proceed to the real ingestion:
|
||||
|
||||
```sh
|
||||
datasetIngestor --ingest metadata.json
|
||||
scicat-cli datasetIngestor --ingest metadata.json
|
||||
```
|
||||
|
||||
For particularly important datasets, you may also want to use the
|
||||
@@ -286,7 +298,7 @@ workstations/PCs are likely to fall in this category.
|
||||
There are more options for this command, just type
|
||||
|
||||
```sh
|
||||
datasetIngestor
|
||||
scicat-cli datasetIngestor
|
||||
```
|
||||
|
||||
to see a list of available options. In particular you can define
|
||||
@@ -303,7 +315,7 @@ For Windows you need execute the corresponding commands inside a
|
||||
powershell and use the binary files ending in .exe, e.g.
|
||||
|
||||
```sh
|
||||
datasetIngestor.exe -token SCICAT-TOKEN -user username:password -copy metadata.json
|
||||
scicat-cli.exe datasetIngestor --token SCICAT-TOKEN --user username:password --copy metadata.json
|
||||
```
|
||||
|
||||
For Windows systems you can only use personal accounts and the data is
|
||||
@@ -358,7 +370,7 @@ Triggering the copy to tape can be done in 3 ways. Either you do it
|
||||
automatically as part of the ingestion
|
||||
|
||||
```sh
|
||||
datasetIngestor --ingest --autoarchive metadata.json
|
||||
scicat-cli datasetIngestor --ingest --autoarchive metadata.json
|
||||
```
|
||||
|
||||
In this case directly after ingestion a job is created to copy the
|
||||
@@ -379,31 +391,14 @@ data is stored.
|
||||
A third option is to use a command line version datasetArchiver.
|
||||
|
||||
```console
|
||||
datasetArchiver [options] (ownerGroup | space separated list of datasetIds)
|
||||
scicat-cli datasetArchiver [options] (ownerGroup | space separated list of datasetIds)
|
||||
```
|
||||
|
||||
You must choose either an ownerGroup, in which case all archivable datasets
|
||||
of this ownerGroup not yet archived will be archived.
|
||||
Or you choose a (list of) datasetIds, in which case all archivable datasets
|
||||
of this list not yet archived will be archived.
|
||||
|
||||
List of options:
|
||||
|
||||
-devenv
|
||||
Use development environment instead or production
|
||||
-localenv
|
||||
Use local environment (local) instead or production
|
||||
-noninteractive
|
||||
Defines if no questions will be asked, just do it - make sure you know what you are doing
|
||||
-tapecopies int
|
||||
Number of tapecopies to be used for archiving (default 1)
|
||||
-testenv
|
||||
Use test environment (qa) instead or production
|
||||
-token string
|
||||
Defines optional API token instead of username:password
|
||||
-user string
|
||||
Defines optional username and password
|
||||
```
|
||||
|
||||
## Retrieve
|
||||
|
||||
Here we describe the retrieval via the command line tools. A retrieve
|
||||
@@ -429,39 +424,22 @@ minutes (e.g. for 1GB) up to days (e.g for 100TB)
|
||||
For the second step you can use the **datasetRetriever** command, which
|
||||
uses the rsync protocol to copy the data to your destination.
|
||||
|
||||
```console
|
||||
Tool to retrieve datasets from the intermediate cache server of the tape archive
|
||||
to the destination path on your local system.
|
||||
Run script with 1 argument:
|
||||
|
||||
datasetRetriever [options] local-destination-path
|
||||
```console
|
||||
scicat-cli datasetRetriever [options] local-destination-path
|
||||
```
|
||||
|
||||
Per default all available datasets on the retrieve server will be fetched.
|
||||
Use option -dataset or -ownerGroup to restrict the datasets which should be fetched.
|
||||
|
||||
-chksum
|
||||
Switch on optional chksum verification step (default no checksum tests)
|
||||
-dataset string
|
||||
Defines single dataset to retrieve (default all available datasets)
|
||||
-devenv
|
||||
Use development environment (default is to use production system)
|
||||
-ownergroup string
|
||||
Defines to fetch only datasets of the specified ownerGroup (default is to fetch all available datasets)
|
||||
-retrieve
|
||||
Defines if this command is meant to actually copy data to the local system (default nothing is done)
|
||||
-testenv
|
||||
Use test environment (qa) (default is to use production system)
|
||||
-token string
|
||||
Defines optional API token instead of username:password
|
||||
-user string
|
||||
Defines optional username and password (default is to prompt for username and password)
|
||||
```
|
||||
Use option --dataset or --ownerGroup to restrict the datasets which should be fetched.
|
||||
|
||||
For the program to check which data is available on the cache server
|
||||
and if the catalog knows about these datasets, you can use:
|
||||
|
||||
```console
|
||||
datasetRetriever my-local-destination-folder
|
||||
scicat-cli datasetRetriever my-local-destination-folder
|
||||
|
||||
======Checking for available datasets on archive cache server ebarema4in.psi.ch:
|
||||
|
||||
@@ -477,7 +455,7 @@ If you want you can skip the previous step and
|
||||
directly trigger the file copy by adding the -retrieve flag:
|
||||
|
||||
```sh
|
||||
datasetRetriever -retrieve <local destinationFolder>
|
||||
scicat-cli datasetRetriever --retrieve <local destinationFolder>
|
||||
```
|
||||
|
||||
This will copy the files into the destinationFolder using the original
|
||||
@@ -489,19 +467,19 @@ Optionally you can also verify the consistency of the copied data by
|
||||
using the `-chksum` flag
|
||||
|
||||
```sh
|
||||
datasetRetriever -retrieve -chksum <local destinationFolder>
|
||||
scicat-cli datasetRetriever --retrieve --chksum <local destinationFolder>
|
||||
```
|
||||
|
||||
If you just want to retrieve a single dataset do the following:
|
||||
|
||||
```sh
|
||||
datasetRetriever -retrieve -dataset <datasetId> <local destinationFolder>
|
||||
scicat-cli datasetRetriever --retrieve --dataset <datasetId> <local destinationFolder>
|
||||
```
|
||||
|
||||
If you want to retrieve all datasets of a given **ownerGroup** do the following:
|
||||
|
||||
```sh
|
||||
datasetRetriever -retrieve -ownergroup <group> <local destinationFolder>
|
||||
scicat-cli datasetRetriever --retrieve --ownergroup <group> <local destinationFolder>
|
||||
```
|
||||
|
||||
#### Expert commands
|
||||
@@ -559,7 +537,7 @@ easiest to get such an API token is to sign it at
|
||||
button. This will bring you to the user settings page, from where you
|
||||
can copy the token with a click on the corresponding copy button.
|
||||
|
||||
### General considerations
|
||||
<!-- ### General considerations
|
||||
|
||||
`SciCat` is a GUI based tool designed to make initial
|
||||
ingests easy. It is especially useful, to ingest data, which can not
|
||||
@@ -591,7 +569,7 @@ On the SLS beamline consoles the software is also pre-installed in the
|
||||
/work/sls/bin folder, which is part of the standard PATH variable.
|
||||
|
||||
If you are not working on the Ra cluster you can download the
|
||||
software on Linux:
|
||||
software on Linux, Windows or Mac.
|
||||
|
||||
```sh
|
||||
/usr/bin/curl -O https://gitlab.psi.ch/scicat/tools/raw/master/linux/SciCat;chmod +x ./SciCat
|
||||
@@ -642,7 +620,7 @@ the desired datasets and clicking on "Save."
|
||||
### Settings
|
||||
|
||||
Additional settings, such as the default value for certain fields can be modified in settings panel (button
|
||||
on the lower left corner).
|
||||
on the lower left corner). -->
|
||||
|
||||
## Publish
|
||||
|
||||
@@ -668,15 +646,15 @@ top bar) and pick the "Publish" action.
|
||||
|
||||

|
||||
|
||||
This opens a form
|
||||
with prefilled information derived from the connected proposal
|
||||
data. This data can then be edited by the user and finally saved.
|
||||
This opens a form. The image below contains all fields that are mandatory and must be filled.
|
||||
|
||||

|
||||
|
||||
This defines the data as to be published and makes it known to the
|
||||
data catalog, but the corresponding DOI is not yet made globally
|
||||
available. For this last step to happen, someone with access to this
|
||||
By clicking on "Save and Continue" and later on "Publish"
|
||||
(makes the data publicly available) defines the data as to
|
||||
be published and makes it known to the data catalog, but
|
||||
the corresponding DOI is not yet made globally available.
|
||||
For this last step to happen, someone with access to this
|
||||
newly generated published data definition (e.g. the person defining
|
||||
the published data or e.g. the PI) has to hit the "register"
|
||||
button. This will trigger the global publication of the DOI. The links
|
||||
@@ -687,12 +665,6 @@ reolver.
|
||||
All published data definitions are then openly available via the so
|
||||
called "Landing Pages", which are hosted on <https://doi.psi.ch> .
|
||||
|
||||
The file data itself data becomes available via the normal data export
|
||||
System of the Ra cluster, which requires however a PSI account. If you
|
||||
want to make the file data anonymously available you need to send a
|
||||
corresponding request to <scicat-help@lists.psi.ch> for now. This process is
|
||||
planned to be automated in future.
|
||||
|
||||
For now all publication are triggered by a scientist explicitly,
|
||||
whenever necessary. In future in addition an automated publication
|
||||
after the embargo period (default 3 years after data taking) will be
|
||||
@@ -822,42 +794,22 @@ module load datacatalog
|
||||
|
||||
If you do not have access to PSI modules (for instance, when archiving
|
||||
from Ubuntu systems), then you can install the datacatalog software
|
||||
yourself. These tools require 64-bit linux.
|
||||
yourself. Both linux, Mac and Windows versions are available.
|
||||
|
||||
I suggest storing the SciCat scripts in ~/bin so that they can be
|
||||
easily accessed.
|
||||
|
||||
```sh
|
||||
mkdir -p ~/bin
|
||||
cd ~/bin
|
||||
/usr/bin/curl -O https://gitlab.psi.ch/scicat/tools/raw/master/linux/datasetIngestor
|
||||
chmod +x ./datasetIngestor
|
||||
/usr/bin/curl -O https://gitlab.psi.ch/scicat/tools/raw/master/linux/datasetRetriever
|
||||
chmod +x ./datasetRetriever
|
||||
/usr/bin/curl -O https://gitlab.psi.ch/scicat/tools/raw/master/linux/SciCat
|
||||
chmod +x ./SciCat
|
||||
```
|
||||
To download and install the binaries, please follow these steps:
|
||||
|
||||
When the scripts are updated you will be prompted to re-run some of
|
||||
the above commands to get the latest version.
|
||||
1. Go to the [GitHub releases page](https://github.com/paulscherrerinstitute/scicat-cli/releases)
|
||||
|
||||
You can call the ingestion scripts using the full path
|
||||
(~/bin/datasetIngestor) or else add ~/bin to your unix PATH. To do so,
|
||||
add the following line to your ~/.bashrc file:
|
||||
2. Choose the release of interest (latest released is recommended)
|
||||
|
||||
```sh
|
||||
export PATH="$HOME/bin:$PATH"
|
||||
```
|
||||
3. Download the file from the Assets of the chosen release, making sure to select the one compatible with your OS
|
||||
|
||||
#### Installation on Windows Systems
|
||||
4. Decompress the asset
|
||||
|
||||
On Windows the executables can be downloaded from the following URL,
|
||||
just enter the address in abrowser and download the file
|
||||
|
||||
```sh
|
||||
https://gitlab.psi.ch/scicat/tools/-/blob/master/windows/datasetIngestor.exe
|
||||
https://gitlab.psi.ch/scicat/tools/-/blob/master/windows/SciCatGUI_Win10.zip
|
||||
```
|
||||
5. Open the folder and run the required APP (grant execute permissions if required)
|
||||
|
||||
#### Online work stations in beamline hutches
|
||||
|
||||
@@ -921,64 +873,60 @@ administrative metadata, which have to be provided (status June
|
||||
2021). All fields marked "m" are mandatory, the rest is optional. Some
|
||||
fields are filled automatically if possible, see comments. For the
|
||||
most recent status see this URL
|
||||
<https://scicatproject.github.io/api-documentation/> and follow the link
|
||||
<https://dacat.psi.ch/explorer/> and follow the link
|
||||
called "Model" for the respective datamodel (e.g. Dataset), visible
|
||||
e.g. inside the GET API call section. Or see the model definitions as
|
||||
defined in the SciCat backend, see the json files in
|
||||
<https://github.com/SciCatProject/catamel/tree/develop/common/models>
|
||||
e.g. inside the GET API call section.
|
||||
|
||||
All "Date" fields must follow the date/time format defined in RFC
|
||||
3339, section 5.6, see <https://www.ietf.org/rfc/rfc3339.txt>
|
||||
|
||||
#### Metadata field definitions for datasets of type "base"
|
||||
|
||||
| field | type | must | comment |
|
||||
|------------------|---------------|------|------------------------------------------------------|
|
||||
| pid | string | m | filled by API automatically, do *not* provide this |
|
||||
| owner | string | m | filled by datasetIngestor if missing |
|
||||
| ownerEmail | string | | filled by datasetIngestor if missing |
|
||||
| orcidOfOwner | string | | |
|
||||
| contactEmail | string | m | filled by datasetIngestor if missing |
|
||||
| datasetName | string | | set to "tail" of sourceFolder path if missing |
|
||||
| sourceFolder | string | m | |
|
||||
| size | number | | autofilled when OrigDataBlock created |
|
||||
| packedSize | number | | autofilled when DataBlock created |
|
||||
| creationTime | date | m | filled by API if missing |
|
||||
| type | string | m | (raw, derived...) |
|
||||
| validationStatus | string | | |
|
||||
| keywords | Array[string] | | |
|
||||
| description | string | | |
|
||||
| classification | string | | filled by API or datasetIngestor if missing |
|
||||
| license | string | | filled by datasetIngestor if missing (CC By-SA 4.0) |
|
||||
| version | string | | autofilled by API |
|
||||
| doi | string | | filled as part of publication workflow |
|
||||
| isPublished | boolean | | filled by datasetIngestor if missing (false) |
|
||||
| ownerGroup | string | m | must be filled explicitly |
|
||||
| accessGroups | Array[string] | | filled by datasetIngestor to beamline specific group |
|
||||
| | | | derived from creationLocation |
|
||||
| | | | e.g. /PSI/SLS/TOMCAT -> accessGroups=["slstomcat"] |
|
||||
| field | type | required | comment |
|
||||
|------------------|---------------|-----------|------------------------------------------------------|
|
||||
| pid | string | | filled by API automatically, do *not* provide this |
|
||||
| owner | string | | filled by datasetIngestor if missing |
|
||||
| ownerEmail | string | | filled by datasetIngestor if missing |
|
||||
| orcidOfOwner | string | | |
|
||||
| contactEmail | string | | filled by datasetIngestor if missing |
|
||||
| datasetName | string | | set to "tail" of sourceFolder path if missing |
|
||||
| sourceFolder | string | x | |
|
||||
| size | number | | autofilled when OrigDataBlock created |
|
||||
| packedSize | number | | autofilled when DataBlock created |
|
||||
| creationTime | date | | filled by API if missing |
|
||||
| type | string | x | (raw, derived...) |
|
||||
| validationStatus | string | | |
|
||||
| keywords | Array[string] | | |
|
||||
| description | string | | |
|
||||
| classification | string | | filled by API or datasetIngestor if missing |
|
||||
| license | string | | filled by datasetIngestor if missing (CC By-SA 4.0) |
|
||||
| version | string | | autofilled by API |
|
||||
| doi | string | | filled as part of publication workflow |
|
||||
| isPublished | boolean | | filled by datasetIngestor if missing (false) |
|
||||
| ownerGroup | string | x | must be filled explicitly |
|
||||
| accessGroups | Array[string] | | filled by datasetIngestor to beamline specific group |
|
||||
|
||||
#### Additional fields for type="raw"
|
||||
|
||||
| field | type | must | comment |
|
||||
|-----------------------|--------|------|------------------------------------------------------------|
|
||||
| principalInvestigator | string | m | filled in datasetIngestor if missing (proposal must exist) |
|
||||
| endTime | date | | filled from datasetIngetor if missing |
|
||||
| creationLocation | string | m | see known Instrument list below |
|
||||
| dataFormat | string | | |
|
||||
| scientificMetadata | object | | |
|
||||
| proposalId | string | | filled by API automatically if missing |
|
||||
| field | type | required | comment |
|
||||
|-----------------------|--------|-----------|------------------------------------------------------------|
|
||||
| principalInvestigator | string | | filled in datasetIngestor if missing (proposal must exist) |
|
||||
| endTime | date | | filled from datasetIngetor if missing |
|
||||
| creationLocation | string | x | see known Instrument list below |
|
||||
| dataFormat | string | | |
|
||||
| scientificMetadata | object | | |
|
||||
| proposalId | string | | filled by API automatically if missing |
|
||||
|
||||
#### Additional fields for type="derived"
|
||||
|
||||
| field | type | must | comment |
|
||||
|--------------------|---------------|------|---------|
|
||||
| investigator | string | m | |
|
||||
| inputDatasets | Array[string] | m | |
|
||||
| usedSoftware | string | m | |
|
||||
| jobParameters | object | | |
|
||||
| jobLogData | string | | |
|
||||
| scientificMetadata | object | | |
|
||||
| field | type | required | comment |
|
||||
|--------------------|---------------|-----------|---------|
|
||||
| investigator | string | x | |
|
||||
| inputDatasets | Array[string] | x | |
|
||||
| usedSoftware | string | x | |
|
||||
| jobParameters | object | | |
|
||||
| jobLogData | string | | |
|
||||
| scientificMetadata | object | | |
|
||||
|
||||
### About Scientific Values and Units
|
||||
|
||||
@@ -1249,7 +1197,7 @@ chosen for the same quantity:
|
||||
and the folders will be scanned for files
|
||||
|
||||
```sh
|
||||
datasetIngestor metadata.json [filelisting.txt | 'folderlisting.txt']
|
||||
scicat-cli datasetIngestor metadata.json [filelisting.txt | 'folderlisting.txt']
|
||||
```
|
||||
|
||||
You will be prompted for your username and password.
|
||||
@@ -1259,7 +1207,7 @@ chosen for the same quantity:
|
||||
catalog
|
||||
|
||||
```sh
|
||||
datasetIngestor --ingest metadata.json [filelisting.txt | 'folderlisting.txt']
|
||||
scicat-cli datasetIngestor --ingest metadata.json [filelisting.txt | 'folderlisting.txt']
|
||||
```
|
||||
|
||||
When the job is finshed all needed metadata will be ingested into the
|
||||
@@ -1299,31 +1247,11 @@ chosen for the same quantity:
|
||||
Then you run the datasetIngestor program usually under a beamline
|
||||
specic account. In order to run fully automatic all potential
|
||||
questions asked interactively by the program must be pre-answered
|
||||
through a set of command line options:
|
||||
through a set of command line options. The command below shows all
|
||||
available options:
|
||||
|
||||
```console
|
||||
datasetIngestor [options] metadata-file [filelisting-file|'folderlisting.txt']
|
||||
|
||||
-allowexistingsource
|
||||
Defines if existing sourceFolders can be reused
|
||||
-autoarchive
|
||||
Option to create archive job automatically after ingestion
|
||||
-copy
|
||||
Defines if files should be copied from your local system to a central server before ingest.
|
||||
-devenv
|
||||
Use development environment instead of production environment (developers only)
|
||||
-ingest
|
||||
Defines if this command is meant to actually ingest data
|
||||
-linkfiles string
|
||||
Define what to do with symbolic links: (keep|delete|keepInternalOnly) (default "keepInternalOnly")
|
||||
-noninteractive
|
||||
If set no questions will be asked and the default settings for all undefined flags will be assumed
|
||||
-tapecopies int
|
||||
Number of tapecopies to be used for archiving (default 1)
|
||||
-testenv
|
||||
Use test environment (qa) instead of production environment
|
||||
-user string
|
||||
Defines optional username:password string
|
||||
scicat-cli datasetIngestor [options] metadata-file [filelisting-file|'folderlisting.txt']
|
||||
```
|
||||
|
||||
- here is a typical example using the MX beamline at SLS as an example
|
||||
@@ -1331,11 +1259,11 @@ chosen for the same quantity:
|
||||
metadata.json
|
||||
|
||||
```sh
|
||||
datasetIngestor -ingest \
|
||||
-linkfiles keepInternalOnly \
|
||||
-allowexistingsource \
|
||||
-user slsmx:XXXXXXXX \
|
||||
-noninteractive \
|
||||
scicat-cli datasetIngestor --ingest \
|
||||
--linkfiles keepInternalOnly \
|
||||
--allowexistingsource \
|
||||
--user slsmx:XXXXXXXX \
|
||||
--noninteractive \
|
||||
metadata.json
|
||||
```
|
||||
|
||||
@@ -1376,7 +1304,7 @@ Otherwise just follow the description in the section "Manual ingest
|
||||
using datasetIngestor program" and use the option -copy, e.g.
|
||||
|
||||
```sh
|
||||
datasetIngestor -autoarchive -copy -ingest metadata.json
|
||||
scicat-cli datasetIngestor --autoarchive --copy --ingest metadata.json
|
||||
```
|
||||
|
||||
This command will copy the data to a central rsync server, from where
|
||||
@@ -1504,13 +1432,10 @@ following curl command:
|
||||
|
||||
```sh
|
||||
# for "functional" accounts
|
||||
curl -X POST --header 'Content-Type: application/json' -d '{"username":"YOUR-LOGIN","password":"YOUR-PASSWORD"}' 'https://dacat-qa.psi.ch/api/v3/Users/login'
|
||||
|
||||
# for normal user accounts
|
||||
curl -X POST --header 'Content-Type: application/json' -d '{"username":"YOUR-LOGIN","password":"YOUR-PASSWORD"}' 'https://dacat-qa.psi.ch/auth/msad'
|
||||
curl -X POST --header 'Content-Type: application/json' -d '{"username":"YOUR-LOGIN","password":"YOUR-PASSWORD"}' 'https://dacat-qa.psi.ch/api/v3/auth/login'
|
||||
|
||||
# reply if succesful:
|
||||
{"id":"NQhe3...","ttl":1209600,"created":"2019-01-22T07:03:21.422Z","userId":"5a745bde4d12b30008020843"}
|
||||
{"access_token":"NQhe3...", "id":"NQhe3...","created":"2019-01-22T07:03:21.422Z","userId":"5a745bde4d12b30008020843","expires_in":604800, "ttl":604800,...}
|
||||
```
|
||||
|
||||
The "id" field contains the access token, which you copy in to the corresponding field at the top of the explorer page.
|
||||
@@ -1563,7 +1488,7 @@ use the command datasetGetProposal, which returns the proposal
|
||||
information for a given ownerGroup
|
||||
|
||||
```sh
|
||||
/usr/bin/curl -O https://gitlab.psi.ch/scicat/tools/raw/master/linux/datasetGetProposal;chmod +x ./datasetGetProposal
|
||||
scicat-cli datasetGetProposal
|
||||
```
|
||||
|
||||
### Link to Group specific descriptions
|
||||
@@ -1597,6 +1522,8 @@ inside the digital user office DUO
|
||||
| Sis-Hrpes | /PSI/SLS/SIS-HRPES | slssis-hrpes |
|
||||
| Super-XAS | /PSI/SLS/SUPER-XAS | slssuper-xas |
|
||||
| Tomcat | /PSI/SLS/TOMCAT | slstomcat |
|
||||
| S-Tomcat | /PSI/SLS/S-TOMCAT | slstomcat |
|
||||
| I-Tomcat | /PSI/SLS/I-TOMCAT | slstomcat |
|
||||
| VUV | /PSI/SLS/VUV | slsvuv |
|
||||
| XIL-II | /PSI/SLS/XIL-II | slsxil-ii |
|
||||
| Xtreme | /PSI/SLS/XTREME | slsxtreme |
|
||||
@@ -1650,49 +1577,3 @@ The connected email distribution lists are {ingestAccount}@psi.ch
|
||||
| FLAME | /PSI/SMUS/FLAME | smusflame |
|
||||
|
||||
The connected email distribution lists are {ingestAccount}@psi.ch
|
||||
|
||||
## Update History of Ingest Manual
|
||||
|
||||
| Date | Updates |
|
||||
|--------------------|----------------------------------------------------------------------------|
|
||||
| 10. September 2018 | Initial Release |
|
||||
| 6. October 2018 | Added warning section to not modify data after ingest |
|
||||
| 10. October 2018 | ownerGroup field must be defined explicitly |
|
||||
| 28. October 2018 | Added section on datasetRetriever tool |
|
||||
| 20. November 2018 | Remove ssh key handling description (use Kerberos) |
|
||||
| 3. December 2018 | Restructure archive stepp, add autoarchive flag |
|
||||
| 17. January 2019 | Update on automatically filled values, more options for datasetIngestor |
|
||||
| 22. January 2019 | Added description for API access for script developers, 2 new commands |
|
||||
| | datasetArchiver and datasetGetProposal |
|
||||
| 22. February 2019 | Added known beamlines(instruments (creationLocation) value list |
|
||||
| 24. February 2019 | datasetIngestor use cases for automated ingests using beamline accounts |
|
||||
| 23. April 2019 | Added AFS infos and available central storage, need for Kerberos tickets |
|
||||
| 23. April 2019 | Availability of commands on RA cluster via pmodules |
|
||||
| 3. May 2019 | Added size limitation infos |
|
||||
| 9. May 2019 | Added hints for accessGroups definition for derived data |
|
||||
| | Added infos about email notifications |
|
||||
| 10. May 2019 | Added ownerGroup filtered retrieve option, decentral case auto detect |
|
||||
| 7. Juni 2019 | Feedback from Manuel added |
|
||||
| 21. Oct 2019 | New version of CLI tools to deal with edge cases (blanks in sourcefolder |
|
||||
| | dangling links, ingest for other person, need for kerberos ticket as user) |
|
||||
| 14. November 2019 | Restructuring of manual,New CLI tools, auto kinit login |
|
||||
| | Progress indicators, chksum test updated |
|
||||
| 20. Januar 2020 | Auto fill principalInvestigator if missing |
|
||||
| 3. March 2020 | Added Jupyter notebook analysis section |
|
||||
| 5. March 2020 | Add hint for datasets not to be published |
|
||||
| 19. March 2020 | Added hint that analysis Jupyter tool is in pilot phase only |
|
||||
| 19. March 2020 | Added recommendation concerning unit handling for physical quantities |
|
||||
| 9. July 2020 | Added GUI tool SciCatArchiver (developer: Klaus Wakonig) |
|
||||
| 11. July 2020 | Installation of SciCatArchiver on non-Ra system |
|
||||
| 14. July 2020 | Added publication workflow and recommended file structure chapter |
|
||||
| 16. July 2020 | Updated SciCat GUI deployment information |
|
||||
| 31. July 2020 | New deploy location, + policy parameters, new recommended file structure |
|
||||
| 27. August 2020 | Added Windows Support information |
|
||||
| 10. Sept 2020 | Corrected example JSON syntax in one location |
|
||||
| 23. November 2020 | Corrected instructions for using the SciCat GUI on Windows 10 |
|
||||
| 19. February 2020 | Added info about proposalId link |
|
||||
| 24. Juni 2021 | Major restructuring of full document for easier readability |
|
||||
| 9. Dec 2021 | Corrected spelling of value/units convention |
|
||||
| 23. April 2022 | Added hint to use -token option for CLI and SciCat GUI as normal user |
|
||||
| 2. Dec 2022 | Extended ingest use cases description of needed parameters Win+Linux |
|
||||
| 21. Dec 2023 | Inlcude redundancy risks and costs and file names limitations |
|
||||
|
||||
@@ -14,6 +14,9 @@ markdown_extensions:
|
||||
- toc:
|
||||
permalink: true
|
||||
- pymdownx.superfences
|
||||
- pymdownx.emoji:
|
||||
emoji_index: !!python/name:material.extensions.emoji.twemoji
|
||||
emoji_generator: !!python/name:material.extensions.emoji.to_svg
|
||||
|
||||
# Configuration
|
||||
theme:
|
||||
|
||||
Reference in New Issue
Block a user