2025-06-30 18:53:30 +02:00
2023-10-27 11:39:09 +02:00
2025-06-30 18:53:30 +02:00

Unarchived Data Fix

SwissFEL scan descriptions are JSON files containing references to absolut paths:

$ cat /sf/instrument/data/p12345/raw/run0123-ImportantData/meta/scan.json 
{
    "scan_files": [
        [
            "/sf/instrument/data/p12345/raw/run0123-ImportantData/data/acq0001.PVDATA.h5",
            "/sf/instrument/data/p12345/raw/run0123-ImportantData/data/acq0001.BSDATA.h5",
            "/sf/instrument/data/p12345/raw/run0123-ImportantData/data/acq0001.CAMERAS.h5"
        ],
...

Unarchived data arrives in /sf/instrument/data/p12345/work/retrieve/, and the paths in the JSON files need to be updated accordingly:

$ cat /sf/instrument/data/p12345/work/retrieve/sf/instrument/data/p12345/raw/run0123-ImportantData/meta/scan_mod.json 
{
    "scan_files": [
        [
            "/sf/instrument/data/p12345/work/retrieve/sf/instrument/data/p12345/raw/run0123-ImportantData/data/acq0001.PVDATA.h5",
            "/sf/instrument/data/p12345/work/retrieve/sf/instrument/data/p12345/raw/run0123-ImportantData/data/acq0001.BSDATA.h5",
            "/sf/instrument/data/p12345/work/retrieve/sf/instrument/data/p12345/raw/run0123-ImportantData/data/acq0001.CAMERAS.h5"
        ],
...

unarchived_data_fix automates this update:

usage: unarchived_data_fix.py [-h] [--no-dryrun] [--inplace] [--overwrite]
                              {alvra,bernina,cristallina,diavolezza,maloja,furka}
                              pgroup

positional arguments:
  {alvra,bernina,cristallina,diavolezza,maloja,furka}
                        Which instrument has the data been measured at?
  pgroup                Which pgroup is the data in?

optional arguments:
  -h, --help            show this help message and exit
  --no-dryrun           Disable dry run. If dryrun is enabled (default) no
                        files are written.
  --inplace             Update scan.json in place. If inplace is disabled
                        (default) a new file scan_mod.json is created.
  --overwrite           Overwrite existing files. If overwrite is disabled
                        (default) existing files will be skipped.

usage examples:
  Dry run (nothing is changed or overwritten)
    unarchived_data_fix.py alvra p12345

  Create new files called scan_mod.json
    unarchived_data_fix.py alvra p12345 --no-dryrun

  Overwrite scan.json
    unarchived_data_fix.py alvra p12345 --no-dryrun --overwrite --inplace
Description
Fix paths in unarchived data
Readme 32 KiB
Languages
Python 100%