Automatic Processing tool aka Apocalypse (apo)
Apocalypse (apo) is a library that provides the SwissFEL user platform to effortlessly run (and re run) their scripts on the SwissFEL infrastructure -- compute cluster "Eris" (login node: sf-eris) with sf-daq output files.
More information about Eris
Apo is a rewrite/ expansion of Automatic Processing tool (ap) giving user flexibility to how to handle the raw data and separates handling processed data to an another step.
Getting started
Before you start
Make sure that your desired script runs properly on Eris:
ssh sf-eris
salloc
ssh sf-cn-#
<pth_script> --input <filename>.h5
sf-cn-#-- corresponds to the node allocated for interactive use and displayed aftersalloc<pth_script>-- is top level script that should run exactly as stated. That means any additional actions (e.g. setting environment, adding pmodules, additional parameter handling) should be included in it. See examples<filename>.h5is the path to file that you would like to process on/sf/<endstation>/data/<pgroup>/raw/
Once your tests are successful remember to cancel your allocation by:
scancel <JID>
<JID>is slurm's job ID, displayed when allocating resources, you can also check it bysqueue -u $USER
Setup
Run apo from Eris:
ssh sf-eris
If you have /sf/daq/bin on your path, there is no need to do anything, otherwise:
export PATH=$PATH:/sf/daq/bin
Running
apocalypse -s <pth_for_script> -e <endstation>
where:
<pth_for_script>is path for your script (can be relative)<endstation>is the endstation that the corresponding data will be filtered for
there are additional parameters that can be used, explore them with
apocalypse --help
Or re-run your apo run by:
apocalypse_re_run -p <pgroup> -r <run>
where:
<pgroup>is pgroup given as a str in format pXXXXX<run>is run number given as str or list of strs e.g."*024*,*025*"there are additional parameters that can be used, explore them with
apocalypse_re_run --help
Re-run will just emit "file written" message similar as when original run was taken, there needs to be an instance of apo running with parameters separately set to trigger the script execution. Re-run does not handle setting any parameters for apo, as it only sends the messages.
Examples
Simple python script
Simple script with writing meta-data can be found in simple1
This example just takes one camera file and will save an output file with projections of channels with more than one dimension along with apo meta file. There is a simple txt meta file written as meta, that won't be parsed later on by apo and will be passed as is with success message. Since there is no filter for bs files the best way to run this script is:
apocalypse -s ./examples/simple1/run_simple1.sh -e endstation -writer-type imagebuffer
to make sure that the jobs are submitted only for cam files
Simple python script with bash wrapper
Simple script with writing meta-data and handling additional parameter can be found in simple2
This example takes one camera file and a defined background file and will save an output file with projections of channels with more than one dimension along with apo meta file. In this example meta file is written as .json and it will be parsed to dictionary before passing the data to the success message. Since there is no filter for bs files the best way to run this script is:
apocalypse -s ./examples/simple2/run_simple2.sh -e endstation -writer-type imagebuffer
to make sure that the jobs are submitted only for cam files It is convenient to just make the background file a symlink and change when needed.
General remarks
- Make sure to write enough metadata in your files to make your work reproduceable apo does not handle that for you
- it is recommended to mirror raw data structure in res, only metadata files in
/sf/<endstation>/data/<pgroup>/res/processed/<run>/metaand calledapo_acq<acq>(.json, .yaml, .txt) will be be processed and sent with success message.
buffer - file type correspondence:
- data3buffer = BSDATA.h5
- imagebuffer = CAMERAS.h5
- detector_buffer = JF.h5
- apocalypse = all files when "re-run"