diff --git a/README.md b/README.md
index 6801e4b..10d7a49 100644
--- a/README.md
+++ b/README.md
@@ -1,31 +1,25 @@
-# ap
+# Automatic Processing tool (ap)
-Automatic Processing tool
-
-Runs on files produced by [sf-daq](https://github.com/paulscherrerinstitute/sf_daq_broker)
+The Automatic Processing (ap) tool, designed to streamline data processing and logbook management during beamtime, is a vital component for SwissFEL experiments. This tool seamlessly integrates with [sf-daq](https://github.com/paulscherrerinstitute/sf_daq_broker) output files, automating tasks like indexing via crystfel and logbook population in Google Spreadsheets.
# Table of Contents
-* [Installation](#installation)
* [Usage](#usage)
* [Before beamtime](#usage1)
* [During beamtime](#usage2)
- * [start/stop](#usage2_start)
- * [changes in configuration files](#usage2_config)
- * [data re-processing](#usage2_reprocess)
- * [pausing indexing](#usage2_pause)
* [After beamtime](#usage3)
* [Configuration files](#config)
* [Google Authentication](#google-api)
+* [Installation](#installation-from-source)
## Description
-Automatic Processing tool checks for the new files/runs produced by sf-daq and runs automatically workload (currently - indexing (by crystfel)) and fills logbook (google spreadsheet) with information with some daq parameters from the sf-daq and processing.
-## Installation
+The Automatic Processing tool continuously monitors and processes files generated by sf-daq. It automatically executes tasks like indexing (utilizing [crystfel](https://www.desy.de/~twhite/crystfel/)) and populates a designated Google Spreadsheet with relevant experiment parameters. This simplifies data processing and documentation, enhancing the efficiency of beamtime operations.
-### Pre-installed software (recommended)
-Automatic Processing tool is installed in **/sf/jungfrau/applications/ap** place and it's recommended to use if from that place (all examples below will be using that tool place).
+## Pre-Installed Package (PSI)
-Installed conda environment can be activated with
+The Automatic Processing tool is installed in **/sf/jungfrau/applications/ap**, and it's recommended to utilize it from this directory (all examples provided below will use this software path).
+
+To activate the corresponding conda environment, use the following commands:
```
$ source /sf/jungfrau/applications/miniconda3/etc/profile.d/conda.sh
$ conda activate ap
@@ -35,12 +29,14 @@ $ conda activate ap
### Before beamtime
- * run **prepare.sh** script, which will make directory **ap_config** in res/ space of corresponding pgroup and populate it with [configuration files](#config):
+ Before the beamtime starts, follow these steps:
+
+ 1. Run **prepare.sh** script from res/ directory:
+ ```bash
+ cd p12345/res
+ /sf/jungfrau/applications/ap/scripts/prepare.sh
```
- $ cd p12345/res
- $ /sf/jungfrau/applications/ap/scripts/prepare.sh
- ```
- * make corresponding changes in the configuration files (see section [Configuration files](#config)):
+ 2. Modify the configuration files:
* BEAM_ENERGY.txt
@@ -50,170 +46,190 @@ $ conda activate ap
* run_index.sh
- * create file .geom (DETECTOR_NAME is variable defined by you in env_setup.sh file) with the crystfel geometry file for corresponding detector (example : JF17T16V01.geom file for CrystallinaMX instrument)
+ 3. create file .geom with the crystfel geometry file for detector.
- * put in ap/CELL directory cell files of the protein which will be exposed during beamtime (format of the files should be readable by crystfel). Name of the cell files needs to be .cell.
- ```
+ 4. Prepare cell files for the exposed proteins in the ap_config/CELL directory.
+ ```bash
$ ls res/ap_config/CELL
lyso.cell hewl.cell
```
- **HINT** - in case there are several space group at which protein can be indexed, it's possible to run automatically indexing in the *alternative* space group. To do this - provide an alternative space group settings in the file .cell_alternative. Example:
+ **HINT**: To automate indexing in an alternative space group, provide an alternative space group settings in the .cell_alternative file.
+
+
+ 5. Create an empty Google Spreadsheet and corresponding [credentials files](#google-api).
```
- > $ ls res/ap_config/CELL
- lyso.cell chim.cell chim.cell_alternative
+ ls res/ap_config/credentials*json
+ credentials.json credentials-1.json credentials-2.json credentials-3.json
```
- runs with the =lyso will be indexed using lyso.cell file, while for the =chim - indexing will be done twice, using chim.cell and chim.cell_alternative files (and results of both indexing will be filled in logbook)
+ 6. Grant necessary access to the spreadsheet for service accounts. To find e-mails of the service accounts:
+ ```
+ grep client_email credentials*json
+ ```
- * create (an empty) google spreadsheet
+ 7. Edit env_setup.sh file to fill URL_TO_GOOGLE_SPREADSHEET(https://...) to the LOGBOOK variable.
- * create (several distinct) credentials files (see section [google authentication](#google-api) how to create service accounts and keys if not done before) and store them with the names in the config directory (it's important to have file with name credentials.json and have few(3 is enough) with names credentials-1.json, credentials-2.json...):
- ```
- $ ls res/ap_config/credentials*json
- credentials.json credentials-1.json credentials-2.json credentials-3.json
- ```
- ***RECOMMENDATION*** - use/generate new credentials files for each beamtime to not expose experiment information
-
- * give write access to the google spreadsheet to the service-accounts (recommended) or give full editor access to all who know url of the logbook(quicker, but not recommended action). To find e-mails of the service accounts:
- ```
- $ grep client_email credentials*json
- ```
-
- * edit env_setup.sh file to fill URL_TO_GOOGLE_SPREADSHEET(https://...) to the LOGBOOK variable
-
- * setup/prepare spreadsheet for automatic filling:
- ```
- $ . ./env_setup.sh
- $ python /sf/jungfrau/applications/ap/ap/update-spreadsheet.py --setup --url ${LOGBOOK}
- ```
+ 8. Setup/prepare spreadsheet for automatic filling:
+ ```
+ . ./env_setup.sh
+ python /sf/jungfrau/applications/ap/ap/update-spreadsheet.py --setup --url ${LOGBOOK}
+ ```
-
### During Beamtime
-#### start/stop automatic processing tool:
-* to function properly, instruction for sf-daq to produce detector files must include
- * adc_to_energy : True
- * save_dap_results : True
- * crystfel_lists_laser : True
-
- Optional, to make files smaller:
- * compression: True
- * factor: Value (0.25 to round to 250eV or photon beam energy to make output in photon counts)
-
- Important:
- * geometry: False (that's the usual choice, module-to-module adjustment is made then with crystfel geometry file. Choice of the value should be aligned with the geometry file used)
-
-* login to swissfel online computing infrastructure with your personal PSI account:
-```
-$ ssh psi_account@sf-l-001
-```
-* go to the directory with configuration files (prepared in the [Before Beamtime](#usage1) step):
-```
-$ cd /sf/alvra/data/p12345/res/ap_config
-```
-* start automatic processing tool execution:
-```
-$ /sf/jungfrau/applications/ap/scripts/ap.sh
-```
-***HINT*** - best is to start this process in screen or tmux session, to be able to re-connect to this session remotely
+ 1. Check that input detector parameters of for sf-daq contains:
+ * adc_to_energy : True
+ * save_dap_results : True
+ * crystfel_lists_laser : True
+ * geometry: False
-* stop automatic processing tool:
- * if running from your account : Ctrl-C in corresponding session
- * if running by other account - put file STOP inside configuration directory
+ Optional(to reduce files sizes):
+ * compression: True
+ * factor: 0.25
+
+ 2. Login to SwissFEL online computing infrastructure with your personal PSI account:
+ ```bash
+ ssh psi_account@sf-l-001
+ cd /sf/alvra/data/p12345/res/ap_config
+ ```
+
+ 3. Start the automatic processing tool execution:
+ ```bash
+ /sf/jungfrau/applications/ap/scripts/ap.sh
```
- $ touch /sf/alvra/data/p12345/res/ap_config/STOP
- ```
- (if such file is present inside directory - new automatic processing tool will not start, so remove file before re-starting the tool)
+ **HINT**: It's recommended to start this process in a screen or tmux session.
+
+ 4. stop the automatic processing tool:
+ * if running from your account : Ctrl-C in corresponding session
+ * if running by other account, use:
+ ```
+ touch /sf/alvra/data/p12345/res/ap_config/STOP
+ ```
+
+ ### Possible actions during beamtime:
-#### changes in configuration files
-can be done at any time and new processing jobs will take new values
+ #### changes in configuration files
-#### re-processing of already processed runs
-in case of need to re-run indexing (new config parameters, new geometry file etc) - first make sure that previous indexing jobs for these runs are finished (check CURRENT_JOBS.txt file in config directory or run "squeue"). If they are finished - remove corresponding to the runs (please note that run number is **unique_acquisition_run_number**, not scan number) files from output directory. Example:
-```
-scan number 206 (raw/run0206*/ directory with data) needs to be re-indexed. Scan contains 24 steps.
-corresponding **unique_acquisition_run_number** are 4048-4071
-$ grep unique_acquisition_run_number raw/run0206*/meta/acq*.json
+ Changes to configuration files can be made at any time, and new processing jobs will automatically consider these updated values.
-or look at logbook, **unique_acquisition_run_number** is the first column of spreadsheet
-check that there are no jobs with such numbers/name running, looking at CURRENT_JOBS.txt file or *squeue*
+ #### re-processing of already processed runs
-remove res/ap_config/output/run*4048-4071*.index* files to re-run indexing for that scan
-```
-#### pausing indexing
-in case of unknown processing parameters (detector distance, geometry file(beam center), not yet known cell file...), it's possible to pause (not start indexing jobs) putting semaphore file NO_INDEXING in config directory
-```
-$ touch res/ap_config/NO_INDEXING
-```
-once this file is removed - all not indexed runs will be processed by the tool
+ If re-running indexing becomes necessary due to new configuration parameters or updated files:
+
+ 1. Ensure previous indexing jobs for these runs are finished (check CURRENT_JOBS.txt file in the config directory or use "squeue"command).
+ 2. Identify the **unique_acquisition_run_number**s associated with the specific scan (e.g., scan number 206).
+ 3. Remove the corresponding files for these runs from the output/ directory to initiate re-indexing.
+
+ Example:
+ ```
+ # Scan number 206 (raw/run0206*/ directory with data) needs to be re-indexed. # Scan contains 24 steps.
+ # corresponding **unique_acquisition_run_number** are 4048-4071
+ grep unique_acquisition_run_number raw/run0206*/meta/acq*.json
+
+ #or look at logbook, **unique_acquisition_run_number** is the first column of # spreadsheet
+ #check that there are no jobs with such numbers/name running, looking at #CURRENT_JOBS.txt file or *squeue*
+ #
+ # remove res/ap_config/output/run*4048-4071*.index* files to re-run indexing # for that scan
+ ```
+
+ #### pausing indexing
+
+ To pause indexing due to unknown processing parameters:
+
+ * Create a semaphore file named NO_INDEXING in the config directory.
+ ```bash
+ touch res/ap_config/NO_INDEXING
+ ```
+ Once this file is removed, all not indexed runs will resume processing by the tool.
+
### After Beamtime
+
+ Upon completing the beamtime activities, follow these steps:
-* stop automatic processing executable (Ctrl-c ap.py process) once all runs are processed for this beamtime (no active jobs and filling of the logbook is finished)
+ 1. Stop the Automatic Processing:
+ * Terminate the automatic processing executable by using Ctrl-C or creating file STOP in running directory, once all runs are processed for this beamtime
-* remove credentials*json files and revoke api-keys in [Google Developer Console](https://console.developers.google.com/) (->"Service Accounts", for each account -> click on "Actions: ..." and choose "Manage Keys", then remove key)
+ 2. Remove Credentials Files and Revoke API Keys:
+ * Remove all credentials*json files.
+ * Revoke API keys associated with the Automatic Processing from the [Google Developer Console](https://console.developers.google.com/)
-* revoke write access to to google spreadsheet to the service-accounts used by ap
## Configuration files
### BEAM_ENERGY.txt
-This file should contain a beam energy values (in eV). There must be one line with the default value and it's possible to define beam energy values, different from defaults for specific runs(scans).
+This file should contain beam energy values in electronvolts(eV). Ensure there's one line with the default value, allowing the definition of specific beam energy values for individual runs (scans).
Example:
-```
-$ cat BEAM_ENERGY.txt
+```plain text
DEFAULT 11330.0
run9876 11001.2
run9870 12015.1
```
-(for the runs 9876 and 9870 - 11001.2 and 12015.1 photon beam energy will be used, while for any other - 11330.0)
+For runs 9876 and 9870, photon beam energies of 11001.2 and 12015.1 eV will be used, respectively. For any other run, 11330.0 eV will be applied as the default value.
### DETECTOR_DISTANCE.txt
-This file should contain a detector distance (from sample to detector) in meter. Format is similar to BEAM_ENERGY.txt file, so for example:
+This file should contain the detector distance (from sample to detector) in meters. The format is similar to the BEAM_ENERGY.txt file.
+
+Example
```
-$ cat DETECTOR_DISTANCE.txt
DEFAULT 0.09369
run9988 0.09212
run9977 0.09413
```
-(for runs 9988 and 9977 - 9.212cm and 9.413cm will be used as detector distance, for all other runs a default value of 9.369cm will be used)
+For runs 9988 and 9977, detector distances of 9.212 cm and 9.413 cm will be used, respectively. For all other runs, the default value of 9.369 cm will be applied.
### env_setup.sh
-During preparation [step](#usage1) this file should be filled (manually) with the proper values for the beamline name(alvra or bernina or ..), pgroup name (p12345), DETECTOR_NAME (JF17T16V01) used in experiment, THRESHOLD_INDEXING (can be changed, adapted, in run_index.sh file, see latter) and LOGBOOK (url to google spreadsheet which will be used for automatic filling)
+This file should be manually filled during the [preparation step](#usage1) with proper values for:
+
+ * Beamline name (e.g., alvra or bernina)
+ * Pgroup name (e.g., p12345)
+ * DETECTOR_NAME (e.g., JF17T16V01) used in the experiment
+ * THRESHOLD_INDEXING (modifiable in run_index.sh file; see latter)
+ * LOGBOOK (URL to the Google Spreadsheet used for automatic filling)
+D
### run_index.sh
-this file contains indexing parameters used by crystfel.
+This file contains indexing parameters utilized by crystfel.
-**HINT** - in case several proteins are used during expertiment, it's possible to define different indexing parameters for each of them: in case run_index..sh file is present - indexing parameters from that file will be used to process protein sample, if not present(default) - run_index.sh parameters are used
+**HINT**: If multiple proteins are used during the experiment, different indexing parameters can be defined for each. The presence of a run_index..sh file uses parameters specific to the protein sample; otherwise, run_index.sh parameters are applied as the default.
## Google Authentication
- ap can fill automatically google spreadsheet with different information. This is done using google-api and one need to have api-keys created and allowed for the corresponding spreadsheet (logbook). To create keys, few steps needs to be done first:
-- [enable API access for a project](https://docs.gspread.org/en/v5.10.0/oauth2.html#enable-api-access-for-a-project)
-- [create (*hint* - do several for same project) service accounts](https://docs.gspread.org/en/v5.10.0/oauth2.html#for-bots-using-service-account) (steps 1-4)
+ To enable automatic filling of the Google Spreadsheet, follow these steps:
-## Installation from source
-Automatic Processing tool can also be installed from scratch (put this to the place which is accessible from online computing nodes and accesible by other people, who will run ap tool):
-```
-$ git clone https://gitlab.psi.ch/sf-daq/ap.git # or via ssh with
- # git clone git@gitlab.psi.ch:sf-daq/ap.git
-```
-In case new conda environment is needed, please install following packages in that environment:
-```
-gspread numpy matplotlib
+ * [enable API access for a project](https://docs.gspread.org/en/v5.10.0/oauth2.html#enable-api-access-for-a-project)
+ * [create service accounts](https://docs.gspread.org/en/v5.10.0/oauth2.html#for-bots-using-service-account) (steps 1-4)
+
+## Installation from source
+
+The Automatic Processing tool can be installed from scratch. Place it in a location accessible from online computing nodes and reachable by individuals intending to use the ap tool.
+
+Steps for Installation:
+
+1. Clone the repository using HTTPS or SSH:
+```bash
+git clone https://gitlab.psi.ch/sf-daq/ap.git
+# or via ssh with
+git clone git@gitlab.psi.ch:sf-daq/ap.git
```
-In case of such installation from source, change correspondingly lines(at the end of the file) in [env_setup.sh](#config_env_setup) file
+2. If setting up a new conda environment, ensure the installation of the following packages within that environment:
+ * gspread
+ * numpy
+ * matplotlib
+When installing from source, remember to make corresponding changes to the lines at the end of the env_setup.sh file.
## Roadmap
- For all SFX experiments at SwissFEL (Alvra, Bernina(SwissMX and pure Bernina one) and Cristallina) this service was used in years 2018-2023 and were running by authors of the code, which helped to make a fast changes and integration with other components as well as successful tuning this product to users needs. In 2013 successful steps were made to split tool to config and executable parts and beamtimes in June at Cristallina were running with config part fully under control of beamline people, in July - executable part was tested to be running under control of beamtime people of Alvra. That opens a possibility to start a migration of this service to tool.
+From 2018 to 2023, the Automatic Processing service was a vital component of Serial Femtosecond Crystallography (SFX) experiments conducted across various SwissFEL beamlines, including Alvra, Bernina (and SwissMX), and Cristallina. During this period, the authors of the code actively managed the tool, facilitating rapid changes and seamless integration with other experiment components. This collaborative effort ensured the continuous refinement and adaptation of the tool to meet the specific needs of users.
- Till now Automatic Processing were used for SFX experiments only, since they were a more demanding for this tool. But enhancement of tool to other types of experiments at SwissFEL is certainly possible.
+Significant strides were made in 2013 to bifurcate the tool into configuration and executable components. Notably, during a beamtime in June at Cristallina, the configuration part was entirely managed by beamline personnel. Subsequently, in July, successful tests were conducted at Alvra, where the executable part was overseen by the beamtime personnel. These achievements paved the way for contemplating a migration of this service into a more comprehensive tool.
+
+Initially designed for SFX experiments due to their demanding nature, the Automatic Processing tool has demonstrated adaptability and potential for expansion to accommodate other experiment types at SwissFEL. The tool's versatility and robustness lay the groundwork for its potential application in diverse experimental setups beyond SFX.
## Authors and reference
Automatic Processing tool was made in 2018 by Karol Nass and Dmitry Ozerov.