next version of dap documentation

2023-11-22 16:13:22 +01:00
parent 1cfa2f5549
commit 42c6d8f157
1 changed files with 126 additions and 104 deletions
--- a/README.md
+++ b/README.md
@ -1,41 +1,46 @@
-# dap
+# dap (Detector Analysis Pipeline)
-Detector Analysis Pipeline
+Runs on detector data stream provided by [sf-daq](https://github.com/paulscherrerinstitute/sf_daq_broker)
 Runs on detector data stream produced by sf-daq
-## Installation
+# Installation
-At PSI-Swissfel there is already pre-installed package/conda environment to use with:
+
-```
+## Pre-Installed Package (PSI)
 At PSI, a pre-installed conda environment is available:
 ```bash
 source /sf/jungfrau/applications/miniconda3/etc/profile.d/conda.sh
 conda activate dap
 ```
-### To install from source
+## Installing from Source
-Create conda environment
+
-```
+Create and activate conda environment
 ```bash
 conda create -n test-dap cython numpy pyzmq jungfrau_utils
 conda activate test-dap
 ```
-Clone code of the dap and install peakfinder8_extension in the conda environment (in the same session where conda environment was made)
+
-```
+Clone and install dap
 ```bash
 git clone https://gitlab.psi.ch/sf-daq/dap.git
 cd dap
 make install
 ``` 
-## Architecture
+# Architecture
-Design of dap is made with the idea to scale horisontally and be able to process different algorithms(very different in computing complexity) running on data from detectors of different sizes (from 1 module to 32 modules detector). Independent **worker**s consumes  zeromq stream of detector data from sf-daq, run desired/selected algorithms on the frame it got from the stream and sends results : 
+The dap architecture is designed for horizontal scalability, processing various algorithms on detector data of different sizes. Each independent worker consumes a ZeroMQ stream from sf-daq, applies selected algorithms on received frames, and sends results:
-  * (frame with metadata information) to visualisation by [streamvis](https://github.com/paulscherrerinstitute/streamvis) 
+- Metadata-enriched frames to [streamvis](https://github.com/paulscherrerinstitute/streamvis).
-  * (only metadata information) to **accumulator**. 
+- Metadata-only results to the accumulator for storage.
  Purpose of **accumulator** is to save results of dap processing. Since currently dap is running inside network which doesn't allow to send results as BS source(s), results are saved in dap buffer and is written permanently to data space upon of user request to [sf-daq](https://github.com/paulscherrerinstitute/sf_daq_broker) ("save_dap_results: True" in the detector section of request to sf-daq).
-### Worker
+## Worker
- Each [worker](https://gitlab.psi.ch/sf-daq/dap/-/blob/main/dap/worker.py) run completely independent from another worker, and do frames processing, getting them from sf-daq by zeromq stream. Detector data is sent as a raw (ADC) values, so before applying any algorithm, worker do a conversion to energy, using [jungfrau_utils](https://github.com/paulscherrerinstitute/jungfrau_utils) package. Input parameters for the worker:
+Each worker runs independently and processes frames received via ZeroMQ. Before applying algorithms, it converts raw (ADC) detector values to energy using [jungfrau_utils](https://github.com/paulscherrerinstitute/jungfrau_utils). 
- ```
+Input parameters:
- $ python dap/worker.py --help
+```bash
 python dap/worker.py --help
 usage: worker.py [-h] [--backend BACKEND] [--accumulator ACCUMULATOR]
                 [--accumulator_port ACCUMULATOR_PORT]
                 [--visualisation VISUALISATION]
@ -58,12 +63,17 @@ options:
                        json file with peakfinder parameters
  --skip_frames_rate SKIP_FRAMES_RATE
                        send to streamvis each of skip_frames_rate frames
 ```
 Number of needed workers strongly depends on the detector size and the desired algorithm. In easiest case (1 module detector and algorithm is to make threshold on energy values of pixels) - 1-2 workers are enough; while for the larger detector(8-16 or 32 modules) and heavy algorithm like peakfinder8 - more than hundred workers are needed. It's better to start each worker pinned to a particular processor core, since they are CPU limited application. Since each worker is completely independent from each other - it's possible to run workers on a different nodes, distributing load and increasing number of workers.
 ### Accumulator
 Purpose of accumulator is to collect result of the processing done by workers. Currently dap is running in network which doesn't allow sending this result as BS source(s), so as temporary workaround, saving of the sub-sample of dap output results (frame indentification and intensity in the selected roi's) is done to the dap-buffer. Accumulator input parameters:
 ```
 The number of required workers varies based on detector size and algorithm complexity. 
 Workers can be pinned to specific processor cores and distributed across multiple nodes.
 ## Accumulator
 The accumulator collects results from workers due to network constraints, temporarily saving them to the dap-buffer before permanent storage upon user request made to sf-daq.
 Input parameters:
 ```bash
 python dap/accumulator.py --help
 usage: accumulator.py [-h] [--accumulator ACCUMULATOR]
                      [--accumulator_port ACCUMULATOR_PORT]
@ -76,117 +86,128 @@ options:
                        accumulator port
 ```
-### Implemented algorithms
+# Implemented algorithms
-There are several algorithms implemented in dap, following the request/need of the experiments. 
+ 
-
+   * **peakfinder Algorithm** 
   * peakfinder 
-     algorithm, based on peakfinder8 from cheetah package. Identifies peaks as connected pixels with the intensity above background, where background is radial averaged, determined iteratively excluding signal pixels.
+     This algorithm is based on peakfinder8 from the [cheetah package](https://www.desy.de/~barty/cheetah/Cheetah/Welcome.html). It identifies peaks as connected pixels exhibiting intensity above the background. The background is determined iteratively by radial averaging, excluding signal pixels.
-     Input parameters to algorithm:
+     Input parameters:
-       * `'do_peakfinder_analysis': 1/0` - to perform or not peakfinder8 algorithm
+       * `'do_peakfinder_analysis': 1/0` - Specifies whether to execute the peakfinder8 algorithm.
-       * `'beam_center_x/beam_center_y': float/float` (beam center in the detector coordinates)
+       * `'beam_center_x/beam_center_y': float/float` - Represents the beam center coordinates in the detector space.
-       * `'hitfinder_min_snr': float` - signal-to-noise value to discriminate background and signal
+       * `'hitfinder_min_snr': float` - Signal-to-noise value used to differentiate between background and signal.
-       * `hitfinder_min_pix_count': float` - minimum number of pixels to form a peak
+       * `hitfinder_min_pix_count': float` - Sets the minimum pixel count required to constitute a peak.
-       * `hitfinder_adc_thresh: float` - exclude pixels below the threshold from peak formation/determination
+       * `hitfinder_adc_thresh: float` - Excludes pixels below this threshold from peak determination.
-       * `'npeaks_threshold_hit': float` - threshold on number of found peaks to mark frame as *hit* or not
+       * `'npeaks_threshold_hit': float` - Threshold on the number of discovered peaks to categorize a frame as a hit or not.
-     Output of algorithm:
+     Algorithm Output:
-       * `'number_of_spots': int` - number of found peaks
+       * `'number_of_spots': int` - Indicates the count of identified peaks.
-       * `'spot_x/spot_y/spot_intensity': 3*list[float]` - coordinates and intensity of the found peaks in the frame
+       * `'spot_x/spot_y/spot_intensity': 3*list[float]` - Provides coordinates and intensity of the identified peaks within the frame.
-       * `'is_hit_frame': True/False` - mark frame as hit or not, in case number of found peaks are above defined threshold
+       * `'is_hit_frame': True/False` - Marks whether a frame qualifies as a hit based on the number of identified peaks exceeding the defined threshold.
-   * radial profile integration
+   * **Radial Profile Integration**
-      Input parameters to algorithm: 
+      This algorithm integrates pixel intensities radially based on defined parameters.
-       * `'do_radial_integration': 1/0` - to perform or not radial integration on dap
+
-       * `'beam_center_x/beam_center_y': float/float` - beam center in the detector coordinates
+      Input parameters: 
-       * `'apply_threshold': 1/0` - apply or not threshold to the pixel intensities before doing radial integaration
+       * `'do_radial_integration': 1/0` - Indicates whether radial integration should occur within dap.
-       * `'threshold_min/threshold_max': float/float` - in case of applying threshold - value of the threshold (threshold_max is applied only if it's value larger than threshold_min)
+       * `'beam_center_x/beam_center_y': float/float` - Specifies the beam center coordinates in the detector space.
-       * `'radial_integration_silent_min/radial_integration_silent_max': float/float` - if both values are present - normalize radial integrated profile to the region between that values). Needed to be able to combine frames, to exclude different beam intensity in each of it.
+       * `'apply_threshold': 1/0` - Determines whether to apply a threshold to pixel intensities before radial integration.
       * `'radial_integration_silent_min/radial_integration_silent_max': float/float` - If both values are present, normalizes the radial integrated profile within this specified range. This is crucial for frame combination to eliminate variations in beam intensity across frames.
     Output of algorithm:  
-       * `'radint_I': list[float]` - (in pixels coordinates) of radial integrated profile
+       * `'radint_I': list[float]` - Represents the radial integrated profile in pixel coordinates.
-       * `'radint_q' : list[float]` - (intensity of normalised intensity) if the profile
+       * `'radint_q' : [float, float]` - Represents the minimum and maximum x-coordinate values considered during integration in pixel coordinates.
-   * threshold on pixel intensity
+   * **Thresholding Pixel Intensity**
-     ignore measured pixel intensity if that intensity is above or below threshold values
+      This function disregards measured pixel intensity falling above or below specified threshold values.
-     Input parameters to algorithm:
+     Algorithm Input Parameters:
-       * `'apply_threshold': 1/0` - apply or not threshold to pixel intensity
+       * `'apply_threshold': 1/0` - Enables or disables the application of threshold to pixel intensity.
-       * `'threshold_min/threshold_max': float/float` - in case of applying threshold - value of the threshold (threshold_max is applied only if it's value larger than threshold_min)
+       * `'threshold_min/threshold_max': float/float` - Specifies threshold values. If applied, `threshold_max` is enforced only when its value surpasses `threshold_min`.
-       * `'threshold_value': 0/NaN` - subsitute pixel intensity by 0 or NaN value, if pixels intensity is outside of the defined values
+       * `'threshold_value': 0/NaN` - Replaces pixel intensity with either 0 or NaN if the pixel intensity falls outside the defined threshold values.
-   * ROI processing
+   * **Region of Interest (ROI) Processing**
-     it's possible to define multiple ROI's on the detector and dap will output results for each of that ROI. Algorithm to threshold pixel intensity will be applied before roi processing, if requested.
+     dap allows the definition of multiple ROIs on the detector, and it generates output results for each defined ROI. Prior to ROI processing, the algorithm to threshold pixel intensity can be applied if requested.
     Input parameters:
-       * `'roi_x1/roi_x2/roi_y1/roi_y2': 4*list[float]` - coordinates of ROI's
+       * `'roi_x1/roi_x2/roi_y1/roi_y2': 4*list[float]` - Specifies the coordinates of the ROIs.
-     Output of algorithm:
+     Algorithm Output:
-       * `'roi_intensities': list[float]` - summ of the intensities of pixels in roi
+       * `'roi_intensities': list[float]` - Sum of pixel intensities within the ROI.
-       * `'roi_intensities_normalised': list[float]` - intensity inside the defined ROI normalised to the ROI size or number of active pixels inside the ROI 
+       * `'roi_intensities_normalised': list[float]` - Intensity within the defined ROI normalized to the ROI size (the count of active pixels within the ROI).
-       * `'roi_intensities_x': list(float, float)` - x1/x2 (left/right) x-coordinates of the roi
+       * `'roi_intensities_x': list(float, float)` - x1/x2 (left/right) x-coordinates of the ROI.
-       * `'roi_intensities_proj_x': list(value)` - projection on the x-coordinate of pixel intensities (sum) 
+       * `'roi_intensities_proj_x': list(value)` - Projection onto the x-coordinate of pixel intensities (sum). 
-   * High intensity frame determination (was used for SPI experiments)
+   * **Detecting Frames with High intensity in Specific Regions**
-     Determine and mark if frame contains signal in a certain regions. It's a simple algorithm which uses "ROI processing" results and compares normalised intensities in first two ROI's to the threshold values. In case any of the threshold is exceeded - frame marked as hit.
+     This algorithm identifies frames containing signals within defined regions. It leverages "ROI processing" outcomes by comparing normalized intensities in the first two ROIs against predetermined thresholds. If any threshold is exceeded, the frame is labeled as a *hit*.
     Input parameters:
-       * `'do_spi_analysis': 1/0` - run determination algorithm
+       * `'do_spi_analysis': 1/0` - Initiates the determination algorithm.
-       * `'roi_x1/roi_x2/roi_y1/roi_y2': 4*list[float]` - coordinates of (at least) two ROI's
+       * `'roi_x1/roi_x2/roi_y1/roi_y2': 4*list[float]` - Coordinates of (at least) two ROIs.
-       * `'spi_limit': list(float, float)` - threshold values for first to ROI's
+       * `'spi_limit': list(float, float)` - Threshold values for first two ROIs.
-     Output of algorithm:
+     Algorithm Output:
-       * `'is_hit_frame': True/False` - mark frame as a hit, in case intensity in at least one ROI is above the threshold
+       * `'is_hit_frame': True/False` - Marks frame as a *hit* if intensity in at least one ROI surpasses the threshold.
-       * `'number_of_spots': int` - 0: if intenisty in both ROI's are below corresponding threshold; 25 - ROI1 is energetic, but not ROI2; 50 - ROI2 is energetic, but not ROI1; 75 - intensities in both ROI's are above the threshold  
+       * `'number_of_spots': int` - Indicates:
         *  0: if intensity in both ROIs falls below the respective thresholds
         * 25: ROI1 has high energy but not ROI2
         * 50: ROI2 has high energy but not ROI1
         * 75: intensities in both ROIs exceed the thresholds  
-   * Frame aggregation
+   * **Frame aggregation**
-
+  
-      For small intensity signal, which is hard to see on single frame, it's possible to aggregate frames on dap level and send resulting (aggregated) frame to visualisation. Note, however, that such aggregation is happening on every worker independently, so number of frames to aggregate should be reasonable compared to number of running workers of dap for that detector. Aggregation does not influence any other algorithms - they are running on real frames. Algorithm to make a threshold can be applied before frames aggregation.
+      When dealing with faint signals that are challenging to discern in individual frames, dap offers the option to aggregate frames, combining them at the dap level before sending the resulting aggregate frame to visualization. It's important to note that this aggregation occurs independently for each worker. Thus, it's crucial to maintain a reasonable balance between the number of frames to aggregate and the number of active dap workers for a given detector. This aggregation process does not impact other algorithms, as they operate on individual frames (example: the threshold algorithm runs before frame aggregation).
      Input parameters:
-        * `'apply_aggregation' : 1/0` - apply or not aggregation to frames before sending them to visualisation  
+        * `'apply_aggregation' : 1/0` - Enables or disables frame aggregation before transmission to visualization.
-        * `'aggregation_max': int` -number of frames to aggregate before sending them to visualisation. Note that this value is per worker
+        * `'aggregation_max': int` - Specifies the maximum number of frames to aggregate before transmitting to visualization. This value pertains to each worker.
-      Output of algorithm:
+      Algorithm Output:
-        * `'aggregated_images': int` - number of aggregated images  
+        * `'aggregated_images': int` - Indicates the count of aggregated images. 
-   * Frame labeling
+   * **Frame Tagging**
-      In case if the event propagation is implemented for the detector (so some event information is known from the detector header), frame can be marked accordingly on dap level to enable frame sorting on visualisation step.
+      When event propagation is integrated into the detector, dap allows frames to be tagged accordingly, facilitating their categorization during visualization.
-
+      
-      Currently implemented markers:
+      Presently supported markers:
-        * `'laser_on': True/False` - frame is marked as laser activated if darkshot event code is False and laser event code is True; in all other cases frame marked as not laser activated
+        * `'laser_on': True/False` - Marks frames as "laser activated" when the darkshot event code is False and the laser event code is True. Otherwise, frames are labeled as "not laser activated".
-   * Saturated pixels
+   * **Detection of Saturated Pixels**
-      Analysis on number and position of saturated pixels is done for each frame received by the dap. 
+      For every frame received by dap, an analysis is performed to ascertain the quantity and positions of saturated pixels.
-      Output of algorithm:
+      Algorithm Output:
-        * `'saturated_pixels': int` - number of saturated pixels in the frame
+        * `'saturated_pixels': int` - Number of saturated pixels within the frame.
-        * `'saturated_pixels_x/saturated_pixels_y': list[float]/list[float]` - coordinates of saturated pixels  
+        * `'saturated_pixels_x/saturated_pixels_y': list[float]/list[float]` - Coordinates of the saturated pixels.  
-   * Parameters propagated from dap input file to visualisation
+   * **Transmitted Parameters from dap Input to Visualization**
-      Some of the input parameters to dap are not used by the dap processing, but propagated to visualisation and used there to display certain characteristics of data
+      Certain input parameters in dap remain unused during the dap processing phase. However, these parameters are transmitted to the visualization component, where they serve to depict specific data characteristics:
-         * 'disabled_modules': list() (list of the number of disabled modules to visualise them)
+       * `'disabled_modules': list[int]` - Enumerates the disabled module numbers for visualization.
-         * 'detector_distance': value (distance from sample to detector in meters)
+       * `'detector_distance': float`  - Distance between sample and detector in meters.
-         * 'beam_energy': value (photon beam energy in eV)
+       * `'beam_energy': float` -  Photon beam energy in eV.
-   * Apply additonal mask
+   * **Implement Additional Masking**
-       For sensitive algorithms (like peakfinder) it may be important to remove some of the detector parts (defective pixels, edge pixels of the modules.. ). Currently definition of that detector parts are made manually and hardcoded in the worker.py code. Activation of that removal is done with `'apply_additional_mask': 0/1` input flag                
+       Sensitive algorithms, such as the peakfinder, often necessitate the exclusion of specific detector regions (like defective or module edge pixels). Presently, defining these detector segments requires manual hardcoding within the worker.py code.
       Use the `'apply_additional_mask': 0/1`  - Input flag to enable this functionality.
-### Input parameters (file)
+   * **Filter based on pulse picker information**
       If the event propagation capability is accessible for the detector and the pulse picker information is correctly configured for propagation, the filtration based on pulse picker information becomes feasible by using the 
       `'select_only_ppicker_events': 0/1` - Input flag.
- All input parameters to algorithms described in the previous section, are specified in the json file which provided as an input to the worker.py application (--peakfinder_parameters option). Each of the worker constantly monitor update to that json file and in case of the change - re-loads that file to apply new set of parameters to the data stream.
+## Input parameters (File)
- Example of the json file:
+Algorithms use input parameters specified in a JSON file provided to worker.py (`--peakfinder_parameters`). It constantly monitors this file for updates to apply new parameters.
- ```
+
 Example JSON:
 ```json
 {
    "beam_center_x": 1119.0,
    "beam_center_y": 1068.0,
@ -208,6 +229,7 @@ There are several algorithms implemented in dap, following the request/need of t
    "do_radial_integration": 0,
    "do_spi_analysis": 0,
    "threshold_value": "NaN",
    "select_only_ppicker_events": 0,
    "disabled_modules": [],
    "roi_x1": [],
    "roi_y1": [],
@ -215,8 +237,8 @@ There are several algorithms implemented in dap, following the request/need of t
    "roi_y2": []
 }
 ```
 That's the example of real dap input parameter file for 4M detector of Alvra. Peakfinder parameter is selected to process frame. No aggregation, thresholding, module disabling, radial integration are requested.
-## Acknowledgment
+# Acknowledgment
 Special thanks to Valerio Mariani for providing the cython implementation of peakfinder8.
 Initially DAP was made for SFX analysis, to run peak finder on a stream of data. peakfinder8 from cheetah was used for this purpose and big thanks to Valerio Mariani who provided cython implementation of peakfinder8