Vespa data processing pipeline

Author: G.Assmann (2023)
Vespa data processing (VDP)

This Pipeline was generated to run spot finding, indexing and integration of VESPA h5 files with crystfel (https://www.desy.de/~twhite/crystfel) in an automated manner.All files inlcuding "vdp" belong to this pipeline.

An additional merging automation (VDM) was also included, files including "vdm" for VESPA data merging belong to this pipeline. The same conda env is needed as for VDP. The user can choose run numbers that are supposed to be merged; obvoiusly they should be of the same protein and spacegroup etc. So far there is no default check if the run numbers chosen are actually from the same protein - be careful! The User can specify different timepoints that are supposed to be merged, there are two possibilities:

1 timepoint --> every stream file in the corresponding folders are merged together - will result in one merging folder and one HKL file.
X timepoints --> only streamfiles of the specific timepoint will be merged together - will result in X merging folders and in total X HKL files. Per merging folder exists only one HKL file.

Installation

source /opt/gfa/python 3.10
conda create --name vdp python=3.10
conda activate vdp

pip install git+https://github.com/pgasparo/crystfelparser
pip install scipy matplotlib pandas h5py scikit-learn bitshuffle joblib seaborn ipython
pip install stomp.py loguru pyepics pyzmq numpy
pip install sseclient requests pathlib

Execution

Have a look into section 2. in https://docs.google.com/document/d/11cP3m3qTd52bGmubK3JrWsMM_9QDvuo46fF4VUwqGZY/edit?usp=sharing

1.6 KiB Raw Blame History

Vespa data processing pipeline

Installation

Execution

1.6 KiB

Raw Blame History