# jfjoch_process `jfjoch_process` is the **offline** crystallographic data-analysis tool of Jungfraujoch. It takes an existing HDF5 dataset, runs the full analysis pipeline — spot finding, indexing, geometry refinement, Bragg integration and (optionally) scaling and merging — and writes the results to a `_process.h5` file, plus reflection files (`.mtz`/`.cif`/`.hkl`) when merging is requested. It runs the *same* analysis code as the online and interactive tools, just driven from the command line over a file rather than a live detector stream. > **Note.** `jfjoch_process` is under very active development. This page describes the tool and > its options at a high level; the authoritative, always-current list of options is the program's > own usage message — run `jfjoch_process` with no arguments. ## Where it fits among the three analysis tools | Tool | Mode | Driven by | Output | | --- | --- | --- | --- | | [`jfjoch_broker`](JFJOCH_BROKER.md) | Online, real-time streaming analysis on FPGA + GPU | HTTP/REST + ZeroMQ | Live results and statistics, images streamed to [`jfjoch_writer`](JFJOCH_WRITER.md) | | [`jfjoch_viewer`](JFJOCH_VIEWER.md) | Interactive, on-screen exploration | Qt desktop application | Displayed on screen (results not saved to disk) | | **`jfjoch_process`** | **Offline batch processing of a stored dataset** | **Command-line interface** | **`_process.h5`, and `.mtz`/`.cif`/`.hkl` when merging** | Use `jfjoch_process` to re-analyse data after acquisition, to experiment with processing parameters, or to produce merged intensities for downstream structure solution. ## Hardware As with the rest of Jungfraujoch, **serious performance requires an NVIDIA GPU**. The CUDA build provides the GPU fast-feedback indexer (`ffbidx`) and the GPU FFT indexer (`fft`); without CUDA only the CPU `fftw` indexer is available. Spot finding, integration and scaling run on the CPU and scale with the thread count (`-N`). ## Input and output **Input** is a single Jungfraujoch HDF5 master file (NXmx-based). If the dataset already contains stored spot lists, two-pass rotation indexing can reuse them instead of re-running spot finding on the first pass. **Output** (controlled by `-o, --output-prefix`, default `output`): - `_process.h5` — NXmx-compliant HDF5 with derived metadata (spots, indexing, integration, azimuthal integration, per-image statistics). See [HDF5 / NeXus data format](HDF5.md) for the layout. - When merging (`-M`, or whenever a `--reference-mtz` is supplied), the merged reflections are written as `.mtz` (default), or `.cif` / `.hkl` depending on `--scaling-output`. No-reference scaling additionally emits per-iteration `_iterN_scale.dat`. Merged statistics (⟨I/σ⟩, CC1/2, completeness, …), the error model and timing are printed to the console. ## Re-scaling and re-merging (`jfjoch_scale`) The companion tool `jfjoch_scale` re-scales and merges the *already-integrated* reflections stored in one or more `_process.h5` files, without re-running spot finding or integration. Use it to re-merge quickly with a different space group, partiality model, resolution limit or reference MTZ, or to combine several processed runs into one set of merged intensities. ## Quick start ### Rotation data Two-pass rotation indexing, rotation partiality, scale and merge in space group 96: ``` jfjoch_process rotation_master.h5 \ -o lyso_rot -N 16 \ -R -S 96 \ -M -P rot ``` `-R` runs the two-pass rotation indexer (index the sweep once, then process every frame against that lattice); `-P rot` selects the rotation partiality model; `-M` scales and merges. For strong rotation data the de-novo FFT indexer often indexes more frames — add `-X fft` (and drop `-C` to let it find the cell from scratch). ### Still / serial data Known-cell indexing of independent stills with the GPU fast-feedback indexer, then merge against a reference structure: ``` jfjoch_process serial_master.h5 \ -o lyso_serial -N 16 \ -X ffbidx -C 79,79,38,90,90,90 -S 96 \ --spot-sigma 4 \ -M -z reference.mtz -r pixelrefine \ --scaling-high-resolution 1.8 ``` `ffbidx` requires a known cell (`-C`) and is the indexer of choice for sparse serial stills. `-r pixelrefine` selects the experimental reference-driven still integrator (needs `--reference-mtz`). For weak serial data, tightening spot finding with `--spot-sigma 4` typically raises the indexing rate substantially. ## Command-line options General: | Option | Description | | --- | --- | | `-o, --output-prefix ` | Output file prefix (default: `output`) | | `-N, --threads ` | Number of worker threads (default: 1) | | `-s, --start-image ` | First image to process (default: 0) | | `-e, --end-image ` | Last image to process (default: all) | | `-t, --stride ` | Process every *n*-th image (default: 1) | | `-v, --verbose` | Verbose output | Spot finding: | Option | Description | | --- | --- | | `--spot-sigma ` | Noise sigma level for spot finding (default: 3.0) | | `--spot-threshold ` | Photon-count threshold for spot finding (default: 10) | | `--spot-high-resolution ` | High-resolution limit for spot finding, Å (default: 1.5) | | `--max-spots ` | Maximum spot count (default: 250) | Indexing: | Option | Description | | --- | --- | | `-X, --indexing-algorithm ` | `FFBIDX` \| `FFT` \| `FFTW` \| `Auto` \| `None` | | `-C, --unit-cell ` | Reference unit cell `"a,b,c,alpha,beta,gamma"` (required by `ffbidx`) | | `-S, --space-group ` | Space group number (used for indexing and scaling) | | `-r, --refine ` | Geometry refinement: `none` \| `orientation` \| `beam_and_lattice` (default) \| `pixelrefine` | | `-R, --two-pass-rotation[=num]` | Two-pass offline rotation indexing (optional image count, default 30) | | `--single-pass-rotation[=num]` | Online-like single-pass rotation indexing (optional min angular range, deg) | | `--redo-rotation-spots` | Redo spot finding for the two-pass rotation first pass | | `--force-rotation-lattice ` | Force rotation lattice (9 floats, Å), skipping the first pass | Indexer choice in brief: `ffbidx` (GPU) refines toward a **known cell** and is best for sparse serial stills; `fft` (GPU) / `fftw` (CPU) index **de novo** and suit strong rotation data. See the [CPU/GPU data-analysis reference](CPU_DATA_ANALYSIS.md) for the algorithms. Scaling and merging: | Option | Description | | --- | --- | | `-M, --scale-merge` | Scale and merge | | `-P, --partiality ` | Partiality model: `fixed` (default) \| `rot` \| `unity` | | `-A, --anomalous` | Anomalous mode (keep Friedel pairs separate) | | `-B, --refine-bfactor` | Refine a per-image B-factor | | `-w, --wedge[=num]` | Refine the per-image rotation wedge (optional starting value) | | `--scaling-high-resolution ` | High-resolution limit for scaling, Å (default: no limit) | | `--min-partiality ` | Minimum partiality to accept a reflection (default: 0.02) | | `--reject-outliers ` | Per-observation outlier rejection, N σ from the per-reflection median (default: off) | | `--reject-delta-cchalf ` | Drop images with ΔCC1/2 below mean − N·stddev (default: off) | | `--min-image-cc ` | Per-image CC limit, percent (default: no limit) | | `--scaling-iterations ` | Scaling iterations with no reference data (default: 3) | | `--scaling-output ` | Reflection output format: `mtz` (default) \| `cif` \| `txt` | | `-z, --reference-mtz ` | Reference MTZ (enables reference-driven scaling) | Pixel refinement (experimental; select with `-r pixelrefine`, requires `--reference-mtz`): | Option | Description | | --- | --- | | `--bandwidth ` | Relative X-ray bandwidth FWHM (e.g. `0.01` for a 1% DMM); default from file or 0 (monochromatic) | | `--integration-radius ` | Signal-box radius `r1`, or `r1,r2,r3` (px). One value ⇒ `r2=r1+2`, `r3=r1+4` | | `--profile-multiplier ` | Scale the measured tangential profile width (default: 6; XDS-style generous aperture) |