Files
Jungfraujoch/docs/JFJOCH_PROCESS.md
T
leonarski_f c69b5297d5 docs: add jfjoch_process page, refresh viewer/tools docs, unify CLI naming
- Add docs/JFJOCH_PROCESS.md describing the offline analysis tool, its
  options, output files, and the broker/viewer/process distinction; mention
  jfjoch_scale for re-scaling/merging.
- Rewrite docs/JFJOCH_VIEWER.md for consistency: functionality, HTTP env
  vars (JUNGFRAUJOCH_HTTP_HOST/PORT), command line, and the real D-Bus API.
- Refresh docs/TOOLS.md to the current set of tools; add both pages to index.rst.
- jfjoch_process: fix stale self-name (jfjoch_analysis -> jfjoch_process) in
  usage/license/logger.
- jfjoch_scale: unify --scaling-high-resolution with jfjoch_process (drop -D
  short flag, make it long-only) and remove dead p/q/i short options.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 07:38:36 +02:00

7.9 KiB
Raw Blame History

jfjoch_process

jfjoch_process is the offline crystallographic data-analysis tool of Jungfraujoch. It takes an existing HDF5 dataset, runs the full analysis pipeline — spot finding, indexing, geometry refinement, Bragg integration and (optionally) scaling and merging — and writes the results to a _process.h5 file, plus reflection files (.mtz/.cif/.hkl) when merging is requested.

It runs the same analysis code as the online and interactive tools, just driven from the command line over a file rather than a live detector stream.

Note. jfjoch_process is under very active development. This page describes the tool and its options at a high level; the authoritative, always-current list of options is the program's own usage message — run jfjoch_process with no arguments.

Where it fits among the three analysis tools

Tool Mode Driven by Output
jfjoch_broker Online, real-time streaming analysis on FPGA + GPU HTTP/REST + ZeroMQ Live results and statistics, images streamed to jfjoch_writer
jfjoch_viewer Interactive, on-screen exploration Qt desktop application Displayed on screen (results not saved to disk)
jfjoch_process Offline batch processing of a stored dataset Command-line interface _process.h5, and .mtz/.cif/.hkl when merging

Use jfjoch_process to re-analyse data after acquisition, to experiment with processing parameters, or to produce merged intensities for downstream structure solution.

Hardware

As with the rest of Jungfraujoch, serious performance requires an NVIDIA GPU. The CUDA build provides the GPU fast-feedback indexer (ffbidx) and the GPU FFT indexer (fft); without CUDA only the CPU fftw indexer is available. Spot finding, integration and scaling run on the CPU and scale with the thread count (-N).

Input and output

Input is a single Jungfraujoch HDF5 master file (NXmx-based). If the dataset already contains stored spot lists, two-pass rotation indexing can reuse them instead of re-running spot finding on the first pass.

Output (controlled by -o, --output-prefix, default output):

  • <prefix>_process.h5 — NXmx-compliant HDF5 with derived metadata (spots, indexing, integration, azimuthal integration, per-image statistics). See HDF5 / NeXus data format for the layout.
  • When merging (-M, or whenever a --reference-mtz is supplied), the merged reflections are written as <prefix>.mtz (default), or <prefix>.cif / <prefix>.hkl depending on --scaling-output. No-reference scaling additionally emits per-iteration <prefix>_iterN_scale.dat.

Merged statistics (⟨I/σ⟩, CC1/2, completeness, …), the error model and timing are printed to the console.

Re-scaling and re-merging (jfjoch_scale)

The companion tool jfjoch_scale re-scales and merges the already-integrated reflections stored in one or more _process.h5 files, without re-running spot finding or integration. Use it to re-merge quickly with a different space group, partiality model, resolution limit or reference MTZ, or to combine several processed runs into one set of merged intensities.

Quick start

Rotation data

Two-pass rotation indexing, rotation partiality, scale and merge in space group 96:

jfjoch_process rotation_master.h5 \
    -o lyso_rot -N 16 \
    -R -S 96 \
    -M -P rot

-R runs the two-pass rotation indexer (index the sweep once, then process every frame against that lattice); -P rot selects the rotation partiality model; -M scales and merges. For strong rotation data the de-novo FFT indexer often indexes more frames — add -X fft (and drop -C to let it find the cell from scratch).

Still / serial data

Known-cell indexing of independent stills with the GPU fast-feedback indexer, then merge against a reference structure:

jfjoch_process serial_master.h5 \
    -o lyso_serial -N 16 \
    -X ffbidx -C 79,79,38,90,90,90 -S 96 \
    --spot-sigma 4 \
    -M -z reference.mtz -r pixelrefine \
    --scaling-high-resolution 1.8

ffbidx requires a known cell (-C) and is the indexer of choice for sparse serial stills. -r pixelrefine selects the experimental reference-driven still integrator (needs --reference-mtz). For weak serial data, tightening spot finding with --spot-sigma 4 typically raises the indexing rate substantially.

Command-line options

General:

Option Description
-o, --output-prefix <txt> Output file prefix (default: output)
-N, --threads <num> Number of worker threads (default: 1)
-s, --start-image <num> First image to process (default: 0)
-e, --end-image <num> Last image to process (default: all)
-t, --stride <num> Process every n-th image (default: 1)
-v, --verbose Verbose output

Spot finding:

Option Description
--spot-sigma <num> Noise sigma level for spot finding (default: 3.0)
--spot-threshold <num> Photon-count threshold for spot finding (default: 10)
--spot-high-resolution <num> High-resolution limit for spot finding, Å (default: 1.5)
--max-spots <num> Maximum spot count (default: 250)

Indexing:

Option Description
-X, --indexing-algorithm <txt> FFBIDX | FFT | FFTW | Auto | None
-C, --unit-cell <cell> Reference unit cell "a,b,c,alpha,beta,gamma" (required by ffbidx)
-S, --space-group <num> Space group number (used for indexing and scaling)
-r, --refine <txt> Geometry refinement: none | orientation | beam_and_lattice (default) | pixelrefine
-R, --two-pass-rotation[=num] Two-pass offline rotation indexing (optional image count, default 30)
--single-pass-rotation[=num] Online-like single-pass rotation indexing (optional min angular range, deg)
--redo-rotation-spots Redo spot finding for the two-pass rotation first pass
--force-rotation-lattice <vec> Force rotation lattice (9 floats, Å), skipping the first pass

Indexer choice in brief: ffbidx (GPU) refines toward a known cell and is best for sparse serial stills; fft (GPU) / fftw (CPU) index de novo and suit strong rotation data. See the CPU/GPU data-analysis reference for the algorithms.

Scaling and merging:

Option Description
-M, --scale-merge Scale and merge
-P, --partiality <txt> Partiality model: fixed (default) | rot | unity
-A, --anomalous Anomalous mode (keep Friedel pairs separate)
-B, --refine-bfactor Refine a per-image B-factor
-w, --wedge[=num] Refine the per-image rotation wedge (optional starting value)
--scaling-high-resolution <num> High-resolution limit for scaling, Å (default: no limit)
--min-partiality <num> Minimum partiality to accept a reflection (default: 0.02)
--reject-outliers <num> Per-observation outlier rejection, N σ from the per-reflection median (default: off)
--reject-delta-cchalf <num> Drop images with ΔCC1/2 below mean N·stddev (default: off)
--min-image-cc <num> Per-image CC limit, percent (default: no limit)
--scaling-iterations <num> Scaling iterations with no reference data (default: 3)
--scaling-output <txt> Reflection output format: mtz (default) | cif | txt
-z, --reference-mtz <file> Reference MTZ (enables reference-driven scaling)

Pixel refinement (experimental; select with -r pixelrefine, requires --reference-mtz):

Option Description
--bandwidth <num> Relative X-ray bandwidth FWHM (e.g. 0.01 for a 1% DMM); default from file or 0 (monochromatic)
--integration-radius <r> Signal-box radius r1, or r1,r2,r3 (px). One value ⇒ r2=r1+2, r3=r1+4
--profile-multiplier <num> Scale the measured tangential profile width (default: 6; XDS-style generous aperture)