Files
Jungfraujoch/docs/JFJOCH_PROCESS.md
T
leonarski_fandClaude Fable 5 347228d008
Build Packages / build:windows:nocuda (pull_request) Successful in 14m25s
Build Packages / build:windows:cuda (pull_request) Successful in 16m48s
Build Packages / build:rpm (rocky8_nocuda) (pull_request) Successful in 12m0s
Build Packages / build:rpm (ubuntu2204_nocuda) (pull_request) Successful in 11m27s
Build Packages / build:rpm (rocky9_nocuda) (pull_request) Successful in 12m24s
Build Packages / build:rpm (ubuntu2404_nocuda) (pull_request) Successful in 11m13s
Build Packages / build:rpm (rocky8_sls9) (pull_request) Successful in 12m9s
Build Packages / build:rpm (rocky9_sls9) (pull_request) Successful in 12m25s
Build Packages / build:rpm (rocky8) (pull_request) Successful in 11m8s
Build Packages / build:rpm (ubuntu2204) (pull_request) Successful in 10m48s
Build Packages / XDS test (durin plugin) (pull_request) Successful in 8m44s
Build Packages / Generate python client (pull_request) Successful in 13s
Build Packages / build:rpm (rocky9) (pull_request) Successful in 11m57s
Build Packages / Create release (pull_request) Skipped
Build Packages / Build documentation (pull_request) Successful in 43s
Build Packages / build:rpm (ubuntu2404) (pull_request) Successful in 11m54s
Build Packages / DIALS test (pull_request) Successful in 13m47s
Build Packages / XDS test (neggia plugin) (pull_request) Successful in 6m15s
Build Packages / XDS test (JFJoch plugin) (pull_request) Successful in 6m51s
Build Packages / Unit tests (pull_request) Successful in 59m10s
Build Packages / Unit tests (push) Successful in 1h13m8s
Build Packages / build:rpm (rocky8_nocuda) (push) Successful in 14m54s
Build Packages / build:rpm (rocky9_nocuda) (push) Successful in 15m34s
Build Packages / build:rpm (ubuntu2204_nocuda) (push) Successful in 15m28s
Build Packages / build:rpm (ubuntu2404_nocuda) (push) Successful in 14m20s
Build Packages / build:rpm (rocky8_sls9) (push) Successful in 15m11s
Build Packages / build:rpm (rocky9_sls9) (push) Successful in 16m0s
Build Packages / build:rpm (rocky8) (push) Successful in 15m14s
Build Packages / build:rpm (rocky9) (push) Successful in 13m20s
Build Packages / build:rpm (ubuntu2204) (push) Successful in 12m37s
Build Packages / build:rpm (ubuntu2404) (push) Successful in 11m1s
Build Packages / DIALS test (push) Successful in 12m23s
Build Packages / XDS test (durin plugin) (push) Successful in 8m1s
Build Packages / XDS test (JFJoch plugin) (push) Successful in 8m16s
Build Packages / XDS test (neggia plugin) (push) Successful in 8m57s
Build Packages / Generate python client (push) Successful in 27s
Build Packages / Build documentation (push) Successful in 1m36s
Build Packages / Create release (push) Skipped
Build Packages / build:windows:nocuda (push) Successful in 15m25s
Build Packages / build:windows:cuda (push) Successful in 17m38s
jfjoch_process: azimuthal-integration CLI + default 0.01 1/A q-spacing
Add -q/--azim-q-spacing, --azim-min-q, --azim-max-q, --azim-phi-bins (mirroring
jfjoch_azint) so offline processing can set the radial binning, applied before
the azint mapping is built. Set the AzimuthalIntegrationSettings default spacing
to 0.01 1/A (was 0.05): the coarse default barely resolved the narrow ice rings,
diluting the ice-ring score. Finer binning sharpens it a lot with no effect on
processing - EP_cs_01-17 ice score 4.6->7.3 (max 11->23), clean cytC stays ~1.0,
and space group / cell / ISa / completeness are unchanged (cytC, InsI3, MyoB,
pding4_001 verified full-image). Documented in JFJOCH_PROCESS.md.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-02 17:24:05 +02:00

9.7 KiB
Raw Blame History

jfjoch_process

jfjoch_process is the offline crystallographic data-analysis tool of Jungfraujoch. It takes an existing HDF5 dataset, runs the full analysis pipeline — spot finding, indexing, geometry refinement, Bragg integration and (optionally) scaling and merging — and writes the results to a _process.h5 file, plus reflection files (.mtz/.cif/.hkl) when merging is requested.

It runs the same analysis code as the online and interactive tools, just driven from the command line over a file rather than a live detector stream.

Note. jfjoch_process is under very active development. This page describes the tool and its options at a high level; the authoritative, always-current list of options is the program's own usage message — run jfjoch_process with no arguments.

Where it fits among the three analysis tools

Tool Mode Driven by Output
jfjoch_broker Online, real-time streaming analysis on FPGA + GPU HTTP/REST + ZeroMQ Live results and statistics, images streamed to jfjoch_writer
jfjoch_viewer Interactive, on-screen exploration Qt desktop application Displayed on screen (results not saved to disk)
jfjoch_process Offline batch processing of a stored dataset Command-line interface _process.h5, and .mtz/.cif/.hkl when merging

Use jfjoch_process to re-analyse data after acquisition, to experiment with processing parameters, or to produce merged intensities for downstream structure solution.

Hardware

As with the rest of Jungfraujoch, serious performance requires an NVIDIA GPU. The CUDA build provides the GPU fast-feedback indexer (ffbidx) and the GPU FFT indexer (fft); without CUDA only the CPU fftw indexer is available. Spot finding, integration and scaling run on the CPU and scale with the thread count (-N).

Input and output

Input is a single Jungfraujoch HDF5 master file (NXmx-based). If the dataset already contains stored spot lists, two-pass rotation indexing can reuse them instead of re-running spot finding on the first pass.

Output (controlled by -o, --output-prefix, default output):

  • <prefix>_process.h5 — NXmx-compliant HDF5 with derived metadata (spots, indexing, integration, azimuthal integration, per-image statistics). See HDF5 / NeXus data format for the layout.
  • When merging (-M, or whenever a --reference-mtz is supplied), the merged reflections are written as <prefix>.cif (mmCIF — the default), or <prefix>.mtz / <prefix>.hkl depending on --scaling-output. Both the mmCIF and the MTZ carry the refined unit cell (from rotation indexing) and the space group determined from systematic absences (constrained to the indexed lattice symmetry). No-reference scaling additionally emits per-iteration <prefix>_iterN_scale.dat.

Merged statistics (⟨I/σ⟩, CC1/2, completeness, …), the error model and timing are printed to the console.

Re-scaling and re-merging (jfjoch_scale)

The companion tool jfjoch_scale re-scales and merges the already-integrated reflections stored in one or more _process.h5 files, without re-running spot finding or integration. Use it to re-merge quickly with a different space group, partiality model, resolution limit or reference MTZ, or to combine several processed runs into one set of merged intensities.

Quick start

Rotation data

Index, integrate, scale and merge a rotation sweep, fully de novo:

jfjoch_process rotation_master.h5 \
    -o lyso_rot -N 32 \
    -M --scaling-high-resolution 1.4

Because the dataset carries a rotation goniometer axis, it is processed as rotation data by default: two-pass rotation indexing (index the sweep once, then process every frame against that lattice) with the rot3d partiality model (rotation partials combined into 3D fulls). -M scales and merges; the unit cell is taken from the rotation indexer and the space group is determined from systematic absences, and both are written into the merged .cif.

Run fully de novo (no -C/-S) for the best result — supplying a cell or space group up front tends to degrade low-symmetry cases. --scaling-high-resolution (set it to your expected resolution) sharpens both the space-group search and the error model. To tune the first pass use --two-pass-rotation=100 (or -R100 — the first-pass image count); to force the sweep to be treated as independent stills use --process-as-stills.

Still / serial data

A dataset with no goniometer axis (e.g. a serial grid scan) is processed as independent stills automatically — no flag needed. Known-cell indexing with the GPU fast-feedback indexer, then merge against a reference structure:

jfjoch_process serial_master.h5 \
    -o lyso_serial -N 32 \
    -X ffbidx -C 79,79,38,90,90,90 -S 96 \
    --spot-sigma 4 \
    -M -z reference.mtz \
    --scaling-high-resolution 1.8

ffbidx requires a known cell (-C) and is the indexer of choice for sparse serial stills. For weak serial data, tightening spot finding with --spot-sigma 4 typically raises the indexing rate substantially. If a dataset does carry a goniometer axis but you want per-frame stills processing anyway, add --process-as-stills.

Command-line options

General:

Option Description
-o, --output-prefix <txt> Output file prefix (default: output)
-N, --threads <num> Number of worker threads (default: 1)
-s, --start-image <num> First image to process (default: 0)
-e, --end-image <num> Last image to process (default: all)
-t, --stride <num> Process every n-th image (default: 1)
-v, --verbose Verbose output

Spot finding:

Option Description
--spot-sigma <num> Noise sigma level for spot finding (default: 3.0)
--spot-threshold <num> Photon-count threshold for spot finding (default: 10)
--spot-high-resolution <num> High-resolution limit for spot finding, Å (default: 1.5)
--max-spots <num> Maximum spot count (default: 250)

Azimuthal integration (the radial profile behind the per-image ice-ring score):

Option Description
-q, --azim-q-spacing <num> Q bin spacing, 1/Å (default: 0.01; finer resolves the narrow ice rings)
--azim-min-q <num> Minimum Q, 1/Å
--azim-max-q <num> Maximum Q, 1/Å
--azim-phi-bins <num> Number of azimuthal (phi) bins (default: 1)

Indexing:

A dataset with a rotation goniometer axis is processed as rotation data (two-pass rotation indexing) by default; a dataset without one is processed as independent stills. --process-as-stills overrides the former; the -R / --single-pass-rotation / --force-rotation-lattice flags request rotation explicitly and pick the pass or lattice.

Option Description
--process-as-stills Treat a rotation (goniometer) dataset as independent stills instead of rotation
-X, --indexing-algorithm <txt> FFBIDX | FFT | FFTW | Auto | None
-C, --unit-cell <cell> Reference unit cell "a,b,c,alpha,beta,gamma" (required by ffbidx)
-S, --space-group <num> Space group number (used for indexing and scaling)
-r, --refine <txt> Geometry refinement: none | orientation | beam_and_lattice (default)
-R, --two-pass-rotation[=num] Two-pass offline rotation indexing (default for goniometer data; optional first-pass image count, default 100)
--single-pass-rotation[=num] Online-like single-pass rotation indexing (optional min angular range, deg)
--redo-rotation-spots Redo spot finding for the two-pass rotation first pass
--force-rotation-lattice <vec> Force rotation lattice (9 floats, Å), skipping the first pass

Indexer choice in brief: ffbidx (GPU) refines toward a known cell and is best for sparse serial stills; fft (GPU) / fftw (CPU) index de novo and suit strong rotation data. See the CPU/GPU data-analysis reference for the algorithms.

Scaling and merging:

Option Description
-M, --scale-merge Scale and merge
-P, --partiality <txt> Partiality model: fixed | rot | rot3d | unity (default: rot3d for rotation data, fixed for stills). rot3d = rot + 3D combine of the per-frame partials into fulls
-A, --anomalous Anomalous mode (keep Friedel pairs separate)
-B, --refine-bfactor Refine a per-image B-factor
-w, --wedge[=num] Refine the per-image rotation wedge (optional starting value)
--scaling-high-resolution <num> High-resolution limit for scaling, Å (default: no limit)
--min-partiality <num> Minimum partiality to accept a reflection (default: 0.02)
--reject-outliers <num> Per-observation outlier rejection, N σ from the per-reflection median (default: 6 for rot3d, off otherwise)
--reject-delta-cchalf <num> Drop images with ΔCC1/2 below mean N·stddev (default: off)
--min-image-cc <num> Per-image CC limit, percent (default: no limit)
--scaling-iterations <num> Scaling iterations with no reference data (default: 3)
--scaling-output <txt> Reflection output format: cif (mmCIF, default) | mtz | txt
-z, --reference-mtz <file> Reference MTZ (enables reference-driven scaling)

Integration:

Option Description
--integrator <txt> Spot integrator: gaussian (profile-fit, default) | empirical | boxsum (classical fallback)
--integration-radius <r> Signal-box radius r1, or r1,r2,r3 (px). One value ⇒ r2=r1+2, r3=r1+4
--bandwidth <num> Relative X-ray bandwidth FWHM (e.g. 0.01 for a 1% DMM); default from file or 0 (monochromatic)