An 11-crystal mosaicity-stratified re-test (/data/rotation_test, off vs on vs a de-contaminated variant, plus a per-frame dump of the fitted widths) showed the dial is net-negative and cannot work in the per-frame paradigm: - The C|q|^2 mosaicity term - the whole point - is unfittable per-frame: the fitted curvature a2 comes out ~0 (often negative) on every crystal, with zero correlation to the XDS mosaicity (0.09..0.42 deg). Strong spots sit at low q where eta^2 q^2 is invisible; the curvature only appears at high q where there are ~0 strong spots. The law degenerates to a straight line. - With a2~0 the high-res width becomes a blind 1/cos^2(2theta) extrapolation, 2-4x wider than per-shell. The per-shell path's high-res "starvation" (flat narrow fallback) is accidentally correct: weak, crowded high-res spots want a narrow aperture, not the true wide spot shape. - The over-wide profile pulls background into weak spots -> R-meas rises, CC1/2 drops in reliable high-multiplicity shells (pding4_001, pding4_003, MyoB, EcwtCQ066). A cap at the widest well-sampled per-shell width recovers the regression, confirming over-widening is the harm. No crystal reliably wins; the apparent overall-CC gains were all in noise shells (mult 2-3, CC<20%). Delete the CLI flag, the BraggIntegrationSettings::reciprocal_profile setting, and the per-frame fit block. Default (per-shell) integration is byte-identical. NEXTGEN_INTEGRATOR.md records the finding as a dead-end for posterity. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
9.3 KiB
jfjoch_process
jfjoch_process is the offline crystallographic data-analysis tool of Jungfraujoch.
It takes an existing HDF5 dataset, runs the full analysis pipeline — spot finding, indexing,
geometry refinement, Bragg integration and (optionally) scaling and merging — and writes the
results to a _process.h5 file, plus reflection files (.mtz/.cif/.hkl) when merging is
requested.
It runs the same analysis code as the online and interactive tools, just driven from the command line over a file rather than a live detector stream.
Note.
jfjoch_processis under very active development. This page describes the tool and its options at a high level; the authoritative, always-current list of options is the program's own usage message — runjfjoch_processwith no arguments.
Where it fits among the three analysis tools
| Tool | Mode | Driven by | Output |
|---|---|---|---|
jfjoch_broker |
Online, real-time streaming analysis on FPGA + GPU | HTTP/REST + ZeroMQ | Live results and statistics, images streamed to jfjoch_writer |
jfjoch_viewer |
Interactive, on-screen exploration | Qt desktop application | Displayed on screen (results not saved to disk) |
jfjoch_process |
Offline batch processing of a stored dataset | Command-line interface | _process.h5, and .mtz/.cif/.hkl when merging |
Use jfjoch_process to re-analyse data after acquisition, to experiment with processing
parameters, or to produce merged intensities for downstream structure solution.
Hardware
As with the rest of Jungfraujoch, serious performance requires an NVIDIA GPU. The CUDA build
provides the GPU fast-feedback indexer (ffbidx) and the GPU FFT indexer (fft); without CUDA
only the CPU fftw indexer is available. Spot finding, integration and scaling run on the CPU and
scale with the thread count (-N).
Input and output
Input is a single Jungfraujoch HDF5 master file (NXmx-based). If the dataset already contains stored spot lists, two-pass rotation indexing can reuse them instead of re-running spot finding on the first pass.
Output (controlled by -o, --output-prefix, default output):
<prefix>_process.h5— NXmx-compliant HDF5 with derived metadata (spots, indexing, integration, azimuthal integration, per-image statistics). See HDF5 / NeXus data format for the layout.- When merging (
-M, or whenever a--reference-mtzis supplied), the merged reflections are written as<prefix>.cif(mmCIF — the default), or<prefix>.mtz/<prefix>.hkldepending on--scaling-output. Both the mmCIF and the MTZ carry the refined unit cell (from rotation indexing) and the space group determined from systematic absences (constrained to the indexed lattice symmetry). No-reference scaling additionally emits per-iteration<prefix>_iterN_scale.dat.
Merged statistics (⟨I/σ⟩, CC1/2, completeness, …), the error model and timing are printed to the console.
Re-scaling and re-merging (jfjoch_scale)
The companion tool jfjoch_scale re-scales and merges the already-integrated reflections stored
in one or more _process.h5 files, without re-running spot finding or integration. Use it to
re-merge quickly with a different space group, partiality model, resolution limit or reference MTZ,
or to combine several processed runs into one set of merged intensities.
Quick start
Rotation data
Index, integrate, scale and merge a rotation sweep, fully de novo:
jfjoch_process rotation_master.h5 \
-o lyso_rot -N 32 \
-M --scaling-high-resolution 1.4
Because the dataset carries a rotation goniometer axis, it is processed as rotation data by
default: two-pass rotation indexing (index the sweep once, then process every frame against that
lattice) with the rot3d partiality model (rotation partials combined into 3D fulls). -M
scales and merges; the unit cell is taken from the rotation indexer and the space group is
determined from systematic absences, and both are written into the merged .cif.
Run fully de novo (no -C/-S) for the best result — supplying a cell or space group up front
tends to degrade low-symmetry cases. --scaling-high-resolution (set it to your expected
resolution) sharpens both the space-group search and the error model. To tune the first pass use
--two-pass-rotation=100 (or -R100 — the first-pass image count); to force the sweep to be
treated as independent stills use --process-as-stills.
Still / serial data
A dataset with no goniometer axis (e.g. a serial grid scan) is processed as independent stills automatically — no flag needed. Known-cell indexing with the GPU fast-feedback indexer, then merge against a reference structure:
jfjoch_process serial_master.h5 \
-o lyso_serial -N 32 \
-X ffbidx -C 79,79,38,90,90,90 -S 96 \
--spot-sigma 4 \
-M -z reference.mtz \
--scaling-high-resolution 1.8
ffbidx requires a known cell (-C) and is the indexer of choice for sparse serial stills. For
weak serial data, tightening spot finding with --spot-sigma 4 typically raises the indexing rate
substantially. If a dataset does carry a goniometer axis but you want per-frame stills processing
anyway, add --process-as-stills.
Command-line options
General:
| Option | Description |
|---|---|
-o, --output-prefix <txt> |
Output file prefix (default: output) |
-N, --threads <num> |
Number of worker threads (default: 1) |
-s, --start-image <num> |
First image to process (default: 0) |
-e, --end-image <num> |
Last image to process (default: all) |
-t, --stride <num> |
Process every n-th image (default: 1) |
-v, --verbose |
Verbose output |
Spot finding:
| Option | Description |
|---|---|
--spot-sigma <num> |
Noise sigma level for spot finding (default: 3.0) |
--spot-threshold <num> |
Photon-count threshold for spot finding (default: 10) |
--spot-high-resolution <num> |
High-resolution limit for spot finding, Å (default: 1.5) |
--max-spots <num> |
Maximum spot count (default: 250) |
Indexing:
A dataset with a rotation goniometer axis is processed as rotation data (two-pass rotation
indexing) by default; a dataset without one is processed as independent stills. --process-as-stills
overrides the former; the -R / --single-pass-rotation / --force-rotation-lattice flags request
rotation explicitly and pick the pass or lattice.
| Option | Description |
|---|---|
--process-as-stills |
Treat a rotation (goniometer) dataset as independent stills instead of rotation |
-X, --indexing-algorithm <txt> |
FFBIDX | FFT | FFTW | Auto | None |
-C, --unit-cell <cell> |
Reference unit cell "a,b,c,alpha,beta,gamma" (required by ffbidx) |
-S, --space-group <num> |
Space group number (used for indexing and scaling) |
-r, --refine <txt> |
Geometry refinement: none | orientation | beam_and_lattice (default) |
-R, --two-pass-rotation[=num] |
Two-pass offline rotation indexing (default for goniometer data; optional first-pass image count, default 100) |
--single-pass-rotation[=num] |
Online-like single-pass rotation indexing (optional min angular range, deg) |
--redo-rotation-spots |
Redo spot finding for the two-pass rotation first pass |
--force-rotation-lattice <vec> |
Force rotation lattice (9 floats, Å), skipping the first pass |
Indexer choice in brief: ffbidx (GPU) refines toward a known cell and is best for sparse
serial stills; fft (GPU) / fftw (CPU) index de novo and suit strong rotation data. See the
CPU/GPU data-analysis reference for the algorithms.
Scaling and merging:
| Option | Description |
|---|---|
-M, --scale-merge |
Scale and merge |
-P, --partiality <txt> |
Partiality model: fixed | rot | rot3d | unity (default: rot3d for rotation data, fixed for stills). rot3d = rot + 3D combine of the per-frame partials into fulls |
-A, --anomalous |
Anomalous mode (keep Friedel pairs separate) |
-B, --refine-bfactor |
Refine a per-image B-factor |
-w, --wedge[=num] |
Refine the per-image rotation wedge (optional starting value) |
--scaling-high-resolution <num> |
High-resolution limit for scaling, Å (default: no limit) |
--min-partiality <num> |
Minimum partiality to accept a reflection (default: 0.02) |
--reject-outliers <num> |
Per-observation outlier rejection, N σ from the per-reflection median (default: 6 for rot3d, off otherwise) |
--reject-delta-cchalf <num> |
Drop images with ΔCC1/2 below mean − N·stddev (default: off) |
--min-image-cc <num> |
Per-image CC limit, percent (default: no limit) |
--scaling-iterations <num> |
Scaling iterations with no reference data (default: 3) |
--scaling-output <txt> |
Reflection output format: cif (mmCIF, default) | mtz | txt |
-z, --reference-mtz <file> |
Reference MTZ (enables reference-driven scaling) |
Integration:
| Option | Description |
|---|---|
--integrator <txt> |
Spot integrator: gaussian (profile-fit, default) | empirical | boxsum (classical fallback) |
--integration-radius <r> |
Signal-box radius r1, or r1,r2,r3 (px). One value ⇒ r2=r1+2, r3=r1+4 |
--bandwidth <num> |
Relative X-ray bandwidth FWHM (e.g. 0.01 for a 1% DMM); default from file or 0 (monochromatic) |