Files
Jungfraujoch/docs/JFJOCH_PROCESS.md
T
leonarski_f c69b5297d5 docs: add jfjoch_process page, refresh viewer/tools docs, unify CLI naming
- Add docs/JFJOCH_PROCESS.md describing the offline analysis tool, its
  options, output files, and the broker/viewer/process distinction; mention
  jfjoch_scale for re-scaling/merging.
- Rewrite docs/JFJOCH_VIEWER.md for consistency: functionality, HTTP env
  vars (JUNGFRAUJOCH_HTTP_HOST/PORT), command line, and the real D-Bus API.
- Refresh docs/TOOLS.md to the current set of tools; add both pages to index.rst.
- jfjoch_process: fix stale self-name (jfjoch_analysis -> jfjoch_process) in
  usage/license/logger.
- jfjoch_scale: unify --scaling-high-resolution with jfjoch_process (drop -D
  short flag, make it long-only) and remove dead p/q/i short options.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 07:38:36 +02:00

160 lines
7.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# jfjoch_process
`jfjoch_process` is the **offline** crystallographic data-analysis tool of Jungfraujoch.
It takes an existing HDF5 dataset, runs the full analysis pipeline — spot finding, indexing,
geometry refinement, Bragg integration and (optionally) scaling and merging — and writes the
results to a `_process.h5` file, plus reflection files (`.mtz`/`.cif`/`.hkl`) when merging is
requested.
It runs the *same* analysis code as the online and interactive tools, just driven from the
command line over a file rather than a live detector stream.
> **Note.** `jfjoch_process` is under very active development. This page describes the tool and
> its options at a high level; the authoritative, always-current list of options is the program's
> own usage message — run `jfjoch_process` with no arguments.
## Where it fits among the three analysis tools
| Tool | Mode | Driven by | Output |
| --- | --- | --- | --- |
| [`jfjoch_broker`](JFJOCH_BROKER.md) | Online, real-time streaming analysis on FPGA + GPU | HTTP/REST + ZeroMQ | Live results and statistics, images streamed to [`jfjoch_writer`](JFJOCH_WRITER.md) |
| [`jfjoch_viewer`](JFJOCH_VIEWER.md) | Interactive, on-screen exploration | Qt desktop application | Displayed on screen (results not saved to disk) |
| **`jfjoch_process`** | **Offline batch processing of a stored dataset** | **Command-line interface** | **`_process.h5`, and `.mtz`/`.cif`/`.hkl` when merging** |
Use `jfjoch_process` to re-analyse data after acquisition, to experiment with processing
parameters, or to produce merged intensities for downstream structure solution.
## Hardware
As with the rest of Jungfraujoch, **serious performance requires an NVIDIA GPU**. The CUDA build
provides the GPU fast-feedback indexer (`ffbidx`) and the GPU FFT indexer (`fft`); without CUDA
only the CPU `fftw` indexer is available. Spot finding, integration and scaling run on the CPU and
scale with the thread count (`-N`).
## Input and output
**Input** is a single Jungfraujoch HDF5 master file (NXmx-based). If the dataset already contains
stored spot lists, two-pass rotation indexing can reuse them instead of re-running spot finding on
the first pass.
**Output** (controlled by `-o, --output-prefix`, default `output`):
- `<prefix>_process.h5` — NXmx-compliant HDF5 with derived metadata (spots, indexing,
integration, azimuthal integration, per-image statistics). See
[HDF5 / NeXus data format](HDF5.md) for the layout.
- When merging (`-M`, or whenever a `--reference-mtz` is supplied), the merged reflections are
written as `<prefix>.mtz` (default), or `<prefix>.cif` / `<prefix>.hkl` depending on
`--scaling-output`. No-reference scaling additionally emits per-iteration `<prefix>_iterN_scale.dat`.
Merged statistics (⟨I/σ⟩, CC1/2, completeness, …), the error model and timing are printed to the
console.
## Re-scaling and re-merging (`jfjoch_scale`)
The companion tool `jfjoch_scale` re-scales and merges the *already-integrated* reflections stored
in one or more `_process.h5` files, without re-running spot finding or integration. Use it to
re-merge quickly with a different space group, partiality model, resolution limit or reference MTZ,
or to combine several processed runs into one set of merged intensities.
## Quick start
### Rotation data
Two-pass rotation indexing, rotation partiality, scale and merge in space group 96:
```
jfjoch_process rotation_master.h5 \
-o lyso_rot -N 16 \
-R -S 96 \
-M -P rot
```
`-R` runs the two-pass rotation indexer (index the sweep once, then process every frame against
that lattice); `-P rot` selects the rotation partiality model; `-M` scales and merges. For strong
rotation data the de-novo FFT indexer often indexes more frames — add `-X fft` (and drop `-C` to
let it find the cell from scratch).
### Still / serial data
Known-cell indexing of independent stills with the GPU fast-feedback indexer, then merge against a
reference structure:
```
jfjoch_process serial_master.h5 \
-o lyso_serial -N 16 \
-X ffbidx -C 79,79,38,90,90,90 -S 96 \
--spot-sigma 4 \
-M -z reference.mtz -r pixelrefine \
--scaling-high-resolution 1.8
```
`ffbidx` requires a known cell (`-C`) and is the indexer of choice for sparse serial stills.
`-r pixelrefine` selects the experimental reference-driven still integrator (needs
`--reference-mtz`). For weak serial data, tightening spot finding with `--spot-sigma 4` typically
raises the indexing rate substantially.
## Command-line options
General:
| Option | Description |
| --- | --- |
| `-o, --output-prefix <txt>` | Output file prefix (default: `output`) |
| `-N, --threads <num>` | Number of worker threads (default: 1) |
| `-s, --start-image <num>` | First image to process (default: 0) |
| `-e, --end-image <num>` | Last image to process (default: all) |
| `-t, --stride <num>` | Process every *n*-th image (default: 1) |
| `-v, --verbose` | Verbose output |
Spot finding:
| Option | Description |
| --- | --- |
| `--spot-sigma <num>` | Noise sigma level for spot finding (default: 3.0) |
| `--spot-threshold <num>` | Photon-count threshold for spot finding (default: 10) |
| `--spot-high-resolution <num>` | High-resolution limit for spot finding, Å (default: 1.5) |
| `--max-spots <num>` | Maximum spot count (default: 250) |
Indexing:
| Option | Description |
| --- | --- |
| `-X, --indexing-algorithm <txt>` | `FFBIDX` \| `FFT` \| `FFTW` \| `Auto` \| `None` |
| `-C, --unit-cell <cell>` | Reference unit cell `"a,b,c,alpha,beta,gamma"` (required by `ffbidx`) |
| `-S, --space-group <num>` | Space group number (used for indexing and scaling) |
| `-r, --refine <txt>` | Geometry refinement: `none` \| `orientation` \| `beam_and_lattice` (default) \| `pixelrefine` |
| `-R, --two-pass-rotation[=num]` | Two-pass offline rotation indexing (optional image count, default 30) |
| `--single-pass-rotation[=num]` | Online-like single-pass rotation indexing (optional min angular range, deg) |
| `--redo-rotation-spots` | Redo spot finding for the two-pass rotation first pass |
| `--force-rotation-lattice <vec>` | Force rotation lattice (9 floats, Å), skipping the first pass |
Indexer choice in brief: `ffbidx` (GPU) refines toward a **known cell** and is best for sparse
serial stills; `fft` (GPU) / `fftw` (CPU) index **de novo** and suit strong rotation data. See the
[CPU/GPU data-analysis reference](CPU_DATA_ANALYSIS.md) for the algorithms.
Scaling and merging:
| Option | Description |
| --- | --- |
| `-M, --scale-merge` | Scale and merge |
| `-P, --partiality <txt>` | Partiality model: `fixed` (default) \| `rot` \| `unity` |
| `-A, --anomalous` | Anomalous mode (keep Friedel pairs separate) |
| `-B, --refine-bfactor` | Refine a per-image B-factor |
| `-w, --wedge[=num]` | Refine the per-image rotation wedge (optional starting value) |
| `--scaling-high-resolution <num>` | High-resolution limit for scaling, Å (default: no limit) |
| `--min-partiality <num>` | Minimum partiality to accept a reflection (default: 0.02) |
| `--reject-outliers <num>` | Per-observation outlier rejection, N σ from the per-reflection median (default: off) |
| `--reject-delta-cchalf <num>` | Drop images with ΔCC1/2 below mean N·stddev (default: off) |
| `--min-image-cc <num>` | Per-image CC limit, percent (default: no limit) |
| `--scaling-iterations <num>` | Scaling iterations with no reference data (default: 3) |
| `--scaling-output <txt>` | Reflection output format: `mtz` (default) \| `cif` \| `txt` |
| `-z, --reference-mtz <file>` | Reference MTZ (enables reference-driven scaling) |
Pixel refinement (experimental; select with `-r pixelrefine`, requires `--reference-mtz`):
| Option | Description |
| --- | --- |
| `--bandwidth <num>` | Relative X-ray bandwidth FWHM (e.g. `0.01` for a 1% DMM); default from file or 0 (monochromatic) |
| `--integration-radius <r>` | Signal-box radius `r1`, or `r1,r2,r3` (px). One value ⇒ `r2=r1+2`, `r3=r1+4` |
| `--profile-multiplier <num>` | Scale the measured tangential profile width (default: 6; XDS-style generous aperture) |