618 lines
25 KiB
Markdown
618 lines
25 KiB
Markdown
# CPU-side crystallographic data analysis (Jungfraujoch)
|
||
|
||
This document describes the crystallographic algorithms implemented in Jungfraujoch for **CPU**- and **GPU**-side real‑time and near‑real‑time data analysis.
|
||
|
||
**Scope.** The pipeline covered here comprises:
|
||
|
||
1. geometry mapping and corrections,
|
||
2. azimuthal integration (powder/radial profiles),
|
||
3. Bragg spot finding (strong pixels → connected components → spot descriptors),
|
||
4. indexing (still and rotation modes),
|
||
5. Bravais lattice / centering inference,
|
||
6. geometry and lattice refinement,
|
||
7. reflection prediction (still and rotation),
|
||
8. 2D summation integration,
|
||
9. scaling and merging,
|
||
10. auxiliary statistics (Wilson plot, ⟨I/σ(I)⟩, French–Wilson).
|
||
|
||
## References
|
||
|
||
The methods are inspired by solutions implemented in:
|
||
|
||
- W. Kabsch, “XDS”, *Acta Cryst.* **D66** (2010), 125–132 and related XDS papers (rotation geometry, partiality, scaling concepts).
|
||
- W. Kabsch, “Integration, scaling, space-group assignment and post-refinement”, *Acta Cryst.* **D66** (2010), 133–144 (mosaicity/partiality likelihood treatment; notation such as ζ and rotation factors).
|
||
- T. A. White et al., CrystFEL method papers (spot finding, three‑ring integration, serial/still diffraction processing concepts).
|
||
- J. Kieffer & J. P. Wright, "PyFAI: a Python library for high performance azimuthal integration on GPU", *Powder Diffraction* **28** (2013), S339-S350 (detector geometry definition, azimuthal integration)
|
||
- H. Powell, "The Rossmann Fourier autoindexing algorithm in MOSFLM", *Acta Cryst.* **D55** (1999), 1690-1695 (FFT indexing)
|
||
(list is not exhaustive)
|
||
|
||
## 1. Geometry, reciprocal-space mapping, and basic quantities
|
||
|
||
### 1.1 Coordinate conventions
|
||
|
||
For a pixel coordinate $(x,y)$ (in pixels), Jungfraujoch converts to a laboratory direction vector via:
|
||
|
||
1. shift by direct-beam position $(x_\mathrm{beam}, y_\mathrm{beam})$,
|
||
2. scale by pixel size $p$ (mm),
|
||
3. set detector distance $D$ (mm),
|
||
4. apply detector orientation rotation $R_\mathrm{det}$ (PyFAI-like parameterization).
|
||
|
||
The unnormalized detector coordinate (mm) is:
|
||
$
|
||
\mathbf{r}_\mathrm{det}(x,y) =
|
||
\begin{pmatrix}
|
||
(x-x_\mathrm{beam})p\\
|
||
(y-y_\mathrm{beam})p\\
|
||
D
|
||
\end{pmatrix}.
|
||
$
|
||
|
||
The lab-frame vector is:
|
||
$
|
||
\mathbf{r}_\mathrm{lab} = R_\mathrm{det}\,\mathbf{r}_\mathrm{det}.
|
||
$
|
||
|
||
Let the incident wavevector magnitude be $k = 1/\lambda$ in Å$^{-1}$, and define:
|
||
$
|
||
\mathbf{S}_0 = (0,0,k).
|
||
$
|
||
|
||
The **reciprocal-space scattering vector** associated with pixel $(x,y)$ is:
|
||
$
|
||
\mathbf{s}(x,y) = k\,\frac{\mathbf{r}_\mathrm{lab}}{\lVert \mathbf{r}_\mathrm{lab}\rVert} - \mathbf{S}_0.
|
||
$
|
||
|
||
This $\mathbf{s}$ is the fundamental quantity used for spot finding (resolution filters), indexing, and refinement.
|
||
|
||
### 1.2 Two-theta, azimuth, resolution and $q$
|
||
|
||
The scattering angle $2\theta$ is computed from $\mathbf{r}_\mathrm{lab}$ via:
|
||
$
|
||
2\theta = \arctan\!\left(\frac{\sqrt{x_\mathrm{lab}^2 + y_\mathrm{lab}^2}}{z_\mathrm{lab}}\right).
|
||
$
|
||
|
||
Resolution (Å) at a pixel is:
|
||
$
|
||
d = \frac{\lambda}{2\sin(\theta)} = \frac{\lambda}{2\sin(2\theta/2)}.
|
||
$
|
||
|
||
The magnitude $q = 2\pi/d$ is used for radial binning and ice-ring handling.
|
||
|
||
### 1.3 Distance from the Ewald sphere
|
||
|
||
For a reciprocal lattice point $\mathbf{p}$ (Å$^{-1}$), define:
|
||
$
|
||
\Delta_\mathrm{Ewald}(\mathbf{p}) = \lVert \mathbf{p} + \mathbf{S}_0\rVert - k.
|
||
$
|
||
Jungfraujoch uses $|\Delta_\mathrm{Ewald}|$ as an operational proxy for excitation error. This appears in:
|
||
- still prediction (accept if $|\Delta_\mathrm{Ewald}|\le \Delta_\mathrm{cut}$),
|
||
- profile radius estimation (see §7.1),
|
||
- still partiality option in scaling/merging (§9.3).
|
||
|
||
---
|
||
|
||
## 2. Azimuthal integration (radial profiles)
|
||
|
||
Azimuthal integration produces a 1D radial profile $I(q)$ or $I(d)$ by histogramming pixels into radial bins. Pixels are **not split** across bins; each pixel contributes wholly to a single bin.
|
||
|
||
### 2.1 Histogram estimator
|
||
|
||
Let bin index $b(x,y)\in\{0,\dots,B-1\}$ be precomputed from $q(x,y)$ (or equivalently from $d(x,y)$). For each bin $b$:
|
||
|
||
- accumulate corrected intensity:
|
||
$
|
||
S_b = \sum_{(x,y):\,b(x,y)=b} I(x,y)\,C(x,y),
|
||
$
|
||
- and count:
|
||
$
|
||
N_b = \#\{(x,y):\,b(x,y)=b \text{ and pixel is valid}\}.
|
||
$
|
||
|
||
A simple mean profile is then $ \bar{I}_b = S_b / N_b$ (when $N_b>0$). Invalid pixels (masked, saturated, detector error codes) are excluded.
|
||
|
||
### 2.2 Corrections applied
|
||
|
||
Two standard corrections are available:
|
||
|
||
**(i) Solid angle / geometric correction.** A commonly used approximation for flat detectors gives a $\cos^3(2\theta)$ factor:
|
||
$
|
||
C_\Omega(2\theta) = \cos^3(2\theta).
|
||
$
|
||
|
||
**(ii) Polarization correction.** With polarization coefficient $P$ (beamline dependent) and azimuth $\phi$:
|
||
$
|
||
C_\mathrm{pol}(2\theta,\phi) =
|
||
\frac{1}{2}\left(1+\cos^2(2\theta) - P\cos(2\phi)\left(1-\cos^2(2\theta)\right)\right),
|
||
$
|
||
applied as a divisor to intensities (i.e. scale by $1/C_\mathrm{pol}$) when enabled.
|
||
|
||
### 2.3 Background estimate for profiles
|
||
|
||
A background estimate is derived from the integrated profile using the azimuthal integration settings (details depend on the configured estimator). This background is used for monitoring and diagnostics; it is **not** the same as local Bragg-spot background used in summation integration (§8).
|
||
|
||
---
|
||
|
||
## 3. Spot finding (strong pixels → Bragg spots)
|
||
|
||
Spot finding is a two-stage process:
|
||
|
||
1. **Strong-pixel selection** using intensity and/or local signal-to-noise criteria.
|
||
2. **Connected-component labeling (CCL)** to group strong pixels into candidate spots, followed by spot-level filtering and feature extraction.
|
||
|
||
### 3.1 Strong-pixel detection by local statistics
|
||
|
||
For each pixel $i$ with value $v_i$, consider a square window (nominally $31\times 31$ pixels) around it. Let the window contain $n$ valid pixels (excluding masked/bad/saturated), and define:
|
||
$
|
||
\Sigma = \sum v,\qquad \Sigma_2 = \sum v^2.
|
||
$
|
||
|
||
To avoid biasing the local statistics by the test pixel itself, Jungfraujoch evaluates the pixel against the window with the pixel removed:
|
||
$
|
||
\Sigma' = \Sigma - v_i,\quad \Sigma_2' = \Sigma_2 - v_i^2,\quad n' = n-1.
|
||
$
|
||
|
||
A variance-like quantity proportional to $n'^2$ is formed:
|
||
$
|
||
V = n'\Sigma_2' - (\Sigma')^2,
|
||
$
|
||
and the deviation-from-mean quantity:
|
||
$
|
||
\Delta = v_i n' - \Sigma'.
|
||
$
|
||
|
||
A pixel is considered strong if:
|
||
- it is above a photon/count threshold, and
|
||
- $\Delta>0$, and
|
||
- the squared deviation exceeds a scaled variance:
|
||
$
|
||
\Delta^2 > V\cdot T^2,
|
||
$
|
||
where $T$ is the configured signal-to-noise threshold.
|
||
|
||
This is equivalent to a local z-score criterion but implemented in integer arithmetic to be robust and fast.
|
||
|
||
Special cases:
|
||
- saturated pixels can be forced to “strong” (useful for detecting overloaded Bragg spots),
|
||
- invalid pixels are never strong.
|
||
|
||
### 3.2 Resolution and ice-ring handling
|
||
|
||
Spot finding can be restricted to a resolution range $[d_\mathrm{high}, d_\mathrm{low}]$ by masking pixels outside the range. Optionally, pixels in identified ice-ring regions can be tagged so that subsequent indexing/refinement may include or exclude them (see §4 and §6).
|
||
|
||
A further optional safeguard removes isolated high-resolution “spur” spots by detecting large gaps in $1/d$ (or $q$) space and discarding spots beyond the gap. This is intended for macromolecular diffraction where edge-of-detector backgrounds can be extremely low.
|
||
|
||
### 3.3 Connected-component labeling (CCL)
|
||
|
||
Strong pixels are grouped into connected components (adjacent strong pixels) using a CCL algorithm. Each component yields a candidate spot with:
|
||
|
||
- centroid $(x,y)$ (often intensity-weighted),
|
||
- pixel count (spot size),
|
||
- integrated spot intensity proxy (sum of pixel values),
|
||
- resolution $d$ at the centroid (or mean over pixels),
|
||
- and quality flags (e.g. ice-ring classification).
|
||
|
||
Spot-level filters include minimum/maximum pixel count and resolution limits.
|
||
|
||
---
|
||
|
||
## 4. Indexing overview
|
||
|
||
Indexing maps observed reciprocal-space vectors $\mathbf{s}_i$ to a lattice such that:
|
||
$
|
||
\mathbf{s}_i \approx h_i\mathbf{a}^* + k_i\mathbf{b}^* + l_i\mathbf{c}^*,
|
||
$
|
||
with integer $(h_i,k_i,l_i)$.
|
||
|
||
Jungfraujoch supports two complementary indexing strategies:
|
||
|
||
1. **FFT-based indexing** (Rossmann-type): does not require an a priori unit cell; suitable for unknown samples.
|
||
2. **Fast-feedback indexing** (TORO-like): requires an approximate unit cell; optimized for speed and feedback.
|
||
|
||
Both feed into a common robust refinement/selection stage which maximizes the number of inliers under an indexing tolerance.
|
||
|
||
### 4.1 Indexed-spot decision (inlier test)
|
||
|
||
Given a trial lattice with direct basis vectors $\mathbf{a},\mathbf{b},\mathbf{c}$ (used here as reciprocal-space dot-test vectors), fractional indices are estimated by:
|
||
$
|
||
h_f = \mathbf{s}\cdot\mathbf{a},\quad
|
||
k_f = \mathbf{s}\cdot\mathbf{b},\quad
|
||
l_f = \mathbf{s}\cdot\mathbf{c}.
|
||
$
|
||
Let $(h,k,l)=(\mathrm{round}(h_f),\mathrm{round}(k_f),\mathrm{round}(l_f))$ and define the fractional residual:
|
||
$
|
||
\delta^2 = (h_f-h)^2 + (k_f-k)^2 + (l_f-l)^2.
|
||
$
|
||
A spot is indexed if $\delta^2 \le \tau^2$, where $\tau$ is the configured tolerance.
|
||
|
||
For indexed spots, the reciprocal lattice point $\mathbf{p} = h\mathbf{a}^*+k\mathbf{b}^*+l\mathbf{c}^*$ is used to compute $\Delta_\mathrm{Ewald}(\mathbf{p})$ (stored as a diagnostic and later used in profile-radius estimation).
|
||
|
||
---
|
||
|
||
## 5. FFT indexing (unknown unit cell)
|
||
|
||
FFT indexing follows a classical approach: detect dominant periodicities by projecting reciprocal-space points onto many directions and Fourier transforming the resulting 1D histograms.
|
||
|
||
### 5.1 Directional projections and histograms
|
||
|
||
Choose a set of unit vectors $\{\mathbf{u}_d\}$ on a half-sphere (a near-uniform distribution generated via a golden-angle construction). For each direction $d$, form a histogram in the scalar projection:
|
||
$
|
||
t_{id} = \left|\mathbf{u}_d\cdot \mathbf{s}_i\right|.
|
||
$
|
||
|
||
Bin width is chosen approximately as:
|
||
$
|
||
\Delta t \approx \frac{1}{2 L_\mathrm{max}},
|
||
$
|
||
where $L_\mathrm{max}$ is the maximum expected real-space unit-cell edge (Å). The histogram extent is tied to the maximum $q$ used (set by a high-resolution cutoff for indexing).
|
||
|
||
### 5.2 FFT peak picking and candidate vectors
|
||
|
||
For each direction, the FFT magnitude spectrum is computed; peaks correspond to periodicities along $\mathbf{u}_d$. Each direction yields a candidate real-space length $L$ with maximum spectral magnitude (subject to $L\ge L_\mathrm{min}$).
|
||
|
||
Candidate vectors are $\mathbf{v}_d = L_d\,\mathbf{u}_d$.
|
||
|
||
A collinearity filter removes nearly parallel vectors (e.g. within 5°) and attempts to resolve harmonic ambiguity: shorter “fundamental” vectors may be preferred over longer harmonics if their peak magnitude is sufficiently strong relative to the dominant peak.
|
||
|
||
### 5.3 Lattice reduction and cell candidates
|
||
|
||
Triples of candidate vectors are combined to form candidate bases $(\mathbf{A},\mathbf{B},\mathbf{C})$. A simple reduction is applied:
|
||
$
|
||
\mathbf{B} \leftarrow \mathbf{B} - \mathrm{round}\!\left(\frac{\mathbf{B}\cdot\mathbf{A}}{\mathbf{A}\cdot\mathbf{A}}\right)\mathbf{A},
|
||
$
|
||
$
|
||
\mathbf{C} \leftarrow \mathbf{C} - \mathrm{round}\!\left(\frac{\mathbf{C}\cdot\mathbf{A}}{\mathbf{A}\cdot\mathbf{A}}\right)\mathbf{A}
|
||
- \mathrm{round}\!\left(\frac{\mathbf{C}\cdot\mathbf{B}}{\mathbf{B}\cdot\mathbf{B}}\right)\mathbf{B}.
|
||
$
|
||
|
||
Candidates are filtered by allowed length and angle ranges.
|
||
|
||
### 5.4 Robust refinement and best-cell selection
|
||
|
||
Candidate bases are refined against observed spots using an iterative inlier‑focused least‑squares procedure (trimmed/contracting threshold). The output cell is chosen to:
|
||
1. maximize the number of indexed spots under the tolerance $\tau$, and
|
||
2. break ties by a refined score (smaller residual threshold/score is preferred).
|
||
|
||
An optional reference unit cell (if supplied) restricts acceptance to cells within a relative distance tolerance in edge lengths (permutation-invariant).
|
||
|
||
---
|
||
|
||
## 6. Bravais lattice / centering inference (“lattice search”)
|
||
|
||
If the space group is supplied by the user, its lattice constraints are assumed for refinement and subsequent processing.
|
||
|
||
If not, Jungfraujoch attempts to infer the most plausible Bravais lattice type from the metric tensor after Niggli reduction:
|
||
|
||
1. **Niggli reduction** is performed to obtain a reduced cell in $G^6$ representation (Gruber vector).
|
||
2. The reduced cell is compared against a list of Niggli classes corresponding to Bravais lattices and centerings.
|
||
3. The highest-symmetry class that matches within tolerances is selected (relative metric tolerance and angular tolerance).
|
||
|
||
The output includes:
|
||
- a conventional cell,
|
||
- crystal system (triclinic, monoclinic, …),
|
||
- centering symbol $P, A, B, C, I, F, R$.
|
||
|
||
This stage provides centering information used for systematic absences in prediction (§7.3) and for reporting.
|
||
|
||
**Note.** In ambiguous or special cases, forcing space group to $P1$ (no symmetry assumptions) is recommended.
|
||
|
||
---
|
||
|
||
## 7. Geometry and lattice refinement
|
||
|
||
Refinement adjusts experimental geometry and crystal parameters to minimize discrepancies between observed spot reciprocal vectors and those predicted by a lattice model with integer indices.
|
||
|
||
### 7.1 Parameterization
|
||
|
||
The refinement jointly optimizes, depending on mode and constraints:
|
||
|
||
- beam center $(x_\mathrm{beam}, y_\mathrm{beam})$,
|
||
- detector distance $D$,
|
||
- detector tilt angles (two-angle model; third rotation often held at 0),
|
||
- rotation axis direction (for rotation datasets),
|
||
- crystal orientation (a global rotation),
|
||
- unit-cell parameters, with constraints determined by inferred crystal system.
|
||
|
||
For higher symmetries, constraints are enforced, e.g.
|
||
- cubic: $a=b=c,\ \alpha=\beta=\gamma=90^\circ$,
|
||
- tetragonal: $a=b$,
|
||
- hexagonal: $a=b,\ \gamma=120^\circ$,
|
||
- monoclinic (unique axis $b$): $\alpha=\gamma=90^\circ$, $\beta$ refined.
|
||
|
||
### 7.2 Residuals and objective
|
||
|
||
For each indexed spot assigned integer $(h,k,l)$, compute:
|
||
|
||
- observed reciprocal vector $\mathbf{s}_\mathrm{obs}$ from its detector position and current geometry,
|
||
- predicted reciprocal vector $\mathbf{s}_\mathrm{pred}(h,k,l;\ \text{lattice params})$.
|
||
|
||
Residual is:
|
||
$
|
||
\mathbf{r} = \mathbf{s}_\mathrm{obs} - \mathbf{s}_\mathrm{pred}.
|
||
$
|
||
|
||
A non-linear least squares solver minimizes $\sum \|\mathbf{r}\|^2$ over all selected inlier spots.
|
||
|
||
### 7.3 Rotation datasets: bringing observations to a common reference frame
|
||
|
||
For oscillation/rotation data, each image corresponds to a rotation angle $\phi$ about an axis $\mathbf{m}_2$. Observed reciprocal vectors are rotated “back to start” so that all images are refined in a single reference crystal frame:
|
||
$
|
||
\mathbf{s}_\mathrm{obs,ref} = R(\phi)\,\mathbf{s}_\mathrm{obs},
|
||
$
|
||
with $R(\phi)$ constructed from the axis-angle representation of the goniometer model.
|
||
|
||
### 7.4 Multi-stage tightening of inlier tolerance
|
||
|
||
Refinement is performed in stages with decreasing acceptance tolerance for including reflections (e.g. from coarse to fine), which stabilizes convergence when starting from imperfect indexing and approximate geometry.
|
||
|
||
---
|
||
|
||
## 8. Reflection prediction
|
||
|
||
Jungfraujoch predicts reflection positions for integration by enumerating Miller indices within a resolution cutoff and accepting those that satisfy a diffraction condition model.
|
||
|
||
### 8.1 Enumerating reciprocal lattice points
|
||
|
||
For a maximum resolution $d_\mathrm{min}$, accept $(h,k,l)$ such that:
|
||
$
|
||
\lVert \mathbf{p}(h,k,l)\rVert^2 = \lVert h\mathbf{a}^* + k\mathbf{b}^* + l\mathbf{c}^*\rVert^2 \le \left(\frac{1}{d_\mathrm{min}}\right)^2.
|
||
$
|
||
|
||
### 8.2 Still prediction (excitation-error cutoff)
|
||
|
||
For still images, the diffracting condition is approximated by an excitation-error cutoff:
|
||
$
|
||
\left|\Delta_\mathrm{Ewald}(\mathbf{p})\right| \le \Delta_\mathrm{cut}.
|
||
$
|
||
Accepted reflections are projected to the detector by intersecting the diffracted direction $\mathbf{S}=\mathbf{S}_0+\mathbf{p}$ with the detector plane, using the current geometry.
|
||
|
||
### 8.3 Rotation prediction (Laue equation + partiality model)
|
||
|
||
For rotation/oscillation datasets, Jungfraujoch solves for rotation angles $\phi$ where the rotated reciprocal lattice point satisfies the Ewald-sphere condition. In an XDS-like notation, define:
|
||
|
||
- rotation axis unit vector $\mathbf{m}_2$,
|
||
- $\mathbf{S}_0$ incident vector,
|
||
- $\mathbf{S}(\phi)=\mathbf{S}_0+\mathbf{p}(\phi)$.
|
||
|
||
A key quantity is:
|
||
$
|
||
\zeta = \left|\mathbf{m}_2\cdot \mathbf{e}_1\right|,\quad
|
||
\mathbf{e}_1 = \frac{\mathbf{S}\times \mathbf{S}_0}{\lVert \mathbf{S}\times \mathbf{S}_0\rVert},
|
||
$
|
||
which also appears in XDS as the Lorentz component linked to the rotation axis.
|
||
|
||
A Gaussian mosaicity model yields a partiality fraction over an oscillation width $\Delta\phi$:
|
||
$
|
||
P(\phi;\sigma_M,\zeta,\Delta\phi) = \frac{1}{2}\left[
|
||
\mathrm{erf}\!\left(\frac{\phi+\Delta\phi/2}{\sqrt{2}\,\sigma_M/\zeta}\right)
|
||
-
|
||
\mathrm{erf}\!\left(\frac{\phi-\Delta\phi/2}{\sqrt{2}\,\sigma_M/\zeta}\right)
|
||
\right],
|
||
$
|
||
with mosaicity $\sigma_M$ in radians.
|
||
|
||
Reflections are predicted if they meet minimum $\zeta$ and mosaicity-window criteria, and their predicted detector coordinates fall on the active detector area.
|
||
|
||
### 8.4 Systematic absences (centering)
|
||
|
||
Systematic absences are applied at least at the centering level (prior to full space-group symmetry). For centering symbol $C$:
|
||
|
||
- $I$: absent if $h+k+l$ odd,
|
||
- $A$: absent if $k+l$ odd,
|
||
- $B$: absent if $h+l$ odd,
|
||
- $C$: absent if $h+k$ odd,
|
||
- $F$: absent if any of $h+k, h+l, k+l$ is odd,
|
||
- $R$: absent if $(-h+k+l)\bmod 3 \ne 0$,
|
||
- $P$: no centering absences.
|
||
|
||
---
|
||
|
||
## 9. 2D summation integration (three-ring method)
|
||
|
||
Jungfraujoch integrates predicted reflections by **summation** (no profile fitting), using a CrystFEL-inspired “three-circle / three-ring” method in the detector plane.
|
||
|
||
### 9.1 Regions of interest
|
||
|
||
For each predicted reflection at $(x_p,y_p)$, define three radii:
|
||
|
||
- $r_1$: inner signal radius,
|
||
- $r_2$: inner background radius,
|
||
- $r_3$: outer background radius.
|
||
|
||
Pixels are classified by their squared distance $r^2=(x-x_p)^2+(y-y_p)^2$:
|
||
|
||
- **signal region:** $r^2 < r_1^2$,
|
||
- **background annulus:** $r_2^2 \le r^2 < r_3^2$.
|
||
|
||
Invalid pixels (masked/bad/saturated) are excluded from both sums.
|
||
|
||
### 9.2 Background subtraction and intensity estimate
|
||
|
||
Let:
|
||
- $S = \sum I(x,y)$ over signal pixels,
|
||
- $n_S$ = number of valid signal pixels,
|
||
- $B = \sum I(x,y)$ over background pixels,
|
||
- $n_B$ = number of valid background pixels.
|
||
|
||
Background per pixel:
|
||
$
|
||
\hat{b} = \frac{B}{n_B},
|
||
$
|
||
integrated intensity:
|
||
$
|
||
\hat{I} = S - n_S \hat{b}.
|
||
$
|
||
|
||
A reflection is accepted as “observed” only if all signal pixels were valid and $n_B$ exceeds a minimum (to avoid unstable background estimates).
|
||
|
||
### 9.3 Uncertainty model
|
||
|
||
A Poisson-like estimator is used for the raw summed counts:
|
||
$
|
||
\sigma(\hat{I}) \approx \sqrt{S},
|
||
$
|
||
with a minimum $\sigma\ge 1$ to avoid singular weights. (This is a pragmatic online estimate; more elaborate models may be applied downstream.)
|
||
|
||
### 9.4 Lorentz–polarization factor handling
|
||
|
||
For integrated reflections, polarization correction can be applied as a multiplicative correction to the reflection scale via the geometry-based polarization term (§2.2). A Lorentz-like factor is carried as `rlp` in predictions, and used during scaling/merging (§10).
|
||
|
||
---
|
||
|
||
## 10. Scaling and merging
|
||
|
||
After per-image integration, Jungfraujoch scales observations and merges them into unique reflections. The design is intentionally compatible with XDS/XSCALE concepts, while supporting both still and rotation partiality models.
|
||
|
||
### 10.1 Observation model
|
||
|
||
For an observation $j$ of a unique reflection $h$ on image (or image group) $i$, the predicted measured intensity is modeled as:
|
||
$
|
||
I_{ij} \approx G_i \, L_{ij}\, P_{ij}\, I_h,
|
||
$
|
||
where:
|
||
|
||
- $G_i$ is the image scale factor,
|
||
- $L_{ij}$ is a Lorentz-like / geometry factor (stored as `rlp` or derived),
|
||
- $P_{ij}$ is a partiality term (model-dependent),
|
||
- $I_h$ is the merged (true) intensity parameter for that unique reflection.
|
||
|
||
A least-squares objective is minimized:
|
||
$
|
||
\sum_{ij} \left(\frac{I_{ij}^{\mathrm{pred}} - I_{ij}^{\mathrm{obs}}}{\sigma_{ij}}\right)^2
|
||
$
|
||
with regularization on $G_i$ and optional smoothness constraints (particularly meaningful for rotation series).
|
||
|
||
### 10.2 Partiality models available
|
||
|
||
Jungfraujoch supports several partiality choices:
|
||
|
||
1. **Rotation partiality** (XDS-like; see §8.3):
|
||
$
|
||
P_{ij} = \frac{1}{2}\left[
|
||
\mathrm{erf}\!\left(\frac{\Delta\phi_{ij}+\Delta\phi/2}{\sqrt{2}\,\sigma_{M,i}/\zeta_{ij}}\right)
|
||
-
|
||
\mathrm{erf}\!\left(\frac{\Delta\phi_{ij}-\Delta\phi/2}{\sqrt{2}\,\sigma_{M,i}/\zeta_{ij}}\right)
|
||
\right].
|
||
$
|
||
Mosaicity $\sigma_{M,i}$ can be refined per image group with bounds.
|
||
|
||
2. **Still partiality** (excitation-error proxy):
|
||
$
|
||
P_{ij} = \exp\!\left(-\frac{\Delta_\mathrm{Ewald}^2}{R_i^2}\right),
|
||
$
|
||
where $R_i^2$ is a refined width parameter (bounded).
|
||
|
||
3. **Unity**: $P_{ij}=1$.
|
||
|
||
4. **Fixed**: use the per-reflection partiality carried from prediction.
|
||
|
||
Reflections below a minimum partiality can be rejected from merging to avoid unstable corrections.
|
||
|
||
### 10.3 Regularization and smoothness
|
||
|
||
To stabilize scale determination, a weak prior $G_i\approx 1$ is used. For rotation datasets, optional smoothness encourages slowly varying scales and mosaicity:
|
||
$
|
||
\log G_{i-1} - 2\log G_i + \log G_{i+1} \approx 0,
|
||
$
|
||
(and similarly for mosaicity), reflecting the expectation of gradual changes during a rotation scan.
|
||
|
||
### 10.4 Merging estimator
|
||
|
||
After refinement, corrected observations are formed:
|
||
$
|
||
I^{\mathrm{corr}}_{ij} = \frac{I^{\mathrm{obs}}_{ij}}{G_i L_{ij} P_{ij}},\qquad
|
||
\sigma^{\mathrm{corr}}_{ij} = \frac{\sigma^{\mathrm{obs}}_{ij}}{G_i L_{ij} P_{ij}}.
|
||
$
|
||
|
||
Unique intensities are merged by inverse-variance weighted mean:
|
||
$
|
||
I_h = \frac{\sum_j w_j I^{\mathrm{corr}}_{ij}}{\sum_j w_j},\qquad
|
||
w_j = \frac{1}{(\sigma^{\mathrm{corr}}_{ij})^2}.
|
||
$
|
||
|
||
An internal-consistency term can inflate uncertainties when multiple observations are present, in the spirit of XSCALE.
|
||
|
||
### 10.5 Merging statistics
|
||
|
||
Per-shell and overall merging statistics are computed on corrected intensities, including:
|
||
- number of observations,
|
||
- number of unique reflections,
|
||
- mean $I/\sigma(I)$,
|
||
- an R$_\mathrm{meas}$-like quantity derived from within‑HKL deviations (shell-binned).
|
||
|
||
Completeness requires enumeration of possible reflections given a unit cell and symmetry; where this is not fully available, completeness may be reported as 0 or omitted.
|
||
|
||
---
|
||
|
||
## 11. Mosaicity and “profile radius” monitoring
|
||
|
||
### 11.1 Profile radius (still excitation error width)
|
||
|
||
A simple scalar “profile radius” is estimated from indexed spots using the distribution of $\Delta_\mathrm{Ewald}$. Two estimators are available:
|
||
|
||
- standard deviation:
|
||
$
|
||
R \approx \sqrt{\frac{1}{N}\sum_i \Delta_{\mathrm{Ewald},i}^2},
|
||
$
|
||
- robust MAD-based alternative (median absolute deviation), scaled by 1.4826.
|
||
|
||
Operationally, predictions for still data may use a cutoff proportional to this width (e.g. $\Delta_\mathrm{cut}\approx 2R$).
|
||
|
||
### 11.2 Mosaicity from rotation data (maximum likelihood)
|
||
|
||
For rotation data, Jungfraujoch can estimate mosaicity by maximizing a likelihood based on the XDS reflection fraction $R(\tau;\sigma_M/\zeta)$ as described by Kabsch (2010). In brief:
|
||
|
||
- compute angular deviations $\tau$ from predicted Bragg positions,
|
||
- compute $\zeta$ for each reflection,
|
||
- maximize $\sum \log R(\tau)$ over $\sigma_M$.
|
||
|
||
This yields a physically meaningful mosaicity estimate tied to the rotation partiality model.
|
||
|
||
---
|
||
|
||
## 12. Wilson statistics and French–Wilson treatment
|
||
|
||
### 12.1 Per-shell ⟨I/σ(I)⟩
|
||
|
||
For monitoring integration quality, Jungfraujoch reports mean $\langle I/\sigma(I)\rangle$ in a fixed number of resolution shells. Shelling is performed in $1/d^2$ space (typical of crystallographic practice).
|
||
|
||
### 12.2 Wilson plot (B-factor proxy)
|
||
|
||
A Wilson-type analysis is computed by binning intensities by resolution and fitting:
|
||
$
|
||
\langle I\rangle \propto \exp\!\left(-\frac{B}{2}\frac{1}{d^2}\right),
|
||
$
|
||
i.e.
|
||
$
|
||
\log \langle I\rangle = \mathrm{const} - \frac{B}{2}\left(\frac{1}{d^2}\right).
|
||
$
|
||
A linear regression of $\log\langle I\rangle$ vs $1/d^2$ provides an estimate of $B$, subject to basic quality checks (e.g. $R^2$ threshold).
|
||
|
||
### 12.3 French–Wilson (posterior expectation of I and |F|)
|
||
|
||
To mitigate negative intensities and obtain physically meaningful amplitudes, Jungfraujoch implements a French–Wilson style Bayesian treatment using per-shell mean intensity as a prior scale.
|
||
|
||
For each merged observation $I_\mathrm{obs}$ with uncertainty $\sigma$, the posterior over true intensity $I\ge 0$ is:
|
||
$
|
||
p(I\mid I_\mathrm{obs}) \propto p(I)\,\exp\!\left(-\frac{(I_\mathrm{obs}-I)^2}{2\sigma^2}\right),
|
||
$
|
||
with priors differing between acentric and centric cases (standard Wilson distributions).
|
||
|
||
Numerical quadrature over a scaled intensity variable is used to compute posterior moments:
|
||
- $\langle I\rangle$,
|
||
- $\langle |F|\rangle = \langle \sqrt{I}\rangle$,
|
||
and an amplitude uncertainty estimate via:
|
||
$
|
||
\sigma_F \approx \sqrt{\langle I\rangle - \langle |F|\rangle^2}.
|
||
$
|
||
|
||
---
|
||
|
||
## 13. Practical notes and limitations
|
||
|
||
- **No profile fitting** is currently performed for Bragg integration; all integration is summation-based (§9). This is appropriate for fast feedback and many serial/streaming use cases, but differs from full profile fitting workflows.
|
||
- **Space-group symmetry** beyond centering absences is not necessarily enforced during prediction/integration unless the space group is supplied and used downstream.
|
||
- **Resolution masking and ice rings** are controllable; including ice-ring spots in indexing can improve robustness for some samples but may bias refinement in others.
|
||
- **Rotation vs still modes** differ substantially in prediction and scaling because partiality is angle-driven in rotation data and excitation-error-driven in still data.
|
||
|
||
---
|