Files
Jungfraujoch/image_analysis/pixel_refinement/METHODS.md
T
leonarski_f 75e401f0e5
Build Packages / Unit tests (push) Successful in 1h31m59s
Build Packages / build:rpm (rocky8_nocuda) (push) Successful in 8m43s
Build Packages / build:rpm (rocky9_nocuda) (push) Successful in 10m5s
Build Packages / build:rpm (ubuntu2204_nocuda) (push) Successful in 9m27s
Build Packages / build:rpm (ubuntu2404_nocuda) (push) Successful in 8m56s
Build Packages / build:rpm (rocky8_sls9) (push) Successful in 9m24s
Build Packages / build:rpm (rocky9_sls9) (push) Successful in 10m27s
Build Packages / build:rpm (rocky8) (push) Successful in 9m20s
Build Packages / build:rpm (rocky9) (push) Successful in 10m50s
Build Packages / build:rpm (ubuntu2204) (push) Successful in 9m54s
Build Packages / build:rpm (ubuntu2404) (push) Successful in 8m38s
Build Packages / DIALS test (push) Successful in 12m13s
Build Packages / XDS test (durin plugin) (push) Successful in 7m8s
Build Packages / XDS test (JFJoch plugin) (push) Successful in 7m8s
Build Packages / XDS test (neggia plugin) (push) Successful in 7m50s
Build Packages / Generate python client (push) Successful in 16s
Build Packages / Build documentation (push) Successful in 50s
Build Packages / Create release (push) Skipped
v1.0.0-rc.153 (#63)
This is an UNSTABLE release. It includes many experimental features, as well as many AI generated fixes. We recommend using rc.152 for production use.

* jfjoch_broker: Add EXPERIMENTAL pixelrefine mode for image processing
* jfjoch_broker: Allow to load user mask from 8-bit and 16-bit TIFF files
* jfjoch_broker: Add ROI calculation in non-FPGA workflow
* jfjoch_broker: Fixes to TCP image pusher
* jfjoch_broker: Remove NUMA bindings
* jfjoch_broker: Improvements to indexing
* jfjoch_broker: For PSI EIGER, trimming energies are taken from the detector configuration (now compulsory) instead of hardcoded values
* jfjoch_writer: Save ROI definitions and the per-pixel ROI bitmap in the master file; azimuthal ROIs support phi (angular) sectors
* jfjoch_viewer: Major redesign with dockable panels and saved layouts, plus on-canvas creation/move/resize of box, circle and azimuthal ROIs
* jfjoch_viewer: Run jfjoch_process reprocessing jobs from inside the GUI and overlay per-run results

Reviewed-on: #63
2026-06-23 20:29:49 +02:00

9.9 KiB
Raw Blame History

PixelRefine — methods

PixelRefine is the still-image integrator. It integrates the Bragg reflections of one image by profile fitting against a reference intensity set I^\mathrm{ref} (e.g. F_calc from a deposited model, or the current merged estimate in an EM-style outer loop) and returns already-scaled intensities. It is an intensity-wise operation: the detector geometry (orientation, cell, beam, distance) is taken as fixed — it was refined upstream by XtalOptimizer (IndexAndRefine::RefineGeometryIfNeeded) — and PixelRefine only measures the spot shape and fits the per-image scale.

The objective is the factored per-reflection likelihood of FACTORED_MODEL.md, Terms 1 and 2. This note records the equations and the reasons behind each design choice.

Throughout, a reflection's shoebox is a small box of raw detector pixels I_p with a local flat background B; the area-normalised tangential profile at pixel p is P_p, and v_p is the variance used to weight pixel p.


0. The forward model

The recorded amplitude of a still reflection is the profile-fit amplitude


J \;=\; \frac{\sum_p P_p\,(I_p - B)/v_p}{\sum_p P_p^{2}/v_p},
\qquad
\operatorname{var}(J) = \frac{1}{\sum_p P_p^{2}/v_p}.

The full (rotation-equivalent) intensity is recovered by dividing out the factors a still does not record,


I \;=\; \frac{J}{p\,B_\mathrm{DW}\,\mathrm{pol}},\qquad
p = \exp\!\left(-\frac{\epsilon_r^{2}}{R_{0,\mathrm{eff}}^{2}}\right),\quad
B_\mathrm{DW}=\exp\!\left(-\frac{B_\mathrm{fac}}{4 d^{2}}\right),

where the partiality p is the fraction of the mosaic block crossing the Ewald sphere, \epsilon_r is the radial excitation error, R_0 the radial (rocking) width, and \mathrm{pol} the polarisation correction. The tangential profile is a separable, area-normalised Gaussian of width R_1:


P_p = \frac{1}{\pi R_1^{2}}\exp\!\left(-\frac{\epsilon_{t,p}^{2}}{R_1^{2}}\right).

A finite X-ray bandwidth thickens the Ewald shell radially, adding a fixed, resolution-dependent term to the radial width, $R_{0,\mathrm{eff}}^2 = R_0^2 + (b\lambda)^2/(2d^4)$ (b = relative bandwidth; the pink-beam / DMM signature). b=0 is a monochromatic no-op.


1. De-biased variance (the load-bearing fix)

Symptom. Mean intensities went negative in the high-resolution shells (\langle I/\sigma\rangle down to -12), and the per-image scale G collapsed to 0 on most images, dropping ~80 % of observations.

Cause. The extraction weighted each pixel by its observed count, v_p = I_p. A down-fluctuated background pixel (I_p < B) then gets a small v_p, hence a large w_p=1/v_p, and its contribution P_p(I_p-B)/v_p < 0 is large in magnitude. Summed over the many empty shoebox pixels this drags J below zero — the inverse-observed-count (Poisson-on-data) bias, worst where the true signal is weakest (high resolution).

Fix. For background-limited reflections the correct variance is the local background, constant over the shoebox, v_p = \max(B,1), giving the unbiased uniform-variance estimator J = \sum_p P_p (I_p-B)/\sum_p P_p^2. This turned \langle I/\sigma\rangle positive at all resolutions and stopped the scale collapse.


2. Prediction band and multiplicity

A reflection is given a shoebox only when it lies within a radial band of the Ewald sphere, \bigl|\,|S_{hkl}| - 1/\lambda\,\bigr| \le \delta. For randomly oriented stills the number of images on which a given hkl qualifies is \propto \delta. The original \delta = 5\times10^{-4}\,\text{Å}^{-1} was 46× tighter than a box integrator, giving 4× fewer observations per reflection. Widening to \delta = 2\times10^{-3}\,\text{Å}^{-1} (ewald_dist_cutoff) restores the multiplicity; the partiality p downweights the slightly-off-Ewald tails it admits. (Widening is only safe with the de-biased variance of §1 and the factored objective of §§34 — with a per-pixel geometry fit it diverged.)


3. Term 2 — measured per-resolution profile width R_1

R_1 is measured, not fitted. Fitting R_1 inside a per-image least squares is degenerate with the scale G (a narrower profile and a larger scale trade off), and that degeneracy slides the per-image scale and wrecks the merge. But a second moment is normalised by the total intensity, so it carries shape information decoupled from scale:


R_1^2 = 2\,\langle \epsilon_t^2\rangle,
\qquad
\langle \epsilon_t^2\rangle = \frac{\sum_p (I_p-B)\,\epsilon_{t,p}^2}{\sum_p (I_p-B)} .

We bin the strong spots (\mathrm{signif}\ge 5) by resolution (1/d^2, 6 bins) and take the median \langle\epsilon_t^2\rangle per bin, so each reflection integrates with the R_1 of its resolution shell (low-res spots are tight; high-res anisotropic streaks are wider). Weak spots fall back to the global R_1. Measuring the width rather than fitting it is what makes profile-width refinement stable — and it is a selling point: the mask adapts to the data per shell.


4. Term 1 — the intensity / scaling residual

The per-pixel least squares is replaced by one residual per reflection: the profile-fit amplitude J (using the Term-2 R_1) should equal the scaled reference,


r_h = \frac{J_h - G\,B_\mathrm{DW}\,p_h\,\mathrm{pol}_h\,I^\mathrm{ref}_h}{\sigma_{J,h}},
\qquad
L = \sum_h r_h^2 + \text{(scale prior)} .

Only the per-image scale G and DebyeWaller B are optimised; geometry and R are fixed. Three consequences:

  • Integration and scaling become one objective. J is the integrated intensity and the residual is the scaling residual.
  • The empty-pixel problem disappears by construction. Empty pixels enter only through J (with ~zero profile weight); they make no residual of their own and cannot dominate.
  • Fisher weighting puts the reference at maximum leverage. \sigma_J uses the model-expected variance v_p = B + \max(J,0)\,P_p (background plus expected signal from I^\mathrm{ref}), not the observed counts — so a strong expected reflection observed absent is penalised, and a noise spike with low I^\mathrm{ref} gets no weight.

The scale is regularised towards 1 with a data-scaled weight $w_G=\sqrt{N_\mathrm{refl}/ \sigma_G}$ (mirrors ScaleOnTheFly) so weakly-measured images cannot drift and scramble the merge.

Geometry is not refined here. PixelRefine's earlier per-image geometry refinement (regularised orientation LSQ, signal-weighting, a global orientation/cell sweep) was removed: on true stills the predictions are already good (radial centroid error ≈ 0, tangential ≈ a 0.4 px sampling floor that is not a recoverable misprediction), and per-image geometry refinement only overfit the sparse signal. Geometry is the job of XtalOptimizer.


5. Background estimator — mean, not median (both integrators)

Symptom. Pushed past the true resolution limit, the no-signal shells reported \langle I/\sigma\rangle\approx4\text{}6 at \mathrm{CC}_{1/2}\approx0 — confident "data" where there is none. Present in both PixelRefine and the classical BraggIntegrate2D.

Cause. Both used the median of the background ring. For a right-skewed (Poisson) background \operatorname{median}(B) < \mathbb{E}[B], so subtraction under-subtracts by a tiny but coherent per-pixel offset that grows over an n_\mathrm{pix} peak and a multiplicity-m merge into a fake \langle I/\sigma\rangle\propto\sqrt{m}, worst where the real signal is weakest.

Fix. Use the mean of the ring (spot cores and saturation sentinels already excluded). \langle I/\sigma\rangle then collapses to ~0 wherever \mathrm{CC}\approx0 and the honest resolution limit becomes visible. This was the single largest contributor to untrustworthy σ — a one-line change in each integrator.


6. Error model (global a,b; XDS form)

Counting statistics under-estimate the variance of strong reflections, which carry systematic errors proportional to I, not \sqrt{I}. The standard correction inflates the variance with a global two-parameter model, applied at the merge level so both integrators benefit:


\sigma'^{\,2} = a\,\sigma^{2} + (b\,\langle I\rangle)^{2},
\qquad
\mathrm{ISa} = \frac{1}{b} = \lim_{I\to\infty}\frac{I}{\sigma'} .

The I^2 term uses the reflection mean \langle I\rangle (not the per-observation I_i, which would bias the merged mean and collapse CC); a,b are fit from the spread of symmetry equivalents with a relative (1/\mathrm{dev}^4) weight so the strong bins (which fix b) do not swamp the weak bins (which fix a). jfjoch_process prints the model and ISa.


Results (lysozyme rotation crystal fixed_master.h5, treated as stills, 1.7 Å)

Configuration N_obs \langle I/\sigma\rangle CC$_{1/2}$ CC$_\mathrm{ref}$
Baseline per-pixel loss 799 k 7.2 erratic, →0 at 1.7 Å erratic
Factored Terms 1+2 (this model) 1.22 M 10.7 8492 % flat 7792 % flat

The factored objective turns the erratic, high-res-collapsing per-pixel result into flat ~90 % CC${1/2}$/CC$\mathrm{ref}$ to 1.7 Å — matching the proper rotation integration path from the stills path. (See FINDINGS-2026-06.md.)


Default recipe

§§14 are PixelRefine-specific; §§56 act at the integration/merge level and apply to the classical route too.

Field / behaviour Default Section
fit/extraction variance local background B 1
ewald_dist_cutoff 2\times10^{-3}\,\text{Å}^{-1} 2
tangential width R_1 measured per resolution shell 3
objective per-reflection intensity residual, Fisher-weighted 4
refined parameters per-image G (and B); geometry fixed 4
scale_reg_sigma 2.0 4
local background estimator mean of the ring 5
merge error model global a,b (ISa printed) 6

ewald_dist_cutoff (multiplicity vs. cost) and bandwidth (Si vs. DMM) are the two knobs worth setting per dataset.