mx/Jungfraujoch

Fork 0

Files

T

leonarski_f 100fe7b7e7

Build Packages / build:rpm (ubuntu2404_nocuda) (push) Successful in 25m0s

Details

Build Packages / build:rpm (rocky8_nocuda) (push) Successful in 26m42s

Details

Build Packages / build:rpm (rocky8_sls9) (push) Successful in 27m7s

Details

Build Packages / build:rpm (ubuntu2204_nocuda) (push) Successful in 28m25s

Details

Build Packages / build:rpm (rocky9_nocuda) (push) Successful in 29m44s

Details

Build Packages / build:rpm (rocky9_sls9) (push) Successful in 32m14s

Details

Build Packages / build:rpm (rocky8) (push) Successful in 24m39s

Details

Build Packages / build:rpm (ubuntu2404) (push) Successful in 23m52s

Details

Build Packages / build:rpm (ubuntu2204) (push) Successful in 25m19s

Details

Build Packages / Generate python client (push) Successful in 23s

Details

Build Packages / XDS test (durin plugin) (push) Successful in 20m42s

Details

Build Packages / Create release (push) Skipped

Details

Build Packages / build:rpm (rocky9) (push) Successful in 27m2s

Details

Build Packages / Build documentation (push) Successful in 1m23s

Details

Build Packages / DIALS test (push) Successful in 31m5s

Details

Build Packages / XDS test (JFJoch plugin) (push) Successful in 14m55s

Details

Build Packages / XDS test (neggia plugin) (push) Successful in 13m7s

Details

Build Packages / Unit tests (push) Successful in 2h14m40s

Details

PixelRefine: make factored Terms 1+2 the model, remove old wiring

PixelRefine is now an intensity-only operation: geometry is fixed (refined
upstream by XtalOptimizer) and the only objective is the factored per-reflection
likelihood (FACTORED_MODEL.md Terms 1+2) - measured per-resolution profile width
R1 plus one Fisher-weighted intensity/scaling residual per reflection, fitting
the per-image scale G and B. Validated on crystal 2 (fixed_master.h5 as stills,
1.7 A): CC1/2 84-92%, CCref 77-92%, flat - reproduces the env-flag prototype and
matches the rotation path from the stills path.

Removed:
- the per-pixel ShoeboxResidual loss and PixelResidual cost functor;
- all in-PixelRefine geometry refinement (orientation/cell/beam/distance/R),
  the regularised-orientation LSQ, signal-weighting, and the global sweep;
- Term 3 (per-spot recentring) - a confirmed no-op on both crystals;
- the diagnostic scaffolding (covariance, centroid, adaptive_R1) and the
  PR_* env knobs + stderr dumps in IndexAndRefine;
- the PredictImage/ChiSquaredImage renderers and the entire viewer
  PixelRefine window/table/params + worker bindings + shoebox overlay.

The sweep box-integrator background median became mean (consistency) by virtue
of removing the sweep. METHODS.md rewritten for the current model; findings
recorded in FINDINGS-2026-06.md. Net -2200 lines.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-13 22:02:18 +02:00

9.9 KiB

Raw Blame History

PixelRefine — methods

PixelRefine is the still-image integrator. It integrates the Bragg reflections of one image by profile fitting against a reference intensity set I^\mathrm{ref} (e.g. F_calc from a deposited model, or the current merged estimate in an EM-style outer loop) and returns already-scaled intensities. It is an intensity-wise operation: the detector geometry (orientation, cell, beam, distance) is taken as fixed — it was refined upstream by XtalOptimizer (IndexAndRefine::RefineGeometryIfNeeded) — and PixelRefine only measures the spot shape and fits the per-image scale.

The objective is the factored per-reflection likelihood of FACTORED_MODEL.md, Terms 1 and 2. This note records the equations and the reasons behind each design choice.

Throughout, a reflection's shoebox is a small box of raw detector pixels I_p with a local flat background B; the area-normalised tangential profile at pixel p is P_p, and v_p is the variance used to weight pixel p.

0. The forward model

The recorded amplitude of a still reflection is the profile-fit amplitude


J \;=\; \frac{\sum_p P_p\,(I_p - B)/v_p}{\sum_p P_p^{2}/v_p},
\qquad
\operatorname{var}(J) = \frac{1}{\sum_p P_p^{2}/v_p}.

The full (rotation-equivalent) intensity is recovered by dividing out the factors a still does not record,


I \;=\; \frac{J}{p\,B_\mathrm{DW}\,\mathrm{pol}},\qquad
p = \exp\!\left(-\frac{\epsilon_r^{2}}{R_{0,\mathrm{eff}}^{2}}\right),\quad
B_\mathrm{DW}=\exp\!\left(-\frac{B_\mathrm{fac}}{4 d^{2}}\right),

where the partiality p is the fraction of the mosaic block crossing the Ewald sphere, \epsilon_r is the radial excitation error, R_0 the radial (rocking) width, and \mathrm{pol} the polarisation correction. The tangential profile is a separable, area-normalised Gaussian of width R_1:


P_p = \frac{1}{\pi R_1^{2}}\exp\!\left(-\frac{\epsilon_{t,p}^{2}}{R_1^{2}}\right).

A finite X-ray bandwidth thickens the Ewald shell radially, adding a fixed, resolution-dependent term to the radial width, $R_{0,\mathrm{eff}}^2 = R_0^2 + (b\lambda)^2/(2d^4)$ (b = relative bandwidth; the pink-beam / DMM signature). b=0 is a monochromatic no-op.

1. De-biased variance (the load-bearing fix)

Symptom. Mean intensities went negative in the high-resolution shells (\langle I/\sigma\rangle down to -12), and the per-image scale G collapsed to 0 on most images, dropping ~80 % of observations.

Cause. The extraction weighted each pixel by its observed count, v_p = I_p. A down-fluctuated background pixel (I_p < B) then gets a small v_p, hence a large w_p=1/v_p, and its contribution P_p(I_p-B)/v_p < 0 is large in magnitude. Summed over the many empty shoebox pixels this drags J below zero — the inverse-observed-count (Poisson-on-data) bias, worst where the true signal is weakest (high resolution).

Fix. For background-limited reflections the correct variance is the local background, constant over the shoebox, v_p = \max(B,1), giving the unbiased uniform-variance estimator J = \sum_p P_p (I_p-B)/\sum_p P_p^2. This turned \langle I/\sigma\rangle positive at all resolutions and stopped the scale collapse.

2. Prediction band and multiplicity

A reflection is given a shoebox only when it lies within a radial band of the Ewald sphere, \bigl|\,|S_{hkl}| - 1/\lambda\,\bigr| \le \delta. For randomly oriented stills the number of images on which a given hkl qualifies is \propto \delta. The original \delta = 5\times10^{-4}\,\text{Å}^{-1} was 4–6× tighter than a box integrator, giving 4× fewer observations per reflection. Widening to \delta = 2\times10^{-3}\,\text{Å}^{-1} (ewald_dist_cutoff) restores the multiplicity; the partiality p downweights the slightly-off-Ewald tails it admits. (Widening is only safe with the de-biased variance of §1 and the factored objective of §§3–4 — with a per-pixel geometry fit it diverged.)

3. Term 2 — measured per-resolution profile width `R_1`

R_1 is measured, not fitted. Fitting R_1 inside a per-image least squares is degenerate with the scale G (a narrower profile and a larger scale trade off), and that degeneracy slides the per-image scale and wrecks the merge. But a second moment is normalised by the total intensity, so it carries shape information decoupled from scale:


R_1^2 = 2\,\langle \epsilon_t^2\rangle,
\qquad
\langle \epsilon_t^2\rangle = \frac{\sum_p (I_p-B)\,\epsilon_{t,p}^2}{\sum_p (I_p-B)} .

We bin the strong spots (\mathrm{signif}\ge 5) by resolution (1/d^2, 6 bins) and take the median \langle\epsilon_t^2\rangle per bin, so each reflection integrates with the R_1 of its resolution shell (low-res spots are tight; high-res anisotropic streaks are wider). Weak spots fall back to the global R_1. Measuring the width rather than fitting it is what makes profile-width refinement stable — and it is a selling point: the mask adapts to the data per shell.

4. Term 1 — the intensity / scaling residual

The per-pixel least squares is replaced by one residual per reflection: the profile-fit amplitude J (using the Term-2 R_1) should equal the scaled reference,


r_h = \frac{J_h - G\,B_\mathrm{DW}\,p_h\,\mathrm{pol}_h\,I^\mathrm{ref}_h}{\sigma_{J,h}},
\qquad
L = \sum_h r_h^2 + \text{(scale prior)} .

Only the per-image scale G and Debye–Waller B are optimised; geometry and R are fixed. Three consequences:

Integration and scaling become one objective. J is the integrated intensity and the residual is the scaling residual.
The empty-pixel problem disappears by construction. Empty pixels enter only through J (with ~zero profile weight); they make no residual of their own and cannot dominate.
Fisher weighting puts the reference at maximum leverage. \sigma_J uses the model-expected variance v_p = B + \max(J,0)\,P_p (background plus expected signal from I^\mathrm{ref}), not the observed counts — so a strong expected reflection observed absent is penalised, and a noise spike with low I^\mathrm{ref} gets no weight.

The scale is regularised towards 1 with a data-scaled weight $w_G=\sqrt{N_\mathrm{refl}/ \sigma_G}$ (mirrors ScaleOnTheFly) so weakly-measured images cannot drift and scramble the merge.

Geometry is not refined here. PixelRefine's earlier per-image geometry refinement (regularised orientation LSQ, signal-weighting, a global orientation/cell sweep) was removed: on true stills the predictions are already good (radial centroid error ≈ 0, tangential ≈ a 0.4 px sampling floor that is not a recoverable misprediction), and per-image geometry refinement only overfit the sparse signal. Geometry is the job of XtalOptimizer.

5. Background estimator — mean, not median (both integrators)

Symptom. Pushed past the true resolution limit, the no-signal shells reported \langle I/\sigma\rangle\approx4\text{–}6 at \mathrm{CC}_{1/2}\approx0 — confident "data" where there is none. Present in both PixelRefine and the classical BraggIntegrate2D.

Cause. Both used the median of the background ring. For a right-skewed (Poisson) background \operatorname{median}(B) < \mathbb{E}[B], so subtraction under-subtracts by a tiny but coherent per-pixel offset that grows over an n_\mathrm{pix} peak and a multiplicity-m merge into a fake \langle I/\sigma\rangle\propto\sqrt{m}, worst where the real signal is weakest.

Fix. Use the mean of the ring (spot cores and saturation sentinels already excluded). \langle I/\sigma\rangle then collapses to ~0 wherever \mathrm{CC}\approx0 and the honest resolution limit becomes visible. This was the single largest contributor to untrustworthy σ — a one-line change in each integrator.

6. Error model (global `a,b`; XDS form)

Counting statistics under-estimate the variance of strong reflections, which carry systematic errors proportional to I, not \sqrt{I}. The standard correction inflates the variance with a global two-parameter model, applied at the merge level so both integrators benefit:


\sigma'^{\,2} = a\,\sigma^{2} + (b\,\langle I\rangle)^{2},
\qquad
\mathrm{ISa} = \frac{1}{b} = \lim_{I\to\infty}\frac{I}{\sigma'} .

The I^2 term uses the reflection mean \langle I\rangle (not the per-observation I_i, which would bias the merged mean and collapse CC); a,b are fit from the spread of symmetry equivalents with a relative (1/\mathrm{dev}^4) weight so the strong bins (which fix b) do not swamp the weak bins (which fix a). jfjoch_process prints the model and ISa.

Results (lysozyme rotation crystal `fixed_master.h5`, treated as stills, 1.7 Å)

Configuration	N_obs	`\langle I/\sigma\rangle`	CC$_{1/2}$	CC$_\mathrm{ref}$
Baseline per-pixel loss	799 k	7.2	erratic, →0 at 1.7 Å	erratic
Factored Terms 1+2 (this model)	1.22 M	10.7	84–92 % flat	77–92 % flat

The factored objective turns the erratic, high-res-collapsing per-pixel result into flat ~90 % CC${1/2}$/CC$\mathrm{ref}$ to 1.7 Å — matching the proper rotation integration path from the stills path. (See FINDINGS-2026-06.md.)

Default recipe

§§1–4 are PixelRefine-specific; §§5–6 act at the integration/merge level and apply to the classical route too.

Field / behaviour	Default	Section
fit/extraction variance	local background `B`	1
`ewald_dist_cutoff`	`2\times10^{-3}\,\text{Å}^{-1}`	2
tangential width `R_1`	measured per resolution shell	3
objective	per-reflection intensity residual, Fisher-weighted	4
refined parameters	per-image `G` (and `B`); geometry fixed	4
`scale_reg_sigma`	2.0	4
local background estimator	mean of the ring	5
merge error model	global `a,b` (ISa printed)	6

ewald_dist_cutoff (multiplicity vs. cost) and bandwidth (Si vs. DMM) are the two knobs worth setting per dataset.

9.9 KiB Raw Blame History Unescape Escape

PixelRefine — methods

0. The forward model

1. De-biased variance (the load-bearing fix)

2. Prediction band and multiplicity

3. Term 2 — measured per-resolution profile width R_1

4. Term 1 — the intensity / scaling residual

5. Background estimator — mean, not median (both integrators)

6. Error model (global a,b; XDS form)

Results (lysozyme rotation crystal fixed_master.h5, treated as stills, 1.7 Å)

Default recipe

9.9 KiB

Raw Blame History

3. Term 2 — measured per-resolution profile width `R_1`

6. Error model (global `a,b`; XDS form)

Results (lysozyme rotation crystal `fixed_master.h5`, treated as stills, 1.7 Å)