Files
Jungfraujoch/image_analysis/pixel_refinement/FACTORED_MODEL.md
T
leonarski_f 100fe7b7e7
Build Packages / build:rpm (ubuntu2404_nocuda) (push) Successful in 25m0s
Build Packages / build:rpm (rocky8_nocuda) (push) Successful in 26m42s
Build Packages / build:rpm (rocky8_sls9) (push) Successful in 27m7s
Build Packages / build:rpm (ubuntu2204_nocuda) (push) Successful in 28m25s
Build Packages / build:rpm (rocky9_nocuda) (push) Successful in 29m44s
Build Packages / build:rpm (rocky9_sls9) (push) Successful in 32m14s
Build Packages / build:rpm (rocky8) (push) Successful in 24m39s
Build Packages / build:rpm (ubuntu2404) (push) Successful in 23m52s
Build Packages / build:rpm (ubuntu2204) (push) Successful in 25m19s
Build Packages / Generate python client (push) Successful in 23s
Build Packages / XDS test (durin plugin) (push) Successful in 20m42s
Build Packages / Create release (push) Skipped
Build Packages / build:rpm (rocky9) (push) Successful in 27m2s
Build Packages / Build documentation (push) Successful in 1m23s
Build Packages / DIALS test (push) Successful in 31m5s
Build Packages / XDS test (JFJoch plugin) (push) Successful in 14m55s
Build Packages / XDS test (neggia plugin) (push) Successful in 13m7s
Build Packages / Unit tests (push) Successful in 2h14m40s
PixelRefine: make factored Terms 1+2 the model, remove old wiring
PixelRefine is now an intensity-only operation: geometry is fixed (refined
upstream by XtalOptimizer) and the only objective is the factored per-reflection
likelihood (FACTORED_MODEL.md Terms 1+2) - measured per-resolution profile width
R1 plus one Fisher-weighted intensity/scaling residual per reflection, fitting
the per-image scale G and B. Validated on crystal 2 (fixed_master.h5 as stills,
1.7 A): CC1/2 84-92%, CCref 77-92%, flat - reproduces the env-flag prototype and
matches the rotation path from the stills path.

Removed:
- the per-pixel ShoeboxResidual loss and PixelResidual cost functor;
- all in-PixelRefine geometry refinement (orientation/cell/beam/distance/R),
  the regularised-orientation LSQ, signal-weighting, and the global sweep;
- Term 3 (per-spot recentring) - a confirmed no-op on both crystals;
- the diagnostic scaffolding (covariance, centroid, adaptive_R1) and the
  PR_* env knobs + stderr dumps in IndexAndRefine;
- the PredictImage/ChiSquaredImage renderers and the entire viewer
  PixelRefine window/table/params + worker bindings + shoebox overlay.

The sweep box-integrator background median became mean (consistency) by virtue
of removing the sweep. METHODS.md rewritten for the current model; findings
recorded in FINDINGS-2026-06.md. Net -2200 lines.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-13 22:02:18 +02:00

7.7 KiB
Raw Blame History

A factored likelihood for joint integration + scaling + geometry

Status: Terms 1+2 implemented and shipping as the PixelRefine default (see PixelRefine.cpp, METHODS.md, FINDINGS-2026-06.md); Term 3 (geometry) and the priors/NN extensions of §4 remain future work. Goal: replace the per-pixel least-squares of PixelRefine with a per-reflection likelihood that fuses profile-fit integration, scaling against the reference, and geometry refinement into one differentiable objective — the foundation for priors (Bayesian) and learned components (NN), and the thing that dissolves the empty-pixel and parameter-degeneracy problems by construction rather than by patching.

0. Notation

Per image, parameters θ: scale G, Debye-Waller B, orientation + cell (geometry), profile width R1 (tangential, possibly a 2×2 tensor), partiality width R0 (radial/mosaicity; R0_eff² = R0² + R_bw², R_bw² = (bλ)²/2d⁴ known from bandwidth). Per reflection h: reference intensity I_ref (the hypothesis), resolution d, predicted centre c_pred, partiality p = exp(−ε_r²/R0_eff²), polarisation pol, B_term = exp(B/4d²), shoebox pixels {I_p} with mean local background Bg, and the area-normalised tangential profile template P_p = P_tang(ε_t,p; R1).

1. The factorisation principle

A reflection's shoebox carries three (to first order) orthogonal pieces of information — the 0th, 1st and 2nd moments of its intensity distribution:

moment statistic constrains
0th — total profile-fit amplitude J scale chain G, B (and p)
1st — position centroid c_obs geometry (orientation; radial→distance/cell)
2nd — shape second moment M₂ profile width R1 (and anisotropy)

The current per-pixel residual mixes all three into one objective over shared pixels — that is what couples the parameters (measured GR0 ≈ 0.46, GR1 ≈ +0.51) and lets the many empty pixels dominate. Residual-ing each moment against its model instead gives a block-diagonal Jacobian: the couplings vanish because each statistic carries one parameter block's information.

2. The three residual terms

2.1 Intensity / scaling residual (one scalar per reflection)

Optimal (Diamond) profile-fit amplitude and its model:

J      = Σ_p w_p P_p (I_p  Bg) / Σ_p w_p P_p²          w_p = 1/v_p
J_model = G · B_term · p · pol · I_ref
r¹_h    = (J  J_model) / σ_J

J is ~invariant to R1 (a well-sampled spot integrates to the same total whatever width is assumed) → R1 leaves this residual. Empty pixels make no residual; they enter only through J with ~zero profile weight → the empty-pixel problem is gone by construction. This residual is the scaling residual — integration and scaling are now one objective.

2.2 Shape residual (constrains R1; decoupled from scale)

M₂_obs   = Σ_p (I_p  Bg) ε_t,p² / Σ_p (I_p  Bg)      (intensity-weighted variance, Å⁻²)
M₂_model = R1² / 2                                      (variance of exp(−ε_t²/R1²))
r²_h     = (M₂_obs  M₂_model) / σ_M2

A moment is normalised by the total → scale-invariant → ∂r²/∂G = 0. The G↔R1 degeneracy disappears. Anisotropic extension: use the 2×2 moment tensor Σ(IBg)(ε_t⊗ε_t)/Σ(IBg) vs diag(R1a²/2, R1b²/2) → elliptical R1 (the DMM streak). Weak spots have huge σ_M2 → contribute ~nothing → R1 is set by strong spots automatically (and may be made R1(d) per resolution).

2.3 Position residual (constrains geometry; decoupled from scale and shape)

c_obs = Σ_p (I_p  Bg)(x_p, y_p) / Σ_p (I_p  Bg)
r³_h  = (c_obs  c_pred(geometry)) / σ_c              (2-vector; split radial / tangential)

Centroid is scale- and width-invariant → ∂r³/∂G = ∂r³/∂R1 ≈ 0. The radial component constrains distance/cell, the tangential constrains orientation — exactly the split the diagnostic measured (radial≈0 = no distance error; tangential∝radius = orientation).

3. Fisher / expected-variance weighting (makes it a likelihood)

Every σ uses the model-expected variance, never observed counts — this is what makes strong expected reflections carry the information and makes the model "feel pain when something that should be there is not":

v_p   = Bg + J_model · P_p           (background + expected signal from I_ref, not I_obs)
σ_J²  = 1 / Σ_p (P_p² / v_p)
σ_M2  ≈ M₂ · √(2 / N_eff),   σ_c ≈ R1 / √(N_eff),   N_eff = (Σ(IBg))² / Σ v_p

Fisher information about G from term 1 is ∝ (B_term·p·pol·I_ref)² / σ_J² — driven by I_ref, so a noise spike (high counts, low I_ref) gets no weight while a strong expected reflection observed absent (J≈0, large residual, moderate σ_J) gets a large penalty. The reference enters at maximum leverage: it sets both the target and the weight.

4. Joint objective and priors

L(θ) = Σ_h [ (r¹_h)² + (r²_h)² + |r³_h|² ]  +  priors

No free λ if the σ's are correct — the relative weighting is the Fisher information. Priors are the Bayesian hooks and the principled degeneracy breaks:

  • R0 (partiality/mosaicity) is GLOBAL + prior. R0 multiplies J_model (p), so it is still degenerate with the per-image G within term 1 — the one degeneracy the factorisation does not remove. Resolve it physically, not with a directional G prior (which would bias every output intensity): R0 ~ N(mosaicity, σ), R_bw fixed from the known bandwidth, and R0 fit globally (one per crystal, from many reflections' partiality distribution) so per-image G can't trade against it.
  • orientation ~ N(spot-centroid, σ); G ~ N(1, σ_G) or tied to the beam monitor; distance ~ N(nominal, σ_L) (loose, since serial/jet alignment is poorly constrained).
  • Optional Bayesian intensities: treat I_true as a parameter with the reference as its prior → posterior over intensities, not point estimates.

5. Why the degeneracies vanish (Jacobian structure)

Jᵀ W J is approximately block-diagonal in (G,B,p | R1 | geometry):

∂r¹/∂{G,B,p} ≠ 0 ;  ∂r¹/∂R1 ≈ 0 ;  ∂r¹/∂geom ≈ 0
∂r²/∂R1     ≠ 0 ;  ∂r²/∂G  = 0 ;  ∂r²/∂geom ≈ 0
∂r³/∂geom   ≠ 0 ;  ∂r³/∂G  = 0 ;  ∂r³/∂R1  ≈ 0

So G↔R1 (+0.51) and all the cross-couplings drop to ~0 by construction. Only G↔R0 survives (R0 is a scale-multiplier, not a shape), handled by the global+physical prior of §4. The degeneracies we measured were artifacts of projecting all information onto a single per-pixel residual.

6. Implementation notes

  • Per reflection: 1 (intensity) + 1 (shape) + 2 (position) residuals = 4, vs ~49 per-pixel residuals → cheaper, and Ceres autodiffs the moment formulas through the pixels.
  • The per-pixel forward model still defines P_tang, p, etc.; the loss moves to the moments.
  • Geometry (term 3) can run as the global sweep we have (it already maximises a position/CC objective); terms 12 are the per-image photometry. Or solve all three jointly per image with the global R0/mosaicity shared across images (two-level fit).
  • Drop-in path: keep the current extraction, add the three residuals as a new objective behind a flag, compare against the per-pixel loss on both test crystals.

7. Why this serves the goal

It is one differentiable likelihood, factored along the physics, that (a) maximises use of the reference (target + Fisher weight), (b) is the substrate for priors / posteriors over intensities (Bayesian), and (c) lets any term — profile P, partiality p, corrections — be replaced by a learned function trained through the same likelihood. That is the qualitative move XDS-style empirical profile fitting cannot make.