Files
Jungfraujoch/image_analysis/pixel_refinement/FACTORED_MODEL.md
T
leonarski_fandClaude Opus 4.8 100fe7b7e7
Build Packages / build:rpm (ubuntu2404_nocuda) (push) Successful in 25m0s
Build Packages / build:rpm (rocky8_nocuda) (push) Successful in 26m42s
Build Packages / build:rpm (rocky8_sls9) (push) Successful in 27m7s
Build Packages / build:rpm (ubuntu2204_nocuda) (push) Successful in 28m25s
Build Packages / build:rpm (rocky9_nocuda) (push) Successful in 29m44s
Build Packages / build:rpm (rocky9_sls9) (push) Successful in 32m14s
Build Packages / build:rpm (rocky8) (push) Successful in 24m39s
Build Packages / build:rpm (ubuntu2404) (push) Successful in 23m52s
Build Packages / build:rpm (ubuntu2204) (push) Successful in 25m19s
Build Packages / Generate python client (push) Successful in 23s
Build Packages / XDS test (durin plugin) (push) Successful in 20m42s
Build Packages / Create release (push) Skipped
Build Packages / build:rpm (rocky9) (push) Successful in 27m2s
Build Packages / Build documentation (push) Successful in 1m23s
Build Packages / DIALS test (push) Successful in 31m5s
Build Packages / XDS test (JFJoch plugin) (push) Successful in 14m55s
Build Packages / XDS test (neggia plugin) (push) Successful in 13m7s
Build Packages / Unit tests (push) Successful in 2h14m40s
PixelRefine: make factored Terms 1+2 the model, remove old wiring
PixelRefine is now an intensity-only operation: geometry is fixed (refined
upstream by XtalOptimizer) and the only objective is the factored per-reflection
likelihood (FACTORED_MODEL.md Terms 1+2) - measured per-resolution profile width
R1 plus one Fisher-weighted intensity/scaling residual per reflection, fitting
the per-image scale G and B. Validated on crystal 2 (fixed_master.h5 as stills,
1.7 A): CC1/2 84-92%, CCref 77-92%, flat - reproduces the env-flag prototype and
matches the rotation path from the stills path.

Removed:
- the per-pixel ShoeboxResidual loss and PixelResidual cost functor;
- all in-PixelRefine geometry refinement (orientation/cell/beam/distance/R),
  the regularised-orientation LSQ, signal-weighting, and the global sweep;
- Term 3 (per-spot recentring) - a confirmed no-op on both crystals;
- the diagnostic scaffolding (covariance, centroid, adaptive_R1) and the
  PR_* env knobs + stderr dumps in IndexAndRefine;
- the PredictImage/ChiSquaredImage renderers and the entire viewer
  PixelRefine window/table/params + worker bindings + shoebox overlay.

The sweep box-integrator background median became mean (consistency) by virtue
of removing the sweep. METHODS.md rewritten for the current model; findings
recorded in FINDINGS-2026-06.md. Net -2200 lines.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-13 22:02:18 +02:00

143 lines
7.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# A factored likelihood for joint integration + scaling + geometry
**Status: Terms 1+2 implemented and shipping as the PixelRefine default** (see
`PixelRefine.cpp`, `METHODS.md`, `FINDINGS-2026-06.md`); Term 3 (geometry) and the
priors/NN extensions of §4 remain future work. Goal: replace the per-pixel least-squares
of PixelRefine with a per-*reflection* likelihood that fuses profile-fit integration,
scaling against the reference, and geometry refinement into one differentiable
objective — the foundation for priors (Bayesian) and learned components (NN), and the
thing that dissolves the empty-pixel and parameter-degeneracy problems by construction
rather than by patching.
## 0. Notation
Per image, parameters `θ`: scale `G`, Debye-Waller `B`, orientation + cell (geometry),
profile width `R1` (tangential, possibly a 2×2 tensor), partiality width `R0`
(radial/mosaicity; `R0_eff² = R0² + R_bw²`, `R_bw² = (bλ)²/2d⁴` *known* from bandwidth).
Per reflection `h`: reference intensity `I_ref` (the hypothesis), resolution `d`,
predicted centre `c_pred`, partiality `p = exp(−ε_r²/R0_eff²)`, polarisation `pol`,
`B_term = exp(B/4d²)`, shoebox pixels `{I_p}` with mean local background `Bg`, and the
area-normalised tangential profile template `P_p = P_tang(ε_t,p; R1)`.
## 1. The factorisation principle
A reflection's shoebox carries three (to first order) **orthogonal** pieces of
information — the 0th, 1st and 2nd moments of its intensity distribution:
| moment | statistic | constrains |
|---|---|---|
| 0th — total | profile-fit amplitude `J` | scale chain `G, B` (and `p`) |
| 1st — position | centroid `c_obs` | geometry (orientation; radial→distance/cell) |
| 2nd — shape | second moment `M₂` | profile width `R1` (and anisotropy) |
The current per-pixel residual mixes all three into one objective over shared pixels —
*that* is what couples the parameters (measured GR0 ≈ 0.46, GR1 ≈ +0.51) and lets the
many empty pixels dominate. Residual-ing each **moment** against its model instead gives
a block-diagonal Jacobian: the couplings vanish because each statistic carries one
parameter block's information.
## 2. The three residual terms
### 2.1 Intensity / scaling residual (one scalar per reflection)
Optimal (Diamond) profile-fit amplitude and its model:
```
J = Σ_p w_p P_p (I_p Bg) / Σ_p w_p P_p² w_p = 1/v_p
J_model = G · B_term · p · pol · I_ref
r¹_h = (J J_model) / σ_J
```
`J` is ~invariant to `R1` (a well-sampled spot integrates to the same total whatever
width is assumed) → **R1 leaves this residual**. Empty pixels make no residual; they
enter only through `J` with ~zero profile weight → **the empty-pixel problem is gone by
construction.** This residual *is* the scaling residual — integration and scaling are now
one objective.
### 2.2 Shape residual (constrains R1; decoupled from scale)
```
M₂_obs = Σ_p (I_p Bg) ε_t,p² / Σ_p (I_p Bg) (intensity-weighted variance, Å⁻²)
M₂_model = R1² / 2 (variance of exp(−ε_t²/R1²))
r²_h = (M₂_obs M₂_model) / σ_M2
```
A moment is normalised by the total → **scale-invariant → `∂r²/∂G = 0`**. The G↔R1
degeneracy disappears. Anisotropic extension: use the 2×2 moment tensor
`Σ(IBg)(ε_t⊗ε_t)/Σ(IBg)` vs `diag(R1a²/2, R1b²/2)` → elliptical R1 (the DMM streak).
Weak spots have huge `σ_M2` → contribute ~nothing → R1 is set by strong spots
automatically (and may be made `R1(d)` per resolution).
### 2.3 Position residual (constrains geometry; decoupled from scale and shape)
```
c_obs = Σ_p (I_p Bg)(x_p, y_p) / Σ_p (I_p Bg)
r³_h = (c_obs c_pred(geometry)) / σ_c (2-vector; split radial / tangential)
```
Centroid is scale- and width-invariant → `∂r³/∂G = ∂r³/∂R1 ≈ 0`. The **radial** component
constrains distance/cell, the **tangential** constrains orientation — exactly the split
the diagnostic measured (radial≈0 = no distance error; tangential∝radius = orientation).
## 3. Fisher / expected-variance weighting (makes it a likelihood)
Every `σ` uses the **model-expected** variance, never observed counts — this is what
makes strong *expected* reflections carry the information and makes the model "feel pain
when something that should be there is not":
```
v_p = Bg + J_model · P_p (background + expected signal from I_ref, not I_obs)
σ_J² = 1 / Σ_p (P_p² / v_p)
σ_M2 ≈ M₂ · √(2 / N_eff), σ_c ≈ R1 / √(N_eff), N_eff = (Σ(IBg))² / Σ v_p
```
Fisher information about `G` from term 1 is `∝ (B_term·p·pol·I_ref)² / σ_J²` — driven by
`I_ref`, so a noise spike (high counts, low `I_ref`) gets *no* weight while a strong
expected reflection observed absent (`J≈0`, large residual, moderate `σ_J`) gets a large
penalty. The reference enters at maximum leverage: it sets both the target and the weight.
## 4. Joint objective and priors
```
L(θ) = Σ_h [ (r¹_h)² + (r²_h)² + |r³_h|² ] + priors
```
No free λ if the σ's are correct — the relative weighting *is* the Fisher information.
Priors are the Bayesian hooks and the principled degeneracy breaks:
- **R0 (partiality/mosaicity) is GLOBAL + prior.** R0 multiplies `J_model` (`p`), so it is
still degenerate with the per-image `G` *within term 1* — the one degeneracy the
factorisation does **not** remove. Resolve it physically, not with a directional G prior
(which would bias every output intensity): `R0 ~ N(mosaicity, σ)`, `R_bw` fixed from the
known bandwidth, and `R0` fit **globally** (one per crystal, from many reflections'
partiality distribution) so per-image G can't trade against it.
- orientation `~ N(spot-centroid, σ)`; `G ~ N(1, σ_G)` or tied to the beam monitor;
distance `~ N(nominal, σ_L)` (loose, since serial/jet alignment is poorly constrained).
- Optional Bayesian intensities: treat `I_true` as a parameter with the reference as its
prior → posterior over intensities, not point estimates.
## 5. Why the degeneracies vanish (Jacobian structure)
`Jᵀ W J` is approximately block-diagonal in `(G,B,p | R1 | geometry)`:
```
∂r¹/∂{G,B,p} ≠ 0 ; ∂r¹/∂R1 ≈ 0 ; ∂r¹/∂geom ≈ 0
∂r²/∂R1 ≠ 0 ; ∂r²/∂G = 0 ; ∂r²/∂geom ≈ 0
∂r³/∂geom ≠ 0 ; ∂r³/∂G = 0 ; ∂r³/∂R1 ≈ 0
```
So G↔R1 (+0.51) and all the cross-couplings drop to ~0 by construction. Only **G↔R0**
survives (R0 is a scale-multiplier, not a shape), handled by the global+physical prior of
§4. The degeneracies we measured were artifacts of projecting all information onto a
single per-pixel residual.
## 6. Implementation notes
- Per reflection: 1 (intensity) + 1 (shape) + 2 (position) residuals = 4, vs ~49 per-pixel
residuals → **cheaper**, and Ceres autodiffs the moment formulas through the pixels.
- The per-pixel forward model still *defines* `P_tang`, `p`, etc.; the **loss** moves to
the moments.
- Geometry (term 3) can run as the global sweep we have (it already maximises a
position/CC objective); terms 12 are the per-image photometry. Or solve all three
jointly per image with the global R0/mosaicity shared across images (two-level fit).
- Drop-in path: keep the current extraction, add the three residuals as a new objective
behind a flag, compare against the per-pixel loss on both test crystals.
## 7. Why this serves the goal
It is one differentiable likelihood, factored along the physics, that (a) maximises use of
the reference (target + Fisher weight), (b) is the substrate for priors / posteriors over
intensities (Bayesian), and (c) lets any term — profile `P`, partiality `p`, corrections —
be replaced by a learned function trained through the same likelihood. That is the
qualitative move XDS-style empirical profile fitting cannot make.