Build Packages / build:windows:nocuda (pull_request) Successful in 14m41s
Build Packages / build:windows:cuda (pull_request) Successful in 16m48s
Build Packages / build:rpm (ubuntu2404_nocuda) (pull_request) Successful in 11m15s
Build Packages / build:rpm (rocky8_nocuda) (pull_request) Successful in 12m46s
Build Packages / build:rpm (ubuntu2204_nocuda) (pull_request) Successful in 12m38s
Build Packages / build:rpm (rocky9_nocuda) (pull_request) Successful in 13m11s
Build Packages / build:rpm (rocky8_sls9) (pull_request) Successful in 12m20s
Build Packages / build:rpm (rocky9_sls9) (pull_request) Successful in 12m22s
Build Packages / build:rpm (ubuntu2404) (pull_request) Successful in 11m7s
Build Packages / build:rpm (ubuntu2204) (pull_request) Successful in 11m55s
Build Packages / build:rpm (rocky8) (pull_request) Successful in 12m56s
Build Packages / Generate python client (pull_request) Successful in 14s
Build Packages / build:rpm (rocky9) (pull_request) Successful in 13m15s
Build Packages / Create release (pull_request) Skipped
Build Packages / Build documentation (pull_request) Successful in 41s
Build Packages / XDS test (durin plugin) (pull_request) Successful in 10m3s
Build Packages / DIALS test (pull_request) Successful in 13m6s
Build Packages / XDS test (neggia plugin) (pull_request) Successful in 6m58s
Build Packages / XDS test (JFJoch plugin) (pull_request) Successful in 7m30s
Build Packages / Unit tests (pull_request) Successful in 58m5s
Build Packages / Unit tests (push) Successful in 1h12m36s
Build Packages / build:rpm (rocky8_nocuda) (push) Successful in 14m52s
Build Packages / build:rpm (rocky9_nocuda) (push) Successful in 15m35s
Build Packages / build:rpm (ubuntu2204_nocuda) (push) Successful in 15m29s
Build Packages / build:rpm (ubuntu2404_nocuda) (push) Successful in 13m35s
Build Packages / build:rpm (rocky8_sls9) (push) Successful in 15m25s
Build Packages / build:rpm (rocky9_sls9) (push) Successful in 16m5s
Build Packages / build:rpm (rocky8) (push) Successful in 15m11s
Build Packages / build:rpm (rocky9) (push) Successful in 13m35s
Build Packages / build:rpm (ubuntu2204) (push) Successful in 11m59s
Build Packages / build:rpm (ubuntu2404) (push) Successful in 12m14s
Build Packages / DIALS test (push) Successful in 14m29s
Build Packages / XDS test (durin plugin) (push) Successful in 9m56s
Build Packages / XDS test (JFJoch plugin) (push) Successful in 10m23s
Build Packages / XDS test (neggia plugin) (push) Successful in 9m3s
Build Packages / Generate python client (push) Successful in 20s
Build Packages / Build documentation (push) Successful in 1m10s
Build Packages / Create release (push) Skipped
Build Packages / build:windows:nocuda (push) Successful in 16m39s
Build Packages / build:windows:cuda (push) Successful in 18m40s
Reimplement BraggIntegrate2D (box sum) and ProfileIntegrate2D (Kabsch profile fit) under one roof as a base + CPU + GPU engine, mirroring the AzIntEngine / ROIIntegration pattern. Reads the preprocessed int32 ImagePreprocessorBuffer (masked=INT32_MIN, saturated=INT32_MAX), the same buffer AzIntEngineGPU/ROIIntegrationGPU consume. The CUDA engine runs one block per reflection with shared-memory reductions across six kernels (reset, mask, box-sum, profile learning, profile build, Kabsch fit); the resolution shell is computed inline. The learning/fit hot path is single precision (FP64 is throttled on consumer GPUs; reproduces the double CPU path to ~1e-4). Collapsing the per-frame CUDA API calls into one reset kernel keeps launch-latency overhead low. Standalone for now: NOT wired into IndexAndRefine. See BRAGG_INTEGRATION_ENGINE.md for the design and the binding steps. BraggIntegrationEngineGPUTest checks GPU == CPU across all three modes (box/gaussian/empirical) within numeric tolerance, plus a [bragg_bench] perf sweep. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
106 lines
5.3 KiB
C++
106 lines
5.3 KiB
C++
// SPDX-FileCopyrightText: 2026 Filip Leonarski, Paul Scherrer Institute <filip.leonarski@psi.ch>
|
|
// SPDX-License-Identifier: GPL-3.0-only
|
|
|
|
#pragma once
|
|
|
|
// =============================================================================
|
|
// BraggIntegrationEngine — box-sum + profile-fitting 2D integrator, GPU-ready
|
|
// =============================================================================
|
|
//
|
|
// A reimplementation of BraggIntegrate2D (box sum) and ProfileIntegrate2D (Kabsch profile
|
|
// fit) under one roof, following the AzIntEngine / ROIIntegration pattern: a base class that
|
|
// extracts the fixed per-experiment configuration, a plain-C++ CPU engine (the fallback and the
|
|
// numeric oracle), and a CUDA engine (BraggIntegrationEngineGPU) that reaches the same result up
|
|
// to floating-point precision.
|
|
//
|
|
// Unlike BraggIntegrate2D/ProfileIntegrate2D, which read the raw CompressedImage per pixel type
|
|
// and reject the special/saturation +/-1 band, this engine reads the already-preprocessed int32
|
|
// image held in an ImagePreprocessorBuffer (the same buffer AzIntEngineGPU/ROIIntegrationGPU
|
|
// consume): masked/bad pixels are INT32_MIN and saturated pixels INT32_MAX, so bad-pixel identity
|
|
// is owned by the preprocessor and a pixel is valid iff v != INT32_MIN && v != INT32_MAX.
|
|
//
|
|
// The integrator is selected by BraggIntegrationSettings::Integrator:
|
|
// BoxSum -> BraggIntegrate2D equivalent (rough disk sum minus ring-mean background)
|
|
// ProfileGaussian -> per-reflection measured-width Gaussian profile fit (the default)
|
|
// ProfileEmpirical-> per-shell learned empirical profile fit
|
|
// The box sum is also the seed pass (Pass A) of the two profile modes, so it always runs.
|
|
//
|
|
// This class is intentionally standalone: it is NOT yet wired into IndexAndRefine. It takes a
|
|
// preprocessed image + the predicted reflections and returns the same vector<Reflection> shape
|
|
// (I, sigma, bkg, partiality, ...) that the downstream scaling/merge consumes unchanged.
|
|
// =============================================================================
|
|
|
|
#include <cmath>
|
|
#include <cstddef>
|
|
#include <cstdint>
|
|
#include <optional>
|
|
#include <vector>
|
|
|
|
#include "../../common/BraggIntegrationSettings.h"
|
|
#include "../../common/DiffractionExperiment.h"
|
|
#include "../../common/DiffractionGeometry.h"
|
|
#include "../../common/Reflection.h"
|
|
#include "../image_preprocessing/ImagePreprocessorBuffer.h"
|
|
|
|
namespace bragg_engine {
|
|
// Shared with both engines so the CPU and GPU paths stay numerically aligned.
|
|
constexpr int N_SHELL = 6; // resolution shells for per-shell profile learning
|
|
constexpr double STRONG_I_OVER_SIGMA = 5.0; // strong-spot threshold that seeds the profile
|
|
constexpr int MIN_STRONG_PER_SHELL = 30; // below this a shell falls back to the global profile
|
|
constexpr double C_CAPTURE = 2.5; // weak-spot radial capture term (monochromatic only)
|
|
} // namespace bragg_engine
|
|
|
|
// One reflection's extracted intensity, produced by the derived engine and turned into a
|
|
// Reflection by Finalize() (which owns the polarization correction and scale bookkeeping).
|
|
struct BraggFitResult {
|
|
float I = 0.0f;
|
|
float sigma = NAN;
|
|
float bkg = 0.0f;
|
|
float observed_x = 0.0f; // intensity-weighted centroid (BoxSum mode only)
|
|
float observed_y = 0.0f;
|
|
bool ok = false;
|
|
bool has_observed = false;
|
|
};
|
|
|
|
class BraggIntegrationEngine {
|
|
protected:
|
|
// --- fixed configuration extracted from the experiment (see ProfileIntegrate2D) ---
|
|
IntegratorMode mode;
|
|
bool empirical; // ProfileEmpirical (vs ProfileGaussian)
|
|
|
|
size_t xpixel, ypixel, npixel;
|
|
|
|
float r1_sq;
|
|
float r2, r2_sq;
|
|
float r3, r3_sq;
|
|
float min_sigma_ratio;
|
|
int R, G, GG; // profile-grid half-size, edge (2R+1) and area (G*G)
|
|
|
|
bool broadband; // a set bandwidth (stills) vs monochromatic (rotation)
|
|
double bw_sigma; // bandwidth sigma [dimensionless, * Rpx -> px]
|
|
bool apply_bkg_clip; // stills-only high-outlier background sigma-clip
|
|
bool use_ellipse; // radially elongate the per-reflection Gaussian
|
|
|
|
double c_radial; // radial variance coefficient of tan^2(2theta): parallax + capture
|
|
double F_px; // detector distance expressed in pixels
|
|
float beam_x, beam_y;
|
|
|
|
DiffractionGeometry geom; // kept for the per-reflection polarization correction
|
|
std::optional<float> polarization;
|
|
|
|
// Assemble output reflections from the per-reflection fit results (polarization + scale corr).
|
|
std::vector<Reflection> Finalize(const std::vector<Reflection> &predicted, size_t npredicted,
|
|
const std::vector<BraggFitResult> &results,
|
|
int64_t image_number) const;
|
|
|
|
public:
|
|
explicit BraggIntegrationEngine(const DiffractionExperiment &experiment);
|
|
virtual ~BraggIntegrationEngine() = default;
|
|
|
|
// predicted[0..npredicted) are the reflections to extract; image is the preprocessed int32
|
|
// frame (image.size() == npixel). Returns only the observed reflections.
|
|
virtual std::vector<Reflection> Run(const ImagePreprocessorBuffer &image,
|
|
const std::vector<Reflection> &predicted, size_t npredicted,
|
|
int64_t image_number) = 0;
|
|
};
|