929e6e1fa0
Build Packages / build:rpm (rocky8_nocuda) (push) Successful in 26m41s
Build Packages / build:rpm (rocky9_nocuda) (push) Successful in 29m5s
Build Packages / build:rpm (ubuntu2204_nocuda) (push) Successful in 25m41s
Build Packages / build:rpm (ubuntu2404_nocuda) (push) Successful in 24m59s
Build Packages / build:rpm (rocky8_sls9) (push) Successful in 29m21s
Build Packages / build:rpm (rocky9_sls9) (push) Successful in 30m14s
Build Packages / build:rpm (rocky8) (push) Successful in 24m32s
Build Packages / build:rpm (rocky9) (push) Successful in 28m25s
Build Packages / build:rpm (ubuntu2204) (push) Successful in 25m5s
Build Packages / build:rpm (ubuntu2404) (push) Successful in 23m57s
Build Packages / DIALS test (push) Successful in 33m44s
Build Packages / XDS test (durin plugin) (push) Successful in 20m34s
Build Packages / XDS test (JFJoch plugin) (push) Successful in 16m31s
Build Packages / XDS test (neggia plugin) (push) Successful in 15m11s
Build Packages / Generate python client (push) Successful in 26s
Build Packages / Build documentation (push) Successful in 1m24s
Build Packages / Create release (push) Skipped
Build Packages / Unit tests (push) Successful in 41m41s
The per-direction peak search took argmax|spec|. The projected-spot histogram has a broad low-frequency ENVELOPE (spots cluster near the origin) whose magnitude can exceed the true lattice peaks, so on weak / pink-beam frames every direction reported a short envelope vector (~10 A) and the real 38-79 A axes never surfaced -> 0 candidate cells -> 0% indexing. (Diagnosed on the lyso jet: the FFT returned only 10-15 A vectors, the true axes entirely absent.) Subtract a running-mean background of half-width ~15 A and pick the peak by its PROMINENCE (mag - background) instead. The smooth envelope cancels to ~0 while sharp lattice peaks - fundamentals and harmonics alike - keep their height, so the real axes win. The prominence is also reported as the magnitude, so FilterFFTResults ranks directions by real-peak strength rather than envelope. Ported identically to CPU (prefix-sum window) and GPU (sliding-window in the kernel). Validation (lyso, de-novo): jet FFT 0% -> 20.5% (CPU and GPU identical; vs FFBIDX 27%); crystal 2 95.3% -> 95.5% (no regression, CC1/2 95.8 / CCref 92.7 unchanged). The ~15 A window is the validated optimum (wider over-smooths, narrower under-removes the envelope). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
131 lines
4.7 KiB
C++
131 lines
4.7 KiB
C++
// SPDX-FileCopyrightText: 2025 Filip Leonarski, Paul Scherrer Institute <filip.leonarski@psi.ch>
|
|
// SPDX-License-Identifier: GPL-3.0-only
|
|
|
|
#include "FFTIndexerCPU.h"
|
|
|
|
#include <cmath>
|
|
#include <algorithm>
|
|
#include <stdexcept>
|
|
#include <cassert>
|
|
#include <mutex>
|
|
#include <vector>
|
|
|
|
static std::mutex fftw_plan_mutex;
|
|
|
|
static inline double dot_abs(const Coord& a, const Coord& b) {
|
|
return std::fabs(a.x * b.x + a.y * b.y + a.z * b.z);
|
|
}
|
|
|
|
FFTIndexerCPU::FFTIndexerCPU(const IndexingSettings& settings)
|
|
: FFTIndexer(settings) {
|
|
|
|
// Allocate host buffers
|
|
h_input_fft.resize(input_size, 0.0);
|
|
h_output_fft.resize(output_size);
|
|
|
|
// Validate allocations vs. expected sizes
|
|
if (h_input_fft.size() != input_size)
|
|
throw std::runtime_error("FFTWIndexer: input buffer size mismatch");
|
|
if (h_output_fft.size() != output_size)
|
|
throw std::runtime_error("FFTWIndexer: output buffer size mismatch");
|
|
|
|
const int H = static_cast<int>(histogram_size);
|
|
const int out_len = (H / 2) + 1;
|
|
|
|
int n[1] = { H };
|
|
|
|
{
|
|
std::unique_lock ul(fftw_plan_mutex);
|
|
plan = fftwf_plan_many_dft_r2c(
|
|
1, n, nDirections,
|
|
h_input_fft.data(), nullptr, 1, H,
|
|
reinterpret_cast<fftwf_complex*>(h_output_fft.data()), nullptr, 1, out_len,
|
|
FFTW_ESTIMATE);
|
|
}
|
|
|
|
if (!plan)
|
|
throw std::runtime_error("fftw_plan_many_dft_r2c failed");
|
|
}
|
|
|
|
|
|
FFTIndexerCPU::~FFTIndexerCPU() {
|
|
std::unique_lock ul(fftw_plan_mutex);
|
|
fftwf_destroy_plan(plan);
|
|
}
|
|
|
|
void FFTIndexerCPU::ExecuteFFT(const std::vector<Coord> &coord, size_t nspots) {
|
|
// Build histograms: one per direction
|
|
const int H = static_cast<int>(histogram_size);
|
|
const int D = nDirections;
|
|
const int out_len = (H / 2) + 1;
|
|
|
|
std::fill(h_input_fft.begin(), h_input_fft.end(), 0.0);
|
|
|
|
for (int d = 0; d < D; ++d) {
|
|
float* hist = h_input_fft.data() + static_cast<size_t>(d) * H;
|
|
|
|
for (size_t i = 0; i < nspots; i++) {
|
|
const auto& r = coord[i];
|
|
double dot = dot_abs(direction_vectors[d], r);
|
|
long long bin = static_cast<long long>(dot / histogram_spacing);
|
|
if (bin >= 0 && bin < H) {
|
|
hist[bin] += 1.0;
|
|
}
|
|
}
|
|
}
|
|
|
|
// Plan and execute batched R2C FFT with FFTW
|
|
fftwf_execute(plan);
|
|
|
|
// Post-process: pick the peak past min_length_A by PROMINENCE above a local background.
|
|
// The projected histogram has a broad low-frequency ENVELOPE (spots cluster near the
|
|
// origin) whose magnitude can exceed the true lattice peaks; a plain argmax|spec| then
|
|
// returns a short envelope vector (~10A) on weak/pink-beam frames and the real axes are
|
|
// lost. Subtracting a running-mean background of half-width BG_HALF bins removes that
|
|
// smooth envelope (it cancels to ~0) while sharp lattice peaks - fundamentals AND
|
|
// harmonics - keep their height. The prominence is also reported as the magnitude so
|
|
// FilterFFTResults ranks directions by real-peak strength, not by envelope.
|
|
const double len_coeff = 2.0 * static_cast<double>(max_length_A) / static_cast<double>(H);
|
|
// Background half-window ~15 A (in length, so it is independent of the histogram sizing);
|
|
// wide enough to span the envelope yet narrow enough not to smooth real peaks. Validated
|
|
// optimum on the lyso jet (de-novo indexing 0% -> 24%, vs FFBIDX 27%); crystal 2 unchanged.
|
|
constexpr double BG_HALF_WIDTH_A = 15.0;
|
|
const int bg_half = std::max(1, static_cast<int>(std::lround(BG_HALF_WIDTH_A / len_coeff)));
|
|
|
|
std::vector<double> mag(out_len), pref(out_len + 1);
|
|
|
|
for (int d = 0; d < D; ++d) {
|
|
const auto* spec = h_output_fft.data() + static_cast<size_t>(d) * out_len;
|
|
|
|
pref[0] = 0.0;
|
|
for (int j = 0; j < out_len; ++j) {
|
|
mag[j] = std::hypot(spec[j][0], spec[j][1]);
|
|
pref[j + 1] = pref[j] + mag[j];
|
|
}
|
|
|
|
double best_prom = 0.0;
|
|
double best_len = -1.0;
|
|
|
|
for (int j = 0; j < out_len; ++j) {
|
|
double len = len_coeff * static_cast<double>(j);
|
|
if (len <= static_cast<double>(min_length_A)) continue;
|
|
|
|
const int lo = std::max(0, j - bg_half);
|
|
const int hi = std::min(out_len, j + bg_half + 1);
|
|
const double background = (pref[hi] - pref[lo]) / static_cast<double>(hi - lo);
|
|
const double prominence = mag[j] - background;
|
|
|
|
if (prominence > best_prom) {
|
|
best_prom = prominence;
|
|
best_len = len;
|
|
}
|
|
}
|
|
|
|
result_fft[d] = FFTResult{
|
|
.magnitude = static_cast<float>(best_prom),
|
|
.direction = d,
|
|
.length = static_cast<float>(best_len)
|
|
};
|
|
}
|
|
}
|