cc3eb8352c
Build Packages / Unit tests (push) Skipped
Build Packages / build:rpm (rocky8_nocuda) (push) Successful in 9m28s
Build Packages / build:rpm (rocky9_nocuda) (push) Successful in 10m9s
Build Packages / build:rpm (ubuntu2404_nocuda) (push) Successful in 9m47s
Build Packages / build:rpm (ubuntu2204_nocuda) (push) Successful in 10m58s
Build Packages / build:rpm (rocky8_sls9) (push) Successful in 11m39s
Build Packages / build:rpm (rocky8) (push) Successful in 11m43s
Build Packages / build:rpm (rocky9_sls9) (push) Successful in 12m59s
Build Packages / Generate python client (push) Successful in 35s
Build Packages / Build documentation (push) Successful in 59s
Build Packages / Create release (push) Skipped
Build Packages / build:rpm (ubuntu2204) (push) Successful in 11m48s
Build Packages / build:rpm (rocky9) (push) Successful in 12m32s
Build Packages / build:rpm (ubuntu2404) (push) Successful in 10m24s
Build Packages / XDS test (durin plugin) (push) Successful in 7m35s
Build Packages / XDS test (neggia plugin) (push) Successful in 6m50s
Build Packages / XDS test (JFJoch plugin) (push) Successful in 7m40s
Build Packages / DIALS test (push) Successful in 11m19s
This is an UNSTABLE release. The release has significant modifications for data processing - in case of troubles go back to 1.0.0-rc.144. * jfjoch_broker: Improve azimuthal integration (add <I^2> calculation) * jfjoch_broker: Fixes around indexing, aiming to handle multi-lattice crystals (work in progress, it is not fully integrated) * jfjoch_writer: Save mean(I), stddev(I), and count(I) for each azimuthal bin Reviewed-on: #58
30 lines
1.1 KiB
C++
30 lines
1.1 KiB
C++
// SPDX-FileCopyrightText: 2025 Filip Leonarski, Paul Scherrer Institute <filip.leonarski@psi.ch>
|
|
// SPDX-License-Identifier: GPL-3.0-only
|
|
|
|
#pragma once
|
|
|
|
#include <vector>
|
|
|
|
#include "SpotFindingSettings.h"
|
|
#include "ImageSpotFinder.h"
|
|
#include "../indexing/CUDAMemHelpers.h"
|
|
|
|
class ImageSpotFinderGPU : public ImageSpotFinder {
|
|
std::shared_ptr<CudaStream> stream;
|
|
|
|
CudaDevicePtr<uint32_t> gpu_out_0;
|
|
CudaDevicePtr<uint32_t> gpu_out_1;
|
|
CudaRegisteredVector<uint32_t> output_buffer_reg;
|
|
|
|
const int numberOfCudaThreads = 128; // #threads per block that should work well for Nvidia L4
|
|
const int numberOfWaves = 32; // #waves that should work well for Nvidia L4
|
|
const int windowSizeLimit = 32; // limit on the window size (2nby+1, 2nbx+1) to prevent shared memory problems
|
|
public:
|
|
ImageSpotFinderGPU(int32_t width, int32_t height, std::shared_ptr<CudaStream> stream);
|
|
~ImageSpotFinderGPU() override = default;
|
|
|
|
std::vector<DiffractionSpot> Run(const ImagePreprocessorBuffer &image, const SpotFindingSettings &settings, const std::vector<bool> &res_mask) override;
|
|
};
|
|
|
|
|