Files
aare/include
Khalil Daniel Ferjaoui a43814801a Add CUDA cluster finder kernel with host launcher
Implements a GPU version of the sequential ClusterFinder for
single-frame cluster reconstrcution.

Kernel (ClusterFinderCUDA.cuh):
- Shared memory tiling with generalized halo loading for arbitrary
  cluster sizes (3x3, 5x5, ...)
- Zero-initialization of shared memory to handle image boundary
  and partial edge-block cases.
- Pedestal subtraction during shared memory loading.
- Compile-time cluster geometry enabling full loop unrolling
  of the stencil reduction
- Atomic global counter for lock-free cluster output across blocks.
- RAII host wrapper; `ClusterFinderCUDA` struct.
2026-04-08 16:20:43 +02:00
..