mirror of
https://github.com/slsdetectorgroup/aare.git
synced 2026-06-05 16:18:41 +02:00
88e0e8d678
- Use per-stream pinned host staging buffers for truly async CUDA transfers. - Avoid reserving full device capacity per result frame. - Reduce kernel work by delaying cluster payload construction. - Use squared comparisons and removing per-pixel sqrtf() ops.