Files
aare/python
kferjaoui 88e0e8d678
Build on RHEL8 / build (push) Successful in 2m51s
Build on RHEL9 / build (push) Successful in 3m15s
Run tests using data on local RHEL8 / build (push) Successful in 3m47s
Optimize CUDA cluster finder transfers and kernel hot path
- Use per-stream pinned host staging buffers for truly async CUDA transfers.
- Avoid reserving full device capacity per result frame.
- Reduce kernel work by delaying cluster payload construction.
- Use squared comparisons and removing per-pixel sqrtf() ops.
2026-04-30 18:23:31 +02:00
..
2026-03-30 09:12:23 +02:00