Jungfraujoch/common/CUDAWrapper.cpp at 02fa15c2b95cc94745b20394efebcb8477654b29 - Jungfraujoch - PSI GIT Service

mx/Jungfraujoch

Files

T

leonarski_f 02fa15c2b9 jfjoch_process: spread per-image GPU work across all visible GPUs

The offline worker threads built MXAnalysisWithoutFPGA without selecting a CUDA
device, so all per-image preprocessing/spot-finding/azimuthal integration ran on
GPU 0 (only the indexer pool was distributed). Add pin_gpu() to CUDAWrapper - a
process-wide round-robin counter (counter++ % get_gpu_count(), no thread id, no-op
without a GPU, honours CUDA_VISIBLE_DEVICES) - and call it once per worker before
building the analysis resources so their CUDA streams/engines land on distinct
devices.

Also add NUMA_GPU_REVIEW.md: a working note mapping ImageBuffer/NUMAHWPolicy/GPU
dispatch with goals and a staged plan (multi-broker GPU isolation via
CUDA_VISIBLE_DEVICES, dropping libnuma, reassessing NUMA pinning for the FPGA path).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-17 15:29:52 +02:00

21 lines

347 B

C++

Raw Blame History

 // SPDX-FileCopyrightText: 2024 Filip Leonarski, Paul Scherrer Institute <filip.leonarski@psi.ch>
 // SPDX-License-Identifier: GPL-3.0-only
 #ifndef JFJOCH_USE_CUDA
 #include "CUDAWrapper.h"
 int32_t get_gpu_count() {
     return 0;
 }
 void set_gpu(int32_t dev_id) {}
 void pin_gpu() {}
 int get_gpu_numa_node(int dev_id) {
     return -1;
 }
 #endif