d156dfa2e8d3f501ffccfa10874fff7b1bdc05c6
The broker logs for the dropped runs show the connection torn down ~2s into a collection (not 3s), via "TCP send failed -> Removed dead connection -> Accepted (new socket)". That is too early for the SendAll send deadline: the real gate was the fixed 2-second enqueue deadline in the zerocopy SendImage path. At the start of a large dataset the writer briefly stalls draining the socket while it creates the master file and writes the large START metadata + calibration frames to GPFS; the per-connection queue fills, and after 2s SendImage marked the connection broken. The writer then reconnected outside the active session, so the rest of the run was dropped and the half-written file was finalized at the next START. Replace the fixed 2s enqueue deadline with the same peer-liveness condition used on the send path: keep applying backpressure while the writer proves it is alive (BUSY heartbeats / ACKs refresh last_peer_activity_ns from a thread independent of the stalled write path), and only declare it dead after the liveness window of complete silence. A transient startup stall is now ridden out instead of dropping the run. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Jungfraujoch
Application to receive data from the PSI JUNGFRAU and EIGER detectors.
All documentation is now placed in docs/ subdirectory and for the current version hosted on Jungfraujoch Read The Docs page.
Languages
C++
70.9%
HTML
10.1%
C
8.2%
TypeScript
5.2%
Tcl
3%
Other
2.4%