Files
Jungfraujoch/docs/IMAGE_STREAM.md

13 KiB

Data streams

Jungfraujoch process (jfjoch_broker) operates three outputs. All three can be operated/enabled independently. These are:

  • Image - all the images including metadata (ZeroMQ PUSH socket or custom TCP/IP socket)
  • Preview - images with metadata at a reduced frame rate (PUB socket)
  • Metadata - only metadata for all the images, bundled into packages (PUB socket)

Image stream

Images (with metadata) are serialized as CBOR image message. The stream will also include CBOR start message, calibration messages and end message with run metadata.

If file_prefix is not provided for a data collection, images won't be sent to image stream (or its HDF5/CBOR replacements).

Splitting image stream

Image stream can be split into multiple sockets to increase performance, in this case images will be split according to file number to which the image belongs. All sockets will forward start and end messages. Only first socket will forward calibration messages and will be marked to write master file.

ZeroMQ image stream

This is using PUSH ZeroMQ socket(s). It should be strictly avoided to have multiple receivers connected to one PUSH ZeroMQ socket. ZeroMQ will send the images in a round-robin basis to the receivers. In this case start and end messages will end up only with one receiver. Instead, Jungfraujoch feature of multiple sockets should be used. For ZeroMQ image stream, each writer connects to a different port.

Behavior is as following:

  • Start message is sent with timeout of 1s per socket. If within the time the message cannot be put in the outgoing queue or there is no connected puller, an exception is thrown — data collection is stopped with an error due to absence of a writer.
  • Calibration message is sent to the first socket only, with timeout of 1s.
  • Images are sent via a per-socket writer thread. If a send times out, the pusher switches to non-blocking mode for the remainder of the collection (images may be dropped).
  • End message is sent with timeout of 1s per socket. No exception is thrown on timeout, but a transmission error is recorded.

The format is generally interchangeable with DECTRIS Stream2 format.

ZeroMQ configuration

ZeroMQ image stream is configured in the broker JSON configuration file under the zeromq_settings section:

{
  "image_socket": ["tcp://192.168.0.1:9000", "tcp://192.168.0.1:9001"],
  "send_watermark": 100,
  "send_buffer_size": 67108864,
  "writer_notification_socket": "tcp://192.168.0.1:*"
}
  • image_socket: one or more PUSH socket addresses. Multiple entries split the image stream across sockets. Addresses follow ZeroMQ conventions (tcp://, ipc://). 0.0.0.0 binds on all network interfaces.
  • send_watermark (optional): ZeroMQ send high-water mark (number of outstanding messages per socket).
  • send_buffer_size (optional): OS-level send buffer size for the ZeroMQ socket.
  • writer_notification_socket (optional): see Writer notification socket below.

TCP/IP image stream

This is using TCP/IP socket(s) with a fixed binary frame header followed by payload bytes. This format was introduced to Jungfraujoch as an alternative to ZeroMQ image stream. It allows two-way communication between the data collection and the writer, and is therefore more robust than ZeroMQ.

For TCP/IP image stream, Jungfraujoch listens on a single TCP port and all writers connect to it. Connections are persistent — writers connect once and stay connected across multiple data collections. Jungfraujoch sends periodic KEEPALIVE frames when no data collection is active to detect dead connections; writers are expected to respond with a KEEPALIVE pong.

Using * as port number (e.g. tcp://127.0.0.1:*) is supported — the OS assigns a free port and the actual bound address can be queried via GetAddress().

Payloads for START, DATA, CALIBRATION and END frames are CBOR messages, equivalent in content to the ZeroMQ image stream messages.
ACK, CANCEL, and KEEPALIVE are control frames (no CBOR payload).

The data collection lifecycle on each connection follows: STARTCALIBRATION (socket 0 only) → DATA (repeated) → END

If a START ACK fails on any connection, Jungfraujoch sends CANCEL to all already-started connections and rolls back.

For each frame:

  1. Read one TcpFrameHeader (fixed size, 64-byte aligned).
  2. Validate magic (0x4A464A54 / "JFJT") and version (2).
  3. Read payload_size bytes (if non-zero).

When image stream is split into multiple connections:

  • START and END are sent on all connections,
  • CALIBRATION is sent only on connection 0,
  • DATA frames are distributed by file grouping: connection index = (image_number / images_per_file) % num_connections.

TCP/IP configuration

TCP/IP image stream is configured in the broker JSON configuration file under the tcp_settings section:

{
  "addr": "tcp://192.168.0.1:9100",
  "nwriters": 2,
  "send_buffer_size": 67108864
}
  • addr: listen address in tcp://<IP>:<port> format. 0.0.0.0 binds on all interfaces. * as port selects a random free port.
  • nwriters (optional): maximum number of simultaneous writer connections accepted.
  • send_buffer_size (optional): OS-level SO_SNDBUF size for accepted connections.

ACK handling

ACK handling is mandatory for correct operation:

  • START must be acknowledged (ACK with ack_for=START) on each connection within 5 seconds, otherwise collection start fails and a rollback is triggered.
  • END must be acknowledged (ack_for=END) on each connection within 10 seconds for successful completion.
  • CANCEL should be acknowledged during rollback paths (500ms timeout).
  • DATA should be acknowledged for every frame. A DATA ACK with FATAL flag set reports a downstream error (e.g. disk full) which is propagated to jfjoch_broker via Finalize(). A failed DATA ACK does not break the TCP connection on its own — data continues to flow.
  • CALIBRATION is not acknowledged at this time.
  • KEEPALIVE frames are not acknowledged via ACK; the writer responds with a KEEPALIVE pong frame instead.

Keepalive

When no data collection is active, Jungfraujoch sends KEEPALIVE frames approximately every 5 seconds on each persistent connection. Writers should respond with a KEEPALIVE frame (pong). OS-level TCP keepalive is also enabled (TCP_KEEPIDLE=30s, TCP_KEEPINTVL=10s, TCP_KEEPCNT=3) as a secondary safety net. Dead connections are automatically removed from the pool.

Zero-copy transmission

On Linux, large payload transmission (DATA and CALIBRATION frames) can use kernel TCP zero-copy (SO_ZEROCOPY/MSG_ZEROCOPY) when available. If the kernel does not support it or the socket option fails, transmission transparently falls back to normal send() behavior. Zero-copy completion notifications are processed by a dedicated per-connection thread.

Frame types

Value Name Purpose
1 START Start-of-run metadata
2 DATA One image payload
3 CALIBRATION Calibration payload
4 END End-of-run metadata
5 ACK Acknowledgement / error reporting
6 CANCEL Cancel run initialization/stream
7 KEEPALIVE Connection liveness probe/pong

TCP frame header (TcpFrameHeader)

Field Type Description
magic uint32_t Protocol magic (0x4A464A54, "JFJT")
version uint16_t Protocol version (2)
type uint16_t Frame type (see table above)
image_number uint64_t Image index for DATA frames
payload_size uint64_t Number of payload bytes after header
socket_number uint32_t Connection index in split-stream mode
flags uint32_t ACK flags (OK, FATAL, HAS_ERROR_TEXT)
run_number uint64_t Run identifier
ack_processed_images uint32_t In ACK: number of images processed by receiver
ack_code uint16_t In ACK: error/status code
ack_for uint16_t In ACK: frame type being acknowledged
reserved uint64_t[2] Reserved, set to 0

The header is 64-byte aligned (alignas(64)).

ACK semantics

  • ACK frames use ack_for to indicate which frame type is acknowledged.
  • flags:
    • OK (bit 0): operation accepted/successful,
    • FATAL (bit 1): receiver reports unrecoverable error (primarily for DATA),
    • HAS_ERROR_TEXT (bit 2): ACK payload contains UTF-8 error text.
  • ack_code can be used to categorize errors:
Code Name Meaning
0 None No error
1 StartFailed START processing failed
2 DataWriteFailed Image write failed
3 EndFailed END processing failed
4 DiskQuotaExceeded Disk quota exceeded
5 NoSpaceLeft No space left on device
6 PermissionDenied Permission denied
7 IoError General I/O error
8 ProtocolError Protocol-level error

Image stream replacement

Image stream can be replaced with direct HDF5 writer and CBOR dump image pushers, or it can be disabled by selecting "None" image pusher for all the measurements.

Writer notification socket

The writer notification socket is used only with ZeroMQ image stream. Since ZeroMQ is asynchronous, jfjoch_broker does not know whether messages were properly handled downstream (e.g. written to disk). The writer notification socket allows downstream code to report back.

For TCP/IP image stream, this mechanism is not needed — ACK frames provide synchronous feedback for each control and data frame.

To use writer notification socket, it has to be first enabled in the JSON configuration file of broker with writer_notification_socket entry:

{
  "writer_notification_socket":"tcp://192.168.0.1:*"
}

Such entry will create PULL socket on 192.168.0.1 network interface listening on one, random TCP port. When data processing is started, the image stream will send CBOR start message. This message will include information on writer_notification_zmq_addr, which needs to be used by downstream code. Since the start message must reference the address of jfjoch_broker host, notification socket should always listen on a particular network interface, and should not be configured with placeholder address 0.0.0.0. It is, however, OK to use placeholder :* for network port, as it will be substituted for the one chosen by ZeroMQ.

For every image stream socket, downstream code must send the following message to the PULL socket:

{
  "run_number":135,
  "run_name": "lysozyme_1",
  "socket_number": 1,
  "processed_images":250,
  "ok": true
}

Here run_number, run_name and socket_number must match information from the start message. ok is boolean confirming if the writing process was OK. processed_images is number of images that were written/processed, this is to track how many images were ignored by non-blocking ZeroMQ procedures. If not, it is possible to include error message:

{
  "run_number":135,
  "run_name": "lysozyme_1",
  "socket_number": 1,
  "processed_images": 0,
  "ok": false,
  "error": "Permission error"
}

This way errors from the downstream code are propagated to jfjoch_broker.

If writer notification socket is configured, but downstream code doesn't send proper notification, jfjoch_broker will time out after 60 seconds producing an error message.

Preview stream

Jungfraujoch can also send images (with metadata) at a reduced frame rate for preview purpose. Images are serialized as CBOR image message. The stream will also include CBOR start message and end message with run metadata. Only start and image messages are sent.

This is using PUB socket with conflate option. I.e., only the last message is kept by ZeroMQ, so if receiver cannot cope with the messages, it will always receive the last generated message (no backlog). For this reason it is also recommended to use the same option on receiver side.

Given PUB socket properties, it is possible to connect multiple viewers to a single socket --- all the viewers should receive all the images sent.

Metadata stream

Jungfraujoch can also send pure metadata for the purpose of archiving such information. Metadata are serialized as CBOR metadata message. This is very similar as image message, but excludes the actual image array and spot positions. As metadata are relatively small, to avoid large number of messages, Jungfraujoch bundles metadata of many images in one message. Order of images within bundle, as well a size of the bundle, are not guaranteed. The stream will also include CBOR start message and end message with run metadata.

This is using PUB socket with watermark, so there is some queuing of messages with ZeroMQ. Multiple receivers can be connected.