9.2 KiB
Data streams
Jungfraujoch process (jfjoch_broker) operates three outputs.
All three can be operated/enabled independently.
These are:
- Image - all the images including metadata (ZeroMQ PUSH socket or custom TCP/IP socket)
- Preview - images with metadata at a reduced frame rate (PUB socket)
- Metadata - only metadata for all the images, bundled into packages (PUB socket)
Image stream
Images (with metadata) are serialized as CBOR image message. The stream will also include CBOR start message, calibration messages and end message with run metadata.
If file_prefix is not provided for a data collection, images won't be sent to image stream (or its HDF5/CBOR replacements).
Splitting image stream
Image stream can be split into multiple sockets to increase performance, in this case images will be split according to file number to which the image belongs. All sockets will forward start and end messages. Only first socket will forward calibration messages and will be marked to write master file.
ZeroMQ image stream
This is using PUSH ZeroMQ socket(s). It should be strictly avoided to have multiple receivers connected to one PUSH ZeroMQ socket. ZeroMQ will send the images in a round-robin basis to the receivers. In this case start and end messages will end up only with one receiver. Instead, Jungfraujoch feature of multiple sockets should be used. For ZeroMQ image stream, each writer connects to a different port.
Behavior is as following:
- Start message is sent with timeout of 5s. If within the time the message cannot be put in the outgoing queue or there is no connected puller exception is thrown - stop data collection with error due to absence of a writer.
- Images are sent in non-blocking way and without timeout.
- End message is sent with timeout of 5s. No error is reported.
The format is generally interchangeable with DECTRIS Stream2 format.
TCP/IP image stream
This is using TCP/IP socket(s) with a fixed binary frame header followed by payload bytes. This format was introduced to Jungfraujoch as an alternative to ZeroMQ image stream. It allows two-way communication between the data collection and the writer, and is therefore more robust than ZeroMQ.
For TCP/IP image stream, all writers connect to a single TCP port.
Payloads for START, DATA, CALIBRATION and END frames are CBOR messages, equivalent in content to the ZeroMQ image stream messages.
ACK and CANCEL are control frames.
Each data collection is treated as a separate TCP streaming session: writers establish fresh connections for that collection (one connection per configured socket), then perform START → DATA/CALIBRATION → END (or CANCEL on startup rollback).
For each frame:
- Read one
TcpFrameHeader(fixed size). - Validate
magicandversion. - Read
payload_sizebytes (if non-zero).
When image stream is split into multiple sockets:
STARTandENDare sent on all sockets,CALIBRATIONis sent only on socket 0,DATAframes are distributed by file grouping (images_per_file).
ACK handling is mandatory for correct operation:
STARTmust be acknowledged (ACKwithack_for=START) on each socket, otherwise collection start fails.ENDmust be acknowledged (ack_for=END) on each socket for successful completion.CANCELis acknowledged during rollback paths.DATAmust be ackonwledged for every frame and should be used to report fatal downstream errors immediately.CALIBRATIONnot acknowledged at this moment.
On Linux, large payload transmission can use kernel TCP zero-copy (SO_ZEROCOPY/MSG_ZEROCOPY) when enabled; when unavailable, transfer falls back to normal send() behavior.
Frame types
| Value | Name | Purpose |
|---|---|---|
| 1 | START |
Start-of-run metadata |
| 2 | DATA |
One image payload |
| 3 | CALIBRATION |
Calibration payload |
| 4 | END |
End-of-run metadata |
| 5 | ACK |
Acknowledgement / error reporting |
| 6 | CANCEL |
Cancel run initialization/stream |
TCP frame header (TcpFrameHeader)
| Field | Type | Description |
|---|---|---|
magic |
uint32_t |
Protocol magic (0x4A464A54, "JFJT") |
version |
uint16_t |
Protocol version (2) |
type |
uint16_t |
Frame type (START/DATA/CALIBRATION/END/ACK/CANCEL) |
image_number |
uint64_t |
Image index for DATA frames |
payload_size |
uint64_t |
Number of payload bytes after header |
socket_number |
uint32_t |
Socket index in split-stream mode |
flags |
uint32_t |
ACK flags (OK, FATAL, HAS_ERROR_TEXT) |
run_number |
uint64_t |
Run identifier |
ack_processed_images |
uint32_t |
In ACK: number of images processed by receiver |
ack_code |
uint16_t |
In ACK: error/status code |
ack_for |
uint16_t |
In ACK: frame type being acknowledged |
reserved |
uint64_t[2] |
Reserved, set to 0 |
ACK semantics
ACKframes useack_forto indicate which frame type is acknowledged.flags:OK: operation accepted/successful,FATAL: receiver reports unrecoverable error (primarily forDATA),HAS_ERROR_TEXT: ACK payload contains UTF-8 error text.
ack_codecan be used to categorize errors (for example I/O, no space left, permission denied, protocol error).
Image stream replacement
Image stream can be replaced with direct HDF5 writer and CBOR dump image pushers, it can be disabled by select "None" image pusher for all the measurements.
Writer notification socket
Normally ZeroMQ is asynchronous. When jfjoch_broker is sending messages via ZeroMQ image stream, it doesn't know
if these were properly handled downstream, e.g., written to disk. For this reason a writer notification socket is introduced.
It allows to downstream processing code to notify 'jfjoch_broker' that all images were handled properly.
To use writer notification socket, it has to be first enabled in the JSON configuration file of broker with writer_notification_socket entry:
{
"writer_notification_socket":"tcp://192.168.0.1:*"
}
Such entry will create PULL socket on 192.168.0.1 network interface listening on one, random TCP port. When data processing is started, the
image stream will send CBOR start message. This message will include information on writer_notification_zmq_addr,
which needs to be used by downstream code. Since the start message must reference the address of jfjoch_broker host, notification
socket should always listen on a particular network interface, and should not be configured with placeholder address 0.0.0.0. It is, however, OK
to use placeholder :* for network port, as it will be substituted for the one chosen by ZeroMQ.
For every image stream, downstream code must send the following message to the PULL socket:
{
"run_number":135,
"run_name": "lysozyme_1",
"socket_number": 1,
"processed_images":250,
"ok": true
}
Here run_number, run_name and socket_number must match information from the start message.
ok is boolean confirming if the writing process was OK.
processed_images is number of images that were written/processed, this is to track how many images were ignored by non-blocking ZeroMQ procedures.
If not, it is possible to include error message:
{
"run_number":135,
"run_name": "lysozyme_1",
"socket_number": 1,
"processed_images": 0,
"ok": false,
"error": "Permission error"
}
This way errors from the downstream code are propagated to jfjoch_broker.
If writer notification socket is configured, but downstream code doesn't send proper notification, jfjoch_broker will time out after 60 seconds producing an error message.
Preview stream
Jungfraujoch can also send images (with metadata) at a reduced frame rate for preview purpose. Images are serialized as CBOR image message. The stream will also include CBOR start message and end message with run metadata. Only start and image messages are sent.
This is using PUB socket with conflate option. I.e., only the last message is kept by ZeroMQ, so if receiver cannot cope with the messages, it will always receive the last generated message (no backlog). For this reason it is also recommended to use the same option on receiver side.
Given PUB socket properties, it is possible to connect multiple viewers to a single socket --- all the viewers should receive all the images sent.
Metadata stream
Jungfrajoch can also send pure metadata for the purpose of archiving such information. Metadata are serialized as CBOR metadata message. This is very similar as image message, but excludes the actual image array and spot positions. As metadata are relatively small, to avoid large number of messages, Jungfraujoch bundles metadata of many images in one message. Order of images within bundle, as well a size of the bundle, are not guaranteed The stream will also include CBOR start message and end message with run metadata.
This is using PUB socket with watermark, so there is some queuing of messages with ZeroMQ. Multiple receivers can be connected.