Files
Jungfraujoch/fpga

FPGA Smart Network Interface Card

Hardware

Currently supported FPGA is only Xilinx Alveo U55C.

See AMD/Xilinx webpage for card user guide (UG1469). According to the user guide:

Alveo data center accelerator cards are designed to be installed into a data center server, where controlled air flow provides direct cooling.

Card needs to be placed in PCI Express (PCIe) Gen4 x8 slot, though mechanically slot has to accommodate x16 card. There is no need to connect additional power cable, as power of the card is not exceeding 75 W load available from PCIe edge connector. Current power estimation is about 30 W when idle and 45 W in operation. The card has built-in protection, which will cut power to the card if HBM temperature is above 120°C.

Card is equipped with two network connectors:

Port Location Network type Transceiver type LED
QSFP0 Upper port on the PCIe bracket 100G QSFP28 green = OK
QSFP1 Lower port on the PCIe bracket 4x10G QSFP+ green = all OK, yellow = at least one OK

See host library documentation for details of network.

Content of directories

CPU Part:

  • pcie_driver Linux kernel driver for PCIe version of the FPGA board - see instructions
  • host_library Library that should be used to access the driver + some simple diagnostic tools - see workflow documentation

FPGA part:

  • scripts Scripts for FPGA synthesis
  • xdc Constraints for FPGA
  • hdl FPGA design parts developed in Verilog
  • hls FPGA design parts developed in C++ with high-level synthesis

Dependencies:

  • include External (Xilinx) headers for high-level synthesis code

Building firmware

Xilinx Vivado version has to precisely match version described in [the system requirements](../README.md. only when vivado and vitis_hls are detected in the path.

Xilinx Vivado

The following procedures require having AMD (Xilinx) Vivado and Vitis HLS toolsets version 2022.1 installed on the machine. Due to the nature of TCL scripts used to generate board designs Vivado version has to exactly match one provided above - specifically newer versions of Vivado will not work.

In additional to Intellectual Property (IP) cores included in Vivado, three additional licenses are necessary:

  • Non-cost license for Ultrascale+ 100G core has to be requested from AMD/Xilinx website, see Xilinx website.
  • 10G/25G Subsystem for Ultrascale+

PSI received a non-cost license from Xilinx University Program for the two latter cores. Therefore, usage of bitstreams generated by PSI continuous integration pipeline is only allowed for non-commercial use.

HLS compilation

Make HLS routines:

mkdir build
cd build
cmake ..
make hls

Synthesis

Create PCIe bitstream with the following command:

mkdir build
cd build
cmake ..
make action_pcie

When Vivado is not present

During CMake execution, the following executables: vivado and vitis_hls must be present in the path. If not, build targets will not be generated, and such or similar error message will show up:

$ make action_pcie
make: *** No rule to make target 'action_pcie'.  Stop.

Gitlab CI

If Gitlab CI is properly set-up, firmware will be automatically built for every commit that modifies FPGA source files. Built firmware should be downloaded as MCS files.

Flashing of the card

After successfully building the bitstream, it is possible to load it onto the Alveo card.

New Alveo card

For the first operation of the card it is necessary to upload the bitstream via MicroUSB cable using other system running Vivado. It is impossible to perform the operation using Vivado running on the same machine.

Alveo card with Jungfraujoch firmware

First step is to find PCIe device ID from the operating system. Use for example the following command:

lspci |grep Xilinx

To flash the card it is necessary to use 'xbflash2' utility from AMD/Xilinx:

xbflash2 program --spi -i <.mcs file path> -d <device id>

It is necessary to confirm the operation by pressing Y key. It is safe to run multiple flashing processes in parallel for different cards, for example in separate screen sessions.

xbflash2 is available as RPM for RHEL7 and RHEL8 from Alveo product page.

For RHEL9 this needs to built from source - Xilinx/XRT github repository.

After flashing the card

Irrespective of the method use to upload firmware, it is necessary to do cold reboot the server (with a short power interruption to PCIe devices). Currently, this is best done by powering the server off and on again. More efficient procedure is yet to be tested.

Reset card

To reset the card it is enough to do a standard warm reboot.

Loading new image from the flash requires cold reboot, where power is cut for the card for a short time.

Hardware verification

To test that FPGA board is working properly without access to a JUNGFRAU detector, you can use jfjoch_action_test tool.

Card release number

To ensure compatibility of the card with driver and user application, each design is marked by release number. This number is incremented after each change of functionality of the card or interface to communicate with the host. This ensures consistency between the FPGA card, driver and user application. Changes within the design (e.g. size of FIFOs), that are invisible to interactions with host do not require change in release number.

To check release number, look for constant RELEASE_NUMBER in include/jfjoch_fpga.h header file. For FPGA design, release number is also included in the generated bitstream name.

In case there is mismatch in release number between card and kernel driver, the latter will not create the character device and return error (check dmesg).

It is recommended to use the same git commit hash for building the design and user application, though it is not strictly necessary in case care is made to have same release number.

FPGA reference

Frame generator

Jungfraujoch card is equipped with frame generator. It allows to simulate JUNGFRAU detector without having access to such system. It is placed in parallel to Ethernet MAC - so it is placed before the network stack and before any processing happening on the card. In the future a redirection will be possible to send the simulated stream through the 100G TX network link. Frame generator is written in HLS and controlled with AXI-Lite.

Register map

FPGA setup can be done via 32-bit registers:

Address Bits Meaning Mode Notes
0x00000 - 0x0FFFF Reserved (in case using MicroBlaze in the future, this has to reserved for internal memory)
0x010000 32 Action Control Register
Bit 0 - Action start R/W
Bit 1 - Action idle R
Bit 2 - Action cancel R/W cleared on reset or action start
Bit 3 - Clear network counters R/W cleared on reset or action start
Bit 4 - Host writer idle R cleared on reset
Bit 7 - Design number R 0 = PCIe #0, 1 = PCIe #1
Bit 16 - AXI Mailbox interrupt 0 R
Bit 17 - AXI Mailbox interrupt 1 R
Bits 24-27 - Various errors in host memory writer R cleared on reset or action start
0x010004 32 Reserved -
0x010008 32 Reserved -
0x01000C 32 Action GIT SHA1 R
0x010010 32 Action Type R
0x010014 32 Action Release Level R
0x010020 32 Max. number supported detector modules R constant
0x010024 32 Reserved R constant
0x010028 64 Pipeline stalls before writing to host memory R reset on action start
0x010030 64 Pipeline stalls before accessing HBM R reset on action start
0x010038 32 FIFO status (see action_config.v for details) R
0x01003C 32 Size of single HBM channel in bytes (default value for the particular card) R/W should not be altered for standard operation
0x010040 64 Packets processed by the action R cleared on reset or action start
0x010048 64 Valid ethernet packets R cleared on reset
0x010050 64 Valid ICMP packets R cleared on reset
0x010058 64 Valid UDP packets R cleared on reset
0x010060 64 Valid detector packets processed by the card R cleared on reset
0x010066 64 Packets flagged as errors by CMAC R cleared on reset
0x010100 32 Spot finder photon count threshold R/W
0x010104 32 Spot finder signal-to-noise ratio threshold (single-precision float) R/W
0x010200 64 MAC address of FPGA card R/W network byte order
0x010208 32 IPv4 address of FPGA card R/W network byte order
0x01020C 32 Number of detector modules (value minus one: 0 => 1 module, 1 => 2 modules, etc.) R/W
0x010210 32 Data collection mode R/W
Bit 0 - Conversion to photons
Bit 16:31 - Data collection ID (carried with completions)
0x010214 32 Photon energy in keV (single-precision float) R/W
0x010218 32 Number of frames expected in the data collection (defines termination condition) R/W
0x01021C 32 Number of storage cells R/W
0x010220 32 Summation on card (value minus one: 0 => summation of 1, 1 => summation of 2, etc.) R/W
0x020000 - 0x02FFFF CMAC 100G See Xilinx PG203 for register map
0x030000 - 0x03FFFF AXI Mailbox for Work Request / Work Completion See Xilinx PG114 for register map
0x040000 - 0x04FFFF QuadSPI flash See Xilinx PG153 for register map
0x050000 - 0x05FFFF Interrupt controller See Xilinx PG099 for register map
0x060000 - 0x06FFFF Load calibration (HLS)
0x070000 - 0x07FFFF AXI Firewall See Xilinx PG293 for register map
0x080000 - 0x08FFFF Frame generator (HLS)
0x090000 - 0x09FFFF PCIe DMA control See Xilinx PG195 for register map
0x0C0000 - 0x0FFFFF Xilinx Card Management Solution Subsystem management subsystem See Xilinx PG348 for register map
0x100000 - 0x10FFFF MAC 10G See Xilinx PG210 for register map
0x110000 - 0x11FFFF MAC 10G See Xilinx PG210 for register map
0x120000 - 0x12FFFF MAC 10G See Xilinx PG210 for register map
0x130000 - 0x13FFFF MAC 10G See Xilinx PG210 for register map
0x200000 - 0x27FFFF 64 Address table: decodes handles used by load_calibration and host_writer to DMA addresses

AXI Mailbox

AXI mailbox is used to send work request from host to action, and receive work completions. Messages are exchanged through AXI Mailbox IP from Xilinx (see Xilinx PG114).

Work request has the following structure:

Bit start Bit end Meaning
0 15 Work request ID (handle)

Work completion has the following structure:

Bit start Bit end Meaning
0 15 Work request ID (handle)
Special values:
65534 - start of data collection
65535 - end of data collection
15 31 Data collection ID