# Implementing CERN IpGBT link arrays in Intel Arria 10 FPGA

Ernő Dávid Wigner Research Centre for Physics, HUN-REN

This work was supported by OTKA K-135515 and NKFI 2021-4.1.2-NEMZ\_KI-2022-00009 grants

30 May 2024 GPU Day 2024, Budapest, Hungary



# IpGBT Links for New Detector Front-end Systems in ALICE Common Readout Unit (CRU)

- During the LS2 upgrade a brand new DAQ and trigger system had been developed for ALICE for Run 3 and Run 4
- The upgrading sub-detectors are now connected to the DAQ and Trigger systems with rad-hard GBT links through the CRU
- This enables the delivery of timing & control with deterministic latency and taking of data through a single fiber connection
- The GBTx ASIC is not available any more and the new lpGBT supersedes it for new developments or system additions



- During the coming LS3 upgrades, the new FE systems (e.g. ITS3, FoCal)
  will (have to) use lpGBT links to connect to the CRUs
- The lpGBT links have to be integrated into the existing CRU FW while keeping the compatibility with the existing O2, TRIGGER, and DCS systems

#### Main features of the present GBT links

- 4.8 Gb/s downlink
- 4.8 Gb/s uplink
- Front-end components:
  - GBTx ASIC
  - external slow-control (I2C, SPI, etc.) controller ASIC (SCA)
  - Versatile Link (VL) optical components

#### Main features of the new IpGBT with VTRX+

- 2.56 Gb/s downlink.
- 5.12/10.24 Gb/s uplink
- Front-end components:
  - IpGBT ASIC
  - internal slow-control controllers (I2C, SPI, GPIO, ADC, etc.) (and optional external SCA)
  - VL+ optical components



VS.

# **Project Goal**

# The goal of this *lpGBT* in *CRU* project is a *feasibility study* with the following consecutive steps:

- Implementing the IpGBT-FPGA IP developed by the CERN EP-ESE team in the CRU Arria10 GX FPGA in a single-link standalone design and test the functionality at the lowest level (links only with pattern generators / checkers)
- 2. Integrating the *lpGBT-FPGA IP* in the common CRU FW by removing the GBT interfaces and
  - adding 1 lpGBT interface with full clock & trigger integration
  - adding 12 lpGBT interface (2 TRX banks) with full clock & trigger integration
  - optional: adding 24 lpGBT interface with full clock, trigger, and data integration





 The aim is not a production firmware, but to study and demonstrate that the replacement of the GBT links with lpGBT links in the CRU is possible while keeping the compatibility with the connected systems (O2, TTS, DCS)



# ALICE Common Read-out Unit (CRU)



The Common Read-out Units (CRU) are PCle add-on cards installed in the First Level Processor (FLP) nodes of the ALICE DAQ system. Main tasks of the CRU:

- Deliver the trigger, timing and read-out control information to the Front-End Electronics
- Deliver detector data to the O2 (FLP Servers) with and/or without processing in the CRU FPGA
- Transport detector control information between the DCS and the FEE
- Take part of the Busy / Drop / Throttle mechanism of the detectors read-out



## CRU Firmware – Main Data Paths





# **GBT / IpGBT Differences**

#### **DOWNLINK**

Header, internal and external control channels, data channel, forward error correction

#### differences:

- 64-bit @ 40 MHz vs 120-bit @ 40 MHz
- 32-bit payload vs 80-bit payload
- 2.56 Gb/s vs 4.8 Gb/s
- TX parallel clock 320 MHz vs 240 MHz





#### **UPLINK**

Header, internal and external control channels, data channel, optional forward error correction

#### differences:

- 128/256-bit @ 40 MHz vs 120-bit @ 40 MHz
- 96/112/192/224-bit payload vs 80/112-bit payload
- 5.12 Gb/s or 10.24 Gb/s vs 4.8 Gb/s
- RX parallel clock 320 MHz vs 240 MHz





**IpGBT** downlink

(CRU -> FEE)

# CRU – Clock and Trigger (GBT version)



#### **GBT CRU**



# solution (concept)

We have to go down to the common 40 MHz clock domain and align data at both sides to the common 40 MHz clock!



# Aligning the 40 MHz Clock to Data Valid

- The PON module recovers the 240 MHz reference clock and a data valid bit with deterministic constant delay to the 40 MHz LHC clock rising edge.
- The IOPLL recovers the 40 MHz dividing the 240 MHz with six. At the output of the IOPLL the two clocks can randomly have six different phase relations.
- If the 40 MHz rising edge is not aligned with the data valid bit then the Control FSM resets the IOPLL.
- Within a reasonable time (practically within a few tries) the data valid bit and the 40 MHz clock will be aligned.







# Aligning IpGBT Data Write Enable to the 40 MHz Clock

- The lpGBT data write enable bit is generated in the 320 MHz clock domain with a 40 MHz rate.
- We sample this bit with our 40 MHz clock which is in sync with the 320 MHz clock.
- If it is not aligned with the 40 MHz clock then the Control FSM shifts the write enable bit (bit slip).
- In a maximum of eight steps the write enable bit will be aligned with the 40 MHz clock.







# Clock and Trigger Integration / Jitter Cleaning



# Test Setup (simplified)





# First Test Results (Downlink Clock and Trigger Delay Stability I.)

## LHC clock & trigger delivery measurements: Delay and stability from LTU to FEE (VLDB+)

- IN1 (yellow): DATA VALID (1:6 tick): PON link output (CRU CLK-OUT SMA conn.)
- IN2 (green): VLDB+ 40 MHz: lpGBT link output (VLDB+ E0CLK\_P SMA conn.)
- TRIGGER: external, LTU 40 MHz output (SCOPEA conn.)





Visualizing the 40 MHz LHC clock, DATA VALID, and IpGBT TX WRITE\_ENABLE flags via the clock chain (1 or 12 links being implemented) with infinite persistence for a few hours while challenging the stability with: **optical link disconnections**, **power cycling**, and **PLL forced initializations** 

- DATA\_VALID (1 link, 12 links) → stable with the same delay (no glitches, phase jumps observed, continuous 10<sup>12</sup> clock cycle)
- TX Write\_Enable (1 link, 12 links) → stable with the same delay (no glitches, phase jumps observed, continuous 10<sup>12</sup> clock cycle)
- VLDB+ ECLK0 (1 link, 12 link, internal feeding) → stable with the same delay (no glitches, phase jumps observed, continuous 10<sup>12</sup> clock cycle)



# **Uplink Testing**

| Link<br>ID | GBT Mode<br>Tx/Rx | Loopback | GBT MUX | Datapath<br>mode | Datapath<br>status | RX freq<br>(MHz) | TX freq<br>(MHz) | Status | Optical<br>power (uW) | System<br>ID | FEE<br>ID |
|------------|-------------------|----------|---------|------------------|--------------------|------------------|------------------|--------|-----------------------|--------------|-----------|
| 0          | GBT/GBT           | None     | TTC:CTP | Streaming        | Enabled            | 259.55           | 320.63           | DOWN   | 0.0                   | 0x3          | 0x0       |
| 1          | GBT/GBT           | None     | TTC:CTP | Streaming        | Enabled            | 320.44           | 320.63           | DOWN   | 0.0                   | 0x3          | 0x0       |
| 2          | GBT/GBT           | None     | TTC:CTP | Streaming        | Enabled            | 321.17           | 320.63           | DOWN   | 0.0                   | 0x3          | 0x0       |
| 3          | GBT/GBT           | None     | TTC:CTP | Streaming        | Enabled            | 320.02           | 320.63           | DOWN   | 0.0                   | 0x3          | 0x0       |
| 4          | GBT/GBT           | None     | TTC:CTP | Streaming        | Enabled            | 317.92           | 320.63           | DOWN   | 0.0                   | 0x3          | 0x0       |
| -5         | CDT/CDT           | None     | TTC:CTP | Streaming        | Enabled            | 313.63           | 320.63           | DOWN   | 0.0                   | 0x3          | 0x0       |
| 6          | GBT/GBT           | None     | TTC:CTP | Streaming        | Enabled            | 320.63           | 320.63           | UP     | 900.0                 | 0x0          | 0×0       |
| 7          | UBI/UBI           | None     | TICICIP | Streaming        | Enabled            | 319.00           | 320.03           | DOWN   | ษ์.ษ์                 | ÜXÜ          | שאט       |
| 8          | GBT/GBT           | None     | TTC:CTP | Streaming        | Enabled            | 298.17           | 320.63           | DOWN   | 0.0                   | 0x0          | 0x0       |
| 9          | GBT/GBT           | None     | TTC:CTP | Streaming        | Enabled            | 254.86           | 320.63           | DOWN   | 0.0                   | 0x0          | 0x0       |
| 10         | GBT/GBT           | None     | TTC:CTP | Streaming        | Enabled            | 257.52           | 320.63           | DOWN   | 0.0                   | 0x0          | 0x0       |
| 11         | GBT/GBT           | None     | TTC:CTP | Streaming        | Enabled            | 249.85           | 320.63           | DOWN   | 0.0                   | 0x0          | 0x0       |

### **Test results:**

- The x12 link lpGBT module is recognized by the O2 software (link status, TX/RX frequency, etc.)
- The lpGBT ASIC "Loopback Downlink Group Data Source" mode was used with built in checkers in the lpGBT module
- The links were tested one-by-one with a single VLDB+ for 1 hour / link ( $10^{14}$  bits)  $\rightarrow$  no errors
- Integration with the Datapath wrapper is ongoing



# **Summary:**

- x12 link lpGBT module (with 10.24 Gbps / FEC12 uplink mode) implemented inside the CRU-FW
- No visible phase jumps in the VLDB+ 40 MHz clock (10<sup>12</sup> clock cycle)
- Stable data loopback through the lpGBT ASIC ("Loopback Downlink Group Data Source" mode) (10<sup>14</sup> bits per link)

# **Further work:**

- Implement detector specific user logic for the new ALICE FoCal detector
- Characterization of the clock and trigger delivery in more detail (jitter, skew between links, etc.)

E. David

Implement up to x24 lpGBT links



# **Thank You for Your Attention!**



E. David