





# FutureDAQ for CBM: On-line Event Selection

About FAIR
About CBM
About FutureDAQ

# FAIR Facility for Antiproton and Ion Research





#### **Primary Beams**

- 10<sup>12</sup>/s; 1.5-2 GeV/u; <sup>238</sup>U<sup>28+</sup>
- Factor 100-1000 over present intensity
- 2(4)x10<sup>13</sup>/s 30 GeV protons
- 10<sup>10</sup>/s <sup>238</sup>U<sup>92+</sup> up to 35 GeV/u
- up to 90 GeV protons

#### **Secondary Beams**

- Broad range of radioactive beams up to 1.5 - 2 GeV/u; up to factor 10 000 in intensity over present
- Antiprotons 0 30 GeV

#### **Storage and Cooler Rings**

- Radioactive beams
- e A (or Antiproton-A) collider
- 10<sup>11</sup> stored and cooled 0.8 14.5 GeV antiprotons
- Polarized antiprotons(?)

#### **Key Technical Features**

- Cooled beams
- Rapidly cycling superconducting magnets



# **FAIR** in 2014







## Scientific programs





- Nuclear Structure Physics and Nuclear Astrophysics with RIBs
- Hadron Physics with

**Anti-Proton Beams** 



Physics of Nuclear Matter with Relativistic Nuclear Collisions



## **Compressed Baryonic Matter**



- Plasma Physics with
- highly Bunched Beams
- Atomic Physics and Applied Science with highly charged ions and low energy Anti-Protons
- + Accelerator Physics



# **Compressed Baryonic Matter**



## U+U 23 AGeV





## CBM physics topics and observables





- 1. In-medium modifications of hadrons (p-A, A-A)
  - $^{\lowerthing}$  onset of chiral symmetry restoration at high  $\rho_B$  measure:  $\rho$ ,  $\omega$ ,  $\phi$   $\rightarrow$  e<sup>+</sup>e<sup>-</sup> (μ<sup>+</sup>μ<sup>-</sup>), J/Ψ, open charm (D<sup>0</sup>,D<sup>±</sup>)
- 2. Indications for deconfinement at high  $\rho_B$  (A-A heavy)
  - ♦ anomalous charmonium suppression ?
    measure: D excitation function, J/Ψ→ e<sup>+</sup>e<sup>-</sup> (μ<sup>+</sup>μ<sup>-</sup>)
  - softening of EOS measure flow excitation function
- 3. Strangeness in matter
- 4. Critical endpoint of deconfinement phase transition
  - event-by-event fluctuations measure: π, K

displaced vertices

e<sup>+</sup>e<sup>-</sup> pair, high p<sub>t</sub> low cross section

#### **Problems for LVL1 trigger:**

- ► High data rates
- ► Short latency (µsec)
- ► Complex (displaced vertices)
- Most of the data needed



## The data acquisition





#### New paradigm: switch full data stream into event selector farms

- 1. A conventional LVL1 trigger would imply full displaced vertex reconstruction within fixed latency.
- 2. Strongly varying complex event filter decisions needed on almost full event data
- **➡** No common trigger! Self triggered channels with time stamps! Event filters
  - 10 MHz interaction rate expected
  - 1 ns time stamps (in all data channels, ~10 ps jitter) required
  - 1 TByte/s primary data rate (Panda < 100 GByte/s) expected</li>
  - 1 GByte/s maximum archive rate (Panda < 100 MByte/s) required</li>
  - Event definition (time correlation: multiplicity over time histograms) required
  - Event filter to 20 KHz (1 GByte/s archive with compression) required
  - On-line track & (displaced) vertex reconstruction required
  - Data flow driven, no problem with latency expected
  - Less complex communication, but high data rate to sort



## **FutureDAQ**







FP6: 6th Framework Program on research, technological development and demonstration

I3HP: Integrated Infrastructure Initiative in Hadron Physics

JRA: Joint Research Activity

#### Participants from

- GSI (Spokesperson: Walter F.J. Müller)
- Kirchhoff Institute for Physics, Univ. Heidelberg
- University of Mannheim
- Technical University Munich
- University of Silesia, Katowice
- Krakow University
- Warsaw University
- Giessen University
- RMKI Budapest
- INFN Turino



# Studying AA-collisions from 1 - 45 AGeV





**CBM Detector: 8 - 45 AGeV** 



## The CBM detectors





## At 10<sup>7</sup> interactions per second!



- Radiation hard Silicon (pixel/strip) tracker in a magnetic dipole field
- ➤ Electron detectors: RICH & TRD & ECAL pion suppression up to 10<sup>5</sup>
- ➤ Hadron identification: RICH, RPC
- Measurement of photons, π<sup>0</sup>,η and muons electromagn. calorimeter ECAL

Multiplicities: 160 p  $400 \ \pi^- \\ 400 \ \pi^+ \\ 44 \ K^+ \\ 13 \ K \\ 800 \ \gamma \\ 1817 \ total at 10 MHz$ 



## The CBM detectors





#### At 10<sup>7</sup> interactions per second!

Central Au+Au collision at 25 AGeV:



- Radiation hard Silicon (pixel/strip) tracker in a magnetic dipole field
- Electron detectors: RICH & TRD & ECAL pion suppression up to 105
- Hadron identification: RICH, RPC
- $\triangleright$  Measurement of photons,  $\pi^0$ , $\eta$  and muons electromagn. calorimeter ECAL

**Multiplicities:** 160 p  $400 \, \pi^{-}$ 13 K 800 y

1817 total at 10 MHz



## The CBM detectors



## At 10<sup>7</sup> interactions per second!







## **DAQ** hierarchy







# TNet: Clock/time distribution





#### Definition of 'Time distribution':

- TNet generates GHz time clock with ~10 ps jitter
- provides global state transitions with clock cycle precise latency
- Hierarchical splitting into 1000 CNet channels

#### Consequences for serial FEE links and CNet switches:

- bit clock cycle precise transmission of time messages
- low jitter clock recover required
- FEE link and CNet will likely use custom SERDES (i.e. OASE)



# **CNet: Data concentrator**





- Collects hits from readout boards (FEE) to active buffers (data dispatchers) 1 GByte/s
- Capture hit clusters, communicate geographically neighboring channels
- Distribute time stamps and clock (from TNet) to FEE
- Low latency bi-directional optical links
- Eventually communicate detector control & status messages



## **BNet: Building network**





Has to sort parallel data to sequential event data Two mechanisms, both with traffic shaping

- switch by time intervals
  - event definition is done behind BNet in PNet compute resources
  - all raw data goes through BNet
- switch by event intervals
  - event definition done in BNet by multiplicity histogramming
  - Suppression of incoherent background and peripheral events
  - potentially significant reduction of BNet traffic
  - Some bandwidth required for histogramming

Functionality of *data dispatcher* and *event dispatcher* implemented on one active buffer board using bi-directional links.

Simulations with mesh like topology





## BNet: Factorization of 1000x1000 switch

 $n=4 \cdot 16x16$ 





1024 double-nodes need 32 32x32 switches with 496 bi-directional connections
Data transfer in time slices (determined by epochs or event streams)
Bandwidth reserved for histogramming and scheduling (traffic shaping)

1 GByte/s point to point





# Simulation of BNet with SystemC





#### Modules:

- event generator
- data dispatcher (sender)
- histogram collector
- tag generator
- BNet controller (schedule)
- event dispatcher (receiver)
- transmitter (data rate, latency)
- switches (buffer capacity, max. # of package queue, 4K)

Running with 10 switches and 100 end nodes.

Simulation takes 1.5 \*10<sup>5</sup> times longer than simulated time.

Various statistics (traffic, network load, etc.)



# BNet: SystemC simulations 100x100







## PNet: Structure of a sub-farm





- A sub-farm is a collection of compute resources connected with a PNet
- Compute resources are
  - programmable logic (FPGA)
  - processors
- Likely choice for the processors are high performance SoC components
  - CPUs, MEM, high speed interconnect on one chip
  - optimized for low W/GFlop and high packing density
  - see QCDOC, Blue Gene, STI cell, ....
- PNet uses 'build-in' serial links connected through switches
- PCIe-AS is a candidate for a commonly used serial interconnect
- A plausible scenario for the low level compute farm
  - O(100) sub-farms with O(100) compute resources each
  - one sub-farm on O(10) boards in one crate
- Consequences
  - only chip-2-chip and board-2-board links in PNet
  - thus only short distance (<1m) communication</li>



## PNet: First & second level computing





Event selection level 1 (FPGA): 1% Event selection level 2 (CPU): 10%



64-128 sub-Farms, each with 32 FPGA and 32 CPU

1 GByte/s



# Summary





#### Five different networks with very different characteristics

- CNet (custom)
  - Capture hit clusters, communicate geographically neighboring channels
  - Distribute time stamps and clock (from TNet) to FEE
  - Low latency bi-directional optical links (OASE)
  - Eventually communicate detector control & status messages
  - connects custom components (FEE ASICS, FPGAs)
- TNet (custom)
  - generates GHz time clock with ~10 ps jitter
  - provides global state transitions with clock cycle precise latency
- BNet (standard technology, i.e. Ethernet or Infiniband)
  - switch by time intervals: event definition is done behind BNet in PNet compute resources
  - switch by event intervals: event definition done in BNet by multiplicity histogramming
- PNet (custom)
  - short distance, most efficient of already 'build-in' links (i.e. PCIe-AS)
  - connects standardized components (FPGA, SoCs)
- HNet
  - general purpose, to archive





- Working groups established
- Simulation frameworks set up
  - Detectors
  - Algorithms
  - Data flows
- Small scale demonstrator hierarchy chain in two years