InfiniBand: Mellanox Hardware & Firmware


August 19, 2015


January 30, 2025

Nvidia InfiniBand Networking Solutions


Switch Config. Ports Speed
SB7800 fixed 36 EDR
QM87xx fixed 40 HDR
QS8500 modular 800+ HDR
QM97xx fixed 64 NDR

Switches come in to configurations…

  • fixed …number of port
  • modular …gradually expandable port modules

Switches come in two flavors…

  • managed
    • …MLXN OS features unlocked
    • …access over SSH, SNMP, HTTPs
    • …enables monitoring and configuration
  • unmanaged
    • …in-band management is possible
    • …status via chassis LEDs

Get information from unmanaged switches with

# requires MST service
>>> ./ -d lid-647
Quantum Mellanox Technologies
part number        | MQM8790-HS2F
serial number      | MT2202X19243
product name       | Jaguar Unmng IB 200
revision           | AK
ports              | 80
PSID               | MT_0000000063
GUID               | 0x1070fd030003af98
firmware version   | 27.2008.3328
uptime (d-h:m:s)   | 26d-20:16:01
PSU0 status        | OK
     P/N           | MTEF-PSF-AC-C
     S/N           | MT2202X18887
     DC power      | OK
     fan status    | OK
     power (W)     | 165
PSU1 status        | OK
     P/N           | MTEF-PSF-AC-C
     S/N           | MT2202X18881
     DC power      | OK
     fan status    | OK
     power (W)     | 148
temperature (C)    | 63
max temp (C)       | 63
fan status         | OK
fan#1 (rpm)        | 5959
fan#2 (rpm)        | 5251
fan#3 (rpm)        | 6013
fan#4 (rpm)        | 5251
fan#5 (rpm)        | 5906
fan#6 (rpm)        | 5293
fan#7 (rpm)        | 6125
fan#8 (rpm)        | 5293
fan#9 (rpm)        | 5959

Ethernet Gateway

Skyway InfiniBand to Ethernet gateway…

  • MLXN-GW (gateway operating system) appliance
  • 16x ports (8 Infiniband EDR/HDR x 8 Ethernet 100/200Gb/s)
  • Max. bandwidth 1.6Tb/s
  • High-availability & load-Balancing

…achieved by leveraging Ethernet LAG (Link Aggregation). LACP (Link Aggregation Control Protocol) is used to establish the LAG and to verify connectivity…


Cable part numbers…

Cable Speed Type Split Length
MC2207130 FDR DAC no .5, 1, 1.5, 2
MC220731V FDR AOC no 3, 5, 10, 15, 20, 25, 30, 40, 50, 75, 100
MCP1600-E EDR DAC no .5, 1, 1.5, 2, 2.5, 3, 4, 5
MFA1A00-E EDR AOC no 3, 5, 10, 15, 20, 30, 50, 100
MCP1650-H HDR DAC no .5, 1, 1.5, 2
MCP7H50-H HDR DAC yes 1, 1.5, 2
MCA1J00-H HDR ACC no 3, 4
MCA7J50-H HDR ACC yes 3, 4
MFS1S00-HxxxE HDR AOC no 3, 5, 10, 15, 20, 30, 50, 100, 130, 150
MFS1S50-HxxxE HDR AOC yes 3, 5, 10, 15, 20, 30

LinkX product family for Mellanox cables and transceivers

  • DAC, (passive) direct attach copper
    • low price
    • up to 2 meters (at HDR)
    • simple copper wires
    • no electronics
    • consume (almost) zero power
    • lowest latency
  • ACC, active copper cables (aka active DAC)
    • consumes 4 to 5 Watts
    • include signal-boosting integrated circuits (ICs)
    • extend the reach up to 4 meters (at 200G HDR)
  • AOC, active optical cables

DAC-in-a-Rack connect servers and storage to top-of-rack (TOR) switches

(passive/active) splitter cables

    • typically used to connect HDR100 HCAs to a HDR TOR switch
    • enabling a 40-port HDR switch to support 80-ports of 100G HDR100
    • 1:2 splitter breakout cable in DAC copper… (QSFP56 to 2xQSFP56)
  • AOC …1:2 splitter optical breakout cable… (QSFP56 to 2xQSFP56)


MFT (Mellanox firmware tools)…

Installation …MLNX_OFED include the required packages…

dnf install -y mft kmod-kernel-mft-mlnx usbutils

…packages include an init-script…

systemctd start mst.service


…can be accessed by their PCI ID

# ...find PCI ID using lxpci
>>> lspci -d 15b3:
21:00.0 Infiniband controller: Mellanox Technologies MT28908 Family [ConnectX-6]

# ...query the firmware on a device using the PCI ID
>>> mstflint -d 21:00.0 query
Image type:            FS4
FW Version:            20.32.1010

…when the IB driver is loaded…access a device by device name..

# ...find the device name
>>> ibv_devinfo | grep hca_id
hca_id: mlx5_0

# ...query the firmware on a device using the device name
>>> mstflint -d mlx5_0 query

PSID (Parameter-Set IDentification) of the channel adapter…

>>> mlxfwmanager --query | grep PSID
  PSID:             SM_2121000001000
  • …PSID used to download the correct firmware for a device
  • …start with MT_. SM_, or AS_ indicate vendor re-labeled cards


Reboot for configuration changes to take effect

Change device configurations without reburning the firmware…

# ...only a single device is present...
mlxconfig query | grep LINK
         PHY_COUNT_LINK_UP_DELAY             DELAY_NONE(0)   
         LINK_TYPE_P1                        IB(1)           
         KEEP_ETH_LINK_UP_P1                 True(1)         
         KEEP_IB_LINK_UP_P1                  False(0)        
         KEEP_LINK_UP_ON_BOOT_P1             True(1)         
         KEEP_LINK_UP_ON_STANDBY_P1          False(0)        
         AUTO_POWER_SAVE_LINK_DOWN_P1        False(0)        
         UNKNOWN_UPLINK_MAC_FLOOD_P1         False(0)
# ...set configuration
mlxconfig -d $device set KEEP_IB_LINK_UP_P1=0 KEEP_LINK_UP_ON_BOOT_P1=1

Reset the device configuration to default…

mlxconfig -d $device reset

Upgrade Firmware

mlxfwmanager — Update HCA firmware1

# get the current firmware version
mlxfwmanager --query | grep FW

# update the firmware
>>> mlxfwmanager --online -u
  PSID:             MT_0000000222
  Versions:         Current        Available     
     FW             20.32.1010     20.35.1012    
     PXE            3.6.0502       3.6.0804      
     UEFI           14.25.0017     14.28.0015

Decouple from the login terminal (useful for automation over SSH)

log_file=/var/log/mlxfwmanager-$(date +%Y%m%d).log
nohup mlxfwmanager --no-progress --online --update --yes \
    &>/dev/null >> $log_file & disown

Downgrade Firmware

Identify the device and download the correct firmware2 archive:

# identify the device 
>>> mlxfwmanager --query | grep -e Part -e PSID
  Part Number:      MCX653105A-ECA_Ax
  PSID:             MT_0000000222
# download the firmware
>>> wget
# extract the archive
>>> unzip fw-ConnectX6-rel-20_41_1000-MCX653105A-ECA_Ax-UEFI-14.34.12-FlexBoot-3.7.400.bin.zi

Identify the device path…

  • …using mst …make sure to select to correct device if there are multiple
  • …format of device name is /dev/mst/mt<dev_id>_pci{_cr0|conf0}
>>> mst start
# find the device name under the column MST
>>> mst status -v
MST modules:
    MST PCI module is not loaded
    MST PCI configuration module loaded 
PCI devices:
DEVICE_TYPE             MST                           PCI       RDMA            NET                                     NUMA  
ConnectX6(rev:0)        /dev/mst/mt4123_pciconf0      81:00.0   mlx5_0          net-ib0                                 1     

Burn the firmware image:

>>> flint -d /dev/mst/mt4123_pciconf0 -i fw-ConnectX6-rel-20_41_1000-MCX653105A-ECA_Ax-UEFI-14.34.12-FlexBoot-3.7.400.bin burn             

    Current FW version on flash:  20.43.1014
    New FW version:               20.41.1000
    Note: The new FW version is older than the current FW version on flash.

 Do you want to continue ? (y/n) [n] : y

Writing Boot image component -   OK
Restoring signature                     - OK
-I- To load new FW run mlxfwreset or reboot machine.


…work against the cables connected to the devices on the machine…

  • mst cable add…discover the cables that are connected to the local devices
  • mlxcables…access the cables…
    • …get cable IDs…
    • …upgrade firmware on the cables
>>> mlxcables -q
Cable name    : mt4123_pciconf0_cable_0
Identifier      : QSFP28 (11h)
Technology      : Copper cable unequalized (a0h)
Compliance      : 50GBASE-CR, ... HDR,EDR,FDR,QDR,DDR,SDR
Vendor          : Mellanox        
Serial number   : MT2214VS04725   
Part number     : MCP7H50-H01AR30 
Length [m]      : 1 m


  1. Firmware Downloads, Nvidia↩︎

  2. Firmware Downloads, Nvidia↩︎