Agilio SmartNIC Basic Firmware User Guide¶
Release Notes¶
The latest version of the Basic Nic Firmware is 2.1.16, released 2018/11/04.
Note
For optimal performance with the Basic Nic Firmware, please ensure to use a kernel newer than 4.15 (or RHEL equivalent); or the kmods driver. See Appendix B: Installing the Out-of-Tree NFP Driver
Note
Rapidly loading and unloading the kernel driver module can result in the card becoming unresponsive. Such a situation requires a host reboot to resolve.
Note
In this firmware release, tunnel offloading capabilities are disabled on VF ports.
Release History¶
2.1.16¶
- New SRIOV capable firmware file
- Support for 48 single-queue VFs on physical port 0
- Tunnel inner header RSS support for VXLAN and Geneve
- Improved performance
2.0.7¶
- TCP (Large) Segment Offload (TSO / LSO)
- TCP and UDP Receive Side Scaling (RSS) with CRC32 algorithm
- Receive and Transmit Checksum Offload
- Jumbo Frame Support up to 9216B Frames
The Agilio SmartNIC Architecture¶
The Agilio CX SmartNICs are based on the NFP-4000 and are available in low profile PCIe and OCM v2 NIC form factors suitable for use in COTS servers. This is a 60 core processor with eight cooperatively multithreaded threads per core. The flow processing cores have an instruction set that is optimized for networking. This ensures an unrivalled level of flexibility within the data plane while maintaining performance. The OVS datapath can also be enabled without a server reboot.
Further extensions such as BPF offload, SR-IOV or custom offloads can be added without any hardware modifications or even server reboot. These extensions are not covered by this guide, which deals with the basic and OVS-TC offload firmware only.
The basic firmware offers a wide variety of features including RSS (Receive Side Scaling), Checksum Offload (IPv4/IPv6,TCP,UDP,Tx/Rx), LSO (Large Segmentation Offload), IEEE 802.3ad, Link flow control, 802.1AX Link Aggregation, etc. For more details regarding currently supported features refer to the section Basic Firmware Features.
Hardware Installation¶
This user guide focusses on x86 deployments of Agilio hardware. As detailed in Validating the Driver, Netronome’s Agilio SmartNIC firmware is now upstreamed with certain kernel versions of Ubuntu and RHEL/Centos. Whilst out-of-tree driver source files are available and build/installation instructions are included in Appendix A: Netronome Repositories, it is highly recommended where possible to make use of the upstreamed drivers. Wherever applicable separate instructions for RHEL/Centos and Ubuntu are provided.
Identification¶
In a running system the assembly ID and serial number of a PCI device may be
determined using the ethtool
debug interface. This requires knowledge of
the physical function network device identifier, or <netdev>, assigned to
the SmartNIC under consideration. Consult the section
SmartNIC netdev interfaces for methods on
determining this identifier. The interface name <netdev> can be otherwise
identified using the ip link
command. The following shell snippet
illustrates this method for some particular netdev whose name is cast as the
argument $1:
1 2 3 4 5 6 7 8 | #!/bin/bash
DEVICE=$1
ethtool -W ${DEVICE} 0
DEBUG=$(ethtool -w ${DEVICE} data /dev/stdout | strings)
SERIAL=$(echo "${DEBUG}" | grep "^SN:")
ASSY=$(echo ${SERIAL} | grep -oE AMDA[0-9]{4})
echo ${SERIAL}
echo Assembly: ${ASSY}
|
Note
The strings
command is commonly provided by the binutils package.
This can be installed by yum install binutils
or apt-get install
binutils
, depending on your distribution.
Physical installation¶
Physically install the SmartNIC in the host server and ensure proper cooling e.g. airflow over card. Ensure the PCI slot is at least Gen3 x8 (can be placed in Gen3 x16 slot). Once installed, power up the server and open a terminal. Further details and support about the hardware installation process can be reviewed in the Hardware User Manual available from Netronome’s support site.
Validation¶
Use the following command to validate that the SmartNIC is being correctly detected by the host server and identify its PCI address, 19ee is the Netronome specific PCI vendor identifier:
# lspci -Dnnd 19ee:4000; lspci -Dnnd 19ee:6000
0000:02:00.0 Ethernet controller [0200]: Netronome Systems, Inc. Device [19ee:4000]
Note
The lspci command is commonly provided by the pciutils package. This can
be installed by yum install pciutils
or apt-get install pciutils
,
depending on your distribution.
Validating the Driver¶
The Netronome SmartNIC physical function driver with support for OVS-TC offload is included in Linux 4.13 and later kernels. The list of minimum required operating system distributions and their respective kernels which include the nfp driver are as follows:
Operating System | Kernel package version |
---|---|
RHEL/CentOS 7.4+ | default |
Ubuntu 16.04.04 LTS | default |
In order to upgrade Ubuntu 16.04.0 - 16.04.3 to a supported version, the following commands must be run:
# apt-get update
# apt-get upgrade
# apt-get dist-upgrade
Confirm Upstreamed NFP Driver¶
To confirm that your current Operating System contains the upstreamed nfp module:
# modinfo nfp | head -3
filename:
/lib/modules/<kernel package version>/kernel/drivers/net/ethernet/netronome/nfp/nfp.ko.xz
description: The Netronome Flow Processor (NFP) driver.
license: GPL
Note
If the module is not found in your current kernel, refer to Appendix B: Installing the Out-of-Tree NFP Driver for instructions on installing the out-of-tree NFP driver, or simply upgrade your distributions and kernel version to include the upstreamed drivers.
Confirm that the NFP Driver is Loaded¶
Use lsmod
to list the loaded driver modules and use grep to match the
expression for the NFP drivers:
# lsmod | grep nfp
nfp 161364 0
If the NFP driver is not loaded, try run the following command to manually load the module:
# modprobe nfp
SmartNIC netdev interfaces¶
The agilio-naming-policy
package ensures consistent naming of Netronome
SmartNIC network interfaces. Please note that this package is optional and
not required if your distribution has a sufficiently new systemd
installation.
Please refer to Appendix A: Netronome Repositories on how to configure the Netronome repository applicable to your distribution. When the repository has been successfully enabled install the naming package using the commands below.
Ubuntu:
# apt-get install agilio-naming-policy
CentOS/RHEL:
# yum install agilio-naming-policy
At nfp driver initialization new netdev interfaces will be created:
# ip link
4: enp6s0np0s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 00:15:4d:13:01:db brd ff:ff:ff:ff:ff:ff
5: enp6s0np0s1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 00:15:4d:13:01:dd brd ff:ff:ff:ff:ff:ff
6: enp6s0np0s2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 00:15:4d:13:01:de brd ff:ff:ff:ff:ff:ff
7: enp6s0np0s3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 00:15:4d:13:01:df brd ff:ff:ff:ff:ff:ff
8: enp6s0np1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 00:15:4d:13:01:dc brd ff:ff:ff:ff:ff:ff
Note
Netdev naming may vary depending on your linux distribution and configuration e.g. enpAsXnpYsZ, pXpY.
To confirm the names of the interfaces, view the contents of
/sys/bus/pci/devices/<pci addr>/net
, using the PCI address obtained in
Hardware Installation e.g.
1 2 3 | #!/bin/bash
PCIA=$(lspci -d 19ee:4000 | awk '{print $1}' | xargs -Iz echo 0000:z)
echo $PCIA | tr ' ' '\n' | xargs -Iz echo "ls /sys/bus/pci/devices/z/net" | bash
|
The output of such a script would be similar to:
enp6s0np0s0 enp6s0np0s1 enp6s0np0s2 enp6s0np0s3 enp6s0np1
In the worst case scenario netdev types can also be discovered by reading the kernel logs.
Support for biosdevname
¶
Netronome NICs support biosdevname
netdev naming with recent versions of
the utility, circa December 2018, e.g. RHEL 8.0 onwards. There are some
notable points to be aware of:
- Whenever an unsupported netdev is considered for naming, the
biosdevname
naming will be skipped and the next inline naming scheme will take preference, e.g. thesystemd
naming policies. - Netdevs in breakout mode are not supported for naming.
- VF netdevs will still be subject to
biosdevname
naming irrespective of the breakout mode of other netdevs. - When using an older version of the
biosdevname
utility, users will observe inconsistent naming of netdevs on multiport NICs, i.e. one netdev may be named according to thebiosdevname
scheme and another according tosystemd
schemes.
To disable biosdevname
users can add biosdevname=0
to the kernel
command line.
Refer to the online biosdevname
documentation for more details about the
naming policy convention that will be applied.
Validating the Firmware¶
Netronome SmartNICs are fully programmable devices and thus depend on the driver to load firmware onto the device at runtime. It is important to note that the functionality of the SmartNIC significantly depends on the firmware loaded. The firmware files should be present in the following directory (contents may vary depending on the installed firmware):
# ls -ogR --time-style="+" /lib/firmware/netronome/
/lib/firmware/netronome/:
total 8
drwxr-xr-x. 2 4096 flower
drwxr-xr-x. 2 4096 nic
lrwxrwxrwx 1 31 nic_AMDA0081-0001_1x40.nffw -> nic/nic_AMDA0081-0001_1x40.nffw
lrwxrwxrwx 1 31 nic_AMDA0081-0001_4x10.nffw -> nic/nic_AMDA0081-0001_4x10.nffw
lrwxrwxrwx 1 31 nic_AMDA0096-0001_2x10.nffw -> nic/nic_AMDA0096-0001_2x10.nffw
lrwxrwxrwx 1 31 nic_AMDA0097-0001_2x40.nffw -> nic/nic_AMDA0097-0001_2x40.nffw
lrwxrwxrwx 1 36 nic_AMDA0097-0001_4x10_1x40.nffw -> nic/nic_AMDA0097-0001_4x10_1x40.nffw
lrwxrwxrwx 1 31 nic_AMDA0097-0001_8x10.nffw -> nic/nic_AMDA0097-0001_8x10.nffw
lrwxrwxrwx 1 36 nic_AMDA0099-0001_1x10_1x25.nffw -> nic/nic_AMDA0099-0001_1x10_1x25.nffw
lrwxrwxrwx 1 31 nic_AMDA0099-0001_2x10.nffw -> nic/nic_AMDA0099-0001_2x10.nffw
lrwxrwxrwx 1 31 nic_AMDA0099-0001_2x25.nffw -> nic/nic_AMDA0099-0001_2x25.nffw
lrwxrwxrwx 1 34 pci-0000:04:00.0.nffw -> flower/nic_AMDA0097-0001_2x40.nffw
lrwxrwxrwx 1 34 pci-0000:06:00.0.nffw -> flower/nic_AMDA0096-0001_2x10.nffw
/lib/firmware/netronome/flower:
total 11692
lrwxrwxrwx. 1 17 nic_AMDA0081-0001_1x40.nffw -> nic_AMDA0097.nffw
lrwxrwxrwx. 1 17 nic_AMDA0081-0001_4x10.nffw -> nic_AMDA0097.nffw
lrwxrwxrwx. 1 17 nic_AMDA0096-0001_2x10.nffw -> nic_AMDA0096.nffw
-rw-r--r--. 1 3987240 nic_AMDA0096.nffw
lrwxrwxrwx. 1 17 nic_AMDA0097-0001_2x40.nffw -> nic_AMDA0097.nffw
lrwxrwxrwx. 1 17 nic_AMDA0097-0001_4x10_1x40.nffw -> nic_AMDA0097.nffw
lrwxrwxrwx. 1 17 nic_AMDA0097-0001_8x10.nffw -> nic_AMDA0097.nffw
-rw-r--r--. 1 3988184 nic_AMDA0097.nffw
lrwxrwxrwx. 1 17 nic_AMDA0099-0001_2x10.nffw -> nic_AMDA0099.nffw
lrwxrwxrwx. 1 17 nic_AMDA0099-0001_2x25.nffw -> nic_AMDA0099.nffw
-rw-r--r--. 1 3990552 nic_AMDA0099.nffw
/lib/firmware/netronome/nic:
total 12220
-rw-r--r--. 1 1380496 nic_AMDA0081-0001_1x40.nffw
-rw-r--r--. 1 1389760 nic_AMDA0081-0001_4x10.nffw
-rw-r--r--. 1 1385608 nic_AMDA0096-0001_2x10.nffw
-rw-r--r--. 1 1385664 nic_AMDA0097-0001_2x40.nffw
-rw-r--r--. 1 1391944 nic_AMDA0097-0001_4x10_1x40.nffw
-rw-r--r--. 1 1397880 nic_AMDA0097-0001_8x10.nffw
-rw-r--r--. 1 1386616 nic_AMDA0099-0001_1x10_1x25.nffw
-rw-r--r--. 1 1385608 nic_AMDA0099-0001_2x10.nffw
-rw-r--r--. 1 1386368 nic_AMDA0099-0001_2x25.nffw
The NFP driver will search for firmware in /lib/firmware/netronome
.
Firmware is searched for in the following order and the first firmware to be
successfully found and loaded is used by the driver:
1: serial-_SERIAL_.nffw
2: pci-_PCI_ADDRESS_.nffw
3: nic-_ASSEMBLY-TYPE___BREAKOUTxMODE_.nffw
This search is logged by the kernel when the driver is loaded. For example:
# dmesg | grep -A 4 nfp.*firmware
[ 3.260788] nfp 0000:04:00.0: nfp: Looking for firmware file in order of priority:
[ 3.260810] nfp 0000:04:00.0: nfp: netronome/serial-00-15-4d-13-51-0c-10-ff.nffw: not found
[ 3.260820] nfp 0000:04:00.0: nfp: netronome/pci-0000:04:00.0.nffw: not found
[ 3.262138] nfp 0000:04:00.0: nfp: netronome/nic_AMDA0097-0001_2x40.nffw: found, loading...
The version of the loaded firmware for a particular <netdev> interface, as
found in SmartNIC netdev interfaces (for example
enp4s0), or an interface’s port <netdev port> (e.g. enp4s0np0
) can be
displayed with the ethtool
command:
# ethtool -i <netdev/netdev port>
driver: nfp
version: 3.10.0-862.el7.x86_64 SMP mod_u
firmware-version: 0.0.3.5 0.22 nic-2.0.4 nic
expansion-rom-version:
bus-info: 0000:04:00.0
Firmware versions are displayed in order; NFD version, NSP version, APP FW version, driver APP. The specific output above shows that basic NIC firmware is running on the card, as indicated by “nic” in the firmware-version field.
Upgrading the firmware¶
The preferred method to upgrading Agilio firmware is via the Netronome repositories, however if this is not possible the corresponding installation packages can be obtained from Netronome Support (https://help.netronome.com).
Upgrading firmware via the Netronome repository¶
Please refer to Appendix A: Netronome Repositories on how to configure the Netronome repository applicable to your
distribution. When the repository has been successfully added install the
agilio-nic-firmware
package using the commands below.
Ubuntu:
# apt-get install agilio-nic-firmware
# rmmod nfp; modprobe nfp
# update-initramfs -u
CentOS/RHEL:
# yum install agilio-nic-firmware
# rmmod nfp; modprobe nfp
# dracut -f
Upgrading firmware from package installations¶
The latest firmware can be obtained at the downloads area of the Netronome Support site (https://help.netronome.com).
Install the packages provided by Netronome Support using the commands below.
Ubuntu:
# dpkg -i agilio-nic-firmware-*.deb
# rmmod nfp; modprobe nfp
# update-initramfs -u
CentOS/RHEL:
# yum install -y agilio-nic-firmware-*.rpm
# rmmod nfp; modprobe nfp
# dracut -f
Using the Linux Driver¶
Configuring Interface Media Mode¶
The following sections detail the configuration of the SmartNIC netdev interfaces.
Note
For older kernels that do not support the configuration methods outlined below, please refer to Appendix C: Working with Board Support Package on how to make use of the BSP toolset to configure interfaces.
Configuring interface link-speed¶
The following steps explains how to change between 10G mode and 25G mode on Agilio CX 2x25GbE cards. The changing of port speed must be done in order, p0 must be set to 10G before p1 may be set to 10G.
Down the respective interface(s):
# ip link set dev enp4s0np0 down
Set interface link-speed to 10G:
# ethtool -s enp4s0np0 speed 10000
Set interface link-speed to 25G:
# ethtool -s enp4s0np0 speed 25000
Reload driver for changes to take effect:
# rmmod nfp; modprobe nfp
Note
The settings above only apply to Agilio CX 25G SmartNICs and older drivers/firmware changes may require a system reboot for changes to take effect
Configuring interface Maximum Transmission Unit (MTU)¶
The MTU of interfaces can temporarily be set using the iproute2
or
ifconfig
tools. Note that this change will not persist. Setting this via
Network Manager, or other appropriate OS configuration tool, is recommended.
Set interface MTU to 9000 bytes:
# ip link set dev <netdev port> mtu 9000
It is the responsibility of the user or the orchestration layer to set appropriate MTU values when handling jumbo frames or utilizing tunnels. For example, if packets sent from a VM are to be encapsulated on the card and egress a physical port, then the MTU of the VF should be set to lower than that of the physical port to account for the extra bytes added by the additional header.
If a setup is expected to see fallback traffic between the SmartNIC and the kernel then the user should also ensure that the PF MTU is appropriately set to avoid unexpected drops on this path.
Configuring FEC modes¶
Agilio CX 2x25GbE SmartNICs support FEC mode configuration, e.g. Auto, Firecode
BaseR, Reed Solomon and Off modes. Each physical port’s FEC mode can be set
independently via the ethtool
command. To view the currently supported FEC
modes of the interface use the following:
# ethtool <netdev>
Settings for <netdev>:
Supported ports: [ FIBRE ]
Supported link modes: Not reported
Supported pause frame use: No
Supports auto-negotiation: No
Supported FEC modes: None BaseR RS
Advertised link modes: Not reported
Advertised pause frame use: No
Advertised auto-negotiation: No
Advertised FEC modes: BaseR RS
Speed: 25000Mb/s
Duplex: Full
Port: Direct Attach Copper
PHYAD: 0
Transceiver: internal
Auto-negotiation: on
Link detected: yes
One can see above which FEC modes are supported for this interface. Note that the Agilio CX 2x25GbE SmartNIC used for the example above only supports Firecode BaseR FEC mode on ports that are forced to 10G speed.
Note
Ethtool FEC support is only available in kernel 4.14 and newer or RHEL/Centos 7.5 and equivalent distributions. The Netronome upstream kernel driver provides ethtool FEC support from kernel 4.15. Furthermore, the SmartNIC NVRAM version must be at least 020025.020025.02006e to support ethtool FEC get/set operations.
To determine your version of the current SmartNIC NVRAM, look at the following system log:
# dmesg | grep 'nfp.*BSP'
[2387.682046] nfp 0000:82:00.0: BSP: 020025.020025.020072
This example lists a version of 020025.020025.020072
which is sufficient to
support ethtool
FEC mode configuration. To update your SmartNIC NVRAM
flash, refer to Appendix E: Updating NFP Flash or
contact Netronome support.
If the SmartNIC NVRAM or the kernel does not support ethtool
modification
of FEC modes, no supported FEC modes will be listed in the ethtool
output
for the port. This could be because of an outdated kernel version or an
unsupported distribution (e.g. Ubuntu 16.04 irrespective of the kernel
version):
# ethtool enp130s0np0
Settings for enp130s0np0:
...
Supported FEC modes: None
To show the currently active FEC mode for either the <netdev> or its physical port(s) <netdev port>:
# ethtool --show-fec <netdev>/<netdev port>
FEC parameters for <netdev>:
Configured FEC encodings: Auto Off BaseR RS
Active FEC encoding: Auto
To force the FEC mode for a particular port, autonegotiation must be disabled with the following:
# ip link set enp130s0np0 down
# ethtool -s enp130s0np0 autoneg off
# ip link set enp130s0np0 up
Note
In order to change the autonegotiation configuration the port must be down.
Note
Changing the autonegotiation configuration will not affect the SmartNIC port speed. Please see Configuring interface link-speed to adjust this setting.
To modify the FEC mode to Firecode BaseR:
# ethtool --set-fec <netdev port> encoding baser
Verify the newly selected mode:
# ethtool --show-fec enp130s0np0
FEC parameters for enp130s0np0:
Configured FEC encodings: Auto Off BaseR RS
Active FEC encoding: BaseR
To modify the FEC mode to Reed Solomon:
# ethtool --set-fec enp130s0np0 encoding rs
Verify the newly selected mode:
# ethtool --show-fec enp130s0np0
FEC parameters for enp130s0np0:
Configured FEC encodings: Auto Off BaseR RS
Active FEC encoding: RS
Verify the newly selected mode:
# ethtool --show-fec enp130s0np0
FEC parameters for enp130s0np0:
Configured FEC encodings: Auto Off BaseR RS
Active FEC encoding: Off
Revert back to the default Auto setting:
# ethtool --set-fec enp130s0np0 encoding auto
Finally verify the setting again:
# ethtool --show-fec enp130s0np0
FEC parameters for enp130s0np0:
Configured FEC encodings: Auto Off BaseR RS
Active FEC encoding: Auto
FEC and auto-negotiation settings are persisted on the SmartNIC across reboots.
Note
In this context setting the interface mode to auto
specifies that the
encoding scheme should be automatically determined if possible. It does
not enable auto-negotiation of link speed between 10Gbps and 25Gbps
Setting Interface Breakout Mode¶
The following commands only work on kernel versions 4.13 and later. If your kernel is older than 4.13 or you do not have devlink support enabled refer to the following section on configuring interfaces: Configure Media Settings.
Note
Breakout mode settings are only applicable to Agilio CX 40GbE and CX 2x40GbE SmartNICs.
Determine the card’s PCI address:
# lspci -Dkd 19ee:4000
0000:04:00.0 Ethernet controller: Netronome Systems, Inc. Device 4000
Subsystem: Netronome Systems, Inc. Device 4001
Kernel driver in use: nfp
Kernel modules: nfp
List the devices:
# devlink dev show
pci/0000:04:00.0
Split the first physical 40G port from 1x40G to 4x10G ports:
# devlink port split pci/0000:04:00.0/0 count 4
Split the second physical 40G port from 1x40G to 4x10G ports:
# devlink port split pci/0000:04:00.0/4 count 4
If the SmartNIC’s port is already configured in breakout mode (it has already been split) then devlink will respond with an argument error. Whenever change to the port configuration are made, the original netdev(s) associated with the port will be removed from the system:
# dmesg | tail
[ 5696.432306] nfp 0000:04:00.0: nfp: Port #0 config changed, unregistering. Driver reload required before port will be operational again.
[ 6270.553902] nfp 0000:04:00.0: nfp: Port #4 config changed, unregistering. Driver reload required before port will be operational again.
The driver needs to be reloaded for the changes to take effect. Older driver/SmartNIC NVRAM versions may require a system reboot for changes to take effect. The driver communicates events related to port split/unsplit in the system logs. The driver may be reloaded with the following command:
# rmmod nfp; modprobe nfp
After reloading the driver, the netdevs associated with the split ports will be available for use:
# ip link show
...
68: enp4s0np0s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
69: enp4s0np0s1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
70: enp4s0np0s2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
71: enp4s0np0s3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
72: enp4s0np1s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
73: enp4s0np1s1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
74: enp4s0np1s2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
75: enp4s0np1s3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
Note
There is an ordering constraint to splitting and unsplitting the ports on Agilio CX 2x40GbE SmartNICs. The first physical 40G port cannot be split without the second physical port also being split, hence 1x40G + 4x10G is always invalid even if it’s only intended to be a transitional mode. The driver will reject such configurations.
Breakout mode persists on the SmartNIC across reboots. To revert back to the original 2x40G ports use the unsplit subcommand.
Unsplit Port 1:
# devlink port unsplit pci/0000:04:00.0/4
Unsplit Port 0:
# devlink port unsplit pci/0000:04:00.0/0
The NFP drivers will again have to be reloaded (rmmod nfp
then modprobe
nfp
) for unsplit changes in the port configuration to take effect.
Confirming Connectivity¶
Allocating IP Addresses¶
Under RHEL/Centos 7.5, the network configuration is managed by default using
NetworkManager. The default configuration for unset interfaces is auto,
which implies that an auto-configuration client is running on them. This means
that any manual configuration made using ifconfig
or iproute2
will be
periodically erased.
Consult the NetworkManager documentation for detailed instructions. For
example, if a connection is named ens1np0
(which corresponds to the
physical port representor ens1np0
of the SmartNIC), the following commands
will set the IPv4 address statically, set it to autostart on boot, and up the
interface:
# nmcli c m <netdev port> ipv4.method manual
# nmcli c m <netdev port> ipv4.addresses 10.0.0.2/24
# nmcli c m <netdev port> connection.autoconnect yes
# nmcli c u <netdev port>
Alternatively, if the interface is not under control of the distribution’s
network management subsystem, iproute2
can be used to configure the port:
# assign IP address to interface
# ip address add 10.0.0.2/24 dev <netdev port>
# ip link set <netdev port> up
Pinging interfaces¶
After you have successfully assigned IP addresses to the NFP interfaces perform a standard ping test to confirm connectivity:
# ping 10.0.0.2
PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
64 bytes from 10.0.0.2: icmp_seq=3 ttl=64 time=0.067 ms
64 bytes from 10.0.0.2: icmp_seq=4 ttl=64 time=0.062 ms
Basic Performance Test¶
iPerf is a basic traffic generator and network performance measuring tool that can be used to quickly determine the throughput achievable by a device.
Set IRQ affinity¶
Balance interrupts across available cores located on the NUMA node of the SmartNIC. A script to perform this action is available for download at https://raw.githubusercontent.com/Netronome/nfp-drv-kmods/master/tools/set_irq_affinity.sh
The source code of this script is also included at Appendix G: set_irq_affinity.sh Source
Example output:
# /nfp-drv-kmods/tools/set_irq_affinity.sh <netdev>
Device 0000:02:00.0 is on node 0 with cpus 0 1 2 3 4 5 6 7 8 9 20 21 22 23 24 25 26 27 28 29
IRQ 181 to CPU 0 (irq: 00,00000001 xps: 03,00030003)
IRQ 182 to CPU 1 (irq: 00,00000002 xps: 00,00000000)
IRQ 183 to CPU 2 (irq: 00,00000004 xps: 0c,000c000c)
IRQ 184 to CPU 3 (irq: 00,00000008 xps: 00,00000000)
IRQ 185 to CPU 4 (irq: 00,00000010 xps: 30,00300030)
IRQ 186 to CPU 5 (irq: 00,00000020 xps: 00,00000000)
IRQ 187 to CPU 6 (irq: 00,00000040 xps: c0,00c000c0)
IRQ 188 to CPU 7 (irq: 00,00000080 xps: 00,00000000)
Run iPerf Test¶
Client¶
Allocate an ip address on the same range as used by the server, then execute the following on the client to connect to the server and start running the test:
# iperf -c 10.0.0.1 -P 4
Example output of 1x40G link:
# iperf -c 10.0.0.1 -P 4
------------------------------------------------------------
Client connecting to 10.1, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[ 5] local 10.0.0.2 port 56938 connected with 10.0.0.1 port 5001
[ 3] local 10.0.0.2 port 56932 connected with 10.0.0.1 port 5001
[ 4] local 10.0.0.2 port 56934 connected with 10.0.0.1 port 5001
[ 6] local 10.0.0.2 port 56936 connected with 10.0.0.1 port 5001
[ ID] Interval Transfer Bandwidth
[ 6] 0.0-10.0 sec 11.9 GBytes 10.3 Gbits/sec
[ 3] 0.0-10.0 sec 9.85 GBytes 8.46 Gbits/sec
[ 4] 0.0-10.0 sec 11.9 GBytes 10.2 Gbits/sec
[ 5] 0.0-10.0 sec 10.2 GBytes 8.75 Gbits/sec
[SUM] 0.0-10.0 sec 43.8 GBytes 37.7 Gbits/sec
Using iPerf3¶
iPerf3 can also be used to measure performance, however multiple instances have to be chained to properly create multiple threads:
On the server:
# iperf3 -s -p 5001 & iperf3 -s -p 5002 & iperf3 -s -p 5003 & iperf3 -s -p 5004 &
On the client:
# iperf3 -c 102.0.0.6 -i 30 -p 5001 & iperf3 -c 102.0.0.6 -i 30 -p 5002 & iperf3 -c 102.0.0.6 -i 30 -p 5003 & iperf3 -c 102.0.0.6 -i 30 -p 5004 &
Example output:
[ ID] Interval Transfer Bandwidth
[ 5] 0.00-10.04 sec 0.00 Bytes 0.00 bits/sec sender
[ 5] 0.00-10.04 sec 9.39 GBytes 8.03 Gbits/sec receiver
[ 5] 10.00-10.04 sec 33.1 MBytes 7.77 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth
[ 5] 0.00-10.04 sec 0.00 Bytes 0.00 bits/sec sender
[ 5] 0.00-10.04 sec 9.86 GBytes 8.44 Gbits/sec receiver
[ 5] 10.00-10.04 sec 53.6 MBytes 11.8 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth
[ 5] 0.00-10.04 sec 0.00 Bytes 0.00 bits/sec sender
[ 5] 0.00-10.04 sec 11.9 GBytes 10.2 Gbits/sec receiver
[ 5] 10.00-10.04 sec 42.1 MBytes 9.43 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth
[ 5] 0.00-10.04 sec 0.00 Bytes 0.00 bits/sec sender
[ 5] 0.00-10.04 sec 10.2 GBytes 8.70 Gbits/sec receiver
Total: 37.7 Gbits/sec
95.49% of 40GbE link
Basic Firmware Features¶
In this section ethtool
will be used to view and configure SmartNIC
interface parameters.
Setting Interface Settings¶
Unless otherwise stated, changing the interface settings detailed below will not require reloading of the NFP drivers for changes to take effect, unlike the interface breakouts described in Configuring Interface Media Mode.
Multiple Queues¶
The Physical Functions on a SmartNIC support multiple transmit and receive queues.
View current settings¶
The -l
flag can be used to view current queue/channel configuration e.g:
#ethtool -l ens1np0
Channel parameters for ens1np0:
Pre-set maximums:
RX: 20
TX: 20
Other: 2
Combined: 20
Current hardware settings:
RX: 0
TX: 12
Other: 2
Combined: 8
Configure Queues¶
The -L
flag can be used to change interface queue/channel configuration.
The following parameters can be configured:
- rx
- Receive ring interrupts
- tx
- Transmit ring interrupts
- combined
- interrupts that service both rx & tx rings
Note
Having RXR-only and TXR-only interrupts are not allowed.
In practice use this formula to calculate parameters for the ethtool command: combined = min(RXR, TXR) ; rx = RXR - combined ; tx = TXR - combined
To configure 8 combined interrupt servicing:
# ethtool -L <intf> rx 0 tx 0 combined 8
Receive side scaling (RSS)¶
RSS is a technology that focuses on effectively distributing received traffic to the spectrum of RX queues available on a given network interface based on a hash function.
View current hash parameters¶
The -n
flag can be used to view current RSS configuration, for example by
default:
# ethtool -n <netdev> rx-flow-hash tcp4
TCP over IPV4 flows use these fields for computing Hash flow key:
IP SA
IP DA
L4 bytes 0 & 1 [TCP/UDP src port]
L4 bytes 2 & 3 [TCP/UDP dst port]
# ethtool -n <netdev> rx-flow-hash udp4
UDP over IPV4 flows use these fields for computing Hash flow key:
IP SA
IP DA
Set hash parameters¶
The -N
flag can be used to change interface RSS configuration e.g:
# ethtool -N <netdev> rx-flow-hash tcp4 sdfn
# ethtool -N <netdev> rx-flow-hash udp4 sdfn
The ethtool
man pages can be consulted for full details of what RSS flags
may be set
Configuring the key¶
The -x
flag can be used to view current interface key configuration, for
example:
# ethtool -x <intf>
# ethtool -X <intf> <hkey>
View Interface Parameters¶
The -k
flag can be used to view current interface configurations, for
example using a Agilio CX 1x40GbE NIC which has an interface id enp4s0np0
:
# ethtool -k <netdev>
Features for enp4s0np0:
rx-checksumming: off [fixed]
tx-checksumming: off
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: off [fixed]
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: off
tx-scatter-gather: off [fixed]
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off
tx-tcp-segmentation: off [fixed]
tx-tcp-ecn-segmentation: off [fixed]
tx-tcp6-segmentation: off [fixed]
tx-tcp-mangleid-segmentation: off [fixed]
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: off [requested on]
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: off [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: off [fixed]
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
busy-poll: off [fixed]
tx-gre-csum-segmentation: off [fixed]
tx-udp_tnl-csum-segmentation: off [fixed]
tx-gso-partial: off [fixed]
tx-sctp-segmentation: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: on
rx-udp_tunnel-port-offload: off [fixed]
Receive Checksumming (rx-checksumming)¶
When enabled, checksum calculation and error checking comparison for received packets is offloaded to the NFP SmartNIC’s flow processor rather than the host CPU.
To enable rx-checksumming:
# ethtool -K <netdev> rx on
To disable rx-checksumming:
# ethtool -K <netdev> rx off
Transmit Checksumming (tx-checksumming)¶
When enabled, checksum calculation for outgoing packets is offloaded to the NFP SmartNIC’s flow processor rather than the host’s CPU.
To enable tx-checksumming:
# ethtool -K <netdev> tx on
To disable tx-checksumming:
# ethtool -K <netdev> tx off
Scatter and Gather (scatter-gather)¶
When enabled the NFP will use scatter and gather I/O, also known as Vectored
I/O, which allows a single procedure call to sequentially read data from
multiple buffers and write it to a single data stream. Only changes to the
scatter-gather interface settings (from on
to off
or off
to on
)
will produce a terminal output as shown below:
To enable scatter-gather:
# ethtool -K <netdev> sg on
Actual changes:
scatter-gather: on
tx-scatter-gather: on
generic-segmentation-offload: on
To disable scatter-gather:
# ethtool -K <netdev> sg off
Actual changes:
scatter-gather: on
tx-scatter-gather: on
generic-segmentation-offload: on
TCP Segmentation Offload (TSO)¶
When enabled, this parameter causes all functions related to the segmentation of TCP packets at egress to be offloaded to the NFP.
To enable tcp-segmentation-offload:
# ethtool -K <netdev> tso on
To disable tcp-segmentation-offload:
# ethtool -K <netdev> tso off
Generic Segmentation Offload (GSO)¶
This parameter offloads segmentation for transport layer protocol data units other than segments and datagrams for TCP/UDP respectively to the NFP. GSO operates at packet egress.
To enable generic-segmentation-offload:
# ethtool -K <netdev> gso on
To disable generic-segmentation-offload:
# ethtool -K <netdev> gso off
Generic Receive Offload (GRO)¶
This parameter enables software implementation of Large Receive Offload (LRO), which aggregates multiple packets at ingress into a large buffer before they are passed higher up the networking stack.
To enable generic-receive-offload:
# ethtool -K <netdev> gro on
To disable generic-receive-offload:
# ethtool -K <netdev> gro off
Note
Do take note that scripts that use ethtool -i <interface> to get bus-info will not work on representors as this information is not populated for representor devices.
Installing, Configuring and Using DPDK¶
Enabling IOMMU¶
In order to use the NFP device with DPDK applications, the VFIO/IGB module has to be loaded.
Firstly, the machine has to have IOMMU enabled. The following link: http://dpdk-guide.gitlab.io/dpdk-guide/setup/binding.html contains some generic information about binding devices including the possibility of using UIO instead of VFIO, and also mentions the VFIO no-IOMMU mode.
Although DPDK focuses on avoiding interrupts, there is an option of a NAPI-like approach using RX interrupts. This is supported by PMD NFP and with VFIO it is possible to have an RX interrupt per queue (with UIO just one interrupt per device). Because of this VFIO is the preferred option.
Edit grub configuration file¶
This change is required for working with VFIO, however when using kernels 4.5+, it is possible to work with VFIO and no-IOMMU mode. If your system comes with a kernel > 4.5, you can work with VFIO and no-IOMMU if desired by enabling this mode:
# echo 1 > /sys/module/vfio/parameters/enable_unsafe_noiommu_mode
For kernels older than 4.5, working with VFIO requires the enabling of IOMMU in
the kernel at boot time. Add the following kernel parameters to
/etc/default/grub
to enable IOMMU:
GRUB_CMDLINE_LINUX="intel_iommu=on iommu=pt intremap=on"
It is worth noting that iommu=pt
is not required for DPDK if VFIO is used,
but it does avoid a performance impact in host drivers, such as the NFP netdev
driver, when intel_iommu=on
is enabled.
Implement changes¶
Apply kernel parameters changes and reboot.
Ubuntu:
# update-grub2
# reboot
CentOS/RHEL:
# grub2-mkconfig -o /boot/grub2/grub.cfg
# reboot
DPDK sources with PF PMD support¶
PF PMD multiport support¶
The PMD can work with up to 8 ports on the same PF device. The number of available ports is firmware and hardware dependent, and the driver looks for a firmware symbol during initialization to know how many can be used.
DPDK apps work with ports, and a port is usually a PF or a VF PCI device. However, with the NFP PF multiport there is just one PF PCI device. Supporting this particular configuration requires the PMD to create ports in a special way, although once they are created, DPDK apps should be able to use them as normal PCI ports.
NFP ports belonging to same PF can be seen inside PMD initialization with a suffix added to the PCI ID: wwww:xx:yy.z_port_n. For example, a PF with PCI ID 0000:03:00.0 and four ports is seen by the PMD code as:
0000:03:00.0_port_0
0000:03:00.0_port_1
0000:03:00.0_port_2
0000:03:00.0_port_3
Note
There are some limitations with multiport support: RX interrupts and device hot-plugging are not supported.
Installing DPDK¶
Physical Function PMD support has been upstreamed into DPDK 17.11. If an earlier version of DPDK is required, please refer to Appendix D: Obtaining DPDK-ns.
Install prerequisites:
# apt-get -y install gcc libnuma-dev make
Obtain DPDK sources:
# cd /usr/src/
# wget http://fast.dpdk.org/rel/dpdk-17.11.tar.xz
# tar xf dpdk-17.11.tar.xz
# export DPDK_DIR=/usr/src/dpdk-17.11
# cd $DPDK_DIR
Configure and install DPDK:
# export DPDK_TARGET=x86_64-native-linuxapp-gcc
# export DPDK_BUILD=$DPDK_DIR/$DPDK_TARGET
# make install T=$DPDK_TARGET DESTDIR=install
Binding DPDK PF driver¶
Note
This section details the binding of dpdk-enabled drivers to the Physical Functions.
Attaching vfio-pci driver¶
Load vfio-pci driver module:
# modprobe vfio-pci
Unbind current drivers:
# PCIA=0000:$(lspci -d 19ee:4000 | awk '{print $1}')
# echo $PCIA > /sys/bus/pci/devices/$PCIA/driver/unbind
Bind vfio-pci driver:
# echo 19ee 4000 > /sys/bus/pci/drivers/vfio-pci/new_id
Attaching igb-uio driver¶
Load igb-uio driver module:
# modprobe uio
# DRKO=$(find $DPDK_DIR -iname 'igb_uio.ko' | head -1 )
# insmod $DRKO
Unbind current drivers:
# PCIA=0000:$(lspci -d 19ee:4000 | awk '{print $1}')
# echo $PCIA > /sys/bus/pci/devices/$PCIA/driver/unbind
Bind igb_uio driver:
# echo 19ee 4000 > /sys/bus/pci/drivers/igb_uio/new_id
Confirm attached driver¶
Confirm that the driver has been attached:
# lspci -kd 19ee:
01:00.0 Ethernet controller: Netronome Systems, Inc. Device 4000
Subsystem: Netronome Systems, Inc. Device 4001
Kernel driver in use: nfp
Kernel modules: nfp
01:08.0 Ethernet controller: Netronome Systems, Inc. Device 6003
Subsystem: Netronome Systems, Inc. Device 4001
Kernel driver in use: igb_uio
Kernel modules: nfp
Unbind driver¶
Determine card address:
# PCIA=$(lspci -d 19ee: | awk '{print $1}')
Unbind vfio-pci driver:
# echo 0000:$PCIA > /sys/bus/pci/drivers/vfio-pci/unbind
Unbind igb_uio driver:
# echo 0000:$PCIA > /sys/bus/pci/drivers/igb_uio/unbind
Using DPDK PF driver¶
Create default symlink¶
Note
This workaround applies to dpdk versions < 18.05.
In order to use the PF in DPDK applications a symlink named
nic_dpdk_default.nffw
pointing to the applicable firmware needs to be
created e.g.
Navigate to firmware directory:
# cd /lib/firmware/netronome
For Agilio 2x40G:
# cp -s nic_AMDA0097-0001_2x40.nffw nic_dpdk_default.nffw
For Agilio 2x25G:
# cp -s nic_AMDA0099-0001_2x25.nffw nic_dpdk_default.nffw
For Agilio 2x40G w/ first port in breakout mode:
# cp -s nic_AMDA0097-0001_4x10_1x40.nffw nic_dpdk_default.nffw
The following table can be used to map product names to their codes
SmartNIC | Code |
---|---|
Agilio CX 2x10G | AMDA0096 |
Agilio CX 2x25G | AMDA0099 |
Agilio CX 1x40G | AMDA0081 |
Agilio CX 2x40G | AMDA0097 |
Using SR-IOV¶
SR-IOV is a PCI feature that allows virtual functions (VFs) to be created from a physical function (PF). The VFs thus share the resources of a PF, while VFs remain isolated from each other. The isolated VFs are typically assigned to virtual machines (VMs) on the host. In this way, the VFs allow the VMs to directly access the PCI device, thereby bypassing the host kernel.
Installing the SR-IOV capable firmware¶
Before installing the SR-IOV capable firmware, ensure that SR-IOV is enabled in
the BIOS of the host machine. If SR-IOV is disabled or unsupported by the
motherboard/chipset being used, the kernel message log will contain a
PCI SR-IOV:-12
error when trying to create a VF at a later stage. This can
be queried using the dmesg
tool.
The firmware currently running on the SmartNIC can be determined by the
ethtool
command. As an example, Ubuntu 18.04 LTS contains the following
upstreamed firmware:
# ethtool -i enp2s0np0 | head -3
driver: nfp
version: 4.15.0-20-generic SMP mod_unloa
firmware-version: 0.0.3.5 0.22 nic-2.0.4 nic
From the above output, the upstreamed firmware is nic-2.0.4
. The prefix
nic
indicates that the firmware implements the basic NIC functionality. The
suffix 2.0.4
indicates the firmware version.
Firmware sriov-2.1.x
or greater provides SR-IOV capability. There are two
methods in which the firmware can be obtained, either from the
linux-firmware
package or from the support site.
The linux-firmware
package¶
The SR-IOV capable firmware has been upstreamed into the linux-firmware
package. For rpm
packages, this will be available from linux-firmware
20181008-88
onwards. As of Ubuntu 18.10, the linux-firmware
Debian
package does not yet contain SR-IOV capable firmware.
Ensure that the latest linux-firmware
package is installed.
For RHEL / Fedora / CentOS:
# yum update linux-firmware
The linux-firmware
package will store the Netronome firmware files in the
/lib/firmware/netronome
directory. This directory contains symbolic links
which point to the actual firmware files. The actual firmware files will be
located in subdirectories, with each subdirectory related to a different
SmartNIC functionality. Consider the following tree structure:
# tree /lib/firmware/netronome
/lib/firmware/netronome/
├── flower
│ ├── nic_AMDA0081-0001_1x40.nffw -> nic_AMDA0081.nffw
│ ├── nic_AMDA0081-0001_4x10.nffw -> nic_AMDA0081.nffw
│ ├── ...
├── nic
│ ├── nic_AMDA0058-0011_2x40.nffw
│ ├── nic_AMDA0058-0012_2x40.nffw
│ ├── ...
├── nic-sriov
│ ├── nic_AMDA0058-0011_2x40.nffw
│ ├── nic_AMDA0058-0012_2x40.nffw
│ ├── ...
├── nic_AMDA0058-0011_2x40.nffw -> nic/nic_AMDA0058-0011_2x40.nffw
├── nic_AMDA0058-0012_2x40.nffw -> nic/nic_AMDA0058-0012_2x40.nffw
├── ...
As can be seen from the tree structure, three functionalities (flower
,
nic
, nic-sriov
) are supplied by the linux-firmware
package. If
nic-sriov
is missing, follow the The support site
method. Point the symbolic links to the specific application required, in this
case nic-sriov
:
# ln -sf /lib/firmware/netronome/nic-sriov/* /lib/firmware/netronome/
The support site¶
The SR-IOV capable firmware can be obtained from the Netronome support site. Upon downloading the packaged firmware, install the firmware files.
For Debian / Ubuntu:
# dpkg -i agilio-sriov-firmware-2.1.x.deb
For RHEL / Fedora / CentOS:
# yum -y install agilio-sriov-firmware-2.1.x.rpm
The /lib/firmware/netronome
directory contains symbolic links which point
to the actual firmware files. When installing the above firmware package, the
symbolic links are automatically updated to point to the new SR-IOV capable
firmware files. This can be confirmed with:
# ls -og --time-style="+" /lib/firmware/netronome
...
lrwxrwxrwx 1 64 nic_AMDA0058-0011_2x40.nffw -> /opt/netronome/agilio-sriov-firmware/nic_AMDA0058-0011_2x40.nffw
...
Load firmware to SmartNIC¶
Remove and reload the driver. The driver will subsequently install the new firmware to the SmartNIC:
# modprobe -r nfp
# modprobe nfp
The ethtool
command can be used to verify that the correct firmware has
been loaded onto the SmartNIC:
# ethtool -i enp2s0np0 | head -3
driver: nfp
version: 4.15.0-20-generic SMP mod_unloa
firmware-version: 0.0.3.5 0.22 sriov-2.1.14 nic
Notice that the firmware has successfully changed from nic-2.0.4
to
sriov-2.1.14
.
Note
Because the /lib/firmware/netronome
directory is managed by the
linux-firmware
package, an update to this package will cause the
symbolic links to point back to the nic
firmware files. If a system
reboot or a driver reload occurs after the links were changed, the
incorrect firmware will be loaded to the SmartNIC. In this event, repeat
the Installing the SR-IOV capable firmware procedure
to restore the desired functionality. A workaround is possible, but
involves additional configuration of the initramfs
file system.
Customers interested in this workaround can Contact Us
for more information.
Configuring SR-IOV¶
At this stage, there are still zero VFs, and only one PF (assuming only one Netronome SmartNIC is installed):
# lspci -kd 19ee:
02:00.0 Ethernet controller: Netronome Systems, Inc. Device 4000
Subsystem: Netronome Systems, Inc. Device 4001
Kernel driver in use: nfp
Kernel modules: nfp
The number of supported VFs on a netdev
is exposed by sriov_totalvfs
in
sysfs
. For example, if enp2s0np0
is the interface associated with the
SmartNIC’s PF, the following command will return the total supported number of
VFs:
# cat /sys/class/net/enp2s0np0/device/sriov_totalvfs
56
VFs can be allocated to a network interface by writing an integer to the
sysfs
file. For example, to allocate two VFs to enp2s0np0
:
# echo 2 > /sys/class/net/enp2s0np0/device/sriov_numvfs
The new VFs, together with the PF, can be observed with the lspci
command:
# lspci -kd 19ee:
02:00.0 Ethernet controller: Netronome Systems, Inc. Device 4000
Subsystem: Netronome Systems, Inc. Device 4001
Kernel driver in use: nfp
Kernel modules: nfp
02:08.0 Ethernet controller: Netronome Systems, Inc. Device 6003
Subsystem: Netronome Systems, Inc. Device 4001
Kernel driver in use: nfp_netvf
Kernel modules: nfp
02:08.1 Ethernet controller: Netronome Systems, Inc. Device 6003
Subsystem: Netronome Systems, Inc. Device 4001
Kernel driver in use: nfp_netvf
Kernel modules: nfp
In this example, the PF is located at PCI address 02:00.0
. The two VFs are
located at 02:08.0
and 02:08.1
. Notice that the VFs are identified by
Device 6003
, and that they use the nfp_netvf
kernel driver. For RHEL
7.x systems however, the VFs will use the nfp
driver.
Note
If the SmartNIC has more than one physical port (phyport), the VFs will
appear to be connected to all the phyports (as reported by the ip link
command). This happens due to the PF being shared among all VFs. In
reality, the VFs are only connected to phyport 0.
SR-IOV VFs cannot be reallocated dynamically. In order to change the number of
allocated VFs, existing functions must first be deallocated by writing a 0
to the sysfs
file. Otherwise, the system will return a
device or resource busy
error:
# echo 0 > /sys/class/net/enp2s0np0/device/sriov_numvfs
Note
Ensure any VMs are shut down and applications that may be using the VFs are stopped before deallocation.
In order to persist the VFs on the system, it is suggested that the system
networking scripts be updated to manage them. The following snippet illustrates
how to do this with NetworkManager for the PF enp2s0np0
:
1 2 3 4 5 6 7 8 | cat >/etc/NetworkManager/dispatcher.d/99-create-vfs << EOF
#!/bin/sh
# This is a NetworkManager script to persist the maximum number of VFs on a netdev
[ "enp2s0np0" == "\$1" -a "up" == "\$2" ] && \
cat /sys/class/net/enp2s0np0/device/sriov_totalvfs > /sys/class/net/enp2s0np0/device/sriov_numvfs
exit
EOF
chmod 755 /etc/NetworkManager/dispatcher.d/99-create-vfs
|
In Ubuntu systems, networkd-dispatcher can be used in place of NetworkManager, using a similar approach to setting up the PF:
1 2 3 4 5 6 7 8 | #!/bin/sh
cat > /usr/lib/networkd-dispatcher/routable.d/50-ifup-noaddr << 'EOF'
#!/bin/sh
ip link set mtu 9216 dev enp2s0np0
ip link set up dev enp2s0np0
cat /sys/class/net/enp2s0np0/device/sriov_totalvfs > /sys/class/net/enp2s0np0/device/sriov_numvfs
EOF
chmod u+x /usr/lib/networkd-dispatcher/routable.d/50-ifup-noaddr
|
To enable PCI passthrough, edit the kernel command line at
/etc/default/grub
. Add the parameters intel_iommu=on iommu=pt
to the
existing command line:
GRUB_CMDLINE_LINUX_DEFAULT="console=tty1 console=ttyS0,115200 intel_iommu=on iommu=pt"
Then:
# update-grub
Ensure that the /boot/grub/grub.cfg
file is updated with the aforementioned
parameters:
# reboot
After reboot, confirm that the kernel has been started with the parameters:
# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-4.15.0-20-generic root=UUID=179b45a3-def2-48b0-8f2f-7a5b6b3f913b ro console=tty1 console=ttyS0,115200 intel_iommu=on iommu=pt
Using virtio-forwarder¶
virtio-forwarder is a userspace networking application that forwards bi-directional traffic between SR-IOV VFs and virtio networking devices in QEMU virtual machines. virtio-forwarder implements a virtio backend driver using the DPDK’s vhost-user library and services designated VFs by means of the DPDK poll mode driver (PMD) mechanism.
The steps shown here closely correlate with the comprehensive virtio-forwarder docs. Ensure that the Requirements are met and that the setup of Using SR-IOV has been completed.
Installing virtio-forwarder¶
For Debian / Ubuntu:
# add-apt-repository ppa:netronome/virtio-forwarder
# apt-get update
# apt-get install virtio-forwarder
For RHEL / Fedora / CentOS:
# yum install yum-plugin-copr
# yum copr enable netronome/virtio-forwarder
# yum install virtio-forwarder
virtio-forwarder makes use of the DPDK library, therefore DPDK has to be installed. Carry out the instructions of Installing DPDK.
Configuring hugepages¶
For Ubuntu, modify libvirt’s apparmor permissions to allow read/write access to
the hugepages directory and library files for QEMU. Add the following lines to
the end of /etc/apparmor.d/abstractions/libvirt-qemu
:
1 2 3 4 5 6 | /tmp/virtio-forwarder/** rwmix,
# for latest QEMU
/usr/lib/x86_64-linux-gnu/qemu/* rmix,
# for access to hugepages
owner "/dev/hugepages/libvirt/qemu/**" rw,
owner "/dev/hugepages-1G/libvirt/qemu/**" rw,
|
Also edit the existing line, such that:
1 | /tmp/{,**} r,
|
Restart the apparmor service:
# systemctl restart apparmor.service
For virtio-forwarder, 2M hugepages are required whereas QEMU/KVM performs better with 1G hugepages. It is recommended that at least 1375 pages of 2M be reserved for virtio-forwarder. The hugepages can be configured during boot time, for which the following should be added to the Linux kernel command line parameters:
hugepagesz=2M hugepages=1375 default_hugepagesz=1G hugepagesz=1G hugepages=8
Alternatively, hugepages can be configured manually after each boot. Reserve at least 1375 * 2M for virtio-forwarder:
# echo 2048 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
Reserve 8G for application hugepages (modify this as needed):
# echo 8 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
Since non-fragmented memory is required for hugepages, it is recommended that hugepages be configured during boot time.
hugetlbfs
needs to be mounted on the file system to allow applications to
create and allocate handles to the mapped memory. The following lines mount the
two types of hugepages on /dev/hugepages
(2M) and /dev/hugepages-1G
(1G):
# grep hugetlbfs /proc/mounts | grep -q "pagesize=2M" || \
( mkdir -p /dev/hugepages && mount nodev -t hugetlbfs -o rw,pagesize=2M /dev/hugepages/ )
# grep hugetlbfs /proc/mounts | grep -q "pagesize=1G" || \
( mkdir -p /dev/hugepages-1G && mount nodev -t hugetlbfs -o rw,pagesize=1G /dev/hugepages-1G/ )
Finally, libvirt requires a special directory inside the hugepages mounts with the correct permissions in order to create the necessary per-VM handles:
# mkdir /dev/hugepages-1G/libvirt
# mkdir /dev/hugepages/libvirt
# chown [libvirt-]qemu:kvm -R /dev/hugepages-1G/libvirt
# chown [libvirt-]qemu:kvm -R /dev/hugepages/libvirt
Note
Substitute /dev/hugepages[-1G]
with your actual hugepage mount
directory. A 2M hugepage mount location is created by default by some
distributions.
Restart the libvirt daemon:
# systemctl restart libvirtd
To check that hugepages are correctly reserved for each page size, the
hugeadm
utility can be used:
# hugeadm --pool-list
Size Minimum Current Maximum Default
2097152 2048 2048 2048 *
1073741824 8 8 8
Binding to vfio-pci¶
Since the VFs need to communicate directly with virtio-forwarder, a
pass-through style driver, such as vfio-pci
is required. The vfio-pci
module is the preferred driver, compared to uio_pci_generic
and
igb_uio
, of which the former lacks SR-IOV compatibility whereas the latter
is considered outdated.
First, unbind the VF PCI devices from their current drivers:
# lspci -Dd 19ee:6003 | awk '{print $1}' | xargs -I{} echo \
"echo {} > /sys/bus/pci/devices/{}/driver/unbind;" | bash
The VFs which now have their drivers unbound, can be observed with the
lspci
command:
# lspci -kd 19ee:
02:00.0 Ethernet controller: Netronome Systems, Inc. Device 4000
Subsystem: Netronome Systems, Inc. Device 4001
Kernel driver in use: nfp
Kernel modules: nfp
02:08.0 Ethernet controller: Netronome Systems, Inc. Device 6003
Subsystem: Netronome Systems, Inc. Device 4001
Kernel modules: nfp
02:08.1 Ethernet controller: Netronome Systems, Inc. Device 6003
Subsystem: Netronome Systems, Inc. Device 4001
Kernel modules: nfp
Notice that the Kernel driver in use
attribute was removed. To bind the
vfio-pci
driver to the VFs, first load the vfio-pci driver to the Linux
kernel:
# modprobe vfio-pci
Then bind the driver to the VFs:
# echo 19ee 6003 > /sys/bus/pci/drivers/vfio-pci/new_id
The VFs are now bound to the vfio-pci driver:
# lspci -kd 19ee:
02:00.0 Ethernet controller: Netronome Systems, Inc. Device 4000
Subsystem: Netronome Systems, Inc. Device 4001
Kernel driver in use: nfp
Kernel modules: nfp
02:08.0 Ethernet controller: Netronome Systems, Inc. Device 6003
Subsystem: Netronome Systems, Inc. Device 4001
Kernel driver in use: vfio-pci
Kernel modules: nfp
02:08.1 Ethernet controller: Netronome Systems, Inc. Device 6003
Subsystem: Netronome Systems, Inc. Device 4001
Kernel driver in use: vfio-pci
Kernel modules: nfp
Launching virtio-forwarder¶
In this guide, the use case will be virtio-forwarder acting as a server. This
means virtio-forwarder will create and host the sockets to which VMs can
connect at a later stage. To configure virtio-forwarder as the server, edit
/etc/default/virtioforwarder
so that VIRTIOFWD_VHOST_CLIENT
is assigned
a blank value:
# Non-blank enables vhostuser client mode (default: server mode)
VIRTIOFWD_VHOST_CLIENT=
The virtio-forwarder service can be configured to start during boot time:
# systemctl enable virtio-forwarder
To manually start the service after installation, run:
# systemctl start virtio-forwarder
Adding VF ports to virtio-forwarder¶
Modify socket permissions:
# chown -R libvirt-qemu:kvm /tmp/virtio-forwarder/
Dynamically map the PCI address of each VF to virtio-forwarder as follows:
# /usr/lib/virtio-forwarder/virtioforwarder_port_control.py add \
--virtio-id 1 --pci-addr 02:08.0
status: OK
# /usr/lib/virtio-forwarder/virtioforwarder_port_control.py add \
--virtio-id 2 --pci-addr 02:08.1
status: OK
The virtio-id
parameter is compulsory and denotes the id of the relay
through which traffic is routed. A relay can accept only a single PCI device
and a single VM.
The VF ports added to virtio-forwarder can be confirmed with:
# /usr/lib/virtio-forwarder/virtioforwarder_stats.py \
--include-inactive | grep DPDK_ADDED
relay_1.vf_to_vm.internal_state=DPDK_ADDED
relay_2.vf_to_vm.internal_state=DPDK_ADDED
The VF ports can be removed in a similar fashion:
# /usr/lib/virtio-forwarder/virtioforwarder_port_control.py remove \
--virtio-id 1 --pci-addr 02:08.0
status: OK
# /usr/lib/virtio-forwarder/virtioforwarder_port_control.py remove \
--virtio-id 2 --pci-addr 02:08.1
status: OK
It is useful to watch the virtio-forwarder journal while adding or removing ports:
# journalctl -fu virtio-forwarder
The VF entries can also be modified statically within the
/etc/default/virtioforwarder
file. Consult the
virtio-forwarder docs for more information.
Modify guest VM XML files¶
The snippets in this section should be inserted in each VM’s XML file.
The following snippet configures the connection between the VM and the
virtio-forwarder service. Note that virtio-forwarder1.sock
refers to
virtio-id 1
and relay_1
. The MAC address should be assigned the value
of the specific VF to be paired with the VM. If left unassigned, libvirt will
assign a random MAC address which will cause the VM’s traffic to be rejected by
the SmartNIC. The PCI address is internal to the VM and can be chosen
arbitrarily, but should be unique within the VM itself.
1 2 3 4 5 6 7 8 9 | <devices>
<interface type='vhostuser'>
<mac address='1e:a3:32:f8:3e:83'/>
<source type='unix' path='/tmp/virtio-forwarder/virtio-forwarder1.sock' mode='client'/>
<model type='virtio'/>
<alias name='net1'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
</interface>
</devices>
|
The VM also has to be configured to make use of the 1G hugepages that was reserved for this purpose:
1 2 3 4 5 | <memoryBacking>
<hugepages>
<page size='1048576' unit='KiB' nodeset='0'/>
</hugepages>
</memoryBacking>
|
Allocate CPUs and memory to the VM. It is especially important to specify
memAccess='shared'
, as this allows the host and guest VM to share RAM. This
is required by virtio-forwarder to write the packets to the VM.
1 2 3 4 5 6 7 | <cpu mode='custom' match='exact'>
<model fallback='allow'>SandyBridge</model>
<feature policy='require' name='ssse3'/>
<numa>
<cell id='0' cpus='0-1' memory='3670016' unit='KiB' memAccess='shared'/>
</numa>
</cpu>
|
The VMs can now be booted. Observing the host’s CPU usage (e.g. htop
) will
show that some of the cores will be utilized to the maximum (polling
mechanism). The default number of cores dedicated for virtio-forwarder is 2,
and can be adjusted in /etc/default/virtioforwarder
by modifying the
`` VIRTIOFWD_CPU_MASK`` value.
Appendix A: Netronome Repositories¶
All the software mentioned in this document can be obtained via the official Netronome repositories. Please find instructions below on how to enable access to the aforementioned repositories from your respective linux distributions.
Importing GPG-key¶
Download and Import GPG-key to your local machine:
For RHEL/Centos 7.5, download the public key:
# wget https://rpm.netronome.com/gpg/NetronomePublic.key
Import the public key:
# rpm --import NetronomePublic.key
For Ubuntu 18.04 LTS, download the public key:
# wget https://deb.netronome.com/gpg/NetronomePublic.key
Import the public key:
# apt-key add NetronomePublic.key
Configuring repositories¶
For RHEL/Centos 7.5, add Netronome’s repository:
# cat << 'EOF' > /etc/yum.repos.d/netronome.repo
[netronome]
name=netronome
baseurl=https://rpm.netronome.com/repos/centos/
gpgcheck=0
enabled=1
EOF
yum makecache
For Ubuntu 18.04 LTS, add Netronome’s repository:
# mkdir -p /etc/apt/sources.list.d/
# echo "deb https://deb.netronome.com/apt stable main" > \
/etc/apt/sources.list.d/netronome.list
Update repository lists:
# apt-get update
Appendix B: Installing the Out-of-Tree NFP Driver¶
The nfp driver can be installed via the Netronome repository or built from source depending on your requirement.
Install Driver via Netronome Repository¶
Please refer to Appendix A: Netronome Repositories on how to configure the Netronome repository applicable to your distribution. When the repository has been successfully added install the nfp-driver package using the commands below.
RHEL 7.5¶
First install the required dependencies for Red Hat, DKMS is required to install the out-of-tree drivers:
# yum install -y kernel-devel-$(uname -r) elfutils-libelf-devel gcc
# wget http://fr2.rpmfind.net/linux/fedora/linux/updates/28/Everything/\
x86_64/Packages/d/dkms-2.6.1-1.fc28.noarch.rpm
# rpm -ivh dkms-2.6.1-1.fc28.noarch.rpm
Then install the NFP driver from the netronome repository added previously in Configuring repositories:
# yum list available | grep nfp-driver
agilio-nfp-driver-dkms.noarch 2017.12.18.2245.77334f7-1.el7 netronome
# yum install -y agilio-nfp-driver-dkms --nogpgcheck
RHEL/CentOS 7.5:
# yum install -y kernel-devel
# yum list available | grep nfp-driver
agilio-nfp-driver-dkms.noarch 2017.12.18.2245.77334f7-1.el7 netronome
# yum install agilio-nfp-driver-dkms --nogpgcheck
Ubuntu 18.04 LTS:
# apt-cache search nfp-driver
agilio-nfp-driver-dkms - agilio-nfp-driver driver in DKMS format.
# apt-get install agilio-nfp-driver-dkms
Kernel Changes¶
Take note that installing the DKMS driver will only install it for the
currently running kernel. When you upgrade the installed kernel it may not
automatically update the the nfp module to use the version in the DKMS package.
In kernel versions older than v4.16 the MODULE_VERSION
parameter of the
in-tree module was not set, which causes DKMS to pick the module with the
highest srcversion
hash (https://github.com/dell/dkms/issues/14). This is
worked around by the package install step adding a --force
to the DKMS
install, but this will not trigger on a kernel upgrade. To work around this
issue, boot into the new kernel and then re-install the
agilio-nfp-driver-dkms
package.
This should not be a problem when upgrading from kernels v4.16 and newer as the
MODULE_VERSION
has been added since this revision and the DKMS version
check should work properly. It’s not possible to determine which nfp.ko
file was loaded by only relying on information provided by the kernel. However,
it’s possible to confirm that the binary signature of a file on disk and the
module loaded in memory is the same.
To confirm that the module in memory is the same as the file on disk, compare
the srcversion
tag. The in-memory module’s tag is at
/sys/module/nfp/srcversion
. The default on-disk version can be queried with
modinfo
In-memory module:
# cat /sys/module/nfp/srcversion
On-disk module:
# modinfo nfp | grep "^srcversion:"
If these tags are in sync, the filename of the module provided by a modinfo query will identify the origin of the module:
# modinfo nfp | grep "^filename:"
If these tags are not in sync, there are likely conflicting copies of the module on the system: the initramfs may be out of sync or the module dependencies may be inconsistent.
The in-tree kernel module is usually located at the following path (please
note, this module may be compressed with a .xz
extension):
/lib/modules/$(uname -r)/kernel/drivers/net/ethernet/netronome/nfp/nfp.ko
The DKMS module is usually located at the following path:
/lib/modules/$(uname -r)/updates/dkms/nfp.ko
To ensure that the out-of-tree driver is correctly loaded instead of the in-tree module, the following commands can be run:
# mkdir -p /etc/depmod.d
# echo "override nfp * extra" > /etc/depmod.d/netronome.conf
# depmod -a
# modprobe -r nfp; modprobe nfp
# update-initramfs -u
Building from Source¶
Driver sources for Netronome Flow Processor devices, including the NFP-4000 and NFP-6000 models can be found at: https://github.com/Netronome/nfp-drv-kmods
RHEL/CentOS 7.5:
# yum install -y kernel-devel-$(uname -r) gcc git
Ubuntu 18.04:
# apt-get update
# apt-get install -y linux-headers-$(uname -r) build-essential libelf-dev
Clone, Build and Install¶
Finally, to clone, build and install the driver:
# git clone https://github.com/Netronome/nfp-drv-kmods.git
# cd nfp-drv-kmods
# make
# make install
# depmod -a
Appendix C: Working with Board Support Package¶
The NFP BSP provides infrastructure software and a development environment for managing NFP based platforms.
Install Software from Netronome Repository¶
Please refer to Appendix A: Netronome Repositories on how to configure the Netronome repository applicable to your distribution. When the repository has been successfully added install the BSP package using the commands below.
RHEL/CentOS 7.5:
# yum list available | grep nfp-bsp
nfp-bsp-6000-b0.x86_64 2017.12.05.1404-1 netronome
# yum install nfp-bsp-6000-b0 --nogpgcheck
# reboot
Ubuntu 18.04 LTS:
# apt-cache search nfp-bsp
nfp-bsp-6000-b0 - Netronome NFP BSP
# apt-get install nfp-bsp-6000-b0
Install Software From deb/rpm Package¶
Obtain Software¶
The latest BSP packages can be obtained at the downloads area of the Netronome Support site (https://help.netronome.com).
Install the prerequisite dependencies¶
RHEL/Centos 7.5 Dependencies¶
No dependency installation required
NFP BSP Package¶
Install the NFP BSP package provided by Netronome Support.
RHEL/CentOS 7.5:
# yum install -y nfp-bsp-6000-*.rpm --nogpgcheck
Ubuntu 18.04 LTS:
# dpkg -i nfp-bsp-6000-*.deb
Using BSP tools¶
Enable CPP access¶
The NFP has an internal Command Push/Pull (CPP) bus that allows debug access to the SmartNIC internals. CPP access allows user space tools raw access to chip internals and is required to enable the use of most BSP tools. Only the out-of-tree (oot) driver allows CPP access.
Follow the steps from Install Driver via Netronome Repository to install the oot nfp driver. After the nfp module has been built load the driver with CPP access:
# depmod -a
# rmmod nfp
# modprobe nfp nfp_dev_cpp=1 nfp_pf_netdev=0
To persist this option across reboots, a number of options are available; the distribution specific documentation will detail that process more thoroughly. Care must be taken that the settings are also applied to any initramfs images generated.
Configure Media Settings¶
Alternatively to the process described in Configuring Interface Media Mode, BSP tools can be used to configure the port speed of the SmartNIC use the following commands. Note, a reboot is still required for changes to take effect.
Agilio CX 2x25GbE - AMDA0099¶
To set the port speed of the CX 2x25GbE the following commands can be used
Set port 0 and port 1 to 10G mode:
# nfp-media phy1=10G phy0=10G
Set port 1 to 25G mode:
# nfp-media phy1=25G+
To change the FEC settings of the 2x25GbE the following commands can be used:
nfp-media --set-aneg=phy0=[S|A|I|C|F] --set-fec=phy0=[A|F|R|N]
Where the parameters for each argument are:
--set-aneg=
:
- S
- search - Search through supported modes until link is found. Only one side should be doing this. It may result in a mode that can have physical layer errors depending on SFP type and what the other end wants. Long DAC cables with no FEC WILL have physical layer errors.
- A
- auto - Automatically choose mode based on speed and SFP type.
- C
- consortium - Consortium 25G auto-negotiation with link training.
- I
- IEEE - IEEE 10G or 25G auto-negotiation with link training.
- F
- forced - Mode is forced with no auto-negotiation or link training.
--set-fec=
:
- A
- auto - Automatically choose FEC based on speed and SFP type.
- F
- Firecode - BASE-R Firecode FEC compatible with 10G.
- R
- Reed-Solomon - Reed-Solomon FEC new for 25G.
- N
- none - No FEC is used.
Agilio CX 1x40GbE - AMDA0081¶
Set port 0 to 40G mode:
# nfp-media phy0=40G
Set port 0 to 4x10G fanout mode:
# nfp-media phy0=4x10G
Agilio CX 2x40GbE - AMDA0097¶
Set port 0 and port 1 to 40G mode:
# nfp-media phy0=40G phy1=40G
Set port 0 to 4x10G fanout mode:
# nfp-media phy0=4x10G
For mixed configuration the highest port must be in 40G mode e.g:
# nfp-media phy0=4x10G phy1=40G
Appendix D: Obtaining DPDK-ns¶
Netronome specific DPDK sources can be acquired from the Official Netronome Support site (https://help.netronome.com). If you do not have an account already, you can request access by sending an email to help@netronome.com.
Download the dpdk-ns sources or deb/rpm package from the Netronome-Support site and perform the following steps to build or install DPDK.
Build DPDK-ns from sources¶
To build DPDK-ns from source assuming the tarball has been downloaded to the
/root
directory:
# cd /root
# tar zxvf dpdk-ns.tar.gz
# cd dpdk-ns
# export RTE_SDK=/root/dpdk-ns
# export RTE_TARGET=x86_64-native-linuxapp-gcc
# make T=$RTE_TARGET install
Install DPDK-ns from packages¶
Ubuntu:
# apt-get install -y netronome-dpdk*.deb
CentOS/RHEL:
# yum install -y netronome-dpdk*.rpm
Appendix E: Updating NFP Flash¶
The NVRAM flash software on the SmartNIC can be updated in one of two ways, either via ethtool or via the BSP userspace tools. In both cases, the BSP package needs to be installed to gain access to the intended flash image. After the flash has been updated, the system needs to be rebooted to take effect.
Note
The ethtool interface is only available for hosts running kernel 4.16 or higher when using the in-tree driver. Please use the out of tree driver to enable ethtool flashing on older kernels.
Note
warning
Updating the flash via ethtool is only supported if the existing flash
version is greater than 0028.0028.007c
. Installed NVRAM flash version
can be checked with the command dmesg | grep BSP
. Cards running older
versions of the NVRAM flash must be updated using the method in
Update via BSP Userspace Tools
Refer to Appendix C: Working with Board Support Package to acquire the BSP tool package.
Update via Ethtool¶
To update the flash using ethtool
, the reflashing utilities used in the
Netronome directory in the system must first be relocated so that ethtool
has access to them:
# cp /opt/netronome/flash/flash-nic.bin /lib/firmware
# cp /opt/netronome/flash/flash-one.bin /lib/firmware
Thereafter, ethtool
can be used to reflash the software loaded onto the
SmartNIC devices identified by either their PF <netdev> or their physical
ports <netdev port>:
# ethtool -f <netdev/netdev port> flash-nic.bin
# ethtool -f <netdev/netdev port> flash-one.bin
Update via BSP Userspace Tools¶
Obtain Out of Tree NFP Driver¶
To update the flash using the BSP userspace tools, use the following steps. Refer to Appendix B: Installing the Out-of-Tree NFP Driver on installing the out of tree NFP driver and to load the driver with CPP access.
Flash the Card¶
The following commands may be executed for each card installed in the system using the PCIe ID of the particular card. First reload the NFP drivers with CPP access enabled:
# rmmod nfp
# modprobe nfp nfp_pf_netdev=0 nfp_dev_cpp=1
Then use the included netronome flashing tools to reflash the card:
# /opt/netronome/bin/nfp-flash --preserve-media-overrides \
-w /opt/netronome/flash/flash-nic.bin -Z <PCI ID, e.g. 04:00.0>
# /opt/netronome/bin/nfp-flash -w /opt/netronome/flash/flash-one.bin \
-Z <PCI ID, e.g. 04:00.0>
# reboot
Appendix F: Upgrading the Kernel¶
RHEL 7.5¶
It is only recommended to use kernel packages released by Red Hat and installable as part of the distribution installation and upgrade procedure.
CentOS 7.5¶
The CentOS package installer yum will manage an update to the supported kernel
version. The command yum install kernel-<version>
updates the kernel for
CentOS. First search for available kernel packages then install the desired
release:
# yum list --showduplicates kernel
kernel.x86_64
3.10.0-862.el7
base
kernel.x86_64
3.10.0-862.2.3.el7
updates
kernel.x86_64
3.10.0-862.3.2.el7
updates
# yum install kernel-3.10.0-862.el7
Ubuntu 18.04 LTS¶
If desired, alternative kernels may be installed. For example, at the time of writing, v4.18 is the newest stable kernel.
Acquire packages¶
To download the kernels from Ubuntu’s ppa mainline:
# BASE=http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.18/
# wget\
$BASE/linux-headers-4.18.0-041800_4.18.0-041800.201808122131_all.deb \
$BASE/linux-headers-4.18.0-041800-generic_4.18.0-041800.201808122131_amd64.deb \
$BASE/linux-image-unsigned-4.18.0-041800-generic_4.18.0-041800.201808122131_amd64.deb \
$BASE/linux-modules-4.18.0-041800-generic_4.18.0-041800.201808122131_amd64.deb
Install packages¶
To install the packages:
# dpkg -i \
linux-headers-4.18.0-041800_4.18.0-041800.201808122131_all.deb \
linux-headers-4.18.0-041800-generic_4.18.0-041800.201808122131_amd64.deb \
linux-image-unsigned-4.18.0-041800-generic_4.18.0-041800.201808122131_amd64.deb \
linux-modules-4.18.0-041800-generic_4.18.0-041800.201808122131_amd64.deb
Appendix G: set_irq_affinity.sh Source¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 | #!/bin/bash -e
# Copyright (C) 2018 Netronome Systems, Inc.
#
# This software is dual licensed under the GNU General License Version 2,
# June 1991 as shown in the file COPYING in the top-level directory of this
# source tree or the BSD 2-Clause License provided below. You have the
# option to license this software under the complete terms of either license.
#
# The BSD 2-Clause License:
#
# Redistribution and use in source and binary forms, with or
# without modification, are permitted provided that the following
# conditions are met:
#
# 1. Redistributions of source code must retain the above
# copyright notice, this list of conditions and the following
# disclaimer.
#
# 2. Redistributions in binary form must reproduce the above
# copyright notice, this list of conditions and the following
# disclaimer in the documentation and/or other materials
# provided with the distribution.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
# BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
# ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
# CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
usage() {
echo "Usage: $0 NETDEV"
echo " Optional env vars: IRQ_NAME_FMT"
exit 1
}
[ $# -ne 1 ] && usage
[ "a$IRQ_NAME_FMT" == a ] && IRQ_NAME_FMT=$1-rxtx
DEV=$1
if ! [ -e /sys/bus/pci/devices/$DEV ]; then
DEV=$(ethtool -i $1 | grep bus | awk '{print $2}')
N_TX=$(ls /sys/class/net/$1/queues/ | grep tx | wc -l)
N_CPUS=$(ls /sys/bus/cpu/devices/ | wc -l)
fi
[ "a$DEV" == a ] && usage
NODE=$(cat /sys/bus/pci/devices/$DEV/numa_node)
CPUL=$(cat /sys/bus/node/devices/node${NODE}/cpulist | tr ',' ' ')
N_NODES=$(ls /sys/bus/node/devices/ | wc -l)
for c in $CPUL; do
# Convert "n-m" into "n n+1 n+2 ... m"
[[ "$c" =~ '-' ]] && c=$(seq $(echo $c | tr '-' ' '))
CPUS=(${CPUS[@]} $c)
done
echo Device $DEV is on node $NODE with cpus ${CPUS[@]}
IRQBAL=$(ps aux | grep irqbalance | wc -l)
[ $IRQBAL -ne 1 ] && echo Killing irqbalance && killall irqbalance
IRQS=$(ls /sys/bus/pci/devices/$DEV/msi_irqs/)
IRQS=($IRQS)
node_mask=$((~(~0 << N_NODES)))
node_shf=$((N_NODES - 1))
cpu_shf=$((N_TX << node_shf))
p_mask=0
id=0
for i in $(seq 0 $((${#IRQS[@]} - 1)))
do
! [ -e /proc/irq/${IRQS[i]} ] && continue
name=$(basename /proc/irq/${IRQS[i]}/$IRQ_NAME_FMT*)
ls /proc/irq/${IRQS[i]}/$IRQ_NAME_FMT* >>/dev/null 2>/dev/null || continue
cpu=${CPUS[id % ${#CPUS[@]}]}
m=0
m_mask=node_mask
if [ $N_TX -gt $((id + ${#CPUS[@]})) ]; then
# Only take one CPU if there will be more rings on this CPU
m_mask=1
fi
# Calc the masks we should cover
for j in `seq 0 $cpu_shf $((N_CPUS - 1))`; do
m=$((m << cpu_shf | (m_mask << ((cpu >> node_shf) << node_shf))))
m=$((m & ~p_mask))
done
xps_mask=$(printf "%x" $((m % (1 << N_CPUS))))
# Insert comma between low and hi 32 bits, if xps_mask is long enough
xps_mask=`echo $xps_mask | sed 's/\(.\)\(.\{8\}$\)/\1,\2/'`
p_mask=$((p_mask | m))
echo $cpu > /proc/irq/${IRQS[i]}/smp_affinity_list
irq_state="irq: $(cat /proc/irq/${IRQS[i]}/smp_affinity)"
xps_state='xps: ---'
xps_file=/sys/class/net/$1/queues/tx-$id/xps_cpus
if [ -e $xps_file ]; then
echo $xps_mask > $xps_file
xps_state="xps: $(cat $xps_file)"
fi
echo -e "IRQ ${IRQS[i]} to CPU $cpu ($irq_state $xps_state)"
((++id))
done
|
Contact Us¶
Netronome Systems, Inc. 2903 Bunker Hill Lane, Suite 150 Santa Clara, CA 95054 |
|
Tel: 408.496.0022 | Fax: 408.586.0002 |
https://www.netronome.com | help@netronome.com |