Troubleshooting Guide
This guide contains information on how to troubleshoot Aeron DPDK installations.
Aeron DPDK provides additional executables to help with troubleshooting:
-
dpdk_arp_tool: for debugging ARP issues. -
dpdk_ping_raw: for testing connectivity between machines.# ping host: send ping and print stats $ sudo ./dpdk_ping_raw -- -s -l <ping_addr>:<ping_port> -r <pong_addr>:<pong_port> -m <pong_mac_addr> # pong host: respond to ping requests $ sudo ./dpdk_ping_raw -- -l <pong_addr>:<pong_port> -r <ping_addr>:<ping_port> -m <ping_mac_addr>
| Both require root privileges to run. |
|
Run either tool with
|
Observability
Aeron DPDK provides a set of DPDK counters which expose runtime state of the system. This includes Aeron-specific counters as well environment-specific DPDK counters.
Here is an example AeronStat output from one of our AWS tests:
43: 453,803,278 - DPDK poll count SENDER
44: 436,106,067 - DPDK poll count RECEIVER
45: 0 - DPDK - local: 10.0.10.191/6:ac:db:fb:56:2d rx: 1/4096 tx: 2/1024 65536
46: 9,081,956 - DPDK TX packets
47: 3,286,117,876 - DPDK TX bytes
48: 9,072,742 - DPDK RX packets
49: 3,285,730,128 - DPDK RX bytes
50: 0 - DPDK TX no buffers
51: 0 - DPDK RX no buffers
52: 0 - DPDK TX EAGAIN
53: 0 - DPDK TX ERROR
54: 0 - DPDK RX ERROR
55: 0 - DPDK RX H/W missed packets
56: 6 - DPDK RX Sender DISCARD
57: 0 - DPDK RX Sender Queue Drop
58: 1 - DPDK ARP Misses
59: 0 - DPDK Checksum Failures
60: 0 - DPDK Fragmented Packets
61: 4,145 - DPDK RX Mempool Available
62: 2,013 - DPDK TX Sender Mempool Available
63: 2,001 - DPDK TX Receiver Mempool Available
64: 9,072,751 - DPDK rx_good_packets
65: 9,081,965 - DPDK tx_good_packets
66: 3,285,733,386 - DPDK rx_good_bytes
67: 3,286,121,134 - DPDK tx_good_bytes
68: 0 - DPDK rx_missed_errors
69: 0 - DPDK rx_errors
70: 0 - DPDK tx_errors
71: 0 - DPDK rx_mbuf_allocation_errors
72: 9,072,752 - DPDK rx_q0_packets
73: 3,285,733,748 - DPDK rx_q0_bytes
74: 0 - DPDK rx_q0_errors
75: 9,076,425 - DPDK tx_q0_packets
76: 3,285,689,378 - DPDK tx_q0_bytes
77: 5,541 - DPDK tx_q1_packets
78: 432,118 - DPDK tx_q1_bytes
79: 0 - DPDK wd_expired
80: 1 - DPDK dev_start
81: 0 - DPDK dev_stop
82: 0 - DPDK tx_drops
83: 0 - DPDK bw_in_allowance_exceeded
84: 0 - DPDK bw_out_allowance_exceeded
85: 0 - DPDK pps_allowance_exceeded
86: 0 - DPDK conntrack_allowance_exceeded
87: 0 - DPDK linklocal_allowance_exceeded
88: 2,462,629 - DPDK conntrack_allowance_available
89: 0 - DPDK ena_srd_mode
90: 0 - DPDK ena_srd_tx_pkts
91: 0 - DPDK ena_srd_eligible_tx_pkts
92: 0 - DPDK ena_srd_rx_pkts
93: 0 - DPDK ena_srd_resource_utilization
94: 9,072,749 - DPDK rx_q0_cnt
95: 3,285,732,662 - DPDK rx_q0_bytes
96: 0 - DPDK rx_q0_refill_partial
97: 0 - DPDK rx_q0_l3_csum_bad
98: 0 - DPDK rx_q0_l4_csum_bad
99: 9,072,746 - DPDK rx_q0_l4_csum_good
100: 0 - DPDK rx_q0_mbuf_alloc_fail
101: 0 - DPDK rx_q0_bad_desc_num
102: 0 - DPDK rx_q0_bad_req_id
103: 0 - DPDK rx_q0_bad_desc
104: 0 - DPDK rx_q0_unknown_error
105: 9,076,423 - DPDK tx_q0_cnt
106: 5,541 - DPDK tx_q1_cnt
107: 3,285,688,654 - DPDK tx_q0_bytes
108: 432,118 - DPDK tx_q1_bytes
109: 0 - DPDK tx_q0_prepare_ctx_err
110: 0 - DPDK tx_q1_prepare_ctx_err
111: 9,076,423 - DPDK tx_q0_tx_poll
112: 5,541 - DPDK tx_q1_tx_poll
113: 9,076,423 - DPDK tx_q0_doorbells
114: 5,541 - DPDK tx_q1_doorbells
115: 0 - DPDK tx_q0_bad_req_id
116: 0 - DPDK tx_q1_bad_req_id
117: 942 - DPDK tx_q0_available_desc
118: 986 - DPDK tx_q1_available_desc
119: 0 - DPDK tx_q0_missed_tx
120: 0 - DPDK tx_q1_missed_tx
Common issues
Invalid gateway
If AERON_DPDK_GATEWAY_IPV4_ADDRESS is set incorrectly it will prevent any Aeron DPDK traffic from reaching its
destination.
It manifests itself as continuously increasing value of the
DPDK ARP Misses counter. If configured correctly this counter should remain at
value 1.
On AWS gateway (AERON_DPDK_GATEWAY_IPV4_ADDRESS) and local IP address (AERON_DPDK_LOCAL_IPV4_ADDRESS) should
be omitted entirely as Aeron can auto resolve both.
|
Connectivity issues
Other than the (see Invalid gateway) issue there are other possible causes for connectivity issues:
-
Network configuration: routing, firewall etc.
Verify that machines can see each other and that UDP traffic is allowed between the boxes on the DPDK interfaces.
Two possible approaches to take:
-
Use
dpdk_ping_rawto send DPDK traffic between the machines. -
Unbind DPDK interface and use any tool (e.g.
iperf) that can send UDP traffic.This requires reversing the steps outlined in the Device and Driver Configuration section:
-
Find the interface
$ ./dpdk-devbind.py --status-dev net Network devices using DPDK-compatible driver ============================================ 0000:48:00.0 '82598EB 10-Gigabit AF Dual Port Network Connection 10f1' drv=vfio-pci unused=ixgbeIn this example the only DPDK interface was
0000:48:00.0 -
Unbind the interface found in step a.
$ sudo ./dpdk-devbind.py -u 0000:48:00.0 $ ./dpdk-devbind.py --status-dev net Network devices using kernel driver =================================== 0000:01:00.0 'MT27800 Family [ConnectX-5] 1017' if=enp1s0f0 drv=mlx5_core unused=vfio-pci *Active* 0000:01:00.1 'MT27800 Family [ConnectX-5] 1017' if=enp1s0f1 drv=mlx5_core unused=vfio-pci 0000:43:00.0 'I211 Gigabit Network Connection 1539' if=enp67s0 drv=igb unused=vfio-pci *Active* 0000:44:00.0 'Wi-Fi 6 AX200 2723' if=wlo2 drv=iwlwifi unused=vfio-pci 0000:48:00.0 '82598EB 10-Gigabit AF Dual Port Network Connection 10f1' if=enp72s0f0 drv=ixgbe unused=vfio-pci 0000:48:00.1 '82598EB 10-Gigabit AF Dual Port Network Connection 10f1' if=enp72s0f1 drv=ixgbe unused=vfio-pciNow
0000:48:00.0is visible to Linux kernel again underenp72s0f0name. -
Enable the link from step b.
$ sudo ip link set enp72s0f0 up -
Verify connectivity using standard Linux tools.
After the test follow the Device and Driver Configuration section to undo the changes.
-
-
-
DPDK-specific issues
If the network test was successful then the issue might be DPDK-specific in which case DPDK counters (see Observability) might provide additional context.
Packet capture
Packet capture must be enabled at the source level and requires specialized tools execute. See DPDK packet capture libraries and tools guide for more information.
It is not possible to capture packets using aeronmd_dpdk. However, both dpdk_arp_tool and
dpdk_ping_raw are compiled with packet capture support.
|