linux/tools
Jason Xing 45e359be1c net: xsk: introduce XDP_MAX_TX_SKB_BUDGET setsockopt
This patch provides a setsockopt method to let applications leverage to
adjust how many descs to be handled at most in one send syscall. It
mitigates the situation where the default value (32) that is too small
leads to higher frequency of triggering send syscall.

Considering the prosperity/complexity the applications have, there is no
absolutely ideal suggestion fitting all cases. So keep 32 as its default
value like before.

The patch does the following things:
- Add XDP_MAX_TX_SKB_BUDGET socket option.
- Set max_tx_budget to 32 by default in the initialization phase as a
  per-socket granular control.
- Set the range of max_tx_budget as [32, xs->tx->nentries].

The idea behind this comes out of real workloads in production. We use a
user-level stack with xsk support to accelerate sending packets and
minimize triggering syscalls. When the packets are aggregated, it's not
hard to hit the upper bound (namely, 32). The moment user-space stack
fetches the -EAGAIN error number passed from sendto(), it will loop to try
again until all the expected descs from tx ring are sent out to the driver.
Enlarging the XDP_MAX_TX_SKB_BUDGET value contributes to less frequency of
sendto() and higher throughput/PPS.

Here is what I did in production, along with some numbers as follows:
For one application I saw lately, I suggested using 128 as max_tx_budget
because I saw two limitations without changing any default configuration:
1) XDP_MAX_TX_SKB_BUDGET, 2) socket sndbuf which is 212992 decided by
net.core.wmem_default. As to XDP_MAX_TX_SKB_BUDGET, the scenario behind
this was I counted how many descs are transmitted to the driver at one
time of sendto() based on [1] patch and then I calculated the
possibility of hitting the upper bound. Finally I chose 128 as a
suitable value because 1) it covers most of the cases, 2) a higher
number would not bring evident results. After twisting the parameters,
a stable improvement of around 4% for both PPS and throughput and less
resources consumption were found to be observed by strace -c -p xxx:
1) %time was decreased by 7.8%
2) error counter was decreased from 18367 to 572

[1]: https://lore.kernel.org/all/20250619093641.70700-1-kerneljasonxing@gmail.com/

Signed-off-by: Jason Xing <kernelxing@tencent.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://patch.msgid.link/20250704160138.48677-1-kerneljasonxing@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-07-10 14:48:29 +02:00
..
accounting
arch LoongArch: Replace __ASSEMBLY__ with __ASSEMBLER__ in headers 2025-06-26 20:07:10 +08:00
bootconfig tools/bootconfig: specify LDFLAGS as an argument to CC 2025-05-16 11:22:54 +09:00
bpf tools/resolve_btfids: Fix build when cross compiling kernel with clang. 2025-06-10 09:09:27 -07:00
build tools build: Don't show libbfd build status as it is opt-in 2025-04-10 10:45:30 -03:00
certs
cgroup
counter
crypto
debugging
firewire
firmware
gpio
hv tools: hv: Enable debug logs for hv_kvp_daemon 2025-05-23 16:30:55 +00:00
iio iio: normalize array sentinel style 2025-04-22 19:10:04 +01:00
include net: xsk: introduce XDP_MAX_TX_SKB_BUDGET setsockopt 2025-07-10 14:48:29 +02:00
kvm/kvm_stat
laptop
leds
lib libbpf: Fix possible use-after-free for externs 2025-06-25 12:28:58 -07:00
memory-model tools/memory-model/Documentation: Fix SRCU section in explanation.txt 2025-04-23 12:17:04 -07:00
mm
net tools: ynl: fix mixing ops and notifications on one socket 2025-06-19 08:37:39 -07:00
objtool Rust changes for v6.16 2025-06-04 21:18:37 -07:00
pcmcia
perf perf bench futex: Fix prctl include in musl libc 2025-06-17 18:29:42 -03:00
power cpupower: split unitdir from libdir in Makefile 2025-06-09 10:17:46 -06:00
rcu
sched_ext sched_ext: change the variable name for slice refill event 2025-04-18 17:25:39 -10:00
scripts tools headers: Update the syscall table with the kernel sources 2025-04-10 09:28:24 -07:00
sound
spi
testing vsock/test: fix test for null ptr deref when transport changes 2025-07-09 19:33:07 -07:00
thermal
time
tracing tracing tools updates for v6.16: 2025-05-29 20:59:52 -07:00
usb
verification tracing tooling updates for 6.15: 2025-03-27 17:03:01 -07:00
virtio
wmi
workqueue
writeback
Makefile tools/Makefile: Add ynl target 2025-04-28 17:18:48 -07:00