linux/net/core
Larysa Zaremba 8821e85775 xdp: produce a warning when calculated tailroom is negative
Many ethernet drivers report xdp Rx queue frag size as being the same as
DMA write size. However, the only user of this field, namely
bpf_xdp_frags_increase_tail(), clearly expects a truesize.

Such difference leads to unspecific memory corruption issues under certain
circumstances, e.g. in ixgbevf maximum DMA write size is 3 KB, so when
running xskxceiver's XDP_ADJUST_TAIL_GROW_MULTI_BUFF, 6K packet fully uses
all DMA-writable space in 2 buffers. This would be fine, if only
rxq->frag_size was properly set to 4K, but value of 3K results in a
negative tailroom, because there is a non-zero page offset.

We are supposed to return -EINVAL and be done with it in such case, but due
to tailroom being stored as an unsigned int, it is reported to be somewhere
near UINT_MAX, resulting in a tail being grown, even if the requested
offset is too much (it is around 2K in the abovementioned test). This later
leads to all kinds of unspecific calltraces.

[ 7340.337579] xskxceiver[1440]: segfault at 1da718 ip 00007f4161aeac9d sp 00007f41615a6a00 error 6
[ 7340.338040] xskxceiver[1441]: segfault at 7f410000000b ip 00000000004042b5 sp 00007f415bffecf0 error 4
[ 7340.338179]  in libc.so.6[61c9d,7f4161aaf000+160000]
[ 7340.339230]  in xskxceiver[42b5,400000+69000]
[ 7340.340300]  likely on CPU 6 (core 0, socket 6)
[ 7340.340302] Code: ff ff 01 e9 f4 fe ff ff 0f 1f 44 00 00 4c 39 f0 74 73 31 c0 ba 01 00 00 00 f0 0f b1 17 0f 85 ba 00 00 00 49 8b 87 88 00 00 00 <4c> 89 70 08 eb cc 0f 1f 44 00 00 48 8d bd f0 fe ff ff 89 85 ec fe
[ 7340.340888]  likely on CPU 3 (core 0, socket 3)
[ 7340.345088] Code: 00 00 00 ba 00 00 00 00 be 00 00 00 00 89 c7 e8 31 ca ff ff 89 45 ec 8b 45 ec 85 c0 78 07 b8 00 00 00 00 eb 46 e8 0b c8 ff ff <8b> 00 83 f8 69 74 24 e8 ff c7 ff ff 8b 00 83 f8 0b 74 18 e8 f3 c7
[ 7340.404334] Oops: general protection fault, probably for non-canonical address 0x6d255010bdffc: 0000 [#1] SMP NOPTI
[ 7340.405972] CPU: 7 UID: 0 PID: 1439 Comm: xskxceiver Not tainted 6.19.0-rc1+ #21 PREEMPT(lazy)
[ 7340.408006] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-5.fc42 04/01/2014
[ 7340.409716] RIP: 0010:lookup_swap_cgroup_id+0x44/0x80
[ 7340.410455] Code: 83 f8 1c 73 39 48 ba ff ff ff ff ff ff ff 03 48 8b 04 c5 20 55 fa bd 48 21 d1 48 89 ca 83 e1 01 48 d1 ea c1 e1 04 48 8d 04 90 <8b> 00 48 83 c4 10 d3 e8 c3 cc cc cc cc 31 c0 e9 98 b7 dd 00 48 89
[ 7340.412787] RSP: 0018:ffffcc5c04f7f6d0 EFLAGS: 00010202
[ 7340.413494] RAX: 0006d255010bdffc RBX: ffff891f477895a8 RCX: 0000000000000010
[ 7340.414431] RDX: 0001c17e3fffffff RSI: 00fa070000000000 RDI: 000382fc7fffffff
[ 7340.415354] RBP: 00fa070000000000 R08: ffffcc5c04f7f8f8 R09: ffffcc5c04f7f7d0
[ 7340.416283] R10: ffff891f4c1a7000 R11: ffffcc5c04f7f9c8 R12: ffffcc5c04f7f7d0
[ 7340.417218] R13: 03ffffffffffffff R14: 00fa06fffffffe00 R15: ffff891f47789500
[ 7340.418229] FS:  0000000000000000(0000) GS:ffff891ffdfaa000(0000) knlGS:0000000000000000
[ 7340.419489] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7340.420286] CR2: 00007f415bfffd58 CR3: 0000000103f03002 CR4: 0000000000772ef0
[ 7340.421237] PKRU: 55555554
[ 7340.421623] Call Trace:
[ 7340.421987]  <TASK>
[ 7340.422309]  ? softleaf_from_pte+0x77/0xa0
[ 7340.422855]  swap_pte_batch+0xa7/0x290
[ 7340.423363]  zap_nonpresent_ptes.constprop.0.isra.0+0xd1/0x270
[ 7340.424102]  zap_pte_range+0x281/0x580
[ 7340.424607]  zap_pmd_range.isra.0+0xc9/0x240
[ 7340.425177]  unmap_page_range+0x24d/0x420
[ 7340.425714]  unmap_vmas+0xa1/0x180
[ 7340.426185]  exit_mmap+0xe1/0x3b0
[ 7340.426644]  __mmput+0x41/0x150
[ 7340.427098]  exit_mm+0xb1/0x110
[ 7340.427539]  do_exit+0x1b2/0x460
[ 7340.427992]  do_group_exit+0x2d/0xc0
[ 7340.428477]  get_signal+0x79d/0x7e0
[ 7340.428957]  arch_do_signal_or_restart+0x34/0x100
[ 7340.429571]  exit_to_user_mode_loop+0x8e/0x4c0
[ 7340.430159]  do_syscall_64+0x188/0x6b0
[ 7340.430672]  ? __do_sys_clone3+0xd9/0x120
[ 7340.431212]  ? switch_fpu_return+0x4e/0xd0
[ 7340.431761]  ? arch_exit_to_user_mode_prepare.isra.0+0xa1/0xc0
[ 7340.432498]  ? do_syscall_64+0xbb/0x6b0
[ 7340.433015]  ? __handle_mm_fault+0x445/0x690
[ 7340.433582]  ? count_memcg_events+0xd6/0x210
[ 7340.434151]  ? handle_mm_fault+0x212/0x340
[ 7340.434697]  ? do_user_addr_fault+0x2b4/0x7b0
[ 7340.435271]  ? clear_bhb_loop+0x30/0x80
[ 7340.435788]  ? clear_bhb_loop+0x30/0x80
[ 7340.436299]  ? clear_bhb_loop+0x30/0x80
[ 7340.436812]  ? clear_bhb_loop+0x30/0x80
[ 7340.437323]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 7340.437973] RIP: 0033:0x7f4161b14169
[ 7340.438468] Code: Unable to access opcode bytes at 0x7f4161b1413f.
[ 7340.439242] RSP: 002b:00007ffc6ebfa770 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
[ 7340.440173] RAX: fffffffffffffe00 RBX: 00000000000005a1 RCX: 00007f4161b14169
[ 7340.441061] RDX: 00000000000005a1 RSI: 0000000000000109 RDI: 00007f415bfff990
[ 7340.441943] RBP: 00007ffc6ebfa7a0 R08: 0000000000000000 R09: 00000000ffffffff
[ 7340.442824] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[ 7340.443707] R13: 0000000000000000 R14: 00007f415bfff990 R15: 00007f415bfff6c0
[ 7340.444586]  </TASK>
[ 7340.444922] Modules linked in: rfkill intel_rapl_msr intel_rapl_common intel_uncore_frequency_common skx_edac_common nfit libnvdimm kvm_intel vfat fat kvm snd_pcm irqbypass rapl iTCO_wdt snd_timer intel_pmc_bxt iTCO_vendor_support snd ixgbevf virtio_net soundcore i2c_i801 pcspkr libeth_xdp net_failover i2c_smbus lpc_ich failover libeth virtio_balloon joydev 9p fuse loop zram lz4hc_compress lz4_compress 9pnet_virtio 9pnet netfs ghash_clmulni_intel serio_raw qemu_fw_cfg
[ 7340.449650] ---[ end trace 0000000000000000 ]---

The issue can be fixed in all in-tree drivers, but we cannot just trust OOT
drivers to not do this. Therefore, make tailroom a signed int and produce a
warning when it is negative to prevent such mistakes in the future.

Fixes: bf25146a55 ("bpf: add frags support to the bpf_xdp_adjust_tail() API")
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Link: https://patch.msgid.link/20260305111253.2317394-10-larysa.zaremba@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05 08:02:05 -08:00
..
bpf_sk_storage.c Convert 'alloc_flex' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
datagram.c net: datagram: introduce datagram_poll_queue for custom receive queues 2025-10-23 15:46:04 +02:00
dev.c net: Provide a PREEMPT_RT specific check for netdev_queue::_xmit_lock 2026-03-05 12:14:21 +01:00
dev.h net: add queue config validation callback 2026-01-23 11:49:02 -08:00
dev_addr_lists.c net: s/dev_pre_changeaddr_notify/netif_pre_changeaddr_notify/ 2025-07-18 17:27:47 -07:00
dev_addr_lists_test.c net: dev_addr_lists: move locking out of init/exit in kunit 2024-04-15 10:26:35 +01:00
dev_api.c net: define an enum for the napi threaded state 2025-07-24 18:34:55 -07:00
dev_ioctl.c net: remove legacy way to get/set HW timestamp config 2026-01-20 18:21:27 -08:00
devmem.c net: devmem: use READ_ONCE/WRITE_ONCE on binding->dev 2026-03-04 17:59:27 -08:00
devmem.h net: inline net_is_devmem_iov() 2026-01-25 13:18:53 -08:00
drop_monitor.c Convert more 'alloc_obj' cases to default GFP_KERNEL arguments 2026-02-21 20:03:00 -08:00
dst.c treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
dst_cache.c net: dst: annotate data-races around dst->obsolete 2025-07-02 14:32:29 -07:00
failover.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
fib_notifier.c net: do not acquire rtnl in fib_seq_sum() 2024-10-11 15:35:05 -07:00
fib_rules.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-04-17 12:26:50 -07:00
filter.c xdp: produce a warning when calculated tailroom is negative 2026-03-05 08:02:05 -08:00
flow_dissector.c net: remove '__' from __skb_flow_get_ports() 2025-02-24 14:27:53 -08:00
flow_offload.c Convert more 'alloc_obj' cases to default GFP_KERNEL arguments 2026-02-21 20:03:00 -08:00
gen_estimator.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
gen_stats.c
gro.c net/ipv6: Drop HBH for BIG TCP on RX side 2026-02-06 20:50:12 -08:00
gro_cells.c treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
gso.c net: introduce struct net_hotdata 2024-03-07 21:12:41 -08:00
hotdata.c net: add net.core.qdisc_max_burst 2026-01-13 10:12:11 +01:00
hwbm.c
ieee8021q_helpers.c net: ieee8021q: fix insufficient table-size assertion 2025-07-01 12:55:49 +02:00
link_watch.c linkwatch: use __dev_put() in callers to prevent UAF 2026-02-02 16:59:18 -08:00
lock_debug.c netdev: fix the locking for netdev notifications 2025-04-17 18:55:14 -07:00
lwt_bpf.c ipv4: Convert ->flowi4_tos to dscp_t. 2025-08-26 17:34:31 -07:00
lwtunnel.c inet: Remove rtnl_is_held arg of lwtunnel_valid_encap_type(_attr)?(). 2025-05-20 19:18:24 -07:00
Makefile net: get rid of net/core/request_sock.c 2026-02-05 09:23:05 -08:00
mp_dmabuf_devmem.h memory-provider: dmabuf devmem memory provider 2024-09-11 20:44:31 -07:00
neighbour.c treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
net-procfs.c net: add proper RCU protection to /proc/net/ptype 2026-02-03 19:20:30 -08:00
net-sysfs.c net: Keep ignoring isolated cpuset change 2026-02-03 15:23:33 +01:00
net-sysfs.h net: remove RTNL use for /proc/sys/net/core/rps_default_mask 2025-07-07 18:42:12 -07:00
net-traces.c move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
net_namespace.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
net_test.c pfcp: always set pfcp metadata 2024-04-01 10:49:28 +01:00
netclassid_cgroup.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
netdev-genl-gen.c Revert "Merge branch 'netkit-support-for-io_uring-zero-copy-and-af_xdp'" 2026-01-20 18:06:01 -08:00
netdev-genl-gen.h Revert "Merge branch 'netkit-support-for-io_uring-zero-copy-and-af_xdp'" 2026-01-20 18:06:01 -08:00
netdev-genl.c Revert "Merge branch 'netkit-support-for-io_uring-zero-copy-and-af_xdp'" 2026-01-20 18:06:01 -08:00
netdev_config.c net: add queue config validation callback 2026-01-23 11:49:02 -08:00
netdev_queues.c Revert "Merge branch 'netkit-support-for-io_uring-zero-copy-and-af_xdp'" 2026-01-20 18:06:01 -08:00
netdev_rx_queue.c net: add queue config validation callback 2026-01-23 11:49:02 -08:00
netevent.c
netmem_priv.h netmem: replace __netmem_clear_lsb() with netmem_to_nmdesc() 2025-10-14 13:37:26 +02:00
netpoll.c net: Provide a PREEMPT_RT specific check for netdev_queue::_xmit_lock 2026-03-05 12:14:21 +01:00
netprio_cgroup.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
of_net.c net: Explicitly include correct DT includes 2023-07-27 20:33:16 -07:00
page_pool.c net: page_pool: sanitise allocation order 2025-12-02 11:08:39 -08:00
page_pool_priv.h net: page_pool: don't try to stash the napi id 2025-01-27 14:37:41 -08:00
page_pool_user.c net: use napi_id_valid helper 2025-02-17 16:43:04 -08:00
pktgen.c kernel.h: drop hex.h and update all hex.h users 2026-01-20 19:44:19 -08:00
ptp_classifier.c
rtnetlink.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
scm.c treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
secure_seq.c tcp: secure_seq: add back ports to TS offset 2026-03-04 17:44:35 -08:00
selftests.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
skb_fault_injection.c net: Implement fault injection forcing skb reallocation 2024-11-12 12:05:33 +01:00
skbuff.c net: Drop the lock in skb_may_tx_timestamp() 2026-02-24 11:27:29 +01:00
skmsg.c net: annotate data-races around sk->sk_{data_ready,write_space} 2026-02-26 19:23:03 -08:00
sock.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
sock_destructor.h
sock_diag.c treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
sock_map.c treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
sock_reuseport.c treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
stream.c net: stream: add description for sk_stream_write_space() 2025-07-18 16:57:21 -07:00
sysctl_net_core.c net: include <linux/hex.h> from sysctl_net_core.c 2026-01-26 16:35:53 -08:00
timestamping.c net: Add the possibility to support a selected hwtstamp in netdevice 2024-12-16 12:51:40 +00:00
tso.c move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
utils.c kernel.h: drop hex.h and update all hex.h users 2026-01-20 19:44:19 -08:00
xdp.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00