linux/include
Koichiro Den 7f083faf59 net: sched: avoid qdisc_reset_all_tx_gt() vs dequeue race for lockless qdiscs
When shrinking the number of real tx queues,
netif_set_real_num_tx_queues() calls qdisc_reset_all_tx_gt() to flush
qdiscs for queues which will no longer be used.

qdisc_reset_all_tx_gt() currently serializes qdisc_reset() with
qdisc_lock(). However, for lockless qdiscs, the dequeue path is
serialized by qdisc_run_begin/end() using qdisc->seqlock instead, so
qdisc_reset() can run concurrently with __qdisc_run() and free skbs
while they are still being dequeued, leading to UAF.

This can easily be reproduced on e.g. virtio-net by imposing heavy
traffic while frequently changing the number of queue pairs:

  iperf3 -ub0 -c $peer -t 0 &
  while :; do
    ethtool -L eth0 combined 1
    ethtool -L eth0 combined 2
  done

With KASAN enabled, this leads to reports like:

  BUG: KASAN: slab-use-after-free in __qdisc_run+0x133f/0x1760
  ...
  Call Trace:
   <TASK>
   ...
   __qdisc_run+0x133f/0x1760
   __dev_queue_xmit+0x248f/0x3550
   ip_finish_output2+0xa42/0x2110
   ip_output+0x1a7/0x410
   ip_send_skb+0x2e6/0x480
   udp_send_skb+0xb0a/0x1590
   udp_sendmsg+0x13c9/0x1fc0
   ...
   </TASK>

  Allocated by task 1270 on cpu 5 at 44.558414s:
   ...
   alloc_skb_with_frags+0x84/0x7c0
   sock_alloc_send_pskb+0x69a/0x830
   __ip_append_data+0x1b86/0x48c0
   ip_make_skb+0x1e8/0x2b0
   udp_sendmsg+0x13a6/0x1fc0
   ...

  Freed by task 1306 on cpu 3 at 44.558445s:
   ...
   kmem_cache_free+0x117/0x5e0
   pfifo_fast_reset+0x14d/0x580
   qdisc_reset+0x9e/0x5f0
   netif_set_real_num_tx_queues+0x303/0x840
   virtnet_set_channels+0x1bf/0x260 [virtio_net]
   ethnl_set_channels+0x684/0xae0
   ethnl_default_set_doit+0x31a/0x890
   ...

Serialize qdisc_reset_all_tx_gt() against the lockless dequeue path by
taking qdisc->seqlock for TCQ_F_NOLOCK qdiscs, matching the
serialization model already used by dev_reset_queue().

Additionally clear QDISC_STATE_NON_EMPTY after reset so the qdisc state
reflects an empty queue, avoiding needless re-scheduling.

Fixes: 6b3ba9146f ("net: sched: allow qdiscs to handle locking")
Signed-off-by: Koichiro Den <den@valinux.co.jp>
Link: https://patch.msgid.link/20260228145307.3955532-1-den@valinux.co.jp
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-04 17:43:45 -08:00
..
acpi mailbox: platform and core updates 2026-02-14 11:13:32 -08:00
asm-generic hyperv-next for v7.0 2026-02-20 08:48:31 -08:00
clocksource
crypto Networking changes for 7.0 2026-02-11 19:31:52 -08:00
cxl
drm drm/pagemap: pass pagemap_addr by reference 2026-02-17 19:39:44 -05:00
dt-bindings phy-for-7.0 2026-02-17 11:40:04 -08:00
hyperv hyperv-next for v7.0 2026-02-20 08:48:31 -08:00
keys keys/trusted_keys: establish PKWM as a trusted source 2026-01-30 09:27:26 +05:30
kunit treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
kvm KVM: arm64: Use standard seq_file iterator for vgic-debug debugfs 2026-02-02 10:59:25 +00:00
linux indirect_call_wrapper: do not reevaluate function pointer 2026-03-03 12:41:29 +01:00
math-emu
media [GIT PULL for v7.0] media updates 2026-02-11 12:20:25 -08:00
memory
misc
net net: sched: avoid qdisc_reset_all_tx_gt() vs dequeue race for lockless qdiscs 2026-03-04 17:43:45 -08:00
pcmcia
ras
rdma RDMA v7.0 merge window 2026-02-12 17:05:20 -08:00
rv rv: Fix multiple definition of __pcpu_unique_da_mon_this 2026-02-20 13:12:00 +01:00
scsi SCSI misc on 20260212 2026-02-12 15:43:02 -08:00
soc
sound ASoC: Updates for v7.0 2026-02-09 17:39:11 +01:00
target
trace vfs-7.0-rc1.misc.2 2026-02-16 13:00:36 -08:00
uapi drm next fixes for 7.0-rc1 2026-02-20 15:36:38 -08:00
ufs scsi: ufs: host: mediatek: Require CONFIG_PM 2026-02-03 22:28:44 -05:00
vdso
video
xen Partial revert "x86/xen: fix balloon target initialization for PVH dom0" 2026-02-02 07:31:22 +01:00
Kbuild