linux/net
YiFei Zhu 1a86a1f7d8 net: Fix rcu_tasks stall in threaded busypoll
I was debugging a NIC driver when I noticed that when I enable
threaded busypoll, bpftrace hangs when starting up. dmesg showed:

  rcu_tasks_wait_gp: rcu_tasks grace period number 85 (since boot) is 10658 jiffies old.
  rcu_tasks_wait_gp: rcu_tasks grace period number 85 (since boot) is 40793 jiffies old.
  rcu_tasks_wait_gp: rcu_tasks grace period number 85 (since boot) is 131273 jiffies old.
  rcu_tasks_wait_gp: rcu_tasks grace period number 85 (since boot) is 402058 jiffies old.
  INFO: rcu_tasks detected stalls on tasks:
  00000000769f52cd: .N nvcsw: 2/2 holdout: 1 idle_cpu: -1/64
  task:napi/eth2-8265  state:R  running task     stack:0     pid:48300 tgid:48300 ppid:2      task_flags:0x208040 flags:0x00004000
  Call Trace:
   <TASK>
   ? napi_threaded_poll_loop+0x27c/0x2c0
   ? __pfx_napi_threaded_poll+0x10/0x10
   ? napi_threaded_poll+0x26/0x80
   ? kthread+0xfa/0x240
   ? __pfx_kthread+0x10/0x10
   ? ret_from_fork+0x31/0x50
   ? __pfx_kthread+0x10/0x10
   ? ret_from_fork_asm+0x1a/0x30
   </TASK>

The cause is that in threaded busypoll, the main loop is in
napi_threaded_poll rather than napi_threaded_poll_loop, where the
latter rarely iterates more than once within its loop. For
rcu_softirq_qs_periodic inside napi_threaded_poll_loop to report its
qs state, the last_qs must be 100ms behind, and this can't happen
because napi_threaded_poll_loop rarely iterates in threaded busypoll,
and each time napi_threaded_poll_loop is called last_qs is reset to
latest jiffies.

This patch changes so that in threaded busypoll, last_qs is saved
in the outer napi_threaded_poll, and whether busy_poll_last_qs
is NULL indicates whether napi_threaded_poll_loop is called for
busypoll. This way last_qs would not reset to latest jiffies on
each invocation of napi_threaded_poll_loop.

Fixes: c18d4b190a ("net: Extend NAPI threaded polling to allow kthread based busy polling")
Cc: stable@vger.kernel.org
Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Reviewed-by: Samiullah Khawaja <skhawaja@google.com>
Link: https://patch.msgid.link/20260227221937.1060857-1-zhuyifei@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-03-03 13:44:28 +01:00
..
6lowpan net: replace ND_PRINTK with dynamic debug 2025-07-10 15:27:32 -07:00
9p Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
802 Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
8021q Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
appletalk Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
atm atm: lec: fix null-ptr-deref in lec_arp_clear_vccs 2026-02-28 09:33:26 -08:00
ax25 Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
batman-adv Here is a batman-adv bugfix: 2026-02-26 19:15:09 -08:00
bluetooth Including fixes from IPsec, Bluetooth and netfilter 2026-02-26 08:00:13 -08:00
bpf Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
bridge bridge: Check relevant per-VLAN options in VLAN range grouping 2026-02-26 19:24:29 -08:00
caif Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
can Convert more 'alloc_obj' cases to default GFP_KERNEL arguments 2026-02-21 20:03:00 -08:00
ceph Convert more 'alloc_obj' cases to default GFP_KERNEL arguments 2026-02-21 20:03:00 -08:00
core net: Fix rcu_tasks stall in threaded busypoll 2026-03-03 13:44:28 +01:00
dcb Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
devlink Convert 'alloc_flex' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
dns_resolver net/dns_resolver: use credential guards in dns_query() 2025-11-04 12:36:51 +01:00
dsa Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
ethernet net: optimize eth_type_trans() vs CONFIG_STACKPROTECTOR_STRONG=y 2025-11-24 19:27:31 -08:00
ethtool Convert more 'alloc_obj' cases to default GFP_KERNEL arguments 2026-02-21 20:03:00 -08:00
handshake treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
hsr Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
ieee802154 Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
ife
ipv4 tcp: give up on stronger sk_rcvbuf checks (for now) 2026-02-28 07:55:39 -08:00
ipv6 inet: annotate data-races around isk->inet_num 2026-02-27 17:16:59 -08:00
iucv Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
kcm kcm: fix zero-frag skb in frag_list on partial sendmsg error 2026-02-23 17:26:55 -08:00
key Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
l2tp Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
l3mdev net: fib_rules: Fix iif / oif matching on L3 master device 2025-04-15 17:54:56 -07:00
lapb treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
llc treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
mac80211 Including fixes from IPsec, Bluetooth and netfilter 2026-02-26 08:00:13 -08:00
mac802154 Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
mctp Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
mpls Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
mptcp Including fixes from IPsec, Bluetooth and netfilter 2026-02-26 08:00:13 -08:00
ncsi Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
netfilter Including fixes from IPsec, Bluetooth and netfilter 2026-02-26 08:00:13 -08:00
netlabel Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
netlink Convert more 'alloc_obj' cases to default GFP_KERNEL arguments 2026-02-21 20:03:00 -08:00
netrom Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
nfc Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
nsh
openvswitch Convert more 'alloc_obj' cases to default GFP_KERNEL arguments 2026-02-21 20:03:00 -08:00
packet Convert more 'alloc_obj' cases to default GFP_KERNEL arguments 2026-02-21 20:03:00 -08:00
phonet treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
psample treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
psp Including fixes from IPsec, Bluetooth and netfilter 2026-02-26 08:00:13 -08:00
qrtr Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
rds net/rds: Fix circular locking dependency in rds_tcp_tune 2026-03-03 12:57:06 +01:00
rfkill Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
rose Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
rxrpc Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
sched net/sched: Only allow act_ct to bind to clsact/ingress qdiscs and shared blocks 2026-02-27 19:06:21 -08:00
sctp Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
shaper Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
smc Including fixes from IPsec, Bluetooth and netfilter 2026-02-26 08:00:13 -08:00
strparser Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-11-13 12:35:38 -08:00
sunrpc Convert more 'alloc_obj' cases to default GFP_KERNEL arguments 2026-02-21 20:03:00 -08:00
switchdev treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
tipc Including fixes from IPsec, Bluetooth and netfilter 2026-02-26 08:00:13 -08:00
tls Including fixes from IPsec, Bluetooth and netfilter 2026-02-26 08:00:13 -08:00
unix net: annotate data-races around sk->sk_{data_ready,write_space} 2026-02-26 19:23:03 -08:00
vmw_vsock Including fixes from IPsec, Bluetooth and netfilter 2026-02-26 08:00:13 -08:00
wireless Including fixes from IPsec, Bluetooth and netfilter 2026-02-26 08:00:13 -08:00
x25 treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
xdp xsk: Fix zero-copy AF_XDP fragment drop 2026-02-28 08:55:11 -08:00
xfrm Including fixes from IPsec, Bluetooth and netfilter 2026-02-26 08:00:13 -08:00
compat.c socket: Unify getsockname and getpeername implementation 2025-11-26 13:45:23 -07:00
devres.c
Kconfig net: Kconfig: discourage drop_monitor enablement 2025-10-17 16:29:26 -07:00
Kconfig.debug
Makefile psp: base PSP device support 2025-09-18 12:32:06 +02:00
socket.c net: Drop the lock in skb_may_tx_timestamp() 2026-02-24 11:27:29 +01:00
sysctl_net.c