linux/net/ipv4
Jakub Kicinski 026dfef287 tcp: give up on stronger sk_rcvbuf checks (for now)
We hit another corner case which leads to TcpExtTCPRcvQDrop

Connections which send RPCs in the 20-80kB range over loopback
experience spurious drops. The exact conditions for most of
the drops I investigated are that:
 - socket exchanged >1MB of data so its not completely fresh
 - rcvbuf is around 128kB (default, hasn't grown)
 - there is ~60kB of data in rcvq
 - skb > 64kB arrives

The sum of skb->len (!) of both of the skbs (the one already
in rcvq and the arriving one) is larger than rwnd.
My suspicion is that this happens because __tcp_select_window()
rounds the rwnd up to (1 << wscale) if less than half of
the rwnd has been consumed.

Eric suggests that given the number of Fixes we already have
pointing to 1d2fbaad7c it's probably time to give up on it,
until a bigger revamp of rmem management.

Also while we could risk tweaking the rwnd math, there are other
drops on workloads I investigated, after the commit in question,
not explained by this phenomenon.

Suggested-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/20260225122355.585fd57b@kernel.org
Fixes: 1d2fbaad7c ("tcp: stronger sk_rcvbuf checks")
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260227003359.2391017-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-28 07:55:39 -08:00
..
netfilter ipv4: use dst4_mtu() instead of dst_mtu() 2026-02-02 17:49:29 -08:00
af_inet.c Convert more 'alloc_obj' cases to default GFP_KERNEL arguments 2026-02-21 20:03:00 -08:00
ah4.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
arp.c kernel.h: drop hex.h and update all hex.h users 2026-01-20 19:44:19 -08:00
bpf_tcp_ca.c tcp: Pass flags to __tcp_send_ack 2025-03-17 13:56:38 +00:00
cipso_ipv4.c Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses 2026-02-22 08:26:33 -08:00
datagram.c net: Convert proto callbacks from sockaddr to sockaddr_unsized 2025-11-04 19:10:33 -08:00
devinet.c Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses 2026-02-22 08:26:33 -08:00
esp4.c tcp: Don't pass hashinfo to socket lookup helpers. 2025-08-25 17:53:35 -07:00
esp4_offload.c xfrm: Fix inner mode lookup in tunnel mode GSO segmentation 2025-12-04 09:54:53 +01:00
fib_frontend.c ipv4: Convert ->flowi4_tos to dscp_t. 2025-08-26 17:34:31 -07:00
fib_lookup.h ipv4: fib: Annotate access to struct fib_alias.fa_state. 2026-01-28 19:33:07 -08:00
fib_notifier.c net: do not acquire rtnl in fib_seq_sum() 2024-10-11 15:35:05 -07:00
fib_rules.c ipv4: Convert ->flowi4_tos to dscp_t. 2025-08-26 17:34:31 -07:00
fib_semantics.c Convert more 'alloc_obj' cases to default GFP_KERNEL arguments 2026-02-21 20:03:00 -08:00
fib_trie.c ipv4: fib: Annotate access to struct fib_alias.fa_state. 2026-01-28 19:33:07 -08:00
fou_bpf.c ip_tunnel: convert __be16 tunnel flags to bitmaps 2024-04-01 10:49:28 +01:00
fou_core.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
fou_nl.c fou: Don't allow 0 for FOU_ATTR_IPPROTO. 2026-01-17 16:00:24 -08:00
fou_nl.h tools: ynl-gen: add regeneration comment 2025-11-25 19:20:42 -08:00
gre_demux.c net: ip_gre: Fix spelling mistake "demultiplexor" -> "demultiplexer" 2025-04-24 18:20:40 -07:00
gre_offload.c net: gro: rename skb_gro_header_hard() 2024-03-05 13:30:11 +01:00
icmp.c ipv4: icmp: icmpv4_xrlim_allow() optimization if net.ipv4.icmp_ratelimit is zero 2026-02-18 16:46:36 -08:00
igmp.c treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
igmp_internal.h netlink: support dumping IPv4 multicast addresses 2025-02-11 11:26:53 +01:00
inet_connection_sock.c tcp: move __reqsk_free() out of line 2026-02-05 09:23:06 -08:00
inet_diag.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
inet_fragment.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
inet_hashtables.c inet: annotate data-races around isk->inet_num 2026-02-27 17:16:59 -08:00
inet_timewait_sock.c inet: Avoid ehash lookup race in inet_twsk_hashdance_schedule() 2025-10-17 16:08:43 -07:00
inetpeer.c inetpeer: use EXPORT_IPV6_MOD[_GPL]() 2025-02-14 13:09:39 -08:00
ip_forward.c
ip_fragment.c inet: frags: flush pending skbs in fqdir_pre_exit() 2025-12-10 01:15:27 -08:00
ip_gre.c ipv4: ip_gre: make ipgre_header() robust 2026-01-10 12:06:22 -08:00
ip_input.c net: ipv4: Remove extern udp_v4_early_demux()/tcp_v4_early_demux() in .c files 2025-10-29 17:05:30 -07:00
ip_options.c net: Switch to skb_dstref_steal/skb_dstref_restore for ip_route_input callers 2025-08-19 17:54:35 -07:00
ip_output.c ipv4: use dst4_mtu() instead of dst_mtu() 2026-02-02 17:49:29 -08:00
ip_sockglue.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
ip_tunnel.c ipv4: ip_tunnel: spread netdev_lockdep_set_classes() 2026-01-08 18:02:35 -08:00
ip_tunnel_core.c tunnels: reset the GSO metadata before reusing the skb 2025-09-09 13:03:33 +02:00
ip_vti.c ipv4: adopt dst_dev, skb_dst_dev and skb_dst_dev_net[_rcu] 2025-07-02 14:32:30 -07:00
ipcomp.c xfrm: delete x->tunnel as we delete x 2025-07-08 13:28:27 +02:00
ipconfig.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
ipip.c netfilter: flowtable: Add IPIP rx sw acceleration 2025-11-28 00:00:38 +00:00
ipmr.c ipv4: use dst4_mtu() instead of dst_mtu() 2026-02-02 17:49:29 -08:00
ipmr_base.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
Kconfig tcp: Convert tcp-md5 to use MD5 library instead of crypto_ahash 2025-10-17 17:14:54 -07:00
Makefile tcp: move tcp_rate_check_app_limited() to tcp.c 2026-01-22 18:28:48 -08:00
metrics.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
netfilter.c ipv4: Convert ->flowi4_tos to dscp_t. 2025-08-26 17:34:31 -07:00
netlink.c
nexthop.c Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses 2026-02-22 08:26:33 -08:00
ping.c ping: annotate data-races in ping_lookup() 2026-02-17 17:11:08 -08:00
proc.c ipv4: snmp: do not use SNMP_MIB_SENTINEL anymore 2025-09-08 18:06:20 -07:00
protocol.c
raw.c ipv4/inet_sock.h: Avoid thousands of -Wflex-array-member-not-at-end warnings 2026-01-06 17:02:52 -08:00
raw_diag.c inet_diag: change inet_diag_bc_sk() first argument 2025-08-29 19:29:24 -07:00
route.c Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
syncookies.c tcp: fix potential race in tcp_v6_syn_recv_sock() 2026-02-19 14:02:19 -08:00
sysctl_net_ipv4.c tcp: accecn: enable AccECN 2026-02-03 15:13:25 +01:00
tcp.c net: annotate data-races around sk->sk_{data_ready,write_space} 2026-02-26 19:23:03 -08:00
tcp_ao.c treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
tcp_bbr.c tcp: Add new args for cong_control in tcp_congestion_ops 2024-05-02 16:26:56 -07:00
tcp_bic.c
tcp_bpf.c net: annotate data-races around sk->sk_{data_ready,write_space} 2026-02-26 19:23:03 -08:00
tcp_cdg.c treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
tcp_cong.c tcp: ECT_1_NEGOTIATION and NEEDS_ACCECN identifiers 2026-02-03 15:13:24 +01:00
tcp_cubic.c tcp_cubic: fix incorrect HyStart round start detection 2025-01-20 12:26:41 +00:00
tcp_dctcp.c tcp: helpers for ECN mode handling 2025-03-17 13:54:11 +00:00
tcp_dctcp.h tcp: Pass flags to __tcp_send_ack 2025-03-17 13:56:38 +00:00
tcp_diag.c inet: annotate data-races around isk->inet_num 2026-02-27 17:16:59 -08:00
tcp_fastopen.c Including fixes from IPsec, Bluetooth and netfilter 2026-02-26 08:00:13 -08:00
tcp_highspeed.c
tcp_htcp.c tcp: Use clamp() in htcp_alpha_update() 2024-08-06 12:16:25 -07:00
tcp_hybla.c
tcp_illinois.c
tcp_input.c tcp: give up on stronger sk_rcvbuf checks (for now) 2026-02-28 07:55:39 -08:00
tcp_ipv4.c Including fixes from IPsec, Bluetooth and netfilter 2026-02-26 08:00:13 -08:00
tcp_lp.c net: tcp_lp: fix kernel-doc warnings and update outdated reference links 2025-10-28 17:52:44 -07:00
tcp_metrics.c treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
tcp_minisocks.c net: annotate data-races around sk->sk_{data_ready,write_space} 2026-02-26 19:23:03 -08:00
tcp_nv.c
tcp_offload.c gro: flushing when CWR is set negatively affects AccECN 2026-02-03 15:13:24 +01:00
tcp_output.c tcp: move tcp_rbtree_insert() to tcp_output.c 2026-02-04 20:36:50 -08:00
tcp_plb.c
tcp_recovery.c tcp: move tcp_rack_advance() to tcp_input.c 2026-01-28 19:31:51 -08:00
tcp_scalable.c
tcp_sigpool.c compiler-context-analysis: Change __cond_acquires to take return value 2026-01-05 16:43:29 +01:00
tcp_timer.c tcp: accecn: unset ECT if receive or send ACE=0 in AccECN negotiaion 2026-02-03 15:13:24 +01:00
tcp_ulp.c
tcp_vegas.c
tcp_vegas.h
tcp_veno.c
tcp_westwood.c
tcp_yeah.c
tunnel4.c net: fill in MODULE_DESCRIPTION()s for ipv4 modules 2024-02-09 14:12:02 -08:00
udp.c udp: Unhash auto-bound connected sk from 4-tuple hash table when disconnected. 2026-02-28 07:46:24 -08:00
udp_bpf.c net: annotate data-races around sk->sk_{data_ready,write_space} 2026-02-26 19:23:03 -08:00
udp_diag.c inet_diag: change inet_diag_bc_sk() first argument 2025-08-29 19:29:24 -07:00
udp_impl.h udp: move udp_memory_allocated into net_aligned_data 2025-07-02 14:22:02 -07:00
udp_offload.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2026-01-29 17:28:54 -08:00
udp_tunnel_core.c net: Convert proto_ops connect() callbacks to use sockaddr_unsized 2025-11-04 19:10:32 -08:00
udp_tunnel_nic.c Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses 2026-02-22 08:26:33 -08:00
udp_tunnel_stub.c
udplite.c udplite: Fix null-ptr-deref in __udp_enqueue_schedule_skb(). 2026-02-20 16:14:10 -08:00
xfrm4_input.c xfrm: Set transport header to fix UDP GRO handling 2025-07-02 09:19:56 +02:00
xfrm4_output.c ipv4: adopt dst_dev, skb_dst_dev and skb_dst_dev_net[_rcu] 2025-07-02 14:32:30 -07:00
xfrm4_policy.c ipv4: Convert ->flowi4_tos to dscp_t. 2025-08-26 17:34:31 -07:00
xfrm4_protocol.c ipv4: Convert ip_route_input_noref() to dscp_t. 2024-10-03 16:21:21 -07:00
xfrm4_state.c
xfrm4_tunnel.c net: fill in MODULE_DESCRIPTION()s for ipv4 modules 2024-02-09 14:12:02 -08:00