Linux kernel source tree
Find a file
Philo Lu 78c91ae2c6 ipv4/udp: Add 4-tuple hash for connected socket
Currently, the udp_table has two hash table, the port hash and portaddr
hash. Usually for UDP servers, all sockets have the same local port and
addr, so they are all on the same hash slot within a reuseport group.

In some applications, UDP servers use connect() to manage clients. In
particular, when firstly receiving from an unseen 4 tuple, a new socket
is created and connect()ed to the remote addr:port, and then the fd is
used exclusively by the client.

Once there are connected sks in a reuseport group, udp has to score all
sks in the same hash2 slot to find the best match. This could be
inefficient with a large number of connections, resulting in high
softirq overhead.

To solve the problem, this patch implement 4-tuple hash for connected
udp sockets. During connect(), hash4 slot is updated, as well as a
corresponding counter, hash4_cnt, in hslot2. In __udp4_lib_lookup(),
hslot4 will be searched firstly if the counter is non-zero. Otherwise,
hslot2 is used like before. Note that only connected sockets enter this
hash4 path, while un-connected ones are not affected.

hlist_nulls is used for hash4, because we probably move to another hslot
wrongly when lookup with concurrent rehash. Then we check nulls at the
list end to see if we should restart lookup. Because udp does not use
SLAB_TYPESAFE_BY_RCU, we don't need to touch sk_refcnt when lookup.

Stress test results (with 1 cpu fully used) are shown below, in pps:
(1) _un-connected_ socket as server
    [a] w/o hash4: 1,825176
    [b] w/  hash4: 1,831750 (+0.36%)

(2) 500 _connected_ sockets as server
    [c] w/o hash4:   290860 (only 16% of [a])
    [d] w/  hash4: 1,889658 (+3.1% compared with [b])

With hash4, compute_score is skipped when lookup, so [d] is slightly
better than [b].

Co-developed-by: Cambda Zhu <cambda@linux.alibaba.com>
Signed-off-by: Cambda Zhu <cambda@linux.alibaba.com>
Co-developed-by: Fred Chen <fred.cc@alibaba-inc.com>
Signed-off-by: Fred Chen <fred.cc@alibaba-inc.com>
Co-developed-by: Yubing Qiu <yubing.qiuyubing@alibaba-inc.com>
Signed-off-by: Yubing Qiu <yubing.qiuyubing@alibaba-inc.com>
Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-18 11:56:21 +00:00
arch Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-11-14 11:29:15 -08:00
block block-6.12-20241101 2024-11-01 13:41:55 -10:00
certs sign-file,extract-cert: use pkcs11 provider for OPENSSL MAJOR >= 3 2024-09-20 19:52:48 +03:00
crypto This push fixes the following issues: 2024-10-16 08:42:54 -07:00
Documentation dt-bindings: net: dsa: microchip,ksz: Drop undocumented "id" 2024-11-15 14:29:28 -08:00
drivers virtio_net: xdp_features add NETDEV_XDP_ACT_XSK_ZEROCOPY 2024-11-15 18:46:56 -08:00
fs Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-11-14 11:29:15 -08:00
include ipv4/udp: Add 4-tuple hash for connected socket 2024-11-18 11:56:21 +00:00
init cfi: fix conditions for HAVE_CFI_ICALL_NORMALIZE_INTEGERS 2024-10-13 22:23:13 +02:00
io_uring io_uring/rw: fix missing NOWAIT check for O_DIRECT start write 2024-10-31 08:21:02 -06:00
ipc struct fd layout change (and conversion to accessor helpers) 2024-09-23 09:35:36 -07:00
kernel Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-11-14 11:29:15 -08:00
lib Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-11-14 11:29:15 -08:00
LICENSES LICENSES: add 0BSD license text 2024-09-01 20:43:24 -07:00
mm Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-11-14 11:29:15 -08:00
net ipv4/udp: Add 4-tuple hash for connected socket 2024-11-18 11:56:21 +00:00
rust Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-10-21 09:14:18 +02:00
samples Including fixes from bluetooth. 2024-11-14 10:05:33 -08:00
scripts Kbuild fixes for v6.12 (2nd) 2024-11-03 08:29:02 -10:00
security integrity-v6.12 2024-11-12 13:06:31 -08:00
sound ASoC: Fixes for v6.12 2024-11-08 09:25:33 +01:00
tools selftests: net: fdb_notify: Add a test for FDB notifications 2024-11-15 16:39:19 -08:00
usr initramfs: shorten cmd_initfs in usr/Makefile 2024-07-16 01:07:52 +09:00
virt ARM64: 2024-10-21 11:22:04 -07:00
.clang-format clang-format: Update with v6.11-rc1's for_each macro list 2024-08-02 13:20:31 +02:00
.cocciconfig
.editorconfig .editorconfig: remove trim_trailing_whitespace option 2024-06-13 16:47:52 +02:00
.get_maintainer.ignore Add Jeff Kirsher to .get_maintainer.ignore 2024-03-08 11:36:54 +00:00
.gitattributes .gitattributes: set diff driver for Rust source code files 2023-05-31 17:48:25 +02:00
.gitignore Kbuild updates for v6.12 2024-09-24 13:02:06 -07:00
.mailmap mailmap: add entry for Thorsten Blum 2024-11-07 14:14:59 -08:00
.rustfmt.toml rust: add .rustfmt.toml 2022-09-28 09:02:20 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: Remove self from DSA entry 2024-11-03 12:52:38 -08:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-11-14 11:29:15 -08:00
Makefile Linux 6.12-rc7 2024-11-10 14:19:35 -08:00
README README: Fix spelling 2024-03-18 03:36:32 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the reStructuredText markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.