linux/include/uapi
Daniel Borkmann e1f95b1992 geneve: Allow users to specify source port range
Recently, in case of Cilium, we run into users on Azure who require to use
tunneling for east/west traffic due to hitting IPAM API limits for Kubernetes
Pods if they would have gone with publicly routable IPs for Pods. In case
of tunneling, Cilium supports the option of vxlan or geneve. In order to
RSS spread flows among remote CPUs both derive a source port hash via
udp_flow_src_port() which takes the inner packet's skb->hash into account.
For clusters with many nodes, this can then hit a new limitation [0]: Today,
the Azure networking stack supports 1M total flows (500k inbound and 500k
outbound) for a VM. [...] Once this limit is hit, other connections are
dropped. [...] Each flow is distinguished by a 5-tuple (protocol, local IP
address, remote IP address, local port, and remote port) information. [...]

For vxlan and geneve, this can create a massive amount of UDP flows which
then run into the limits if stale flows are not evicted fast enough. One
option to mitigate this for vxlan is to narrow the source port range via
IFLA_VXLAN_PORT_RANGE while still being able to benefit from RSS. However,
geneve currently does not have this option and it spreads traffic across
the full source port range of [1, USHRT_MAX]. To overcome this limitation
also for geneve, add an equivalent IFLA_GENEVE_PORT_RANGE setting for users.

Note that struct geneve_config before/after still remains at 2 cachelines
on x86-64. The low/high members of struct ifla_geneve_port_range (which is
uapi exposed) are of type __be16. While they would be perfectly fine to be
of __u16 type, the consensus was that it would be good to be consistent
with the existing struct ifla_vxlan_port_range from a uapi consumer PoV.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://learn.microsoft.com/en-us/azure/virtual-network/virtual-machine-network-throughput [0]
Link: https://patch.msgid.link/20250226182030.89440-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-27 16:54:54 -08:00
..
asm-generic \n 2025-01-23 13:36:06 -08:00
drm drm/amdgpu: add a BO metadata flag to disable write compression for Vulkan 2025-02-03 12:11:36 -05:00
linux geneve: Allow users to specify source port range 2025-02-27 16:54:54 -08:00
misc Revert "misc: fastrpc: Restrict untrusted app to attach to privileged PD" 2024-08-15 16:59:14 +02:00
mtd ubi: Expose interface for detailed erase counters 2025-01-18 15:32:32 +01:00
rdma RDMA/nldev: Add IB device and net device rename events 2024-11-04 06:57:21 -05:00
regulator
scsi scsi: mpi3mr: Add ioctl support for HDB 2024-06-26 23:30:09 -04:00
sound ASoC: Updates for v6.14 2025-01-20 16:15:07 +01:00
video
xen xen/privcmd: Add new syscall to get gsi from dev 2024-09-25 09:54:55 +02:00
Kbuild