Linux kernel source tree
Find a file
Koen De Schepper 8f9516daed sched: Add enqueue/dequeue of dualpi2 qdisc
DualPI2 provides L4S-type low latency & loss to traffic that uses a
scalable congestion controller (e.g. TCP-Prague, DCTCP) without
degrading the performance of 'classic' traffic (e.g. Reno,
Cubic etc.). It is to be the reference implementation of IETF RFC9332
DualQ Coupled AQM (https://datatracker.ietf.org/doc/html/rfc9332).

Note that creating two independent queues cannot meet the goal of
DualPI2 mentioned in RFC9332: "...to preserve fairness between
ECN-capable and non-ECN-capable traffic." Further, it could even
lead to starvation of Classic traffic, which is also inconsistent
with the requirements in RFC9332: "...although priority MUST be
bounded in order not to starve Classic traffic." DualPI2 is
designed to maintain approximate per-flow fairness on L-queue and
C-queue by forming a single qdisc using the coupling factor and
scheduler between two queues.

The qdisc provides two queues called low latency and classic. It
classifies packets based on the ECN field in the IP headers. By
default it directs non-ECN and ECT(0) into the classic queue and
ECT(1) and CE into the low latency queue, as per the IETF spec.

Each queue runs its own AQM:
* The classic AQM is called PI2, which is similar to the PIE AQM but
  more responsive and simpler. Classic traffic requires a decent
  target queue (default 15ms for Internet deployment) to fully
  utilize the link and to avoid high drop rates.
* The low latency AQM is, by default, a very shallow ECN marking
  threshold (1ms) similar to that used for DCTCP.

The DualQ isolates the low queuing delay of the Low Latency queue
from the larger delay of the 'Classic' queue. However, from a
bandwidth perspective, flows in either queue will share out the link
capacity as if there was just a single queue. This bandwidth pooling
effect is achieved by coupling together the drop and ECN-marking
probabilities of the two AQMs.

The PI2 AQM has two main parameters in addition to its target delay.
The integral gain factor alpha is used to slowly correct any persistent
standing queue error from the target delay, while the proportional gain
factor beta is used to quickly compensate for queue changes (growth or
shrinkage). Either alpha and beta are given as a parameter, or they can
be calculated by tc from alternative typical and maximum RTT parameters.

Internally, the output of a linear Proportional Integral (PI)
controller is used for both queues. This output is squared to
calculate the drop or ECN-marking probability of the classic queue.
This counterbalances the square-root rate equation of Reno/Cubic,
which is the trick that balances flow rates across the queues. For
the ECN-marking probability of the low latency queue, the output of
the base AQM is multiplied by a coupling factor. This determines the
balance between the flow rates in each queue. The default setting
makes the flow rates roughly equal, which should be generally
applicable.

If DUALPI2 AQM has detected overload (due to excessive non-responsive
traffic in either queue), it will switch to signaling congestion
solely using drop, irrespective of the ECN field. Alternatively, it
can be configured to limit the drop probability and let the queue
grow and eventually overflow (like tail-drop).

GSO splitting in DUALPI2 is configurable from userspace while the
default behavior is to split gso. When running DUALPI2 at unshaped
10gigE with 4 download streams test, splitting gso apart results in
halving the latency with no loss in throughput:

Summary of tcp_4down run 'no_split_gso':
                         avg         median      # data pts
 Ping (ms) ICMP   :       0.53      0.30 ms         350
 TCP download avg :    2326.86       N/A Mbits/s    350
 TCP download sum :    9307.42       N/A Mbits/s    350
 TCP download::1  :    2672.99   2568.73 Mbits/s    350
 TCP download::2  :    2586.96   2570.51 Mbits/s    350
 TCP download::3  :    1786.26   1798.82 Mbits/s    350
 TCP download::4  :    2261.21   2309.49 Mbits/s    350

Summart of tcp_4down run 'split_gso':
                         avg          median      # data pts
 Ping (ms) ICMP   :       0.22      0.23 ms         350
 TCP download avg :    2335.02       N/A Mbits/s    350
 TCP download sum :    9340.09       N/A Mbits/s    350
 TCP download::1  :    2335.30   2334.22 Mbits/s    350
 TCP download::2  :    2334.72   2334.20 Mbits/s    350
 TCP download::3  :    2335.28   2334.58 Mbits/s    350
 TCP download::4  :    2334.79   2334.39 Mbits/s    350

A similar result is observed when running DUALPI2 at unshaped 1gigE
with 1 download stream test:

Summary of tcp_1down run 'no_split_gso':
                         avg         median      # data pts
 Ping (ms) ICMP :         1.13      1.25 ms         350
 TCP download   :       941.41    941.46 Mbits/s    350

Summart of tcp_1down run 'split_gso':
                         avg         median      # data pts
 Ping (ms) ICMP :         0.51      0.55 ms         350
 TCP download   :       941.41    941.45 Mbits/s    350

Additional details can be found in the draft:
  https://datatracker.ietf.org/doc/html/rfc9332

Signed-off-by: Koen De Schepper <koen.de_schepper@nokia-bell-labs.com>
Co-developed-by: Olga Albisser <olga@albisser.org>
Signed-off-by: Olga Albisser <olga@albisser.org>
Co-developed-by: Olivier Tilmans <olivier.tilmans@nokia.com>
Signed-off-by: Olivier Tilmans <olivier.tilmans@nokia.com>
Co-developed-by: Henrik Steen <henrist@henrist.net>
Signed-off-by: Henrik Steen <henrist@henrist.net>
Co-developed-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
Signed-off-by: Bob Briscoe <research@bobbriscoe.net>
Signed-off-by: Ilpo Järvinen <ij@kernel.org>
Acked-by: Dave Taht <dave.taht@gmail.com>
Link: https://patch.msgid.link/20250722095915.24485-4-chia-yu.chang@nokia-bell-labs.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-23 17:52:07 -07:00
arch ibmveth: Add multi buffers rx replenishment hcall support 2025-07-22 15:08:23 +02:00
block block-6.16-20250626 2025-06-27 09:02:33 -07:00
certs sign-file,extract-cert: use pkcs11 provider for OPENSSL MAJOR >= 3 2024-09-20 19:52:48 +03:00
crypto crypto: wp512 - Use API partial block handling 2025-06-23 16:56:56 +08:00
Documentation devlink: Fix excessive stack usage in rate TC bandwidth parsing 2025-07-23 17:07:35 -07:00
drivers net: ti: icssg-prueth: access ->pp through netmem_desc instead of page 2025-07-23 17:46:55 -07:00
fs net: s/dev_get_flags/netif_get_flags/ 2025-07-18 17:27:47 -07:00
include sched: Add enqueue/dequeue of dualpi2 qdisc 2025-07-23 17:52:07 -07:00
init futex: Temporary disable FUTEX_PRIVATE_HASH 2025-07-01 15:02:05 +02:00
io_uring io_uring-6.16-20250710 2025-07-11 10:29:30 -07:00
ipc - The 3 patch series "hung_task: extend blocking task stacktrace dump to 2025-05-31 19:12:53 -07:00
kernel Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-07-17 11:00:33 -07:00
lib Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-07-17 11:00:33 -07:00
LICENSES LICENSES: add CC0-1.0 license text 2025-05-21 14:54:17 +02:00
mm Revert "sched/numa: add statistics of numa balance task" 2025-07-09 21:07:56 -07:00
net sched: Add enqueue/dequeue of dualpi2 qdisc 2025-07-23 17:52:07 -07:00
rust Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-07-17 11:00:33 -07:00
samples samples/damon: fix damon sample mtier for start failure 2025-07-09 21:07:55 -07:00
scripts Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-07-17 11:00:33 -07:00
security selinux: change security_compute_sid to return the ssid or tsid on match 2025-06-19 16:13:16 -04:00
sound ALSA: hda/realtek: Enable headset Mic on Positivo K116J 2025-07-07 14:09:09 +02:00
tools netdevsim: add fw_update_flash_chunk_time_ms debugfs knobs 2025-07-23 17:08:56 -07:00
usr usr/include: openrisc: don't HDRTEST bpf_perf_event.h 2025-05-12 15:03:17 +09:00
virt KVM: Allow CPU to reschedule while setting per-page memory attributes 2025-06-24 12:20:17 -07:00
.clang-format Linux 6.15-rc5 2025-05-06 16:39:25 +10:00
.clippy.toml rust: clean Rust 1.88.0's warning about clippy::disallowed_macros configuration 2025-05-07 00:11:47 +02:00
.cocciconfig
.editorconfig .editorconfig: remove trim_trailing_whitespace option 2024-06-13 16:47:52 +02:00
.get_maintainer.ignore MAINTAINERS: Retire Ralf Baechle 2024-11-12 15:48:59 +01:00
.gitattributes .gitattributes: set diff driver for Rust source code files 2023-05-31 17:48:25 +02:00
.gitignore .gitignore: ignore Python compiled bytecode 2025-04-24 10:12:46 -06:00
.mailmap MAINTAINERS: Update Kirill Shutemov's email address for TDX 2025-07-11 07:25:51 -07:00
.pylintrc docs: add a .pylintrc file with sys path for docs scripts 2025-04-09 12:10:33 -06:00
.rustfmt.toml rust: add .rustfmt.toml 2022-09-28 09:02:20 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS CREDITS: Add entry for Shannon Nelson 2025-06-21 07:34:28 -07:00
Kbuild drm: ensure drm headers are self-contained and pass kernel-doc 2025-02-12 10:44:43 +02:00
Kconfig io_uring: Rename KConfig to Kconfig 2025-02-19 14:53:27 -07:00
MAINTAINERS Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-07-17 11:00:33 -07:00
Makefile Linux 6.16-rc6 2025-07-13 14:25:58 -07:00
README README: Fix spelling 2024-03-18 03:36:32 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the reStructuredText markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.