Linux kernel source tree
Find a file
zhongjinji 5e1953dc71 mm/oom_kill: the OOM reaper traverses the VMA maple tree in reverse order
Although the oom_reaper is delayed and it gives the oom victim chance to
clean up its address space this might take a while especially for
processes with a large address space footprint.  In those cases oom_reaper
might start racing with the dying task and compete for shared resources -
e.g.  page table lock contention has been observed.

Reduce those races by reaping the oom victim from the other end of the
address space.

It is also a significant improvement for process_mrelease().  When a
process is killed, process_mrelease is used to reap the killed process and
often runs concurrently with the dying task.  The test data shows that
after applying the patch, lock contention is greatly reduced during the
procedure of reaping the killed process.

The test is conducted on arm64.  The following basic perf numbers show
that applying this patch significantly reduces pte spin lock contention.

Without the patch:
|--99.57%-- oom_reaper
|    |--73.58%-- unmap_page_range
|    |    |--8.67%-- [hit in function]
|    |    |--41.59%-- __pte_offset_map_lock
|    |    |--29.47%-- folio_remove_rmap_ptes
|    |    |--16.11%-- tlb_flush_mmu
|    |--19.94%-- tlb_finish_mmu
|    |--3.21%-- folio_remove_rmap_ptes

With the patch:
|--99.53%-- oom_reaper
|    |--55.77%-- unmap_page_range
|    |    |--20.49%-- [hit in function]
|    |    |--58.30%-- folio_remove_rmap_ptes
|    |    |--11.48%-- tlb_flush_mmu
|    |    |--3.33%-- folio_mark_accessed
|    |--32.21%-- tlb_finish_mmu
|    |--6.93%-- folio_remove_rmap_ptes
|    |--0.69%-- __pte_offset_map_lock

Detailed breakdowns for both scenarios are provided below.  The cumulative
time for oom_reaper plus exit_mmap(victim) in both cases is also
summarized, making the performance improvements clear.

+----------------------------------------------------------------+
| Category                      | Applying patch | Without patch |
+-------------------------------+----------------+---------------+
| Total running time            |    132.6       |    167.1      |
|   (exit_mmap + reaper work)   |  72.4 + 60.2   |  90.7 + 76.4  |
+-------------------------------+----------------+---------------+
| Time waiting for pte spinlock |     1.0        |    33.1       |
|   (exit_mmap + reaper work)   |   0.4 + 0.6    |  10.0 + 23.1  |
+-------------------------------+----------------+---------------+
| folio_remove_rmap_ptes time   |    42.0        |    41.3       |
|   (exit_mmap + reaper work)   |  18.4 + 23.6   |  22.4 + 18.9  |
+----------------------------------------------------------------+

From this report, we can see that:

1. The reduction in total time comes mainly from the decrease in time
   spent on pte spinlock and other locks.

2. oom_reaper performs more work in some areas, but at the same time,
   exit_mmap also handles certain tasks more efficiently, such as
   folio_remove_rmap_ptes.

Here is a more detailed perf report. [1]

Link: https://lkml.kernel.org/r/20250915162946.5515-3-zhongjinji@honor.com
Link: https://lore.kernel.org/all/20250915162619.5133-1-zhongjinji@honor.com/ [1]
Signed-off-by: zhongjinji <zhongjinji@honor.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-21 14:22:35 -07:00
arch ptdesc: remove ptdesc_to_virt() 2025-09-21 14:22:27 -07:00
block block: use largest_zero_folio in __blkdev_issue_zero_pages() 2025-09-13 16:54:54 -07:00
certs sign-file,extract-cert: use pkcs11 provider for OPENSSL MAJOR >= 3 2024-09-20 19:52:48 +03:00
crypto crypto: remove nth_page() usage within SG entry 2025-09-21 14:22:09 -07:00
Documentation docs/mm: add document for swap table 2025-09-21 14:22:22 -07:00
drivers mm: make folio page count functions return unsigned 2025-09-21 14:22:31 -07:00
fs mm/hwpoison: decouple hwpoison_filter from mm/memory-failure.c 2025-09-21 14:22:21 -07:00
include mm/oom_kill: thaw the entire OOM victim process 2025-09-21 14:22:35 -07:00
init init/main.c: fix boot time tracing crash 2025-09-03 17:10:36 -07:00
io_uring io_uring/zcrx: remove nth_page() usage within folio 2025-09-21 14:22:05 -07:00
ipc vfs-6.17-rc1.mmap_prepare 2025-07-28 13:43:25 -07:00
kernel mm/oom_kill: thaw the entire OOM victim process 2025-09-21 14:22:35 -07:00
lib alloc_tag: prevent enabling memory profiling if it was shut down 2025-09-21 14:22:30 -07:00
LICENSES LICENSES: Replace the obsolete address of the FSF in the GFDL-1.2 2025-07-24 11:15:39 +02:00
mm mm/oom_kill: the OOM reaper traverses the VMA maple tree in reverse order 2025-09-21 14:22:35 -07:00
net net: ipv4: fix regression in local-broadcast routes 2025-08-28 10:52:30 +02:00
rust rust: maple_tree: add MapleTreeAlloc 2025-09-21 14:22:19 -07:00
samples samples/cgroup: rm unused MEMCG_EVENTS macro 2025-09-21 14:22:26 -07:00
scripts scripts/decode_stacktrace.sh: code: preserve alignment 2025-09-21 14:22:28 -07:00
security + Features 2025-08-04 08:17:28 -07:00
sound ALSA: hda/hdmi: Add pin fix for another HP EliteDesk 800 G4 model 2025-09-01 13:51:57 +02:00
tools selftests/mm: protection_keys: fix dead code 2025-09-21 14:22:34 -07:00
usr usr/include: openrisc: don't HDRTEST bpf_perf_event.h 2025-05-12 15:03:17 +09:00
virt Merge tag 'kvm-x86-no_assignment-6.17' of https://github.com/kvm-x86/linux into HEAD 2025-07-29 08:36:42 -04:00
.clang-format Linux 6.15-rc5 2025-05-06 16:39:25 +10:00
.clippy.toml rust: clean Rust 1.88.0's warning about clippy::disallowed_macros configuration 2025-05-07 00:11:47 +02:00
.cocciconfig
.editorconfig .editorconfig: remove trim_trailing_whitespace option 2024-06-13 16:47:52 +02:00
.get_maintainer.ignore MAINTAINERS: Retire Ralf Baechle 2024-11-12 15:48:59 +01:00
.gitattributes .gitattributes: set diff driver for Rust source code files 2023-05-31 17:48:25 +02:00
.gitignore gitignore: allow .pylintrc to be tracked 2025-07-02 17:10:04 -06:00
.mailmap .mailmap: add entry for Easwar Hariharan 2025-08-19 16:35:55 -07:00
.pylintrc docs: add a .pylintrc file with sys path for docs scripts 2025-04-09 12:10:33 -06:00
.rustfmt.toml rust: add .rustfmt.toml 2022-09-28 09:02:20 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: retire Boris from TLS maintainers 2025-08-26 17:36:01 -07:00
Kbuild drm: ensure drm headers are self-contained and pass kernel-doc 2025-02-12 10:44:43 +02:00
Kconfig io_uring: Rename KConfig to Kconfig 2025-02-19 14:53:27 -07:00
MAINTAINERS mm, swap: use the swap table for the swap cache and switch API 2025-09-21 14:22:24 -07:00
Makefile Linux 6.17-rc4 2025-08-31 15:33:07 -07:00
README README: Fix spelling 2024-03-18 03:36:32 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the reStructuredText markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.