Linux kernel source tree
Find a file
Waiman Long ee57ab5a32 locking/lockdep: Disable KASAN instrumentation of lockdep.c
Both KASAN and LOCKDEP are commonly enabled in building a debug kernel.
Each of them can significantly slow down the speed of a debug kernel.
Enabling KASAN instrumentation of the LOCKDEP code will further slow
things down.

Since LOCKDEP is a high overhead debugging tool, it will never get
enabled in a production kernel. The LOCKDEP code is also pretty mature
and is unlikely to get major changes. There is also a possibility of
recursion similar to KCSAN.

To evaluate the performance impact of disabling KASAN instrumentation
of lockdep.c, the time to do a parallel build of the Linux defconfig
kernel was used as the benchmark. Two x86-64 systems (Skylake & Zen 2)
and an arm64 system were used as test beds. Two sets of non-RT and RT
kernels with similar configurations except mainly CONFIG_PREEMPT_RT
were used for evaluation.

For the Skylake system:

  Kernel			Run time	    Sys time
  ------			--------	    --------
  Non-debug kernel (baseline)	0m47.642s	      4m19.811s

  [CONFIG_KASAN_INLINE=y]
  Debug kernel			2m11.108s (x2.8)     38m20.467s (x8.9)
  Debug kernel (patched)	1m49.602s (x2.3)     31m28.501s (x7.3)
  Debug kernel
  (patched + mitigations=off)	1m30.988s (x1.9)     26m41.993s (x6.2)

  RT kernel (baseline)		0m54.871s	      7m15.340s

  [CONFIG_KASAN_INLINE=n]
  RT debug kernel		6m07.151s (x6.7)    135m47.428s (x18.7)
  RT debug kernel (patched)	3m42.434s (x4.1)     74m51.636s (x10.3)
  RT debug kernel
  (patched + mitigations=off)	2m40.383s (x2.9)     57m54.369s (x8.0)

  [CONFIG_KASAN_INLINE=y]
  RT debug kernel		3m22.155s (x3.7)     77m53.018s (x10.7)
  RT debug kernel (patched)	2m36.700s (x2.9)     54m31.195s (x7.5)
  RT debug kernel
  (patched + mitigations=off)	2m06.110s (x2.3)     45m49.493s (x6.3)

For the Zen 2 system:

  Kernel			Run time	    Sys time
  ------			--------	    --------
  Non-debug kernel (baseline)	1m42.806s	     39m48.714s

  [CONFIG_KASAN_INLINE=y]
  Debug kernel			4m04.524s (x2.4)    125m35.904s (x3.2)
  Debug kernel (patched)	3m56.241s (x2.3)    127m22.378s (x3.2)
  Debug kernel
  (patched + mitigations=off)	2m38.157s (x1.5)     92m35.680s (x2.3)

  RT kernel (baseline)		 1m51.500s	     14m56.322s

  [CONFIG_KASAN_INLINE=n]
  RT debug kernel		16m04.962s (x8.7)   244m36.463s (x16.4)
  RT debug kernel (patched)	 9m09.073s (x4.9)   129m28.439s (x8.7)
  RT debug kernel
  (patched + mitigations=off)	 3m31.662s (x1.9)    51m01.391s (x3.4)

For the arm64 system:

  Kernel			Run time	    Sys time
  ------			--------	    --------
  Non-debug kernel (baseline)	1m56.844s	      8m47.150s
  Debug kernel			3m54.774s (x2.0)     92m30.098s (x10.5)
  Debug kernel (patched)	3m32.429s (x1.8)     77m40.779s (x8.8)

  RT kernel (baseline)		 4m01.641s	     18m16.777s

  [CONFIG_KASAN_INLINE=n]
  RT debug kernel		19m32.977s (x4.9)   304m23.965s (x16.7)
  RT debug kernel (patched)	16m28.354s (x4.1)   234m18.149s (x12.8)

Turning the mitigations off doesn't seems to have any noticeable impact
on the performance of the arm64 system. So the mitigation=off entries
aren't included.

For the x86 CPUs, CPU mitigations has a much bigger
impact on performance, especially the RT debug kernel with
CONFIG_KASAN_INLINE=n. The SRSO mitigation in Zen 2 has an especially
big impact on the debug kernel. It is also the majority of the slowdown
with mitigations on. It is because the patched RET instruction slows
down function returns. A lot of helper functions that are normally
compiled out or inlined may become real function calls in the debug
kernel.

With !CONFIG_KASAN_INLINE, the KASAN instrumentation inserts a
lot of __asan_loadX*() and __kasan_check_read() function calls to memory
access portion of the code. The lockdep's __lock_acquire() function,
for instance, has 66 __asan_loadX*() and 6 __kasan_check_read() calls
added with KASAN instrumentation. Of course, the actual numbers may vary
depending on the compiler used and the exact version of the lockdep code.

With the Skylake test system, the parallel kernel build times reduction
of the RT debug kernel with this patch are:

 CONFIG_KASAN_INLINE=n: -37%
 CONFIG_KASAN_INLINE=y: -22%

The time reduction is less with CONFIG_KASAN_INLINE=y, but it is still
significant.

Setting CONFIG_KASAN_INLINE=y can result in a significant performance
improvement. The major drawback is a significant increase in the size
of kernel text. In the case of vmlinux, its text size increases from
45997948 to 67606807. That is a 47% size increase (about 21 Mbytes). The
size increase of other kernel modules should be similar.

With the newly added rtmutex and lockdep lock events, the relevant
event counts for the test runs with the Skylake system were:

  Event type		Debug kernel	RT debug kernel
  ----------		------------	---------------
  lockdep_acquire	1,968,663,277	5,425,313,953
  rtlock_slowlock	     -		  401,701,156
  rtmutex_slowlock	     -		      139,672

The __lock_acquire() calls in the RT debug kernel are x2.8 times of the
non-RT debug kernel with the same workload. Since the __lock_acquire()
function is a big hitter in term of performance slowdown, this makes
the RT debug kernel much slower than the non-RT one. The average lock
nesting depth is likely to be higher in the RT debug kernel too leading
to longer execution time in the __lock_acquire() function.

As the small advantage of enabling KASAN instrumentation to catch
potential memory access error in the lockdep debugging tool is probably
not worth the drawback of further slowing down a debug kernel, disable
KASAN instrumentation in the lockdep code to allow the debug kernels
to regain some performance back, especially for the RT debug kernels.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Marco Elver <elver@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250307232717.1759087-6-boqun.feng@gmail.com
2025-03-08 00:55:03 +01:00
arch Merge branch 'locking/urgent' into locking/core, to pick up locking fixes 2025-03-08 00:54:06 +01:00
block block-6.14-20250306 2025-03-07 11:12:33 -10:00
certs sign-file,extract-cert: use pkcs11 provider for OPENSSL MAJOR >= 3 2024-09-20 19:52:48 +03:00
crypto treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
Documentation execve fix for v6.14-rc6 2025-03-07 11:49:33 -10:00
drivers ACPI fix for 6.14-rc6 2025-03-07 12:17:42 -10:00
fs execve fix for v6.14-rc6 2025-03-07 11:49:33 -10:00
include ACPI fix for 6.14-rc6 2025-03-07 12:17:42 -10:00
init Kbuild updates for v6.14 2025-01-31 12:07:07 -08:00
io_uring io_uring-6.14-20250306 2025-03-07 11:09:33 -10:00
ipc treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
kernel locking/lockdep: Disable KASAN instrumentation of lockdep.c 2025-03-08 00:55:03 +01:00
lib Fix an rcuref_put() slowpath race. 2025-02-28 16:07:18 -08:00
LICENSES LICENSES: add 0BSD license text 2024-09-01 20:43:24 -07:00
mm Merge branch 'locking/urgent' into locking/core, to pick up locking fixes 2025-03-08 00:54:06 +01:00
net net: ipv6: fix missing dst ref drop in ila lwtunnel 2025-03-06 11:08:45 +01:00
rust rust: lockdep: Remove support for dynamically allocated LockClassKeys 2025-03-08 00:52:00 +01:00
samples Driver core api addition for 6.14-rc3 2025-02-16 12:54:42 -08:00
scripts modpost: Fix a few typos in a comment 2025-02-16 03:10:58 +09:00
security Landlock fix for v6.14-rc5 2025-02-26 11:55:44 -08:00
sound ALSA: hda/realtek: Add support for ASUS Zenbook UM3406KA Laptops using CS35L41 HDA 2025-03-05 19:52:56 +01:00
tools Fix an objtool false positive, and objtool related 2025-02-28 16:45:36 -08:00
usr kbuild: Drop support for include/asm-<arch> in headers_check.pl 2024-12-21 11:43:17 +09:00
virt KVM: remove kvm_arch_post_init_vm 2025-02-04 11:27:45 -05:00
.clang-format clang-format: Update with v6.11-rc1's for_each macro list 2024-08-02 13:20:31 +02:00
.clippy.toml rust: give Clippy the minimum supported Rust version 2025-01-10 00:17:25 +01:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.editorconfig .editorconfig: remove trim_trailing_whitespace option 2024-06-13 16:47:52 +02:00
.get_maintainer.ignore MAINTAINERS: Retire Ralf Baechle 2024-11-12 15:48:59 +01:00
.gitattributes .gitattributes: set diff driver for Rust source code files 2023-05-31 17:48:25 +02:00
.gitignore rust: use host dylib naming convention to support macOS 2025-01-10 01:01:24 +01:00
.mailmap We have been notified of a TLS regression that will be addressed 2025-03-06 09:34:54 -10:00
.rustfmt.toml rust: add .rustfmt.toml 2022-09-28 09:02:20 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: Move Pavel to kernel.org address 2025-02-07 09:12:33 -08:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS MAINTAINERS: update email address in cifs and ksmbd entry 2025-03-02 22:44:18 -06:00
Makefile Linux 6.14-rc5 2025-03-02 11:48:20 -08:00
README README: Fix spelling 2024-03-18 03:36:32 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the reStructuredText markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.