linux/kernel/locking
Maxim Levitsky c5b6ababd2 locking/mutex: implement mutex_trylock_nested
Despite the fact that several lockdep-related checks are skipped when
calling trylock* versions of the locking primitives, for example
mutex_trylock, each time the mutex is acquired, a held_lock is still
placed onto the lockdep stack by __lock_acquire() which is called
regardless of whether the trylock* or regular locking API was used.

This means that if the caller successfully acquires more than
MAX_LOCK_DEPTH locks of the same class, even when using mutex_trylock,
lockdep will still complain that the maximum depth of the held lock stack
has been reached and disable itself.

For example, the following error currently occurs in the ARM version
of KVM, once the code tries to lock all vCPUs of a VM configured with more
than MAX_LOCK_DEPTH vCPUs, a situation that can easily happen on modern
systems, where having more than 48 CPUs is common, and it's also common to
run VMs that have vCPU counts approaching that number:

[  328.171264] BUG: MAX_LOCK_DEPTH too low!
[  328.175227] turning off the locking correctness validator.
[  328.180726] Please attach the output of /proc/lock_stat to the bug report
[  328.187531] depth: 48  max: 48!
[  328.190678] 48 locks held by qemu-kvm/11664:
[  328.194957]  #0: ffff800086de5ba0 (&kvm->lock){+.+.}-{3:3}, at: kvm_ioctl_create_device+0x174/0x5b0
[  328.204048]  #1: ffff0800e78800b8 (&vcpu->mutex){+.+.}-{3:3}, at: lock_all_vcpus+0x16c/0x2a0
[  328.212521]  #2: ffff07ffeee51e98 (&vcpu->mutex){+.+.}-{3:3}, at: lock_all_vcpus+0x16c/0x2a0
[  328.220991]  #3: ffff0800dc7d80b8 (&vcpu->mutex){+.+.}-{3:3}, at: lock_all_vcpus+0x16c/0x2a0
[  328.229463]  #4: ffff07ffe0c980b8 (&vcpu->mutex){+.+.}-{3:3}, at: lock_all_vcpus+0x16c/0x2a0
[  328.237934]  #5: ffff0800a3883c78 (&vcpu->mutex){+.+.}-{3:3}, at: lock_all_vcpus+0x16c/0x2a0
[  328.246405]  #6: ffff07fffbe480b8 (&vcpu->mutex){+.+.}-{3:3}, at: lock_all_vcpus+0x16c/0x2a0

Luckily, in all instances that require locking all vCPUs, the
'kvm->lock' is taken a priori, and that fact makes it possible to use
the little known feature of lockdep, called a 'nest_lock', to avoid this
warning and subsequent lockdep self-disablement.

The action of 'nested lock' being provided to lockdep's lock_acquire(),
causes the lockdep to detect that the top of the held lock stack contains
a lock of the same class and then increment its reference counter instead
of pushing a new held_lock item onto that stack.

See __lock_acquire for more information.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Message-ID: <20250512180407.659015-2-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-05-27 12:16:40 -04:00
..
irqflag-debug.c lockdep: Noinstr annotate warn_bogus_irq_restore() 2021-02-10 14:44:39 +01:00
lock_events.c locking/debug: Fix debugfs API return value checks to use IS_ERR() 2023-10-03 10:11:25 +02:00
lock_events.h locking/qspinlock: Always evaluate lockevent* non-event parameter once 2024-03-21 20:45:17 +01:00
lock_events_list.h bpf_res_spin_lock 2025-03-30 13:06:27 -07:00
lockdep.c locking/lockdep: Decrease nr_unused_locks if lock unused in zap_class() 2025-03-27 08:23:17 +01:00
lockdep_internals.h lockdep: Document MAX_LOCKDEP_CHAIN_HLOCKS calculation 2024-12-15 11:49:35 -08:00
lockdep_proc.c locking/lockdep: Simplify character output in seq_line() 2024-08-06 10:46:43 -07:00
lockdep_states.h locking/lockdep: Rework FS_RECLAIM annotation 2017-08-10 12:29:03 +02:00
locktorture.c rqspinlock: Add locktorture support 2025-03-19 08:03:05 -07:00
Makefile locking/lockdep: Disable KASAN instrumentation of lockdep.c 2025-03-08 00:55:03 +01:00
mcs_spinlock.h locking: Allow obtaining result of arch_mcs_spin_lock_contended 2025-03-19 08:03:04 -07:00
mutex-debug.c locking/mutex: Introduce devm_mutex_init() 2024-04-11 17:34:41 +01:00
mutex.c locking/mutex: implement mutex_trylock_nested 2025-05-27 12:16:40 -04:00
mutex.h locking/mutex: Expose __mutex_owner() 2024-10-14 12:52:41 +02:00
osq_lock.c locking/osq_lock: Use atomic_try_cmpxchg_release() in osq_unlock() 2024-10-25 10:01:50 +02:00
percpu-rwsem.c percpu: use TYPEOF_UNQUAL() in variable declarations 2025-03-16 22:05:53 -07:00
qrwlock.c locking: Add __lockfunc to slow path functions 2022-08-19 19:47:51 +02:00
qspinlock.c locking: Move common qspinlock helpers to a private header 2025-03-19 08:02:29 -07:00
qspinlock.h locking: Move common qspinlock helpers to a private header 2025-03-19 08:02:29 -07:00
qspinlock_paravirt.h locking/pvqspinlock: Convert fields of 'enum vcpu_state' to uppercase 2024-10-17 21:21:16 -07:00
qspinlock_stat.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157 2019-05-30 11:26:37 -07:00
rtmutex.c locking/lock_events: Add locking events for rtmutex slow paths 2025-03-08 00:55:03 +01:00
rtmutex_api.c locking/rtmutex: Make sure we wake anything on the wake_q when we release the lock->wait_lock 2024-12-17 17:47:24 +01:00
rtmutex_common.h locking/rtmutex: Use the 'struct' keyword in kernel-doc comment 2025-03-08 00:52:01 +01:00
rwbase_rt.c locking/mutex: Remove wakeups from under mutex::wait_lock 2024-10-14 12:52:40 +02:00
rwsem.c locking/mutex: Remove wakeups from under mutex::wait_lock 2024-10-14 12:52:40 +02:00
semaphore.c locking/semaphore: Use wake_q to wake up processes outside lock critical section 2025-03-08 00:52:01 +01:00
spinlock.c locking/spinlocks: Make __raw_* lock ops static 2024-10-07 09:28:35 +02:00
spinlock_debug.c sched.h: move pid helpers to pid.h 2023-12-20 19:26:31 -05:00
spinlock_rt.c Scheduler changes for v6.13: 2024-11-19 14:16:06 -08:00
test-ww_mutex.c locking/ww_mutex/test: Use swap() macro 2024-12-15 11:49:35 -08:00
ww_mutex.h locking/mutex: Make mutex::wait_lock irq safe 2024-10-14 12:52:40 +02:00
ww_rt_mutex.c locking/rtmutex: Avoid unconditional slowpath for DEBUG_RT_MUTEXES 2023-09-20 09:31:11 +02:00