linux/kernel
Zqiang 5f02151c41 sched_ext: Fix unsafe locking in the scx_dump_state()
For built with CONFIG_PREEMPT_RT=y kernels, the dump_lock will be converted
sleepable spinlock and not disable-irq, so the following scenarios occur:

inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
irq_work/0/27 [HC0[0]:SC0[0]:HE1:SE1] takes:
(&rq->__lock){?...}-{2:2}, at: raw_spin_rq_lock_nested+0x2b/0x40
{IN-HARDIRQ-W} state was registered at:
   lock_acquire+0x1e1/0x510
   _raw_spin_lock_nested+0x42/0x80
   raw_spin_rq_lock_nested+0x2b/0x40
   sched_tick+0xae/0x7b0
   update_process_times+0x14c/0x1b0
   tick_periodic+0x62/0x1f0
   tick_handle_periodic+0x48/0xf0
   timer_interrupt+0x55/0x80
   __handle_irq_event_percpu+0x20a/0x5c0
   handle_irq_event_percpu+0x18/0xc0
   handle_irq_event+0xb5/0x150
   handle_level_irq+0x220/0x460
   __common_interrupt+0xa2/0x1e0
   common_interrupt+0xb0/0xd0
   asm_common_interrupt+0x2b/0x40
   _raw_spin_unlock_irqrestore+0x45/0x80
   __setup_irq+0xc34/0x1a30
   request_threaded_irq+0x214/0x2f0
   hpet_time_init+0x3e/0x60
   x86_late_time_init+0x5b/0xb0
   start_kernel+0x308/0x410
   x86_64_start_reservations+0x1c/0x30
   x86_64_start_kernel+0x96/0xa0
   common_startup_64+0x13e/0x148

 other info that might help us debug this:
 Possible unsafe locking scenario:

        CPU0
        ----
   lock(&rq->__lock);
   <Interrupt>
     lock(&rq->__lock);

  *** DEADLOCK ***

 stack backtrace:
 CPU: 0 UID: 0 PID: 27 Comm: irq_work/0
 Call Trace:
  <TASK>
  dump_stack_lvl+0x8c/0xd0
  dump_stack+0x14/0x20
  print_usage_bug+0x42e/0x690
  mark_lock.part.44+0x867/0xa70
  ? __pfx_mark_lock.part.44+0x10/0x10
  ? string_nocheck+0x19c/0x310
  ? number+0x739/0x9f0
  ? __pfx_string_nocheck+0x10/0x10
  ? __pfx_check_pointer+0x10/0x10
  ? kvm_sched_clock_read+0x15/0x30
  ? sched_clock_noinstr+0xd/0x20
  ? local_clock_noinstr+0x1c/0xe0
  __lock_acquire+0xc4b/0x62b0
  ? __pfx_format_decode+0x10/0x10
  ? __pfx_string+0x10/0x10
  ? __pfx___lock_acquire+0x10/0x10
  ? __pfx_vsnprintf+0x10/0x10
  lock_acquire+0x1e1/0x510
  ? raw_spin_rq_lock_nested+0x2b/0x40
  ? __pfx_lock_acquire+0x10/0x10
  ? dump_line+0x12e/0x270
  ? raw_spin_rq_lock_nested+0x20/0x40
  _raw_spin_lock_nested+0x42/0x80
  ? raw_spin_rq_lock_nested+0x2b/0x40
  raw_spin_rq_lock_nested+0x2b/0x40
  scx_dump_state+0x3b3/0x1270
  ? finish_task_switch+0x27e/0x840
  scx_ops_error_irq_workfn+0x67/0x80
  irq_work_single+0x113/0x260
  irq_work_run_list.part.3+0x44/0x70
  run_irq_workd+0x6b/0x90
  ? __pfx_run_irq_workd+0x10/0x10
  smpboot_thread_fn+0x529/0x870
  ? __pfx_smpboot_thread_fn+0x10/0x10
  kthread+0x305/0x3f0
  ? __pfx_kthread+0x10/0x10
  ret_from_fork+0x40/0x70
  ? __pfx_kthread+0x10/0x10
  ret_from_fork_asm+0x1a/0x30
  </TASK>

This commit therefore use rq_lock_irqsave/irqrestore() to replace
rq_lock/unlock() in the scx_dump_state().

Fixes: 07814a9439 ("sched_ext: Print debug dump after an error exit")
Signed-off-by: Zqiang <qiang.zhang@linux.dev>
Signed-off-by: Tejun Heo <tj@kernel.org>
2025-11-12 06:28:32 -10:00
..
bpf bpf: Avoid RCU context warning when unpinning htab with internal structs 2025-10-10 10:10:08 -07:00
cgroup RCU pull request for v6.18 2025-10-04 11:28:45 -07:00
configs kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI 2025-09-24 14:29:14 -07:00
debug kdb: remove redundant check for scancode 0xe0 2025-09-20 21:19:09 +01:00
dma dma-mapping updates for Linux 6.18: 2025-10-03 17:41:12 -07:00
entry hyperv-next for v6.18 2025-10-07 08:40:15 -07:00
events Summary of significant series in this pull request: 2025-10-02 18:18:33 -07:00
futex A set of updates for futexes and related selftests: 2025-09-30 16:07:10 -07:00
gcov kbuild: require gcc-8 and binutils-2.30 2025-04-30 21:53:35 +02:00
irq A set of updates for the interrupt core subsystem: 2025-09-30 15:55:25 -07:00
kcsan Kernel Concurrency Sanitizer (KCSAN) updates for v6.18 2025-10-02 08:31:44 -07:00
livepatch sched,livepatch: Untangle cond_resched() and live-patching 2025-05-14 13:16:24 +02:00
locking locking/local_lock: Introduce local_lock_is_locked(). 2025-09-29 09:42:35 +02:00
module kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI 2025-09-24 14:29:14 -07:00
power Merge branches 'pm-core', 'pm-runtime' and 'pm-sleep' 2025-09-29 12:54:01 +02:00
printk printk changes for 6.18 2025-10-04 11:13:11 -07:00
rcu hyperv-next for v6.18 2025-10-07 08:40:15 -07:00
sched sched_ext: Fix unsafe locking in the scx_dump_state() 2025-11-12 06:28:32 -10:00
time Networking changes for 6.18. 2025-10-02 15:17:01 -07:00
trace tracing fixes for v6.18: 2025-10-11 16:06:04 -07:00
unwind unwind: Finish up unwind when a task exits 2025-07-31 10:20:11 -04:00
.gitignore kheaders: rebuild kheaders_data.tar.xz when a file is modified within a minute 2025-06-24 20:30:37 +09:00
acct.c kernel/acct.c: saner struct file treatment 2025-09-27 20:13:56 -04:00
async.c
audit.c audit: fix skb leak when audit rate limit is exceeded 2025-09-10 19:55:00 -04:00
audit.h audit: create audit_stamp structure 2025-08-30 10:15:28 -04:00
audit_fsnotify.c VFS/audit: introduce kern_path_parent() for audit 2025-09-23 12:37:35 +02:00
audit_tree.c mount-related stuff for this cycle 2025-10-03 10:19:44 -07:00
audit_watch.c VFS/audit: introduce kern_path_parent() for audit 2025-09-23 12:37:35 +02:00
auditfilter.c audit/stable-6.18 PR 20250926 2025-09-30 08:22:16 -07:00
auditsc.c audit: add record for multiple object contexts 2025-08-30 10:15:30 -04:00
backtracetest.c
bounds.c
capability.c capability: Remove unused has_capability 2025-03-07 22:03:09 -06:00
cfi.c cfi: Move BPF CFI types and helpers to generic code 2025-07-31 18:23:53 -07:00
compat.c
configs.c
context_tracking.c context_tracking: Make RCU watch ct_kernel_exit_state() warning 2025-03-04 18:44:29 -08:00
cpu.c cpu: Remove obsolete comment from takedown_cpu() 2025-08-06 22:48:12 +02:00
cpu_pm.c
crash_core.c crash: add KUnit tests for crash_exclude_mem_range 2025-09-13 17:32:55 -07:00
crash_core_test.c crash: add KUnit tests for crash_exclude_mem_range 2025-09-13 17:32:55 -07:00
crash_dump_dm_crypt.c crash_dump: retrieve dm crypt keys in kdump kernel 2025-05-21 10:48:21 -07:00
crash_reserve.c kdump: implement reserve_crashkernel_cma 2025-07-19 19:08:23 -07:00
cred.c copy_process: pass clone_flags as u64 across calltree 2025-09-01 15:31:34 +02:00
delayacct.c delayacct: remove redundant code and adjust indentation 2025-05-27 19:40:33 -07:00
dma.c
elfcorehdr.c
exec_domain.c
exit.c task_stack.h: clean-up stack_not_used() implementation 2025-09-21 14:22:00 -07:00
exit.h
extable.c
fail_function.c
fork.c Patch series in this pull request: 2025-10-02 18:44:54 -07:00
freezer.c mm/oom_kill: thaw the entire OOM victim process 2025-09-21 14:22:35 -07:00
gen_kheaders.sh kheaders: make it possible to override TAR 2025-08-06 10:23:36 +09:00
groups.c
hung_task.c hung_task: dump blocker task if it is not hung 2025-09-13 17:32:43 -07:00
iomem.c mm/memremap: Pass down MEMREMAP_* flags to arch_memremap_wb() 2025-02-21 15:05:38 +01:00
irq_work.c
jump_label.c jump_label: Use RCU in all users of __module_text_address(). 2025-03-10 11:54:46 +01:00
kallsyms.c bpf: Clean up individual BTF_ID code 2025-07-16 18:34:42 -07:00
kallsyms_internal.h
kallsyms_selftest.c kallsyms: use kmalloc_array() instead of kmalloc() 2025-09-28 11:36:14 -07:00
kallsyms_selftest.h
kcmp.c kcmp: improve performance adding an unlikely hint to task comparisons 2025-02-21 10:25:33 +01:00
Kconfig.freezer
Kconfig.hz kernel: Fix "select" wording on HZ_250 description 2025-02-21 09:20:30 +01:00
Kconfig.kexec crash: add KUnit tests for crash_exclude_mem_range 2025-09-13 17:32:55 -07:00
Kconfig.locks
Kconfig.preempt softirq: Allow to drop the softirq-BKL lock on PREEMPT_RT 2025-09-17 16:25:41 +02:00
kcov.c kcov: use write memory barrier after memcpy() in kcov_move_area() 2025-09-13 17:32:44 -07:00
kexec.c kexec: enable CMA based contiguous allocation 2025-08-02 12:01:38 -07:00
kexec_core.c kexec_core: remove redundant 0 value initialization 2025-09-13 17:32:49 -07:00
kexec_elf.c kexec: initialize ELF lowest address to ULONG_MAX 2025-03-16 22:30:47 -07:00
kexec_file.c x86/kexec: carry forward the boot DTB on kexec 2025-09-13 17:32:43 -07:00
kexec_handover.c kho: add support for preserving vmalloc allocations 2025-10-07 13:48:55 -07:00
kexec_internal.h kexec: enable CMA based contiguous allocation 2025-08-02 12:01:38 -07:00
kheaders.c
kprobes.c kprobes: Add missing kerneldoc for __get_insn_slot 2025-07-15 18:45:34 +09:00
kstack_erase.c stackleak: Rename stackleak_track_stack to __sanitizer_cov_stack_depth 2025-07-21 21:40:39 -07:00
ksyms_common.c
ksysfs.c
kthread.c ipvs: Fix estimator kthreads preferred affinity 2025-08-13 08:34:33 +02:00
latencytop.c
Makefile Patch series in this pull request: 2025-10-02 18:44:54 -07:00
module_signature.c
notifier.c
nscommon.c ns: drop assert 2025-09-25 09:23:54 +02:00
nsproxy.c namespace-6.18-rc1 2025-09-29 11:20:29 -07:00
nstree.c ns: move ns type into struct ns_common 2025-09-25 09:23:54 +02:00
padata.c padata: WQ_PERCPU added to alloc_workqueue users 2025-09-13 12:11:06 +08:00
panic.c panic: remove CONFIG_PANIC_ON_OOPS_VALUE 2025-09-28 11:36:13 -07:00
params.c params: Replace deprecated strcpy() with strscpy() and memcpy() 2025-08-16 21:47:25 +02:00
pid.c namespace-6.18-rc1 2025-09-29 11:20:29 -07:00
pid_namespace.c namespace-6.18-rc1 2025-09-29 11:20:29 -07:00
pid_sysctl.h
profile.c
ptrace.c ptrace: introduce PTRACE_SET_SYSCALL_INFO request 2025-05-11 17:48:15 -07:00
range.c
reboot.c - The 7 patch series "powerpc/crash: use generic crashkernel 2025-04-01 10:06:52 -07:00
regset.c
relay.c relayfs: support a counter tracking if data is too big to write 2025-07-09 22:57:52 -07:00
resource.c resource: improve child resource handling in release_mem_region_adjustable() 2025-09-21 14:22:34 -07:00
resource_kunit.c
rseq.c rseq: Protect event mask against membarrier IPI 2025-09-13 19:51:59 +02:00
scftorture.c
scs.c
seccomp.c Performance events updates for v6.18: 2025-09-30 11:11:21 -07:00
signal.c signal: Fix memory leak for PIDFD_SELF* sentinels 2025-08-19 13:51:28 +02:00
smp.c smp: Fix up and expand the smp_call_function_many() kerneldoc 2025-09-18 22:21:28 +02:00
smpboot.c sched/smp: Use the SMP version of idle_thread_set_boot_cpu() 2025-06-13 08:47:20 +02:00
smpboot.h
softirq.c softirq: Allow to drop the softirq-BKL lock on PREEMPT_RT 2025-09-17 16:25:41 +02:00
stacktrace.c
static_call.c
static_call_inline.c Modules changes for 6.15-rc1 2025-03-30 15:44:36 -07:00
stop_machine.c sched/core: Fix migrate_swap() vs. hotplug 2025-07-01 15:02:03 +02:00
sys.c Patch series in this pull request: 2025-10-02 18:44:54 -07:00
sys_ni.c uprobes/x86: Add uprobe syscall to speed up uprobe 2025-08-21 20:09:20 +02:00
sysctl-test.c sysctl: move u8 register test to lib/test_sysctl.c 2025-04-14 14:13:41 +02:00
sysctl.c sysctl: rename kern_table -> sysctl_subsys_table 2025-07-23 11:56:02 +02:00
task_work.c
taskstats.c
torture.c torture: Delay CPU-hotplug operations until boot completes 2025-08-14 15:26:30 -07:00
tracepoint.c tracepoint: Print the function symbol when tracepoint_debug is set 2025-03-21 15:30:10 -04:00
tsacct.c pid: change bacct_add_tsk() to use task_ppid_nr_ns() 2025-08-19 13:38:20 +02:00
ucount.c ucount: use atomic_long_try_cmpxchg() in atomic_long_inc_below() 2025-08-02 12:01:38 -07:00
uid16.c
uid16.h
umh.c
up.c
user-return-notifier.c
user.c ns: move ns type into struct ns_common 2025-09-25 09:23:54 +02:00
user_namespace.c ns: move ns type into struct ns_common 2025-09-25 09:23:54 +02:00
utsname.c namespace-6.18-rc1 2025-09-29 11:20:29 -07:00
utsname_sysctl.c
vhost_task.c vhost: Take a reference on the task in struct vhost_task. 2025-09-21 17:44:20 -04:00
vmcore_info.c crash: export PAGE_UNACCEPTED_MAPCOUNT_VALUE to vmcoreinfo 2025-05-11 17:54:04 -07:00
watch_queue.c vfs-6.15-rc1.pipe 2025-03-24 09:52:37 -07:00
watchdog.c watchdog: skip checks when panic is in progress 2025-09-13 17:32:53 -07:00
watchdog_buddy.c watchdog: fix opencoded cpumask_next_wrap() in watchdog_next_cpu() 2025-07-31 11:28:03 -04:00
watchdog_perf.c watchdog: skip checks when panic is in progress 2025-09-13 17:32:53 -07:00
workqueue.c workqueue: WQ_PERCPU added to alloc_workqueue users 2025-09-16 10:33:53 -10:00
workqueue_internal.h