workqueue: add CONFIG_BOOTPARAM_WQ_STALL_PANIC option

Add a kernel config option to set the default value of
workqueue.panic_on_stall, similar to CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC,
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC and CONFIG_BOOTPARAM_HUNG_TASK_PANIC.

This allows setting the number of workqueue stalls before triggering
a kernel panic at build time, which is useful for high-availability
systems that need consistent panic-on-stall, in other words, those
servers which run with CONFIG_BOOTPARAM_*_PANIC=y already.

The default remains 0 (disabled). Setting it to 1 will panic on the
first stall, and higher values will panic after that many stall
warnings. The value can still be overridden at runtime via the
workqueue.panic_on_stall boot parameter or sysfs.

Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
This commit is contained in:
Breno Leitao 2026-02-03 09:01:17 -08:00 committed by Tejun Heo
parent 51cd2d2dec
commit 32d572e390
3 changed files with 26 additions and 2 deletions

View file

@ -8336,7 +8336,8 @@ Kernel parameters
CONFIG_WQ_WATCHDOG. It sets the number times of the
stall to trigger panic.
The default is 0, which disables the panic on stall.
The default is set by CONFIG_BOOTPARAM_WQ_STALL_PANIC,
which is 0 (disabled) if not configured.
workqueue.cpu_intensive_thresh_us=
Per-cpu work items which run for longer than this

View file

@ -7568,7 +7568,7 @@ static struct timer_list wq_watchdog_timer;
static unsigned long wq_watchdog_touched = INITIAL_JIFFIES;
static DEFINE_PER_CPU(unsigned long, wq_watchdog_touched_cpu) = INITIAL_JIFFIES;
static unsigned int wq_panic_on_stall;
static unsigned int wq_panic_on_stall = CONFIG_BOOTPARAM_WQ_STALL_PANIC;
module_param_named(panic_on_stall, wq_panic_on_stall, uint, 0644);
/*

View file

@ -1297,6 +1297,29 @@ config WQ_WATCHDOG
state. This can be configured through kernel parameter
"workqueue.watchdog_thresh" and its sysfs counterpart.
config BOOTPARAM_WQ_STALL_PANIC
int "Panic on Nth workqueue stall"
default 0
range 0 100
depends on WQ_WATCHDOG
help
Set the number of workqueue stalls to trigger a kernel panic.
A workqueue stall occurs when a worker pool doesn't make forward
progress on a pending work item for over 30 seconds (configurable
using the workqueue.watchdog_thresh parameter).
If n = 0, the kernel will not panic on stall. If n > 0, the kernel
will panic after n stall warnings.
The panic can be used in combination with panic_timeout,
to cause the system to reboot automatically after a
stall has been detected. This feature is useful for
high-availability systems that have uptime guarantees and
where a stall must be resolved ASAP.
This setting can be overridden at runtime via the
workqueue.panic_on_stall kernel parameter.
config WQ_CPU_INTENSIVE_REPORT
bool "Report per-cpu work items which hog CPU for too long"
depends on DEBUG_KERNEL