linux

mirror of https://github.com/torvalds/linux.git synced 2026-03-14 05:36:15 +01:00

Author	SHA1	Message	Date
Linus Torvalds	e7c375b181	vfs-6.18-rc7.fixes -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaRtBJwAKCRCRxhvAZXjc ou5CAQCJb5y2ULKklblICU1wR7Nr15WvTW7VVOcv44RJ22S3NgEAy4DLDBFBw8zC 8e7Hp8gxbjsq8ZJmU088aobFcqbZOwk= =TAnu -----END PGP SIGNATURE----- Merge tag 'vfs-6.18-rc7.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Fix unitialized variable in statmount_string() - Fix hostfs mounting when passing host root during boot - Fix dynamic lookup to fail on cell lookup failure - Fix missing file type when reading bfs inodes from disk - Enforce checking of sb_min_blocksize() calls and update all callers accordingly - Restore write access before closing files opened by open_exec() in binfmt_misc - Always freeze efivarfs during suspend/hibernate cycles - Fix statmount()'s and listmount()'s grab_requested_mnt_ns() helper to actually allow mount namespace file descriptor in addition to mount namespace ids - Fix tmpfs remount when noswap is specified - Switch Landlock to iput_not_last() to remove false-positives from might_sleep() annotations in iput() - Remove dead node_to_mnt_ns() code - Ensure that per-queue kobjects are successfully created * tag 'vfs-6.18-rc7.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: landlock: fix splats from iput() after it started calling might_sleep() fs: add iput_not_last() shmem: fix tmpfs reconfiguration (remount) when noswap is set fs/namespace: correctly handle errors returned by grab_requested_mnt_ns power: always freeze efivarfs binfmt_misc: restore write access before closing files opened by open_exec() block: add __must_check attribute to sb_min_blocksize() virtio-fs: fix incorrect check for fsvq->kobj xfs: check the return value of sb_min_blocksize() in xfs_fs_fill_super isofs: check the return value of sb_min_blocksize() in isofs_fill_super exfat: check return value of sb_min_blocksize in exfat_read_boot_sector vfat: fix missing sb_min_blocksize() return value checks mnt: Remove dead code which might prevent from building bfs: Reconstruct file type when loading from disk afs: Fix dynamic lookup to fail on cell lookup failure hostfs: Fix only passing host root in boot stage with new mount fs: Fix uninitialized 'offp' in statmount_string()	2025-11-17 09:11:27 -08:00
Linus Torvalds	418592a040	sched_ext: Fixes for v6.18-rc6 Five fixes addressing PREEMPT_RT compatibility and locking issues. Three commits fix potential deadlocks and sleeps in atomic contexts on RT kernels by converting locks to raw spinlocks and ensuring IRQ work runs in hard-irq context. The remaining two fix unsafe locking in the debug dump path and a variable dereference typo. -----BEGIN PGP SIGNATURE----- iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCaRs/0w4cdGpAa2VybmVs Lm9yZwAKCRCxYfJx3gVYGRjVAQClymu/aOROE97FWD+JEsuNYIse5qkBEiAfJtWR D9pz7QD+IGyvvF51zVS1tM8eBVoO0AX2Xc6vY/rfY9p9RUfRKwo= =OwLv -----END PGP SIGNATURE----- Merge tag 'sched_ext-for-6.18-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext fixes from Tejun Heo: "Five fixes addressing PREEMPT_RT compatibility and locking issues. Three commits fix potential deadlocks and sleeps in atomic contexts on RT kernels by converting locks to raw spinlocks and ensuring IRQ work runs in hard-irq context. The remaining two fix unsafe locking in the debug dump path and a variable dereference typo" * tag 'sched_ext-for-6.18-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: sched_ext: Use IRQ_WORK_INIT_HARD() to initialize rq->scx.kick_cpus_irq_work sched_ext: Fix possible deadlock in the deferred_irq_workfn() sched/ext: convert scx_tasks_lock to raw spinlock sched_ext: Fix unsafe locking in the scx_dump_state() sched_ext: Fix use of uninitialized variable in scx_bpf_cpuperf_set()	2025-11-17 09:01:22 -08:00
Zqiang	36c6f3c03d	sched_ext: Use IRQ_WORK_INIT_HARD() to initialize rq->scx.kick_cpus_irq_work For PREEMPT_RT kernels, the kick_cpus_irq_workfn() be invoked in the per-cpu irq_work/* task context and there is no rcu-read critical section to protect. this commit therefore use IRQ_WORK_INIT_HARD() to initialize the per-cpu rq->scx.kick_cpus_irq_work in the init_sched_ext_class(). Signed-off-by: Zqiang <qiang.zhang@linux.dev> Signed-off-by: Tejun Heo <tj@kernel.org>	2025-11-17 05:07:22 -10:00
Linus Torvalds	7ba45f1504	7 hotfixes. 5 are cc:stable, 4 are against mm/. All are singletons - please see the respective changelogs for details. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaRoauQAKCRDdBJ7gKXxA jtNFAQDEMH0+zRGz/Larkf9cgmdKcDgij1DP2gP/3i8PWAoaGQD8C9evZxu1h9wC rFbaSkPDeSdDafo3RZfpo1gqE0LdEA4= =oew8 -----END PGP SIGNATURE----- Merge tag 'mm-hotfixes-stable-2025-11-16-10-40' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "7 hotfixes. 5 are cc:stable, 4 are against mm/ All are singletons - please see the respective changelogs for details" * tag 'mm-hotfixes-stable-2025-11-16-10-40' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm, swap: fix potential UAF issue for VMA readahead selftests/user_events: fix type cast for write_index packed member in perf_test lib/test_kho: check if KHO is enabled mm/huge_memory: fix folio split check for anon folios in swapcache MAINTAINERS: update David Hildenbrand's email address crash: fix crashkernel resource shrink mm: fix MAX_FOLIO_ORDER on powerpc configs with hugetlb	2025-11-16 13:31:14 -08:00
Sourabh Jain	00fbff75c5	crash: fix crashkernel resource shrink When crashkernel is configured with a high reservation, shrinking its value below the low crashkernel reservation causes two issues: 1. Invalid crashkernel resource objects 2. Kernel crash if crashkernel shrinking is done twice For example, with crashkernel=200M,high, the kernel reserves 200MB of high memory and some default low memory (say 256MB). The reservation appears as: cat /proc/iomem \| grep -i crash af000000-beffffff : Crash kernel 433000000-43f7fffff : Crash kernel If crashkernel is then shrunk to 50MB (echo 52428800 > /sys/kernel/kexec_crash_size), /proc/iomem still shows 256MB reserved: af000000-beffffff : Crash kernel Instead, it should show 50MB: af000000-b21fffff : Crash kernel Further shrinking crashkernel to 40MB causes a kernel crash with the following trace (x86): BUG: kernel NULL pointer dereference, address: 0000000000000038 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI <snip...> Call Trace: <TASK> ? __die_body.cold+0x19/0x27 ? page_fault_oops+0x15a/0x2f0 ? search_module_extables+0x19/0x60 ? search_bpf_extables+0x5f/0x80 ? exc_page_fault+0x7e/0x180 ? asm_exc_page_fault+0x26/0x30 ? __release_resource+0xd/0xb0 release_resource+0x26/0x40 __crash_shrink_memory+0xe5/0x110 crash_shrink_memory+0x12a/0x190 kexec_crash_size_store+0x41/0x80 kernfs_fop_write_iter+0x141/0x1f0 vfs_write+0x294/0x460 ksys_write+0x6d/0xf0 <snip...> This happens because __crash_shrink_memory()/kernel/crash_core.c incorrectly updates the crashk_res resource object even when crashk_low_res should be updated. Fix this by ensuring the correct crashkernel resource object is updated when shrinking crashkernel memory. Link: https://lkml.kernel.org/r/20251101193741.289252-1-sourabhjain@linux.ibm.com Fixes: `16c6006af4` ("kexec: enable kexec_crash_size to support two crash kernel regions") Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Zhen Lei <thunder.leizhen@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-11-15 10:52:01 -08:00
Linus Torvalds	bb1a6ddcfa	Fix a memory leak in the posix timer creation logic. Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmkYQ4URHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1gwPhAAuwspuw4ODgk3qpEcYzK8AypsARxD5dC9 Dzm+kcWLOk2hDGCQIRnxm7HVE1/wogprmtr8OUorqvOzKXAxl+4Wi+Pwof3i7TRY 8qKAJBcapFOx5wi6h6CTpfHkWRPtfV4F4FG/bJuqHUAOKWjYdTZt+Vlod4jrDteM ySYxmIdBMiAzRGp3VJwjoSIA0FxssqyCGgwwBBp8a9sncdvGHvLKKM3Vdr18LgOH Tdtq32n2x09FgyFsffkc7gGTvmMhP4Ln5vQBltuMNXWcQoOv85tTCCLXzHyfQvYk OGhyq6OALJIX9xqqQoxZBAf3gil6F7fLrUZrgIgeZw1FWeMqtqvdUk22H9o5Vm7u FKbX4imDApDRlFhFAz0SSJWHGrggJ9cKR5BON6Rk0BpF+6koGeJlhjs7rCvHX33X 7qdiRLsi9Fu79lo+nEJC+EbvghhpNeQG38uqty+3VU1c/LDM5Hv50EJz5G7AgL1t akKvSEIRixLjT5FtPp9jdecjh17F//Nm0M/zG44cxof46jF8cGzVyPdkFrBMY2mG FEJwAsFqiiFNfrC27gC08KFowuKJbiwEfOZYFCj3BtBZGgPKv/uh5zO+4ZtNRdLW Fy9ykQ7hL0JH2qekK1mdbhenjbrxORqSBiPztv7ziMkkd4pRs2cmVrz+8gnNoOmi 7in63SMSjWw= =fTUR -----END PGP SIGNATURE----- Merge tag 'timers-urgent-2025-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Ingo Molnar: "Fix a memory leak in the posix timer creation logic" * tag 'timers-urgent-2025-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: posix-timers: Plug potential memory leak in do_timer_create()	2025-11-15 08:51:43 -08:00
Linus Torvalds	cbba5d1b53	bpf-fixes -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmkXpZUACgkQ6rmadz2v bTrGCw//UCx+KBXbzvv7m0A1QGOUL3oHL/Qd+OJA3RW3B+saVbYYzn9jjl0SRgFP X0q/DwbDOjFtOSORV9oFgJkrucn7+BM/yxPaC4sE1SQZJAjDFA/CSaF0r8duuGsM Mvat9TTiwwetOMAkNB9WZ1e6AKGovBLguLFGAWZc6vLeQZopcER5+pFwS44a9RrK dq0Th8O/oY3VmUDgSKJ2KyY51KxpJU7k2ipifiIbu1M1MWZ7s2vERkMEkzJ/lB8/ nldMsTZUdknGFzVH/W6Rc9ScFYlH+h/x1gkOHwTibMsqDBm92mWVo6O7hvuUbsEO NlPDgMtkhBp7PDSx9SA0UBcriMs1M6ovNBOpj/cI4AL1k8WNubf/FHZtrBwoy8C9 3HaM+8lkA2uiHVPUvT5dImzWqshweN0GXoXAoa9xPSQPchJ38UdzCHqYRAg/kWFZ 5jUK2j4e5+yyII44pD7Xti0PrfoP81giliqmTbGFV8+Y89dQnk+WK12vnbv34ER7 unLwId8HLtq0ZN7FVG4F6s/4qNdEMKqXbAkve0WWFXn4vKZMCju4ol6NYVGisRAg zcn7Yk+weSuY3UOzC+/4SxhfTEAD0Kg6fUoG/1JdflgNsm8XhLBja0DZaAlIVO0p xz5UaljwcNvjAKGGMYbCGrf3XN2tOmGpVyJkMj17Vcq88y3bJBU= =JJui -----END PGP SIGNATURE----- Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Pull bpf fixes from Alexei Starovoitov: - Fix interaction between livepatch and BPF fexit programs (Song Liu) With Steven and Masami acks. - Fix stack ORC unwind from BPF kprobe_multi (Jiri Olsa) With Steven and Masami acks. - Fix out of bounds access in widen_imprecise_scalars() in the verifier (Eduard Zingerman) - Fix conflicts between MPTCP and BPF sockmap (Jiayuan Chen) - Fix net_sched storage collision with BPF data_meta/data_end (Eric Dumazet) - Add _impl suffix to BPF kfuncs with implicit args to avoid breaking them in bpf-next when KF_IMPLICIT_ARGS is added (Mykyta Yatsenko) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: Test widen_imprecise_scalars() with different stack depth bpf: account for current allocated stack depth in widen_imprecise_scalars() bpf: Add bpf_prog_run_data_pointers() selftests/bpf: Add mptcp test with sockmap mptcp: Fix proto fallback detection with BPF mptcp: Disallow MPTCP subflows from sockmap selftests/bpf: Add stacktrace ips test for raw_tp selftests/bpf: Add stacktrace ips test for kprobe_multi/kretprobe_multi x86/fgraph,bpf: Fix stack ORC unwind from kprobe_multi return probe Revert "perf/x86: Always store regs->ip in perf_callchain_kernel()" bpf: add _impl suffix for bpf_stream_vprintk() kfunc bpf:add _impl suffix for bpf_task_work_schedule* kfuncs selftests/bpf: Add tests for livepatch + bpf trampoline ftrace: bpf: Fix IPMODIFY + DIRECT in modify_ftrace_direct() ftrace: Fix BPF fexit with livepatch	2025-11-14 15:39:39 -08:00
Eduard Zingerman	b0c8e6d3d8	bpf: account for current allocated stack depth in widen_imprecise_scalars() The usage pattern for widen_imprecise_scalars() looks as follows: prev_st = find_prev_entry(env, ...); queued_st = push_stack(...); widen_imprecise_scalars(env, prev_st, queued_st); Where prev_st is an ancestor of the queued_st in the explored states tree. This ancestor is not guaranteed to have same allocated stack depth as queued_st. E.g. in the following case: def main(): for i in 1..2: foo(i) // same callsite, differnt param def foo(i): if i == 1: use 128 bytes of stack iterator based loop Here, for a second 'foo' call prev_st->allocated_stack is 128, while queued_st->allocated_stack is much smaller. widen_imprecise_scalars() needs to take this into account and avoid accessing bpf_verifier_state->frame[*]->stack out of bounds. Fixes: `2793a8b015` ("bpf: exact states comparison for iterator convergence checks") Reported-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251114025730.772723-1-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-11-14 09:26:05 -08:00
Eslam Khafagy	e0fd4d42e2	posix-timers: Plug potential memory leak in do_timer_create() When posix timer creation is set to allocate a given timer ID and the access to the user space value faults, the function terminates without freeing the already allocated posix timer structure. Move the allocation after the user space access to cure that. [ tglx: Massaged change log ] Fixes: `ec2d0c0462` ("posix-timers: Provide a mechanism to allocate a given timer ID") Reported-by: syzbot+9c47ad18f978d4394986@syzkaller.appspotmail.com Suggested-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Eslam Khafagy <eslam.medhat1993@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://patch.msgid.link/20251114122739.994326-1-eslam.medhat1993@gmail.com Closes: https://lore.kernel.org/all/69155df4.a70a0220.3124cb.0017.GAE@google.com/T/	2025-11-14 16:58:31 +01:00
Linus Torvalds	aecba2e013	Power management fixes for 6.18-rc6 - Fix issues related to using inadequate data types and incorrect use of atomic variables in the compressed hibernation images handling code that were introduced during the 6.9 development cycle (Mario Limonciello) - Move a X86_FEATURE_IDA check from turbo_is_disabled() to the places where a new value for MSR_IA32_PERF_CTL is computed in intel_pstate to address a regression preventing users from enabling turbo frequencies post-boot (Srinivas Pandruvada) -----BEGIN PGP SIGNATURE----- iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmkWOnoSHHJqd0Byand5 c29ja2kubmV0AAoJEO5fvZ0v1OO14gkH/08TYjtNEvC6acD2r5gJTpacdnExWopL NFmRyhuZM21Ja2gd1q0xtPvcTdAz3rkhB4vqn9KQ0oLkfXj08/+zpRyOP3PzVVSp bvE/Am28s/VChjDg/MFcP7o/fLSNoL73wK6er+i721KIV1uscK4FydkPNs6gpBHw 03FkUJX8jRjil0Cp6km2O0Zo5SEgm/U6wDjR5Azpdru8VKbI1RaxCMsR0/HnlA9Y pUAph9NX1UBBjdMFFdn8++Vna8XJX4qe9CiYT7KwGbGx5jUpVBaT9d/hPm0O/mJt VvNe3Dl5soFM/3yibsvV4sTcZHNPTsIKjuKIqwL4F0TCGug9kxwjFBk= =r3Fq -----END PGP SIGNATURE----- Merge tag 'pm-6.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix issues related to the handling of compressed hibernation images and a recent intel_pstate driver regression: - Fix issues related to using inadequate data types and incorrect use of atomic variables in the compressed hibernation images handling code that were introduced during the 6.9 development cycle (Mario Limonciello) - Move a X86_FEATURE_IDA check from turbo_is_disabled() to the places where a new value for MSR_IA32_PERF_CTL is computed in intel_pstate to address a regression preventing users from enabling turbo frequencies post-boot (Srinivas Pandruvada)" * tag 'pm-6.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: intel_pstate: Check IDA only before MSR_IA32_PERF_CTL writes PM: hibernate: Fix style issues in save_compressed_image() PM: hibernate: Use atomic64_t for compressed_size variable PM: hibernate: Emit an error when image writing fails	2025-11-13 16:31:07 -08:00
Rafael J. Wysocki	161284b26f	Merge branch 'pm-sleep' Merge fixes for issues related to the handling of compressed hibernation images that were introduced during the 6.9 development cycle. * pm-sleep: PM: hibernate: Fix style issues in save_compressed_image() PM: hibernate: Use atomic64_t for compressed_size variable PM: hibernate: Emit an error when image writing fails	2025-11-13 21:05:46 +01:00
Zqiang	a257e97421	sched_ext: Fix possible deadlock in the deferred_irq_workfn() For PREEMPT_RT=y kernels, the deferred_irq_workfn() is executed in the per-cpu irq_work/* task context and not disable-irq, if the rq returned by container_of() is current CPU's rq, the following scenarios may occur: lock(&rq->__lock); <Interrupt> lock(&rq->__lock); This commit use IRQ_WORK_INIT_HARD() to replace init_irq_work() to initialize rq->scx.deferred_irq_work, make the deferred_irq_workfn() is always invoked in hard-irq context. Signed-off-by: Zqiang <qiang.zhang@linux.dev> Signed-off-by: Tejun Heo <tj@kernel.org>	2025-11-13 08:29:28 -10:00
Emil Tsalapatis	c87488a123	sched/ext: convert scx_tasks_lock to raw spinlock Update scx_task_locks so that it's safe to lock/unlock in a non-sleepable context in PREEMPT_RT kernels. scx_task_locks is (non-raw) spinlock used to protect the list of tasks under SCX. This list is updated during from finish_task_switch(), which cannot sleep. Regular spinlocks can be locked in such a context in non-RT kernels, but are sleepable under when CONFIG_PREEMPT_RT=y. Convert scx_task_locks into a raw spinlock, which is not sleepable even on RT kernels. Sample backtrace: <TASK> dump_stack_lvl+0x83/0xa0 __might_resched+0x14a/0x200 rt_spin_lock+0x61/0x1c0 ? sched_ext_dead+0x2d/0xf0 ? lock_release+0xc6/0x280 sched_ext_dead+0x2d/0xf0 ? srso_alias_return_thunk+0x5/0xfbef5 finish_task_switch.isra.0+0x254/0x360 __schedule+0x584/0x11d0 ? srso_alias_return_thunk+0x5/0xfbef5 ? srso_alias_return_thunk+0x5/0xfbef5 ? tick_nohz_idle_exit+0x7e/0x120 schedule_idle+0x23/0x40 cpu_startup_entry+0x29/0x30 start_secondary+0xf8/0x100 common_startup_64+0x13e/0x148 </TASK> Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2025-11-12 08:42:02 -10:00
Zqiang	5f02151c41	sched_ext: Fix unsafe locking in the scx_dump_state() For built with CONFIG_PREEMPT_RT=y kernels, the dump_lock will be converted sleepable spinlock and not disable-irq, so the following scenarios occur: inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. irq_work/0/27 [HC0[0]:SC0[0]:HE1:SE1] takes: (&rq->__lock){?...}-{2:2}, at: raw_spin_rq_lock_nested+0x2b/0x40 {IN-HARDIRQ-W} state was registered at: lock_acquire+0x1e1/0x510 _raw_spin_lock_nested+0x42/0x80 raw_spin_rq_lock_nested+0x2b/0x40 sched_tick+0xae/0x7b0 update_process_times+0x14c/0x1b0 tick_periodic+0x62/0x1f0 tick_handle_periodic+0x48/0xf0 timer_interrupt+0x55/0x80 __handle_irq_event_percpu+0x20a/0x5c0 handle_irq_event_percpu+0x18/0xc0 handle_irq_event+0xb5/0x150 handle_level_irq+0x220/0x460 __common_interrupt+0xa2/0x1e0 common_interrupt+0xb0/0xd0 asm_common_interrupt+0x2b/0x40 _raw_spin_unlock_irqrestore+0x45/0x80 __setup_irq+0xc34/0x1a30 request_threaded_irq+0x214/0x2f0 hpet_time_init+0x3e/0x60 x86_late_time_init+0x5b/0xb0 start_kernel+0x308/0x410 x86_64_start_reservations+0x1c/0x30 x86_64_start_kernel+0x96/0xa0 common_startup_64+0x13e/0x148 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&rq->__lock); <Interrupt> lock(&rq->__lock); * DEADLOCK * stack backtrace: CPU: 0 UID: 0 PID: 27 Comm: irq_work/0 Call Trace: <TASK> dump_stack_lvl+0x8c/0xd0 dump_stack+0x14/0x20 print_usage_bug+0x42e/0x690 mark_lock.part.44+0x867/0xa70 ? __pfx_mark_lock.part.44+0x10/0x10 ? string_nocheck+0x19c/0x310 ? number+0x739/0x9f0 ? __pfx_string_nocheck+0x10/0x10 ? __pfx_check_pointer+0x10/0x10 ? kvm_sched_clock_read+0x15/0x30 ? sched_clock_noinstr+0xd/0x20 ? local_clock_noinstr+0x1c/0xe0 __lock_acquire+0xc4b/0x62b0 ? __pfx_format_decode+0x10/0x10 ? __pfx_string+0x10/0x10 ? __pfx___lock_acquire+0x10/0x10 ? __pfx_vsnprintf+0x10/0x10 lock_acquire+0x1e1/0x510 ? raw_spin_rq_lock_nested+0x2b/0x40 ? __pfx_lock_acquire+0x10/0x10 ? dump_line+0x12e/0x270 ? raw_spin_rq_lock_nested+0x20/0x40 _raw_spin_lock_nested+0x42/0x80 ? raw_spin_rq_lock_nested+0x2b/0x40 raw_spin_rq_lock_nested+0x2b/0x40 scx_dump_state+0x3b3/0x1270 ? finish_task_switch+0x27e/0x840 scx_ops_error_irq_workfn+0x67/0x80 irq_work_single+0x113/0x260 irq_work_run_list.part.3+0x44/0x70 run_irq_workd+0x6b/0x90 ? __pfx_run_irq_workd+0x10/0x10 smpboot_thread_fn+0x529/0x870 ? __pfx_smpboot_thread_fn+0x10/0x10 kthread+0x305/0x3f0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x40/0x70 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> This commit therefore use rq_lock_irqsave/irqrestore() to replace rq_lock/unlock() in the scx_dump_state(). Fixes: `07814a9439` ("sched_ext: Print debug dump after an error exit") Signed-off-by: Zqiang <qiang.zhang@linux.dev> Signed-off-by: Tejun Heo <tj@kernel.org>	2025-11-12 06:28:32 -10:00
Christian Brauner	a3f8f86627	power: always freeze efivarfs The efivarfs filesystems must always be frozen and thawed to resync variable state. Make it so. Link: https://patch.msgid.link/20251105-vorbild-zutreffen-fe00d1dd98db@brauner Signed-off-by: Christian Brauner <brauner@kernel.org>	2025-11-12 10:12:39 +01:00
Pratyush Yadav	b05addf6f0	kho: warn and exit when unpreserved page wasn't preserved Calling __kho_unpreserve() on a pair of (pfn, end_pfn) that wasn't preserved is a bug. Currently, if that is done, the physxa or bits can be NULL. This results in a soft lockup since a NULL physxa or bits results in redoing the loop without ever making any progress. Return when physxa or bits are not found, but WARN first to loudly indicate invalid behaviour. Link: https://lkml.kernel.org/r/20251103180235.71409-3-pratyush@kernel.org Fixes: `fc33e4b44b` ("kexec: enable KHO support for memory preservation") Signed-off-by: Pratyush Yadav <pratyush@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Baoquan He <bhe@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-11-09 21:19:47 -08:00
Pratyush Yadav	7ecd2e439d	kho: fix unpreservation of higher-order vmalloc preservations kho_vmalloc_unpreserve_chunk() calls __kho_unpreserve() with end_pfn as pfn + 1. This happens to work for 0-order pages, but leaks higher order pages. For example, say order 2 pages back the allocation. During preservation, they get preserved in the order 2 bitmaps, but kho_vmalloc_unpreserve_chunk() would try to unpreserve them from the order 0 bitmaps, which should not have these bits set anyway, leaving the order 2 bitmaps untouched. This results in the pages being carried over to the next kernel. Nothing will free those pages in the next boot, leaking them. Fix this by taking the order into account when calculating the end PFN for __kho_unpreserve(). Link: https://lkml.kernel.org/r/20251103180235.71409-2-pratyush@kernel.org Fixes: `a667300bd5` ("kho: add support for preserving vmalloc allocations") Signed-off-by: Pratyush Yadav <pratyush@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Baoquan He <bhe@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-11-09 21:19:47 -08:00
Pratyush Yadav	0b07092d09	kho: fix out-of-bounds access of vmalloc chunk The list of pages in a vmalloc chunk is NULL-terminated. So when looping through the pages in a vmalloc chunk, both kho_restore_vmalloc() and kho_vmalloc_unpreserve_chunk() rightly make sure to stop when encountering a NULL page. But when the chunk is full, the loops do not stop and go past the bounds of chunk->phys, resulting in out-of-bounds memory access, and possibly the restoration or unpreservation of an invalid page. Fix this by making sure the processing of chunk stops at the end of the array. Link: https://lkml.kernel.org/r/20251103110159.8399-1-pratyush@kernel.org Fixes: `a667300bd5` ("kho: add support for preserving vmalloc allocations") Signed-off-by: Pratyush Yadav <pratyush@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Baoquan He <bhe@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-11-09 21:19:47 -08:00
Peter Oberparleiter	ec4d11fc4b	gcov: add support for GCC 15 Using gcov on kernels compiled with GCC 15 results in truncated 16-byte long .gcda files with no usable data. To fix this, update GCOV_COUNTERS to match the value defined by GCC 15. Tested with GCC 14.3.0 and GCC 15.2.0. Link: https://lkml.kernel.org/r/20251028115125.1319410-1-oberpar@linux.ibm.com Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Reported-by: Matthieu Baerts <matttbe@kernel.org> Closes: https://github.com/linux-test-project/lcov/issues/445 Tested-by: Matthieu Baerts <matttbe@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-11-09 21:19:44 -08:00
Pasha Tatashin	fa759cd75b	kho: allocate metadata directly from the buddy allocator KHO allocates metadata for its preserved memory map using the slab allocator via kzalloc(). This metadata is temporary and is used by the next kernel during early boot to find preserved memory. A problem arises when KFENCE is enabled. kzalloc() calls can be randomly intercepted by kfence_alloc(), which services the allocation from a dedicated KFENCE memory pool. This pool is allocated early in boot via memblock. When booting via KHO, the memblock allocator is restricted to a "scratch area", forcing the KFENCE pool to be allocated within it. This creates a conflict, as the scratch area is expected to be ephemeral and overwriteable by a subsequent kexec. If KHO metadata is placed in this KFENCE pool, it leads to memory corruption when the next kernel is loaded. To fix this, modify KHO to allocate its metadata directly from the buddy allocator instead of slab. Link: https://lkml.kernel.org/r/20251021000852.2924827-4-pasha.tatashin@soleen.com Fixes: `fc33e4b44b` ("kexec: enable KHO support for memory preservation") Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: David Matlack <dmatlack@google.com> Cc: Alexander Graf <graf@amazon.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Samiullah Khawaja <skhawaja@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-11-09 21:19:42 -08:00
Pasha Tatashin	a2fff99f92	kho: increase metadata bitmap size to PAGE_SIZE KHO memory preservation metadata is preserved in 512 byte chunks which requires their allocation from slab allocator. Slabs are not safe to be used with KHO because of kfence, and because partial slabs may lead leaks to the next kernel. Change the size to be PAGE_SIZE. The kfence specifically may cause memory corruption, where it randomly provides slab objects that can be within the scratch area. The reason for that is that kfence allocates its objects prior to KHO scratch is marked as CMA region. While this change could potentially increase metadata overhead on systems with sparsely preserved memory, this is being mitigated by ongoing work to reduce sparseness during preservation via 1G guest pages. Furthermore, this change aligns with future work on a stateless KHO, which will also use page-sized bitmaps for its radix tree metadata. Link: https://lkml.kernel.org/r/20251021000852.2924827-3-pasha.tatashin@soleen.com Fixes: `fc33e4b44b` ("kexec: enable KHO support for memory preservation") Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Matlack <dmatlack@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Samiullah Khawaja <skhawaja@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-11-09 21:19:41 -08:00
Pasha Tatashin	e38f65d317	kho: warn and fail on metadata or preserved memory in scratch area Patch series "KHO: kfence + KHO memory corruption fix", v3. This series fixes a memory corruption bug in KHO that occurs when KFENCE is enabled. The root cause is that KHO metadata, allocated via kzalloc(), can be randomly serviced by kfence_alloc(). When a kernel boots via KHO, the early memblock allocator is restricted to a "scratch area". This forces the KFENCE pool to be allocated within this scratch area, creating a conflict. If KHO metadata is subsequently placed in this pool, it gets corrupted during the next kexec operation. Google is using KHO and have had obscure crashes due to this memory corruption, with stacks all over the place. I would prefer this fix to be properly backported to stable so we can also automatically consume it once we switch to the upstream KHO. Patch 1/3 introduces a debug-only feature (CONFIG_KEXEC_HANDOVER_DEBUG) that adds checks to detect and fail any operation that attempts to place KHO metadata or preserved memory within the scratch area. This serves as a validation and diagnostic tool to confirm the problem without affecting production builds. Patch 2/3 Increases bitmap to PAGE_SIZE, so buddy allocator can be used. Patch 3/3 Provides the fix by modifying KHO to allocate its metadata directly from the buddy allocator instead of slab. This bypasses the KFENCE interception entirely. This patch (of 3): It is invalid for KHO metadata or preserved memory regions to be located within the KHO scratch area, as this area is overwritten when the next kernel is loaded, and used early in boot by the next kernel. This can lead to memory corruption. Add checks to kho_preserve_* and KHO's internal metadata allocators (xa_load_or_alloc, new_chunk) to verify that the physical address of the memory does not overlap with any defined scratch region. If an overlap is detected, the operation will fail and a WARN_ON is triggered. To avoid performance overhead in production kernels, these checks are enabled only when CONFIG_KEXEC_HANDOVER_DEBUG is selected. [rppt@kernel.org: fix KEXEC_HANDOVER_DEBUG Kconfig dependency] Link: https://lkml.kernel.org/r/aQHUyyFtiNZhx8jo@kernel.org [pasha.tatashin@soleen.com: build fix] Link: https://lkml.kernel.org/r/CA+CK2bBnorfsTymKtv4rKvqGBHs=y=MjEMMRg_tE-RME6n-zUw@mail.gmail.com Link: https://lkml.kernel.org/r/20251021000852.2924827-1-pasha.tatashin@soleen.com Link: https://lkml.kernel.org/r/20251021000852.2924827-2-pasha.tatashin@soleen.com Fixes: `fc33e4b44b` ("kexec: enable KHO support for memory preservation") Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Mike Rapoport <rppt@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Matlack <dmatlack@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Samiullah Khawaja <skhawaja@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-11-09 21:19:41 -08:00
Linus Torvalds	b5c0946029	Fix a group-throttling bug in the fair scheduler. Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmkPQRgRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1jtjQ//UM9YfdauTXacEE2dPG4b0QwxHPgMsFmS GMT7c6H7ApCQoiBnGwv3K0d0bu4FZwnWBMulUv/yhe49vHXdIPdZAlMkJrM3IA7X 80IVCfLHtRkbMaAUIDc3MGwkzneCUIuzKPPH1iXqn9/R0zBB8S7qxt4XHyBHzlX7 uCNbFkUFhrh5s3sWond2ogAlCvGiZ5Qo7/oTfNrpOmYGvXfNIh4T1zDOWpPrsKLX Md6rucBs0bcV1vlyKwNrobqOuaS0mdSxjt+SKDuI1CdCj6mNbYvjLPinnAi2n3zZ CLaI8+rBe3JpLOH+kXuOf+CUatXDBjF4GO1k6XXwvcsK4ARqmKcbk9Xs3i/Tn/bm Ls7IdLkCrekXaGU2MlYLVg0twe+O5oUgwBpa4Ap/IObbI+fIKP/Pj2blZlpT1RlY J455LrldsMUy1NWaqVd13gCGOPGzR6SrD+ruOJS4BAK3JFyRw8rdDW4zJ7qEiCRj yejZfiFcCAoD7cqFJoCon6rt+WC3T5I1/Sc40JCmfKH/GhLzTAGt/8cRaic4ntRX Yv8T/lJVgjhGqfDWmcRYZVF/SiyZq8IP+wqrpr3ETAfRhqB5ZhQHPUGZPPDDeeqH QHEYSrWdbmxKqYpBK3nwgwAIz9dSJcdQUqWLvP85rjRTeLr08/reXha4P80RJ3wz XmLzR7KQ9mI= =zLw8 -----END PGP SIGNATURE----- Merge tag 'sched-urgent-2025-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Ingo Molnar: "Fix a group-throttling bug in the fair scheduler" * tag 'sched-urgent-2025-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/fair: Prevent cfs_rq from being unthrottled with zero runtime_remaining	2025-11-08 08:59:05 -08:00
Linus Torvalds	133262cae9	Fix a system hang caused by cpu-clock events. Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmkPQBYRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1j3tQ//f63vT4+Wn87Ukm+5sOdNcaCaYuYdW7Pa lMEegZgPxmtOsrN79qEyeaZzA5v3KH/ijE1ENKb4GvDdUBKFrxiVZ/LxxMs0+4PC O40pHPFKBBT7Aolu+TSp8LJgo0hvFKMpBFV5nLPsf/iwr4otgI7UFaMfWtdZQ4J7 e1iPuEt2hVrlcbgDn0HdT6YQfrWJNHyWLu2a16TMsklryuJRoA3lJJusDMVPry1o REacolLCH4c+zvlLcCGx33LQl9k560RZqQVnZwkSRlvv2pkf7pa8XG8f2nNAsxFf DbXRLNfXrVMOciLkgDUeJ6Vb9feMNDF0+pNYWOX4hZ6iOX4bmg5CEScmDwTq6xpU XMefVvZMhyTKxUHynGzvVgTyTgjNbawvLqLwXjNYcSrl9+WSOnnBgLn/YgrCF8lY W/wh2jAGzmKk3wm9r65pyjLW+GJwVT8zEKD2J6UzpRd52ITbHMuZ+StRdPSOEb5+ 1fgD5FTXerYYV7FwC7SgcWpJ+BtBptfngi2PzPVCQ+VexvGJPqdCFQ0EcoXPpznc XsjmuLgef2mRJlNjKSVwLFjXOwfUe4Dsj54cAurUEh0xDINwQelS1NRykmXQfgod i6a/nCCmSdVDBydMnM7AhH/hqRUz9CErNmIPUJS5VLjyKlW12qm0KsD9a0K0oei8 67GKLQhj/FI= =SulU -----END PGP SIGNATURE----- Merge tag 'perf-urgent-2025-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf event fix from Ingo Molnar: "Fix a system hang caused by cpu-clock events deadlock" * tag 'perf-urgent-2025-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Fix system hang caused by cpu-clock usage	2025-11-08 08:54:13 -08:00
Linus Torvalds	e6f55fe790	Fix (well, cut in half) a futex performance regression on PowerPC. Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmkPP2YRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1hTlg//WBWA4717W46TetZJAYauc1rDeunhwoXQ nVQntZ4/jWrDroWhOC5iElaQGQlO974pEqt+z99IbnIFJ273vwgf03u6txcEOydu 2zmL067IC1TPNyrsnZ+yimeMEkpFKXYg00RhhSLr9HK6YPEbLh/SI8nYqSUvLcaJ eY60Ck8xgGf6oc4atiEPyUz6oX3EcEMVYUMXopsv8KksP6MUzbB6bbRMWklQn03s qFoDfrOF0aBlsMbjS5GsqeTMUTlYs/Py1L2IWMoKbSKhhVsT8eOazJZvcAKPCyhK OWrTZpW0vFcxS0dmLArGOejv8SyZsqQ35yOs5GfTa1qYOiQJ/trfi660tkglvddm DPXC2hPbR3XqLFmTKu1u2oJ46rfHYITWPilE3PfbaKOCVW1QEC2Q8p+cGPswvlFi CH6lU2UiiyiRIkvqGevEjAd9oVGyCQg01Pxce7WDRYmLYoFf3V3xrVwvC3IqoUM8 kLfUAxAGZN0CQFAwaIwi1hWSmCmEaHjIETZ321OsJYGSs/VbkM9TgXTs3J2fkz4i nY4FLCoxAwTvOrk9aW8fBKdGs51pUL/zWA3dGcrz3pJunYWnEiZ2Cvg+fEn6n46m fYT6KjOhqbap6V5Y7W7ny5qKiD3x7rEGX++nWHvL02TsPCe4fC/paob5RL/AmqNA D4S+6mpLwlQ= =ehm1 -----END PGP SIGNATURE----- Merge tag 'locking-urgent-2025-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Ingo Molnar: "Fix (well, cut in half) a futex performance regression on PowerPC" * tag 'locking-urgent-2025-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Optimize per-cpu reference counting	2025-11-08 08:51:22 -08:00
Linus Torvalds	5b95a50001	Fixes for tracing: - Check for reader catching up in ring_buffer_map_get_reader() If the reader catches up to the writer in the memory mapped ring buffer then calling rb_get_reader_page() will return NULL as there's no pages left. But this isn't checked for before calling rb_get_reader_page() and the return of NULL causes a warning. If it is detected that the reader caught up to the writer, then simply exit the routine. - Fix memory leak in histogram create_field_var() The couple of the error paths in create_field_var() did not properly clean up what was allocated. Make sure everything is freed properly on error. - Fix help message of tools latency_collector The help message incorrectly stated that "-t" was the same as "--threads" whereas "--threads" is actually represented by "-e". -----BEGIN PGP SIGNATURE----- iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaQ3wOxQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qrYvAP9zLYz/pCTTCY64/Yx2gMimFt7g9XhO b5xL+mZWoiYJigD+Ma7IpRC1QVyAk5YgxkWJqpEyHrxE84fBIBevoTRBTQE= =+x8m -----END PGP SIGNATURE----- Merge tag 'trace-v6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Check for reader catching up in ring_buffer_map_get_reader() If the reader catches up to the writer in the memory mapped ring buffer then calling rb_get_reader_page() will return NULL as there's no pages left. But this isn't checked for before calling rb_get_reader_page() and the return of NULL causes a warning. If it is detected that the reader caught up to the writer, then simply exit the routine - Fix memory leak in histogram create_field_var() The couple of the error paths in create_field_var() did not properly clean up what was allocated. Make sure everything is freed properly on error - Fix help message of tools latency_collector The help message incorrectly stated that "-t" was the same as "--threads" whereas "--threads" is actually represented by "-e" * tag 'trace-v6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing/tools: Fix incorrcet short option in usage text for --threads tracing: Fix memory leaks in create_field_var() ring-buffer: Do not warn in ring_buffer_map_get_reader() when reader catches up	2025-11-07 08:07:11 -08:00
Mario Limonciello (AMD)	0b6c10cb84	PM: hibernate: Fix style issues in save_compressed_image() Address two issues indicated by checkpatch: - Trailing statements should be on next line. - Prefer 'unsigned int' to bare use of 'unsigned'. Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> [ rjw: Changelog edits ] Link: https://patch.msgid.link/20251106045158.3198061-4-superm1@kernel.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2025-11-07 16:53:57 +01:00
Mario Limonciello (AMD)	66ededc694	PM: hibernate: Use atomic64_t for compressed_size variable `compressed_size` can overflow, showing nonsensical values. Change from `atomic_t` to `atomic64_t` to prevent overflow. Fixes: `a06c6f5d3c` ("PM: hibernate: Move to crypto APIs for LZO compression") Reported-by: Askar Safin <safinaskar@gmail.com> Closes: https://lore.kernel.org/linux-pm/20251105180506.137448-1-safinaskar@gmail.com/ Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Tested-by: Askar Safin <safinaskar@gmail.com> Cc: 6.9+ <stable@vger.kernel.org> # 6.9+ Link: https://patch.msgid.link/20251106045158.3198061-3-superm1@kernel.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2025-11-07 16:53:56 +01:00
Mario Limonciello (AMD)	62b9ca1706	PM: hibernate: Emit an error when image writing fails If image writing fails, a return code is passed up to the caller, but none of the callers log anything to the log and so the only record of it is the return code that userspace gets. Adjust the logging so that the image size and speed of writing is only emitted on success and if there is an error, it's saved to the logs. Fixes: `a06c6f5d3c` ("PM: hibernate: Move to crypto APIs for LZO compression") Reported-by: Askar Safin <safinaskar@gmail.com> Closes: https://lore.kernel.org/linux-pm/20251105180506.137448-1-safinaskar@gmail.com/ Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Tested-by: Askar Safin <safinaskar@gmail.com> Cc: 6.9+ <stable@vger.kernel.org> # 6.9+ [ rjw: Added missing braces after "else", changelog edits ] Link: https://patch.msgid.link/20251106045158.3198061-2-superm1@kernel.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2025-11-07 16:53:56 +01:00
Zilin Guan	80f0d631dc	tracing: Fix memory leaks in create_field_var() The function create_field_var() allocates memory for 'val' through create_hist_field() inside parse_atom(), and for 'var' through create_var(), which in turn allocates var->type and var->var.name internally. Simply calling kfree() to release these structures will result in memory leaks. Use destroy_hist_field() to properly free 'val', and explicitly release the memory of var->type and var->var.name before freeing 'var' itself. Link: https://patch.msgid.link/20251106120132.3639920-1-zilin@seu.edu.cn Fixes: `02205a6752` ("tracing: Add support for 'field variables'") Signed-off-by: Zilin Guan <zilin@seu.edu.cn> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-11-06 19:51:33 -05:00
Steven Rostedt	aa997d2d2a	ring-buffer: Do not warn in ring_buffer_map_get_reader() when reader catches up The function ring_buffer_map_get_reader() is a bit more strict than the other get reader functions, and except for certain situations the rb_get_reader_page() should not return NULL. If it does, it triggers a warning. This warning was triggering but after looking at why, it was because another acceptable situation was happening and it wasn't checked for. If the reader catches up to the writer and there's still data to be read on the reader page, then the rb_get_reader_page() will return NULL as there's no new page to get. In this situation, the reader page should not be updated and no warning should trigger. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Vincent Donnefort <vdonnefort@google.com> Reported-by: syzbot+92a3745cea5ec6360309@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/690babec.050a0220.baf87.0064.GAE@google.com/ Link: https://lore.kernel.org/20251016132848.1b11bb37@gandalf.local.home Fixes: `117c39200d` ("ring-buffer: Introducing ring-buffer mapping functions") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-11-06 19:38:54 -05:00
Masami Hiramatsu (Google)	c91afa7610	tracing: tprobe-events: Fix to put tracepoint_user when disable the tprobe __unregister_trace_fprobe() checks tf->tuser to put it when removing tprobe. However, disable_trace_fprobe() does not use it and only calls unregister_fprobe(). Thus it forgets to disable tracepoint_user. If the trace_fprobe has tuser, put it for unregistering the tracepoint callbacks when disabling tprobe correctly. Link: https://lore.kernel.org/all/176244794466.155515.3971904050506100243.stgit@devnote2/ Fixes: `2867495dea` ("tracing: tprobe-events: Register tracepoint when enable tprobe event") Cc: stable@vger.kernel.org Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Tested-by: Beau Belgrave <beaub@linux.microsoft.com> Reviewed-by: Beau Belgrave <beaub@linux.microsoft.com>	2025-11-07 07:36:20 +09:00
Masami Hiramatsu (Google)	10d9dda426	tracing: tprobe-events: Fix to register tracepoint correctly Since __tracepoint_user_init() calls tracepoint_user_register() without initializing tuser->tpoint with given tracpoint, it does not register tracepoint stub function as callback correctly, and tprobe does not work. Initializing tuser->tpoint correctly before tracepoint_user_register() so that it sets up tracepoint callback. I confirmed below example works fine again. echo "t sched_switch preempt prev_pid=prev->pid next_pid=next->pid" > /sys/kernel/tracing/dynamic_events echo 1 > /sys/kernel/tracing/events/tracepoints/sched_switch/enable cat /sys/kernel/tracing/trace_pipe Link: https://lore.kernel.org/all/176244793514.155515.6466348656998627773.stgit@devnote2/ Fixes: `2867495dea` ("tracing: tprobe-events: Register tracepoint when enable tprobe event") Reported-by: Beau Belgrave <beaub@linux.microsoft.com> Cc: stable@vger.kernel.org Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Tested-by: Beau Belgrave <beaub@linux.microsoft.com> Reviewed-by: Beau Belgrave <beaub@linux.microsoft.com>	2025-11-07 07:32:55 +09:00
Peter Zijlstra	4cb5ac2626	futex: Optimize per-cpu reference counting Shrikanth noted that the per-cpu reference counter was still some 10% slower than the old immutable option (which removes the reference counting entirely). Further optimize the per-cpu reference counter by: - switching from RCU to preempt; - using __this_cpu_() since we now have preempt disabled; - switching from smp_load_acquire() to READ_ONCE(). This is all safe because disabling preemption inhibits the RCU grace period exactly like rcu_read_lock(). Having preemption disabled allows using __this_cpu_() provided the only access to the variable is in task context -- which is the case here. Furthermore, since we know changing fph->state to FR_ATOMIC demands a full RCU grace period we can rely on the implied smp_mb() from that to replace the acquire barrier(). This is very similar to the percpu_down_read_internal() fast-path. The reason this is significant for PowerPC is that it uses the generic this_cpu_() implementation which relies on local_irq_disable() (the x86 implementation relies on it being a single memop instruction to be IRQ-safe). Switching to preempt_disable() and __this_cpu() avoids this IRQ state swizzling. Also, PowerPC needs LWSYNC for the ACQUIRE barrier, not having to use explicit barriers safes a bunch. Combined this reduces the performance gap by half, down to some 5%. Fixes: `760e6f7bef` ("futex: Remove support for IMMUTABLE") Reported-by: Shrikanth Hegde <sshegde@linux.ibm.com> Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20251106092929.GR4067720@noisy.programming.kicks-ass.net	2025-11-06 12:30:54 +01:00
Aaron Lu	956dfda6a7	sched/fair: Prevent cfs_rq from being unthrottled with zero runtime_remaining When a cfs_rq is to be throttled, its limbo list should be empty and that's why there is a warn in tg_throttle_down() for non empty cfs_rq->throttled_limbo_list. When running a test with the following hierarchy: root / \ A* ... / \| \ ... B / \ C* where both A and C have quota settings, that warn on non empty limbo list is triggered for a cfs_rq of C, let's call it cfs_rq_c(and ignore the cpu part of the cfs_rq for the sake of simpler representation). Debug showed it happened like this: Task group C is created and quota is set, so in tg_set_cfs_bandwidth(), cfs_rq_c is initialized with runtime_enabled set, runtime_remaining equals to 0 and unthrottled. Before any tasks are enqueued to cfs_rq_c, multiple throttled tasks can migrate to cfs_rq_c (e.g., due to task group changes). When enqueue_task_fair(cfs_rq_c, throttled_task) is called and cfs_rq_c is in a throttled hierarchy (e.g., A is throttled), these throttled tasks are directly placed into cfs_rq_c's limbo list by enqueue_throttled_task(). Later, when A is unthrottled, tg_unthrottle_up(cfs_rq_c) enqueues these tasks. The first enqueue triggers check_enqueue_throttle(), and with zero runtime_remaining, cfs_rq_c can be throttled in throttle_cfs_rq() if it can't get more runtime and enters tg_throttle_down(), where the warning is hit due to remaining tasks in the limbo list. I think it's a chaos to trigger throttle on unthrottle path, the status of a being unthrottled cfs_rq can be in a mixed state in the end, so fix this by granting 1ns to cfs_rq in tg_set_cfs_bandwidth(). This ensures cfs_rq_c has a positive runtime_remaining when initialized as unthrottled and cannot enter tg_unthrottle_up() with zero runtime_remaining. Also, update outdated comments in tg_throttle_down() since unthrottle_cfs_rq() is no longer called with zero runtime_remaining. While at it, remove a redundant assignment to se in tg_throttle_down(). Fixes: `e1fad12dcb` ("sched/fair: Switch to task based throttle model") Reviewed-By: Benjamin Segall <bsegall@google.com> Suggested-by: Benjamin Segall <bsegall@google.com> Signed-off-by: Aaron Lu <ziqianlu@bytedance.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Tested-by: Hao Jia <jiahao1@lixiang.com> Link: https://patch.msgid.link/20251030032755.560-1-ziqianlu@bytedance.com	2025-11-06 12:30:52 +01:00
Mykyta Yatsenko	137cc92ffe	bpf: add _impl suffix for bpf_stream_vprintk() kfunc Rename bpf_stream_vprintk() to bpf_stream_vprintk_impl(). This makes bpf_stream_vprintk() follow the already established "_impl" suffix-based naming convention for kfuncs with the bpf_prog_aux argument provided by the verifier implicitly. This convention will be taken advantage of with the upcoming KF_IMPLICIT_ARGS feature to preserve backwards compatibility to BPF programs. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Link: https://lore.kernel.org/r/20251104-implv2-v3-2-4772b9ae0e06@meta.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Ihor Solodrai <ihor.solodrai@linux.dev>	2025-11-04 17:50:25 -08:00
Mykyta Yatsenko	ea0714d61d	bpf:add _impl suffix for bpf_task_work_schedule* kfuncs Rename: bpf_task_work_schedule_resume()->bpf_task_work_schedule_resume_impl() bpf_task_work_schedule_signal()->bpf_task_work_schedule_signal_impl() This aligns task work scheduling kfuncs with the established naming scheme for kfuncs with the bpf_prog_aux argument provided by the verifier implicitly. This convention will be taken advantage of with the upcoming KF_IMPLICIT_ARGS feature to preserve backwards compatibility to BPF programs. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Link: https://lore.kernel.org/r/20251104-implv2-v3-1-4772b9ae0e06@meta.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Ihor Solodrai <ihor.solodrai@linux.dev>	2025-11-04 17:50:25 -08:00
Song Liu	3e9a18e1c3	ftrace: bpf: Fix IPMODIFY + DIRECT in modify_ftrace_direct() ftrace_hash_ipmodify_enable() checks IPMODIFY and DIRECT ftrace_ops on the same kernel function. When needed, ftrace_hash_ipmodify_enable() calls ops->ops_func() to prepare the direct ftrace (BPF trampoline) to share the same function as the IPMODIFY ftrace (livepatch). ftrace_hash_ipmodify_enable() is called in register_ftrace_direct() path, but not called in modify_ftrace_direct() path. As a result, the following operations will break livepatch: 1. Load livepatch to a kernel function; 2. Attach fentry program to the kernel function; 3. Attach fexit program to the kernel function. After 3, the kernel function being used will not be the livepatched version, but the original version. Fix this by adding __ftrace_hash_update_ipmodify() to __modify_ftrace_direct() and adjust some logic around the call. Signed-off-by: Song Liu <song@kernel.org> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20251027175023.1521602-3-song@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-11-03 17:22:06 -08:00
Song Liu	56b3c85e15	ftrace: Fix BPF fexit with livepatch When livepatch is attached to the same function as bpf trampoline with a fexit program, bpf trampoline code calls register_ftrace_direct() twice. The first time will fail with -EAGAIN, and the second time it will succeed. This requires register_ftrace_direct() to unregister the address on the first attempt. Otherwise, the bpf trampoline cannot attach. Here is an easy way to reproduce this issue: insmod samples/livepatch/livepatch-sample.ko bpftrace -e 'fexit:cmdline_proc_show {}' ERROR: Unable to attach probe: fexit:vmlinux:cmdline_proc_show... Fix this by cleaning up the hash when register_ftrace_function_nolock hits errors. Also, move the code that resets ops->func and ops->trampoline to the error path of register_ftrace_direct(); and add a helper function reset_direct() in register_ftrace_direct() and unregister_ftrace_direct(). Fixes: `d05cb47066` ("ftrace: Fix modification of direct_function hash while in use") Cc: stable@vger.kernel.org # v6.6+ Reported-by: Andrey Grodzovsky <andrey.grodzovsky@crowdstrike.com> Closes: https://lore.kernel.org/live-patching/c5058315a39d4615b333e485893345be@crowdstrike.com/ Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Acked-and-tested-by: Andrey Grodzovsky <andrey.grodzovsky@crowdstrike.com> Signed-off-by: Song Liu <song@kernel.org> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20251027175023.1521602-2-song@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2025-11-03 17:22:06 -08:00
Dapeng Mi	eb3182ef04	perf/core: Fix system hang caused by cpu-clock usage cpu-clock usage by the async-profiler tool can trigger a system hang, which got bisected back to the following commit by Octavia Togami: `18dbcbfabf` ("perf: Fix the POLL_HUP delivery breakage") causes this issue The root cause of the hang is that cpu-clock is a special type of SW event which relies on hrtimers. The __perf_event_overflow() callback is invoked from the hrtimer handler for cpu-clock events, and __perf_event_overflow() tries to call cpu_clock_event_stop() to stop the event, which calls htimer_cancel() to cancel the hrtimer. But that's a recursion into the hrtimer code from a hrtimer handler, which (unsurprisingly) deadlocks. To fix this bug, use hrtimer_try_to_cancel() instead, and set the PERF_HES_STOPPED flag, which causes perf_swevent_hrtimer() to stop the event once it sees the PERF_HES_STOPPED flag. [ mingo: Fixed the comments and improved the changelog. ] Closes: https://lore.kernel.org/all/CAHPNGSQpXEopYreir+uDDEbtXTBvBvi8c6fYXJvceqtgTPao3Q@mail.gmail.com/ Fixes: `18dbcbfabf` ("perf: Fix the POLL_HUP delivery breakage") Reported-by: Octavia Togami <octavia.togami@gmail.com> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Octavia Togami <octavia.togami@gmail.com> Cc: stable@vger.kernel.org Link: https://github.com/lucko/spark/issues/530 Link: https://patch.msgid.link/20251015051828.12809-1-dapeng1.mi@linux.intel.com	2025-11-03 11:04:19 +01:00
Linus Torvalds	ba36dd5ee6	bpf-fixes -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmkFVBIACgkQ6rmadz2v bTrWVg//chctHGZZcP2ZbLxDDwLOwfjjsUY2COaD9P3ZN8/vWX6GEbvElLulkLgD Hwv3pe2C6NzHN9QH37M+WtJmLE1vI5aRuMXzpBKhOtOFAE5BfHzXeON0M4pswZd1 jh7f4w7mBdW3MMoR2Dg/l+lbGxDKFfb9jfD1blm+uOuBodHdbIpa66Mscakannrx tNWoauPDcu7fu7b+KCItnICC+VewaoDmhr20Q8X/kwvqbNPZ98D/tzUw7YlngO1d p+K/oKVAfXbWbW79agNoqD+zVDKAos7dQgqCDY/cuZhJNzt4xBZfTkM62SXdHU7g aCXHg+qxoWMrYTWGGueAhwf4gB3YIe0atKxP9w5gbjtxbWa5Y6oTyIpgGKvO5SMj 7qsmg/m338kS4aKQVjr9D042W+qqxRjrn2eF/x4Sth1GXMJd1ny14NpoNGEk/xsU TZfBdFgNOYUa1jeK3N3oEDdlxx8ITA9gsNPzSy9O8Ke6WRHp5u9Ob/7UIJsiVYWw 6SPdIhagv719m93GvAC4Xe3BrRi1dmf5UX39oOqpnGKkg4lT/xNu4aYP89UbFVGW XgTPX+Cm7kRKb32Fv9GiLC0sTQEWVAiB0jVTGB9E8v15P7ybJ/9IrcRNcwcrKGNS ny+cn1SR+CmX6c8TdliSzLdtgGuPk3QrXkwWs4442IphtbPnhE4= =t7MS -----END PGP SIGNATURE----- Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Pull bpf fixes from Alexei Starovoitov: - Mark migrate_disable/enable() as always_inline to avoid issues with partial inlining (Yonghong Song) - Fix powerpc stack register definition in libbpf bpf_tracing.h (Andrii Nakryiko) - Reject negative head_room in __bpf_skb_change_head (Daniel Borkmann) - Conditionally include dynptr copy kfuncs (Malin Jonsson) - Sync pending IRQ work before freeing BPF ring buffer (Noorain Eqbal) - Do not audit capability check in x86 do_jit() (Ondrej Mosnacek) - Fix arm64 JIT of BPF_ST insn when it writes into arena memory (Puranjay Mohan) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf/arm64: Fix BPF_ST into arena memory bpf: Make migrate_disable always inline to avoid partial inlining bpf: Reject negative head_room in __bpf_skb_change_head bpf: Conditionally include dynptr copy kfuncs libbpf: Fix powerpc's stack register definition in bpf_tracing.h bpf: Do not audit capability check in do_jit() bpf: Sync pending IRQ work before freeing ring buffer	2025-10-31 18:22:26 -07:00
Linus Torvalds	a5dbbb39e1	Power management fixes for 6.18-rc4 - Add an exit latency check to the menu cpuidle governor in the case when it considers using a real idle state instead of a polling one to address a performance regression (Rafael Wysocki) - Revert an attempted cleanup of a system suspend code path that introduced a regression elsewhere (Samuel Wu) - Allow pm_restrict_gfp_mask() to be called multiple times in a row and adjust pm_restore_gfp_mask() accordingly to avoid having to play nasty games with these calls during hibernation (Rafael Wysocki) -----BEGIN PGP SIGNATURE----- iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmkDvsQSHHJqd0Byand5 c29ja2kubmV0AAoJEO5fvZ0v1OO1ICQH/1haZ2p7muRXiyP/rmyt41wQm/VZPMFV JBguNH0nd9r9szVntE0ic+lfv/nT5q0C26/tDHCnDPrtqT4aEj8uQgecfb71r1Sn 4cp4Y3BDp/9v6K2AAdo/FBYBMG63qlKlMaSXG2hewH3MreaP1V86AmELhGz6jvSV hFzJCi/bPoS7ot2mmKCE3MSGU9XEI1Hce4YAGfI3j/6RK9UD921g9gZWuEgqsQbB QHM3Wqp348sPW0JgDNYtFv6X6N3+JmSyO0oYLSSTbYKVNVkd7o/3zxEx5pnnLDjD l7y9pUbJ155fBkCbLIrU0NUwCo4PZfMs6KS3L/0Cu3Dp4JPa1F/BsN4= =p+g/ -----END PGP SIGNATURE----- Merge tag 'pm-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix three regressions, two recent ones and one introduced during the 6.17 development cycle: - Add an exit latency check to the menu cpuidle governor in the case when it considers using a real idle state instead of a polling one to address a performance regression (Rafael Wysocki) - Revert an attempted cleanup of a system suspend code path that introduced a regression elsewhere (Samuel Wu) - Allow pm_restrict_gfp_mask() to be called multiple times in a row and adjust pm_restore_gfp_mask() accordingly to avoid having to play nasty games with these calls during hibernation (Rafael Wysocki)" * tag 'pm-6.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: sleep: Allow pm_restrict_gfp_mask() stacking cpuidle: governors: menu: Select polling state in some more cases Revert "PM: sleep: Make pm_wakeup_clear() call more clear"	2025-10-30 19:02:16 -07:00
Rafael J. Wysocki	590c5cd106	Merge branches 'pm-cpuidle' and 'pm-sleep' Merge a cpuidle fix and two fixes related to system sleep for 6.18-rc4: - Add an exit latency check to the menu cpuidle governor in the case when it considers using a real idle state instead of a polling one to address a performance regression (Rafael Wysocki) - Revert an attempted cleanup of a system suspend code path that introduced a regression elsewhere (Samuel Wu) - Allow pm_restrict_gfp_mask() to be called multiple times in a row and adjust pm_restore_gfp_mask() accordingly to avoid having to play nasty games with these calls during hibernation (Rafael Wysocki) * pm-cpuidle: cpuidle: governors: menu: Select polling state in some more cases * pm-sleep: PM: sleep: Allow pm_restrict_gfp_mask() stacking Revert "PM: sleep: Make pm_wakeup_clear() call more clear"	2025-10-30 20:25:18 +01:00
Rafael J. Wysocki	35e4a69b20	PM: sleep: Allow pm_restrict_gfp_mask() stacking Allow pm_restrict_gfp_mask() to be called many times in a row to avoid issues with calling dpm_suspend_start() when the GFP mask has been already restricted. Only the first invocation of pm_restrict_gfp_mask() will actually restrict the GFP mask and the subsequent calls will warn if there is a mismatch between the expected allowed GFP mask and the actual one. Moreover, if pm_restrict_gfp_mask() is called many times in a row, pm_restore_gfp_mask() needs to be called matching number of times in a row to actually restore the GFP mask. Calling it when the GFP mask has not been restricted will cause it to warn. This is necessary for the GFP mask restriction starting in hibernation_snapshot() to continue throughout the entire hibernation flow until it completes or it is aborted (either by a wakeup event or by an error). Fixes: `449c9c0253` ("PM: hibernate: Restrict GFP mask in hibernation_snapshot()") Fixes: `469d80a371` ("PM: hibernate: Fix hybrid-sleep") Reported-by: Askar Safin <safinaskar@gmail.com> Closes: https://lore.kernel.org/linux-pm/20251025050812.421905-1-safinaskar@gmail.com/ Link: https://lore.kernel.org/linux-pm/20251028111730.2261404-1-safinaskar@gmail.com/ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Tested-by: Mario Limonciello (AMD) <superm1@kernel.org> Cc: 6.16+ <stable@vger.kernel.org> # 6.16+ Link: https://patch.msgid.link/5935682.DvuYhMxLoT@rafael.j.wysocki	2025-10-29 18:55:32 +01:00
Andrea Righi	f4fa7c25f6	sched_ext: Fix use of uninitialized variable in scx_bpf_cpuperf_set() scx_bpf_cpuperf_set() has a typo where it dereferences the local variable @sch, instead of the global @scx_root pointer. Fix by dereferencing the correct variable. Fixes: `956f2b11a8` ("sched_ext: Drop kf_cpu_valid()") Signed-off-by: Andrea Righi <arighi@nvidia.com> Reviewed-by: Christian Loehle <christian.loehle@arm.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2025-10-29 05:14:39 -10:00
Linus Torvalds	fd57572253	sched_ext: Fixes for v6.18-rc3 - Fix scx_kick_pseqs corruption when multiple schedulers are loaded concurrently - Allocate scx_kick_cpus_pnt_seqs lazily using kvzalloc() to handle systems with large CPU counts - Defer queue_balance_callback() until after ops.dispatch to fix callback ordering issues - Sync error_irq_work before freeing scx_sched to prevent use-after-free - Mark scx_bpf_dsq_move_set_[slice\|vtime]() with KF_RCU for proper RCU protection - Fix flag check for deferred callbacks -----BEGIN PGP SIGNATURE----- iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCaP+iWg4cdGpAa2VybmVs Lm9yZwAKCRCxYfJx3gVYGZ+MAQCVyGxKbEvCRqPwEkwxRdTTBBkHlxEzgeFAK5GN UrQ6mwEAq7cdmdjpPZ22iHEeyRfr2EZZww4oAlX9JpU0Pipj4QM= =MbR0 -----END PGP SIGNATURE----- Merge tag 'sched_ext-for-6.18-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext fixes from Tejun Heo: - Fix scx_kick_pseqs corruption when multiple schedulers are loaded concurrently - Allocate scx_kick_cpus_pnt_seqs lazily using kvzalloc() to handle systems with large CPU counts - Defer queue_balance_callback() until after ops.dispatch to fix callback ordering issues - Sync error_irq_work before freeing scx_sched to prevent use-after-free - Mark scx_bpf_dsq_move_set_[slice\|vtime]() with KF_RCU for proper RCU protection - Fix flag check for deferred callbacks * tag 'sched_ext-for-6.18-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: sched_ext: fix flag check for deferred callbacks sched_ext: Fix scx_kick_pseqs corruption on concurrent scheduler loads sched_ext: Allocate scx_kick_cpus_pnt_seqs lazily using kvzalloc() sched_ext: defer queue_balance_callback() until after ops.dispatch sched_ext: Sync error_irq_work before freeing scx_sched sched_ext: Mark scx_bpf_dsq_move_set_[slice\|vtime]() with KF_RCU	2025-10-27 10:52:18 -07:00
Linus Torvalds	5fee0dafba	- Restore the original buslock locking in a couple of places in the irq core subsystem after a rework -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmj+EwoACgkQEsHwGGHe VUqm6xAAuPDn4E0wuxgD5l6gXYDWXx7xoHEDT0KuL2J9OsfbWoHl8OwObBRmD7ls au/SuuJUSs3NEntQwLfTklyi7UignkTzcyOYLqb2fMYPFLk+nRXWSjvxsQMQV/u3 wwSXyK1YaZ4qaEKqIAPm5Uvs4E1DQFu6zzBdjVTKB+w1n0Lh9P4xBdDaHgwc/dV/ 8jKt39JsInLzCy+8aDLeabeU5X5qDscnbpJ3LEHf/6scMBCAvQbnfeICvDijzLgf FF4qw+O7qGzFQTKRB2B4pymoFhKGOnGR4jtygejjm3wDO/k2QKS3OwoJo8mzIM3S p/HimQ7Uy0KEU11Vo37ANdE8XErkeoj7meoBNGFiU4KZzRU99CnRz0EDap9RUvlx clat0CC/3NSGau2hcbYDrTSsjkoWVbEtQJ2XbvHavnE0MscHUMIf1vIQjWzvVG06 0u5R1OPD+0czeCIXKZQVDGyRcRmmAF1+na3AuBUDq1h0i+KT4V/Y1vX64IFkDdd3 NaMk6GVmQu3bDpJ4LBpdhVl7cGV50kAbGl77VHST4pERvWQ1EWwwutDp0CK96zo0 WQnQfjF4/5Ja9l5nCLK7kffQtjdFg/jY/wyixwASEWJDM81T+fZSf32VGkP6Wf0N tQYfjKOEj1l/ilRRarSxW8opazhZuN7t7k5e8IxPYP9LmbgAbnQ= =oIwu -----END PGP SIGNATURE----- Merge tag 'irq_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Borislav Petkov: - Restore the original buslock locking in a couple of places in the irq core subsystem after a rework * tag 'irq_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/manage: Add buslock back in to enable_irq() genirq/manage: Add buslock back in to __disable_irq_nosync() genirq/chip: Add buslock back in to irq_set_handler()	2025-10-26 09:54:36 -07:00
Linus Torvalds	1bc9743b64	- Make sure a CFS runqueue on a throttled hierarchy has its PELT clock throttled otherwise task movement and manipulation would lead to dangling cfs_rq references and an eventual crash -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmj+Cf0ACgkQEsHwGGHe VUqkvA/8D1ItoOslMeTpD6YtcaNN9oxzQ7Zow1QaWaPqirUsc+2l/zZ/3R5s0Zlt 9n0mUNdZ6EC03ZGPwYCNVLk2PvTywmMdwXOypya303PXLez2bPigekJIyXJeW5FV YuJWTJBQWtZwiFf2ekP1OmHRceOA4KuBIwmWvfW4YwdXlUGfDLn+X6a4z8GsH/z+ ss8iUTfbEraBoFFaF16xq1zxrvRDw5vZpX2HkcHADiTVdkHcuXrf+33AeW/URWKz FrwimiW+HJdue9trFNwLKUggHCPDoUpHLPA/kmWFiGCZWRXBPpmZ56NGRgfoadGa 4/Hb9ASMjMFl8Y9gnkOqLyomhQ8vJ8LkNqDChiJ5AiQQFYRekrPuZw+zuCENtzVZ miAmp/kXCGSCWTMNZKlztxJGhmn/yiH+sVegmyHyDqGfqnuEBF3sebkf/DDkDAvu 88SG1YB8OlgmDIxShhfHQqw1nZa7BshLkViak6110n4fP6fbZrbY0MwBLHX2VVpQ jJeFuvQ2pZuEl1LKVDsy+ROIShkQITZ8IOeabnm6vAeHEpjomDvmlZOmc5f9NfHV wH6SmrHzSaEam70EJflzoglujYy+JMtVIUd7QC/jYXtPOYj1fcHPgwqlnv25uW9e 4IrwjFNwc2u0MAemKcqRO4DUEwAczD0y+dL/6eVKK8niVmat4f8= =8MbE -----END PGP SIGNATURE----- Merge tag 'sched_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Borislav Petkov: - Make sure a CFS runqueue on a throttled hierarchy has its PELT clock throttled otherwise task movement and manipulation would lead to dangling cfs_rq references and an eventual crash * tag 'sched_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/fair: Start a cfs_rq on throttled hierarchy with PELT clock throttled	2025-10-26 09:42:19 -07:00
Linus Torvalds	7ea5092f52	- Do not create more than eight (max supported) AUX clocks sysfs hierarchies -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmj+BzAACgkQEsHwGGHe VUrGkg//aG7SmeAZiKL1SCyEZgNitZRzuxD7lHS+gky7qsjZu8ieGwV4WvcPoG2B wQq7dfH7ZiafiOV/IjGaRwWDbg90934QBZNW9hswMyNsg5ci8vhDCRu6TSWhZN5O HSezJNElq9t3QK1+40qY8m5Zw7DlnJJOQhOyDlkc/tdzZjJs+KSDctEytQ1RjKn6 kR1x7GE6UUJpdKP0/fvlvJVALSrR7hyzqmv+G9HBLhuz8E/lj6IUhRpeouQ3u4tn 7TNYHi5/HEIUE/T87YEsIIYeAVe6hQ33M73rJ6UMYDhuwfLmePVYtm5gsPt8D6Zi 9z73CTZKhry1fpMR0X8pow+zHsyMyYtErX93mFmmvCiPtC8FlvUSLWTVx0n8itBQ jyGZOVPAJWiZ1FCuSaZeaBB3s4/AGDFAOOZIS+l0oRzZHE4xU23y1cgyNbf1pp6H 3i2UR0UZc8D5YVaT6LnmhVDosjZrV7V+GPCqcLNxb8QCqr+dkHjb8fv/G2b0RB34 YUg798hulYOYhU+mKr/qOJs9SPztx8VgmirURU/4wU8aM7vpPPdn5IsgeohzBmSA 2LLE3M2KFCtDem5O5I2HUrBsJBr85om+XMAP2O7ctoWzl3GC4j0Fqr4dK2M0UUfD sTMamGyci7QfiUvzfe/Qh+T1dp6b/4kmU/1rsoyA1fn8V2q+khA= =PFVB -----END PGP SIGNATURE----- Merge tag 'timers_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Borislav Petkov: - Do not create more than eight (max supported) AUX clocks sysfs hierarchies * tag 'timers_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timekeeping: Fix aux clocks sysfs initialization loop bound	2025-10-26 09:40:16 -07:00
Andy Shevchenko	53abe3e1c1	sched: Remove never used code in mm_cid_get() Clang is not happy with set but unused variable (this is visible with `make W=1` build: kernel/sched/sched.h:3744:18: error: variable 'cpumask' set but not used [-Werror,-Wunused-but-set-variable] It seems like the variable was never used along with the assignment that does not have side effects as far as I can see. Remove those altogether. Fixes: `223baf9d17` ("sched: Fix performance regression introduced by mm_cid") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Eric Biggers <ebiggers@kernel.org> Reviewed-by: Breno Leitao <leitao@debian.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2025-10-24 16:55:46 -07:00

1 2 3 4 5 ...

49613 commits