linux

mirror of https://github.com/torvalds/linux.git synced 2026-03-08 03:44:45 +01:00

Author	SHA1	Message	Date
Shengming Hu	cafe4074a7	watchdog/softlockup: fix sample ring index wrap in need_counting_irqs() cpustat_tail indexes cpustat_util[], which is a NUM_SAMPLE_PERIODS-sized ring buffer. need_counting_irqs() currently wraps the index using NUM_HARDIRQ_REPORT, which only happens to match NUM_SAMPLE_PERIODS. Use NUM_SAMPLE_PERIODS for the wrap to keep the ring math correct even if the NUM_HARDIRQ_REPORT or NUM_SAMPLE_PERIODS changes. Link: https://lkml.kernel.org/r/tencent_7068189CB6D6689EB353F3D17BF5A5311A07@qq.com Fixes: `e9a9292e23` ("watchdog/softlockup: Report the most frequent interrupts") Signed-off-by: Shengming Hu <hu.shengming@zte.com.cn> Reviewed-by: Petr Mladek <pmladek@suse.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Zhang Run <zhang.run@zte.com.cn> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-02-08 00:13:34 -08:00
Alan Maguire	9dc052234d	kcsan, compiler_types: avoid duplicate type issues in BPF Type Format Enabling KCSAN is causing a large number of duplicate types in BTF for core kernel structs like task_struct [1]. This is due to the definition in include/linux/compiler_types.h `#ifdef __SANITIZE_THREAD__ ... `#define __data_racy volatile .. `#else ... `#define __data_racy ... `#endif Because some objects in the kernel are compiled without KCSAN flags (KCSAN_SANITIZE) we sometimes get the empty __data_racy annotation for objects; as a result we get multiple conflicting representations of the associated structs in DWARF, and these lead to multiple instances of core kernel types in BTF since they cannot be deduplicated due to the additional modifier in some instances. Moving the __data_racy definition under CONFIG_KCSAN avoids this problem, since the volatile modifier will be present for both KCSAN and KCSAN_SANITIZE objects in a CONFIG_KCSAN=y kernel. Link: https://lkml.kernel.org/r/20260116091730.324322-1-alan.maguire@oracle.com Fixes: `31f605a308` ("kcsan, compiler_types: Introduce __data_racy type qualifier") Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Reported-by: Nilay Shroff <nilay@linux.ibm.com> Tested-by: Nilay Shroff <nilay@linux.ibm.com> Suggested-by: Marco Elver <elver@google.com> Reviewed-by: Marco Elver <elver@google.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Bart van Assche <bvanassche@acm.org> Cc: Daniel Borkman <daniel@iogearbox.net> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Hao Luo <haoluo@google.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Jason A. Donenfeld <jason@zx2c4.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Kees Cook <kees@kernel.org> Cc: KP Singh <kpsingh@kernel.org> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Naman Jain <namjain@linux.microsoft.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: "Paul E . McKenney" <paulmck@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislav Fomichev <sdf@fomichev.me> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-02-08 00:13:34 -08:00
Tycho Andersen (AMD)	0758293d5d	kho: fix doc for kho_restore_pages() This function returns NULL if kho_restore_page() returns NULL, which happens in a couple of corner cases. It never returns an error code. Link: https://lkml.kernel.org/r/20260123190506.1058669-1-tycho@kernel.org Signed-off-by: Tycho Andersen (AMD) <tycho@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-02-08 00:13:34 -08:00
Pasha Tatashin	f653ff7af9	tests/liveupdate: add in-kernel liveupdate test Introduce an in-kernel test module to validate the core logic of the Live Update Orchestrator's File-Lifecycle-Bound feature. This provides a low-level, controlled environment to test FLB registration and callback invocation without requiring userspace interaction or actual kexec reboots. The test is enabled by the CONFIG_LIVEUPDATE_TEST Kconfig option. Link: https://lkml.kernel.org/r/20251218155752.3045808-6-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Alexander Graf <graf@amazon.com> Cc: David Gow <davidgow@google.com> Cc: David Matlack <dmatlack@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <kees@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Pratyush Yadav <pratyush@kernel.org> Cc: Samiullah Khawaja <skhawaja@google.com> Cc: Tamir Duberstein <tamird@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-02-08 00:13:33 -08:00
Pasha Tatashin	cab056f2aa	liveupdate: luo_flb: introduce File-Lifecycle-Bound global state Introduce a mechanism for managing global kernel state whose lifecycle is tied to the preservation of one or more files. This is necessary for subsystems where multiple preserved file descriptors depend on a single, shared underlying resource. An example is HugeTLB, where multiple file descriptors such as memfd and guest_memfd may rely on the state of a single HugeTLB subsystem. Preserving this state for each individual file would be redundant and incorrect. The state should be preserved only once when the first file is preserved, and restored/finished only once the last file is handled. This patch introduces File-Lifecycle-Bound (FLB) objects to solve this problem. An FLB is a global, reference-counted object with a defined set of operations: - A file handler (struct liveupdate_file_handler) declares a dependency on one or more FLBs via a new registration function, liveupdate_register_flb(). - When the first file depending on an FLB is preserved, the FLB's .preserve() callback is invoked to save the shared global state. The reference count is then incremented for each subsequent file. - Conversely, when the last file is unpreserved (before reboot) or finished (after reboot), the FLB's .unpreserve() or .finish() callback is invoked to clean up the global resource. The implementation includes: - A new set of ABI definitions (luo_flb_ser, luo_flb_head_ser) and a corresponding FDT node (luo-flb) to serialize the state of all active FLBs and pass them via Kexec Handover. - Core logic in luo_flb.c to manage FLB registration, reference counting, and the invocation of lifecycle callbacks. - An API (liveupdate_flb_get/_incoming/_outgoing) for other kernel subsystems to safely access the live object managed by an FLB, both before and after the live update. This framework provides the necessary infrastructure for more complex subsystems like IOMMU, VFIO, and KVM to integrate with the Live Update Orchestrator. Link: https://lkml.kernel.org/r/20251218155752.3045808-5-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Alexander Graf <graf@amazon.com> Cc: David Gow <davidgow@google.com> Cc: David Matlack <dmatlack@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <kees@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Pratyush Yadav <pratyush@kernel.org> Cc: Samiullah Khawaja <skhawaja@google.com> Cc: Tamir Duberstein <tamird@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-02-08 00:13:33 -08:00
Pasha Tatashin	6845645eef	liveupdate: luo_file: Use private list Switch LUO to use the private list iterators. Link: https://lkml.kernel.org/r/20251218155752.3045808-4-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Alexander Graf <graf@amazon.com> Cc: David Gow <davidgow@google.com> Cc: David Matlack <dmatlack@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <kees@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Pratyush Yadav <pratyush@kernel.org> Cc: Samiullah Khawaja <skhawaja@google.com> Cc: Tamir Duberstein <tamird@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-02-08 00:13:33 -08:00
Pasha Tatashin	66bd8501ce	list: add kunit test for private list primitives Add a KUnit test suite for the new private list primitives. The test defines a struct with a __private list_head and exercises every macro defined in <linux/list_private.h>. This ensures that the macros correctly handle the ACCESS_PRIVATE() abstraction and compile without warnings when acting on private members, verifying that qualifiers are stripped and offsets are calculated correctly. Link: https://lkml.kernel.org/r/20251218155752.3045808-3-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: David Gow <davidgow@google.com> Cc: Alexander Graf <graf@amazon.com> Cc: David Matlack <dmatlack@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <kees@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Pratyush Yadav <pratyush@kernel.org> Cc: Samiullah Khawaja <skhawaja@google.com> Cc: Tamir Duberstein <tamird@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-02-08 00:13:32 -08:00
Pasha Tatashin	989b3c5af6	list: add primitives for private list manipulations Patch series "list private v2 & luo flb", v9. This series introduces two connected infrastructure improvements: a new API for handling private linked lists, and the "File-Lifecycle-Bound" (FLB) mechanism for the Live Update Orchestrator. 1. Private List Primitives (patches 1-3) Recently, Linux introduced the ability to mark structure members as __private and access them via ACCESS_PRIVATE(). This enforces better encapsulation by ensuring internal details are only accessible by the owning subsystem. However, struct list_head is frequently used as an internal linkage mechanism within these private sections. The standard macros in <linux/list.h> do not support ACCESS_PRIVATE() natively. Consequently, subsystems using private lists are forced to implement ad-hoc workarounds or local iterator macros. This series adds <linux/list_private.h>, providing a set of primitives identical to those in <linux/list.h> but designed for private list heads. It also includes a KUnit test suite to verify that the macros correctly handle pointer offsets and qualifiers. 2. This series adds FLB (patches 4-5) support to Live Update that also internally uses private lists. FLB allows global kernel state (such as IOMMU domains or HugeTLB state) to be preserved once, shared across multiple file descriptors, and restored when needed. This is necessary for subsystems where multiple preserved file descriptors depend on a single, shared underlying resource. Preserving this state for each individual file would be redundant and incorrect. FLB uses reference counting tied to the lifecycle of preserved files. The state is preserved when the first file depending on it is preserved, and restored or cleaned up only when the last file is handled. This patch (of 5): Linux recently added an ability to add private members to structs (i.e. __private) and access them via ACCESS_PRIVATE(). This ensures that those members are only accessible by the subsystem which owns the struct type, and not to the object owner. However, struct list_head often needs to be placed into the private section to be manipulated privately by the subsystem. Add macros to support private list manipulations in <linux/list_private.h>. [akpm@linux-foundation.org: fix kerneldoc] Link: https://lkml.kernel.org/r/20251218155752.3045808-1-pasha.tatashin@soleen.com Link: https://lkml.kernel.org/r/20251218155752.3045808-2-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Alexander Graf <graf@amazon.com> Cc: David Gow <davidgow@google.com> Cc: David Matlack <dmatlack@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <kees@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Pratyush Yadav <pratyush@kernel.org> Cc: Samiullah Khawaja <skhawaja@google.com> Cc: Tamir Duberstein <tamird@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-02-08 00:13:32 -08:00
Arnd Bergmann	90079798f1	delayacct: fix uapi timespec64 definition The custom definition of 'struct timespec64' is incompatible with both the kernel's internal definition and the glibc type, at least on big-endian targets that have the tv_nsec field in a different place, and the definition clashes with any userspace that also defines a timespec64 structure. Running the header check with -Wpadding enabled produces this output that warns about the incorrect padding: usr/include/linux/taskstats.h:25:1: error: padding struct size to alignment boundary with 4 bytes [-Werror=padded] Remove the hack and instead use the regular __kernel_timespec type that is meant to be used in uapi definitions. Link: https://lkml.kernel.org/r/20260202095906.1344100-1-arnd@kernel.org Fixes: 29b63f6eff0e ("delayacct: add timestamp of delay max") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Fan Yu <fan.yu9@zte.com.cn> Cc: Jonathan Corbet <corbet@lwn.net> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Yang Yang <yang.yang29@zte.com.cn> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Jiang Kun <jiang.kun2@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-02-08 00:13:32 -08:00
Pnina Feder	2e171ab29f	panic: add panic_force_cpu= parameter to redirect panic to a specific CPU Some platforms require panic handling to execute on a specific CPU for crash dump to work reliably. This can be due to firmware limitations, interrupt routing constraints, or platform-specific requirements where only a single CPU is able to safely enter the crash kernel. Add the panic_force_cpu= kernel command-line parameter to redirect panic execution to a designated CPU. When the parameter is provided, the CPU that initially triggers panic forwards the panic context to the target CPU via IPI, which then proceeds with the normal panic and kexec flow. The IPI delivery is implemented as a weak function (panic_smp_redirect_cpu) so architectures with NMI support can override it for more reliable delivery. If the specified CPU is invalid, offline, or a panic is already in progress on another CPU, the redirection is skipped and panic continues on the current CPU. [pnina.feder@mobileye.com: fix unused variable warning] Link: https://lkml.kernel.org/r/20260126122618.2967950-1-pnina.feder@mobileye.com Link: https://lkml.kernel.org/r/20260122102457.1154599-1-pnina.feder@mobileye.com Signed-off-by: Pnina Feder <pnina.feder@mobileye.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Cc: Baoquan He <bhe@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-02-03 08:21:26 -08:00
Oleg Nesterov	f3951e93d4	netclassid: use thread_group_leader(p) in update_classid_task() Cleanup and preparation to simplify planned future changes. Link: https://lkml.kernel.org/r/aXY_4NSP094-Cf-2@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Christan König <christian.koenig@amd.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Felix Kuehling <felix.kuehling@amd.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Leon Romanovsky <leon@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Steven Price <steven.price@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-02-03 08:21:26 -08:00
Oleg Nesterov	6fd390e2bc	RDMA/umem: don't abuse current->group_leader Cleanup and preparation to simplify the next changes. Use current->tgid instead of current->group_leader->pid. Link: https://lkml.kernel.org/r/aXY_2JIhCeGAYC0r@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Leon Romanovsky <leon@kernel.org> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Christan König <christian.koenig@amd.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Felix Kuehling <felix.kuehling@amd.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Steven Price <steven.price@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-02-03 08:21:25 -08:00
Oleg Nesterov	05f8f36d0b	drm/pan*: don't abuse current->group_leader Cleanup and preparation to simplify the next changes. Use current->tgid instead of current->group_leader->pid. Link: https://lkml.kernel.org/r/aXY_0MrQBZWKbbmA@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Steven Price <steven.price@arm.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Christan König <christian.koenig@amd.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Felix Kuehling <felix.kuehling@amd.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Leon Romanovsky <leon@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-02-03 08:21:25 -08:00
Oleg Nesterov	a87da7a9fa	drm/amd: kill the outdated "Only the pthreads threading model is supported" checks Nowadays task->group_leader->mm != task->mm is only possible if a) task is not a group leader and b) task->group_leader->mm == NULL because task->group_leader has already exited using sys_exit(). I don't think that drm/amd tries to detect/nack this case. Link: https://lkml.kernel.org/r/aXY_yLVHd63UlWtm@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Christan König <christian.koenig@amd.com> Acked-by: Felix Kuehling <felix.kuehling@amd.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Leon Romanovsky <leon@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Steven Price <steven.price@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-02-03 08:21:25 -08:00
Oleg Nesterov	7d08e0916a	drm/amdgpu: don't abuse current->group_leader Cleanup and preparation to simplify the next changes. - Use current->tgid instead of current->group_leader->pid - Use get_task_pid(current, PIDTYPE_TGID) instead of get_task_pid(current->group_leader, PIDTYPE_PID) Link: https://lkml.kernel.org/r/aXY_wKewzV5lCa5I@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Felix Kuehling <felix.kuehling@amd.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Christan König <christian.koenig@amd.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Leon Romanovsky <leon@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Steven Price <steven.price@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-02-03 08:21:25 -08:00
Oleg Nesterov	a170919d1b	android/binder: use same_thread_group(proc->tsk, current) in binder_mmap() With or without this change the checked condition can be falsely true if proc->tsk execs, but this is fine: binder_alloc_mmap_handler() checks vma->vm_mm == alloc->mm. Link: https://lkml.kernel.org/r/aXY_uPYyUg4rwNOg@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Christan König <christian.koenig@amd.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Felix Kuehling <felix.kuehling@amd.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Leon Romanovsky <leon@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Steven Price <steven.price@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-02-03 08:21:25 -08:00
Oleg Nesterov	33caa19f4b	android/binder: don't abuse current->group_leader Patch series "don't abuse task_struct.group_leader", v2. This series removes the usage of ->group_leader when it is "obviously unnecessary". I am going to move ->group_leader from task_struct to signal_struct or at least add the new task_group_leader() helper. So I will send more tree-wide changes on top of this series. This patch (of 7): Cleanup and preparation to simplify the next changes. - Use current->tgid instead of current->group_leader->pid - Use the value returned by get_task_struct() to initialize proc->tsk Link: https://lkml.kernel.org/r/aXY_h8i78n6yD9JY@redhat.com Link: https://lkml.kernel.org/r/aXY_ryGDwdygl1Tv@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Christan König <christian.koenig@amd.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Felix Kuehling <felix.kuehling@amd.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Leon Romanovsky <leon@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Steven Price <steven.price@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-02-03 08:21:25 -08:00
Evangelos Petrongonas	427b2535f5	kho: skip memoryless NUMA nodes when reserving scratch areas kho_reserve_scratch() iterates over all online NUMA nodes to allocate per-node scratch memory. On systems with memoryless NUMA nodes (nodes that have CPUs but no memory), memblock_alloc_range_nid() fails because there is no memory available on that node. This causes KHO initialization to fail and kho_enable to be set to false. Some ARM64 systems have NUMA topologies where certain nodes contain only CPUs without any associated memory. These configurations are valid and should not prevent KHO from functioning. Fix this by only counting nodes that have memory (N_MEMORY state) and skip memoryless nodes in the per-node scratch allocation loop. Link: https://lkml.kernel.org/r/20260120175913.34368-1-epetron@amazon.de Fixes: `3dc92c3114` ("kexec: add Kexec HandOver (KHO) generation helpers"). Signed-off-by: Evangelos Petrongonas <epetron@amazon.de> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Alexander Graf <graf@amazon.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-31 16:16:08 -08:00
Vasily Gorbik	96a54b8ffc	crash_dump: fix dm_crypt keys locking and ref leak crash_load_dm_crypt_keys() reads dm-crypt volume keys from the user keyring. It uses user_key_payload_locked() without holding key->sem, which makes lockdep complain when kexec_file_load() assembles the crash image: ============================= WARNING: suspicious RCU usage ----------------------------- ./include/keys/user-type.h:53 suspicious rcu_dereference_protected() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 no locks held by kexec/4875. stack backtrace: Call Trace: <TASK> dump_stack_lvl+0x5d/0x80 lockdep_rcu_suspicious.cold+0x4e/0x96 crash_load_dm_crypt_keys+0x314/0x390 bzImage64_load+0x116/0x9a0 ? __lock_acquire+0x464/0x1ba0 __do_sys_kexec_file_load+0x26a/0x4f0 do_syscall_64+0xbd/0x430 entry_SYSCALL_64_after_hwframe+0x77/0x7f In addition, the key returned by request_key() is never key_put()'d, leaking a key reference on each load attempt. Take key->sem while copying the payload and drop the key reference afterwards. Link: https://lkml.kernel.org/r/patch.git-2d4d76083a5c.your-ad-here.call-01769426386-ext-2560@work.hours Fixes: `479e58549b` ("crash_dump: store dm crypt keys in kdump reserved memory") Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Cc: Baoquan He <bhe@redhat.com> Cc: Coiby Xu <coxu@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-31 16:16:08 -08:00
Mike Rapoport (Microsoft)	b50634c5e8	kho: cleanup error handling in kho_populate() * use dedicated labels for error handling instead of checking if a pointer is not null to decide if it should be unmapped * drop assignment of values to err that are only used to print a numeric error code, there are pr_warn()s for each failure already so printing a numeric error code in the next line does not add anything useful Link: https://lkml.kernel.org/r/20260122121757.575987-1-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Mike Rapoport <rppt@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-31 16:16:08 -08:00
Ondrej Mosnacek	0895a000e4	ucount: check for CAP_SYS_RESOURCE using ns_capable_noaudit() The user.* sysctls implement the ctl_table_root::permissions hook and they override the file access mode based on the CAP_SYS_RESOURCE capability (at most rwx if capable, at most r-- if not). The capability is being checked unconditionally, so if an LSM denies the capability, an audit record may be logged even when access is in fact granted. Given the logic in the set_permissions() function in kernel/ucount.c and the unfortunate way the permission checking is implemented, it doesn't seem viable to avoid false positive denials by deferring the capability check. Thus, do the same as in net_ctl_permissions() (net/sysctl_net.c) - switch from ns_capable() to ns_capable_noaudit(), so that the check never logs an audit record. Link: https://lkml.kernel.org/r/20260122140745.239428-1-omosnace@redhat.com Fixes: `dbec28460a` ("userns: Add per user namespace sysctls.") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Reviewed-by: Paul Moore <paul@paul-moore.com> Acked-by: Serge Hallyn <serge@hallyn.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Alexey Gladkov <legion@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-31 16:16:08 -08:00
Ondrej Mosnacek	8924336531	ipc: don't audit capability check in ipc_permissions() The IPC sysctls implement the ctl_table_root::permissions hook and they override the file access mode based on the CAP_CHECKPOINT_RESTORE capability, which is being checked regardless of whether any access is actually denied or not, so if an LSM denies the capability, an audit record may be logged even when access is in fact granted. It wouldn't be viable to restructure the sysctl permission logic to only check the capability when the access would be actually denied if it's not granted. Thus, do the same as in net_ctl_permissions() (net/sysctl_net.c) - switch from ns_capable() to ns_capable_noaudit(), so that the check never emits an audit record. Link: https://lkml.kernel.org/r/20260122141303.241133-1-omosnace@redhat.com Fixes: `0889f44e28` ("ipc: Check permissions for checkpoint_restart sysctls at open time") Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Acked-by: Alexey Gladkov <legion@kernel.org> Acked-by: Serge Hallyn <serge@hallyn.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Paul Moore <paul@paul-moore.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-31 16:16:07 -08:00
Li Chen	480e1d5c64	kexec: derive purgatory entry from symbol kexec_load_purgatory() derives image->start by locating e_entry inside an SHF_EXECINSTR section. If the purgatory object contains multiple executable sections with overlapping sh_addr, the entrypoint check can match more than once and trigger a WARN. Derive the entry section from the purgatory_start symbol when present and compute image->start from its final placement. Keep the existing e_entry fallback for purgatories that do not expose the symbol. WARNING: kernel/kexec_file.c:1009 at kexec_load_purgatory+0x395/0x3c0, CPU#10: kexec/1784 Call Trace: <TASK> bzImage64_load+0x133/0xa00 __do_sys_kexec_file_load+0x2b3/0x5c0 do_syscall_64+0x81/0x610 entry_SYSCALL_64_after_hwframe+0x76/0x7e [me@linux.beauty: move helper to avoid forward declaration, per Baoquan] Link: https://lkml.kernel.org/r/20260128043511.316860-1-me@linux.beauty Link: https://lkml.kernel.org/r/20260120124005.148381-1-me@linux.beauty Fixes: `8652d44f46` ("kexec: support purgatories with .text.hot sections") Signed-off-by: Li Chen <me@linux.beauty> Acked-by: Baoquan He <bhe@redhat.com> Cc: Alexander Graf <graf@amazon.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Li Chen <me@linux.beauty> Cc: Philipp Rudo <prudo@redhat.com> Cc: Ricardo Ribalda Delgado <ribalda@chromium.org> Cc: Ross Zwisler <zwisler@google.com> Cc: Sourabh Jain <sourabhjain@linux.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-31 16:16:07 -08:00
Heming Zhao	5138c936c2	ocfs2: fix reflink preserve cleanup issue commit `c06c303832` ("ocfs2: fix xattr array entry __counted_by error") doesn't handle all cases and the cleanup job for preserved xattr entries still has bug: - the 'last' pointer should be shifted by one unit after cleanup an array entry. - current code logic doesn't cleanup the first entry when xh_count is 1. Note, commit `c06c303832` is also a bug fix for `0fe9b66c65`. Link: https://lkml.kernel.org/r/20251210015725.8409-2-heming.zhao@suse.com Fixes: `0fe9b66c65` ("ocfs2: Add preserve to reflink.") Signed-off-by: Heming Zhao <heming.zhao@suse.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-31 16:16:07 -08:00
Haoxiang Li	666183dcdd	rapidio: replace rio_free_net() with kfree() in rio_scan_alloc_net() When idtab allocation fails, net is not registered with rio_add_net() yet, so kfree(net) is sufficient to release the memory. Set mport->net to NULL to avoid dangling pointer. Link: https://lkml.kernel.org/r/20260121013508.195836-1-lihaoxiang@isrc.iscas.ac.cn Fixes: `e6b585ca6e` ("rapidio: move net allocation into core code") Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexandre Bounine <alex.bou9@gmail.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-31 16:16:07 -08:00
Wang Yaxin	503efe850c	delayacct: add timestamp of delay max Problem ======= Commit `658eb5ab91` ("delayacct: add delay max to record delay peak") introduced the delay max for getdelays, which records abnormal latency peaks and helps us understand the magnitude of such delays. However, the peak latency value alone is insufficient for effective root cause analysis. Without the precise timestamp of when the peak occurred, we still lack the critical context needed to correlate it with other system events. Solution ======== To address this, we need to additionally record a precise timestamp when the maximum latency occurs. By correlating this timestamp with system logs and monitoring metrics, we can identify processes with abnormal resource usage at the same moment, which can help us to pinpoint root causes. Use Case ======== bash-4.4# ./getdelays -d -t 227 print delayacct stats ON TGID 227 CPU count real total virtual total delay total delay average delay max delay min delay max timestamp 46 188000000 192348334 4098012 0.089ms 0.429260ms 0.051205ms 2026-01-15T15:06:58 IO count delay total delay average delay max delay min delay max timestamp 0 0 0.000ms 0.000000ms 0.000000ms N/A SWAP count delay total delay average delay max delay min delay max timestamp 0 0 0.000ms 0.000000ms 0.000000ms N/A RECLAIM count delay total delay average delay max delay min delay max timestamp 0 0 0.000ms 0.000000ms 0.000000ms N/A THRAS HING count delay total delay average delay max delay min delay max timestamp 0 0 0.000ms 0.000000ms 0.000000ms N/A COMPACT count delay total delay average delay max delay min delay max timestamp 0 0 0.000ms 0.000000ms 0.000000ms N/A WPCOPY count delay total delay average delay max delay min delay max timestamp 182 19413338 0.107ms 0.547353ms 0.022462ms 2026-01-15T15:05:24 IRQ count delay total delay average delay max delay min delay max timestamp 0 0 0.000ms 0.000000ms 0.000000ms N/A Link: https://lkml.kernel.org/r/20260119100241520gWubW8-5QfhSf9gjqcc_E@zte.com.cn Signed-off-by: Wang Yaxin <wang.yaxin@zte.com.cn> Cc: Fan Yu <fan.yu9@zte.com.cn> Cc: Jonathan Corbet <corbet@lwn.net> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Yang Yang <yang.yang29@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-31 16:16:06 -08:00
Eric Dumazet	cc20650a09	scripts/bloat-o-meter: ignore __noinstr_text_start __noinstr_text_start is adding noise to the script, ignore it. For instance using __always_inline on __skb_incr_checksum_unnecessary and CC=clang build. Before this patch, __noinstr_text_start can show up and confuse us. $ scripts/bloat-o-meter -t vmlinux.old vmlinux.new add/remove: 0/2 grow/shrink: 3/0 up/down: 212/-206 (6) Function old new delta tcp6_gro_complete 208 283 +75 tcp4_gro_complete 376 449 +73 __noinstr_text_start 3536 3600 +64 __pfx___skb_incr_checksum_unnecessary 32 - -32 __skb_incr_checksum_unnecessary 174 - -174 Total: Before=25509464, After=25509470, chg +0.00% After this patch we have a more precise result. $ scripts/bloat-o-meter -t vmlinux.old vmlinux.new add/remove: 0/2 grow/shrink: 2/0 up/down: 148/-206 (-58) Function old new delta tcp6_gro_complete 208 283 +75 tcp4_gro_complete 376 449 +73 __pfx___skb_incr_checksum_unnecessary 32 - -32 __skb_incr_checksum_unnecessary 174 - -174 Total: Before=25505928, After=25505870, chg -0.00% Link: https://lkml.kernel.org/r/20260117083448.3877418-1-edumazet@google.com Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-31 16:16:06 -08:00
Yury Norov	bec261fec6	tracing: move tracing declarations from kernel.h to a dedicated header Tracing is a half of the kernel.h in terms of LOCs, although it's a self-consistent part. It is intended for quick debugging purposes and isn't used by the normal tracing utilities. Move it to a separate header. If someone needs to just throw a trace_printk() in their driver, they will not have to pull all the heavy tracing machinery. This is a pure move. Link: https://lkml.kernel.org/r/20260116042510.241009-7-ynorov@nvidia.com Signed-off-by: Yury Norov <ynorov@nvidia.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Cc: Aaron Tomlin <atomlin@atomlin.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Christophe Leroy (CS GROUP) <chleroy@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Petr Pavlu <petr.pavlu@suse.com> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-31 16:16:06 -08:00
Steven Rostedt	86e685ff36	tracing: remove size parameter in __trace_puts() The __trace_puts() function takes a string pointer and the size of the string itself. All users currently simply pass in the strlen() of the string it is also passing in. There's no reason to pass in the size. Instead have the __trace_puts() function do the strlen() within the function itself. This fixes a header recursion issue where using strlen() in the macro calling __trace_puts() requires adding #include <linux/string.h> in order to use strlen(). Removing the use of strlen() from the header fixes the recursion issue. Link: https://lore.kernel.org/all/aUN8Hm377C5A0ILX@yury/ Link: https://lkml.kernel.org/r/20260116042510.241009-6-ynorov@nvidia.com Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Yury Norov <ynorov@nvidia.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Cc: Aaron Tomlin <atomlin@atomlin.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Christophe Leroy (CS GROUP) <chleroy@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Petr Pavlu <petr.pavlu@suse.com> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-31 16:16:05 -08:00
Yury Norov	269586d689	kernel.h: include linux/instruction_pointer.h explicitly In preparation for decoupling linux/instruction_pointer.h and linux/kernel.h, include instruction_pointer.h explicitly where needed. Link: https://lkml.kernel.org/r/20260116042510.241009-5-ynorov@nvidia.com Signed-off-by: Yury Norov <ynorov@nvidia.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Cc: Aaron Tomlin <atomlin@atomlin.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Christophe Leroy (CS GROUP) <chleroy@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Petr Pavlu <petr.pavlu@suse.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-31 16:16:05 -08:00
Yury Norov	90ddd39b88	kernel.h: move VERIFY_OCTAL_PERMISSIONS() to sysfs.h The macro is related to sysfs, but is defined in kernel.h. Move it to the proper header, and unload the generic kernel.h. Now that the macro is removed from kernel.h, linux/moduleparam.h is decoupled, and kernel.h inclusion can be removed. Link: https://lkml.kernel.org/r/20260116042510.241009-4-ynorov@nvidia.com Signed-off-by: Yury Norov <ynorov@nvidia.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Cc: Aaron Tomlin <atomlin@atomlin.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Christophe Leroy (CS GROUP) <chleroy@kernel.org> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-31 16:16:05 -08:00
Yury Norov	25b66674b1	moduleparam: include required headers explicitly The following patch drops moduleparam.h dependency on kernel.h. In preparation to it, list all the required headers explicitly. Link: https://lkml.kernel.org/r/20260116042510.241009-3-ynorov@nvidia.com Signed-off-by: Yury Norov <ynorov@nvidia.com> Suggested-by: Petr Pavlu <petr.pavlu@suse.com> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Cc: Aaron Tomlin <atomlin@atomlin.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Christophe Leroy (CS GROUP) <chleroy@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-31 16:16:05 -08:00
Yury Norov	f2e0abdc88	kernel.h: drop STACK_MAGIC macro Patch series "Unload linux/kernel.h", v5. kernel.h hosts declarations that can be placed better. This series decouples kernel.h with some explicit and implicit dependencies; also, moves tracing functionality to a new independent header. This patch (of 6): The macro was introduced in 1994, v1.0.4, for stacks protection. Since that, people found better ways to protect stacks, and now the macro is only used by i915 selftests. Move it to a local header and drop from the kernel.h. Link: https://lkml.kernel.org/r/20260116042510.241009-1-ynorov@nvidia.com Link: https://lkml.kernel.org/r/20260116042510.241009-2-ynorov@nvidia.com Signed-off-by: Yury Norov <ynorov@nvidia.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org> Reviewed-by: Aaron Tomlin <atomlin@atomlin.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Petr Pavlu <petr.pavlu@suse.com> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-31 16:16:04 -08:00
Nathan Chancellor	e8d899d301	compiler-clang.h: require LLVM 19.1.0 or higher for __typeof_unqual__ When building the kernel using a version of LLVM between llvmorg-19-init (the first commit of the LLVM 19 development cycle) and the change in LLVM that actually added __typeof_unqual__ for all C modes [1], which might happen during a bisect of LLVM, there is a build failure: In file included from arch/x86/kernel/asm-offsets.c:9: In file included from include/linux/crypto.h:15: In file included from include/linux/completion.h:12: In file included from include/linux/swait.h:7: In file included from include/linux/spinlock.h:56: In file included from include/linux/preempt.h:79: arch/x86/include/asm/preempt.h:61:2: error: call to undeclared function '__typeof_unqual__'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 61 \| raw_cpu_and_4(__preempt_count, ~PREEMPT_NEED_RESCHED); \| ^ arch/x86/include/asm/percpu.h:478:36: note: expanded from macro 'raw_cpu_and_4' 478 \| #define raw_cpu_and_4(pcp, val) percpu_binary_op(4, , "and", (pcp), val) \| ^ arch/x86/include/asm/percpu.h:210:3: note: expanded from macro 'percpu_binary_op' 210 \| TYPEOF_UNQUAL(_var) pto_tmp__; \ \| ^ include/linux/compiler.h:248:29: note: expanded from macro 'TYPEOF_UNQUAL' 248 \| # define TYPEOF_UNQUAL(exp) __typeof_unqual__(exp) \| ^ The current logic of CC_HAS_TYPEOF_UNQUAL just checks for a major version of 19 but half of the 19 development cycle did not have support for __typeof_unqual__. Harden the logic of CC_HAS_TYPEOF_UNQUAL to avoid this error by only using __typeof_unqual__ with a released version of LLVM 19, which is greater than or equal to 19.1.0 with LLVM's versioning scheme that matches GCC's [2]. Link: `cc308f60d4` [1] Link: `4532617ae4` [2] Link: https://lkml.kernel.org/r/20260116-require-llvm-19-1-for-typeof_unqual-v1-1-3b9a4a4b212b@kernel.org Fixes: `ac053946f5` ("compiler.h: introduce TYPEOF_UNQUAL() macro") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Cc: Bill Wendling <morbo@google.com> Cc: Justin Stitt <justinstitt@google.com> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-31 16:16:04 -08:00
Pratyush Yadav	8f1081892d	kho: simplify page initialization in kho_restore_page() When restoring a page (from kho_restore_pages()) or folio (from kho_restore_folio()), KHO must initialize the struct page. The initialization differs slightly depending on if a folio is requested or a set of 0-order pages is requested. Conceptually, it is quite simple to understand. When restoring 0-order pages, each page gets a refcount of 1 and that's it. When restoring a folio, head page gets a refcount of 1 and tail pages get 0. kho_restore_page() tries to combine the two separate initialization flow into one piece of code. While it works fine, it is more complicated to read than it needs to be. Make the code simpler by splitting the two initalization paths into two separate functions. This improves readability by clearly showing how each type must be initialized. Link: https://lkml.kernel.org/r/20260116112217.915803-3-pratyush@kernel.org Signed-off-by: Pratyush Yadav <pratyush@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Alexander Graf <graf@amazon.com> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-31 16:16:04 -08:00
Pratyush Yadav	840fe43d37	kho: use unsigned long for nr_pages Patch series "kho: clean up page initialization logic", v2. This series simplifies the page initialization logic in kho_restore_page(). It was originally only a single patch [0], but on Pasha's suggestion, I added another patch to use unsigned long for nr_pages. Technically speaking, the patches aren't related and can be applied independently, but bundling them together since patch 2 relies on 1 and it is easier to manage them this way. This patch (of 2): With 4k pages, a 32-bit nr_pages can span up to 16 TiB. While it is a lot, there exist systems with terabytes of RAM. gup is also moving to using long for nr_pages. Use unsigned long and make KHO future-proof. Link: https://lkml.kernel.org/r/20260116112217.915803-1-pratyush@kernel.org Link: https://lkml.kernel.org/r/20260116112217.915803-2-pratyush@kernel.org Signed-off-by: Pratyush Yadav <pratyush@kernel.org> Suggested-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Alexander Graf <graf@amazon.com> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-31 16:16:04 -08:00
Joe Perches	931d5c36c7	checkpatch: add an invalid patch separator test Some versions of tools that apply patches incorrectly allow lines that start with 3 dashes and have additional content on the same line. Checkpatch will now emit an ERROR on these lines and optionally convert those lines from dashes to equals with --fix. Link: https://lkml.kernel.org/r/6ec1ed08328340db42655287afd5fa4067316b11.camel@perches.com Signed-off-by: Joe Perches <joe@perches.com> Suggested-by: Ian Rogers <irogers@google.com> Cc: Andy Whitcroft <apw@canonical.com> Cc: Dwaipayan Ray <dwaipayanray1@gmail.com> Cc: Kuan-Wei Chiu <visitorckw@gmail.com> Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com> Cc: Namhyung kim <namhyung@kernel.org> Cc: Stehen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-31 16:16:03 -08:00
Andrew Morton	2eec08ff09	Merge branch 'mm-hotfixes-stable' into mm-nonmm-stable to pick up changes required to merge "kho: use unsigned long for nr_pages".	2026-01-31 16:12:21 -08:00
Pratyush Yadav (Google)	6ca9de3600	kho: print which scratch buffer failed to be reserved When scratch area fails to reserve, KHO prints a message indicating that. But it doesn't say which scratch failed to allocate. This can be useful information for debugging. Even more so when the failure is hard to reproduce. Along with the current message, also print which exact scratch area failed to be reserved. Link: https://lkml.kernel.org/r/20260116165416.1262531-1-pratyush@kernel.org Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: David Matlack <dmatlack@google.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Pratyush Yadav <pratyush@kernel.org> Cc: Samiullah Khawaja <skhawaja@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-26 19:07:15 -08:00
Randy Dunlap	08e8f1ef3d	kernel-chktaint: add reporting for tainted modules Check all loaded modules and report any that have their 'taint' flags set. The tainted module output format is: * <module_name> (<taint_flags>) Example output: Kernel is "tainted" for the following reasons: * externally-built ('out-of-tree') module was loaded (#12) * unsigned module was loaded (#13) Raw taint value as int/string: 12288/'G OE ' Tainted modules: * dump_test (OE) Link: https://lkml.kernel.org/r/20260115064756.531592-1-rdunlap@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Thorsten Leemhuis <linux@leemhuis.info> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-26 19:07:15 -08:00
Wangyang Guo	89802ca36c	lib/group_cpus: make group CPU cluster aware As CPU core counts increase, the number of NVMe IRQs may be smaller than the total number of CPUs. This forces multiple CPUs to share the same IRQ. If the IRQ affinity and the CPU's cluster do not align, a performance penalty can be observed on some platforms. This patch improves IRQ affinity by grouping CPUs by cluster within each NUMA domain, ensuring better locality between CPUs and their assigned NVMe IRQs. Details: Intel Xeon E platform packs 4 CPU cores as 1 module (cluster) and share the L2 cache. Let's say, if there are 40 CPUs in 1 NUMA domain and 11 IRQs to dispatch. The existing algorithm will map first 7 IRQs each with 4 CPUs and remained 4 IRQs each with 3 CPUs. The last 4 IRQs may have cross cluster issue. For example, the 9th IRQ which pinned to CPU32, then for CPU31, it will have cross L2 memory access. CPU \|28 29 30 31\|32 33 34 35\|36 ... -------- -------- -------- IRQ 8 9 10 If this patch applied, then first 2 IRQs each mapped with 2 CPUs and rest 9 IRQs each mapped with 4 CPUs, which avoids the cross cluster memory access. CPU \|00 01 02 03\|04 05 06 07\|08 09 10 11\| ... ----- ----- ----------- ----------- IRQ 1 2 3 4 As a result, 15%+ performance difference is observed in FIO libaio/randread/bs=8k. Changes since V1: - Add more performance details in commit messages. - Fix endless loop when topology_cluster_cpumask return invalid mask. History: v1: https://lore.kernel.org/all/20251024023038.872616-1-wangyang.guo@intel.com/ v1 [RESEND]: https://lore.kernel.org/all/20251111020608.1501543-1-wangyang.guo@intel.com/ Link: https://lkml.kernel.org/r/20260113022958.3379650-1-wangyang.guo@intel.com Signed-off-by: Wangyang Guo <wangyang.guo@intel.com> Reviewed-by: Tianyou Li <tianyou.li@intel.com> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com> Tested-by: Dan Liang <dan.liang@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@fb.com> Cc: Keith Busch <kbusch@kernel.org> Cc: Ming Lei <ming.lei@redhat.com> Cc: Radu Rendec <rrendec@redhat.com> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-26 19:07:15 -08:00
Finn Thain	9a229ae249	atomic: add option for weaker alignment check Add a new Kconfig symbol to make CONFIG_DEBUG_ATOMIC more useful on those architectures which do not align dynamic allocations to 8-byte boundaries. Without this, CONFIG_DEBUG_ATOMIC produces excessive WARN splats. Link: https://lkml.kernel.org/r/6d25a12934fe9199332f4d65d17c17de450139a8.1768281748.git.fthain@linux-m68k.org Signed-off-by: Finn Thain <fthain@linux-m68k.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Daniel Borkman <daniel@iogearbox.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Gary Guo <gary@garyguo.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: Hao Luo <haoluo@google.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonas Bonn <jonas@southpole.se> Cc: KP Singh <kpsingh@kernel.org> Cc: Marc Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rich Felker <dalias@libc.org> Cc: Sasha Levin (Microsoft) <sashal@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Stafford Horne <shorne@gmail.com> Cc: Stanislav Fomichev <sdf@fomichev.me> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yonghong.song@linux.dev> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-26 19:07:15 -08:00
Peter Zijlstra	80047d84ee	atomic: add alignment check to instrumented atomic operations Add a Kconfig option for debug builds which logs a warning when an instrumented atomic operation takes place that's misaligned. Some platforms don't trap for this. [fthain@linux-m68k.org: added __DISABLE_EXPORTS conditional and refactored as helper function] Link: https://lkml.kernel.org/r/51ebf844e006ca0de408f5d3a831e7b39d7fc31c.1768281748.git.fthain@linux-m68k.org Link: https://lore.kernel.org/lkml/20250901093600.GF4067720@noisy.programming.kicks-ass.net/ Link: https://lore.kernel.org/linux-next/df9fbd22-a648-ada4-fee0-68fe4325ff82@linux-m68k.org/ Signed-off-by: Finn Thain <fthain@linux-m68k.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Sasha Levin <sashal@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Daniel Borkman <daniel@iogearbox.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Gary Guo <gary@garyguo.net> Cc: Guo Ren <guoren@kernel.org> Cc: Hao Luo <haoluo@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonas Bonn <jonas@southpole.se> Cc: KP Singh <kpsingh@kernel.org> Cc: Marc Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Rich Felker <dalias@libc.org> Cc: Song Liu <song@kernel.org> Cc: Stafford Horne <shorne@gmail.com> Cc: Stanislav Fomichev <sdf@fomichev.me> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yonghong.song@linux.dev> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-26 19:07:14 -08:00
Finn Thain	e428b013d9	atomic: specify alignment for atomic_t and atomic64_t Some recent commits incorrectly assumed 4-byte alignment of locks. That assumption fails on Linux/m68k (and, interestingly, would have failed on Linux/cris also). The jump label implementation makes a similar alignment assumption. The expectation that atomic_t and atomic64_t variables will be naturally aligned seems reasonable, as indeed they are on 64-bit architectures. But atomic64_t isn't naturally aligned on csky, m68k, microblaze, nios2, openrisc and sh. Neither atomic_t nor atomic64_t are naturally aligned on m68k. This patch brings a little uniformity by specifying natural alignment for atomic types. One benefit is that atomic64_t variables do not get split across a page boundary. The cost is that some structs grow which leads to cache misses and wasted memory. See also, commit `bbf2a330d9` ("x86: atomic64: The atomic64_t data type should be 8 bytes aligned on 32-bit too"). Link: https://lkml.kernel.org/r/a76bc24a4e7c1d8112d7d5fa8d14e4b694a0e90c.1768281748.git.fthain@linux-m68k.org Link: https://lore.kernel.org/lkml/CAFr9PX=MYUDGJS2kAvPMkkfvH+0-SwQB_kxE4ea0J_wZ_pk=7w@mail.gmail.com Link: https://lore.kernel.org/lkml/CAMuHMdW7Ab13DdGs2acMQcix5ObJK0O2dG_Fxzr8_g58Rc1_0g@mail.gmail.com/ Signed-off-by: Finn Thain <fthain@linux-m68k.org> Acked-by: Guo Ren <guoren@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Cc: Guo Ren <guoren@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Stafford Horne <shorne@gmail.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Daniel Borkman <daniel@iogearbox.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Gary Guo <gary@garyguo.net> Cc: Hao Luo <haoluo@google.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Marc Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sasha Levin (Microsoft) <sashal@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Stanislav Fomichev <sdf@fomichev.me> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-26 19:07:14 -08:00
Finn Thain	3bb83c9109	bpf: explicitly align bpf_res_spin_lock Patch series "Align atomic storage", v7. This series adds the __aligned attribute to atomic_t and atomic64_t definitions in include/linux and include/asm-generic (respectively) to get natural alignment of both types on csky, m68k, microblaze, nios2, openrisc and sh. This series also adds Kconfig options to enable a new run-time warning to help reveal misaligned atomic accesses on platforms which don't trap that. The performance impact is expected to vary across platforms and workloads. The measurements I made on m68k show that some workloads run faster and others slower. This patch (of 4): Align bpf_res_spin_lock to avoid a BUILD_BUG_ON() when the alignment changes, as it will do on m68k when, in a subsequent patch, the minimum alignment of the atomic_t member of struct rqspinlock gets increased from 2 to 4. Drop the BUILD_BUG_ON() as it becomes redundant. Link: https://lkml.kernel.org/r/cover.1768281748.git.fthain@linux-m68k.org Link: https://lkml.kernel.org/r/8a83876b07d1feacc024521e44059ae89abbb1ea.1768281748.git.fthain@linux-m68k.org Signed-off-by: Finn Thain <fthain@linux-m68k.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Daniel Borkman <daniel@iogearbox.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Gary Guo <gary@garyguo.net> Cc: Guo Ren <guoren@kernel.org> Cc: Hao Luo <haoluo@google.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonas Bonn <jonas@southpole.se> Cc: KP Singh <kpsingh@kernel.org> Cc: Marc Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rich Felker <dalias@libc.org> Cc: Sasha Levin (Microsoft) <sashal@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Stafford Horne <shorne@gmail.com> Cc: Stanislav Fomichev <sdf@fomichev.me> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yonghong.song@linux.dev> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-26 19:07:14 -08:00
Sun Jian	499f86de4f	init/main: read bootconfig header with get_unaligned_le32() get_boot_config_from_initrd() scans up to 3 bytes before initrd_end to handle GRUB 4-byte alignment. As a result, the bootconfig header immediately preceding the magic may be unaligned. Read the size and checksum fields with get_unaligned_le32() instead of casting to u32 * and using le32_to_cpu(), avoiding potential unaligned access and silencing sparse "cast to restricted __le32" warnings. Sparse warnings (gcc + C=1): init/main.c:292:16: warning: cast to restricted __le32 init/main.c:293:16: warning: cast to restricted __le32 No functional change intended. Link: https://lkml.kernel.org/r/20260113101532.1630770-1-sun.jian.kdev@gmail.com Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-26 19:07:14 -08:00
Lillian Berry	a906f3ae44	init/main.c: check if rdinit was explicitly set before printing warning The rdinit parameter is set by default, and attempted during boot even if not specified in the command line. Only print the warning about rdinit being inaccessible if the rdinit value was found in command line; it's just noise otherwise. [akpm@linux-foundation.org: move ramdisk_execute_command_set into __initdata] Link: https://lkml.kernel.org/r/20260111125635.53682-1-lillian@star-ark.net Signed-off-by: Lillian Berry <lillian@star-ark.net> Cc: Ahmad Fatoum <a.fatoum@pengutronix.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Douglas Anderson <dianders@chromium.org> Cc: Francesco Valla <francesco@valla.it> Cc: Guo Weikang <guoweikang.kernel@gmail.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Huan Yang <link@vivo.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: "Mike Rapoport (Microsoft)" <rppt@kernel.org> Cc: Sascha Hauer <kernel@pengutronix.de> Cc: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-26 19:07:14 -08:00
Maciej W. Rozycki	4cc67b0484	linux/log2.h: reduce instruction count for is_power_of_2() Follow an observation that (n ^ (n - 1)) will only ever retain the most significant bit set in the word operated on if that is the only bit set in the first place, and use it to determine whether a number is a whole power of 2, avoiding the need for an explicit check for nonzero. This reduces the sequence produced to 3 instructions only across Alpha, MIPS, and RISC-V targets, down from 4, 5, and 4 respectively, removing a branch in the two latter cases. And it's 5 instructions on POWER and x86-64 vs 8 and 9 respectively. There are no branches now emitted here for targets that have a suitable conditional set operation, although an inline expansion will often end with one, depending on what code a call to this function is used in. Credit goes to GCC authors for coming up with this optimisation used as the fallback for (__builtin_popcountl(n) == 1), equivalent to this code, for targets where the hardware population count operation is considered expensive. Link: https://lkml.kernel.org/r/alpine.DEB.2.21.2601111836250.30566@angie.orcam.me.uk Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Cc: Jens Axboe <axboe@kernel.dk> Cc: John Garry <john.g.garry@oracle.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Su Hui <suhui@nfschina.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-26 19:07:14 -08:00
Mathieu Desnoyers	5e65b5ca7d	tsacct: skip all kernel threads This patch is a preparation step for HPCC, for the OOM killer improvements. I suspect that this patch is useful on its own, because it really makes no sense to sum up accounting statistics of use_mm within kernel threads which are only temporarily using those mm. When we hit acct_account_cputime within a irq handler over a kthread that happens to use a userspace mm, we end up summing up the mm's RSS into the tsk acct_rss_mem1, which eventually decays. I don't see a good rationale behind tracking the mm's rss in that way when a kthread use a userspace mm temporarily through use_mm. It causes issues with init_mm and efi_mm which only partially initialize their mm_struct when introducing the new hierarchical percpu counters to replace RSS counters, which requires a pointer dereference when reading the approximate counter sum. The current percpu counters simply load a zeroed atomic counter, which happen to work. Skip all kernel threads in acct_account_cputime(), not just those that happen to have a NULL mm. This is a preparation step before introducing the hierarchical percpu counters. Link: https://lkml.kernel.org/r/20251224173810.648699-2-mathieu.desnoyers@efficios.com Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Mark Brown <broonie@kernel.org> Cc: Aboorva Devarajan <aboorvad@linux.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Christan König <christian.koenig@amd.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: "Liam R . Howlett" <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Martin Liu <liumartin@google.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: SeongJae Park <sj@kernel.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-26 19:07:13 -08:00
Luck, Tony	e8eef69a99	once: don't use a work queue to reset sleepable static key Pointless overhead to use a work queue to reset the static key for a DO_ONCE_SLEEPABLE() invocation. Note that the previous code path included a BUG_ON() if the static key was already disabled. Dropped that as part of this change because: 1) Use of BUG_ON() is highly discouraged. 2) There is a WARN_ON() in the static_branch_disable() code path that would provide adequate breadcrumbs to debug any issue. Link: https://lkml.kernel.org/r/aWU4tfTju1l3oZCu@agluck-desk3 Signed-off-by: Tony Luck <tony.luck@intel.com> Reported-by: Reinette Chatre <reinette.chatre@intel.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-26 19:07:13 -08:00

1 2 3 4 5 ...

1413685 commits