-----BEGIN PGP SIGNATURE-----
iJEEABYKADkWIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCaav3pBsUgAAAAAAEAA5t
YW51MiwyLjUrMS4xMSwyLDIACgkQgFxhu0/YY75xWwD+NO/7WX01zcYSFMHTjHRx
okbOkBwFzcZK+p/L4iTtVv0BAIYPUpa+RBLR2RtYN7mQEw8KO5yVgiLP2nlQYwIf
wZcH
=DrUe
-----END PGP SIGNATURE-----
Merge tag 'for-linus-7.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- a cleanup of arch/x86/kernel/head_64.S removing the pre-built page
tables for Xen guests
- a small comment update
- another cleanup for Xen PVH guests mode
- fix an issue with Xen PV-devices backed by driver domains
* tag 'for-linus-7.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/xenbus: better handle backend crash
xenbus: add xenbus_device parameter to xenbus_read_driver_state()
x86/PVH: Use boot params to pass RSDP address in start_info page
x86/xen: update outdated comment
xen/acpi-processor: fix _CST detection using undersized evaluation buffer
x86/xen: Build identity mapping page tables dynamically for XENPV
After commit e6e094e053 ("x86/acpi, x86/boot: Take RSDP address from
boot params if available"), the RSDP address can be passed in boot
params. Therefore, store the RSDP address in start_info page into boot
params in the PVH entry instead of registering a different callback.
This removes an absolute reference during the PVH entry and is more
standardized.
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <76675c4d49d3a8f72252076812ef8f22276230c2.1772282441.git.houwenlong.hwl@antgroup.com>
This was done entirely with mindless brute force, using
git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'
to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.
Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.
For the same reason the 'flex' versions will be done as a separate
conversion.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmmLFI0ACgkQEsHwGGHe
VUpuwQ/9Gm7faW+oz0yNAm0q/GQw/hVPAPQgAN5UmpbsaW0CYI6EzB/b8ZWO4HXy
zbsKntVoEEV54KBQ9a4gouWzP6id9//NAzElY/ZBo6brNWSRXbN/7M28RczArIzE
Nm5MrHTbEU552HRQ29x5wEl3CzIANzWuij/Xt+lNBt5kXNHgRcOZ35icw692d1va
+7eNgtP/AOGjFpBX/OuBK2HvTxA02tpZPRLSgursWq8rNFIpN5quOmCbNWPJKH4I
+5tDXv3yfMiJ4BPCMLoQWYm6w1avwJI72aCM0V066Q+ZGi5XHfcg5GeOq3NGdE5c
/D4eoNIRqLrxC9V93d8CJV4BnmPuYsTra9B8HsWp50ip2jTVwborAoONkxLBat0r
hoFZ+3IrXU6QEEkVFDeJWv5gb7FK/krJV5y04tQV/byD03QP5HEJeVcewsBUluN3
OdTbin/GQbJKc+kwYlV9iBHeY93zmlHiHNWLfTfkU2Urp6oWQpYk4srhI12CdTFy
upQJ8o6atnONqHWiyIipCNxHHmAikaOiRUpXAgnqTTk8jZZOMs8sU/X1/85fQfg2
cM72/Qb6iGGdyB33AkmmG6aIvmsmoLhMnz2HV7S8E+L8A7EO3/nBkuLpLqH/KNwi
gNo323xqaTfFm9fi13Up2/epPApIicxEjZWD0eTbNA31ft7sMzM=
=oa27
-----END PGP SIGNATURE-----
Merge tag 'x86_cleanups_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Borislav Petkov:
- The usual set of cleanups and simplifications all over the tree
* tag 'x86_cleanups_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/segment: Use MOVL when reading segment registers
selftests/x86: Clean up sysret_rip coding style
x86/mm: Hide mm_free_global_asid() definition under CONFIG_BROADCAST_TLB_FLUSH
x86/crash: Use set_memory_p() instead of __set_memory_prot()
x86/CPU/AMD: Simplify the spectral chicken fix
x86/platform/olpc: Replace strcpy() with strscpy() in xo15_sci_add()
x86/split_lock: Remove dead string when split_lock_detect=fatal
- x86/acpi: Add acpi=spcr to use SPCR-provided default console
(Shenghao Yang)
- x86/acpi/boot: Correct the acpi_is_processor_usable() check again
(Yazen Ghannam)
- Refresh the x86 memory map (e820 table) handling code, and
make the printouts a bit more informative. (Ingo Molnar)
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmmJk1wRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1gcvA/9E00IuhZS6RxIPhqPFVhUvYlOEMnBBfnb
Sv0de0etGb2Q5GRL5j+XqMbTE2mns1we8Jh7dyY9N4K1eRlhXJZQWEFjqsFHX5/A
VxSc3/QrfzzpAxYDAg0hOSjMeJv3W4pdLjfwDpNLa/EZ+KA5tVcwu6ufd9ZFXO3a
DWlk0H3JbRuiObFlKqSpccaf9/taRz9LZ9DNqTT0K9zDN4yukWKYXxjhwxASLZ2C
i5cAzgJqvxiK4xtx5WbCs5b0swLzDi/4V7L1D9ojhckoCkrlLUjo7eAvSiS2kYik
NcA7LHDehMmP6mKOQUq9YJhR/P9FfmeItd0VgdAmVTuMdsl0LQDCl5v8XkWoBbXG
yYabeJl3eBrd5OVqsMK/vrnRkxMuBnTHdysyUpJrtCXHrMbQnpxerMIgqwhDH5rx
Z3zN4MRDf2buDJZDQT53ItKjaCGuUAEu869vo12eWhbA+OeL83G6+cs8kPUXquIq
p3NfPTLWj3U7qcwN35x8WTT1PWBrOtiUdjm27lLFUFsrY4xqXrgcHoXkoAp3tqQX
eVoxjROVGsrmFdzVr6c9D/1ZyVDwsetKQdixHrLWiAlm/S0lwDYee180uB+ocYLH
ML4QD7rvb6npAHcxUtvftpDEBO7TRnxZWlSgS70ZOYUnnVxLBhFrhygLMJ1giEew
aAodneB2Os8=
=brCM
-----END PGP SIGNATURE-----
Merge tag 'x86-boot-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/boot updates from Ingo Molnar:
- x86/acpi: Add acpi=spcr to use SPCR-provided default console
(Shenghao Yang)
- x86/acpi/boot: Correct the acpi_is_processor_usable() check again
(Yazen Ghannam)
- Refresh the x86 memory map (e820 table) handling code, and make the
printouts a bit more informative (Ingo Molnar)
* tag 'x86-boot-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits)
x86/acpi: Add acpi=spcr to use SPCR-provided default console
x86/boot/e820: Use <linux/sizes.h> symbols for literals
x86/boot/e820: Make sure e820_search_gap() finds all gaps
x86/boot/e820: Simplify the e820__range_remove() API
x86/boot/e820: Remove e820__range_remove()'s unused return parameter
x86/boot/e820: Simplify append_e820_table() and remove restriction on single-entry tables
x86/boot/e820: Standardize __init/__initdata tag placement
x86/boot/e820: Simplify & clarify __e820__range_add() a bit
x86/boot/e820: Rename gap_start/gap_size to max_gap_start/max_gap_start in e820_search_gap() et al
x86/boot/e820: Change e820_search_gap() to search for the highest-address PCI gap
x86/boot/e820: Clean up e820__setup_pci_gap()/e820_search_gap() a bit
x86/boot/e820: Change struct e820_table::nr_entries type from __u32 to u32
x86/boot/e820: Standardize e820 table index variable types under 'u32'
x86/boot/e820: Standardize e820 table index variable names under 'idx'
x86/boot/e820: Remove unnecessary header inclusions
x86/boot/e820: Clean up __refdata use a bit
x86/boot/e820: Clean up __e820__range_add() a bit
x86/boot/e820: Improve e820_print_type() messages
x86/boot/e820: Clean up confusing and self-contradictory verbiage around E820 related resource allocations
x86/boot/e820: Remove pointless early_panic() indirection
...
The PVH entry is available for 32-bit KVM guests, and 32-bit KVM guests
do not depend on CONFIG_X86_PAE. However, mk_early_pgtbl_32() builds
different pagetables depending on whether CONFIG_X86_PAE is set.
Therefore, enabling PAE mode for 32-bit KVM guests without
CONFIG_X86_PAE being set would result in a boot failure during CR3
loading.
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <d09ce9a134eb9cbc16928a5b316969f8ba606b81.1768017442.git.houwenlong.hwl@antgroup.com>
Right now e820__range_remove() has two parameters to control the
E820 type of the range removed:
extern void e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool check_type);
Since E820 types start at 1, zero has a natural meaning of 'no type.
Consolidate the (old_type,check_type) parameters into a single (filter_type)
parameter:
extern void e820__range_remove(u64 start, u64 size, enum e820_type filter_type);
Note that both e820__mapped_raw_any() and e820__mapped_any()
already have such semantics for their 'type' parameter, although
it's currently not used with '0' by in-kernel code.
Also, the __e820__mapped_all() internal helper already has such
semantics implemented as well, and the e820__get_entry_type() API
uses the '0' type to such effect.
This simplifies not just e820__range_remove(), and synchronizes its
use of type filters with other E820 API functions, but simplifies
usage sites as well, such as parse_memmap_one(), beyond the reduction
of the number of parameters:
- else if (from)
- e820__range_remove(start_at, mem_size, from, 1);
else
- e820__range_remove(start_at, mem_size, 0, 0);
+ e820__range_remove(start_at, mem_size, from);
The generated code gets smaller as well:
add/remove: 0/0 grow/shrink: 0/5 up/down: 0/-66 (-66)
Function old new delta
parse_memopt 112 107 -5
efi_init 1048 1039 -9
setup_arch 2719 2709 -10
e820__range_remove 283 273 -10
parse_memmap_opt 559 527 -32
Total: Before=22,675,600, After=22,675,534, chg -0.00%
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: H . Peter Anvin <hpa@zytor.com>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://patch.msgid.link/20250515120549.2820541-28-mingo@kernel.org
emulator in favor of int3_emulate_jcc() which is written in C
- Replace KVM fastops with C-based stubs which avoids problems with the
fastop infra related to latter not adhering to the C ABI due to their
special calling convention and, more importantly, bypassing compiler
control-flow integrity checking because they're written in asm
- Remove wrongly used static branches and other ugliness accumulated
over time in hyperv's hypercall implementation with a proper static
function call to the correct hypervisor call variant
- Add some fixes and modifications to allow running FRED-enabled kernels
in KVM even on non-FRED hardware
- Add kCFI improvements like validating indirect calls and prepare for
enabling kCFI with GCC. Add cmdline params documentation and other
code cleanups
- Use the single-byte 0xd6 insn as the official #UD single-byte
undefined opcode instruction as agreed upon by both x86 vendors
- Other smaller cleanups and touchups all over the place
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmjqXxkACgkQEsHwGGHe
VUq9QBAAsjaay99a1+Dc53xyP1/HzCUFZDOzEYhj9zF85I8/xA9vTXZr7Qg2m6os
+4EEmnlwU43AR5KgwGJcuszLF9qSqTMz5qkAdFpvnoQ1Hbc8b49A+3yo9/hM7NA2
gPGH0gVZVBcffoETiQ8tJN6C9H6Ec0nTZwKTbasWwxz5oUAw+ppjP+aF4rFQ2/5w
b1ofrcga5yucjvSlXjBOEwHvd21l7O9iMre1oGEn6b0E2LU8ldToRkJkVZIhkWeL
2Iq3gYtVNN4Ao06WbV/EfXAqg5HWXjcm5bLcUXDtSF+Blae+gWoCjrT7XQdQGyEq
J12l4FbIZk5Ha8eWAC425ye9i3Wwo+oie3Cc4SVCMdv5A+AmOF0ijAlo1hcxq0rX
eGNWm8BKJOJ9zz1kxLISO7CfjULKgpsXLabF5a19uwoCsQgj5YrhlJezaIKHXbnK
OWwHWg9IuRkN2KLmJa7pXtHkuAHp4MtEV9TP9kU2WCvCInrNrzp3gYtds3pri82c
8ove+WA3yb/AQ6RCq5vAMLYXBxMRbN7FrmY5ZuwgWJTMi6cp1Sp02mhobwJOgNhO
H7nKWCZnQMyCLPzVeg97HTSgqSXw13dSrujWX9gWYVWBMfZO1B9HcUrhtiOhH7Q9
cvELkcqaxKrCKdRHLLYgHeMIQU2tdpsQ5TXHm7C7liEcZPZpk+g=
=3Otb
-----END PGP SIGNATURE-----
Merge tag 'x86_core_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull more x86 updates from Borislav Petkov:
- Remove a bunch of asm implementing condition flags testing in KVM's
emulator in favor of int3_emulate_jcc() which is written in C
- Replace KVM fastops with C-based stubs which avoids problems with the
fastop infra related to latter not adhering to the C ABI due to their
special calling convention and, more importantly, bypassing compiler
control-flow integrity checking because they're written in asm
- Remove wrongly used static branches and other ugliness accumulated
over time in hyperv's hypercall implementation with a proper static
function call to the correct hypervisor call variant
- Add some fixes and modifications to allow running FRED-enabled
kernels in KVM even on non-FRED hardware
- Add kCFI improvements like validating indirect calls and prepare for
enabling kCFI with GCC. Add cmdline params documentation and other
code cleanups
- Use the single-byte 0xd6 insn as the official #UD single-byte
undefined opcode instruction as agreed upon by both x86 vendors
- Other smaller cleanups and touchups all over the place
* tag 'x86_core_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
x86,retpoline: Optimize patch_retpoline()
x86,ibt: Use UDB instead of 0xEA
x86/cfi: Remove __noinitretpoline and __noretpoline
x86/cfi: Add "debug" option to "cfi=" bootparam
x86/cfi: Standardize on common "CFI:" prefix for CFI reports
x86/cfi: Document the "cfi=" bootparam options
x86/traps: Clarify KCFI instruction layout
compiler_types.h: Move __nocfi out of compiler-specific header
objtool: Validate kCFI calls
x86/fred: KVM: VMX: Always use FRED for IRQs when CONFIG_X86_FRED=y
x86/fred: Play nice with invoking asm_fred_entry_from_kvm() on non-FRED hardware
x86/fred: Install system vector handlers even if FRED isn't fully enabled
x86/hyperv: Use direct call to hypercall-page
x86/hyperv: Clean up hv_do_hypercall()
KVM: x86: Remove fastops
KVM: x86: Convert em_salc() to C
KVM: x86: Introduce EM_ASM_3WCL
KVM: x86: Introduce EM_ASM_1SRC2
KVM: x86: Introduce EM_ASM_2CL
KVM: x86: Introduce EM_ASM_2W
...
- The 3 patch series "mm, swap: improve cluster scan strategy" from
Kairui Song improves performance and reduces the failure rate of swap
cluster allocation.
- The 4 patch series "support large align and nid in Rust allocators"
from Vitaly Wool permits Rust allocators to set NUMA node and large
alignment when perforning slub and vmalloc reallocs.
- The 2 patch series "mm/damon/vaddr: support stat-purpose DAMOS" from
Yueyang Pan extend DAMOS_STAT's handling of the DAMON operations sets
for virtual address spaces for ops-level DAMOS filters.
- The 3 patch series "execute PROCMAP_QUERY ioctl under per-vma lock"
from Suren Baghdasaryan reduces mmap_lock contention during reads of
/proc/pid/maps.
- The 2 patch series "mm/mincore: minor clean up for swap cache
checking" from Kairui Song performs some cleanup in the swap code.
- The 11 patch series "mm: vm_normal_page*() improvements" from David
Hildenbrand provides code cleanup in the pagemap code.
- The 5 patch series "add persistent huge zero folio support" from
Pankaj Raghav provides a block layer speedup by optionalls making the
huge_zero_pagepersistent, instead of releasing it when its refcount
falls to zero.
- The 3 patch series "kho: fixes and cleanups" from Mike Rapoport adds a
few touchups to the recently added Kexec Handover feature.
- The 10 patch series "mm: make mm->flags a bitmap and 64-bit on all
arches" from Lorenzo Stoakes turns mm_struct.flags into a bitmap. To
end the constant struggle with space shortage on 32-bit conflicting with
64-bit's needs.
- The 2 patch series "mm/swapfile.c and swap.h cleanup" from Chris Li
cleans up some swap code.
- The 7 patch series "selftests/mm: Fix false positives and skip
unsupported tests" from Donet Tom fixes a few things in our selftests
code.
- The 7 patch series "prctl: extend PR_SET_THP_DISABLE to only provide
THPs when advised" from David Hildenbrand "allows individual processes
to opt-out of THP=always into THP=madvise, without affecting other
workloads on the system".
It's a long story - the [1/N] changelog spells out the considerations.
- The 11 patch series "Add and use memdesc_flags_t" from Matthew Wilcox
gets us started on the memdesc project. Please see
https://kernelnewbies.org/MatthewWilcox/Memdescs and
https://blogs.oracle.com/linux/post/introducing-memdesc.
- The 3 patch series "Tiny optimization for large read operations" from
Chi Zhiling improves the efficiency of the pagecache read path.
- The 5 patch series "Better split_huge_page_test result check" from Zi
Yan improves our folio splitting selftest code.
- The 2 patch series "test that rmap behaves as expected" from Wei Yang
adds some rmap selftests.
- The 3 patch series "remove write_cache_pages()" from Christoph Hellwig
removes that function and converts its two remaining callers.
- The 2 patch series "selftests/mm: uffd-stress fixes" from Dev Jain
fixes some UFFD selftests issues.
- The 3 patch series "introduce kernel file mapped folios" from Boris
Burkov introduces the concept of "kernel file pages". Using these
permits btrfs to account its metadata pages to the root cgroup, rather
than to the cgroups of random inappropriate tasks.
- The 2 patch series "mm/pageblock: improve readability of some
pageblock handling" from Wei Yang provides some readability improvements
to the page allocator code.
- The 11 patch series "mm/damon: support ARM32 with LPAE" from SeongJae
Park teaches DAMON to understand arm32 highmem.
- The 4 patch series "tools: testing: Use existing atomic.h for
vma/maple tests" from Brendan Jackman performs some code cleanups and
deduplication under tools/testing/.
- The 2 patch series "maple_tree: Fix testing for 32bit compiles" from
Liam Howlett fixes a couple of 32-bit issues in
tools/testing/radix-tree.c.
- The 2 patch series "kasan: unify kasan_enabled() and remove
arch-specific implementations" from Sabyrzhan Tasbolatov moves KASAN
arch-specific initialization code into a common arch-neutral
implementation.
- The 3 patch series "mm: remove zpool" from Johannes Weiner removes
zspool - an indirection layer which now only redirects to a single thing
(zsmalloc).
- The 2 patch series "mm: task_stack: Stack handling cleanups" from
Pasha Tatashin makes a couple of cleanups in the fork code.
- The 37 patch series "mm: remove nth_page()" from David Hildenbrand
makes rather a lot of adjustments at various nth_page() callsites,
eventually permitting the removal of that undesirable helper function.
- The 2 patch series "introduce kasan.write_only option in hw-tags" from
Yeoreum Yun creates a KASAN read-only mode for ARM, using that
architecture's memory tagging feature. It is felt that a read-only mode
KASAN is suitable for use in production systems rather than debug-only.
- The 3 patch series "mm: hugetlb: cleanup hugetlb folio allocation"
from Kefeng Wang does some tidying in the hugetlb folio allocation code.
- The 12 patch series "mm: establish const-correctness for pointer
parameters" from Max Kellermann makes quite a number of the MM API
functions more accurate about the constness of their arguments. This
was getting in the way of subsystems (in this case CEPH) when they
attempt to improving their own const/non-const accuracy.
- The 7 patch series "Cleanup free_pages() misuse" from Vishal Moola
fixes a number of code sites which were confused over when to use
free_pages() vs __free_pages().
- The 3 patch series "Add Rust abstraction for Maple Trees" from Alice
Ryhl makes the mapletree code accessible to Rust. Required by nouveau
and by its forthcoming successor: the new Rust Nova driver.
- The 2 patch series "selftests/mm: split_huge_page_test:
split_pte_mapped_thp improvements" from David Hildenbrand adds a fix and
some cleanups to the thp selftesting code.
- The 14 patch series "mm, swap: introduce swap table as swap cache
(phase I)" from Chris Li and Kairui Song is the first step along the
path to implementing "swap tables" - a new approach to swap allocation
and state tracking which is expected to yield speed and space
improvements. This patchset itself yields a 5-20% performance benefit
in some situations.
- The 3 patch series "Some ptdesc cleanups" from Matthew Wilcox utilizes
the new memdesc layer to clean up the ptdesc code a little.
- The 3 patch series "Fix va_high_addr_switch.sh test failure" from
Chunyu Hu fixes some issues in our 5-level pagetable selftesting code.
- The 2 patch series "Minor fixes for memory allocation profiling" from
Suren Baghdasaryan addresses a couple of minor issues in relatively new
memory allocation profiling feature.
- The 3 patch series "Small cleanups" from Matthew Wilcox has a few
cleanups in preparation for more memdesc work.
- The 2 patch series "mm/damon: add addr_unit for DAMON_LRU_SORT and
DAMON_RECLAIM" from Quanmin Yan makes some changes to DAMON in
furtherance of supporting arm highmem.
- The 2 patch series "selftests/mm: Add -Wunreachable-code and fix
warnings" from Muhammad Anjum adds that compiler check to selftests code
and fixes the fallout, by removing dead code.
- The 10 patch series "Improvements to Victim Process Thawing and OOM
Reaper Traversal Order" from zhongjinji makes a number of improvements
in the OOM killer: mainly thawing a more appropriate group of victim
threads so they can release resources.
- The 5 patch series "mm/damon: misc fixups and improvements for 6.18"
from SeongJae Park is a bunch of small and unrelated fixups for DAMON.
- The 7 patch series "mm/damon: define and use DAMON initialization
check function" from SeongJae Park implement reliability and
maintainability improvements to a recently-added bug fix.
- The 2 patch series "mm/damon/stat: expose auto-tuned intervals and
non-idle ages" from SeongJae Park provides additional transparency to
userspace clients of the DAMON_STAT information.
- The 2 patch series "Expand scope of khugepaged anonymous collapse"
from Dev Jain removes some constraints on khubepaged's collapsing of
anon VMAs. It also increases the success rate of MADV_COLLAPSE against
an anon vma.
- The 2 patch series "mm: do not assume file == vma->vm_file in
compat_vma_mmap_prepare()" from Lorenzo Stoakes moves us further towards
removal of file_operations.mmap(). This patchset concentrates upon
clearing up the treatment of stacked filesystems.
- The 6 patch series "mm: Improve mlock tracking for large folios" from
Kiryl Shutsemau provides some fixes and improvements to mlock's tracking
of large folios. /proc/meminfo's "Mlocked" field became more accurate.
- The 2 patch series "mm/ksm: Fix incorrect accounting of KSM counters
during fork" from Donet Tom fixes several user-visible KSM stats
inaccuracies across forks and adds selftest code to verify these
counters.
- The 2 patch series "mm_slot: fix the usage of mm_slot_entry" from Wei
Yang addresses some potential but presently benign issues in KSM's
mm_slot handling.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaN3cywAKCRDdBJ7gKXxA
jtaPAQDmIuIu7+XnVUK5V11hsQ/5QtsUeLHV3OsAn4yW5/3dEQD/UddRU08ePN+1
2VRB0EwkLAdfMWW7TfiNZ+yhuoiL/AA=
=4mhY
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2025-10-01-19-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- "mm, swap: improve cluster scan strategy" from Kairui Song improves
performance and reduces the failure rate of swap cluster allocation
- "support large align and nid in Rust allocators" from Vitaly Wool
permits Rust allocators to set NUMA node and large alignment when
perforning slub and vmalloc reallocs
- "mm/damon/vaddr: support stat-purpose DAMOS" from Yueyang Pan extend
DAMOS_STAT's handling of the DAMON operations sets for virtual
address spaces for ops-level DAMOS filters
- "execute PROCMAP_QUERY ioctl under per-vma lock" from Suren
Baghdasaryan reduces mmap_lock contention during reads of
/proc/pid/maps
- "mm/mincore: minor clean up for swap cache checking" from Kairui Song
performs some cleanup in the swap code
- "mm: vm_normal_page*() improvements" from David Hildenbrand provides
code cleanup in the pagemap code
- "add persistent huge zero folio support" from Pankaj Raghav provides
a block layer speedup by optionalls making the
huge_zero_pagepersistent, instead of releasing it when its refcount
falls to zero
- "kho: fixes and cleanups" from Mike Rapoport adds a few touchups to
the recently added Kexec Handover feature
- "mm: make mm->flags a bitmap and 64-bit on all arches" from Lorenzo
Stoakes turns mm_struct.flags into a bitmap. To end the constant
struggle with space shortage on 32-bit conflicting with 64-bit's
needs
- "mm/swapfile.c and swap.h cleanup" from Chris Li cleans up some swap
code
- "selftests/mm: Fix false positives and skip unsupported tests" from
Donet Tom fixes a few things in our selftests code
- "prctl: extend PR_SET_THP_DISABLE to only provide THPs when advised"
from David Hildenbrand "allows individual processes to opt-out of
THP=always into THP=madvise, without affecting other workloads on the
system".
It's a long story - the [1/N] changelog spells out the considerations
- "Add and use memdesc_flags_t" from Matthew Wilcox gets us started on
the memdesc project. Please see
https://kernelnewbies.org/MatthewWilcox/Memdescs and
https://blogs.oracle.com/linux/post/introducing-memdesc
- "Tiny optimization for large read operations" from Chi Zhiling
improves the efficiency of the pagecache read path
- "Better split_huge_page_test result check" from Zi Yan improves our
folio splitting selftest code
- "test that rmap behaves as expected" from Wei Yang adds some rmap
selftests
- "remove write_cache_pages()" from Christoph Hellwig removes that
function and converts its two remaining callers
- "selftests/mm: uffd-stress fixes" from Dev Jain fixes some UFFD
selftests issues
- "introduce kernel file mapped folios" from Boris Burkov introduces
the concept of "kernel file pages". Using these permits btrfs to
account its metadata pages to the root cgroup, rather than to the
cgroups of random inappropriate tasks
- "mm/pageblock: improve readability of some pageblock handling" from
Wei Yang provides some readability improvements to the page allocator
code
- "mm/damon: support ARM32 with LPAE" from SeongJae Park teaches DAMON
to understand arm32 highmem
- "tools: testing: Use existing atomic.h for vma/maple tests" from
Brendan Jackman performs some code cleanups and deduplication under
tools/testing/
- "maple_tree: Fix testing for 32bit compiles" from Liam Howlett fixes
a couple of 32-bit issues in tools/testing/radix-tree.c
- "kasan: unify kasan_enabled() and remove arch-specific
implementations" from Sabyrzhan Tasbolatov moves KASAN arch-specific
initialization code into a common arch-neutral implementation
- "mm: remove zpool" from Johannes Weiner removes zspool - an
indirection layer which now only redirects to a single thing
(zsmalloc)
- "mm: task_stack: Stack handling cleanups" from Pasha Tatashin makes a
couple of cleanups in the fork code
- "mm: remove nth_page()" from David Hildenbrand makes rather a lot of
adjustments at various nth_page() callsites, eventually permitting
the removal of that undesirable helper function
- "introduce kasan.write_only option in hw-tags" from Yeoreum Yun
creates a KASAN read-only mode for ARM, using that architecture's
memory tagging feature. It is felt that a read-only mode KASAN is
suitable for use in production systems rather than debug-only
- "mm: hugetlb: cleanup hugetlb folio allocation" from Kefeng Wang does
some tidying in the hugetlb folio allocation code
- "mm: establish const-correctness for pointer parameters" from Max
Kellermann makes quite a number of the MM API functions more accurate
about the constness of their arguments. This was getting in the way
of subsystems (in this case CEPH) when they attempt to improving
their own const/non-const accuracy
- "Cleanup free_pages() misuse" from Vishal Moola fixes a number of
code sites which were confused over when to use free_pages() vs
__free_pages()
- "Add Rust abstraction for Maple Trees" from Alice Ryhl makes the
mapletree code accessible to Rust. Required by nouveau and by its
forthcoming successor: the new Rust Nova driver
- "selftests/mm: split_huge_page_test: split_pte_mapped_thp
improvements" from David Hildenbrand adds a fix and some cleanups to
the thp selftesting code
- "mm, swap: introduce swap table as swap cache (phase I)" from Chris
Li and Kairui Song is the first step along the path to implementing
"swap tables" - a new approach to swap allocation and state tracking
which is expected to yield speed and space improvements. This
patchset itself yields a 5-20% performance benefit in some situations
- "Some ptdesc cleanups" from Matthew Wilcox utilizes the new memdesc
layer to clean up the ptdesc code a little
- "Fix va_high_addr_switch.sh test failure" from Chunyu Hu fixes some
issues in our 5-level pagetable selftesting code
- "Minor fixes for memory allocation profiling" from Suren Baghdasaryan
addresses a couple of minor issues in relatively new memory
allocation profiling feature
- "Small cleanups" from Matthew Wilcox has a few cleanups in
preparation for more memdesc work
- "mm/damon: add addr_unit for DAMON_LRU_SORT and DAMON_RECLAIM" from
Quanmin Yan makes some changes to DAMON in furtherance of supporting
arm highmem
- "selftests/mm: Add -Wunreachable-code and fix warnings" from Muhammad
Anjum adds that compiler check to selftests code and fixes the
fallout, by removing dead code
- "Improvements to Victim Process Thawing and OOM Reaper Traversal
Order" from zhongjinji makes a number of improvements in the OOM
killer: mainly thawing a more appropriate group of victim threads so
they can release resources
- "mm/damon: misc fixups and improvements for 6.18" from SeongJae Park
is a bunch of small and unrelated fixups for DAMON
- "mm/damon: define and use DAMON initialization check function" from
SeongJae Park implement reliability and maintainability improvements
to a recently-added bug fix
- "mm/damon/stat: expose auto-tuned intervals and non-idle ages" from
SeongJae Park provides additional transparency to userspace clients
of the DAMON_STAT information
- "Expand scope of khugepaged anonymous collapse" from Dev Jain removes
some constraints on khubepaged's collapsing of anon VMAs. It also
increases the success rate of MADV_COLLAPSE against an anon vma
- "mm: do not assume file == vma->vm_file in compat_vma_mmap_prepare()"
from Lorenzo Stoakes moves us further towards removal of
file_operations.mmap(). This patchset concentrates upon clearing up
the treatment of stacked filesystems
- "mm: Improve mlock tracking for large folios" from Kiryl Shutsemau
provides some fixes and improvements to mlock's tracking of large
folios. /proc/meminfo's "Mlocked" field became more accurate
- "mm/ksm: Fix incorrect accounting of KSM counters during fork" from
Donet Tom fixes several user-visible KSM stats inaccuracies across
forks and adds selftest code to verify these counters
- "mm_slot: fix the usage of mm_slot_entry" from Wei Yang addresses
some potential but presently benign issues in KSM's mm_slot handling
* tag 'mm-stable-2025-10-01-19-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (372 commits)
mm: swap: check for stable address space before operating on the VMA
mm: convert folio_page() back to a macro
mm/khugepaged: use start_addr/addr for improved readability
hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list
alloc_tag: fix boot failure due to NULL pointer dereference
mm: silence data-race in update_hiwater_rss
mm/memory-failure: don't select MEMORY_ISOLATION
mm/khugepaged: remove definition of struct khugepaged_mm_slot
mm/ksm: get mm_slot by mm_slot_entry() when slot is !NULL
hugetlb: increase number of reserving hugepages via cmdline
selftests/mm: add fork inheritance test for ksm_merging_pages counter
mm/ksm: fix incorrect KSM counter handling in mm_struct during fork
drivers/base/node: fix double free in register_one_node()
mm: remove PMD alignment constraint in execmem_vmalloc()
mm/memory_hotplug: fix typo 'esecially' -> 'especially'
mm/rmap: improve mlock tracking for large folios
mm/filemap: map entire large folio faultaround
mm/fault: try to map the entire file folio in finish_fault()
mm/rmap: mlock large folios in try_to_unmap_one()
mm/rmap: fix a mlock race condition in folio_referenced_one()
...
free_pages() should be used when we only have a virtual address. We
should call __free_pages() directly on our page instead.
Link: https://lkml.kernel.org/r/20250903185921.1785167-4-vishal.moola@gmail.com
Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Justin Sanders <justin@coraid.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Move startup code out of the __head section, now that this no longer has
a special significance. Move everything into .text or .init.text as
appropriate, so that startup code is not kept around unnecessarily.
[ bp: Fold in hunk to fix 32-bit CPU hotplug:
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202509022207.56fd97f4-lkp@intel.com ]
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250828102202.1849035-45-ardb+git@google.com
Validate that all indirect calls adhere to kCFI rules. Notably doing
nocfi indirect call to a cfi function is broken.
Apparently some Rust 'core' code violates this and explodes when ran
with FineIBT.
All the ANNOTATE_NOCFI_SYM sites are prime targets for attackers.
- runtime EFI is especially henous because it also needs to disable
IBT. Basically calling unknown code without CFI protection at
runtime is a massice security issue.
- Kexec image handover; if you can exploit this, you get to keep it :-)
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Acked-by: Sean Christopherson <seanjc@google.com>
Link: https://lkml.kernel.org/r/20250714103441.496787279@infradead.org
calls properly with the goal of implementing EFI variable store in the
SVSM - a component which is trusted by the guest, vs in the firmware, which
is not
- Allow the kernel to handle #VC exceptions from EFI runtime services
properly when running as a SNP guest
- Rework and cleanup the SNP guest request issue glue code a bit
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmiH13gACgkQEsHwGGHe
VUr88A//fIbR7eaz7QRiHq32S57NOpyOAciYrGsBrWSo1BLSFcrelYG4RTnzaKzR
ACVr2yALoeoZooH7gtPgjt7554xWJHA9DR9Ln2YPGd5a2Np8fknY0Uu1MGFVIorC
4z2u1EATlsB0I/nCh/LboryVxFN4C+qRKRk7iJ7wibdJ15zguc0T/P5lU8gY1eB8
0NZ2e0T8QnpjIc8cx/XSYXDXIwvOJ5rX36Xm5/g6A/vPubLy1UO0hkBDGfVh+2WG
dt8T+szidtqru8RQ522jW/3R/ct8iZa0U8Cp9QDdwwcQC3jBvo/xyIv5K4ueDEEI
J0KfcIKn5zbDeQbBHMw5a9XvPshwHKQIUjY83JfSsviZ1yVseQEQHeJOE6mDn2Mj
QeCWuqtwMaEoElhNX5xhe9p60KID8VoBJqB+bb1bgbN8sPeYoHc8f9p13XJaU1Mo
hV0dwlpFwCaxCZgWdtxDVji9mmvzaUT4O1QEO88AdfhDNMa+b/T5L0dJb1gnZaUY
rQ6ePImHh9nXRtJncfK3UsGmSE6HPc4O7dyV83IAcniGTgQycIYlOIUzkUbpF+wJ
advBv5Zhx2xCOBDI1ucpHNWXCCe99YVE5GeaLjq6DMLgD0HdGnXqrCw4kOluZDBQ
Xoy07x1XANQVLk0xQ5Bf1MsOiztbCZG9Rvb2dN9lCA06W5v+0MA=
=KRgO
-----END PGP SIGNATURE-----
Merge tag 'x86_sev_for_v6.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 SEV updates from Borislav Petkov:
- Map the SNP calling area pages too so that OVMF EFI fw can issue SVSM
calls properly with the goal of implementing EFI variable store in
the SVSM - a component which is trusted by the guest, vs in the
firmware, which is not
- Allow the kernel to handle #VC exceptions from EFI runtime services
properly when running as a SNP guest
- Rework and cleanup the SNP guest request issue glue code a bit
* tag 'x86_sev_for_v6.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/sev: Let sev_es_efi_map_ghcbs() map the CA pages too
x86/sev/vc: Fix EFI runtime instruction emulation
x86/sev: Drop unnecessary parameter in snp_issue_guest_request()
x86/sev: Document requirement for linear mapping of guest request buffers
x86/sev: Allocate request in TSC_INFO_REQ on stack
virt: sev-guest: Contain snp_guest_request_ioctl in sev-guest
There is inconvenient for maintainers and maintainership to have
some quirks under architectural code. Move it to the specific quirk
file like other 8250-compatible drivers do.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20250627182743.1273326-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
OVMF EFI firmware needs access to the CA page to do SVSM protocol calls. For
example, when the SVSM implements an EFI variable store, such calls will be
necessary.
So add that to sev_es_efi_map_ghcbs() and also rename the function to reflect
the additional job it is doing now.
[ bp: Massage. ]
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250626114014.373748-4-kraxel@redhat.com
ce4100_mem_serial_in() unnecessarily contains 4 nested 'if's. That makes
the code hard to follow. Invert the conditions and return early if the
particular conditions do not hold.
And use "<<=" for shifting the offset in both of the hooks.
Signed-off-by: "Jiri Slaby (SUSE)" <jirislaby@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lore.kernel.org/r/20250623101246.486866-2-jirislaby@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This x86_32 driver remain unnoticed, so after the commit below, the
compilation now fails with:
arch/x86/platform/ce4100/ce4100.c:107:16: error: incompatible function pointer types assigning to '...' from '...'
To fix the error, convert also ce4100 to the new
uart_port::serial_{in,out}() types.
Signed-off-by: "Jiri Slaby (SUSE)" <jirislaby@kernel.org>
Fixes: fc9ceb501e ("serial: 8250: sanitize uart_port::serial_{in,out}() types")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202506190552.TqNasrC3-lkp@intel.com/
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lore.kernel.org/r/20250623101246.486866-1-jirislaby@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Prepare to resolve conflicts with an upstream series of fixes that conflict
with pending x86 changes:
6f5bf947ba Merge tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Prepare to resolve conflicts with an upstream series of fixes that conflict
with pending x86 changes:
6f5bf947ba Merge tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Multiple testers reported the following new warning:
WARNING: CPU: 0 PID: 0 at arch/x86/mm/tlb.c:795
Which corresponds to:
if (IS_ENABLED(CONFIG_DEBUG_VM) && WARN_ON_ONCE(prev != &init_mm &&
!cpumask_test_cpu(cpu, mm_cpumask(next))))
cpumask_set_cpu(cpu, mm_cpumask(next));
So the problem is that unuse_temporary_mm() explicitly clears
that bit; and it has to, because otherwise the flush_tlb_mm_range() in
__text_poke() will try sending IPIs, which are not at all needed.
See also:
https://lore.kernel.org/all/20241113095550.GBZzR3pg-RhJKPDazS@fat_crate.local/
Notably, the whole {,un}use_temporary_mm() thing requires preemption to
be disabled across it with the express purpose of keeping all TLB
nonsense CPU local, such that invalidations can also stay local etc.
However, as a side-effect, we violate this above WARN(), which sorta
makes sense for the normal case, but very much doesn't make sense here.
Change unuse_temporary_mm() to mark the mm_struct such that a further
exception (beyond init_mm) can be grafted, to keep the warning for all
the other cases.
Reported-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com>
Reported-by: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rik van Riel <riel@surriel.com>
Link: https://lore.kernel.org/r/20250430081154.GH4439@noisy.programming.kicks-ass.net
Recently _pgd_alloc() was switched from using __get_free_pages() to
pagetable_alloc_noprof(), which might return a compound page in case
the allocation order is larger than 0.
On x86 this will be the case if CONFIG_MITIGATION_PAGE_TABLE_ISOLATION
is set, even if PTI has been disabled at runtime.
When running as a Xen PV guest (this will always disable PTI), using
a compound page for a PGD will result in VM_BUG_ON_PGFLAGS being
triggered when the Xen code tries to pin the PGD.
Fix the Xen issue together with the not needed 8k allocation for a
PGD with PTI disabled by replacing PGD_ALLOCATION_ORDER with an
inline helper returning the needed order for PGD allocations.
Fixes: a9b3c355c2 ("asm-generic: pgalloc: provide generic __pgd_{alloc,free}")
Reported-by: Petr Vaněk <arkamar@atlas.cz>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Petr Vaněk <arkamar@atlas.cz>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250422131717.25724-1-jgross%40suse.com
Minimum version of binutils required to compile the kernel is 2.25.
This version correctly handles the "rep" prefixes, so it is possible
to remove the semicolon, which was used to support ancient versions
of GNU as.
Due to the semicolon, the compiler considers "rep; insn" (or its
alternate "rep\n\tinsn" form) as two separate instructions. Removing
the semicolon makes asm length calculations more accurate, consequently
making scheduling and inlining decisions of the compiler more accurate.
Removing the semicolon also enables assembler checks involving "rep"
prefixes. Trying to assemble e.g. "rep addl %eax, %ebx" results in:
Error: invalid instruction `add' after `rep'
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pavel Machek <pavel@kernel.org>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20250418071437.4144391-2-ubizjak@gmail.com
This should be considerably more robust. It's also necessary for optimized
for_each_possible_lazymm_cpu() on x86 -- without this patch, EFI calls in
lazy context would remove the lazy mm from mm_cpumask().
[ mingo: Merged it on top of x86/alternatives ]
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lore.kernel.org/r/20250402094540.3586683-7-mingo@kernel.org
The last use of iosf_mbi_unregister_pmic_bus_access_notifier() was
removed in 2017 by:
a5266db4d3 ("drm/i915: Acquire PUNIT->PMIC bus for intel_uncore_forcewake_reset()")
Remove it.
(Note that the '_unlocked' version is still used.)
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Tvrtko Ursulin <tursulin@ursulin.net>
Cc: David Airlie <airlied@gmail.com>
Cc: Simona Vetter <simona@ffwll.ch>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Link: https://lore.kernel.org/r/20241225175010.91783-1-linux@treblig.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmfepu4RHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1gDlxAAjHZwjixviNtICUdOmvx1NwG4MX2w/Wcy
zF1PJWWkcxD7xlp6JXUgQFwJ6BLBj2UulJ3+5yrhv8InzAacfCdSgocrzmriYB9g
Zmw72nFXvj1aR70uS2Fs838ngHJ/SUDpHTRCOddl5mUEA6GxGzT71Zz1bykBRado
w0y0Nv3U0kuROGk/iQa6O9N3bFubQfP7KlfWEpEklrgP8NwFyZYsTX/b33N529TL
8SDJn/dOpBl5ZsHOSxYlZYtT/bpD+plWxZ0VpA92Drw3pZDgAnoWPVd4ADCJw1XX
GFVwWlEO3kuKEQ4F2AFJeZDqK2aHgeevSnDX6PA5f5bDhmU8T1/2/dwASfbCAEEa
1ZwHmuqYtxay6iISliTmIaxnC6a++m81tHQJVpnNcp3Ngl3EJ/c1Rve7W5YOz0E9
RMPJU3mqb+i2C9IUgCiW7f75W4fIyisOmURGhOKTVH3J3VeX/nu5Ggm5Egwf3Wjj
0Q+JTDRDkUa6hmgeBFm2T1ugCfYr0nYxicoEvNFrvm6cR8SdaKwYJXZLF4ZW7Ekn
EFnk5mTtCPe1VxtpRAf9Dt4rIqbzLcXsGiJaqb5ZWOpVMqNGmAX4Kct3VOLTEFMr
CKbso3jMt2iTnL8Z4THVbJDk6k7rJZ11Bg7xYZVHkZwaFAOZ5AHmj673aCgdwl6J
LM0sho6i45g=
=eh6a
-----END PGP SIGNATURE-----
Merge tag 'x86-platform-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 platform updates from Ingo Molnar:
"Two small cleanups in the x86 platform support code"
* tag 'x86-platform-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/platform/olpc: Remove unused variable 'len' in olpc_dt_compatible_match()
x86/platform/olpc-xo1-sci: Don't include <linux/pm_wakeup.h> directly
The following build warning highlights some unused code:
arch/x86/platform/olpc/olpc_dt.c: In function ‘olpc_dt_compatible_match’:
arch/x86/platform/olpc/olpc_dt.c:222:12: warning: variable ‘len’ set but not used [-Wunused-but-set-variable]
The compiler is right, the local variable 'len' is set but never used,
so remove it.
Fixes: a7a9bacb9a ("x86/platform/olpc: Use a correct version when making up a battery node")
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20241025074203.1921344-1-zengheng4@huawei.com
The header clearly states that it does not want to be included directly,
only via <linux/(platform_)?device.h>. Which is already present, so
delete the superfluous include.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250210113453.51825-2-wsa+renesas@sang-engineering.com
The percpu section is currently linked at absolute address 0, because
older compilers hard-coded the stack protector canary value at a fixed
offset from the start of the GS segment. Now that the canary is a
normal percpu variable, the percpu section does not need to be linked
at a specific address.
x86-64 will now calculate the percpu offsets as the delta between the
initial percpu address and the dynamically allocated memory, like other
architectures. Note that GSBASE is limited to the canonical address
width (48 or 57 bits, sign-extended). As long as the kernel text,
modules, and the dynamically allocated percpu memory are all in the
negative address space, the delta will not overflow this limit.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250123190747.745588-9-brgerst@gmail.com
Instead of having a private area for the stack canary, use
fixed_percpu_data for GSBASE like the native kernel.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250123190747.745588-5-brgerst@gmail.com
indivudual patches which are described in their changelogs.
- "Allocate and free frozen pages" from Matthew Wilcox reorganizes the
page allocator so we end up with the ability to allocate and free
zero-refcount pages. So that callers (ie, slab) can avoid a refcount
inc & dec.
- "Support large folios for tmpfs" from Baolin Wang teaches tmpfs to use
large folios other than PMD-sized ones.
- "Fix mm/rodata_test" from Petr Tesarik performs some maintenance and
fixes for this small built-in kernel selftest.
- "mas_anode_descend() related cleanup" from Wei Yang tidies up part of
the mapletree code.
- "mm: fix format issues and param types" from Keren Sun implements a
few minor code cleanups.
- "simplify split calculation" from Wei Yang provides a few fixes and a
test for the mapletree code.
- "mm/vma: make more mmap logic userland testable" from Lorenzo Stoakes
continues the work of moving vma-related code into the (relatively) new
mm/vma.c.
- "mm/page_alloc: gfp flags cleanups for alloc_contig_*()" from David
Hildenbrand cleans up and rationalizes handling of gfp flags in the page
allocator.
- "readahead: Reintroduce fix for improper RA window sizing" from Jan
Kara is a second attempt at fixing a readahead window sizing issue. It
should reduce the amount of unnecessary reading.
- "synchronously scan and reclaim empty user PTE pages" from Qi Zheng
addresses an issue where "huge" amounts of pte pagetables are
accumulated
(https://lore.kernel.org/lkml/cover.1718267194.git.zhengqi.arch@bytedance.com/).
Qi's series addresses this windup by synchronously freeing PTE memory
within the context of madvise(MADV_DONTNEED).
- "selftest/mm: Remove warnings found by adding compiler flags" from
Muhammad Usama Anjum fixes some build warnings in the selftests code
when optional compiler warnings are enabled.
- "mm: don't use __GFP_HARDWALL when migrating remote pages" from David
Hildenbrand tightens the allocator's observance of __GFP_HARDWALL.
- "pkeys kselftests improvements" from Kevin Brodsky implements various
fixes and cleanups in the MM selftests code, mainly pertaining to the
pkeys tests.
- "mm/damon: add sample modules" from SeongJae Park enhances DAMON to
estimate application working set size.
- "memcg/hugetlb: Rework memcg hugetlb charging" from Joshua Hahn
provides some cleanups to memcg's hugetlb charging logic.
- "mm/swap_cgroup: remove global swap cgroup lock" from Kairui Song
removes the global swap cgroup lock. A speedup of 10% for a tmpfs-based
kernel build was demonstrated.
- "zram: split page type read/write handling" from Sergey Senozhatsky
has several fixes and cleaups for zram in the area of zram_write_page().
A watchdog softlockup warning was eliminated.
- "move pagetable_*_dtor() to __tlb_remove_table()" from Kevin Brodsky
cleans up the pagetable destructor implementations. A rare
use-after-free race is fixed.
- "mm/debug: introduce and use VM_WARN_ON_VMG()" from Lorenzo Stoakes
simplifies and cleans up the debugging code in the VMA merging logic.
- "Account page tables at all levels" from Kevin Brodsky cleans up and
regularizes the pagetable ctor/dtor handling. This results in
improvements in accounting accuracy.
- "mm/damon: replace most damon_callback usages in sysfs with new core
functions" from SeongJae Park cleans up and generalizes DAMON's sysfs
file interface logic.
- "mm/damon: enable page level properties based monitoring" from
SeongJae Park increases the amount of information which is presented in
response to DAMOS actions.
- "mm/damon: remove DAMON debugfs interface" from SeongJae Park removes
DAMON's long-deprecated debugfs interfaces. Thus the migration to sysfs
is completed.
- "mm/hugetlb: Refactor hugetlb allocation resv accounting" from Peter
Xu cleans up and generalizes the hugetlb reservation accounting.
- "mm: alloc_pages_bulk: small API refactor" from Luiz Capitulino
removes a never-used feature of the alloc_pages_bulk() interface.
- "mm/damon: extend DAMOS filters for inclusion" from SeongJae Park
extends DAMOS filters to support not only exclusion (rejecting), but
also inclusion (allowing) behavior.
- "Add zpdesc memory descriptor for zswap.zpool" from Alex Shi
"introduces a new memory descriptor for zswap.zpool that currently
overlaps with struct page for now. This is part of the effort to reduce
the size of struct page and to enable dynamic allocation of memory
descriptors."
- "mm, swap: rework of swap allocator locks" from Kairui Song redoes and
simplifies the swap allocator locking. A speedup of 400% was
demonstrated for one workload. As was a 35% reduction for kernel build
time with swap-on-zram.
- "mm: update mips to use do_mmap(), make mmap_region() internal" from
Lorenzo Stoakes reworks MIPS's use of mmap_region() so that
mmap_region() can be made MM-internal.
- "mm/mglru: performance optimizations" from Yu Zhao fixes a few MGLRU
regressions and otherwise improves MGLRU performance.
- "Docs/mm/damon: add tuning guide and misc updates" from SeongJae Park
updates DAMON documentation.
- "Cleanup for memfd_create()" from Isaac Manjarres does that thing.
- "mm: hugetlb+THP folio and migration cleanups" from David Hildenbrand
provides various cleanups in the areas of hugetlb folios, THP folios and
migration.
- "Uncached buffered IO" from Jens Axboe implements the new
RWF_DONTCACHE flag which provides synchronous dropbehind for pagecache
reading and writing. To permite userspace to address issues with
massive buildup of useless pagecache when reading/writing fast devices.
- "selftests/mm: virtual_address_range: Reduce memory" from Thomas
Weißschuh fixes and optimizes some of the MM selftests.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZ5a+cwAKCRDdBJ7gKXxA
jtoyAP9R58oaOKPJuTizEKKXvh/RpMyD6sYcz/uPpnf+cKTZxQEAqfVznfWlw/Lz
uC3KRZYhmd5YrxU4o+qjbzp9XWX/xAE=
=Ib2s
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2025-01-26-14-59' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
"The various patchsets are summarized below. Plus of course many
indivudual patches which are described in their changelogs.
- "Allocate and free frozen pages" from Matthew Wilcox reorganizes
the page allocator so we end up with the ability to allocate and
free zero-refcount pages. So that callers (ie, slab) can avoid a
refcount inc & dec
- "Support large folios for tmpfs" from Baolin Wang teaches tmpfs to
use large folios other than PMD-sized ones
- "Fix mm/rodata_test" from Petr Tesarik performs some maintenance
and fixes for this small built-in kernel selftest
- "mas_anode_descend() related cleanup" from Wei Yang tidies up part
of the mapletree code
- "mm: fix format issues and param types" from Keren Sun implements a
few minor code cleanups
- "simplify split calculation" from Wei Yang provides a few fixes and
a test for the mapletree code
- "mm/vma: make more mmap logic userland testable" from Lorenzo
Stoakes continues the work of moving vma-related code into the
(relatively) new mm/vma.c
- "mm/page_alloc: gfp flags cleanups for alloc_contig_*()" from David
Hildenbrand cleans up and rationalizes handling of gfp flags in the
page allocator
- "readahead: Reintroduce fix for improper RA window sizing" from Jan
Kara is a second attempt at fixing a readahead window sizing issue.
It should reduce the amount of unnecessary reading
- "synchronously scan and reclaim empty user PTE pages" from Qi Zheng
addresses an issue where "huge" amounts of pte pagetables are
accumulated:
https://lore.kernel.org/lkml/cover.1718267194.git.zhengqi.arch@bytedance.com/
Qi's series addresses this windup by synchronously freeing PTE
memory within the context of madvise(MADV_DONTNEED)
- "selftest/mm: Remove warnings found by adding compiler flags" from
Muhammad Usama Anjum fixes some build warnings in the selftests
code when optional compiler warnings are enabled
- "mm: don't use __GFP_HARDWALL when migrating remote pages" from
David Hildenbrand tightens the allocator's observance of
__GFP_HARDWALL
- "pkeys kselftests improvements" from Kevin Brodsky implements
various fixes and cleanups in the MM selftests code, mainly
pertaining to the pkeys tests
- "mm/damon: add sample modules" from SeongJae Park enhances DAMON to
estimate application working set size
- "memcg/hugetlb: Rework memcg hugetlb charging" from Joshua Hahn
provides some cleanups to memcg's hugetlb charging logic
- "mm/swap_cgroup: remove global swap cgroup lock" from Kairui Song
removes the global swap cgroup lock. A speedup of 10% for a
tmpfs-based kernel build was demonstrated
- "zram: split page type read/write handling" from Sergey Senozhatsky
has several fixes and cleaups for zram in the area of
zram_write_page(). A watchdog softlockup warning was eliminated
- "move pagetable_*_dtor() to __tlb_remove_table()" from Kevin
Brodsky cleans up the pagetable destructor implementations. A rare
use-after-free race is fixed
- "mm/debug: introduce and use VM_WARN_ON_VMG()" from Lorenzo Stoakes
simplifies and cleans up the debugging code in the VMA merging
logic
- "Account page tables at all levels" from Kevin Brodsky cleans up
and regularizes the pagetable ctor/dtor handling. This results in
improvements in accounting accuracy
- "mm/damon: replace most damon_callback usages in sysfs with new
core functions" from SeongJae Park cleans up and generalizes
DAMON's sysfs file interface logic
- "mm/damon: enable page level properties based monitoring" from
SeongJae Park increases the amount of information which is
presented in response to DAMOS actions
- "mm/damon: remove DAMON debugfs interface" from SeongJae Park
removes DAMON's long-deprecated debugfs interfaces. Thus the
migration to sysfs is completed
- "mm/hugetlb: Refactor hugetlb allocation resv accounting" from
Peter Xu cleans up and generalizes the hugetlb reservation
accounting
- "mm: alloc_pages_bulk: small API refactor" from Luiz Capitulino
removes a never-used feature of the alloc_pages_bulk() interface
- "mm/damon: extend DAMOS filters for inclusion" from SeongJae Park
extends DAMOS filters to support not only exclusion (rejecting),
but also inclusion (allowing) behavior
- "Add zpdesc memory descriptor for zswap.zpool" from Alex Shi
introduces a new memory descriptor for zswap.zpool that currently
overlaps with struct page for now. This is part of the effort to
reduce the size of struct page and to enable dynamic allocation of
memory descriptors
- "mm, swap: rework of swap allocator locks" from Kairui Song redoes
and simplifies the swap allocator locking. A speedup of 400% was
demonstrated for one workload. As was a 35% reduction for kernel
build time with swap-on-zram
- "mm: update mips to use do_mmap(), make mmap_region() internal"
from Lorenzo Stoakes reworks MIPS's use of mmap_region() so that
mmap_region() can be made MM-internal
- "mm/mglru: performance optimizations" from Yu Zhao fixes a few
MGLRU regressions and otherwise improves MGLRU performance
- "Docs/mm/damon: add tuning guide and misc updates" from SeongJae
Park updates DAMON documentation
- "Cleanup for memfd_create()" from Isaac Manjarres does that thing
- "mm: hugetlb+THP folio and migration cleanups" from David
Hildenbrand provides various cleanups in the areas of hugetlb
folios, THP folios and migration
- "Uncached buffered IO" from Jens Axboe implements the new
RWF_DONTCACHE flag which provides synchronous dropbehind for
pagecache reading and writing. To permite userspace to address
issues with massive buildup of useless pagecache when
reading/writing fast devices
- "selftests/mm: virtual_address_range: Reduce memory" from Thomas
Weißschuh fixes and optimizes some of the MM selftests"
* tag 'mm-stable-2025-01-26-14-59' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits)
mm/compaction: fix UBSAN shift-out-of-bounds warning
s390/mm: add missing ctor/dtor on page table upgrade
kasan: sw_tags: use str_on_off() helper in kasan_init_sw_tags()
tools: add VM_WARN_ON_VMG definition
mm/damon/core: use str_high_low() helper in damos_wmark_wait_us()
seqlock: add missing parameter documentation for raw_seqcount_try_begin()
mm/page-writeback: consolidate wb_thresh bumping logic into __wb_calc_thresh
mm/page_alloc: remove the incorrect and misleading comment
zram: remove zcomp_stream_put() from write_incompressible_page()
mm: separate move/undo parts from migrate_pages_batch()
mm/kfence: use str_write_read() helper in get_access_type()
selftests/mm/mkdirty: fix memory leak in test_uffdio_copy()
kasan: hw_tags: Use str_on_off() helper in kasan_init_hw_tags()
selftests/mm: virtual_address_range: avoid reading from VM_IO mappings
selftests/mm: vm_util: split up /proc/self/smaps parsing
selftests/mm: virtual_address_range: unmap chunks after validation
selftests/mm: virtual_address_range: mmap() without PROT_WRITE
selftests/memfd/memfd_test: fix possible NULL pointer dereference
mm: add FGP_DONTCACHE folio creation flag
mm: call filemap_fdatawrite_range_kick() after IOCB_DONTCACHE issue
...
Before SLUB initialization, various subsystems used memblock_alloc to
allocate memory. In most cases, when memory allocation fails, an
immediate panic is required. To simplify this behavior and reduce
repetitive checks, introduce `memblock_alloc_or_panic`. This function
ensures that memory allocation failures result in a panic automatically,
improving code readability and consistency across subsystems that require
this behavior.
[guoweikang.kernel@gmail.com: arch/s390: save_area_alloc default failure behavior changed to panic]
Link: https://lkml.kernel.org/r/20250109033136.2845676-1-guoweikang.kernel@gmail.com
Link: https://lore.kernel.org/lkml/Z2fknmnNtiZbCc7x@kernel.org/
Link: https://lkml.kernel.org/r/20250102072528.650926-1-guoweikang.kernel@gmail.com
Signed-off-by: Guo Weikang <guoweikang.kernel@gmail.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> [s390]
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
- Increase the headroom in the EFI memory map allocation created by the
EFI stub. This is needed because event callbacks called during
ExitBootServices() may cause fragmentation, and reallocation is not
allowed after that.
- Drop obsolete UGA graphics code and switch to a more ergonomic API to
traverse handle buffers. Simplify some error paths using a __free()
helper while at it.
- Fix some W=1 warnings when CONFIG_EFI=n
- Rely on the dentry cache to keep track of the contents of the efivarfs
filesystem, rather than using a separate linked list.
- Improve and extend efivarfs test cases.
- Synchronize efivarfs with underlying variable store on resume from
hibernation - this is needed because the firmware itself or another OS
running on the same machine may have modified it.
- Fix x86 EFI stub build with GCC 15.
- Fix kexec/x86 false positive warning in EFI memory attributes table
sanity check.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQQQm/3uucuRGn1Dmh0wbglWLn0tXAUCZ5IH+gAKCRAwbglWLn0t
XHyMAP9Mqn5dD4XT22gvTRUrJuVYFLBlN+9d8ysRMjRVCzGwCQEAvCUJMy5Kje0J
h9i2InWjjPOVATx5hTrEoIEl96BGOgk=
=3Hnk
-----END PGP SIGNATURE-----
Merge tag 'efi-next-for-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI updates from Ard Biesheuvel:
- Increase the headroom in the EFI memory map allocation created by the
EFI stub. This is needed because event callbacks called during
ExitBootServices() may cause fragmentation, and reallocation is not
allowed after that.
- Drop obsolete UGA graphics code and switch to a more ergonomic API to
traverse handle buffers. Simplify some error paths using a __free()
helper while at it.
- Fix some W=1 warnings when CONFIG_EFI=n
- Rely on the dentry cache to keep track of the contents of the
efivarfs filesystem, rather than using a separate linked list.
- Improve and extend efivarfs test cases.
- Synchronize efivarfs with underlying variable store on resume from
hibernation - this is needed because the firmware itself or another
OS running on the same machine may have modified it.
- Fix x86 EFI stub build with GCC 15.
- Fix kexec/x86 false positive warning in EFI memory attributes table
sanity check.
* tag 'efi-next-for-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: (23 commits)
x86/efi: skip memattr table on kexec boot
efivarfs: add variable resync after hibernation
efivarfs: abstract initial variable creation routine
efi: libstub: Use '-std=gnu11' to fix build with GCC 15
selftests/efivarfs: add concurrent update tests
selftests/efivarfs: fix tests for failed write removal
efivarfs: fix error on write to new variable leaving remnants
efivarfs: remove unused efivarfs_list
efivarfs: move variable lifetime management into the inodes
selftests/efivarfs: add check for disallowing file truncation
efivarfs: prevent setting of zero size on the inodes in the cache
efi: sysfb_efi: fix W=1 warnings when EFI is not set
efi/libstub: Use __free() helper for pool deallocations
efi/libstub: Use cleanup helpers for freeing copies of the memory map
efi/libstub: Simplify PCI I/O handle buffer traversal
efi/libstub: Refactor and clean up GOP resolution picker code
efi/libstub: Simplify GOP handling code
efi/libstub: Use C99-style for loop to traverse handle buffer
x86/efistub: Drop long obsolete UGA support
efivarfs: make variable_is_present use dcache lookup
...
efi_memattr_init() added a sanity check to avoid firmware caused corruption.
The check is based on efi memmap entry numbers, but kexec only takes the
runtime related memmap entries thus this caused many false warnings, see
below thread for details:
https://lore.kernel.org/all/20250108215957.3437660-2-usamaarif642@gmail.com/
Ard suggests to skip the efi memattr table in kexec, this makes sense because
those memattr fixups are not critical.
Fixes: 8fbe4c49c0 ("efi/memattr: Ignore table if the size is clearly bogus")
Cc: <stable@vger.kernel.org> # v6.13+
Reported-by: Breno Leitao <leitao@debian.org>
Reported-and-tested-by: Usama Arif <usamaarif642@gmail.com>
Suggested-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Instead of marking individual interrupts as safe to be migrated in
arbitrary contexts, mark the interrupt chips, which require the interrupt
to be moved in actual interrupt context, with the new IRQCHIP_MOVE_DEFERRED
flag. This makes more sense because this is a per interrupt chip property
and not restricted to individual interrupts.
That flips the logic from the historical opt-out to a opt-in model. This is
simpler to handle for other architectures, which default to unrestricted
affinity setting. It also allows to cleanup the redundant core logic
significantly.
All interrupt chips, which belong to a top-level domain sitting directly on
top of the x86 vector domain are marked accordingly, unless the related
setup code marks the interrupts with IRQ_MOVE_PCNTXT, i.e. XEN.
No functional change intended.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steve Wahl <steve.wahl@hpe.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Link: https://lore.kernel.org/all/20241210103335.563277044@linutronix.de
UGA is the EFI graphical output protocol that preceded GOP, and has been
long obsolete. Drop support for it from the x86 implementation of the
EFI stub - other architectures never bothered to implement it (save for
ia64)
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
- Align handling of the compiled-in command line with the core kernel
- Measure the initrd into the TPM also when it was loaded via the EFI
file I/O protocols
- Clean up TPM event log handling
- Sanity check the EFI memory attributes table, and apply it after kexec
too
- Assorted other fixes
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQQQm/3uucuRGn1Dmh0wbglWLn0tXAUCZzxU/QAKCRAwbglWLn0t
XIBoAQDAHoTX2/CxsCKHXaJE8C19kN451lgLRtZea5kFCVhq+QD/YDVxfif4OvUM
q2Wo5BwmqyUk56qHAW14DWZb2Pl1sgU=
=YFm/
-----END PGP SIGNATURE-----
Merge tag 'efi-next-for-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI updates from Ard Biesheuvel:
"Just some cleanups and bug fixes this time around:
- Align handling of the compiled-in command line with the core kernel
- Measure the initrd into the TPM also when it was loaded via the EFI
file I/O protocols
- Clean up TPM event log handling
- Sanity check the EFI memory attributes table, and apply it after
kexec too
- Assorted other fixes"
* tag 'efi-next-for-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efi: Fix memory leak in efivar_ssdt_load
efi/libstub: Take command line overrides into account for loaded files
efi/libstub: Fix command line fallback handling when loading files
efi/libstub: Parse builtin command line after bootloader provided one
x86/efi: Apply EFI Memory Attributes after kexec
x86/efi: Drop support for the EFI_PROPERTIES_TABLE
efi/memattr: Ignore table if the size is clearly bogus
efi/zboot: Fix outdated comment about using LoadImage/StartImage
efi/libstub: Free correct pointer on failure
libstub,tpm: do not ignore failure case when reading final event log
tpm: fix unsigned/signed mismatch errors related to __calc_tpm2_event_size
tpm: do not ignore memblock_reserve return value
tpm: fix signed/unsigned bug when checking event logs
efi/libstub: measure initrd to PCR9 independent of source
efi/libstub: remove unnecessary cmd_line_len from efi_convert_cmdline()
efi/libstub: fix efi_parse_options() ignoring the default command line
- x86/boot: Remove unused function atou() (Dr. David Alan Gilbert)
- x86/cpu: Use str_yes_no() helper in show_cpuinfo_misc() (Thorsten Blum)
- x86/platform: Switch back to struct platform_driver::remove() (Uwe Kleine-König)
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmc7gbsRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1gk4g//QiV5NS8d5mc+6q/UbVahST5Mf3Yktg7/
LIgzJyqUMZctkbcASDbyTlFqZXGYfZ/ZY6VPEOuo0ANE57GfSbcuwe+gdvL9kXWQ
9IN5Tst4BJ4iMC+FXYA+VgRAg5Ufy4xJFpw1ObnNtAQ4f2emCTeCMPonCrN8x737
2dbHzgnJSBbDgG6lxcEJZ/PsoKoTLiq6uZZ98FRRrHCWmkPkNOn2IQx97ITdAKcj
LnQkTYnTRIzap/eLBrqgxhhtM2R+3DrDClar3WWRSCoFMYg6c8aptidDBQcJhUKZ
XIhb2enKBKFVtzahjPVOiqLE2i+JGGiPSyQnCpt7y2yHQUyDXkDWWMPdZKC/6WHf
+2sLgEdl+pz2o3mGDMCkkG58bDgEf0+5gGva+M5F6h7GslYM9Gp0doOYZDt6l+FH
3rrmXj2WKwI2kM0SWDwfgdLhhmPEsIm8DMOjOftEfR0pwqyUw5JmPfD1dnudrPX0
wfpRavlUAzzW0IbWaL+S2VhqQnLhLwJ/T/Gb6Gi6LOkTaRuD42hu0tkctlfJ/VTR
7Bk+mV2PHpBek3gWnKEUegYQ9LK9DGIucvYi693MWjXVFN6vSyNEsaefRHbT2rGi
QuLnO2Laecg6SGNBjLYJBWjv72H2CohDRuJdFgzzJAEzebLwdCJE79qjqWRhpKAx
sEzSy0+6waw=
=l+KH
-----END PGP SIGNATURE-----
Merge tag 'x86-cleanups-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Ingo Molnar:
- x86/boot: Remove unused function atou() (Dr. David Alan Gilbert)
- x86/cpu: Use str_yes_no() helper in show_cpuinfo_misc() (Thorsten
Blum)
- x86/platform: Switch back to struct platform_driver::remove() (Uwe
Kleine-König)
* tag 'x86-cleanups-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot: Remove unused function atou()
x86/cpu: Use str_yes_no() helper in show_cpuinfo_misc()
x86/platform: Switch back to struct platform_driver::remove()
with the purpose of using such hints when making scheduling decisions
- Determine the boost enumerator for each AMD core based on its type: efficiency
or performance, in the cppc driver
- Add the type of a CPU to the topology CPU descriptor with the goal of
supporting and making decisions based on the type of the respective core
- Add a feature flag to denote AMD cores which have heterogeneous topology and
enable SD_ASYM_PACKING for those
- Check microcode revisions before disabling PCID on Intel
- Cleanups and fixlets
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmc7q0UACgkQEsHwGGHe
VUq27Q//TADIn/rZj95OuWLYFXduOpzdyfF6BAOabRjUpIWTGJ5YdKjj1TCA2wUE
6SiHZWQxQropB3NgeICcDT+3OGdGzE2qywzpXspUDsBPraWx+9CA56qREYafpRps
88ZQZJWHla2/0kHN5oM4fYe05mWMLAFgIhG4tPH/7sj54Zqar40nhVksz3WjKAid
yEfzbdVeRI5sNoujyHzGANXI0Fo98nAyi5Qj9kXL9W/UV1JmoQ78Rq7V9IIgOBsc
l6Gv/h0CNtH9voqfrfUb07VHk8ZqSJ37xUnrnKdidncWGCWEAoZRr7wU+I9CHKIs
tzdx+zq6JC3YN0IwsZCjk4me+BqVLJxW2oDgW7esPifye6ElyEo4T9UO9LEpE1qm
ReAByoIMdSXWwXuITwy4NxLPKPCpU7RyJCiqFzpJp0g4qUq2cmlyERDirf6eknXL
s+dmRaglEdcQT/EL+Y+vfFdQtLdwJmOu+nPPjjFxeRcIDB+u1sXJMEFbyvkLL6FE
HOdNxL+5n/3M8Lbh77KIS5uCcjXL2VCkZK2/hyoifUb+JZR/ENoqYjElkMXOplyV
KQIfcTzVCLRVvZApf/MMkTO86cpxMDs7YLYkgFxDsBjRdoq/Mzub8yzWn6kLZtmP
ANNH4uYVtjrHE1nxJSA0JgYQlJKYeNU5yhLiTLKhHL5BwDYfiz8=
=420r
-----END PGP SIGNATURE-----
Merge tag 'x86_cpu_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpuid updates from Borislav Petkov:
- Add a feature flag which denotes AMD CPUs supporting workload
classification with the purpose of using such hints when making
scheduling decisions
- Determine the boost enumerator for each AMD core based on its type:
efficiency or performance, in the cppc driver
- Add the type of a CPU to the topology CPU descriptor with the goal of
supporting and making decisions based on the type of the respective
core
- Add a feature flag to denote AMD cores which have heterogeneous
topology and enable SD_ASYM_PACKING for those
- Check microcode revisions before disabling PCID on Intel
- Cleanups and fixlets
* tag 'x86_cpu_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu: Remove redundant CONFIG_NUMA guard around numa_add_cpu()
x86/cpu: Fix FAM5_QUARK_X1000 to use X86_MATCH_VFM()
x86/cpu: Fix formatting of cpuid_bits[] in scattered.c
x86/cpufeatures: Add X86_FEATURE_AMD_WORKLOAD_CLASS feature bit
x86/amd: Use heterogeneous core topology for identifying boost numerator
x86/cpu: Add CPU type to struct cpuinfo_topology
x86/cpu: Enable SD_ASYM_PACKING for PKG domain on AMD
x86/cpufeatures: Add X86_FEATURE_AMD_HETEROGENEOUS_CORES
x86/cpufeatures: Rename X86_FEATURE_FAST_CPPC to have AMD prefix
x86/mm: Don't disable PCID when INVLPG has been fixed by microcode
Kexec bypasses EFI's switch to virtual mode. In exchange, it has its own
routine, kexec_enter_virtual_mode(), which replays the mappings made by
the original kernel. Unfortunately, that function fails to reinstate
EFI's memory attributes, which would've otherwise been set after
entering virtual mode. Remediate this by calling
efi_runtime_update_mappings() within kexec's routine.
Signed-off-by: Nicolas Saenz Julienne <nsaenz@amazon.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Drop support for the EFI_PROPERTIES_TABLE. It was a failed, short-lived
experiment that broke the boot both on Linux and Windows, and was
replaced by the EFI_MEMORY_ATTRIBUTES_TABLE shortly after.
Suggested-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Nicolas Saenz Julienne <nsaenz@amazon.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
pcim_iomap_table() and pcim_request_regions() have been deprecated in
e354bb84a4 ("PCI: Deprecate pcim_iomap_table(), pcim_iomap_regions_request_all()") and
d140f80f60 ("PCI: Deprecate pcim_iomap_regions() in favor of pcim_iomap_region()"),
respectively.
Replace these functions with pcim_iomap_region().
Additionally, pass the actual driver name to pcim_iomap_region() instead of
the previous pci_name(), since the @name parameter should always reflect which
driver owns a region.
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20241111103602.16615-2-pstanner@redhat.com