mirror of
https://github.com/torvalds/linux.git
synced 2026-03-08 01:04:41 +01:00
5016 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
4ae12d8bd9 |
Second round of Kbuild fixes for 7.0
- Split out .modinfo section from ELF_DETAILS macro, as that macro may be used in other areas that expect to discard .modinfo, breaking certain image layouts - Adjust genksyms parser to handle optional attributes in certain declarations, necessary after commit |
||
|
|
70295a479d |
KVM: always define KVM_CAP_SYNC_MMU
KVM_CAP_SYNC_MMU is provided by KVM's MMU notifiers, which are now always available. Move the definition from individual architectures to common code. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
||
|
|
407fd8b8d8 |
KVM: remove CONFIG_KVM_GENERIC_MMU_NOTIFIER
All architectures now use MMU notifier for KVM page table management. Remove the Kconfig symbol and the code that is used when it is disabled. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> |
||
|
|
8678591b47
|
kbuild: Split .modinfo out from ELF_DETAILS
Commit |
||
|
|
189f164e57 |
Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses
Conversion performed via this Coccinelle script:
// SPDX-License-Identifier: GPL-2.0-only
// Options: --include-headers-for-types --all-includes --include-headers --keep-comments
virtual patch
@gfp depends on patch && !(file in "tools") && !(file in "samples")@
identifier ALLOC = {kmalloc_obj,kmalloc_objs,kmalloc_flex,
kzalloc_obj,kzalloc_objs,kzalloc_flex,
kvmalloc_obj,kvmalloc_objs,kvmalloc_flex,
kvzalloc_obj,kvzalloc_objs,kvzalloc_flex};
@@
ALLOC(...
- , GFP_KERNEL
)
$ make coccicheck MODE=patch COCCI=gfp.cocci
Build and boot tested x86_64 with Fedora 42's GCC and Clang:
Linux version 6.19.0+ (user@host) (gcc (GCC) 15.2.1 20260123 (Red Hat 15.2.1-7), GNU ld version 2.44-12.fc42) #1 SMP PREEMPT_DYNAMIC 1970-01-01
Linux version 6.19.0+ (user@host) (clang version 20.1.8 (Fedora 20.1.8-4.fc42), LLD 20.1.8) #1 SMP PREEMPT_DYNAMIC 1970-01-01
Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|
|
32a92f8c89 |
Convert more 'alloc_obj' cases to default GFP_KERNEL arguments
This converts some of the visually simpler cases that have been split over multiple lines. I only did the ones that are easy to verify the resulting diff by having just that final GFP_KERNEL argument on the next line. Somebody should probably do a proper coccinelle script for this, but for me the trivial script actually resulted in an assertion failure in the middle of the script. I probably had made it a bit _too_ trivial. So after fighting that far a while I decided to just do some of the syntactically simpler cases with variations of the previous 'sed' scripts. The more syntactically complex multi-line cases would mostly really want whitespace cleanup anyway. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
323bbfcf1e |
Convert 'alloc_flex' family to use the new default GFP_KERNEL argument
This is the exact same thing as the 'alloc_obj()' version, only much smaller because there are a lot fewer users of the *alloc_flex() interface. As with alloc_obj() version, this was done entirely with mindless brute force, using the same script, except using 'flex' in the pattern rather than 'objs*'. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
bf4afc53b7 |
Convert 'alloc_obj' family to use the new default GFP_KERNEL argument
This was done entirely with mindless brute force, using
git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'
to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.
Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.
For the same reason the 'flex' versions will be done as a separate
conversion.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|
|
69050f8d6d |
treewide: Replace kmalloc with kmalloc_obj for non-scalar types
This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook <kees@kernel.org> |
||
|
|
35b3515be0 |
bpf, riscv: add fsession support for trampolines
Implement BPF_TRACE_FSESSION support in the RISC-V trampoline JIT. The logic here is similar to what we did in x86_64. In order to simply the logic, we factor out the function invoke_bpf() for fentry and fexit. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Tested-by: Björn Töpel <bjorn@kernel.org> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/r/20260208053311.698352-3-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
||
|
|
93fd420d71 |
bpf, riscv: introduce emit_store_stack_imm64() for trampoline
Introduce a helper to store 64-bit immediate on the trampoline stack with a help of a register. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Tested-by: Björn Töpel <bjorn@kernel.org> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/r/20260208053311.698352-2-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
||
|
|
cb5573868e |
Loongarch:
- Add more CPUCFG mask bits. - Improve feature detection. - Add lazy load support for FPU and binary translation (LBT) register state. - Fix return value for memory reads from and writes to in-kernel devices. - Add support for detecting preemption from within a guest. - Add KVM steal time test case to tools/selftests. ARM: - Add support for FEAT_IDST, allowing ID registers that are not implemented to be reported as a normal trap rather than as an UNDEF exception. - Add sanitisation of the VTCR_EL2 register, fixing a number of UXN/PXN/XN bugs in the process. - Full handling of RESx bits, instead of only RES0, and resulting in SCTLR_EL2 being added to the list of sanitised registers. - More pKVM fixes for features that are not supposed to be exposed to guests. - Make sure that MTE being disabled on the pKVM host doesn't give it the ability to attack the hypervisor. - Allow pKVM's host stage-2 mappings to use the Force Write Back version of the memory attributes by using the "pass-through' encoding. - Fix trapping of ICC_DIR_EL1 on GICv5 hosts emulating GICv3 for the guest. - Preliminary work for guest GICv5 support. - A bunch of debugfs fixes, removing pointless custom iterators stored in guest data structures. - A small set of FPSIMD cleanups. - Selftest fixes addressing the incorrect alignment of page allocation. - Other assorted low-impact fixes and spelling fixes. RISC-V: - Fixes for issues discoverd by KVM API fuzzing in kvm_riscv_aia_imsic_has_attr(), kvm_riscv_aia_imsic_rw_attr(), and kvm_riscv_vcpu_aia_imsic_update() - Allow Zalasr, Zilsd and Zclsd extensions for Guest/VM - Transparent huge page support for hypervisor page tables - Adjust the number of available guest irq files based on MMIO register sizes found in the device tree or the ACPI tables - Add RISC-V specific paging modes to KVM selftests - Detect paging mode at runtime for selftests s390: - Performance improvement for vSIE (aka nested virtualization) - Completely new memory management. s390 was a special snowflake that enlisted help from the architecture's page table management to build hypervisor page tables, in particular enabling sharing the last level of page tables. This however was a lot of code (~3K lines) in order to support KVM, and also blocked several features. The biggest advantages is that the page size of userspace is completely independent of the page size used by the guest: userspace can mix normal pages, THPs and hugetlbfs as it sees fit, and in fact transparent hugepages were not possible before. It's also now possible to have nested guests and guests with huge pages running on the same host. - Maintainership change for s390 vfio-pci - Small quality of life improvement for protected guests x86: - Add support for giving the guest full ownership of PMU hardware (contexted switched around the fastpath run loop) and allowing direct access to data MSRs and PMCs (restricted by the vPMU model). KVM still intercepts access to control registers, e.g. to enforce event filtering and to prevent the guest from profiling sensitive host state. This is more accurate, since it has no risk of contention and thus dropped events, and also has significantly less overhead. For more information, see the commit message for merge commit |
||
|
|
cee73b1e84 |
RISC-V updates for v7.0
- Add support for control flow integrity for userspace processes. This is based on the standard RISC-V ISA extensions Zicfiss and Zicfilp - Improve ptrace behavior regarding vector registers, and add some selftests - Optimize our strlen() assembly - Enable the ISO-8859-1 code page as built-in, similar to ARM64, for EFI volume mounting - Clean up some code slightly, including defining copy_user_page() as copy_page() rather than memcpy(), aligning us with other architectures; and using max3() to slightly simplify an expression in riscv_iommu_init_check() -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEElRDoIDdEz9/svf2Kx4+xDQu9KksFAmmOYpYACgkQx4+xDQu9 KkvzOQ/9Fq8ZxWgYofhTPtw9/vps3avheOHlEoRrBWYfn1VkTRPAcbUULL4PGXwg dnVFEl3AcrpOFikIthbukklLeLoOnUshZJBU25zY5h0My1jb63V1//gEwJR6I0dg +V+GJmfzc4+YVaHK6UFdn7j3GgKUbTC7xXRMuGEriAzKPnm3AXAjh94wMNx6depv Li3IXRoZT/HvqIAyfeAoM9STwOzJtE3Sc6fXABkzsIbNTjjdgIqoRSsQsKY10178 z6ox/sVStnLmVaMbOd/ZVN0J70JRDsvK0TC0/13K1ESUbnVia9a3bPIxLRmSapKC wXnwAuSeevtFshGGyd5LZO0QQGxzG1H63Gky2GRoh8bTQbd2tQcfQzANdnPkBAQS j2aOiSsiUQeNZqfZAfEBwRd27GXRYlKb/MpgCZKUH+ZO9VG6QaD3VGvg17/Caghy nVdbBQ81ZV9tkz9EMN0vt2VJHmEqARh88w619laHjg+ioPTG4/UIDPzskt1I+Fgm Y6NQLeFyfaO3RKKDYWGPcY7fmWQI9V8MECHOvyVI4xJcgqAbqnfsgytjuiFbrfRo fTvpuB7kvltBZ180QSB79xj0sWGFTWR02MeWy3uOaLZz2eIm2ZTZbMUSgNYR0ldG L3y7CEkTkoVF1ijYgAfuMgptk3Yf0dpa66D9HUo947wWkNrW5ds= =4fTk -----END PGP SIGNATURE----- Merge tag 'riscv-for-linus-7.0-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Paul Walmsley: - Add support for control flow integrity for userspace processes. This is based on the standard RISC-V ISA extensions Zicfiss and Zicfilp - Improve ptrace behavior regarding vector registers, and add some selftests - Optimize our strlen() assembly - Enable the ISO-8859-1 code page as built-in, similar to ARM64, for EFI volume mounting - Clean up some code slightly, including defining copy_user_page() as copy_page() rather than memcpy(), aligning us with other architectures; and using max3() to slightly simplify an expression in riscv_iommu_init_check() * tag 'riscv-for-linus-7.0-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (42 commits) riscv: lib: optimize strlen loop efficiency selftests: riscv: vstate_exec_nolibc: Use the regular prctl() function selftests: riscv: verify ptrace accepts valid vector csr values selftests: riscv: verify ptrace rejects invalid vector csr inputs selftests: riscv: verify syscalls discard vector context selftests: riscv: verify initial vector state with ptrace selftests: riscv: test ptrace vector interface riscv: ptrace: validate input vector csr registers riscv: csr: define vtype register elements riscv: vector: init vector context with proper vlenb riscv: ptrace: return ENODATA for inactive vector extension kselftest/riscv: add kselftest for user mode CFI riscv: add documentation for shadow stack riscv: add documentation for landing pad / indirect branch tracking riscv: create a Kconfig fragment for shadow stack and landing pad support arch/riscv: add dual vdso creation logic and select vdso based on hw arch/riscv: compile vdso with landing pad and shadow stack note riscv: enable kernel access to shadow stack memory via the FWFT SBI call riscv: add kernel command line option to opt out of user CFI riscv/hwprobe: add zicfilp / zicfiss enumeration in hwprobe ... |
||
|
|
4cff5c05e0 |
mm.git review status for linus..mm-stable
Everything:
Total patches: 325
Reviews/patch: 1.39
Reviewed rate: 72%
Excluding DAMON:
Total patches: 262
Reviews/patch: 1.63
Reviewed rate: 82%
Excluding DAMON and zram:
Total patches: 248
Reviews/patch: 1.72
Reviewed rate: 86%
- The 14 patch series "powerpc/64s: do not re-activate batched TLB
flush" from Alexander Gordeev makes arch_{enter|leave}_lazy_mmu_mode()
nest properly.
It adds a generic enter/leave layer and switches architectures to use
it. Various hacks were removed in the process.
- The 7 patch series "zram: introduce compressed data writeback" from
Richard Chang and Sergey Senozhatsky implements data compression for
zram writeback.
- The 8 patch series "mm: folio_zero_user: clear page ranges" from David
Hildenbrand adds clearing of contiguous page ranges for hugepages.
Large improvements during demand faulting are demonstrated.
- The 2 patch series "memcg cleanups" from Chen Ridong tideis up some
memcg code.
- The 12 patch series "mm/damon: introduce {,max_}nr_snapshots and
tracepoint for damos stats" from SeongJae Park improves DAMOS stat's
provided information, deterministic control, and readability.
- The 3 patch series "selftests/mm: hugetlb cgroup charging: robustness
fixes" from Li Wang fixes a few issues in the hugetlb cgroup charging
selftests.
- The 5 patch series "Fix va_high_addr_switch.sh test failure - again"
from Chunyu Hu addresses several issues in the va_high_addr_switch test.
- The 5 patch series "mm/damon/tests/core-kunit: extend existing test
scenarios" from Shu Anzai improves the KUnit test coverage for DAMON.
- The 2 patch series "mm/khugepaged: fix dirty page handling for
MADV_COLLAPSE" from Shivank Garg fixes a glitch in khugepaged which was
causing madvise(MADV_COLLAPSE) to transiently return -EAGAIN.
- The 29 patch series "arch, mm: consolidate hugetlb early reservation"
from Mike Rapoport reworks and consolidates a pile of straggly code
related to reservation of hugetlb memory from bootmem and creation of
CMA areas for hugetlb.
- The 9 patch series "mm: clean up anon_vma implementation" from Lorenzo
Stoakes cleans up the anon_vma implementation in various ways.
- The 3 patch series "tweaks for __alloc_pages_slowpath()" from
Vlastimil Babka does a little streamlining of the page allocator's
slowpath code.
- The 8 patch series "memcg: separate private and public ID namespaces"
from Shakeel Butt cleans up the memcg ID code and prevents the
internal-only private IDs from being exposed to userspace.
- The 6 patch series "mm: hugetlb: allocate frozen gigantic folio" from
Kefeng Wang cleans up the allocation of frozen folios and avoids some
atomic refcount operations.
- The 11 patch series "mm/damon: advance DAMOS-based LRU sorting" from
SeongJae Park improves DAMOS's movement of memory betewwn the active and
inactive LRUs and adds auto-tuning of the ratio-based quotas and of
monitoring intervals.
- The 18 patch series "Support page table check on PowerPC" from Andrew
Donnellan makes CONFIG_PAGE_TABLE_CHECK_ENFORCED work on powerpc.
- The 3 patch series "nodemask: align nodes_and{,not} with underlying
bitmap ops" from Yury Norov makes nodes_and() and nodes_andnot()
propagate the return values from the underlying bit operations, enabling
some cleanup in calling code.
- The 5 patch series "mm/damon: hide kdamond and kdamond_lock from API
callers" from SeongJae Park cleans up some DAMON internal interfaces.
- The 4 patch series "mm/khugepaged: cleanups and scan limit fix" from
Shivank Garg does some cleanup work in khupaged and fixes a scan limit
accounting issue.
- The 24 patch series "mm: balloon infrastructure cleanups" from David
Hildenbrand goes to town on the balloon infrastructure and its page
migration function. Mainly cleanups, also some locking simplification.
- The 2 patch series "mm/vmscan: add tracepoint and reason for
kswapd_failures reset" from Jiayuan Chen adds additional tracepoints to
the page reclaim code.
- The 3 patch series "Replace wq users and add WQ_PERCPU to
alloc_workqueue() users" from Marco Crivellari is part of Marco's
kernel-wide migration from the legacy workqueue APIs over to the
preferred unbound workqueues.
- The 9 patch series "Various mm kselftests improvements/fixes" from
Kevin Brodsky provides various unrelated improvements/fixes for the mm
kselftests.
- The 5 patch series "mm: accelerate gigantic folio allocation" from
Kefeng Wang greatly speeds up gigantic folio allocation, mainly by
avoiding unnecessary work in pfn_range_valid_contig().
- The 5 patch series "selftests/damon: improve leak detection and wss
estimation reliability" from SeongJae Park improves the reliability of
two of the DAMON selftests.
- The 8 patch series "mm/damon: cleanup kdamond, damon_call(), damos
filter and DAMON_MIN_REGION" from SeongJae Park does some cleanup work
in the core DAMON code.
- The 8 patch series "Docs/mm/damon: update intro, modules, maintainer
profile, and misc" from SeongJae Park performs maintenance work on the
DAMON documentation.
- The 10 patch series "mm: add and use vma_assert_stabilised() helper"
from Lorenzo Stoakes refactors and cleans up the core VMA code. The
main aim here is to be able to use the mmap write lock's lockdep state
to perform various assertions regarding the locking which the VMA code
requires.
- The 19 patch series "mm, swap: swap table phase II: unify swapin use"
from Kairui Song removes some old swap code (swap cache bypassing and
swap synchronization) which wasn't working very well. Various other
cleanups and simplifications were made. The end result is a 20% speedup
in one benchmark.
- The 8 patch series "enable PT_RECLAIM on more 64-bit architectures"
from Qi Zheng makes PT_RECLAIM available on 64-bit alpha, loongarch,
mips, parisc, um, Various cleanups were performed along the way.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaY1HfAAKCRDdBJ7gKXxA
jqhZAP9H8ZlKKqCEgnr6U5XXmJ63Ep2FDQpl8p35yr9yVuU9+gEAgfyWiJ43l1fP
rT0yjsUW3KQFBi/SEA3R6aYarmoIBgI=
=+HLt
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- "powerpc/64s: do not re-activate batched TLB flush" makes
arch_{enter|leave}_lazy_mmu_mode() nest properly (Alexander Gordeev)
It adds a generic enter/leave layer and switches architectures to use
it. Various hacks were removed in the process.
- "zram: introduce compressed data writeback" implements data
compression for zram writeback (Richard Chang and Sergey Senozhatsky)
- "mm: folio_zero_user: clear page ranges" adds clearing of contiguous
page ranges for hugepages. Large improvements during demand faulting
are demonstrated (David Hildenbrand)
- "memcg cleanups" tidies up some memcg code (Chen Ridong)
- "mm/damon: introduce {,max_}nr_snapshots and tracepoint for damos
stats" improves DAMOS stat's provided information, deterministic
control, and readability (SeongJae Park)
- "selftests/mm: hugetlb cgroup charging: robustness fixes" fixes a few
issues in the hugetlb cgroup charging selftests (Li Wang)
- "Fix va_high_addr_switch.sh test failure - again" addresses several
issues in the va_high_addr_switch test (Chunyu Hu)
- "mm/damon/tests/core-kunit: extend existing test scenarios" improves
the KUnit test coverage for DAMON (Shu Anzai)
- "mm/khugepaged: fix dirty page handling for MADV_COLLAPSE" fixes a
glitch in khugepaged which was causing madvise(MADV_COLLAPSE) to
transiently return -EAGAIN (Shivank Garg)
- "arch, mm: consolidate hugetlb early reservation" reworks and
consolidates a pile of straggly code related to reservation of
hugetlb memory from bootmem and creation of CMA areas for hugetlb
(Mike Rapoport)
- "mm: clean up anon_vma implementation" cleans up the anon_vma
implementation in various ways (Lorenzo Stoakes)
- "tweaks for __alloc_pages_slowpath()" does a little streamlining of
the page allocator's slowpath code (Vlastimil Babka)
- "memcg: separate private and public ID namespaces" cleans up the
memcg ID code and prevents the internal-only private IDs from being
exposed to userspace (Shakeel Butt)
- "mm: hugetlb: allocate frozen gigantic folio" cleans up the
allocation of frozen folios and avoids some atomic refcount
operations (Kefeng Wang)
- "mm/damon: advance DAMOS-based LRU sorting" improves DAMOS's movement
of memory betewwn the active and inactive LRUs and adds auto-tuning
of the ratio-based quotas and of monitoring intervals (SeongJae Park)
- "Support page table check on PowerPC" makes
CONFIG_PAGE_TABLE_CHECK_ENFORCED work on powerpc (Andrew Donnellan)
- "nodemask: align nodes_and{,not} with underlying bitmap ops" makes
nodes_and() and nodes_andnot() propagate the return values from the
underlying bit operations, enabling some cleanup in calling code
(Yury Norov)
- "mm/damon: hide kdamond and kdamond_lock from API callers" cleans up
some DAMON internal interfaces (SeongJae Park)
- "mm/khugepaged: cleanups and scan limit fix" does some cleanup work
in khupaged and fixes a scan limit accounting issue (Shivank Garg)
- "mm: balloon infrastructure cleanups" goes to town on the balloon
infrastructure and its page migration function. Mainly cleanups, also
some locking simplification (David Hildenbrand)
- "mm/vmscan: add tracepoint and reason for kswapd_failures reset" adds
additional tracepoints to the page reclaim code (Jiayuan Chen)
- "Replace wq users and add WQ_PERCPU to alloc_workqueue() users" is
part of Marco's kernel-wide migration from the legacy workqueue APIs
over to the preferred unbound workqueues (Marco Crivellari)
- "Various mm kselftests improvements/fixes" provides various unrelated
improvements/fixes for the mm kselftests (Kevin Brodsky)
- "mm: accelerate gigantic folio allocation" greatly speeds up gigantic
folio allocation, mainly by avoiding unnecessary work in
pfn_range_valid_contig() (Kefeng Wang)
- "selftests/damon: improve leak detection and wss estimation
reliability" improves the reliability of two of the DAMON selftests
(SeongJae Park)
- "mm/damon: cleanup kdamond, damon_call(), damos filter and
DAMON_MIN_REGION" does some cleanup work in the core DAMON code
(SeongJae Park)
- "Docs/mm/damon: update intro, modules, maintainer profile, and misc"
performs maintenance work on the DAMON documentation (SeongJae Park)
- "mm: add and use vma_assert_stabilised() helper" refactors and cleans
up the core VMA code. The main aim here is to be able to use the mmap
write lock's lockdep state to perform various assertions regarding
the locking which the VMA code requires (Lorenzo Stoakes)
- "mm, swap: swap table phase II: unify swapin use" removes some old
swap code (swap cache bypassing and swap synchronization) which
wasn't working very well. Various other cleanups and simplifications
were made. The end result is a 20% speedup in one benchmark (Kairui
Song)
- "enable PT_RECLAIM on more 64-bit architectures" makes PT_RECLAIM
available on 64-bit alpha, loongarch, mips, parisc, and um. Various
cleanups were performed along the way (Qi Zheng)
* tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (325 commits)
mm/memory: handle non-split locks correctly in zap_empty_pte_table()
mm: move pte table reclaim code to memory.c
mm: make PT_RECLAIM depends on MMU_GATHER_RCU_TABLE_FREE
mm: convert __HAVE_ARCH_TLB_REMOVE_TABLE to CONFIG_HAVE_ARCH_TLB_REMOVE_TABLE config
um: mm: enable MMU_GATHER_RCU_TABLE_FREE
parisc: mm: enable MMU_GATHER_RCU_TABLE_FREE
mips: mm: enable MMU_GATHER_RCU_TABLE_FREE
LoongArch: mm: enable MMU_GATHER_RCU_TABLE_FREE
alpha: mm: enable MMU_GATHER_RCU_TABLE_FREE
mm: change mm/pt_reclaim.c to use asm/tlb.h instead of asm-generic/tlb.h
mm/damon/stat: remove __read_mostly from memory_idle_ms_percentiles
zsmalloc: make common caches global
mm: add SPDX id lines to some mm source files
mm/zswap: use %pe to print error pointers
mm/vmscan: use %pe to print error pointers
mm/readahead: fix typo in comment
mm: khugepaged: fix NR_FILE_PAGES and NR_SHMEM in collapse_file()
mm: refactor vma_map_pages to use vm_insert_pages
mm/damon: unify address range representation with damon_addr_range
mm/cma: replace snprintf with strscpy in cma_new_area
...
|
||
|
|
bf2c3138ae |
Merge tag 'kvm-x86-pmu-6.20' of https://github.com/kvm-x86/linux into HEAD
KVM mediated PMU support for 6.20 Add support for mediated PMUs, where KVM gives the guest full ownership of PMU hardware (contexted switched around the fastpath run loop) and allows direct access to data MSRs and PMCs (restricted by the vPMU model), but intercepts access to control registers, e.g. to enforce event filtering and to prevent the guest from profiling sensitive host state. To keep overall complexity reasonable, mediated PMU usage is all or nothing for a given instance of KVM (controlled via module param). The Mediated PMU is disabled default, partly to maintain backwards compatilibity for existing setup, partly because there are tradeoffs when running with a mediated PMU that may be non-starters for some use cases, e.g. the host loses the ability to profile guests with mediated PMUs, the fastpath run loop is also a blind spot, entry/exit transitions are more expensive, etc. Versus the emulated PMU, where KVM is "just another perf user", the mediated PMU delivers more accurate profiling and monitoring (no risk of contention and thus dropped events), with significantly less overhead (fewer exits and faster emulation/programming of event selectors) E.g. when running Specint-2017 on a single-socket Sapphire Rapids with 56 cores and no-SMT, and using perf from within the guest: Perf command: a. basic-sampling: perf record -F 1000 -e 6-instructions -a --overwrite b. multiplex-sampling: perf record -F 1000 -e 10-instructions -a --overwrite Guest performance overhead: --------------------------------------------------------------------------- | Test case | emulated vPMU | all passthrough | passthrough with | | | | | event filters | --------------------------------------------------------------------------- | basic-sampling | 33.62% | 4.24% | 6.21% | --------------------------------------------------------------------------- | multiplex-sampling | 79.32% | 7.34% | 10.45% | --------------------------------------------------------------------------- |
||
|
|
54f15ebfc6 |
KVM/riscv changes for 6.20
- Fixes for issues discoverd by KVM API fuzzing in kvm_riscv_aia_imsic_has_attr(), kvm_riscv_aia_imsic_rw_attr(), and kvm_riscv_vcpu_aia_imsic_update() - Allow Zalasr, Zilsd and Zclsd extensions for Guest/VM - Add riscv vm satp modes in KVM selftests - Transparent huge page support for G-stage - Adjust the number of available guest irq files based on MMIO register sizes in DeviceTree or ACPI -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmmF8FsACgkQrUjsVaLH LAfDYxAAh3jlLkHGlPiWtKcZ/cS+uvpA5hE52h+UmCUOU7mRuvnoA+zS3HcW8lQo qyZt/NNE4qZ7vNhcDp+BTPIGAv06lwCbsPaBkGMA94jrBHXko6GBb5qkiIqi+L0M nkUABfM5l3Rsleo8JJEGEn5Egr7waNQBr8TynF6yChAJlnbuEVskaxzwzl+s7COV wHrU4OfkXBDCLwyuP65oJbBpP+P2ylJV25gl6E0oGv2CIcMpgJIMibbTewqzVFuR Z79/GhRC64ds7+vlHhOuajehbMcBSAnkGZGC6IMOp63gyswtXZvXfI+x3uv+i1KS D5gdO7sT6WBl/Y8IDQTTv4Tuk5I9I6luClVzJtfxaIp9I5wNPx9FS4qKipUxbu+e EFWs/mC+6U7MRm49n8FwXfoDwiFYm2XA6VB2FZdAwePxJKsqON5UKI3TDNTxNuh7 rbUOFOUn3azyHgHD/WuVXRnFK4VUs0YVFgW/cx4hUWLafVkiWW/5ve5vsx1jmiBG EFN/db6unjUXa/ZIC3y/hJ1UhTBVdSKupbawWmksHav8ugE69o7GF8r5J7/RQtTj 6MHTNFwvatjaWVzCCjYQ+hV/qGD2SMB0D7rReV28D44KFQCrCgTmkpJoZKu+Uq2B sjI1XW8kH/n3OX/Sllj3ZO+VOfeXWlBC6yW5ARhnsEvoc4bHWpk= =PSMs -----END PGP SIGNATURE----- Merge tag 'kvm-riscv-6.20-1' of https://github.com/kvm-riscv/linux into HEAD KVM/riscv changes for 6.20 - Fixes for issues discoverd by KVM API fuzzing in kvm_riscv_aia_imsic_has_attr(), kvm_riscv_aia_imsic_rw_attr(), and kvm_riscv_vcpu_aia_imsic_update() - Allow Zalasr, Zilsd and Zclsd extensions for Guest/VM - Add riscv vm satp modes in KVM selftests - Transparent huge page support for G-stage - Adjust the number of available guest irq files based on MMIO register sizes in DeviceTree or ACPI |
||
|
|
6589b3d76d |
soc: devicetree updates for 7.0
There are a handful of new SoCs this time, all of these are
more or less related to chips in a wider family:
- SpacemiT Key Stone K3 is an 8-core risc-v chip, and the first
widely available RVA23 implementation. Note that this is
entirely unrelated with the similarly named Texas Instruments
K3 chip family that follwed the TI Keystone2 SoC.
- The Realtek Kent family of SoCs contains three chip models
rtd1501s, rtd1861b and rtd1920s, and is related to their earlier
Set-top-box and NAS products such as rtd1619, but is built
on newer Arm Cortex-A78 cores.
- The Qualcomm Milos family includes the Snapdragon 7s Gen 3
(SM7635) mobile phone SoC built around Armv9 Kryo cores of the Arm
Cortex-A720 generation. This one is used in the Fairphone Gen 6
- Qualcomm Kaanapali is a new SoC based around eight high
performance Oryon CPU cores
- NXP i.MX8QP and i.MX952 are both feature reduced versions of
chips we already support, i.e. the i.MX8QM and i.MX952, with
fewer CPU cores and I/O interfaces.
As part of a cleanup, a number of SoC specific devicetree files got
removed because they did not have a single board using the .dtsi files
and they were never compile tested as a result: Samsung s3c6400,
ST spear320s, ST stm32mp21xc/stm32mp23xc/stm32mp25xc, Renesas
r8a779m0/r8a779m2/r8a779m4/r8a779m6/r8a779m7/r8a779m8/r8a779mb/
r9a07g044c1/r9a07g044l1/r9a07g054l1/r9a09g047e37, and TI am3703/am3715.
All of these could be restored easily if a new board gets merged.
Broadcom/Cavium/Marvell ThunderX2 gets removed along with its only
machine, as all remaining users are assumed to be using ACPI
based firmware.
A relatively small number of 43 boards get added this time, and
almost all of them for arm64. Aside from the reference boards for
the newly added SoCs, this includes:
- Three server boards use 32-bit ASpeed BMCs
- One more reference board for 32-bit Microchip LAN9668
- 64-bit Arm single-board computers based on Amlogic s905y4,
CIX sky1, NXP ls1028a/imx8mn/imx8mp/imx91/imx93/imx95,
Qualcomm qcs6490/qrb2210 and Rockchip rk3568/rk3588s
- Carrier board for SOMs using Intel agilex5, Marvell Armada 7020,
NXP iMX8QP, Mediatek mt8370/mt8390 and rockchip rk3588
- Two mobile phones using Snapdragon 845
- A gaming device and a NAS box, both based on Rockchips rk356x
On top of the newly added boards and SoCs, there is a lot of
background activity going into cleanups, in particular towards
getting a warning-free dtc build, and the usual work on adding
support for more hardware on the previously added machines.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmmLSTIACgkQmmx57+YA
GNk6xw//bn239Nn6XUSrmm3b7SGDf+9AvdrukrUEOsIYBYUM7fkulVSINpVOSzZU
DAxLSCY1qfE9zP4x+hrYv922w9Rt19zPuEwFVCslbbTk9NN8IhmhIOs06o2jrvN3
HS/AcESV2SCUe0EVjDIdBgisKMGdbN2t8bdrFFOmqUkQ+7EJ2GvNL0MoaKrdF+Sr
ilt5Hhkl6ixbGDq2KEB2QQHQhYKa/5GdKS0CLTY4et/dZbjHVg9o6/sfgIhLINCz
wNb9CKnt1Gv5L3RWW2LxQrrNe5qhLmHq1vmPbxSJGrzqnOwY9Tcg4s1Io9EcDtyW
LZlq4PkLJV9oPVHgi0mygZ3ONVhWhCMVhTXg6Osi1aHJeEERuIaYMfeU7WD0jHv8
ZcGboxfyiQmphRJumL0C74uIuuXgdoKrv7gqQvo9dy+HRxdHW/7p8TQi9SSfh7kF
Iysc2ePMmqLd4WJCMxV+7FrT8oZxOL+/KfisCu6n/Qdv65kTWmBlLCK6XZrmWYyk
YKg48F8xpQaSmgevWePwhcH0a/TmgmoT+6xOfTuyo88k65FLXXmrFp14th2Kg5sI
60W9ur6ujPI3s19H9C3IQp7ub5Ermvj+g893zEB1e2CR9blfqRARV9zFSv4OMkq+
hQmqe5cU9/17k7wchFke4Y/FsS8W2oFFJ9o6czOTnh5NhlpVSJw=
=IK23
-----END PGP SIGNATURE-----
Merge tag 'soc-dt-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC devicetree updates from Arnd Bergmann:
"There are a handful of new SoCs this time, all of these are more or
less related to chips in a wider family:
- SpacemiT Key Stone K3 is an 8-core risc-v chip, and the first
widely available RVA23 implementation. Note that this is entirely
unrelated with the similarly named Texas Instruments K3 chip family
that follwed the TI Keystone2 SoC.
- The Realtek Kent family of SoCs contains three chip models
rtd1501s, rtd1861b and rtd1920s, and is related to their earlier
Set-top-box and NAS products such as rtd1619, but is built on newer
Arm Cortex-A78 cores.
- The Qualcomm Milos family includes the Snapdragon 7s Gen 3 (SM7635)
mobile phone SoC built around Armv9 Kryo cores of the Arm
Cortex-A720 generation. This one is used in the Fairphone Gen 6
- Qualcomm Kaanapali is a new SoC based around eight high performance
Oryon CPU cores
- NXP i.MX8QP and i.MX952 are both feature reduced versions of chips
we already support, i.e. the i.MX8QM and i.MX952, with fewer CPU
cores and I/O interfaces.
As part of a cleanup, a number of SoC specific devicetree files got
removed because they did not have a single board using the .dtsi files
and they were never compile tested as a result: Samsung s3c6400, ST
spear320s, ST stm32mp21xc/stm32mp23xc/stm32mp25xc, Renesas
r8a779m0/r8a779m2/r8a779m4/r8a779m6/r8a779m7/r8a779m8/r8a779mb/
r9a07g044c1/r9a07g044l1/r9a07g054l1/r9a09g047e37, and TI
am3703/am3715. All of these could be restored easily if a new board
gets merged.
Broadcom/Cavium/Marvell ThunderX2 gets removed along with its only
machine, as all remaining users are assumed to be using ACPI based
firmware.
A relatively small number of 43 boards get added this time, and almost
all of them for arm64. Aside from the reference boards for the newly
added SoCs, this includes:
- Three server boards use 32-bit ASpeed BMCs
- One more reference board for 32-bit Microchip LAN9668
- 64-bit Arm single-board computers based on Amlogic s905y4, CIX
sky1, NXP ls1028a/imx8mn/imx8mp/imx91/imx93/imx95, Qualcomm
qcs6490/qrb2210 and Rockchip rk3568/rk3588s
- Carrier board for SOMs using Intel agilex5, Marvell Armada 7020,
NXP iMX8QP, Mediatek mt8370/mt8390 and rockchip rk3588
- Two mobile phones using Snapdragon 845
- A gaming device and a NAS box, both based on Rockchips rk356x
On top of the newly added boards and SoCs, there is a lot of
background activity going into cleanups, in particular towards getting
a warning-free dtc build, and the usual work on adding support for
more hardware on the previously added machines"
* tag 'soc-dt-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (757 commits)
dt-bindings: intel: Add Agilex eMMC support
arm64: dts: socfpga: agilex: add emmc support
arm64: dts: intel: agilex5: Add simple-bus node on top of dma controller node
ARM: dts: socfpga: fix dtbs_check warning for fpga-region
ARM: dts: socfpga: add #address-cells and #size-cells for sram node
dt-bindings: altera: document syscon as fallback for sys-mgr
arm64: dts: altera: Use lowercase hex
dt-bindings: arm: altera: combine Intel's SoCFPGA into altera.yaml
arm64: dts: socfpga: agilex5: Add IOMMUS property for ethernet nodes
arm64: dts: socfpga: agilex5: add support for modular board
dt-bindings: intel: Add Agilex5 SoCFPGA modular board
arm64: dts: socfpga: agilex5: Add dma-coherent property
arm64: dts: realtek: Add Kent SoC and EVB device trees
dt-bindings: arm: realtek: Add Kent Soc family compatibles
ARM: dts: samsung: Drop s3c6400.dtsi
ARM: dts: nuvoton: Minor whitespace cleanup
MAINTAINERS: Add Falcon DB
arm64: dts: a7k: add COM Express boards
ARM: dts: microchip: Drop usb_a9g20-dab-mmx.dtsi
arm64: dts: rockchip: Fix rk3588 PCIe range mappings
...
|
||
|
|
f7fae9b4d3 |
soc: defconfig updates for 7.0
These are the usual updates, enabling mode newly merged device drivers for various Arm and RISC-V based platforms in the defconfig files. The Renesas and NXP defconfig files also get a refresh for modified Kconfig options. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmmLVkEACgkQmmx57+YA GNkSrxAAtP2FcZtRGrAXuUwmr9THJTJsn+F+oulSa5DOWSe9QSEPMXMbxX7k28w7 4b3RLBaKKHUlj/cvZrGg/nCo3xX4RtAszhs9FE4mBaQ2/ai3hdre7vnNyD7Vo2i6 OxlTqzO/G7X6IRhhVQXHu0++xUYs9DR632ovpptXtdHKRbTfvs7ox1H1xJyvJfVD fhVmcl0JllTlVIjRUatOFYSDazHHXSAo14B6oMwgNoh7kXDyk04j86h7ZWo+YfgA h0/Q9XVVqLbjF27vKzNePZwpbFyyqcqM0zsotliY5Cx+Sidig5P25P2oqEeOOL3V llHZQz+5qgYQrIkajfM/5dDpeD2YeZwja2NJu8d1UoParoFTB+zranFQdwKUmWTI sooe98fJKUwFYdzWl+MX+CE3pJ9cXuCycS7hfiaIZoJnY9ZLg+rCAH7CdD2sER7f S7WS2oh7A9VvXgeB08ayeTWFLoPfwpkLFJQVf7w31OgEhYTxGv4JxlvQgHbj8Dtb BbPB9lvtdFOnUwWik+E+O0rVylOBgHciTAKbNDRZvvIxWT2jMdqrHMmnHetdIi90 QSP4fkrcxX4p/XCXOd+yFVfj/shaW84wSsgARWca26xlX4SQELy1LHPPgsuzB/0R CWyvGODbE9na2j7Bia3Oz6+mt/p6cVRDbHcYMZLHqZRDMnNmTQ8= =tlG5 -----END PGP SIGNATURE----- Merge tag 'soc-defconfig-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC defconfig updates from Arnd Bergmann: "These are the usual updates, enabling mode newly merged device drivers for various Arm and RISC-V based platforms in the defconfig files. The Renesas and NXP defconfig files also get a refresh for modified Kconfig options" * tag 'soc-defconfig-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: riscv: defconfig: spacemit: k3: enable clock support ARM: defconfig: turn off CONFIG_EXPERT ARM: defconfig: move entries arm64: defconfig: Enable configurations for Kontron SMARC-sAM67 ARM: imx_v4_v5_defconfig: update for v6.19-rc1 arm64: defconfig: Enable Apple Silicon drivers arm64: select APPLE_PMGR_PWRSTATE for ARCH_APPLE arm64: defconfig: Enable Mediatek HDMIv2 driver ARM: shmobile: defconfig: Refresh for v6.19-rc1 arm64: defconfig: Enable PCIe for the Renesas RZ/G3S SoC arm64: defconfig: Enable RZ/G3E USB3 PHY driver arm64: defconfig: Enable EC drivers for Qualcomm-based laptops arm64: defconfig: Enable options for Qualcomm Milos SoC ARM: imx_v6_v7_defconfig: enable EPD regulator needed for Kobo Clara 2e ARM: imx_v6_v7_defconfig: Configure CONFIG_SND_SOC_FSL_ASOC_CARD as module ARM: multi_v7_defconfig: enable DA9052 and MC13XXX arm64: defconfig: enable clocks, interconnect and pinctrl for Qualcomm Kaanapali arm64: defconfig: Drop duplicate CONFIG_OMAP_USB2 entry arm64: defconfig: Enable missing AMD/Xilinx drivers |
||
|
|
57cb845067 |
- A nice cleanup to the paravirt code containing a unification of the paravirt
clock interface, taming the include hell by splitting the pv_ops structure and removing of a bunch of obsolete code. Work by Juergen Gross. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmmLKHAACgkQEsHwGGHe VUrURg//Ucf+3EAIkLCmFkH0WwYmQl2JjRYww8bPAw3iJMIVxy4dMnaBbsUiAtUp kYza+pgEtvyAwwd8RIEs85c9VhZn0DKoaWV8goBH3zFH6YvIRiLwb0w2QvjkF+70 FNU+4zlvt/I3FD+tWNElAgVtkFL3Gmzm44qyLLsPtlYaJ71xFl2XB7V+TlqXMHzE m8BMenP9/CrbTlBBdNJGzAkAbWi1uAP+IydvuFNolH/F2lqVM2z5Ta3gUWWCIk/q jWrPLDZCHr2WlBZNUGamKVVH9NEh+7YNwBAGUrSNYGZFoaFjqeX6lN3djzS+wXIj 0nDoW35jN0QNKz239MdXZDf1mfpb6ZQd/iOhFjo4dAvbm+J8WPAMr98ac8wR3Dyb 2LF/BxkoKWRabxQApXSCrLPXEuqT6Qc1+lDA0bNHg51zBoqP5vRNVZRwArnzGB+O LxDKx+o4VYOf+UCaB6oQHjylbSgFvIedZ9p822hBe3QG9act8indRE8LWip7Utld peoJGgvlQ0xtClh6FjVHpvmVfAvk7Zki5ywj2GwmB/TZ0yywuGStAjE3UqY168/M gb7MSajh+HHZNj1/2+b/se4CUYlAgIPDQ+SwHJPm5TqyopvnOVi/2XWmjbx8I5jT jS0nxaxD+SbESSZ6IMAsppnAAxAYbvRHGIS+6mtNCXVkaV1pMbA= =AeFt -----END PGP SIGNATURE----- Merge tag 'x86_paravirt_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 paravirt updates from Borislav Petkov: - A nice cleanup to the paravirt code containing a unification of the paravirt clock interface, taming the include hell by splitting the pv_ops structure and removing of a bunch of obsolete code (Juergen Gross) * tag 'x86_paravirt_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) x86/paravirt: Use XOR r32,r32 to clear register in pv_vcpu_is_preempted() x86/paravirt: Remove trailing semicolons from alternative asm templates x86/pvlocks: Move paravirt spinlock functions into own header x86/paravirt: Specify pv_ops array in paravirt macros x86/paravirt: Allow pv-calls outside paravirt.h objtool: Allow multiple pv_ops arrays x86/xen: Drop xen_mmu_ops x86/xen: Drop xen_cpu_ops x86/xen: Drop xen_irq_ops x86/paravirt: Move pv_native_*() prototypes to paravirt.c x86/paravirt: Introduce new paravirt-base.h header x86/paravirt: Move paravirt_sched_clock() related code into tsc.c x86/paravirt: Use common code for paravirt_steal_clock() riscv/paravirt: Use common code for paravirt_steal_clock() loongarch/paravirt: Use common code for paravirt_steal_clock() arm64/paravirt: Use common code for paravirt_steal_clock() arm/paravirt: Use common code for paravirt_steal_clock() sched: Move clock related paravirt code to kernel/sched paravirt: Remove asm/paravirt_api_clock.h x86/paravirt: Move thunk macros to paravirt_types.h ... |
||
|
|
4d84667627 |
Performance events changes for v7.0:
x86 PMU driver updates:
- Add support for the core PMU for Intel Diamond Rapids (DMR) CPUs.
Compared to previous iterations of the Intel PMU code, there's
been a lot of changes, which center around three main areas:
- Introduce the OFF-MODULE RESPONSE (OMR) facility to
replace the Off-Core Response (OCR) facility
- New PEBS data source encoding layout
- Support the new "RDPMC user disable" feature
(Dapeng Mi)
- Likewise, a large series adds uncore PMU support for
Intel Diamond Rapids (DMR) CPUs, which center around these
four main areas:
- DMR may have two Integrated I/O and Memory Hub (IMH) dies,
separate from the compute tile (CBB) dies. Each CBB and
each IMH die has its own discovery domain.
- Unlike prior CPUs that retrieve the global discovery table
portal exclusively via PCI or MSR, DMR uses PCI for IMH PMON
discovery and MSR for CBB PMON discovery.
- DMR introduces several new PMON types: SCA, HAMVF, D2D_ULA,
UBR, PCIE4, CRS, CPC, ITC, OTC, CMS, and PCIE6.
- IIO free-running counters in DMR are MMIO-based, unlike SPR.
(Zide Chen)
- Also add support for Add missing PMON units for Intel Panther Lake,
and support Nova Lake (NVL), which largely maps to Panther Lake.
(Zide Chen)
- KVM integration: Add support for mediated vPMUs (by Kan Liang
and Sean Christopherson, with fixes and cleanups by Peter Zijlstra,
Sandipan Das and Mingwei Zhang)
- Add Intel cstate driver to support for Wildcat Lake (WCL)
CPUs, which are a low-power variant of Panther Lake.
(Zide Chen)
- Add core, cstate and MSR PMU support for the Airmont NP Intel CPU
(aka MaxLinear Lightning Mountain), which maps to the existing
Airmont code. (Martin Schiller)
Performance enhancements:
- core: Speed up kexec shutdown by avoiding unnecessary
cross CPU calls. (Jan H. Schönherr)
- core: Fix slow perf_event_task_exit() with LBR callstacks
(Namhyung Kim)
User-space stack unwinding support:
- Various cleanups and refactorings in preparation to generalize
the unwinding code for other architectures. (Jens Remus)
Uprobes updates:
- Transition from kmap_atomic to kmap_local_page (Keke Ming)
- Fix incorrect lockdep condition in filter_chain() (Breno Leitao)
- Fix XOL allocation failure for 32-bit tasks (Oleg Nesterov)
Misc fixes and cleanups:
- s390: Remove kvm_types.h from Kbuild (Randy Dunlap)
- x86/intel/uncore: Convert comma to semicolon (Chen Ni)
- x86/uncore: Clean up const mismatch (Greg Kroah-Hartman)
- x86/ibs: Fix typo in dc_l2tlb_miss comment (Xiang-Bin Shi)
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmmJhTURHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1i/qw/9F/sjMqbxH8d3kPXB8wk2eUuSkynQ2aNw
Zec9qtfCC5N1U9b2D7ywGJRscTmWYnX/3BKTyzFuyA6SDz6buAgDDIGPlHi+9Fww
+RUUS3lQ7N3pVWZ4Ifu3kbh3Vz4lkQuOXhfcjiyIMS6QIxfrcSLFoKHK+2V6PeU+
x0k+THHz/Ymg+DIpqSjqil1yrKaUmU9xRrbnyy6zJB1duREQrkYBhIWL1+bcd7SA
89RVAGXQ+sWzVQMPaKrMkZj6GavOCB7zseigiiwjBRLznukS2OulDDe8zR6pCJZp
wbdc7TR/nCm+QtNfkHlOmTQvsPAXiXNyXe5Vi8aFjGc0uMGhHaeiL9ah/bwsKA5m
Bm5Y7oVSmBlCJbcr/CTrGYkb+WwvLIgPCwVkn4FYPlsWv+U92qTOx9q7qKmIDFaj
1oUXCwoHbYrYnoZqZqPp2h689m0Lh/lsGhy0QRt8aGnKu0SDjMaqRGbYWH0UI0kA
aDnZstVHG76RnVi0143q8HcvvjNZb82NL0cS749tY/YcwH4kUEGj2XuSK2Ar887T
H0oDJHXijMlGXWqO5bK3WMoCQajR7nRyqBo7/rKYj20OjXmwoXS3vel77/s8WFo2
fUUC469MacDzyxdBNutnkJvvcvsUio3r4MWFsEEWQk2nUE58PtN8YM8j/FdNYpql
zAZ4Jx/A0RM=
=4SRB
-----END PGP SIGNATURE-----
Merge tag 'perf-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull performance event updates from Ingo Molnar:
"x86 PMU driver updates:
- Add support for the core PMU for Intel Diamond Rapids (DMR) CPUs
(Dapeng Mi)
Compared to previous iterations of the Intel PMU code, there's been
a lot of changes, which center around three main areas:
- Introduce the OFF-MODULE RESPONSE (OMR) facility to replace the
Off-Core Response (OCR) facility
- New PEBS data source encoding layout
- Support the new "RDPMC user disable" feature
- Likewise, a large series adds uncore PMU support for Intel Diamond
Rapids (DMR) CPUs (Zide Chen)
This centers around these four main areas:
- DMR may have two Integrated I/O and Memory Hub (IMH) dies,
separate from the compute tile (CBB) dies. Each CBB and each IMH
die has its own discovery domain.
- Unlike prior CPUs that retrieve the global discovery table
portal exclusively via PCI or MSR, DMR uses PCI for IMH PMON
discovery and MSR for CBB PMON discovery.
- DMR introduces several new PMON types: SCA, HAMVF, D2D_ULA, UBR,
PCIE4, CRS, CPC, ITC, OTC, CMS, and PCIE6.
- IIO free-running counters in DMR are MMIO-based, unlike SPR.
- Also add support for Add missing PMON units for Intel Panther Lake,
and support Nova Lake (NVL), which largely maps to Panther Lake.
(Zide Chen)
- KVM integration: Add support for mediated vPMUs (by Kan Liang and
Sean Christopherson, with fixes and cleanups by Peter Zijlstra,
Sandipan Das and Mingwei Zhang)
- Add Intel cstate driver to support for Wildcat Lake (WCL) CPUs,
which are a low-power variant of Panther Lake (Zide Chen)
- Add core, cstate and MSR PMU support for the Airmont NP Intel CPU
(aka MaxLinear Lightning Mountain), which maps to the existing
Airmont code (Martin Schiller)
Performance enhancements:
- Speed up kexec shutdown by avoiding unnecessary cross CPU calls
(Jan H. Schönherr)
- Fix slow perf_event_task_exit() with LBR callstacks (Namhyung Kim)
User-space stack unwinding support:
- Various cleanups and refactorings in preparation to generalize the
unwinding code for other architectures (Jens Remus)
Uprobes updates:
- Transition from kmap_atomic to kmap_local_page (Keke Ming)
- Fix incorrect lockdep condition in filter_chain() (Breno Leitao)
- Fix XOL allocation failure for 32-bit tasks (Oleg Nesterov)
Misc fixes and cleanups:
- s390: Remove kvm_types.h from Kbuild (Randy Dunlap)
- x86/intel/uncore: Convert comma to semicolon (Chen Ni)
- x86/uncore: Clean up const mismatch (Greg Kroah-Hartman)
- x86/ibs: Fix typo in dc_l2tlb_miss comment (Xiang-Bin Shi)"
* tag 'perf-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits)
s390: remove kvm_types.h from Kbuild
uprobes: Fix incorrect lockdep condition in filter_chain()
x86/ibs: Fix typo in dc_l2tlb_miss comment
x86/uprobes: Fix XOL allocation failure for 32-bit tasks
perf/x86/intel/uncore: Convert comma to semicolon
perf/x86/intel: Add support for rdpmc user disable feature
perf/x86: Use macros to replace magic numbers in attr_rdpmc
perf/x86/intel: Add core PMU support for Novalake
perf/x86/intel: Add support for PEBS memory auxiliary info field in NVL
perf/x86/intel: Add core PMU support for DMR
perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR
perf/x86/intel: Support the 4 new OMR MSRs introduced in DMR and NVL
perf/core: Fix slow perf_event_task_exit() with LBR callstacks
perf/core: Speed up kexec shutdown by avoiding unnecessary cross CPU calls
uprobes: use kmap_local_page() for temporary page mappings
arm/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
mips/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
arm64/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
riscv/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
perf/x86/intel/uncore: Add Nova Lake support
...
|
||
|
|
13d83ea9d8 |
Crypto library updates for 7.0
- Add support for verifying ML-DSA signatures.
ML-DSA (Module-Lattice-Based Digital Signature Algorithm) is a
recently-standardized post-quantum (quantum-resistant) signature
algorithm. It was known as Dilithium pre-standardization.
The first use case in the kernel will be module signing. But there
are also other users of RSA and ECDSA signatures in the kernel that
might want to upgrade to ML-DSA eventually.
- Improve the AES library:
- Make the AES key expansion and single block encryption and
decryption functions use the architecture-optimized AES code.
Enable these optimizations by default.
- Support preparing an AES key for encryption-only, using about
half as much memory as a bidirectional key.
- Replace the existing two generic implementations of AES with a
single one.
- Simplify how Adiantum message hashing is implemented. Remove the
"nhpoly1305" crypto_shash in favor of direct lib/crypto/ support for
NH hashing, and enable optimizations by default.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaYlV8xQcZWJpZ2dlcnNA
a2VybmVsLm9yZwAKCRDzXCl4vpKOK1ffAQCbM+cnqF4ThspBCgLZGSScx02KsA4U
dQblKoOFyIEbnwEA1ElJNhNQs2m7AT+R0hOh6yI+5+ttUfqLMT9tuNs2mwM=
=iZ06
-----END PGP SIGNATURE-----
Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux
Pull crypto library updates from Eric Biggers:
- Add support for verifying ML-DSA signatures.
ML-DSA (Module-Lattice-Based Digital Signature Algorithm) is a
recently-standardized post-quantum (quantum-resistant) signature
algorithm. It was known as Dilithium pre-standardization.
The first use case in the kernel will be module signing. But there
are also other users of RSA and ECDSA signatures in the kernel that
might want to upgrade to ML-DSA eventually.
- Improve the AES library:
- Make the AES key expansion and single block encryption and
decryption functions use the architecture-optimized AES code.
Enable these optimizations by default.
- Support preparing an AES key for encryption-only, using about
half as much memory as a bidirectional key.
- Replace the existing two generic implementations of AES with a
single one.
- Simplify how Adiantum message hashing is implemented. Remove the
"nhpoly1305" crypto_shash in favor of direct lib/crypto/ support for
NH hashing, and enable optimizations by default.
* tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (53 commits)
lib/crypto: mldsa: Clarify the documentation for mldsa_verify() slightly
lib/crypto: aes: Drop 'volatile' from aes_sbox and aes_inv_sbox
lib/crypto: aes: Remove old AES en/decryption functions
lib/crypto: aesgcm: Use new AES library API
lib/crypto: aescfb: Use new AES library API
crypto: omap - Use new AES library API
crypto: inside-secure - Use new AES library API
crypto: drbg - Use new AES library API
crypto: crypto4xx - Use new AES library API
crypto: chelsio - Use new AES library API
crypto: ccp - Use new AES library API
crypto: x86/aes-gcm - Use new AES library API
crypto: arm64/ghash - Use new AES library API
crypto: arm/ghash - Use new AES library API
staging: rtl8723bs: core: Use new AES library API
net: phy: mscc: macsec: Use new AES library API
chelsio: Use new AES library API
Bluetooth: SMP: Use new AES library API
crypto: x86/aes - Remove the superseded AES-NI crypto_cipher
lib/crypto: x86/aes: Add AES-NI optimization
...
|
||
|
|
0c61526621 |
EFI updates for v7.0
- Quirk the broken EFI framebuffer geometry on the Valve Steam Deck - Capture the EDID information of the primary display also on non-x86 EFI systems when booting via the EFI stub. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQQm/3uucuRGn1Dmh0wbglWLn0tXAUCaYoTAgAKCRAwbglWLn0t XFe8AQDJe2GSNfzWgqoTqgT6tcH7lFG2SjdpIb+jHSmvgHckbAD/cUaY8YnhdYkm nz6URLJN/2NHuaDq1mUL8CwJwIot4wk= =41tn -----END PGP SIGNATURE----- Merge tag 'efi-next-for-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI updates from Ard Biesheuvel: - Quirk the broken EFI framebuffer geometry on the Valve Steam Deck - Capture the EDID information of the primary display also on non-x86 EFI systems when booting via the EFI stub. * tag 'efi-next-for-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi: Support EDID information sysfb: Move edid_info into sysfb_primary_display sysfb: Pass sysfb_primary_display to devices sysfb: Replace screen_info with sysfb_primary_display sysfb: Add struct sysfb_display_info efi: sysfb_efi: Reduce number of references to global screen_info efi: earlycon: Reduce number of references to global screen_info efi: sysfb_efi: Fix efidrmfb and simpledrmfb on Valve Steam Deck efi: sysfb_efi: Convert swap width and height quirk to a callback efi: sysfb_efi: Fix lfb_linelength calculation when applying quirks efi: sysfb_efi: Replace open coded swap with the macro |
||
|
|
18be4ca5cb |
riscv: lib: optimize strlen loop efficiency
Optimize the generic strlen implementation by using a pre-decrement
pointer. This reduces the loop body from 4 instructions to 3 and
eliminates the unconditional jump ('j').
Old loop (4 instructions, 2 branches):
1: lbu t0, 0(t1); beqz t0, 2f; addi t1, t1, 1; j 1b
New loop (3 instructions, 1 branch):
1: addi t1, t1, 1; lbu t0, 0(t1); bnez t0, 1b
This change improves execution efficiency and reduces branch pressure
for systems without the Zbb extension.
Signed-off-by: Feng Jiang <jiangfeng@kylinos.cn>
Link: https://patch.msgid.link/20251218032614.57356-1-jiangfeng@kylinos.cn
Signed-off-by: Paul Walmsley <pjw@kernel.org>
|
||
|
|
f4be988f5b |
riscv: ptrace: validate input vector csr registers
Add strict validation for vector csr registers when setting them via ptrace: - reject attempts to set reserved bits or invalid field combinations - enforce strict VL checks against calculated VLMAX values Vector specs 0.7.1 and 1.0 allow normal applications to set candidate VL values and read back the hardware-adjusted results, see section 6 for details. Disallow such flexibility in vector ptrace operations and strictly enforce valid VL input. The traced process may not update its saved vector context if no vector instructions execute between breakpoints. So the purpose of the strict ptrace approach is to make sure that debuggers maintain an accurate view of the tracee's vector context across multiple halt/resume debug cycles. Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com> Reviewed-by: Andy Chiu <andybnac@gmail.com> Tested-by: Andy Chiu <andybnac@gmail.com> Link: https://patch.msgid.link/20251214163537.1054292-5-geomatsi@gmail.com Signed-off-by: Paul Walmsley <pjw@kernel.org> |
||
|
|
fd515e037e |
riscv: csr: define vtype register elements
Define masks and shifts for vtype CSR according to the vector specs: - v0.7.1 used in early T-Head cores, known as xtheadvector in the kernel - v1.0 Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com> Reviewed-by: Andy Chiu <andybnac@gmail.com> Tested-by: Andy Chiu <andybnac@gmail.com> Link: https://patch.msgid.link/20251214163537.1054292-4-geomatsi@gmail.com Signed-off-by: Paul Walmsley <pjw@kernel.org> |
||
|
|
ef3ff40346 |
riscv: vector: init vector context with proper vlenb
The vstate in thread_struct is zeroed when the vector context is initialized. That includes read-only register vlenb, which holds the vector register length in bytes. Zeroed state persists until mstatus.VS becomes 'dirty' and a context switch saves the actual hardware values. This can expose the zero vlenb value to the user-space in early debug scenarios, e.g. when ptrace attaches to a traced process early, before any vector instruction except the first one was executed. Fix this by specifying proper vlenb on vector context init. Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com> Reviewed-by: Andy Chiu <andybnac@gmail.com> Tested-by: Andy Chiu <andybnac@gmail.com> Link: https://patch.msgid.link/20251214163537.1054292-3-geomatsi@gmail.com Signed-off-by: Paul Walmsley <pjw@kernel.org> |
||
|
|
376e2f8cca |
irqchip/riscv-imsic: Adjust the number of available guest irq files
Currently, KVM assumes the minimum of implemented HGEIE bits and "BIT(gc->guest_index_bits) - 1" as the number of guest files available across all CPUs. This will not work when CPUs have different number of guest files because KVM may incorrectly allocate a guest file on a CPU with fewer guest files. To address above, during initialization, calculate the number of available guest interrupt files according to MMIO resources and constrain the number of guest interrupt files that can be allocated by KVM. Signed-off-by: Xu Lu <luxu.kernel@bytedance.com> Reviewed-by: Nutty Liu <nutty.liu@hotmail.com> Reviewed-by: Anup Patel <anup@brainfault.org> Acked-by: Thomas Gleixner <tglx@kernel.org> Link: https://lore.kernel.org/r/20260104133457.57742-1-luxu.kernel@bytedance.com Signed-off-by: Anup Patel <anup@brainfault.org> |
||
|
|
ed7ae7a34b |
RISC-V: KVM: Transparent huge page support
Use block mapping if backed by a THP, as implemented in architectures like ARM and x86_64. Signed-off-by: Jessica Liu <liu.xuemei1@zte.com.cn> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20251127165137780QbUOVPKPAfWSGAFl5qtRy@zte.com.cn Signed-off-by: Anup Patel <anup@brainfault.org> |
||
|
|
655d330c05 |
RISC-V: KVM: Allow Zalasr extensions for Guest/VM
Extend the KVM ISA extension ONE_REG interface to allow KVM user space to detect and enable Zalasr extensions for Guest/VM. Signed-off-by: Xu Lu <luxu.kernel@bytedance.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20251020042457.30915-5-luxu.kernel@bytedance.com Signed-off-by: Anup Patel <anup@brainfault.org> |
||
|
|
f326e846ff |
riscv: KVM: allow Zilsd and Zclsd extensions for Guest/VM
Extend the KVM ISA extension ONE_REG interface to allow KVM user space to detect and enable Zilsd and Zclsd extensions for Guest/VM. Signed-off-by: Pincheng Wang <pincheng.plct@isrc.iscas.ac.cn> Reviewed-by: Nutty Liu <nutty.liu@hotmail.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20250826162939.1494021-5-pincheng.plct@isrc.iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org> |
||
|
|
003b9dae53 |
RISC-V: KVM: Skip IMSIC update if vCPU IMSIC state is not initialized
kvm_riscv_vcpu_aia_imsic_update() assumes that the vCPU IMSIC state has
already been initialized and unconditionally accesses imsic->vsfile_lock.
However, in fuzzed ioctl sequences, the AIA device may be initialized at
the VM level while the per-vCPU IMSIC state is still NULL.
This leads to invalid access when entering the vCPU run loop before
IMSIC initialization has completed.
The crash manifests as:
Unable to handle kernel paging request at virtual address
dfffffff00000006
...
kvm_riscv_vcpu_aia_imsic_update arch/riscv/kvm/aia_imsic.c:801
kvm_riscv_vcpu_aia_update arch/riscv/kvm/aia_device.c:493
kvm_arch_vcpu_ioctl_run arch/riscv/kvm/vcpu.c:927
...
Add a guard to skip the IMSIC update path when imsic_state is NULL. This
allows the vCPU run loop to continue safely.
This issue was discovered during fuzzing of RISC-V KVM code.
Fixes:
|
||
|
|
aeb1d17d1a |
RISC-V: KVM: Fix null pointer dereference in kvm_riscv_aia_imsic_rw_attr()
Add a null pointer check for imsic_state before dereferencing it in
kvm_riscv_aia_imsic_rw_attr(). While the function checks that the
vcpu exists, it doesn't verify that the vcpu's imsic_state has been
initialized, leading to a null pointer dereference when accessed.
The crash manifests as:
Unable to handle kernel paging request at virtual address
dfffffff00000006
...
kvm_riscv_aia_imsic_rw_attr+0x2d8/0x854 arch/riscv/kvm/aia_imsic.c:958
aia_set_attr+0x2ee/0x1726 arch/riscv/kvm/aia_device.c:354
kvm_device_ioctl_attr virt/kvm/kvm_main.c:4744 [inline]
kvm_device_ioctl+0x296/0x374 virt/kvm/kvm_main.c:4761
vfs_ioctl fs/ioctl.c:51 [inline]
...
The fix adds a check to return -ENODEV if imsic_state is NULL and moves
isel assignment after imsic_state NULL check.
Fixes:
|
||
|
|
11366ead4f |
RISC-V: KVM: Fix null pointer dereference in kvm_riscv_aia_imsic_has_attr()
Add a null pointer check for imsic_state before dereferencing it in
kvm_riscv_aia_imsic_has_attr(). While the function checks that the
vcpu exists, it doesn't verify that the vcpu's imsic_state has been
initialized, leading to a null pointer dereference when accessed.
This issue was discovered during fuzzing of RISC-V KVM code. The
crash occurs when userspace calls KVM_HAS_DEVICE_ATTR ioctl on an
AIA IMSIC device before the IMSIC state has been fully initialized
for a vcpu.
The crash manifests as:
Unable to handle kernel paging request at virtual address
dfffffff00000001
...
epc : kvm_riscv_aia_imsic_has_attr+0x464/0x50e
arch/riscv/kvm/aia_imsic.c:998
...
kvm_riscv_aia_imsic_has_attr+0x464/0x50e arch/riscv/kvm/aia_imsic.c:998
aia_has_attr+0x128/0x2bc arch/riscv/kvm/aia_device.c:471
kvm_device_ioctl_attr virt/kvm/kvm_main.c:4722 [inline]
kvm_device_ioctl+0x296/0x374 virt/kvm/kvm_main.c:4739
...
The fix adds a check to return -ENODEV if imsic_state is NULL, which
is consistent with other error handling in the function and prevents
the null pointer dereference.
Fixes:
|
||
|
|
1c0180cd42 |
RISC-V: KVM: Remove unnecessary 'ret' assignment
If execution reaches "ret = 0" assignment in
kvm_riscv_vcpu_pmu_event_info() then it means
kvm_vcpu_write_guest() returned 0 hence ret is
already zero and does not need to be assigned 0.
Fixes:
|
||
|
|
8cdb04bd06 |
riscv: ptrace: return ENODATA for inactive vector extension
Currently, ptrace returns EINVAL when the vector extension is supported
but not yet activated for the traced process. This error code is not
always appropriate since the ptrace arguments may be valid.
Debug tools like gdbserver expect ENODATA when the requested register
set is not active, e.g. see [1]. This expectation seems to be more
appropriate, so modify the vector ptrace implementation to return:
- EINVAL when V extension is not supported
- ENODATA when V extension is supported but not active
[1]
|
||
|
|
22c1e263af |
riscv: create a Kconfig fragment for shadow stack and landing pad support
This patch creates a Kconfig fragment for shadow stack support and landing pad instruction support. Shadow stack support and landing pad instruction support can be enabled by selecting 'CONFIG_RISCV_USER_CFI'. Selecting 'CONFIG_RISCV_USER_CFI' wires up the path to enumerate CPU support. If support exists, the kernel will support CPU-assisted user mode CFI. If CONFIG_RISCV_USER_CFI is selected, select 'ARCH_USES_HIGH_VMA_FLAGS', 'ARCH_HAS_USER_SHADOW_STACK' and 'DYNAMIC_SIGFRAME' for riscv. Reviewed-by: Zong Li <zong.li@sifive.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-25-b55691eacf4f@rivosinc.com [pjw@kernel.org: cleaned up patch description, Kconfig text; added CONFIG_MMU exclusion] Signed-off-by: Paul Walmsley <pjw@kernel.org> |
||
|
|
ccad8c1336 |
arch/riscv: add dual vdso creation logic and select vdso based on hw
Shadow stack instructions are taken from the Zimop ISA extension, which is mandated on RVA23. Any userspace with shadow stack instructions in it will fault on hardware that doesn't have support for Zimop. Thus, a shadow stack-enabled userspace can't be run on hardware that doesn't support Zimop. It's not known how Linux userspace providers will respond to this kind of binary fragmentation. In order to keep kernel portable across different hardware, 'arch/riscv/kernel/vdso_cfi' is created which has Makefile logic to compile 'arch/riscv/kernel/vdso' sources with CFI flags, and 'arch/riscv/kernel/vdso.c' is modified to select the appropriate vdso depending on whether the underlying CPU implements the Zimop extension. Since the offset of vdso symbols will change due to having two different vdso binaries, there is added logic to include a new generated vdso offset header and dynamically select the offset (like for rt_sigreturn). Signed-off-by: Deepak Gupta <debug@rivosinc.com> Acked-by: Charles Mirabile <cmirabil@redhat.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-24-b55691eacf4f@rivosinc.com [pjw@kernel.org: cleaned up patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org> |
||
|
|
37f57bd3fa |
arch/riscv: compile vdso with landing pad and shadow stack note
User mode tasks compiled with Zicfilp may call indirectly into the vdso (like hwprobe indirect calls). Add support for compiling landing pads into the vdso. Landing pad instructions in the vdso will be no-ops for tasks which have not enabled landing pads. Furthermore, add support for the C sources of the vdso to be compiled with shadow stack and landing pads enabled as well. Landing pad and shadow stack instructions are emitted only when the VDSO_CFI cflags option is defined during compile. Signed-off-by: Jim Shu <jim.shu@sifive.com> Reviewed-by: Zong Li <zong.li@sifive.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-23-b55691eacf4f@rivosinc.com [pjw@kernel.org: cleaned up patch description, issues reported by checkpatch] Signed-off-by: Paul Walmsley <pjw@kernel.org> |
||
|
|
41213bf2ae |
riscv: enable kernel access to shadow stack memory via the FWFT SBI call
The kernel has to perform shadow stack operations on the user shadow stack. During signal delivery and sigreturn, the shadow stack token must be created and validated respectively. Thus shadow stack access for the kernel must be enabled. In the future, when kernel shadow stacks are enabled, they must be enabled as early as possible for better coverage and to prevent any imbalance between the regular stack and the shadow stack. After 'relocate_enable_mmu' has completed, this is the earliest that it can be enabled. Reviewed-by: Zong Li <zong.li@sifive.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-22-b55691eacf4f@rivosinc.com [pjw@kernel.org: updated to apply; cleaned up commit message] Signed-off-by: Paul Walmsley <pjw@kernel.org> |
||
|
|
c9b859c4d8 |
riscv: add kernel command line option to opt out of user CFI
Add a kernel command line option to disable part or all of user CFI. User backward CFI and forward CFI can be controlled independently. The kernel command line parameter "riscv_nousercfi" can take the following values: - "all" : Disable forward and backward cfi both - "bcfi" : Disable backward cfi - "fcfi" : Disable forward cfi Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-21-b55691eacf4f@rivosinc.com [pjw@kernel.org: fixed warnings from checkpatch; cleaned up patch description, doc, printk text] Signed-off-by: Paul Walmsley <pjw@kernel.org> |
||
|
|
30c3099036 |
riscv/hwprobe: add zicfilp / zicfiss enumeration in hwprobe
Add enumeration of the zicfilp and zicfiss extensions in the hwprobe syscall. Reviewed-by: Zong Li <zong.li@sifive.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-20-b55691eacf4f@rivosinc.com [pjw@kernel.org: updated to apply; extend into RISCV_HWPROBE_KEY_IMA_EXT_1; clean patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org> |
||
|
|
462a94fb8a |
riscv: hwprobe: add support for RISCV_HWPROBE_KEY_IMA_EXT_1
We've run out of bits to describe RISC-V ISA extensions in our initial hwprobe key, RISCV_HWPROBE_KEY_IMA_EXT_0. So, let's add RISCV_HWPROBE_KEY_IMA_EXT_1, along with the framework to set the appropriate hwprobe tuple, and add testing for it. Based on a suggestion from Andrew Jones <andrew.jones@oss.qualcomm.com>, also fix the documentation for RISCV_HWPROBE_KEY_IMA_EXT_0. Reviewed-by: Andrew Jones <andrew.jones@oss.qualcomm.com> Signed-off-by: Paul Walmsley <pjw@kernel.org> |
||
|
|
2af7c9cf02 |
riscv/ptrace: expose riscv CFI status and state via ptrace and in core files
Expose a new register type NT_RISCV_USER_CFI for risc-v CFI status and state. Intentionally, both landing pad and shadow stack status and state are rolled into the CFI state. Creating two different NT_RISCV_USER_XXX would not be useful and would waste a note type. Enabling, disabling and locking the CFI feature is not allowed via ptrace set interface. However, setting 'elp' state or setting shadow stack pointer are allowed via the ptrace set interface. It is expected that 'gdb' might need to fixup 'elp' state or 'shadow stack' pointer. Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-19-b55691eacf4f@rivosinc.com [pjw@kernel.org: updated to apply; cleaned patch description and comments; addressed checkpatch issues] Signed-off-by: Paul Walmsley <pjw@kernel.org> |
||
|
|
9d0e75e25e |
riscv/kernel: update __show_regs() to print shadow stack register
Update __show_regs() to print the captured shadow stack pointer. On tasks where shadow stack is disabled, simply print 0. Signed-off-by: Deepak Gupta <debug@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-18-b55691eacf4f@rivosinc.com [pjw@kernel.org: cleaned up patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org> |
||
|
|
66c9c713de |
riscv/signal: save and restore the shadow stack on a signal
Save the shadow stack pointer in the sigcontext structure when delivering a signal. Restore the shadow stack pointer from sigcontext on sigreturn. As part of the save operation, the kernel uses the 'ssamoswap' instruction to save a snapshot of the current shadow stack on the shadow stack itself (this can be called a "save token"). During restore on sigreturn, the kernel retrieves the save token from the top of the shadow stack and validates it. This ensures that user mode can't arbitrarily pivot to any shadow stack address without having a token and thus provides a strong security assurance during the window between signal delivery and sigreturn. Use an ABI-compatible way of saving/restoring the shadow stack pointer into the signal stack. This follows the vector extension, where extra registers are placed in a form of extension header + extension body in the stack. The extension header indicates the size of the extra architectural states plus the size of header itself, and a magic identifier for the extension. Then, the extension body contains the new architectural states in the form defined by uapi. Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-17-b55691eacf4f@rivosinc.com [pjw@kernel.org: cleaned patch description, code comments; resolved checkpatch warning] Signed-off-by: Paul Walmsley <pjw@kernel.org> |
||
|
|
9d42fc28fc |
riscv/traps: Introduce software check exception and uprobe handling
The Zicfiss and Zicfilp extensions introduce a new exception, the 'software check exception', in the privileged ISA, with cause code = 18. This patch implements support for software check exceptions. Additionally, the patch implements a CFI violation handler which checks the code in the xtval register. If xtval=2, the software check exception happened because of an indirect branch that didn't land on a 4 byte aligned PC or on a 'lpad' instruction, or the label value embedded in 'lpad' didn't match the label value set in the x7 register. If xtval=3, the software check exception happened due to a mismatch between the link register (x1 or x5) and the top of shadow stack (on execution of `sspopchk`). In case of a CFI violation, SIGSEGV is raised with code=SEGV_CPERR. SEGV_CPERR was introduced by the x86 shadow stack patches. To keep uprobes working, handle the uprobe event first before reporting the CFI violation in the software check exception handler. This is because, when the landing pad is activated, if the uprobe point is set at the lpad instruction at the beginning of a function, the system triggers a software check exception instead of an ebreak exception due to the exception priority. This would prevent uprobe from working. Reviewed-by: Zong Li <zong.li@sifive.com> Co-developed-by: Zong Li <zong.li@sifive.com> Signed-off-by: Zong Li <zong.li@sifive.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-15-b55691eacf4f@rivosinc.com [pjw@kernel.org: cleaned up the patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org> |
||
|
|
8a9e22d2ca |
riscv: Implement indirect branch tracking prctls
This patch adds a RISC-V implementation of the following prctls: PR_SET_INDIR_BR_LP_STATUS, PR_GET_INDIR_BR_LP_STATUS and PR_LOCK_INDIR_BR_LP_STATUS. Reviewed-by: Zong Li <zong.li@sifive.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-14-b55691eacf4f@rivosinc.com [pjw@kernel.org: clean up patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org> |
||
|
|
61a0200211 |
riscv: Implement arch-agnostic shadow stack prctls
Implement an architecture-agnostic prctl() interface for setting and getting shadow stack status. The prctls implemented are PR_GET_SHADOW_STACK_STATUS, PR_SET_SHADOW_STACK_STATUS and PR_LOCK_SHADOW_STACK_STATUS. As part of PR_SET_SHADOW_STACK_STATUS/PR_GET_SHADOW_STACK_STATUS, only PR_SHADOW_STACK_ENABLE is implemented because RISCV allows each mode to write to their own shadow stack using 'sspush' or 'ssamoswap'. PR_LOCK_SHADOW_STACK_STATUS locks the current shadow stack enablement configuration. Reviewed-by: Zong Li <zong.li@sifive.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-12-b55691eacf4f@rivosinc.com [pjw@kernel.org: cleaned up patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org> |
||
|
|
fd44a4a855 |
riscv/shstk: If needed allocate a new shadow stack on clone
Userspace specifies CLONE_VM to share address space and spawn new thread. 'clone' allows userspace to specify a new stack for a new thread. However there is no way to specify a new shadow stack base address without changing the API. This patch allocates a new shadow stack whenever CLONE_VM is given. In case of CLONE_VFORK, the parent is suspended until the child finishes; thus the child can use the parent's shadow stack. In case of !CLONE_VM, COW kicks in because entire address space is copied from parent to child. 'clone3' is extensible and can provide mechanisms for specifying the shadow stack as an input parameter. This is not settled yet and is being extensively discussed on the mailing list. Once that's settled, this code should be adapted. Reviewed-by: Zong Li <zong.li@sifive.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-11-b55691eacf4f@rivosinc.com [pjw@kernel.org: cleaned up patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org> |
||
|
|
c70772afd5 |
riscv/mm: Implement map_shadow_stack() syscall
As discussed extensively in the changelog for the addition of this
syscall on x86 ("x86/shstk: Introduce map_shadow_stack syscall") the
existing mmap() and madvise() syscalls do not map entirely well onto the
security requirements for shadow stack memory since they lead to windows
where memory is allocated but not yet protected or stacks which are not
properly and safely initialised. Instead a new syscall map_shadow_stack()
has been defined which allocates and initialises a shadow stack page.
This patch implements this syscall for riscv. riscv doesn't require
tokens to be setup by kernel because user mode can do that by
itself. However to provide compatibility and portability with other
architectues, user mode can specify token set flag.
Signed-off-by: Deepak Gupta <debug@rivosinc.com>
Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-10-b55691eacf4f@rivosinc.com
Link: https://lore.kernel.org/linux-riscv/aXfRPJvoSsOW8AwM@debug.ba.rivosinc.com/
[pjw@kernel.org: added allocate_shadow_stack() fix per Deepak; fixed bug found by sparse]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
|