linux

mirror of https://github.com/torvalds/linux.git synced 2026-03-08 00:44:31 +01:00

Author	SHA1	Message	Date
Linus Torvalds	c7decec2f2	perf tools changes for v7.0: - Introduce 'perf sched stats' tool with record/report/diff workflows using schedstat counters. - Add a faster libdw based addr2line implementation and allow selecting it or its alternatives via 'perf config addr2line.style='. - Data-type profiling fixes and improvements including the ability to select fields using 'perf report''s -F/-fields, e.g.: 'perf report --fields overhead,type' - Add 'perf test' regression tests for Data-type profiling with C and Rust workloads. - Fix srcline printing with inlines in callchains, make sure this has coverage in 'perf test'. - Fix printing of leaf IP in LBR callchains. - Fix display of metrics without sufficient permission in 'perf stat'. - Print all machines in 'perf kvm report -vvv', not just the host. - Switch from SHA-1 to BLAKE2s for build ID generation, remove SHA-1 code. - Fix 'perf report's histogram entry collapsing with '-F' option. - Use system's cacheline size instead of a hardcoded value in 'perf report'. - Allow filtering conversion by time range in 'perf data'. - Cover conversion to CTF using 'perf data' in 'perf test'. - Address newer glibc const-correctness (-Werror=discarded-qualifiers) issues. - Fixes and improvements for ARM's CoreSight support, simplify ARM SPE event config in 'perf mem', update docs for 'perf c2c' including the ARM events it can be used with. - Build support for generating metrics from arch specific python script, add extra AMD, Intel, ARM64 metrics using it. - Add AMD Zen 6 events and metrics. - Add JSON file with OpenHW Risc-V CVA6 hardware counters. - Add 'perf kvm' stats live testing. - Add more 'perf stat' tests to 'perf test'. - Fix segfault in `perf lock contention -b/--use-bpf` - Fix various 'perf test' cases for s390. - Build system cleanups, bump minimum shellcheck version to 0.7.2 - Support building the capstone based annotation routines as a plugin. - Allow passing extra Clang flags via EXTRA_BPF_FLAGS. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCaZn25QAKCRCyPKLppCJ+ J9MbAQCTKChBwDsqVh5iPr0UAc+mez9LOPJFa580SYH9nmd1jgD+Oqip7xCf514G ZtYPNh+Mz0se0Mcb++syLUEjxvbrQQY= =y2VY -----END PGP SIGNATURE----- Merge tag 'perf-tools-for-v7.0-1-2026-02-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools updates from Arnaldo Carvalho de Melo: - Introduce 'perf sched stats' tool with record/report/diff workflows using schedstat counters - Add a faster libdw based addr2line implementation and allow selecting it or its alternatives via 'perf config addr2line.style=' - Data-type profiling fixes and improvements including the ability to select fields using 'perf report''s -F/-fields, e.g.: 'perf report --fields overhead,type' - Add 'perf test' regression tests for Data-type profiling with C and Rust workloads - Fix srcline printing with inlines in callchains, make sure this has coverage in 'perf test' - Fix printing of leaf IP in LBR callchains - Fix display of metrics without sufficient permission in 'perf stat' - Print all machines in 'perf kvm report -vvv', not just the host - Switch from SHA-1 to BLAKE2s for build ID generation, remove SHA-1 code - Fix 'perf report's histogram entry collapsing with '-F' option - Use system's cacheline size instead of a hardcoded value in 'perf report' - Allow filtering conversion by time range in 'perf data' - Cover conversion to CTF using 'perf data' in 'perf test' - Address newer glibc const-correctness (-Werror=discarded-qualifiers) issues - Fixes and improvements for ARM's CoreSight support, simplify ARM SPE event config in 'perf mem', update docs for 'perf c2c' including the ARM events it can be used with - Build support for generating metrics from arch specific python script, add extra AMD, Intel, ARM64 metrics using it - Add AMD Zen 6 events and metrics - Add JSON file with OpenHW Risc-V CVA6 hardware counters - Add 'perf kvm' stats live testing - Add more 'perf stat' tests to 'perf test' - Fix segfault in `perf lock contention -b/--use-bpf` - Fix various 'perf test' cases for s390 - Build system cleanups, bump minimum shellcheck version to 0.7.2 - Support building the capstone based annotation routines as a plugin - Allow passing extra Clang flags via EXTRA_BPF_FLAGS * tag 'perf-tools-for-v7.0-1-2026-02-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (255 commits) perf test script: Add python script testing support perf test script: Add perl script testing support perf script: Allow the generated script to be a path perf test: perf data --to-ctf testing perf test: Test pipe mode with data conversion --to-json perf json: Pipe mode --to-ctf support perf json: Pipe mode --to-json support perf check: Add libbabeltrace to the listed features perf build: Allow passing extra Clang flags via EXTRA_BPF_FLAGS perf test data_type_profiling.sh: Skip just the Rust tests if code_with_type workload is missing tools build: Fix feature test for rust compiler perf libunwind: Fix calls to thread__e_machine() perf stat: Add no-affinity flag perf evlist: Reduce affinity use and move into iterator, fix no affinity perf evlist: Missing TPEBS close in evlist__close() perf evlist: Special map propagation for tool events that read on 1 CPU perf stat-shadow: In prepare_metric fix guard on reading NULL perf_stat_evsel Revert "perf tool_pmu: More accurately set the cpus for tool events" tools build: Emit dependencies file for test-rust.bin tools build: Make test-rust.bin be removed by the 'clean' target ...	2026-02-21 10:51:08 -08:00
Linus Torvalds	cb5573868e	Loongarch: - Add more CPUCFG mask bits. - Improve feature detection. - Add lazy load support for FPU and binary translation (LBT) register state. - Fix return value for memory reads from and writes to in-kernel devices. - Add support for detecting preemption from within a guest. - Add KVM steal time test case to tools/selftests. ARM: - Add support for FEAT_IDST, allowing ID registers that are not implemented to be reported as a normal trap rather than as an UNDEF exception. - Add sanitisation of the VTCR_EL2 register, fixing a number of UXN/PXN/XN bugs in the process. - Full handling of RESx bits, instead of only RES0, and resulting in SCTLR_EL2 being added to the list of sanitised registers. - More pKVM fixes for features that are not supposed to be exposed to guests. - Make sure that MTE being disabled on the pKVM host doesn't give it the ability to attack the hypervisor. - Allow pKVM's host stage-2 mappings to use the Force Write Back version of the memory attributes by using the "pass-through' encoding. - Fix trapping of ICC_DIR_EL1 on GICv5 hosts emulating GICv3 for the guest. - Preliminary work for guest GICv5 support. - A bunch of debugfs fixes, removing pointless custom iterators stored in guest data structures. - A small set of FPSIMD cleanups. - Selftest fixes addressing the incorrect alignment of page allocation. - Other assorted low-impact fixes and spelling fixes. RISC-V: - Fixes for issues discoverd by KVM API fuzzing in kvm_riscv_aia_imsic_has_attr(), kvm_riscv_aia_imsic_rw_attr(), and kvm_riscv_vcpu_aia_imsic_update() - Allow Zalasr, Zilsd and Zclsd extensions for Guest/VM - Transparent huge page support for hypervisor page tables - Adjust the number of available guest irq files based on MMIO register sizes found in the device tree or the ACPI tables - Add RISC-V specific paging modes to KVM selftests - Detect paging mode at runtime for selftests s390: - Performance improvement for vSIE (aka nested virtualization) - Completely new memory management. s390 was a special snowflake that enlisted help from the architecture's page table management to build hypervisor page tables, in particular enabling sharing the last level of page tables. This however was a lot of code (~3K lines) in order to support KVM, and also blocked several features. The biggest advantages is that the page size of userspace is completely independent of the page size used by the guest: userspace can mix normal pages, THPs and hugetlbfs as it sees fit, and in fact transparent hugepages were not possible before. It's also now possible to have nested guests and guests with huge pages running on the same host. - Maintainership change for s390 vfio-pci - Small quality of life improvement for protected guests x86: - Add support for giving the guest full ownership of PMU hardware (contexted switched around the fastpath run loop) and allowing direct access to data MSRs and PMCs (restricted by the vPMU model). KVM still intercepts access to control registers, e.g. to enforce event filtering and to prevent the guest from profiling sensitive host state. This is more accurate, since it has no risk of contention and thus dropped events, and also has significantly less overhead. For more information, see the commit message for merge commit `bf2c3138ae` ("Merge tag 'kvm-x86-pmu-6.20' of https://github.com/kvm-x86/linux into HEAD"). - Disallow changing the virtual CPU model if L2 is active, for all the same reasons KVM disallows change the model after the first KVM_RUN. - Fix a bug where KVM would incorrectly reject host accesses to PV MSRs when running with KVM_CAP_ENFORCE_PV_FEATURE_CPUID enabled, even if those were advertised as supported to userspace, - Fix a bug with protected guest state (SEV-ES/SNP and TDX) VMs, where KVM would attempt to read CR3 configuring an async #PF entry. - Fail the build if EXPORT_SYMBOL_GPL or EXPORT_SYMBOL is used in KVM (for x86 only) to enforce usage of EXPORT_SYMBOL_FOR_KVM_INTERNAL. Only a few exports that are intended for external usage, and those are allowed explicitly. - When checking nested events after a vCPU is unblocked, ignore -EBUSY instead of WARNing. Userspace can sometimes put the vCPU into what should be an impossible state, and spurious exit to userspace on -EBUSY does not really do anything to solve the issue. - Also throw in the towel and drop the WARN on INIT/SIPI being blocked when vCPU is in Wait-For-SIPI, which also resulted in playing whack-a-mole with syzkaller stuffing architecturally impossible states into KVM. - Add support for new Intel instructions that don't require anything beyond enumerating feature flags to userspace. - Grab SRCU when reading PDPTRs in KVM_GET_SREGS2. - Add WARNs to guard against modifying KVM's CPU caps outside of the intended setup flow, as nested VMX in particular is sensitive to unexpected changes in KVM's golden configuration. - Add a quirk to allow userspace to opt-in to actually suppress EOI broadcasts when the suppression feature is enabled by the guest (currently limited to split IRQCHIP, i.e. userspace I/O APIC). Sadly, simply fixing KVM to honor Suppress EOI Broadcasts isn't an option as some userspaces have come to rely on KVM's buggy behavior (KVM advertises Supress EOI Broadcast irrespective of whether or not userspace I/O APIC supports Directed EOIs). - Clean up KVM's handling of marking mapped vCPU pages dirty. - Drop a pile of ancient sanity checks hidden behind in KVM's unused ASSERT() macro, most of which could be trivially triggered by the guest and/or user, and all of which were useless. - Fold "struct dest_map" into its sole user, "struct rtc_status", to make it more obvious what the weird parameter is used for, and to allow fropping these RTC shenanigans if CONFIG_KVM_IOAPIC=n. - Bury all of ioapic.h, i8254.h and related ioctls (including KVM_CREATE_IRQCHIP) behind CONFIG_KVM_IOAPIC=y. - Add a regression test for recent APICv update fixes. - Handle "hardware APIC ISR", a.k.a. SVI, updates in kvm_apic_update_apicv() to consolidate the updates, and to co-locate SVI updates with the updates for KVM's own cache of ISR information. - Drop a dead function declaration. - Minor cleanups. x86 (Intel): - Rework KVM's handling of VMCS updates while L2 is active to temporarily switch to vmcs01 instead of deferring the update until the next nested VM-Exit. The deferred updates approach directly contributed to several bugs, was proving to be a maintenance burden due to the difficulty in auditing the correctness of deferred updates, and was polluting "struct nested_vmx" with a growing pile of booleans. - Fix an SGX bug where KVM would incorrectly try to handle EPCM page faults, and instead always reflect them into the guest. Since KVM doesn't shadow EPCM entries, EPCM violations cannot be due to KVM interference and can't be resolved by KVM. - Fix a bug where KVM would register its posted interrupt wakeup handler even if loading kvm-intel.ko ultimately failed. - Disallow access to vmcb12 fields that aren't fully supported, mostly to avoid weirdness and complexity for FRED and other features, where KVM wants enable VMCS shadowing for fields that conditionally exist. - Print out the "bad" offsets and values if kvm-intel.ko refuses to load (or refuses to online a CPU) due to a VMCS config mismatch. x86 (AMD): - Drop a user-triggerable WARN on nested_svm_load_cr3() failure. - Add support for virtualizing ERAPS. Note, correct virtualization of ERAPS relies on an upcoming, publicly announced change in the APM to reduce the set of conditions where hardware (i.e. KVM) must flush the RAP. - Ignore nSVM intercepts for instructions that are not supported according to L1's virtual CPU model. - Add support for expedited writes to the fast MMIO bus, a la VMX's fastpath for EPT Misconfig. - Don't set GIF when clearing EFER.SVME, as GIF exists independently of SVM, and allow userspace to restore nested state with GIF=0. - Treat exit_code as an unsigned 64-bit value through all of KVM. - Add support for fetching SNP certificates from userspace. - Fix a bug where KVM would use vmcb02 instead of vmcb01 when emulating VMLOAD or VMSAVE on behalf of L2. - Misc fixes and cleanups. x86 selftests: - Add a regression test for TPR<=>CR8 synchronization and IRQ masking. - Overhaul selftest's MMU infrastructure to genericize stage-2 MMU support, and extend x86's infrastructure to support EPT and NPT (for L2 guests). - Extend several nested VMX tests to also cover nested SVM. - Add a selftest for nested VMLOAD/VMSAVE. - Rework the nested dirty log test, originally added as a regression test for PML where KVM logged L2 GPAs instead of L1 GPAs, to improve test coverage and to hopefully make the test easier to understand and maintain. guest_memfd: - Remove kvm_gmem_populate()'s preparation tracking and half-baked hugepage handling. SEV/SNP was the only user of the tracking and it can do it via the RMP. - Retroactively document and enforce (for SNP) that KVM_SEV_SNP_LAUNCH_UPDATE and KVM_TDX_INIT_MEM_REGION require the source page to be 4KiB aligned, to avoid non-trivial complexity for something that no known VMM seems to be doing and to avoid an API special case for in-place conversion, which simply can't support unaligned sources. - When populating guest_memfd memory, GUP the source page in common code and pass the refcounted page to the vendor callback, instead of letting vendor code do the heavy lifting. Doing so avoids a looming deadlock bug with in-place due an AB-BA conflict betwee mmap_lock and guest_memfd's filemap invalidate lock. Generic: - Fix a bug where KVM would ignore the vCPU's selected address space when creating a vCPU-specific mapping of guest memory. Actually this bug could not be hit even on x86, the only architecture with multiple address spaces, but it's a bug nevertheless. -----BEGIN PGP SIGNATURE----- iQFIBAABCgAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmmNqwwUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroPaZAf/cJx5B67lnST272esz0j29MIuT/Ti jnf6PI9b7XubKYOtNvlu5ZW4Jsa5dqRG0qeO/JmcXDlwBf5/UkWOyvqIXyiuTl0l KcSUlKPtTgKZSoZpJpTppuuDE8FSYqEdcCmjNvoYzcJoPjmaeJbK6aqO0AkBbb6e L5InrLV7nV9iua6rFvA0s/G8/Eq2DG8M9hTRHe6NcI/z4hvslOudvpUXtC8Jygoo cV8vFavUwc+atrmvhAOLvSitnrjfNa4zcG6XMOlwXPfIdvi3zqTlQTgUpwGKiAGQ RIDUVZ/9bcWgJqbPRsdEWwaYRkNQWc5nmrAHRpEEaYV/NeBBNf4v6qfKSw== =SkJ1 -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM updates from Paolo Bonzini: "Loongarch: - Add more CPUCFG mask bits - Improve feature detection - Add lazy load support for FPU and binary translation (LBT) register state - Fix return value for memory reads from and writes to in-kernel devices - Add support for detecting preemption from within a guest - Add KVM steal time test case to tools/selftests ARM: - Add support for FEAT_IDST, allowing ID registers that are not implemented to be reported as a normal trap rather than as an UNDEF exception - Add sanitisation of the VTCR_EL2 register, fixing a number of UXN/PXN/XN bugs in the process - Full handling of RESx bits, instead of only RES0, and resulting in SCTLR_EL2 being added to the list of sanitised registers - More pKVM fixes for features that are not supposed to be exposed to guests - Make sure that MTE being disabled on the pKVM host doesn't give it the ability to attack the hypervisor - Allow pKVM's host stage-2 mappings to use the Force Write Back version of the memory attributes by using the "pass-through' encoding - Fix trapping of ICC_DIR_EL1 on GICv5 hosts emulating GICv3 for the guest - Preliminary work for guest GICv5 support - A bunch of debugfs fixes, removing pointless custom iterators stored in guest data structures - A small set of FPSIMD cleanups - Selftest fixes addressing the incorrect alignment of page allocation - Other assorted low-impact fixes and spelling fixes RISC-V: - Fixes for issues discoverd by KVM API fuzzing in kvm_riscv_aia_imsic_has_attr(), kvm_riscv_aia_imsic_rw_attr(), and kvm_riscv_vcpu_aia_imsic_update() - Allow Zalasr, Zilsd and Zclsd extensions for Guest/VM - Transparent huge page support for hypervisor page tables - Adjust the number of available guest irq files based on MMIO register sizes found in the device tree or the ACPI tables - Add RISC-V specific paging modes to KVM selftests - Detect paging mode at runtime for selftests s390: - Performance improvement for vSIE (aka nested virtualization) - Completely new memory management. s390 was a special snowflake that enlisted help from the architecture's page table management to build hypervisor page tables, in particular enabling sharing the last level of page tables. This however was a lot of code (~3K lines) in order to support KVM, and also blocked several features. The biggest advantages is that the page size of userspace is completely independent of the page size used by the guest: userspace can mix normal pages, THPs and hugetlbfs as it sees fit, and in fact transparent hugepages were not possible before. It's also now possible to have nested guests and guests with huge pages running on the same host - Maintainership change for s390 vfio-pci - Small quality of life improvement for protected guests x86: - Add support for giving the guest full ownership of PMU hardware (contexted switched around the fastpath run loop) and allowing direct access to data MSRs and PMCs (restricted by the vPMU model). KVM still intercepts access to control registers, e.g. to enforce event filtering and to prevent the guest from profiling sensitive host state. This is more accurate, since it has no risk of contention and thus dropped events, and also has significantly less overhead. For more information, see the commit message for merge commit `bf2c3138ae` ("Merge tag 'kvm-x86-pmu-6.20' ...") - Disallow changing the virtual CPU model if L2 is active, for all the same reasons KVM disallows change the model after the first KVM_RUN - Fix a bug where KVM would incorrectly reject host accesses to PV MSRs when running with KVM_CAP_ENFORCE_PV_FEATURE_CPUID enabled, even if those were advertised as supported to userspace, - Fix a bug with protected guest state (SEV-ES/SNP and TDX) VMs, where KVM would attempt to read CR3 configuring an async #PF entry - Fail the build if EXPORT_SYMBOL_GPL or EXPORT_SYMBOL is used in KVM (for x86 only) to enforce usage of EXPORT_SYMBOL_FOR_KVM_INTERNAL. Only a few exports that are intended for external usage, and those are allowed explicitly - When checking nested events after a vCPU is unblocked, ignore -EBUSY instead of WARNing. Userspace can sometimes put the vCPU into what should be an impossible state, and spurious exit to userspace on -EBUSY does not really do anything to solve the issue - Also throw in the towel and drop the WARN on INIT/SIPI being blocked when vCPU is in Wait-For-SIPI, which also resulted in playing whack-a-mole with syzkaller stuffing architecturally impossible states into KVM - Add support for new Intel instructions that don't require anything beyond enumerating feature flags to userspace - Grab SRCU when reading PDPTRs in KVM_GET_SREGS2 - Add WARNs to guard against modifying KVM's CPU caps outside of the intended setup flow, as nested VMX in particular is sensitive to unexpected changes in KVM's golden configuration - Add a quirk to allow userspace to opt-in to actually suppress EOI broadcasts when the suppression feature is enabled by the guest (currently limited to split IRQCHIP, i.e. userspace I/O APIC). Sadly, simply fixing KVM to honor Suppress EOI Broadcasts isn't an option as some userspaces have come to rely on KVM's buggy behavior (KVM advertises Supress EOI Broadcast irrespective of whether or not userspace I/O APIC supports Directed EOIs) - Clean up KVM's handling of marking mapped vCPU pages dirty - Drop a pile of ancient sanity checks hidden behind in KVM's unused ASSERT() macro, most of which could be trivially triggered by the guest and/or user, and all of which were useless - Fold "struct dest_map" into its sole user, "struct rtc_status", to make it more obvious what the weird parameter is used for, and to allow fropping these RTC shenanigans if CONFIG_KVM_IOAPIC=n - Bury all of ioapic.h, i8254.h and related ioctls (including KVM_CREATE_IRQCHIP) behind CONFIG_KVM_IOAPIC=y - Add a regression test for recent APICv update fixes - Handle "hardware APIC ISR", a.k.a. SVI, updates in kvm_apic_update_apicv() to consolidate the updates, and to co-locate SVI updates with the updates for KVM's own cache of ISR information - Drop a dead function declaration - Minor cleanups x86 (Intel): - Rework KVM's handling of VMCS updates while L2 is active to temporarily switch to vmcs01 instead of deferring the update until the next nested VM-Exit. The deferred updates approach directly contributed to several bugs, was proving to be a maintenance burden due to the difficulty in auditing the correctness of deferred updates, and was polluting "struct nested_vmx" with a growing pile of booleans - Fix an SGX bug where KVM would incorrectly try to handle EPCM page faults, and instead always reflect them into the guest. Since KVM doesn't shadow EPCM entries, EPCM violations cannot be due to KVM interference and can't be resolved by KVM - Fix a bug where KVM would register its posted interrupt wakeup handler even if loading kvm-intel.ko ultimately failed - Disallow access to vmcb12 fields that aren't fully supported, mostly to avoid weirdness and complexity for FRED and other features, where KVM wants enable VMCS shadowing for fields that conditionally exist - Print out the "bad" offsets and values if kvm-intel.ko refuses to load (or refuses to online a CPU) due to a VMCS config mismatch x86 (AMD): - Drop a user-triggerable WARN on nested_svm_load_cr3() failure - Add support for virtualizing ERAPS. Note, correct virtualization of ERAPS relies on an upcoming, publicly announced change in the APM to reduce the set of conditions where hardware (i.e. KVM) must flush the RAP - Ignore nSVM intercepts for instructions that are not supported according to L1's virtual CPU model - Add support for expedited writes to the fast MMIO bus, a la VMX's fastpath for EPT Misconfig - Don't set GIF when clearing EFER.SVME, as GIF exists independently of SVM, and allow userspace to restore nested state with GIF=0 - Treat exit_code as an unsigned 64-bit value through all of KVM - Add support for fetching SNP certificates from userspace - Fix a bug where KVM would use vmcb02 instead of vmcb01 when emulating VMLOAD or VMSAVE on behalf of L2 - Misc fixes and cleanups x86 selftests: - Add a regression test for TPR<=>CR8 synchronization and IRQ masking - Overhaul selftest's MMU infrastructure to genericize stage-2 MMU support, and extend x86's infrastructure to support EPT and NPT (for L2 guests) - Extend several nested VMX tests to also cover nested SVM - Add a selftest for nested VMLOAD/VMSAVE - Rework the nested dirty log test, originally added as a regression test for PML where KVM logged L2 GPAs instead of L1 GPAs, to improve test coverage and to hopefully make the test easier to understand and maintain guest_memfd: - Remove kvm_gmem_populate()'s preparation tracking and half-baked hugepage handling. SEV/SNP was the only user of the tracking and it can do it via the RMP - Retroactively document and enforce (for SNP) that KVM_SEV_SNP_LAUNCH_UPDATE and KVM_TDX_INIT_MEM_REGION require the source page to be 4KiB aligned, to avoid non-trivial complexity for something that no known VMM seems to be doing and to avoid an API special case for in-place conversion, which simply can't support unaligned sources - When populating guest_memfd memory, GUP the source page in common code and pass the refcounted page to the vendor callback, instead of letting vendor code do the heavy lifting. Doing so avoids a looming deadlock bug with in-place due an AB-BA conflict betwee mmap_lock and guest_memfd's filemap invalidate lock Generic: - Fix a bug where KVM would ignore the vCPU's selected address space when creating a vCPU-specific mapping of guest memory. Actually this bug could not be hit even on x86, the only architecture with multiple address spaces, but it's a bug nevertheless" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (267 commits) KVM: s390: Increase permitted SE header size to 1 MiB MAINTAINERS: Replace backup for s390 vfio-pci KVM: s390: vsie: Fix race in acquire_gmap_shadow() KVM: s390: vsie: Fix race in walk_guest_tables() KVM: s390: Use guest address to mark guest page dirty irqchip/riscv-imsic: Adjust the number of available guest irq files RISC-V: KVM: Transparent huge page support RISC-V: KVM: selftests: Add Zalasr extensions to get-reg-list test RISC-V: KVM: Allow Zalasr extensions for Guest/VM KVM: riscv: selftests: Add riscv vm satp modes KVM: riscv: selftests: add Zilsd and Zclsd extension to get-reg-list test riscv: KVM: allow Zilsd and Zclsd extensions for Guest/VM RISC-V: KVM: Skip IMSIC update if vCPU IMSIC state is not initialized RISC-V: KVM: Fix null pointer dereference in kvm_riscv_aia_imsic_rw_attr() RISC-V: KVM: Fix null pointer dereference in kvm_riscv_aia_imsic_has_attr() RISC-V: KVM: Remove unnecessary 'ret' assignment KVM: s390: Add explicit padding to struct kvm_s390_keyop KVM: LoongArch: selftests: Add steal time test case LoongArch: KVM: Add paravirt vcpu_is_preempted() support in guest side LoongArch: KVM: Add paravirt preempt feature in hypervisor side ...	2026-02-13 11:31:15 -08:00
Ian Rogers	dbf0108347	perf test script: Add python script testing support Basic coverage of python script support from `perf script`. Committer testing: $ perf test 'perf script python' 107: perf script python tests : Ok $ perf test -vv 'perf script python' 107: perf script python tests: --- start --- test child forked, pid 595537 Testing event: sched:sched_switch perf script python test [Skipped: failed to record sched:sched_switch] Testing event: task-clock Generating python script... generated Python script: /tmp/__perf_test_script.J4rWj.py Executing python script... perf script python test [Success: task-clock triggered param_dict] ---- end(0) ---- 107: perf script python tests : Ok $ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Yujie Liu <yujie.liu@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-12 17:45:23 -03:00
Ian Rogers	2273697781	perf test script: Add perl script testing support Basic coverage of perl script support from `perf script`. This is disabled by default and so the test will most normally skip. Committer testing: $ perf test 'perf script perl' 106: perf script perl tests : Skip $ perf test -vv 'perf script perl' 106: perf script perl tests: --- start --- test child forked, pid 578323 perf script perl test [Skipped: no libperl support] ---- end(-2) ---- 106: perf script perl tests : Skip $ perf check feature libperl libperl: [ OFF ] # HAVE_LIBPERL_SUPPORT ( tip: Deprecated, use LIBPERL=1 and install perl-ExtUtils-Embed/libperl-dev to build with it ) $ Install perl-ExtUtils-Embed, build with LIBPERL=1, rebuild: $ perf check feature libperl libperl: [ on ] # HAVE_LIBPERL_SUPPORT $ perf test 'perf script perl' 106: perf script perl tests : Ok $ perf test -vv 'perf script perl' 106: perf script perl tests: --- start --- test child forked, pid 588206 Testing event: sched:sched_switch perf script perl test [Skipped: failed to record sched:sched_switch] Testing event: task-clock Generating perl script... generated Perl script: /tmp/__perf_test_script.RpMn5.pl Executing perl script... perf script perl test [Success: task-clock triggered $VAR1] ---- end(0) ---- 106: perf script perl tests : Ok $ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Yujie Liu <yujie.liu@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-12 17:45:23 -03:00
Ian Rogers	22ca2f7f32	perf script: Allow the generated script to be a path Allow the script generated by "perf script -g <language>" to be a file path and the language determined by the file extension. This is useful in testing so that the generated script file can be written to a test directory. Committer testing: $ perf record ls a.a ls: cannot access 'a.a': No such file or directory [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.003 MB perf.data (7 samples) ] $ perf script -g python generated Python script: perf-script.py $ perf script -g myscript.py generated Python script: myscript.py $ diff -u perf-script.py myscript.py $ tail myscript.py def trace_unhandled(event_name, context, event_fields_dict, perf_sample_dict): print(get_dict_as_string(event_fields_dict)) print('Sample: {'+get_dict_as_string(perf_sample_dict['sample'], ', ')+'}') def print_header(event_name, cpu, secs, nsecs, pid, comm): print("%-20s %5u %05u.%09u %8u %-20s " % \ (event_name, cpu, secs, nsecs, pid, comm), end="") def get_dict_as_string(a_dict, delimiter=' '): return delimiter.join(['%s=%s'%(k,str(v))for k,v in sorted(a_dict.items())]) $ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Yujie Liu <yujie.liu@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-12 17:45:22 -03:00
Ian Rogers	9083ce531a	perf test: perf data --to-ctf testing If babeltrace is detected check that --to-ctf functions with a data file and in pipe mode. Committer testing: $ perf test 'perf data convert --to-ctf' 124: 'perf data convert --to-ctf' command test : Ok $ perf test -vv 'perf data convert --to-ctf' 124: 'perf data convert --to-ctf' command test: --- start --- test child forked, pid 556008 libbabeltrace: [ on ] # HAVE_LIBBABELTRACE_SUPPORT Testing Perf Data Conversion Command to CTF (File input) [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.021 MB /tmp/__perf_test.perf.data.9TxzZ (115 samples) ] [ perf data convert: Converted '/tmp/__perf_test.perf.data.9TxzZ' into CTF data '/tmp/__perf_test.ctf.f5EkS' ] [ perf data convert: Converted and wrote 0.012 MB (115 samples) ] Perf Data Converter Command to CTF (File input) [SUCCESS] Testing Perf Data Conversion Command to CTF (Pipe mode) [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.047 MB - ] Failed to setup all events. [ perf data convert: Converted '/tmp/__perf_test.perf.data.9TxzZ' into CTF data '/tmp/__perf_test.ctf.f5EkS' ] [ perf data convert: Converted and wrote 0.000 MB (0 samples) ] Perf Data Converter Command to CTF (Pipe mode) [SUCCESS] Unexpected signal in main ---- end(0) ---- 124: 'perf data convert --to-ctf' command test : Ok $ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Derek Foreman <derek.foreman@collabora.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-12 17:45:22 -03:00
Ian Rogers	fc4577b52a	perf test: Test pipe mode with data conversion --to-json Add pipe mode test for json data conversion. Tidy up exit and cleanup code. Committer testing: $ perf test 'perf data convert --to-json' 124: 'perf data convert --to-json' command test : Ok $ perf test -vv 'perf data convert --to-json' 124: 'perf data convert --to-json' command test: --- start --- test child forked, pid 548738 Testing Perf Data Conversion Command to JSON [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.020 MB /tmp/__perf_test.perf.data.krxvl (104 samples) ] [ perf data convert: Converted '/tmp/__perf_test.perf.data.krxvl' into JSON data '/tmp/__perf_test.output.json.0z60p' ] [ perf data convert: Converted and wrote 0.075 MB (104 samples) ] Perf Data Converter Command to JSON [SUCCESS] Validating Perf Data Converted JSON file The file contains valid JSON format [SUCCESS] Testing Perf Data Conversion Command to JSON (Pipe mode) [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.046 MB - ] [ perf data convert: Converted '-' into JSON data '/tmp/__perf_test.output.json.0z60p' ] [ perf data convert: Converted and wrote 0.081 MB (110 samples) ] Perf Data Converter Command to JSON (Pipe mode) [SUCCESS] Validating Perf Data Converted JSON file The file contains valid JSON format [SUCCESS] ---- end(0) ---- 124: 'perf data convert --to-json' command test : Ok $ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Derek Foreman <derek.foreman@collabora.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-12 17:45:22 -03:00
Ian Rogers	5b92fc082c	perf json: Pipe mode --to-ctf support In pipe mode the environment may not be fully initialized so be robust to fields being NULL. Add default handling of attr events, use the feature events to populate the ctf writer environment. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Derek Foreman <derek.foreman@collabora.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-12 17:45:22 -03:00
Ian Rogers	6db2f7c67b	perf json: Pipe mode --to-json support In pipe mode the environment may not be fully initialized so be robust to fields being NULL. Add default handling of feature and attr events. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Derek Foreman <derek.foreman@collabora.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-12 17:45:22 -03:00
Ian Rogers	8772598b78	perf check: Add libbabeltrace to the listed features This enables scripts to more easily determine if `perf data --to-ctf` is supported. Committer testing: $ perf check feature libbabeltrace libbabeltrace: [ on ] # HAVE_LIBBABELTRACE_SUPPORT $ perf check feature -q libbabeltrace && echo have libbabeltrace support have libbabeltrace support $ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Derek Foreman <derek.foreman@collabora.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-12 17:45:22 -03:00
hupu	9eb1760f84	perf build: Allow passing extra Clang flags via EXTRA_BPF_FLAGS Add support for EXTRA_BPF_FLAGS in the eBPF skeleton build, allowing users to pass additional clang options such as --sysroot or custom include paths when cross-compiling perf. This is primarily intended for cross-build scenarios where the default host include paths do not match the target kernel version. Example usage: make perf ARCH="arm64" EXTRA_BPF_FLAGS="--sysroot=..." Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: hupu <hupu.gm@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Bill Wendling <morbo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-12 17:45:22 -03:00
Arnaldo Carvalho de Melo	adc1284bae	perf test data_type_profiling.sh: Skip just the Rust tests if code_with_type workload is missing Namhyung suggested skipping only the rust tests when the code_with_type 'perf test' workload is not built into perf, do it so that we can continue to test the C based workloads: With rust: root@number:/# perf test -vv "data type" 83: perf data type profiling tests: --- start --- test child forked, pid 2645245 Basic Rust perf annotate test Basic annotate test [Success] Pipe Rust perf annotate test Pipe annotate test [Success] Basic C perf annotate test Basic annotate test [Success] Pipe C perf annotate test Pipe annotate test [Success] ---- end(0) ---- 83: perf data type profiling tests : Ok root@number:/# Without: root@number:/# perf test "data type" 83: perf data type profiling tests : Ok root@number:/# perf test -vv "data type" 83: perf data type profiling tests: --- start --- test child forked, pid 2634759 Basic Rust perf annotate test Skip: code_with_type workload not built in 'perf test' Pipe Rust perf annotate test Skip: code_with_type workload not built in 'perf test' Basic C perf annotate test Basic annotate test [Success] Pipe C perf annotate test Pipe annotate test [Success] ---- end(0) ---- 83: perf data type profiling tests : Ok root@number:/# Suggested-by: Namhyung Kim <namhyung@kernel.org> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-12 17:45:22 -03:00
Ian Rogers	1a6c45969a	perf libunwind: Fix calls to thread__e_machine() Add the missing 'e_flags' option to fix the build. Fixes: `4e66527f88` ("perf thread: Add optional e_flags output argument to thread__e_machine") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-12 17:45:22 -03:00
Linus Torvalds	41f1a08645	Kbuild/Kconfig updates for 7.0 Kbuild changes ============== * Drop '_probe' pattern from modpost section check allowlist, which hid legitimate warnings (Johan Hovold) Disable -Wtype-limits altogether, instead of enabling at W=2 (Vincent Mailhol) * Improve UAPI testing to skip testing headers that require a libc when CONFIG_CC_CAN_LINK is not set, opening up testing of headers with no libc dependencies to more environments (Thomas Weißschuh) * Update gendwarfksyms documentation with required dependencies (Jihan LIN) * Reject invalid LLVM= values to avoid unintentionally falling back to system toolchain (Thomas Weißschuh) * Add a script to help run the kernel build process in a container for consistent environments and testing (Guillaume Tucker) * Simplify kallsyms by getting rid of the relative base (Ard Biesheuvel) * Performance and usability improvements to scripts/make_fit.py (Simon Glass) * Minor various clean ups and fixes Kconfig changes =============== * Move XPM icons to individual files, clearing up GTK deprecation warnings (Rostislav Krasny) * Support depends on FOO if BAR as syntactic sugar for depends on FOO \|\| !BAR' (Nicolas Pitre, Graham Roff) * Refactor merge_config.sh to use awk over shell/sed/grep, dramatically speeding up processing large number of config fragments (Anders Roxell, Mikko Rapeli) -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQR74yXHMTGczQHYypIdayaRccAalgUCaYpgQwAKCRAdayaRccAa liOGAQCqMI42YMLqljFcPu3B/3f43xhDBCXAhquPBIMhbgt+aAEAmmo3uMLHKSRV XZDKkq13HMMV3Zlmrn5Xk/tzk+hkwwk= =WYl4 -----END PGP SIGNATURE----- Merge tag 'kbuild-7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux Pull Kbuild/Kconfig updates from Nathan Chancellor: "Kbuild: - Drop '_probe' pattern from modpost section check allowlist, which hid legitimate warnings (Johan Hovold) - Disable -Wtype-limits altogether, instead of enabling at W=2 (Vincent Mailhol) - Improve UAPI testing to skip testing headers that require a libc when CONFIG_CC_CAN_LINK is not set, opening up testing of headers with no libc dependencies to more environments (Thomas Weißschuh) - Update gendwarfksyms documentation with required dependencies (Jihan LIN) - Reject invalid LLVM= values to avoid unintentionally falling back to system toolchain (Thomas Weißschuh) - Add a script to help run the kernel build process in a container for consistent environments and testing (Guillaume Tucker) - Simplify kallsyms by getting rid of the relative base (Ard Biesheuvel) - Performance and usability improvements to scripts/make_fit.py (Simon Glass) - Minor various clean ups and fixes Kconfig: - Move XPM icons to individual files, clearing up GTK deprecation warnings (Rostislav Krasny) - Support depends on FOO if BAR as syntactic sugar for depends on FOO \|\| !BAR (Nicolas Pitre, Graham Roff) - Refactor merge_config.sh to use awk over shell/sed/grep, dramatically speeding up processing large number of config fragments (Anders Roxell, Mikko Rapeli)" tag 'kbuild-7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux: (39 commits) kbuild: remove dependency of run-command on config scripts/make_fit: Compress dtbs in parallel scripts/make_fit: Support a few more parallel compressors kbuild: Support a FIT_EXTRA_ARGS environment variable scripts/make_fit: Move dtb processing into a function scripts/make_fit: Support an initial ramdisk scripts/make_fit: Speed up operation rust: kconfig: Don't require RUST_IS_AVAILABLE for rustc-option MAINTAINERS: Add scripts/install.sh into Kbuild entry modpost: Amend ppc64 save/restfpr symnames for -Os build MIPS: tools: relocs: Ship a definition of R_MIPS_PC32 streamline_config.pl: remove superfluous exclamation mark kbuild: dummy-tools: Add python3 scripts: kconfig: merge_config.sh: warn on duplicate input files scripts: kconfig: merge_config.sh: use awk in checks too scripts: kconfig: merge_config.sh: refactor from shell/sed/grep to awk kallsyms: Get rid of kallsyms relative base mips: Add support for PC32 relocations in vmlinux Documentation: dev-tools: add container.rst page scripts: add tool to run containerized builds ...	2026-02-11 13:40:35 -08:00
Paolo Bonzini	bf2c3138ae	Merge tag 'kvm-x86-pmu-6.20' of https://github.com/kvm-x86/linux into HEAD KVM mediated PMU support for 6.20 Add support for mediated PMUs, where KVM gives the guest full ownership of PMU hardware (contexted switched around the fastpath run loop) and allows direct access to data MSRs and PMCs (restricted by the vPMU model), but intercepts access to control registers, e.g. to enforce event filtering and to prevent the guest from profiling sensitive host state. To keep overall complexity reasonable, mediated PMU usage is all or nothing for a given instance of KVM (controlled via module param). The Mediated PMU is disabled default, partly to maintain backwards compatilibity for existing setup, partly because there are tradeoffs when running with a mediated PMU that may be non-starters for some use cases, e.g. the host loses the ability to profile guests with mediated PMUs, the fastpath run loop is also a blind spot, entry/exit transitions are more expensive, etc. Versus the emulated PMU, where KVM is "just another perf user", the mediated PMU delivers more accurate profiling and monitoring (no risk of contention and thus dropped events), with significantly less overhead (fewer exits and faster emulation/programming of event selectors) E.g. when running Specint-2017 on a single-socket Sapphire Rapids with 56 cores and no-SMT, and using perf from within the guest: Perf command: a. basic-sampling: perf record -F 1000 -e 6-instructions -a --overwrite b. multiplex-sampling: perf record -F 1000 -e 10-instructions -a --overwrite Guest performance overhead: --------------------------------------------------------------------------- \| Test case \| emulated vPMU \| all passthrough \| passthrough with \| \| \| \| \| event filters \| --------------------------------------------------------------------------- \| basic-sampling \| 33.62% \| 4.24% \| 6.21% \| --------------------------------------------------------------------------- \| multiplex-sampling \| 79.32% \| 7.34% \| 10.45% \| ---------------------------------------------------------------------------	2026-02-11 12:45:40 -05:00
Linus Torvalds	4d84667627	Performance events changes for v7.0: x86 PMU driver updates: - Add support for the core PMU for Intel Diamond Rapids (DMR) CPUs. Compared to previous iterations of the Intel PMU code, there's been a lot of changes, which center around three main areas: - Introduce the OFF-MODULE RESPONSE (OMR) facility to replace the Off-Core Response (OCR) facility - New PEBS data source encoding layout - Support the new "RDPMC user disable" feature (Dapeng Mi) - Likewise, a large series adds uncore PMU support for Intel Diamond Rapids (DMR) CPUs, which center around these four main areas: - DMR may have two Integrated I/O and Memory Hub (IMH) dies, separate from the compute tile (CBB) dies. Each CBB and each IMH die has its own discovery domain. - Unlike prior CPUs that retrieve the global discovery table portal exclusively via PCI or MSR, DMR uses PCI for IMH PMON discovery and MSR for CBB PMON discovery. - DMR introduces several new PMON types: SCA, HAMVF, D2D_ULA, UBR, PCIE4, CRS, CPC, ITC, OTC, CMS, and PCIE6. - IIO free-running counters in DMR are MMIO-based, unlike SPR. (Zide Chen) - Also add support for Add missing PMON units for Intel Panther Lake, and support Nova Lake (NVL), which largely maps to Panther Lake. (Zide Chen) - KVM integration: Add support for mediated vPMUs (by Kan Liang and Sean Christopherson, with fixes and cleanups by Peter Zijlstra, Sandipan Das and Mingwei Zhang) - Add Intel cstate driver to support for Wildcat Lake (WCL) CPUs, which are a low-power variant of Panther Lake. (Zide Chen) - Add core, cstate and MSR PMU support for the Airmont NP Intel CPU (aka MaxLinear Lightning Mountain), which maps to the existing Airmont code. (Martin Schiller) Performance enhancements: - core: Speed up kexec shutdown by avoiding unnecessary cross CPU calls. (Jan H. Schönherr) - core: Fix slow perf_event_task_exit() with LBR callstacks (Namhyung Kim) User-space stack unwinding support: - Various cleanups and refactorings in preparation to generalize the unwinding code for other architectures. (Jens Remus) Uprobes updates: - Transition from kmap_atomic to kmap_local_page (Keke Ming) - Fix incorrect lockdep condition in filter_chain() (Breno Leitao) - Fix XOL allocation failure for 32-bit tasks (Oleg Nesterov) Misc fixes and cleanups: - s390: Remove kvm_types.h from Kbuild (Randy Dunlap) - x86/intel/uncore: Convert comma to semicolon (Chen Ni) - x86/uncore: Clean up const mismatch (Greg Kroah-Hartman) - x86/ibs: Fix typo in dc_l2tlb_miss comment (Xiang-Bin Shi) Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmmJhTURHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1i/qw/9F/sjMqbxH8d3kPXB8wk2eUuSkynQ2aNw Zec9qtfCC5N1U9b2D7ywGJRscTmWYnX/3BKTyzFuyA6SDz6buAgDDIGPlHi+9Fww +RUUS3lQ7N3pVWZ4Ifu3kbh3Vz4lkQuOXhfcjiyIMS6QIxfrcSLFoKHK+2V6PeU+ x0k+THHz/Ymg+DIpqSjqil1yrKaUmU9xRrbnyy6zJB1duREQrkYBhIWL1+bcd7SA 89RVAGXQ+sWzVQMPaKrMkZj6GavOCB7zseigiiwjBRLznukS2OulDDe8zR6pCJZp wbdc7TR/nCm+QtNfkHlOmTQvsPAXiXNyXe5Vi8aFjGc0uMGhHaeiL9ah/bwsKA5m Bm5Y7oVSmBlCJbcr/CTrGYkb+WwvLIgPCwVkn4FYPlsWv+U92qTOx9q7qKmIDFaj 1oUXCwoHbYrYnoZqZqPp2h689m0Lh/lsGhy0QRt8aGnKu0SDjMaqRGbYWH0UI0kA aDnZstVHG76RnVi0143q8HcvvjNZb82NL0cS749tY/YcwH4kUEGj2XuSK2Ar887T H0oDJHXijMlGXWqO5bK3WMoCQajR7nRyqBo7/rKYj20OjXmwoXS3vel77/s8WFo2 fUUC469MacDzyxdBNutnkJvvcvsUio3r4MWFsEEWQk2nUE58PtN8YM8j/FdNYpql zAZ4Jx/A0RM= =4SRB -----END PGP SIGNATURE----- Merge tag 'perf-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull performance event updates from Ingo Molnar: "x86 PMU driver updates: - Add support for the core PMU for Intel Diamond Rapids (DMR) CPUs (Dapeng Mi) Compared to previous iterations of the Intel PMU code, there's been a lot of changes, which center around three main areas: - Introduce the OFF-MODULE RESPONSE (OMR) facility to replace the Off-Core Response (OCR) facility - New PEBS data source encoding layout - Support the new "RDPMC user disable" feature - Likewise, a large series adds uncore PMU support for Intel Diamond Rapids (DMR) CPUs (Zide Chen) This centers around these four main areas: - DMR may have two Integrated I/O and Memory Hub (IMH) dies, separate from the compute tile (CBB) dies. Each CBB and each IMH die has its own discovery domain. - Unlike prior CPUs that retrieve the global discovery table portal exclusively via PCI or MSR, DMR uses PCI for IMH PMON discovery and MSR for CBB PMON discovery. - DMR introduces several new PMON types: SCA, HAMVF, D2D_ULA, UBR, PCIE4, CRS, CPC, ITC, OTC, CMS, and PCIE6. - IIO free-running counters in DMR are MMIO-based, unlike SPR. - Also add support for Add missing PMON units for Intel Panther Lake, and support Nova Lake (NVL), which largely maps to Panther Lake. (Zide Chen) - KVM integration: Add support for mediated vPMUs (by Kan Liang and Sean Christopherson, with fixes and cleanups by Peter Zijlstra, Sandipan Das and Mingwei Zhang) - Add Intel cstate driver to support for Wildcat Lake (WCL) CPUs, which are a low-power variant of Panther Lake (Zide Chen) - Add core, cstate and MSR PMU support for the Airmont NP Intel CPU (aka MaxLinear Lightning Mountain), which maps to the existing Airmont code (Martin Schiller) Performance enhancements: - Speed up kexec shutdown by avoiding unnecessary cross CPU calls (Jan H. Schönherr) - Fix slow perf_event_task_exit() with LBR callstacks (Namhyung Kim) User-space stack unwinding support: - Various cleanups and refactorings in preparation to generalize the unwinding code for other architectures (Jens Remus) Uprobes updates: - Transition from kmap_atomic to kmap_local_page (Keke Ming) - Fix incorrect lockdep condition in filter_chain() (Breno Leitao) - Fix XOL allocation failure for 32-bit tasks (Oleg Nesterov) Misc fixes and cleanups: - s390: Remove kvm_types.h from Kbuild (Randy Dunlap) - x86/intel/uncore: Convert comma to semicolon (Chen Ni) - x86/uncore: Clean up const mismatch (Greg Kroah-Hartman) - x86/ibs: Fix typo in dc_l2tlb_miss comment (Xiang-Bin Shi)" * tag 'perf-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits) s390: remove kvm_types.h from Kbuild uprobes: Fix incorrect lockdep condition in filter_chain() x86/ibs: Fix typo in dc_l2tlb_miss comment x86/uprobes: Fix XOL allocation failure for 32-bit tasks perf/x86/intel/uncore: Convert comma to semicolon perf/x86/intel: Add support for rdpmc user disable feature perf/x86: Use macros to replace magic numbers in attr_rdpmc perf/x86/intel: Add core PMU support for Novalake perf/x86/intel: Add support for PEBS memory auxiliary info field in NVL perf/x86/intel: Add core PMU support for DMR perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR perf/x86/intel: Support the 4 new OMR MSRs introduced in DMR and NVL perf/core: Fix slow perf_event_task_exit() with LBR callstacks perf/core: Speed up kexec shutdown by avoiding unnecessary cross CPU calls uprobes: use kmap_local_page() for temporary page mappings arm/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol() mips/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol() arm64/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol() riscv/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol() perf/x86/intel/uncore: Add Nova Lake support ...	2026-02-10 12:00:46 -08:00
Ian Rogers	5d1ab659fb	perf stat: Add no-affinity flag Add flag that disables affinity behavior. Using sched_setaffinity() to place a perf thread on a CPU can avoid certain interprocessor interrupts but may introduce a delay due to the scheduling, particularly on loaded machines. Add a command line option to disable the behavior. This behavior is less present in other tools like `perf record`, as it uses a ring buffer and doesn't make repeated system calls. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-10 09:35:28 -03:00
Ian Rogers	d484361550	perf evlist: Reduce affinity use and move into iterator, fix no affinity The evlist__for_each_cpu iterator will call sched_setaffitinity when moving between CPUs to avoid IPIs. If only 1 IPI is saved then this may be unprofitable as the delay to get scheduled may be considerable. This may be particularly true if reading an event group in `perf stat` in interval mode. Move the affinity handling completely into the iterator so that a single evlist__use_affinity can determine whether CPU affinities will be used. For `perf record` the change is minimal as the dummy event and the real event will always make the use of affinities the thing to do. In `perf stat`, tool events are ignored and affinities only used if >1 event on the same CPU occur. Determining if affinities are useful is done by evlist__use_affinity which tests per-event whether the event's PMU benefits from affinity use - it is assumed only perf event using PMUs do. Fix a bug where when there are no affinities that the CPU map iterator may reference a CPU not present in the initial evsel. Fix by making the iterator and non-iterator code common. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-10 09:34:44 -03:00
Ian Rogers	47172912c9	perf evlist: Missing TPEBS close in evlist__close() The libperf evsel close won't close TPEBS events properly. Add a test to do this. The libperf close routine is used in evlist__close() for affinity reasons. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-10 09:34:11 -03:00
Ian Rogers	ff8548172f	perf evlist: Special map propagation for tool events that read on 1 CPU Tool events like duration_time don't need a perf_cpu_map that contains all online CPUs. Having such a perf_cpu_map causes overheads when iterating between events for CPU affinity. During parsing mark events that just read on a single CPU map index as such, then during map propagation set up the evsel's CPUs and thereby the evlists accordingly. The setting cannot be done early in parsing as user CPUs are only fully known when evlist__create_maps is called. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-10 09:33:28 -03:00
Ian Rogers	63b320aaac	perf stat-shadow: In prepare_metric fix guard on reading NULL perf_stat_evsel The aggr value is setup to always be non-null creating a redundant guard for reading from it. Switch to using the perf_stat_evsel (ps) and narrow the scope of aggr so that it is known valid when used. Fixes: `3d65f6445f` ("perf stat-shadow: Read tool events directly") Reported-by: Andres Freund <andres@anarazel.de> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-10 09:33:13 -03:00
Ian Rogers	bc105a8918	Revert "perf tool_pmu: More accurately set the cpus for tool events" This reverts commit `d8d8a0b360`. The setting of a user CPU map can cause an empty intersection when combined with CPU 0 and the event removed. This later triggers a segv in the stat-shadow logic. Let's put back a full online CPU map for now by reverting this patch. Closes: https://lore.kernel.org/linux-perf-users/cgja46br2smmznxs7kbeabs6zgv3b4olfqgh2fdp5mxk2yom4v@w6jjgov6hdi6/ Reported-by: Andres Freund <andres@anarazel.de> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-10 09:32:34 -03:00
Thomas Richter	3d012b8614	perf test: Fix test case perftool-testsuite_report for s390 Test case perftool-testsuite_report fails on s390 for some time now. Root cause is a time out which is too tight for large s390 machines. The time out value addr2line_timeout_ms is per default set to 1 second. This is the maximum time the function read_addr2line_record() waits for a reply from the forked off tool addr2line, which is started as a child in interactive mode. It reads stdin (an address in hexadecimal) and replies on stdout with function name, file name and line number. This might take more than one second. However one second is not always enough and the reply from addr2line tool is not received. Function read_addr2line_record() fails and emits a warning, which is not expected by the test case. It fails. Output before: # perf test -F 133 -- [ PASS ] -- perf_report :: setup :: prepare the perf.data file ================== [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.087 MB \ /tmp/perftool-testsuite_report.FHz/perf_report/perf.data.1 \ (207 samples) ] ================== -- [ PASS ] -- perf_report :: setup :: prepare the perf.data.1 file ## [ PASS ] ## perf_report :: setup SUMMARY -- [ SKIP ] -- perf_report :: test_basic :: help message :: testcase skipped Line did not match any pattern: "cmd__addr2line /usr/lib/debug/lib/modules/ 6.19.0-20260205.rc8.git366.9845cf73f7db.300.fc43.s390x+next/ vmlinux: could not read first record" Line did not match any pattern: "cmd__addr2line /usr/lib/debug/lib/modules/ 6.19.0-20260205.rc8.git366.9845cf73f7db.300.fc43.s390x+next/ vmlinux: could not read first record" -- [ FAIL ] -- perf_report :: test_basic :: basic execution (output regexp parsing) .... 133: perftool-testsuite_report : FAILED! Output after: # ./perf test -F 133 -- [ PASS ] -- perf_report :: setup :: prepare the perf.data file ================== [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.087 MB \ /tmp/perftool-testsuite_report.Mlp/perf_report/perf.data.1 (188 samples) ] ================== -- [ PASS ] -- perf_report :: setup :: prepare the perf.data.1 file ## [ PASS ] ## perf_report :: setup SUMMARY -- [ SKIP ] -- perf_report :: test_basic :: help message :: testcase skipped -- [ PASS ] -- perf_report :: test_basic :: basic execution -- [ PASS ] -- perf_report :: test_basic :: number of samples -- [ PASS ] -- perf_report :: test_basic :: header -- [ PASS ] -- perf_report :: test_basic :: header timestamp -- [ PASS ] -- perf_report :: test_basic :: show CPU utilization -- [ PASS ] -- perf_report :: test_basic :: pid -- [ PASS ] -- perf_report :: test_basic :: non-existing symbol -- [ PASS ] -- perf_report :: test_basic :: symbol filter -- [ PASS ] -- perf_report :: test_basic :: latency header -- [ PASS ] -- perf_report :: test_basic :: default report for latency profile -- [ PASS ] -- perf_report :: test_basic :: latency report for latency profile -- [ PASS ] -- perf_report :: test_basic :: parallelism histogram ## [ PASS ] ## perf_report :: test_basic SUMMARY 133: perftool-testsuite_report : Ok # Fixes: `257046a367` ("perf srcline: Fallback between addr2line implementations") Reviewed-by: Jan Polensky <japo@linux.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: linux-s390@vger.kernel.org Cc: Namhyung Kim <namhyung@kernel.org> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-09 15:51:35 -03:00
Arnaldo Carvalho de Melo	2a400eeba4	perf test code_with_type.sh: Skip test if rust wasn't available at build time $ perf test 'perf data type profiling tests' 83: perf data type profiling tests : Skip $ perf test -vv 'perf data type profiling tests' 83: perf data type profiling tests: --- start --- test child forked, pid 977213 Skip: code_with_type workload not built in 'perf test' ---- end(-2) ---- 83: perf data type profiling tests : Skip $ Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-09 15:47:03 -03:00
Dmitrii Dolgov	1d3ffe6233	perf tests workload: Formatting for code_with_type.rs One part of the rust code for code_with_type workload wasn't properly formatted. Pass it through rustfmt to fix that. Closes: https://lore.kernel.org/oe-kbuild-all/202602091357.oyRv6hgQ-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-09 15:36:00 -03:00
Paolo Bonzini	5490063269	KVM/arm64 updates for 7.0 - Add support for FEAT_IDST, allowing ID registers that are not implemented to be reported as a normal trap rather than as an UNDEF exception. - Add sanitisation of the VTCR_EL2 register, fixing a number of UXN/PXN/XN bugs in the process. - Full handling of RESx bits, instead of only RES0, and resulting in SCTLR_EL2 being added to the list of sanitised registers. - More pKVM fixes for features that are not supposed to be exposed to guests. - Make sure that MTE being disabled on the pKVM host doesn't give it the ability to attack the hypervisor. - Allow pKVM's host stage-2 mappings to use the Force Write Back version of the memory attributes by using the "pass-through' encoding. - Fix trapping of ICC_DIR_EL1 on GICv5 hosts emulating GICv3 for the guest. - Preliminary work for guest GICv5 support. - A bunch of debugfs fixes, removing pointless custom iterators stored in guest data structures. - A small set of FPSIMD cleanups. - Selftest fixes addressing the incorrect alignment of page allocation. - Other assorted low-impact fixes and spelling fixes. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmmGBEkACgkQI9DQutE9 ekPxQQ//VOzle+RVmgzVSJzpNcoW576QGI7+pZLEMIywXTx6rH+uz2FCaZvvgV7M LrJ+1Qps9ea5Yti9OplNJmQwy1yAHIurZnpnAoMR+EJ5PUeq8p1EAypySpHtmT/d KngZsbCvSMydNdfJFwGaz3NFSYj05FlTmWNN+Ndq0JFqyMJQMgY2qKDVmg3pWKcv TLKTNRo9fJFUVhhBIyIoMl2hE36M6Ac3Qd4dUb5J+Fn834QDXgOzVzUjBtkmbSHD kJ4gbSs2Ic6QsYWtt70RlyRdreBYegA4C3z1cZV6DDQYxp5Jz2oqXYYC31Ro520A swuI5y9HMct4mOxqPUqf1lhbvsmkjuZ5Iog6P7W+mOtYHXZIzY8F61sv9YAis9/5 XNOHkg9Cn/n8C2RRQ8vnq0FEI1g7se1UGbe/1NkD4xeR/bzhE/AZSoOrRE7G/XJx qbF9FkPzd4OXYB2Pdm37G1BWsfN4M1bY1rOmmCyMKym793+b/jM7xdoZY1QfbabP uKiavuK8RYgqxrEilhP0asvafKjpZaJbn2R3jwHZgQDWe7WH5FhXwX2UcUpQsTan XZd+/cWaYXjLsKJbiAzy3UArgnzSrHPSpwIOkYq8Lf8EvPgS2g3LLJYbw250Cf1G 74stwoK4PgZ3e6k0nkMk43x1swKb13Gp0vCZjVdnIec9EQgOHfI= =X8iC -----END PGP SIGNATURE----- Merge tag 'kvmarm-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 7.0 - Add support for FEAT_IDST, allowing ID registers that are not implemented to be reported as a normal trap rather than as an UNDEF exception. - Add sanitisation of the VTCR_EL2 register, fixing a number of UXN/PXN/XN bugs in the process. - Full handling of RESx bits, instead of only RES0, and resulting in SCTLR_EL2 being added to the list of sanitised registers. - More pKVM fixes for features that are not supposed to be exposed to guests. - Make sure that MTE being disabled on the pKVM host doesn't give it the ability to attack the hypervisor. - Allow pKVM's host stage-2 mappings to use the Force Write Back version of the memory attributes by using the "pass-through' encoding. - Fix trapping of ICC_DIR_EL1 on GICv5 hosts emulating GICv3 for the guest. - Preliminary work for guest GICv5 support. - A bunch of debugfs fixes, removing pointless custom iterators stored in guest data structures. - A small set of FPSIMD cleanups. - Selftest fixes addressing the incorrect alignment of page allocation. - Other assorted low-impact fixes and spelling fixes.	2026-02-09 18:18:19 +01:00
Arnaldo Carvalho de Melo	3f5dfa472e	tools build: Fix rust feature detection Features in FEATURE_TESTS_BASIC will be set as being available if test-all.c builds, so since the rust test isn't included in test-all.c, we can't have 'rust' in there, remove it from FEATURE_TESTS_BASIC and use feature-check so that it tries to build test-rust.bin, doing the actual feature detection. On a system lacking a rust compiler: Makefile.config:1158: Rust is not found. Test workloads with rust are disabled. Auto-detecting system features: ... libdw: [ on ] ... glibc: [ on ] ... libelf: [ on ] ... libnuma: [ on ] ... numa_num_possible_cpus: [ on ] ... libpython: [ on ] ... libcapstone: [ on ] ... llvm-perf: [ on ] ... zlib: [ on ] ... lzma: [ on ] ... bpf: [ on ] ... libaio: [ on ] ... libzstd: [ on ] ... libopenssl: [ on ] ... rust: [ OFF ] $ cat /tmp/build/perf-tools-next/feature/test-rust.make.output /bin/sh: line 1: rustc: command not found $ file /tmp/build/perf-tools-next/feature/test-rust.bin /tmp/build/perf-tools-next/feature/test-rust.bin: cannot open `/tmp/build/perf-tools-next/feature/test-rust.bin' (No such file or directory) $ $ perf -vv \| grep RUST rust: [ OFF ] # HAVE_RUST_SUPPORT $ And after installing it: ... rust: [ on ] $ cat /tmp/build/perf-tools-next/feature/test-rust.make.output $ file /tmp/build/perf-tools-next/feature/test-rust.bin /tmp/build/perf-tools-next/feature/test-rust.bin: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=9c416edf673ee3705b97bae893a99a6fcf1ee258, for GNU/Linux 3.2.0, with debug_info, not stripped $ $ perf -vv \| grep RUST rust: [ on ] # HAVE_RUST_SUPPORT $ Fixes: `6a32fa5ccd` ("tools build: Add a feature test for rust compiler") Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-09 11:10:56 -03:00
Dmitrii Dolgov	335047109d	perf tests: Test annotate with data type profiling and C Exercise the annotate command with data type profiling feature with C. For that extend the existing data type profiling shell test to profile the datasym workload, then annotate the result expecting to see some data structures from the C code. Committer testing: root@number:~# perf test 'perf data type profiling tests' 83: perf data type profiling tests : Ok root@number:~# perf test -vv 'perf data type profiling tests' 83: perf data type profiling tests: --- start --- test child forked, pid 125028 Basic Rust perf annotate test Basic annotate test [Success] Pipe Rust perf annotate test Pipe annotate test [Success] Basic C perf annotate test Basic annotate test [Success] Pipe C perf annotate test Pipe annotate test [Success] ---- end(0) ---- 83: perf data type profiling tests : Ok root@number:~# Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ian Rogers <irogers@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-08 19:16:28 -03:00
Dmitrii Dolgov	f60a5c2296	perf tests: Test annotate with data type profiling and rust Exercise the annotate command with data type profiling feature on the rust runtime. For that add a new shell test, which will profile the code_with_type workload, then annotate the result expecting to see some data structures from the rust code. Committer testing: root@number:~# perf test 'perf data type profiling tests' 83: perf data type profiling tests : Ok root@number:~# perf test -v 'perf data type profiling tests' 83: perf data type profiling tests : Ok root@number:~# perf test -vv 'perf data type profiling tests' 83: perf data type profiling tests: --- start --- test child forked, pid 111044 Basic perf annotate test Basic annotate test [Success] Pipe perf annotate test Pipe annotate test [Success] ---- end(0) ---- 83: perf data type profiling tests : Ok root@number:~# Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ian Rogers <irogers@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-08 19:16:28 -03:00
Dmitrii Dolgov	2e05bb52a1	perf test workload: Add code_with_type test workload The purpose of the workload is to gather samples of rust runtime. To achieve that it has a dummy rust library linked with it. Per recommendations for such scenarios [1], the rust library is statically linked. An example: $ perf record perf test -w code_with_type [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.160 MB perf.data (4074 samples) ] $ perf report --stdio --dso perf -s srcfile,srcline 45.16% ub_checks.rs ub_checks.rs:72 6.72% code_with_type.rs code_with_type.rs:15 6.64% range.rs range.rs:767 4.26% code_with_type.rs code_with_type.rs:21 4.23% range.rs range.rs:0 3.99% code_with_type.rs code_with_type.rs:16 [...] [1]: https://doc.rust-lang.org/reference/linkage.html#mixed-rust-and-foreign-codebases Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ian Rogers <irogers@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-08 19:16:24 -03:00
Dmitrii Dolgov	6a32fa5ccd	tools build: Add a feature test for rust compiler Add a feature test to identify if the rust compiler is available, so that perf could build rust based worloads based on that. Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-08 11:30:45 -03:00
Ian Rogers	ff9aeb6bd1	perf test parse-metric: Ensure aggregate counts appear to have run Commit `bb5a920b90` ("perf stat: Ensure metrics are displayed even with failed events") with failed events") made it so that counters which weren't enabled in the kernel were handled as NaN in metrics. This caused the "Parse and process metrics" test to start failing as it wasn't putting a non-zero value in these variables. Add arbitrary values of 1 to fix the test. Fixes: `bb5a920b90` ("perf stat: Ensure metrics are displayed even with failed events") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-08 11:30:01 -03:00
Ian Rogers	c60ee958d6	perf test record.sh: Fix shellcheck warning Add quotes to avoid the following warning: ``` In tests/shell/record.sh line 264: [ $(uname -m) = "s390x" ] && { ^---------^ SC2046 (warning): Quote this to prevent word splitting. For more information: https://www.shellcheck.net/wiki/SC2046 -- Quote this to prevent word splitt... ``` Fixes: `c73a56ed3c` ("perf test: Fix test case Leader sampling on s390") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-08 11:29:58 -03:00
Ian Rogers	36a1b0061a	perf build: Reduce pmu-events related copying and mkdirs When building to an output directory the previous code would remove files and then copy the source files over. Each source file copy would have a rule to make its directory. All JSON for every architecture was considered a source file. This led to unnecessary copying as a file would be deleted and then the same file copied again, unnecessary directory making, and copying of files not used in the build. A side-effect would be a lot of build messages. This change makes it so that all computed output files are created and then compared to all files in the OUTPUT directory. By filtering out the files that would be copied, unnecessary files can be determined and then deleted - note, this is a phony target which would remake the pmu-events.c if always depended upon, and so the dependency is conditional on there being files to remove. This has some overhead as the $(OUTPUT)/pmu-events is "find" over rather than just "rm -fr", but the savings from unnecessary copying, etc. should make up for this new make overhead. The copy target just does copying but has a dependency on the directory it needs being built, avoiding repetitive mkdirs. The source files for copying only consider the JEVENTS_ARCH unless the JEVENTS_ARCH is all. The metric JSON is only generated if appropriate, rather than always being generated and jevents.py deciding whether or not to use the files. The mypy and pylint targets are fixed as variable names had changed but the rules not updated. The line count of a build with "make -C tools/perf O=/tmp/perf clean all" prior to this change was 2181 lines, after this change it is 1596 lines. This is a reduction of 585 lines or about 27%. The generated pmu-events.c for JEVENTS_ARCH "x86" and "all" were validated as being identical after this change. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-06 19:11:36 -03:00
Tycho Andersen (AMD)	4479884d1f	perf lock contention: fix segfault in `lock contention -b/--use-bpf` When run on a kernel without BTF info, perf crashes: libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled? libbpf: failed to find valid kernel BTF Program received signal SIGSEGV, Segmentation fault. 0x00005555556915b7 in btf.type_cnt () (gdb) bt #0 0x00005555556915b7 in btf.type_cnt () #1 0x0000555555691fbc in btf_find_by_name_kind () #2 0x00005555556920d0 in btf.find_by_name_kind () #3 0x00005555558a1b7c in init_numa_data (con=0x7fffffffd0a0) at util/bpf_lock_contention.c:125 #4 0x00005555558a264b in lock_contention_prepare (con=0x7fffffffd0a0) at util/bpf_lock_contention.c:313 #5 0x0000555555620702 in __cmd_contention (argc=0, argv=0x7fffffffea10) at builtin-lock.c:2084 #6 0x0000555555622c8d in cmd_lock (argc=0, argv=0x7fffffffea10) at builtin-lock.c:2755 #7 0x0000555555651451 in run_builtin (p=0x555556104f00 <commands+576>, argc=3, argv=0x7fffffffea10) at perf.c:349 #8 0x00005555556516ed in handle_internal_command (argc=3, argv=0x7fffffffea10) at perf.c:401 #9 0x000055555565184e in run_argv (argcp=0x7fffffffe7fc, argv=0x7fffffffe7f0) at perf.c:445 #10 0x0000555555651b9f in main (argc=3, argv=0x7fffffffea10) at perf.c:553 Check if btf loading failed, and don't do anything with it in init_numa_data(). This leads to the following error message, instead of just a crash: libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled? libbpf: failed to find valid kernel BTF libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled? libbpf: failed to find valid kernel BTF libbpf: Error loading vmlinux BTF: -ESRCH libbpf: failed to load BPF skeleton 'lock_contention_bpf': -ESRCH Failed to load lock-contention BPF skeleton lock contention BPF setup failed Signed-off-by: Tycho Andersen (AMD) <tycho@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-06 18:57:26 -03:00
Ricky Ringler	920c5570a6	perf sort: Replace static cacheline size with sysconf cacheline size Testing: - Built perf - Executed perf mem record and report Committer notes: This addresses a TODO and improves the situation where record and report/c2c are performed on the same machine or in machines with the same cacheline size, but the proper way is to store the cacheline size in the perf.data header at 'record' time and then use it at post processing time. Signed-off-by: Ricky Ringler <ricky.ringler@proton.me> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20260129004223.26799-1-ricky.ringler@proton.me Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-06 18:51:15 -03:00
Thomas Richter	c73a56ed3c	perf test: Fix test case Leader sampling on s390 The subtest 'Leader sampling' some time fails on s390. - for z/VM guest: Disable the test for z/VM guest. There is no CPU Measurement facility to run the test successfully. - for LPAR: Use correct event names. A detailed analysis follows here: Now to the debugging and investigation: 1. With command perf record -e '{cycles,cycles}:S' -- .... the first cycles event starts sampling. On s390 this sets up sampling with a frequency of 4000 Hz. This translates to hardware sample rate of 1377000 instructions per micro-second to meet a frequency of 4000 HZ. 2. With first event cycles now sampling into a hardware buffer, an interrupt is triggered each time a sampling buffer gets full. The interrupt handler is then invoked and debug output shows the processing of samples. The size of one hardware sample is 32 bytes. With an interrupt triggered when the hardware buffer page of 4KB gets full, the interrupt handler processes 128 samples. (This is taken from s390 specific fast debug data gathering) 2025-11-07 14:35:51.977248 000003ffe013cbfa \ perf_event_count_update event->count 0x0 count 0x1502e8 2025-11-07 14:35:51.977248 000003ffe013cbfa \ perf_event_count_update event->count 0x1502e8 count 0x1502e8 2025-11-07 14:35:51.977248 000003ffe013cbfa \ perf_event_count_update event->count 0x2a05d0 count 0x1502e8 2025-11-07 14:35:51.977252 000003ffe013cbfa \ perf_event_count_update event->count 0x3f08b8 count 0x1502e8 2025-11-07 14:35:51.977252 000003ffe013cbfa \ perf_event_count_update event->count 0x540ba0 count 0x1502e8 2025-11-07 14:35:51.977253 000003ffe013cbfa \ perf_event_count_update event->count 0x690e88 count 0x1502e8 2025-11-07 14:35:51.977254 000003ffe013cbfa \ perf_event_count_update event->count 0x7e1170 count 0x1502e8 2025-11-07 14:35:51.977254 000003ffe013cbfa \ perf_event_count_update event->count 0x931458 count 0x1502e8 2025-11-07 14:35:51.977254 000003ffe013cbfa \ perf_event_count_update event->count 0xa81740 count 0x1502e8 3. The value is constantly increasing by the number of instructions executed to generate a sample entry. This is the first line of the pairs of lines. count 0x1502e8 --> 1377000 # perf script \| grep 1377000 \| wc -l 214 # perf script \| wc -l 428 # That is 428 lines in total, and half of the lines contain value 1377000. 4. The second event cycles is opened against the counting PMU, which is an independent PMU and is not interrupt driven. Once enabled it runs in the background and keeps running, incrementing silently about 400+ counters. The counter values are read via assembly instructions. This second counter PMU's read call back function is called when the interrupt handler of the sampling facility processes each sample. The function call sequence is: perf_event_overflow() +--> __perf_event_overflow() +--> __perf_event_output() +--> perf_output_sample() +--> perf_output_read() +--> perf_output_read_group() for_each_sibling_event(sub, leader) { values[n++] = perf_event_count(sub, self); printk("%s sub %p values %#lx\n", __func__, sub, values[n-1]); } The last function perf_event_count() is invoked on the second event cylces on the counting PMU. An added printk statement shows the following lines in the dmesg output: # dmesg\|grep perf_output_read_group \|head -10 [ 332.368620] perf_output_read_group sub 00000000d80b7c1f values 0x3a80917 (1) [ 332.368624] perf_output_read_group sub 00000000d80b7c1f values 0x3a86c7f (2) [ 332.368627] perf_output_read_group sub 00000000d80b7c1f values 0x3a89c15 (3) [ 332.368629] perf_output_read_group sub 00000000d80b7c1f values 0x3a8c895 (4) [ 332.368631] perf_output_read_group sub 00000000d80b7c1f values 0x3a8f569 (5) [ 332.368633] perf_output_read_group sub 00000000d80b7c1f values 0x3a9204b [ 332.368635] perf_output_read_group sub 00000000d80b7c1f values 0x3a94790 [ 332.368637] perf_output_read_group sub 00000000d80b7c1f values 0x3a9704b [ 332.368638] perf_output_read_group sub 00000000d80b7c1f values 0x3a99888 # This correlates with the output of # perf report -D \| grep 'id 00000000000000'\|head -10 ..... id 0000000000000006, value 00000000001502e8, lost 0 ..... id 000000000000000e, value 0000000003a80917, lost 0 --> line (1) above ..... id 0000000000000006, value 00000000002a05d0, lost 0 ..... id 000000000000000e, value 0000000003a86c7f, lost 0 --> line (2) above ..... id 0000000000000006, value 00000000003f08b8, lost 0 ..... id 000000000000000e, value 0000000003a89c15, lost 0 --> line (3) above ..... id 0000000000000006, value 0000000000540ba0, lost 0 ..... id 000000000000000e, value 0000000003a8c895, lost 0 --> line (4) above ..... id 0000000000000006, value 0000000000690e88, lost 0 ..... id 000000000000000e, value 0000000003a8f569, lost 0 --> line (5) above Summary: - Above command starts the CPU sampling facility, with runs interrupt driven when a 4KB page is full. An interrupt processes the 128 samples and calls eventually perf_output_read_group() for each sample to save it in the event's ring buffer. - At that time the CPU counting facility is invoked to read the value of the event cycles. This value is saved as the second value in the sample_read structure. - The first and odd lines in the perf script output displays the period value between 2 samples being created by hardware. It is the number of instructions executes before the hardware writes a sample. - The second and even lines in the perf script output displays the number of CPU cycles needed to process each sample and save it in the event's ring buffer. These 2 different values can never be identical on s390. Since event leader sampling is not possible on s390 the perf tool will return EOPNOTSUPP soon. Perpare the test case for that. Suggested-by: James Clark <james.clark@linaro.org> Reviewed-by: Jan Polensky <japo@linux.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Tested-by: Jan Polensky <japo@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-06 18:40:24 -03:00
Namhyung Kim	64ea7a4620	perf annotate: Fix register usage in data type profiling On data type profiling, it tried to match register name with a partial string. For example, it allowed to match with "%rbp)" or "%rdi,8)". But with recent change in the area, it doesn't match anymore and break the data type profiling. Let's pass the correct register name by removing the unwanted part. Add arch__dwarf_regnum() to handle it in a single place. Closes: 7d3n23li6drroxrdlpxn7ixehdeszkjdftah3zyngjl2qs22ef@yelcjv53v42o Reported-by: Dmitry Dolgov <9erthalion6@gmail.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zecheng Li <zli94@ncsu.edu> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-06 18:18:52 -03:00
Chun-Tse Shao	bb5a920b90	perf stat: Ensure metrics are displayed even with failed events Currently, `perf stat` skips or hides metrics when the underlying hardware events cannot be counted (e.g., due to insufficient permissions or unsupported events). In `--metric-only` mode, this often results in missing columns or blank spaces, making the output difficult to parse. Modify the logic to ensure metrics are consistently displayed by propagating NAN (Not a Number) through the expression evaluator. Specifically: 1. Update `prepare_metric()` in stat-shadow.c to treat uncounted events (where `run == 0`) as NAN. This leverages the existing math in expr.y to propagate NAN through metric expressions. 2. Remove the early return in the display logic's `printout()` function that was previously skipping metrics in `--metric-only` mode for failed events. l 3. Simplify `perf_stat__skip_metric_event()` to no longer depend on event runtime. Tested: 1. `perf all metrics test` did not crash while paranoid is 2. 2. Multiple combinations with `CPUs_utilized` while paranoid is 2. $ ./perf stat -M CPUs_utilized -a -- sleep 1 Performance counter stats for 'system wide': <not supported> msec cpu-clock:u # nan CPUs CPUs_utilized 1,006,356,120 duration_time 1.004375550 seconds time elapsed $ ./perf stat -M CPUs_utilized -a -j -- sleep 1 {"counter-value" : "<not supported>", "unit" : "msec", "event" : "cpu-clock:u", "event-runtime" : 0, "pcnt-running" : 100.00, "metric-value" : "nan", "metric-unit" : "CPUs CPUs_utilized"} {"counter-value" : "1006642462.000000", "unit" : "", "event" : "duration_time", "event-runtime" : 1, "pcnt-running" : 100.00} $ ./perf stat -M CPUs_utilized -a --metric-only -- sleep 1 Performance counter stats for 'system wide': CPUs CPUs_utilized nan 1.004424652 seconds time elapsed $ ./perf stat -M CPUs_utilized -a --metric-only -j -- sleep 1 {"CPUs CPUs_utilized" : "none"} Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Chun-Tse Shao <ctshao@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-06 18:18:32 -03:00
Ian Rogers	446c595dc0	perf test addr2line_inlines: Ensure inline information shows on LBR leaves Expand the addr2line inline function testing to also run for an LBR callchain, skipping if LBR support isn't present. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-06 18:11:31 -03:00
Ian Rogers	04f81f45b4	perf callchain lbr: Make the leaf IP that of the sample The current IP of a leaf function when reported from a perf record with "--call-graph lbr" is the "to" field of the LBR branch stack record. The sample for the event being recorded may be further into the function and there may be inlining information associated with it. Rather than use the branch stack "to" field in this case switch to the callchain appending the sample->ip and thereby allowing the inline information to show. Before this change: ``` $ perf record --call-graph lbr perf test -w inlineloop ... $ perf script --fields +srcline ... perf-inlineloop 467586 4649.344493: 950905 cpu_core/cycles/P: 55dfda2829c0 parent+0x0 (perf) inlineloop.c:31 55dfda282a96 inlineloop+0x86 (perf) inlineloop.c:47 55dfda236420 run_workload+0x59 (perf) builtin-test.c:715 55dfda236b03 cmd_test+0x413 (perf) builtin-test.c:825 ... ``` After this change: ``` $ perf record --call-graph lbr perf test -w inlineloop ... $ perf script --fields +srcline ... perf-inlineloop 529703 11878.680815: 950905 cpu_core/cycles/P: 555ce86be9e6 leaf+0x26 inlineloop.c:20 (inlined) 555ce86be9e6 middle+0x26 inlineloop.c:27 (inlined) 555ce86be9e6 parent+0x26 (perf) inlineloop.c:32 555ce86bea96 inlineloop+0x86 (perf) inlineloop.c:47 555ce8672420 run_workload+0x59 (perf) builtin-test.c:715 555ce8672b03 cmd_test+0x413 (perf) builtin-test.c:825 ... ``` Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-06 18:10:44 -03:00
Leo Yan	a724a8fce5	perf kvm stat: Fix build error Since commit `ceea279f93` ("perf kvm stat: Remove use of the arch directory"), a native build on Arm64 machine reports: util/kvm-stat-arch/kvm-stat-x86.c:7:10: fatal error: asm/svm.h: No such file or directory 7 \| #include <asm/svm.h> \| ^~~~~~~~~~~ compilation terminated. The build fails to find x86's asm headers when building for Arm64. Fix this by including asm headers with relative path instead. Fixes: `ceea279f93` ("perf kvm stat: Remove use of the arch directory") Signed-off-by: Leo Yan <leo.yan@arm.com> Link: https://lore.kernel.org/r/20260206-perf_fix_kvm_stat_error-v1-1-ad40115876be@arm.com Cc: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-06 18:07:05 -03:00
Dapeng Mi	e5e66adfe4	perf regs: Remove __weak attributive arch_sdt_arg_parse_op() function In line with the previous patch, the __weak arch_sdt_arg_parse_op() function is removed. Architectural-specific implementations in the arch/ directory are now converted into sub-functions within the util/perf-regs-arch/ directory. The perf_sdt_arg_parse_op() function will call these sub-functions based on the EM_HOST. This change enables cross-architecture calls to arch_sdt_arg_parse_op(). No functional changes are intended. Suggested-by: Ian Rogers <irogers@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Guo Ren <guoren@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Xudong Hao <xudong.hao@intel.com> Cc: Zide Chen <zide.chen@intel.com> [ Fixed up somme fuzz with powerpc and x86 Build files wrt removing perf_regs.o ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-06 12:16:12 -03:00
Dapeng Mi	16dccbb842	perf regs: Remove __weak attributive arch__xxx_reg_mask() functions Currently, some architecture-specific perf-regs functions, such as arch__intr_reg_mask() and arch__user_reg_mask(), are defined with the __weak attribute. This approach ensures that only functions matching the architecture of the build/run host are compiled and executed, reducing build time and binary size. However, this __weak attribute restricts these functions to be called only on the same architecture, preventing cross-architecture functionality. For example, a perf.data file captured on x86 cannot be parsed on an ARM platform. To address this limitation, this patch removes the __weak attribute from these perf-regs functions. The architecture-specific code is moved from the arch/ directory to the util/perf-regs-arch/ directory. The appropriate architectural functions are then called based on the EM_HOST. No functional changes are intended. Suggested-by: Ian Rogers <irogers@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Guo Ren <guoren@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Xudong Hao <xudong.hao@intel.com> Cc: Zide Chen <zide.chen@intel.com> [ Fixed up somme fuzz with s390 and riscv Build files wrt removing perf_regs.o ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-06 12:16:09 -03:00
Dapeng Mi	e716e69cf6	perf arch: Update arch headers to use relative UAPI paths The architectural specific headers perf_regs.h currently rely on the host architecture's 'asm/perf_regs.h'. This can lead to compilation inconsistencies or failures when including and building perf for a target architecture that differs from the host's architecture. Explicitly point to the UAPI headers within the tools source tree using relative paths. This ensures that perf is always built against the intended architecture. No functional changes are intended. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Guo Ren <guoren@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Xudong Hao <xudong.hao@intel.com> Cc: Zide Chen <zide.chen@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-06 12:15:48 -03:00
Dapeng Mi	c2e28ae294	perf regs: Fix abort for "-I" or "--user-regs" options Fix an issue where the `perf` tool aborts unexpectedly when running the following command: ``` perf record -e cycles -I -- true Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -I, --intr-regs[=<any register>] sample selected machine registers on interrupt, use '-I?' to list register names ``` The usage of the `-I` or `--user-regs` options without specifying any registers should default to sampling all general-purpose registers. However, this currently causes an abnormal termination. The issue was introduced by commit `3d06db9bad` ("perf regs: Refactor use of arch__sample_reg_masks() to perf_reg_name()"). This patch resolves the problem, ensuring that the `-I` or `--user-regs` options work as intended without causing an abort. Fixes: `3d06db9bad` ("perf regs: Refactor use of arch__sample_reg_masks() to perf_reg_name()") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Guo Ren <guoren@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: John Garry <john.g.garry@oracle.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-csky@vger.kernel.org Cc: linux-riscv@lists.infradead.org Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Xudong Hao <xudong.hao@intel.com> Cc: Zide Chen <zide.chen@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-06 12:15:22 -03:00
Ian Rogers	cee275edcd	perf metricgroup: Don't early exit if no CPUID table exists The failure to find a table of metrics with a CPUID shouldn't early exit as the metric code will now also consider the default table. When searching for a metric or metric group, pmu_metrics_table__for_each_metric() considers all tables and so the caller doesn't need to switch the table to do this. Fixes: `c7adeb0974` ("perf jevents: Add set of common metrics based on default ones") Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Leo Yan <leo.yan@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-06 11:58:51 -03:00
Ian Rogers	f637bb2eed	perf tests: build-test coverage for NO_JEVENTS=1 Leo reported 'perf stat' being broken and this highlighted that the 'make NO_JEVENTS=1' variant is missing from 'make -C tools/perf build-test', add it. Closes: https://lore.kernel.org/linux-perf-users/20260205175250.GC3529712@e132581.arm.com/ Reported-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-06 11:56:54 -03:00
Ian Rogers	1d9622c3c1	perf tests: Additional 'perf stat' tests Recently 'perf stat' regressed in per CPU mode [1]. Let's expand test coverage to catch the same breakage again as well as to test the repeat, pid, detailed and no aggregation options. [1] https://lore.kernel.org/linux-perf-users/cgja46br2smmznxs7kbeabs6zgv3b4olfqgh2fdp5mxk2yom4v@w6jjgov6hdi6/ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-06 11:55:21 -03:00
Leo Yan	a108a6a4b9	perf record: Make logs more readable for event open failures Since commit `ee27476fa3` ("perf record: Skip don't fail for events that don't open"), if a user does not have permission to access a PMU event, perf reports: perf record -e cs_etm// -C 3 -- ls Error: Failure to open event 'cs_etm//u' on PMU 'cs_etm' which will be removed. No fallback found for 'cs_etm//u' for error 13 Error: Failure to open event 'dummy:u' on PMU 'software' which will be removed. No fallback found for 'dummy:u' for error 13 Error: Failure to open any events for recording. The log is not very helpful, as no clear indication of what "error 13" means or how to address the issue. This commit restores evsel__open_strerror() to generate a readable error message and print it out: perf record -e cs_etm// -C 3 -- ls Error: Failure to open event 'cs_etm//' on PMU 'cs_etm' which will be removed. Access to performance monitoring and observability operations is limited. Consider adjusting /proc/sys/kernel/perf_event_paranoid setting to open access to performance monitoring and observability operations for processes without CAP_PERFMON, CAP_SYS_PTRACE or CAP_SYS_ADMIN Linux capability. More information can be found at 'Perf events and tool security' document: https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html perf_event_paranoid setting is 1: -1: Allow use of (almost) all events by all users Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK >= 0: Disallow raw and ftrace function tracepoint access >= 1: Disallow CPU event access >= 2: Disallow kernel profiling To make the adjusted perf_event_paranoid setting permanent preserve it in /etc/sysctl.conf (e.g. kernel.perf_event_paranoid = <setting>) Error: Failure to open event 'dummy:u' on PMU 'software' which will be removed. Access to performance monitoring and observability operations is limited. Consider adjusting /proc/sys/kernel/perf_event_paranoid setting to open access to performance monitoring and observability operations for processes without CAP_PERFMON, CAP_SYS_PTRACE or CAP_SYS_ADMIN Linux capability. More information can be found at 'Perf events and tool security' document: https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html perf_event_paranoid setting is 1: -1: Allow use of (almost) all events by all users Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK >= 0: Disallow raw and ftrace function tracepoint access >= 1: Disallow CPU event access >= 2: Disallow kernel profiling To make the adjusted perf_event_paranoid setting permanent preserve it in /etc/sysctl.conf (e.g. kernel.perf_event_paranoid = <setting>) Error: Failure to open any events for recording. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Leo Yan <leo.yan@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2026-02-06 11:53:10 -03:00

1 2 3 4 5 ...

18420 commits