linux/include
Ankit Agrawal 2ec4196718 mm: handle poisoning of pfn without struct pages
Poison (or ECC) errors can be very common on a large size cluster.  The
kernel MM currently does not handle ECC errors / poison on a memory region
that is not backed by struct pages.  If a memory region mapped using
remap_pfn_range() for example, but not added to the kernel, MM will not
have associated struct pages.  Add a new mechanism to handle memory
failure on such memory.

Make kernel MM expose a function to allow modules managing the device
memory to register the device memory SPA and the address space associated
it.  MM maintains this information as an interval tree.  On poison, MM can
search for the range that the poisoned PFN belong and use the
address_space to determine the mapping VMA.

In this implementation, kernel MM follows the following sequence that is
largely similar to the memory_failure() handler for struct page backed
memory:

1. memory_failure() is triggered on reception of a poison error.  An
   absence of struct page is detected and consequently
   memory_failure_pfn() is executed.

2. memory_failure_pfn() collects the processes mapped to the PFN.

3. memory_failure_pfn() sends SIGBUS to all the processes mapping the
   faulty PFN using kill_procs().

Note that there is one primary difference versus the handling of the
poison on struct pages, which is to skip unmapping to the faulty PFN. 
This is done to handle the huge PFNMAP support added recently [1] that
enables VM_PFNMAP vmas to map at PMD or PUD level.  A poison to a PFN
mapped in such as way would need breaking the PMD/PUD mapping into PTEs
that will get mirrored into the S2.  This can greatly increase the cost of
table walks and have a major performance impact.

Link: https://lore.kernel.org/all/20240826204353.2228736-1-peterx@redhat.com/ [1]
Link: https://lkml.kernel.org/r/20251102184434.2406-3-ankita@nvidia.com
Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
Cc: Aniket Agashe <aniketa@nvidia.com>
Cc: Borislav Betkov <bp@alien8.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Joanthan Cameron <Jonathan.Cameron@huawei.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Kirti Wankhede <kwankhede@nvidia.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matthew R. Ochs <mochs@nvidia.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Neo Jia <cjia@nvidia.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shuai Xue <xueshuai@linux.alibaba.com>
Cc: Smita Koralahalli Channabasappa <smita.koralahallichannabasappa@amd.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Tarun Gupta <targupta@nvidia.com>
Cc: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Cc: Vikram Sethi <vsethi@nvidia.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zhi Wang <zhiw@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-16 17:28:29 -08:00
..
acpi More power management updates for 6.18-rc1 2025-10-07 09:39:51 -07:00
asm-generic mm: actually mark kernel page table pages 2025-11-16 17:28:17 -08:00
clocksource clocksource/drivers/arm_arch_timer_mmio: Switch over to standalone driver 2025-09-23 12:31:50 +02:00
crypto This update includes the following changes: 2025-10-04 14:59:29 -07:00
cxl
drm kbuild: Let kernel-doc.py use PYTHON3 override 2025-11-08 19:42:22 -07:00
dt-bindings There's a bunch of patches here across drivers/clk/ to migrate drivers to use 2025-10-07 09:28:37 -07:00
hyperv hyperv: Remove the spurious null directive line 2025-10-02 21:21:24 +00:00
keys KEYS: trusted_tpm1: Move private functionality out of public header 2025-09-27 21:05:06 +03:00
kunit linux_kselftest-kunit-6.18-rc1 2025-10-01 19:15:11 -07:00
kvm KVM: arm64: Kill leftovers of ad-hoc timer userspace access 2025-10-13 14:42:41 +01:00
linux mm: handle poisoning of pfn without struct pages 2025-11-16 17:28:29 -08:00
math-emu
media
memory
misc
net memcg: net: track network throttling due to memcg memory pressure 2025-11-16 17:28:06 -08:00
pcmcia
ras mm: handle poisoning of pfn without struct pages 2025-11-16 17:28:29 -08:00
rdma
rv kernel-6.18-rc1.clone3 2025-09-29 10:36:50 -07:00
scsi scsi: core: Fix the unit attention counter implementation 2025-10-21 21:09:36 -04:00
soc There's a bunch of patches here across drivers/clk/ to migrate drivers to use 2025-10-07 09:28:37 -07:00
sound ASoC: tas2781: Support more newly-released amplifiers tas58xx in the driver 2025-10-13 11:08:09 +01:00
target
trace trace: tcp: add three metrics to trace_tcp_rcvbuf_grow() 2025-10-29 17:30:18 -07:00
uapi drm fixes for 6.18-rc5 2025-11-07 14:51:11 -08:00
ufs scsi: ufs: core: Add a quirk to suppress link_startup_again 2025-10-29 23:20:19 -04:00
vdso Updates for the VDSO subsystem: 2025-09-30 16:58:21 -07:00
video
xen
Kbuild