linux/mm
Gavin Guo be6e843fc5 mm/huge_memory: fix dereferencing invalid pmd migration entry
When migrating a THP, concurrent access to the PMD migration entry during
a deferred split scan can lead to an invalid address access, as
illustrated below.  To prevent this invalid access, it is necessary to
check the PMD migration entry and return early.  In this context, there is
no need to use pmd_to_swp_entry and pfn_swap_entry_to_page to verify the
equality of the target folio.  Since the PMD migration entry is locked, it
cannot be served as the target.

Mailing list discussion and explanation from Hugh Dickins: "An anon_vma
lookup points to a location which may contain the folio of interest, but
might instead contain another folio: and weeding out those other folios is
precisely what the "folio != pmd_folio((*pmd)" check (and the "risk of
replacing the wrong folio" comment a few lines above it) is for."

BUG: unable to handle page fault for address: ffffea60001db008
CPU: 0 UID: 0 PID: 2199114 Comm: tee Not tainted 6.14.0+ #4 NONE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:split_huge_pmd_locked+0x3b5/0x2b60
Call Trace:
<TASK>
try_to_migrate_one+0x28c/0x3730
rmap_walk_anon+0x4f6/0x770
unmap_folio+0x196/0x1f0
split_huge_page_to_list_to_order+0x9f6/0x1560
deferred_split_scan+0xac5/0x12a0
shrinker_debugfs_scan_write+0x376/0x470
full_proxy_write+0x15c/0x220
vfs_write+0x2fc/0xcb0
ksys_write+0x146/0x250
do_syscall_64+0x6a/0x120
entry_SYSCALL_64_after_hwframe+0x76/0x7e

The bug is found by syzkaller on an internal kernel, then confirmed on
upstream.

Link: https://lkml.kernel.org/r/20250421113536.3682201-1-gavinguo@igalia.com
Link: https://lore.kernel.org/all/20250414072737.1698513-1-gavinguo@igalia.com/
Link: https://lore.kernel.org/all/20250418085802.2973519-1-gavinguo@igalia.com/
Fixes: 84c3fc4e9c ("mm: thp: check pmd migration entry in common path")
Signed-off-by: Gavin Guo <gavinguo@igalia.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Cc: Florent Revest <revest@google.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07 23:39:38 -07:00
..
damon mm/damon/core: simplify control flow in damon_register_ops() 2025-04-01 15:17:10 -07:00
kasan hardening fixes for v6.15-rc3 2025-04-18 13:20:20 -07:00
kfence kfence: skip __GFP_THISNODE allocations on NUMA systems 2025-02-01 03:53:26 -08:00
kmsan dma: kmsan: export kmsan_handle_dma() for modules 2025-03-05 21:36:14 -08:00
backing-dev.c treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
balloon_compaction.c balloon_compaction: update the NR_BALLOON_PAGES state 2025-03-21 22:03:13 -07:00
bootmem_info.c mm/sparse: allow for alternate vmemmap section init at boot 2025-03-16 22:06:27 -07:00
cma.c mm/cma: report base address of single range correctly 2025-04-11 17:32:39 -07:00
cma.h mm/cma: using per-CMA locks to improve concurrent allocation performance 2025-03-21 22:03:10 -07:00
cma_debug.c mm, cma: support multiple contiguous ranges, if requested 2025-03-16 22:06:25 -07:00
cma_sysfs.c mm/cma: export total and free number of pages for CMA areas 2025-03-16 22:06:24 -07:00
compaction.c mm/compaction: fix bug in hugetlb handling pathway 2025-04-11 17:32:36 -07:00
debug.c mm/debug: add line breaks 2025-03-17 22:07:05 -07:00
debug_page_alloc.c mm: page_alloc: consolidate free page accounting 2024-04-25 20:56:04 -07:00
debug_page_ref.c
debug_vm_pgtable.c mm/debug_vm_pgtable: Use pxdp_get() for accessing page table entries 2024-09-17 01:07:01 -07:00
dmapool.c
dmapool_test.c mm/dmapool: add MODULE_DESCRIPTION() 2024-07-03 19:29:58 -07:00
early_ioremap.c mm/early_ioremap: add null pointer checks to prevent NULL-pointer dereference 2025-01-13 22:40:59 -08:00
execmem.c execmem: add API for temporal remapping as RW and restoring ROX afterwards 2025-02-03 11:46:02 +01:00
fadvise.c fdget(), trivial conversions 2024-11-03 01:28:06 -05:00
fail_page_alloc.c fault-inject: improve build for CONFIG_FAULT_INJECTION=n 2024-09-01 20:43:33 -07:00
failslab.c fault-inject: improve build for CONFIG_FAULT_INJECTION=n 2024-09-01 20:43:33 -07:00
filemap.c mm: fix filemap_get_folios_contig returning batches of identical folios 2025-04-11 17:32:40 -07:00
folio-compat.c mm: Remove grab_cache_page_write_begin() 2025-03-04 17:02:25 +00:00
gup.c mm/gup: fix wrongly calculated returned value in fault_in_safe_writeable() 2025-04-17 20:10:07 -07:00
gup_test.c
gup_test.h
highmem.c mm/highmem: make nr_free_highpages() return "unsigned long" 2024-07-03 19:30:06 -07:00
hmm.c mm: allow compound zone device pages 2025-03-17 22:06:39 -07:00
huge_memory.c mm/huge_memory: fix dereferencing invalid pmd migration entry 2025-05-07 23:39:38 -07:00
hugetlb.c mm/hugetlb: add a line break at the end of the format string 2025-04-11 17:32:40 -07:00
hugetlb_cgroup.c page_counter: track failcnt only for legacy cgroups 2025-03-17 00:05:35 -07:00
hugetlb_cma.c mm/hugetlb: move hugetlb CMA code in to its own file 2025-03-16 22:06:31 -07:00
hugetlb_cma.h mm/hugetlb: move hugetlb CMA code in to its own file 2025-03-16 22:06:31 -07:00
hugetlb_vmemmap.c mm, hugetlb: increment the number of pages to be reset on HVO 2025-04-17 20:10:08 -07:00
hugetlb_vmemmap.h mm/hugetlb: do pre-HVO for bootmem allocated pages 2025-03-16 22:06:29 -07:00
hwpoison-inject.c mm/hwpoison: add MODULE_DESCRIPTION() 2024-07-03 19:29:58 -07:00
init-mm.c mm: replace vm_lock and detached flag with a reference count 2025-03-16 22:06:20 -07:00
internal.h mm/page_alloc: fix deadlock on cpu_hotplug_lock in __accept_page() 2025-04-17 20:10:05 -07:00
interval_tree.c
io-mapping.c
ioremap.c mm/ioremap: pass pgprot_t to ioremap_prot() instead of unsigned long 2025-03-16 22:06:23 -07:00
Kconfig Disable SLUB_TINY for build testing 2025-04-06 10:00:04 -07:00
Kconfig.debug mm: rename GENERIC_PTDUMP and PTDUMP_CORE 2025-03-17 00:05:32 -07:00
khugepaged.c mm: convert folio_likely_mapped_shared() to folio_maybe_mapped_shared() 2025-03-17 22:06:46 -07:00
kmemleak.c mm: kmemleak: add support for dumping physical and __percpu object info 2025-03-16 22:06:08 -07:00
ksm.c mm/ksm: handle device-exclusive entries correctly in write_protect_page() 2025-03-16 22:05:58 -07:00
list_lru.c mm/list_lru: make the case where mlru is NULL as unlikely 2025-03-17 00:05:32 -07:00
maccess.c kasan: migrate copy_user_test to kunit 2024-11-11 00:26:44 -08:00
madvise.c mm/madvise: remove len parameter of madvise_do_behavior() 2025-03-17 22:07:04 -07:00
Makefile mm: rename GENERIC_PTDUMP and PTDUMP_CORE 2025-03-17 00:05:32 -07:00
mapping_dirty_helpers.c
memblock.c mm/memblock: repeat setting reserved region nid if array is doubled 2025-04-07 09:28:01 +03:00
memcontrol-v1.c mm: memcontrol: fix swap counter leak from offline cgroup 2025-04-17 20:10:06 -07:00
memcontrol-v1.h memcg: move do_memsw_account() to CONFIG_MEMCG_V1 2025-03-21 22:03:11 -07:00
memcontrol.c locking/local_lock, mm: replace localtry_ helpers with local_trylock_t type 2025-04-11 17:32:35 -07:00
memfd.c mm/memfd: fix spelling and grammatical issues 2025-03-16 22:06:04 -07:00
memory-failure.c mm: memory-failure: enhance comments for return value of memory_failure() 2025-03-17 22:07:05 -07:00
memory-tiers.c memory tiers: use default_dram_perf_ref_source in log message 2024-09-26 14:01:44 -07:00
memory.c mm/memory: move sanity checks in do_wp_page() after mapcount vs. refcount stabilization 2025-04-17 20:10:08 -07:00
memory_hotplug.c mm/memory_hotplug: fix call folio_test_large with tail page in do_migrate_range 2025-04-01 15:17:12 -07:00
mempolicy.c - The 6 patch series "Enable strict percpu address space checks" from 2025-04-01 09:29:18 -07:00
mempool.c mm: fix xyz_noprof functions calling profiled functions 2024-06-05 19:19:26 -07:00
memremap.c device/dax: properly refcount device dax pages when mapping 2025-03-17 22:06:41 -07:00
memtest.c memtest: use {READ,WRITE}_ONCE in memory scanning 2024-03-13 12:12:21 -07:00
migrate.c mm/migrate: fix sleep in atomic for large folios and buffer heads 2025-04-22 18:16:08 +02:00
migrate_device.c - The 6 patch series "Enable strict percpu address space checks" from 2025-04-01 09:29:18 -07:00
mincore.c mm/mincore: improve performance by adding an unlikely hint 2025-03-16 22:06:32 -07:00
mlock.c mm: allow compound zone device pages 2025-03-17 22:06:39 -07:00
mm_init.c mm/page_alloc: fix deadlock on cpu_hotplug_lock in __accept_page() 2025-04-17 20:10:05 -07:00
mm_slot.h
mmap.c - The 6 patch series "Enable strict percpu address space checks" from 2025-04-01 09:29:18 -07:00
mmap_lock.c mm: mmap_lock: optimize mmap_lock tracepoints 2025-01-13 22:40:34 -08:00
mmu_gather.c mm/mmu_gather: update comment on RCU freeing 2025-03-16 22:06:12 -07:00
mmu_notifier.c mm: move internal core VMA manipulation functions to own file 2024-09-01 20:25:54 -07:00
mmzone.c mm: improve code consistency with zonelist_* helper functions 2024-09-01 20:25:55 -07:00
mprotect.c mm: convert folio_likely_mapped_shared() to folio_maybe_mapped_shared() 2025-03-17 22:06:46 -07:00
mremap.c mm/mremap: do not set vrm->vma NULL immediately prior to checking it 2025-04-01 15:17:09 -07:00
mseal.c mseal: remove can_do_mseal() 2025-01-13 22:40:51 -08:00
msync.c
nommu.c - The 6 patch series "Enable strict percpu address space checks" from 2025-04-01 09:29:18 -07:00
numa.c mm/memblock: add memblock_alloc_or_panic interface 2025-01-25 20:22:38 -08:00
numa_emulation.c mm/fake-numa: allow later numa node hotplug 2025-01-25 20:22:29 -08:00
numa_memblks.c mm/fake-numa: allow later numa node hotplug 2025-01-25 20:22:29 -08:00
oom_kill.c mm/oom_kill: fix trivial typo in comment 2025-03-16 22:05:55 -07:00
page-writeback.c treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
page_alloc.c mm: vmscan: restore high-cpu watermark safety in kswapd 2025-04-17 20:10:09 -07:00
page_counter.c page_counter: track failcnt only for legacy cgroups 2025-03-17 00:05:35 -07:00
page_ext.c mm: page_ext: add an iteration API for page extensions 2025-03-17 22:06:57 -07:00
page_frag_cache.c mm/page_alloc: export free_frozen_pages() instead of free_unref_page() 2025-01-13 22:40:31 -08:00
page_idle.c mm/page_idle: handle device-exclusive entries correctly in page_idle_clear_pte_refs_one() 2025-03-16 22:05:59 -07:00
page_io.c page_io: zswap: do not crash the kernel on decompression failure 2025-03-17 22:06:50 -07:00
page_isolation.c mm: page_isolation: avoid calling folio_hstate() without hugetlb_lock 2025-04-01 15:14:43 -07:00
page_owner.c - The 6 patch series "Enable strict percpu address space checks" from 2025-04-01 09:29:18 -07:00
page_poison.c mm/page_poison: replace kmap_atomic() with kmap_local_page() 2023-12-10 16:51:50 -08:00
page_reporting.c mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER 2024-01-08 15:27:15 -08:00
page_reporting.h
page_table_check.c mm: page_table_check: use new iteration API 2025-03-17 22:06:57 -07:00
page_vma_mapped.c mm: make page_mapped_in_vma() hugetlb walk aware 2025-03-16 22:06:42 -07:00
pagewalk.c mm: pagewalk: add the ability to install PTEs 2024-11-11 00:26:44 -08:00
percpu-internal.h mm: remove CONFIG_MEMCG_KMEM 2024-07-10 12:14:54 -07:00
percpu-km.c
percpu-stats.c
percpu-vm.c percpu: clean up all mappings when pcpu_map_pages() fails 2024-04-25 20:55:49 -07:00
percpu.c - The 6 patch series "Enable strict percpu address space checks" from 2025-04-01 09:29:18 -07:00
pgalloc-track.h
pgtable-generic.c mm: add RCU annotation to pte_offset_map(_lock) 2024-12-18 19:04:43 -08:00
process_vm_access.c mm: refactor mm_access() to not return NULL 2024-11-05 16:56:23 -08:00
pt_reclaim.c mm: pgtable: reclaim empty PTE page in madvise(MADV_DONTNEED) 2025-01-13 22:40:48 -08:00
ptdump.c mm: ptdump: add check_wx_pages debugfs attribute 2024-02-22 10:24:47 -08:00
readahead.c Revert "fanotify: disable readahead if we have pre-content watches" 2025-03-13 16:31:12 +01:00
rmap.c mm: stop maintaining the per-page mapcount of large folios (CONFIG_NO_PAGE_MAPCOUNT) 2025-03-17 22:06:48 -07:00
rodata_test.c mm/rodata_test: verify test data is unchanged, rather than non-zero 2025-01-13 22:40:38 -08:00
secretmem.c add a string-to-qstr constructor 2025-01-27 19:25:45 -05:00
shmem.c - The 6 patch series "Enable strict percpu address space checks" from 2025-04-01 09:29:18 -07:00
shmem_quota.c shmem_quota: build the object file conditionally to the config option 2024-09-01 20:25:45 -07:00
show_mem.c meminfo: add a per node counter for balloon drivers 2025-03-21 22:03:13 -07:00
shrinker.c mm: shrinker: avoid memleak in alloc_shrinker_info 2024-10-31 20:27:04 -07:00
shrinker_debug.c mm/shrinker: fix name consistency issue in shrinker_debugfs_rename() 2025-03-17 00:05:40 -07:00
shuffle.c
shuffle.h mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER 2024-01-08 15:27:15 -08:00
slab.h Merge branch 'slab/for-6.15/kfree_rcu_tiny' into slab/for-next 2025-03-20 10:33:38 +01:00
slab_common.c A treewide hrtimer timer cleanup 2025-03-25 10:54:15 -07:00
slub.c mm, slab: clean up slab->obj_exts always 2025-04-24 19:19:40 +02:00
sparse-vmemmap.c mm/hugetlb: do pre-HVO for bootmem allocated pages 2025-03-16 22:06:29 -07:00
sparse.c drivers/base/memory: improve add_boot_memory_block() 2025-03-17 22:07:01 -07:00
swap.c - The 6 patch series "Enable strict percpu address space checks" from 2025-04-01 09:29:18 -07:00
swap.h - The 6 patch series "Enable strict percpu address space checks" from 2025-04-01 09:29:18 -07:00
swap_cgroup.c mm: swap_cgroup: remove double initialization of locals 2025-03-17 22:06:58 -07:00
swap_state.c mm, swap: simplify folio swap allocation 2025-03-16 22:06:44 -07:00
swapfile.c mm, swap: simplify folio swap allocation 2025-03-16 22:06:44 -07:00
truncate.c mm/truncate: use folio_split() in truncate operation 2025-03-17 22:07:00 -07:00
usercopy.c mm: security: Check early if HARDENED_USERCOPY is enabled 2025-02-28 11:51:31 -08:00
userfaultfd.c mm/vma: add give_up_on_oom option on modify/merge, use in uffd release 2025-04-11 17:32:37 -07:00
util.c Summary 2025-03-26 21:02:05 -07:00
vma.c mm/vma: add give_up_on_oom option on modify/merge, use in uffd release 2025-04-11 17:32:37 -07:00
vma.h mm/vma: add give_up_on_oom option on modify/merge, use in uffd release 2025-04-11 17:32:37 -07:00
vma_internal.h mm/vma: move brk() internals to mm/vma.c 2025-01-13 22:40:42 -08:00
vmalloc.c mm/vmalloc: refactor __vmalloc_node_range_noprof() 2025-03-17 22:06:58 -07:00
vmpressure.c
vmscan.c mm: vmscan: fix kswapd exit condition in defrag_mode 2025-04-17 20:10:09 -07:00
vmstat.c - The 6 patch series "Enable strict percpu address space checks" from 2025-04-01 09:29:18 -07:00
workingset.c mm/mglru: rework workingset protection 2025-01-25 20:22:39 -08:00
zpdesc.h mm/zsmalloc: introduce __zpdesc_clear/set_zsmalloc() 2025-01-25 20:22:35 -08:00
zpool.c mm: zpool: remove zpool_malloc_support_movable() 2025-03-17 00:05:41 -07:00
zsmalloc.c mm: zpool: remove zpool_malloc_support_movable() 2025-03-17 00:05:41 -07:00
zswap.c mm: zswap: fix crypto_free_acomp() deadlock in zswap_cpu_comp_dead() 2025-04-01 15:14:43 -07:00