linux/include/trace/events
Lorenzo Stoakes 5dba5cc2e0 mm: introduce VM_MAYBE_GUARD and make visible in /proc/$pid/smaps
Patch series "introduce VM_MAYBE_GUARD and make it sticky", v4.

Currently, guard regions are not visible to users except through
/proc/$pid/pagemap, with no explicit visibility at the VMA level.

This makes the feature less useful, as it isn't entirely apparent which
VMAs may have these entries present, especially when performing actions
which walk through memory regions such as those performed by CRIU.

This series addresses this issue by introducing the VM_MAYBE_GUARD flag
which fulfils this role, updating the smaps logic to display an entry for
these.

The semantics of this flag are that a guard region MAY be present if set
(we cannot be sure, as we can't efficiently track whether an
MADV_GUARD_REMOVE finally removes all the guard regions in a VMA) - but if
not set the VMA definitely does NOT have any guard regions present.

It's problematic to establish this flag without further action, because
that means that VMAs with guard regions in them become non-mergeable with
adjacent VMAs for no especially good reason.

To work around this, this series also introduces the concept of 'sticky'
VMA flags - that is flags which:

a. if set in one VMA and not in another still permit those VMAs to be
   merged (if otherwise compatible).

b. When they are merged, the resultant VMA must have the flag set.

The VMA logic is updated to propagate these flags correctly.

Additionally, VM_MAYBE_GUARD being an explicit VMA flag allows us to solve
an issue with file-backed guard regions - previously these established an
anon_vma object for file-backed mappings solely to have vma_needs_copy()
correctly propagate guard region mappings to child processes.

We introduce a new flag alias VM_COPY_ON_FORK (which currently only
specifies VM_MAYBE_GUARD) and update vma_needs_copy() to check explicitly
for this flag and to copy page tables if it is present, which resolves
this issue.

Additionally, we add the ability for allow-listed VMA flags to be
atomically writable with only mmap/VMA read locks held.

The only flag we allow so far is VM_MAYBE_GUARD, which we carefully ensure
does not cause any races by being allowed to do so.

This allows us to maintain guard region installation as a read-locked
operation and not endure the overhead of obtaining a write lock here.

Finally we introduce extensive VMA userland tests to assert that the
sticky VMA logic behaves correctly as well as guard region self tests to
assert that smaps visibility is correctly implemented.


This patch (of 9):

Currently, if a user needs to determine if guard regions are present in a
range, they have to scan all VMAs (or have knowledge of which ones might
have guard regions).

Since commit 8e2f2aeb8b ("fs/proc/task_mmu: add guard region bit to
pagemap") and the related commit a516403787 ("fs/proc: extend the
PAGEMAP_SCAN ioctl to report guard regions"), users can use either
/proc/$pid/pagemap or the PAGEMAP_SCAN functionality to perform this
operation at a virtual address level.

This is not ideal, and it gives no visibility at a /proc/$pid/smaps level
that guard regions exist in ranges.

This patch remedies the situation by establishing a new VMA flag,
VM_MAYBE_GUARD, to indicate that a VMA may contain guard regions (it is
uncertain because we cannot reasonably determine whether a
MADV_GUARD_REMOVE call has removed all of the guard regions in a VMA, and
additionally VMAs may change across merge/split).

We utilise 0x800 for this flag which makes it available to 32-bit
architectures also, a flag that was previously used by VM_DENYWRITE, which
was removed in commit 8d0920bde5 ("mm: remove VM_DENYWRITE") and hasn't
bee reused yet.

We also update the smaps logic and documentation to identify these VMAs.

Another major use of this functionality is that we can use it to identify
that we ought to copy page tables on fork.

We do not actually implement usage of this flag in mm/madvise.c yet as we
need to allow some VMA flags to be applied atomically under mmap/VMA read
lock in order to avoid the need to acquire a write lock for this purpose.

Link: https://lkml.kernel.org/r/cover.1763460113.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/cf8ef821eba29b6c5b5e138fffe95d6dcabdedb9.1763460113.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Cc: Andrei Vagin <avagin@gmail.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20 13:43:58 -08:00
..
9p.h 9p: prevent read overrun in protocol dump tracepoint 2023-12-05 21:18:44 +09:00
afs.h afs: Add support for RENAME_NOREPLACE and RENAME_EXCHANGE 2025-09-25 09:19:07 +02:00
alarmtimer.h alarmtimer: Hide alarmtimer_suspend event when RTC_CLASS is not configured 2025-07-21 16:40:56 -04:00
amdxdna.h accel/amdxdna: Add command execution 2024-11-22 11:43:27 -07:00
asoc.h ALSA: trace: use snd_pcm_direction_name() 2024-08-01 12:50:03 +02:00
avc.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
bcache.h
block.h block: fix blk_zone_append_update_request_bio() kernel-doc 2025-07-16 10:02:18 -06:00
bpf_test_run.h bpf: add bpf_modify_return_test_tp() kfunc triggering tracepoint 2024-03-28 18:31:40 -07:00
bridge.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
btrfs.h Summary of significant series in this pull request: 2025-07-31 14:57:54 -07:00
cachefiles.h cachefiles: Add auxiliary data trace 2024-12-20 22:34:05 +01:00
capability.h security: add trace event for cap_capable 2024-12-04 20:59:21 -06:00
cgroup.h cgroup: remove per-cpu per-subsystem locks 2025-06-17 10:01:18 -10:00
clk.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
cma.h mm/cma: add 'available count' and 'total count' to trace_cma_alloc_start 2025-09-13 16:55:15 -07:00
compaction.h mm: compaction: update the cc->nr_migratepages when allocating or freeing the freepages 2024-02-22 10:24:50 -08:00
context_tracking.h
cpuhp.h
csd.h smp: Change function signatures to use call_single_data_t 2023-09-13 14:59:24 +02:00
damon.h mm/damon: add trace event for effective size quota 2025-07-13 16:38:33 -07:00
devfreq.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
devlink.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
dlm.h dlm: remove lkb from callback tracepoints 2024-04-01 13:31:12 -05:00
dma.h dma-mapping: fix direction in dma_alloc direction traces 2025-10-03 08:45:09 +02:00
dma_fence.h dma-fence: Add safe access helpers and document the rules 2025-06-13 08:26:49 +01:00
erofs.h erofs: remove unused trace event erofs_destroy_inode 2025-06-18 13:41:16 +08:00
error_report.h
exceptions.h x86/tracing, x86/mm: Move page fault tracepoints to generic 2025-05-16 10:13:59 +02:00
ext4.h Major ext4 changes for 6.17: 2025-07-31 10:02:44 -07:00
f2fs.h f2fs: remove wbc->for_reclaim handling 2025-05-08 15:22:45 +00:00
fib.h ipv4: Convert ->flowi4_tos to dscp_t. 2025-08-26 17:34:31 -07:00
fib6.h tracing: ipv6: Add flow label to fib6_table_lookup tracepoint 2024-12-19 16:02:22 +01:00
filelock.h vfs-6.18-rc1.inode 2025-09-29 09:42:30 -07:00
filemap.h filemap: add trace events for get_pages, map_pages, and fault 2024-09-01 20:26:10 -07:00
firewire.h firewire: core: rename cause flag of tracepoints event 2024-09-12 22:30:38 +09:00
firewire_ohci.h firewire: ohci: add tracepoints event for data of Self-ID DMA 2024-07-04 09:07:14 +09:00
fs_dax.h mm: update core kernel code to use vm_flags_t consistently 2025-07-09 22:42:13 -07:00
fscache.h cachefiles: fix slab-use-after-free in fscache_withdraw_volume() 2024-07-03 10:36:14 +02:00
fsi.h fsi: core: Add trace events for scan and unregister 2023-08-09 15:43:28 +09:30
fsi_master_aspeed.h
fsi_master_ast_cf.h
fsi_master_gpio.h
fsi_master_i2cr.h fsi: Add IBM I2C Responder virtual FSI master 2023-08-11 13:32:14 +09:30
gpio.h
gpu_mem.h
habanalabs.h accel/habanalabs: fix typo in trace output (cms -> cmd) 2025-09-25 09:09:28 +03:00
handshake.h net/handshake: Trace events for TLS Alert helpers 2023-07-28 14:07:59 -07:00
host1x.h
huge_memory.h mm: drop all references of writable and SCAN_PAGE_RO 2025-09-21 14:22:40 -07:00
hugetlbfs.h hugetlb: fix NULL pointer dereference in trace_hugetlbfs_alloc_inode 2025-01-12 19:03:36 -08:00
hw_pressure.h sched/cpufreq: Rename arch_update_thermal_pressure() => arch_update_hw_pressure() 2024-04-24 12:08:01 +02:00
hwmon.h hwmon: Introduce 64-bit energy attribute support 2025-09-07 16:33:48 -07:00
i2c.h
i2c_slave.h
ib_mad.h IB/mad: Don't call to function that might sleep while in atomic context 2022-11-10 10:57:15 +02:00
ib_umad.h
icmp.h net/ipv4: add tracepoint for icmp_send 2024-05-08 10:39:26 +01:00
initcall.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
intel-sst.h
intel_ifs.h trace: platform/x86/intel/ifs: Add SBAF trace support 2024-08-12 16:36:11 +02:00
intel_ish.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
io_uring.h io_uring/trace: support completion tracing of mixed 32b CQEs 2025-08-24 11:41:13 -06:00
iocost.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
iommu.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
ipi.h tracing: arm: arm64: Hide trace events ipi_raise, ipi_entry and ipi_exit 2025-07-23 14:58:55 -04:00
irq.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
irq_matrix.h genirq/matrix: Remove unused irq_matrix_alloc_reserved tracepoint 2025-06-02 13:12:26 -04:00
iscsi.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
jbd2.h jbd2: remove journal_clean_one_cp_list() 2023-07-10 23:09:21 -04:00
kmem.h kmem/tracing: add kmem name to kmem_cache_alloc tracepoint 2025-09-13 16:55:18 -07:00
ksm.h mm/ksm: add tracepoint for ksm advisor 2023-12-29 11:58:27 -08:00
kvm.h LoongArch: KVM: Move kvm_iocsr tracepoint out of generic code 2025-09-23 23:37:26 +08:00
kyber.h kyber: Replace strlcpy with strscpy 2023-07-17 08:18:17 -06:00
libata.h
lock.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
maple_tree.h Maple Tree: add new data structure 2022-09-26 19:46:13 -07:00
mce.h x86/MCE/AMD: Add support for new MCA_SYND{1,2} registers 2024-10-31 10:36:07 +01:00
mctp.h
mdio.h trace: events: cleanup deprecated strncpy uses 2024-04-05 22:10:25 -07:00
memcg.h memcg: add flush tracepoint 2024-11-11 00:26:46 -08:00
migrate.h mm/migrate: add MR_DAMON to migrate_reason 2024-07-03 19:30:12 -07:00
mlxsw.h
mmap.h mm: remove unused mmap tracepoints 2025-07-09 22:41:55 -07:00
mmap_lock.h mm: mmap_lock: optimize mmap_lock tracepoints 2025-01-13 22:40:34 -08:00
mmc.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
mmflags.h mm: introduce VM_MAYBE_GUARD and make visible in /proc/$pid/smaps 2025-11-20 13:43:58 -08:00
module.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
mptcp.h mptcp: sched: check both directions for backup 2024-07-30 10:27:29 +02:00
napi.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
nbd.h nbd: Use NULL to represent a pointer 2024-05-14 07:22:35 -06:00
neigh.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
net.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
net_probe_common.h trace: adjust TP_STORE_ADDR_PORTS_SKB() parameters 2024-04-03 19:26:14 -07:00
netfs.h netfs: Fix race between cache write completion and ALL_QUEUED being set 2025-07-14 11:05:02 +02:00
netlink.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
nilfs2.h nilfs2: use __field_struct() for a bitwise field 2024-05-11 15:51:43 -07:00
nmi.h
notifier.h notifiers: add tracepoints to the notifiers infrastructure 2023-04-08 13:45:38 -07:00
objagg.h
oom.h mm: improve code consistency with zonelist_* helper functions 2024-09-01 20:25:55 -07:00
osnoise.h trace/osnoise: Add trace events for samples 2025-02-26 19:44:30 -05:00
page_isolation.h
page_pool.h page_pool: devmem support 2024-09-11 20:44:31 -07:00
page_ref.h mm: introduce memdesc_flags_t 2025-09-13 16:55:07 -07:00
pagemap.h
percpu.h
power.h PM: tracing: Hide power_domain_target event under ARCH_OMAP2PLUS 2025-07-21 16:40:57 -04:00
power_cpu_migrate.h
preemptirq.h tracing: Remove definition of trace_*_rcuidle() 2024-10-08 21:17:39 -04:00
printk.h
pwc.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
pwm.h pwm: Add tracing for waveform callbacks 2024-09-28 15:13:56 +02:00
qdisc.h tracing/net_sched: NULL pointer dereference in perf_trace_qdisc_reset() 2024-06-27 11:06:30 +02:00
qla.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
qrtr.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
rcu.h RCU pull request for v6.15 2025-03-24 19:41:37 -07:00
rdma_core.h
readahead.h readahead: add trace points 2025-09-21 14:22:28 -07:00
regulator.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
rpcgss.h sunrpc: implement rfc2203 rpcsec_gss seqnum cache 2025-05-19 10:14:29 -04:00
rpcrdma.h svcrdma: Handle device removal outside of the CM event handler 2024-09-20 19:31:03 -04:00
rpm.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
rseq.h tracing/rseq: Add mm_cid field to rseq_update 2022-12-27 12:52:15 +01:00
rtc.h
rust_sample.h rust: samples: add tracepoint to Rust sample 2024-11-04 16:21:44 -05:00
rwmmio.h asm-generic/io: Add _RET_IP_ to MMIO trace for more accurate debug info 2022-11-21 22:02:10 +01:00
rxrpc.h rxrpc: Fix notification vs call-release vs recvmsg 2025-07-17 07:50:48 -07:00
sched.h tracing changes for 6.17 2025-08-01 10:29:36 -07:00
sched_ext.h sched_ext: Add trace point to track sched_ext core events 2025-03-04 08:06:17 -10:00
scmi.h include: trace: Add tracepoint support for inflight xfer count 2025-07-03 16:18:09 +01:00
scsi.h scsi: trace: Show rtn in string for scsi_dispatch_cmd_error() 2025-06-09 21:59:07 -04:00
sctp.h
signal.h
siox.h
skb.h net: add rx_sk to trace_kfree_skb 2024-06-19 12:44:22 +01:00
smbus.h
sock.h net: Retire DCCP socket. 2025-04-11 18:58:10 -07:00
sof.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
sof_intel.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
spi.h spi: Fix spelling typos and acronyms capitalization 2023-07-11 14:14:32 +01:00
spmi.h
sunrpc.h sunrpc: remove SVC_SYSERR 2025-07-14 12:46:48 -04:00
sunvnet.h
swiotlb.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
syscalls.h tracing: Declare system call tracepoints with TRACE_EVENT_SYSCALL 2024-10-09 17:05:54 -04:00
target.h scsi: usb: Rename the RESERVE and RELEASE constants 2025-02-12 22:20:55 -05:00
task.h copy_process: pass clone_flags as u64 across calltree 2025-09-01 15:31:34 +02:00
tcp.h trace: tcp: add three metrics to trace_tcp_rcvbuf_grow() 2025-10-29 17:30:18 -07:00
tegra_apb_dma.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
thp.h powerpc/thp: tracing: Hide hugepage events under CONFIG_PPC_BOOK3S_64 2025-07-25 08:58:07 -04:00
timer.h tracing/timers: Rename the hrtimer_init event to hrtimer_setup 2025-04-05 10:30:17 +02:00
timer_migration.h timers/migration: Rename childmask by groupmask to make naming more obvious 2024-07-22 18:03:34 +02:00
timestamp.h fs: tracepoints around multigrain timestamp events 2024-10-10 10:20:52 +02:00
tlb.h
tsm_mr.h tsm-mr: Add TVM Measurement Register support 2025-05-08 19:17:33 -07:00
udp.h trace: adjust TP_STORE_ADDR_PORTS_SKB() parameters 2024-04-03 19:26:14 -07:00
v4l2.h
vb2.h
vmalloc.h mm: vmalloc: add free_vmap_area_noflush trace event 2022-11-08 17:37:17 -08:00
vmscan.h vmscan: add a vmscan event for reclaim_pages 2024-11-06 20:11:13 -08:00
vsock_virtio_transport_common.h vsock/virtio: MSG_ZEROCOPY flag support 2023-09-21 12:34:00 +02:00
watchdog.h watchdog: Add tracing events for the most usual watchdog events 2022-10-12 09:47:02 +02:00
wbt.h blk-wbt: Replace strlcpy with strscpy 2023-07-17 08:18:17 -06:00
workqueue.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
writeback.h writeback: Add tracepoint to track pending inode switches 2025-09-19 13:11:06 +02:00
xdp.h xdp: tracing: Hide some xdp events under CONFIG_BPF_SYSCALL 2025-06-12 19:36:53 -07:00
xen.h x86/xen: move paravirt lazy code 2023-09-19 07:04:49 +02:00