fs/proc/task_mmu: remove per-page mapcount dependency for smaps/smaps_rollup (CONFIG_NO_PAGE_MAPCOUNT)

Let's implement an alternative when per-page mapcounts in large folios are
no longer maintained -- soon with CONFIG_NO_PAGE_MAPCOUNT.

When computing the output for smaps / smaps_rollups, in particular when
calculating the USS (Unique Set Size) and the PSS (Proportional Set Size),
we still rely on per-page mapcounts.

To determine private vs.  shared, we'll use folio_likely_mapped_shared(),
similar to how we handle PM_MMAP_EXCLUSIVE.  Similarly, we might now
under-estimate the USS and count pages towards "shared" that are actually
"private" ("exclusively mapped").

When calculating the PSS, we'll now also use the average per-page mapcount
for large folios: this can result in both, an over-estimation and an
under-estimation of the PSS.  The difference is not expected to matter
much in practice, but we'll have to learn as we go.

We can now provide folio_precise_page_mapcount() only with
CONFIG_PAGE_MAPCOUNT, and remove one of the last users of per-page
mapcounts when CONFIG_NO_PAGE_MAPCOUNT is enabled.

Document the new behavior.

Link: https://lkml.kernel.org/r/20250303163014.1128035-20-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Andy Lutomirks^H^Hski <luto@kernel.org>
Cc: Borislav Betkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcow (Oracle) <willy@infradead.org>
Cc: Michal Koutn <mkoutny@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: tejun heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This commit is contained in:
David Hildenbrand 2025-03-03 17:30:12 +01:00 committed by Andrew Morton
parent 7a34ae1449
commit 6dd55dd1c5
3 changed files with 42 additions and 5 deletions

View file

@ -502,9 +502,25 @@ process, its PSS will be 1500. "Pss_Dirty" is the portion of PSS which
consists of dirty pages. ("Pss_Clean" is not included, but it can be
calculated by subtracting "Pss_Dirty" from "Pss".)
Note that even a page which is part of a MAP_SHARED mapping, but has only
a single pte mapped, i.e. is currently used by only one process, is accounted
as private and not as shared.
Traditionally, a page is accounted as "private" if it is mapped exactly once,
and a page is accounted as "shared" when mapped multiple times, even when
mapped in the same process multiple times. Note that this accounting is
independent of MAP_SHARED.
In some kernel configurations, the semantics of pages part of a larger
allocation (e.g., THP) can differ: a page is accounted as "private" if all
pages part of the corresponding large allocation are *certainly* mapped in the
same process, even if the page is mapped multiple times in that process. A
page is accounted as "shared" if any page page of the larger allocation
is *maybe* mapped in a different process. In some cases, a large allocation
might be treated as "maybe mapped by multiple processes" even though this
is no longer the case.
Some kernel configurations do not track the precise number of times a page part
of a larger allocation is mapped. In this case, when calculating the PSS, the
average number of mappings per page in this larger allocation might be used
as an approximation for the number of mappings of a page. The PSS calculation
will be imprecise in this case.
"Referenced" indicates the amount of memory currently marked as referenced or
accessed.

View file

@ -157,6 +157,7 @@ unsigned name_to_int(const struct qstr *qstr);
/* Worst case buffer size needed for holding an integer. */
#define PROC_NUMBUF 13
#ifdef CONFIG_PAGE_MAPCOUNT
/**
* folio_precise_page_mapcount() - Number of mappings of this folio page.
* @folio: The folio.
@ -187,6 +188,13 @@ static inline int folio_precise_page_mapcount(struct folio *folio,
return mapcount;
}
#else /* !CONFIG_PAGE_MAPCOUNT */
static inline int folio_precise_page_mapcount(struct folio *folio,
struct page *page)
{
BUILD_BUG();
}
#endif /* CONFIG_PAGE_MAPCOUNT */
/**
* folio_average_page_mapcount() - Average number of mappings per page in this

View file

@ -707,6 +707,8 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page,
struct folio *folio = page_folio(page);
int i, nr = compound ? compound_nr(page) : 1;
unsigned long size = nr * PAGE_SIZE;
bool exclusive;
int mapcount;
/*
* First accumulate quantities that depend only on |size| and the type
@ -747,18 +749,29 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page,
dirty, locked, present);
return;
}
if (IS_ENABLED(CONFIG_NO_PAGE_MAPCOUNT)) {
mapcount = folio_average_page_mapcount(folio);
exclusive = !folio_maybe_mapped_shared(folio);
}
/*
* We obtain a snapshot of the mapcount. Without holding the folio lock
* this snapshot can be slightly wrong as we cannot always read the
* mapcount atomically.
*/
for (i = 0; i < nr; i++, page++) {
int mapcount = folio_precise_page_mapcount(folio, page);
unsigned long pss = PAGE_SIZE << PSS_SHIFT;
if (IS_ENABLED(CONFIG_PAGE_MAPCOUNT)) {
mapcount = folio_precise_page_mapcount(folio, page);
exclusive = mapcount < 2;
}
if (mapcount >= 2)
pss /= mapcount;
smaps_page_accumulate(mss, folio, PAGE_SIZE, pss,
dirty, locked, mapcount < 2);
dirty, locked, exclusive);
}
}