linux

mirror of https://github.com/torvalds/linux.git synced 2026-03-08 01:24:47 +01:00

Author	SHA1	Message	Date
Linus Torvalds	0e335a7745	vfs-7.0-rc2.fixes Please consider pulling these changes from the signed vfs-7.0-rc2.fixes tag. Thanks! Christian -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaZ7xWAAKCRCRxhvAZXjc onpeAP4qOrTURIAX9M/NGCHywvjI91ZJt20J6vm0X6KbVV/ebQD/eoJ21xzPhG9M gN7oRcZ9SW3e/AdtdnlqB0PEP+cyGwM= =9Ji+ -----END PGP SIGNATURE----- Merge tag 'vfs-7.0-rc2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Fix an uninitialized variable in file_getattr(). The flags_valid field wasn't initialized before calling vfs_fileattr_get(), triggering KMSAN uninit-value reports in fuse - Fix writeback wakeup and logging timeouts when DETECT_HUNG_TASK is not enabled. sysctl_hung_task_timeout_secs is 0 in that case causing spurious "waiting for writeback completion for more than 1 seconds" warnings - Fix a null-ptr-deref in do_statmount() when the mount is internal - Add missing kernel-doc description for the @private parameter in iomap_readahead() - Fix mount namespace creation to hold namespace_sem across the mount copy in create_new_namespace(). The previous drop-and-reacquire pattern was fragile and failed to clean up mount propagation links if the real rootfs was a shared or dependent mount - Fix /proc mount iteration where m->index wasn't updated when m->show() overflows, causing a restart to repeatedly show the same mount entry in a rapidly expanding mount table - Return EFSCORRUPTED instead of ENOSPC in minix_new_inode() when the inode number is out of range - Fix unshare(2) when CLONE_NEWNS is set and current->fs isn't shared. copy_mnt_ns() received the live fs_struct so if a subsequent namespace creation failed the rollback would leave pwd and root pointing to detached mounts. Always allocate a new fs_struct when CLONE_NEWNS is requested - fserror bug fixes: - Remove the unused fsnotify_sb_error() helper now that all callers have been converted to fserror_report_metadata - Fix a lockdep splat in fserror_report() where igrab() takes inode::i_lock which can be held in IRQ context. Replace igrab() with a direct i_count bump since filesystems should not report inodes that are about to be freed or not yet exposed - Handle error pointer in procfs for try_lookup_noperm() - Fix an integer overflow in ep_loop_check_proc() where recursive calls returning INT_MAX would overflow when +1 is added, breaking the recursion depth check - Fix a misleading break in pidfs * tag 'vfs-7.0-rc2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: pidfs: avoid misleading break eventpoll: Fix integer overflow in ep_loop_check_proc() proc: Fix pointer error dereference fserror: fix lockdep complaint when igrabbing inode fsnotify: drop unused helper unshare: fix unshare_fs() handling minix: Correct errno in minix_new_inode namespace: fix proc mount iteration mount: hold namespace_sem across copy in create_new_namespace() iomap: Describe @private in iomap_readahead() statmount: Fix the null-ptr-deref in do_statmount() writeback: Fix wakeup and logging timeouts for !DETECT_HUNG_TASK fs: init flags_valid before calling vfs_fileattr_get	2026-02-25 10:34:23 -08:00
Wilfred Mallawa	650b774cf9	xfs: add static size checks for ioctl UABI The ioctl structures in libxfs/xfs_fs.h are missing static size checks. It is useful to have static size checks for these structures as adding new fields to them could cause issues (e.g. extra padding that may be inserted by the compiler). So add these checks to xfs/xfs_ondisk.h. Due to different padding/alignment requirements across different architectures, to avoid build failures, some structures are ommited from the size checks. For example, structures with "compat_" definitions in xfs/xfs_ioctl32.h are ommited. Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2026-02-25 13:58:50 +01:00
Wilfred Mallawa	e97cbf863d	xfs: remove duplicate static size checks In libxfs/xfs_ondisk.h, remove some duplicate entries of XFS_CHECK_STRUCT_SIZE(). Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2026-02-25 13:58:49 +01:00
Nirjhar Roy (IBM)	9a654a8fa3	xfs: Add comments for usages of some macros. Add comments explaining when to use XFS_IS_CORRUPT() and ASSERT() Suggested-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2026-02-25 13:58:49 +01:00
Nirjhar Roy (IBM)	c2368fc89a	xfs: Update lazy counters in xfs_growfs_rt_bmblock() Update lazy counters in xfs_growfs_rt_bmblock() similar to the way it is done xfs_growfs_data_private(). This is because the lazy counters are not always updated and synching the counters will avoid inconsistencies between frexents and rtextents(total realtime extent count). This will be more useful once realtime shrink is implemented as this will prevent some transient state to occur where frexents might be greater than total rtextents. Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2026-02-25 13:58:49 +01:00
Nirjhar Roy (IBM)	ac1d977096	xfs: Add a comment in xfs_log_sb() Add a comment explaining why the sb_frextents are updated outside the if (xfs_has_lazycount(mp) check even though it is a lazycounter. RT groups are supported only in v5 filesystems which always have lazycounter enabled - so putting it inside the if(xfs_has_lazycount(mp) check is redundant. Suggested-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2026-02-25 13:58:49 +01:00
Nirjhar Roy (IBM)	8baa9bccc0	xfs: Fix xfs_last_rt_bmblock() Bug description: If the size of the last rtgroup i.e, the rtg passed to xfs_last_rt_bmblock() is such that the last rtextent falls in 0th word offset of a bmblock of the bitmap file tracking this (last) rtgroup, then in that case xfs_last_rt_bmblock() incorrectly returns the next bmblock number instead of the current/last used bmblock number. When xfs_last_rt_bmblock() incorrectly returns the next bmblock, the loop to grow/modify the bmblocks in xfs_growfs_rtg() doesn't execute and xfs_growfs basically does a nop in certain cases. xfs_growfs will do a nop when the new size of the fs will have the same number of rtgroups i.e, we are only growing the last rtgroup. Reproduce: $ mkfs.xfs -m metadir=0 -r rtdev=/dev/loop1 /dev/loop0 \ -r size=32769b -f $ mount -o rtdev=/dev/loop1 /dev/loop0 /mnt/scratch $ xfs_growfs -R $(( 32769 + 1 )) /mnt/scratch $ xfs_info /mnt/scratch \| grep rtextents $ # We can see that rtextents hasn't changed Fix: Fix this by returning the current/last used bmblock when the last rtgroup size is not a multiple xfs_rtbitmap_rtx_per_rbmblock() and the next bmblock when the rtgroup size is a multiple of xfs_rtbitmap_rtx_per_rbmblock() i.e, the existing blocks are completely used up. Also, I have renamed xfs_last_rt_bmblock() to xfs_last_rt_bmblock_to_extend() to signify that this function returns the bmblock number to extend and NOT always the last used bmblock number. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2026-02-25 13:58:49 +01:00
Darrick J. Wong	115ea07b94	xfs: don't report half-built inodes to fserror Sam Sun apparently found a syzbot way to fuzz a filesystem such that xfs_iget_cache_miss would free the inode before the fserror code could catch up. Frustratingly he doesn't use the syzbot dashboard so there's no C reproducer and not even a full error report, so I'm guessing that: Inodes that are being constructed or torn down inside XFS are not visible to the VFS. They should never be reported to fserror. Also, any inode that has been freshly allocated in _cache_miss should be marked INEW immediately because, well, it's an incompletely constructed inode that isn't yet visible to the VFS. Reported-by: Sam Sun <samsun1006219@gmail.com> Fixes: `5eb4cb18e4` ("xfs: convey metadata health events to the health monitor") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2026-02-25 13:58:49 +01:00
Darrick J. Wong	75690e5fdd	xfs: don't report metadata inodes to fserror Internal metadata inodes are not exposed to userspace programs, so it makes no sense to pass them to the fserror functions (aka fsnotify). Instead, report metadata file problems as general filesystem corruption. Fixes: `5eb4cb18e4` ("xfs: convey metadata health events to the health monitor") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2026-02-25 13:58:49 +01:00
Darrick J. Wong	94014a23e9	xfs: fix potential pointer access race in xfs_healthmon_get Pankaj Raghav asks about this code in xfs_healthmon_get: hm = mp->m_healthmon; if (hm && !refcount_inc_not_zero(&hm->ref)) hm = NULL; rcu_read_unlock(); return hm; (slightly edited to compress a mailing list thread) "Nit: Should we do a READ_ONCE(mp->m_healthmon) here to avoid any compiler tricks that can result in an undefined behaviour? I am not sure if I am being paranoid here. "So this is my understanding: RCU guarantees that we get a valid object (actual data of m_healthmon) but does not guarantee the compiler will not reread the pointer between checking if hm is !NULL and accessing the pointer as we are doing it lockless. "So just a barrier() call in rcu_read_lock is enough to make sure this doesn't happen and probably adding a READ_ONCE() is not needed?" After some initial confusion I concluded that he's correct. The compiler could very well eliminate the hm variable in favor of walking the pointers twice, turning the code into: if (mp->m_healthmon && !refcount_inc_not_zero(&mp->m_healthmon->ref)) If this happens, then xfs_healthmon_detach can sneak in between the two sides of the && expression and set mp->m_healthmon to NULL, and thereby cause a null pointer dereference crash. Fix this by using the rcu pointer assignment and dereference functions, which ensure that the proper reordering barriers are in place. Practically speaking, gcc seems to allocate an actual variable for hm and only reads mp->m_healthmon once (as intended), but we ought to be more explicit about requiring this. Reported-by: Pankaj Raghav <pankaj.raghav@linux.dev> Fixes: `a48373e7d3` ("xfs: start creating infrastructure for health monitoring") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Pankaj Raghav <p.raghav@samsung.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2026-02-25 13:58:49 +01:00
Darrick J. Wong	eb8550fb75	xfs: fix xfs_group release bug in xfs_dax_notify_dev_failure Chris Mason reports that his AI tools noticed that we were using xfs_perag_put and xfs_group_put to release the group reference returned by xfs_group_next_range. However, the iterator function returns an object with an active refcount, which means that we must use the correct function to release the active refcount, which is _rele. Cc: <stable@vger.kernel.org> # v6.0 Fixes: `6f643c57d5` ("xfs: implement ->notify_failure() for XFS") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2026-02-25 13:58:49 +01:00
Darrick J. Wong	161456987a	xfs: fix xfs_group release bug in xfs_verify_report_losses Chris Mason reports that his AI tools noticed that we were using xfs_perag_put and xfs_group_put to release the group reference returned by xfs_group_next_range. However, the iterator function returns an object with an active refcount, which means that we must use the correct function to release the active refcount, which is _rele. Fixes: `b8accfd65d` ("xfs: add media verification ioctl") Reported-by: Chris Mason <clm@meta.com> Link: https://lore.kernel.org/linux-xfs/20260206030527.2506821-1-clm@meta.com/ Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2026-02-25 13:58:48 +01:00
Darrick J. Wong	e764dd439d	xfs: fix copy-paste error in previous fix Chris Mason noticed that there is a copy-paste error in a recent change to xrep_dir_teardown that nulls out pointers after freeing the resources. Fixes: `ba408d299a` ("xfs: only call xf{array,blob}_destroy if we have a valid pointer") Link: https://lore.kernel.org/linux-xfs/20260205194211.2307232-1-clm@meta.com/ Reported-by: Chris Mason <clm@meta.com> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2026-02-25 13:58:48 +01:00
Ethan Tidmore	cddfa648f1	xfs: Fix error pointer dereference The function try_lookup_noperm() can return an error pointer and is not checked for one. Add checks for error pointer in xrep_adoption_check_dcache() and xrep_adoption_zap_dcache(). Detected by Smatch: fs/xfs/scrub/orphanage.c:449 xrep_adoption_check_dcache() error: 'd_child' dereferencing possible ERR_PTR() fs/xfs/scrub/orphanage.c:485 xrep_adoption_zap_dcache() error: 'd_child' dereferencing possible ERR_PTR() Fixes: `73597e3e42` ("xfs: ensure dentry consistency when the orphanage adopts a file") Cc: stable@vger.kernel.org # v6.16 Signed-off-by: Ethan Tidmore <ethantidmore06@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2026-02-25 13:58:48 +01:00
Christoph Hellwig	47553dd60b	xfs: remove metafile inodes from the active inode stat The active inode (or active vnode until recently) stat can get much larger than expected on file systems with a lot of metafile inodes like zoned file systems on SMR hard disks with 10.000s of rtg rmap inodes. Remove all metafile inodes from the active counter to make it more useful to track actual workloads and add a separate counter for active metafile inodes. This fixes xfs/177 on SMR hard drives. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2026-02-25 13:58:48 +01:00
Christoph Hellwig	03a6d6c4c8	xfs: cleanup inode counter stats Most of them are unused, so mark them as such. Give the remaining ones names that match their use instead of the historic IRIX ones based on vnodes. Note that the names are purely internal to the XFS code, the user interface is based on section names and arrays of counters. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2026-02-25 13:58:48 +01:00
Wilfred Mallawa	fd81d3fd01	xfs: fix code alignment issues in xfs_ondisk.c Fixup some code alignment issues in xfs_ondisk.c Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2026-02-25 13:58:48 +01:00
Nirjhar Roy (IBM)	4ad85e633b	xfs: Replace &rtg->rtg_group with rtg_group() Use the already existing rtg_group() wrapper instead of directly accessing the struct xfs_group member in struct xfs_rtgroup. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com> [cem: Conflict resolution against `06873dbd94`] Signed-off-by: Carlos Maiolino <cem@kernel.org>	2026-02-25 13:58:48 +01:00
Nirjhar Roy (IBM)	a49b7ff63f	xfs: Refactoring the nagcount and delta calculation Introduce xfs_growfs_compute_delta() to calculate the nagcount and delta blocks and refactor the code from xfs_growfs_data_private(). No functional changes. Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2026-02-25 13:58:48 +01:00
Nirjhar Roy (IBM)	18c16f602a	xfs: Replace ASSERT with XFS_IS_CORRUPT in xfs_rtcopy_summary() Replace ASSERT(sum > 0) with an XFS_IS_CORRUPT() and place it just after the call to xfs_rtget_summary() so that we don't end up using an illegal value of sum. Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2026-02-25 13:58:48 +01:00
Gao Xiang	4a2d046e4b	erofs: fix interlaced plain identification for encoded extents Only plain data whose start position and on-disk physical length are both aligned to the block size should be classified as interlaced plain extents. Otherwise, it must be treated as shifted plain extents. This issue was found by syzbot using a crafted compressed image containing plain extents with unaligned physical lengths, which can cause OOB read in z_erofs_transform_plain(). Reported-and-tested-by: syzbot+d988dc155e740d76a331@syzkaller.appspotmail.com Closes: https://lore.kernel.org/r/699d5714.050a0220.cdd3c.03e7.GAE@google.com Fixes: `1d191b4ca5` ("erofs: implement encoded extent metadata") Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2026-02-25 17:40:58 +08:00
Phillip Lougher	fdb24a820a	Squashfs: check metadata block offset is within range Syzkaller reports a "general protection fault in squashfs_copy_data" This is ultimately caused by a corrupted index look-up table, which produces a negative metadata block offset. This is subsequently passed to squashfs_copy_data (via squashfs_read_metadata) where the negative offset causes an out of bounds access. The fix is to check that the offset is within range in squashfs_read_metadata. This will trap this and other cases. Link: https://lkml.kernel.org/r/20260217050955.138351-1-phillip@squashfs.org.uk Fixes: `f400e12656` ("Squashfs: cache operations") Reported-by: syzbot+a9747fe1c35a5b115d3f@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/699234e2.a70a0220.2c38d7.00e2.GAE@google.com/ Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-02-24 11:13:27 -08:00
Jeff Layton	364410170a	nfsd: report the requested maximum number of threads instead of number running The current netlink and /proc interfaces deviate from their traditional values when dynamic threading is enabled, and there is currently no way to know what the current setting is. This patch brings the reporting back in line with traditional behavior. Make these interfaces report the requested maximum number of threads instead of the number currently running. Also, update documentation and comments to reflect that this value represents a maximum and not the number currently running. Fixes: `d8316b837c` ("nfsd: add controls to set the minimum number of threads per pool") Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2026-02-24 10:27:51 -05:00
Christian Brauner	4a1ddb0f1c	pidfs: avoid misleading break The break would only break out of the scoped_guard() loop, not the switch statement. It still works correct as is ofc but let's avoid the confusion. Reported-by: David Lechner <dlechner@baylibre.com> Link:: https://lore.kernel.org/cd2153f1-098b-463c-bbc1-5c6ca9ef1f12@baylibre.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2026-02-24 12:09:00 +01:00
Ferry Meng	bf4fde7db4	erofs: remove more unnecessary #ifdefs Many #ifdefs can be replaced with IS_ENABLED() to improve code readability. No functional changes. Signed-off-by: Ferry Meng <mengferry@linux.alibaba.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2026-02-24 18:36:52 +08:00
Jann Horn	fdcfce9307	eventpoll: Fix integer overflow in ep_loop_check_proc() If a recursive call to ep_loop_check_proc() hits the `result = INT_MAX`, an integer overflow will occur in the calling ep_loop_check_proc() at `result = max(result, ep_loop_check_proc(ep_tovisit, depth + 1) + 1)`, breaking the recursion depth check. Fix it by using a different placeholder value that can't lead to an overflow. Reported-by: Guenter Roeck <linux@roeck-us.net> Fixes: `f2e467a482` ("eventpoll: Fix semi-unbounded recursion") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn <jannh@google.com> Link: https://patch.msgid.link/20260223-epoll-int-overflow-v1-1-452f35132224@google.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2026-02-24 10:21:30 +01:00
Mathieu Desnoyers	3b68df9781	rseq: slice ext: Ensure rseq feature size differs from original rseq size Before rseq became extensible, its original size was 32 bytes even though the active rseq area was only 20 bytes. This had the following impact in terms of userspace ecosystem evolution: * The GNU libc between 2.35 and 2.39 expose a __rseq_size symbol set to 32, even though the size of the active rseq area is really 20. * The GNU libc 2.40 changes this __rseq_size to 20, thus making it express the active rseq area. * Starting from glibc 2.41, __rseq_size corresponds to the AT_RSEQ_FEATURE_SIZE from getauxval(3). This means that users of __rseq_size can always expect it to correspond to the active rseq area, except for the value 32, for which the active rseq area is 20 bytes. Exposing a 32 bytes feature size would make life needlessly painful for userspace. Therefore, add a reserved field at the end of the rseq area to bump the feature size to 33 bytes. This reserved field is expected to be replaced with whatever field will come next, expecting that this field will be larger than 1 byte. The effect of this change is to increase the size from 32 to 64 bytes before we actually have fields using that memory. Clarify the allocation size and alignment requirements in the struct rseq uapi comment. Change the value returned by getauxval(AT_RSEQ_ALIGN) to return the value of the active rseq area size rounded up to next power of 2, which guarantees that the rseq structure will always be aligned on the nearest power of two large enough to contain it, even as it grows. Change the alignment check in the rseq registration accordingly. This will minimize the amount of ABI corner-cases we need to document and require userspace to play games with. The rule stays simple when __rseq_size != 32: #define rseq_field_available(field) (__rseq_size >= offsetofend(struct rseq_abi, field)) Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260220200642.1317826-3-mathieu.desnoyers@efficios.com	2026-02-23 11:19:19 +01:00
Hongbo Li	03c0d030f5	erofs: allow sharing page cache with the same aops only Inode with identical data but different @aops cannot be mixed because the page cache is managed by different subsystems (e.g., @aops for compressed on-disk inodes cannot handle plain on-disk inodes). In this patch, we never allow inodes to share the page cache among plain, compressed, and fileio cases. When a shared inode is created, we initialize @aops that is the same as the initial real inode, and subsequent inodes cannot share the page cache if the inferred @aops differ from the corresponding shared inode. This is reasonable as a first step because, in typical use cases, if an inode is compressible, it will fall into compressed inodes across different filesystem images unless users use plain filesystems. However, in that cases, users will use plain filesystems all the time. Fixes: `5ef3208e3b` ("erofs: introduce the page cache share feature") Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>	2026-02-23 18:04:18 +08:00
Nicholas Carlini	6b4f875aac	ksmbd: fix signededness bug in smb_direct_prepare_negotiation() smb_direct_prepare_negotiation() casts an unsigned __u32 value from sp->max_recv_size and req->preferred_send_size to a signed int before computing min_t(int, ...). A maliciously provided preferred_send_size of 0x80000000 will return as smaller than max_recv_size, and then be used to set the maximum allowed alowed receive size for the next message. By sending a second message with a large value (>1420 bytes) the attacker can then achieve a heap buffer overflow. This fix replaces min_t(int, ...) with min_t(u32) Fixes: `0626e6641f` ("cifsd: add server handler for central processing and tranport layers") Signed-off-by: Nicholas Carlini <nicholas@carlini.com> Reviewed-by: Stefan Metzmacher <metze@samba.org> Acked-by: Stefan Metzmacher <metze@samba.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2026-02-22 21:27:33 -06:00
Eric Biggers	c5794709bc	ksmbd: Compare MACs in constant time To prevent timing attacks, MAC comparisons need to be constant-time. Replace the memcmp() with the correct function, crypto_memneq(). Fixes: `e2f34481b2` ("cifsd: add server-side procedures for SMB3") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2026-02-22 21:27:28 -06:00
Henrique Carvalho	663c28469d	smb: client: fix cifs_pick_channel when channels are equally loaded cifs_pick_channel uses (start % chan_count) when channels are equally loaded, but that can return a channel that failed the eligibility checks. Drop the fallback and return the scan-selected channel instead. If none is eligible, keep the existing behavior of using the primary channel. Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com> Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Acked-by: Meetakshi Setiya <msetiya@microsoft.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>	2026-02-22 16:52:50 -06:00
Linus Torvalds	fbf3380361	fsverity fixes for v7.0-rc1 - Fix a build error on parisc - Remove the non-large-folio-aware function fsverity_verify_page() -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaZtg+xQcZWJpZ2dlcnNA a2VybmVsLm9yZwAKCRDzXCl4vpKOKwQ+AQCiXEYAibl3vHRgQo7qEPCC5or4FtkF HZ0ERRArhsU17AD/TKHE/AJkyFrwK4rGTb6I9Wi1OXnpG7jihZlYjj03Ag4= =CUql -----END PGP SIGNATURE----- Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux Pull fsverity fixes from Eric Biggers: - Fix a build error on parisc - Remove the non-large-folio-aware function fsverity_verify_page() * tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux: fsverity: fix build error by adding fsverity_readahead() stub fsverity: remove fsverity_verify_page() f2fs: make f2fs_verify_cluster() partially large-folio-aware f2fs: remove unnecessary ClearPageUptodate in f2fs_verify_cluster()	2026-02-22 13:12:04 -08:00
Kees Cook	189f164e57	Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses Conversion performed via this Coccinelle script: // SPDX-License-Identifier: GPL-2.0-only // Options: --include-headers-for-types --all-includes --include-headers --keep-comments virtual patch @gfp depends on patch && !(file in "tools") && !(file in "samples")@ identifier ALLOC = {kmalloc_obj,kmalloc_objs,kmalloc_flex, kzalloc_obj,kzalloc_objs,kzalloc_flex, kvmalloc_obj,kvmalloc_objs,kvmalloc_flex, kvzalloc_obj,kvzalloc_objs,kvzalloc_flex}; @@ ALLOC(... - , GFP_KERNEL ) $ make coccicheck MODE=patch COCCI=gfp.cocci Build and boot tested x86_64 with Fedora 42's GCC and Clang: Linux version 6.19.0+ (user@host) (gcc (GCC) 15.2.1 20260123 (Red Hat 15.2.1-7), GNU ld version 2.44-12.fc42) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Linux version 6.19.0+ (user@host) (clang version 20.1.8 (Fedora 20.1.8-4.fc42), LLD 20.1.8) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-22 08:26:33 -08:00
Linus Torvalds	32a92f8c89	Convert more 'alloc_obj' cases to default GFP_KERNEL arguments This converts some of the visually simpler cases that have been split over multiple lines. I only did the ones that are easy to verify the resulting diff by having just that final GFP_KERNEL argument on the next line. Somebody should probably do a proper coccinelle script for this, but for me the trivial script actually resulted in an assertion failure in the middle of the script. I probably had made it a bit _too_ trivial. So after fighting that far a while I decided to just do some of the syntactically simpler cases with variations of the previous 'sed' scripts. The more syntactically complex multi-line cases would mostly really want whitespace cleanup anyway. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 20:03:00 -08:00
Linus Torvalds	323bbfcf1e	Convert 'alloc_flex' family to use the new default GFP_KERNEL argument This is the exact same thing as the 'alloc_obj()' version, only much smaller because there are a lot fewer users of the alloc_flex() interface. As with alloc_obj() version, this was done entirely with mindless brute force, using the same script, except using 'flex' in the pattern rather than 'objs'. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 17:09:51 -08:00
Linus Torvalds	bf4afc53b7	Convert 'alloc_obj' family to use the new default GFP_KERNEL argument This was done entirely with mindless brute force, using git grep -l '\<k[vmz]alloc_objs(., GFP_KERNEL)' \| xargs sed -i 's/$alloc_objs(.*$, GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 17:09:51 -08:00
Linus Torvalds	8934827db5	kmalloc_obj treewide refactoring for v7.0-rc1 -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCaZl14wAKCRA2KwveOeQk uz8aAQCBFLYlij3Y3ivVADkBxuVF3xECaznFya41ENYsBwlHdwEArXqMyNrw+DiG TvWCK/tiddNmGIRpI2sxBFzyRpsHfAY= =rVD3 -----END PGP SIGNATURE----- Merge tag 'kmalloc_obj-treewide-v7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull kmalloc_obj conversion from Kees Cook: "This does the tree-wide conversion to kmalloc_obj() and friends using coccinelle, with a subsequent small manual cleanup of whitespace alignment that coccinelle does not handle. This uncovered a clang bug in __builtin_counted_by_ref(), so the conversion is preceded by disabling that for current versions of clang. The imminent clang 22.1 release has the fix. I've done allmodconfig build tests for x86_64, arm64, i386, and arm. I did defconfig builds for alpha, m68k, mips, parisc, powerpc, riscv, s390, sparc, sh, arc, csky, xtensa, hexagon, and openrisc" * tag 'kmalloc_obj-treewide-v7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: kmalloc_obj: Clean up after treewide replacements treewide: Replace kmalloc with kmalloc_obj for non-scalar types compiler_types: Disable __builtin_counted_by_ref for Clang	2026-02-21 11:02:58 -08:00
Linus Torvalds	f9d66e64a2	io_uring-20260221 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmmZszYQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpvW9D/9qI0JrQ7zw9/zjFVOqFHVbwJUErTwUiSE/ bKpbDOUcm1utiTIw8nWuhhmXbPq8MT+3Uyyk24UFpcIM+QZrqUHRRPYcqHwNy0mw 3QXZut2QkwHKO/tWoKXIguCVQKtPteBjfH+La1IRZPLKK+MozQEloGMulbZwIP0j ynEX7tA8VYwuGtgmzBiTt7Ba2W5EIJ0SSfmdY9XajKDHrZCksx5iHnC/uJg0qEc0 QQr4YZisyz4VxG4MxsJvpv6/wiic07MRSYa5cJXntknmPGc9zj6wg79WWB/lkbuA w95+X+EorHyCvKk2Ny9uwl1/UOp9VQbKQnVIZ58LBC4ILPj6Qk7a7WyxTbfl358d QNhrcspgcLwakQkVkhR1Z4SaCqMdc6i1hXx5m2nhXFevKipeyg11uoM75bZEzWsN 63+6+S1upwSqFTk2nto8gKg+7aMGOvjT/Ae0kNvUD/WPXSkC66gSzADZdv2C4i4Y QdFs8UN8jKRXd5Rx7pthAM0E//XFHko/tGJnHUIzQ3Nk7xeXp4BeXOdp/m/FvpDe BakTvLHvvk5s1X9Bkv5hSXua6ZBFlgcsqn5VdVplQOdWPKwU0Mjg28z6AAzdZxLF mCcgDqDqls19E9HI7nFJOUJtBQjoCl4sVEuPvs06agY04sJTXWkYph5pwhkNJOKD NwumZJpb2A== =Pkge -----END PGP SIGNATURE----- Merge tag 'io_uring-20260221' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull io_uring fixes from Jens Axboe: - A fix for a missing URING_CMD128 opcode check, fixing an issue with the SQE mixed mode support introduced in 6.19. Merged late due to having multiple dependencies - Add sqe->cmd size checking for big SQEs, similar to what we have for normal sized SQEs - Fix a race condition in zcrx, that leads to a double free * tag 'io_uring-20260221' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: io_uring: Add size check for sqe->cmd io_uring: add IORING_OP_URING_CMD128 to opcode checks io_uring/zcrx: fix user_ref race between scrub and refill paths	2026-02-21 10:05:49 -08:00
Linus Torvalds	8eb604d4ee	two small server fixes -----BEGIN PGP SIGNATURE----- iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmmYg8MACgkQiiy9cAdy T1HLbAwAjr1Eg4hDbI1GBLxFkB5pTxYmnHtj8eFJLNXpmpfe4/4f29e7S7hs3ChX UIIbFAFKPWAC7dFwAQ8rnYw4e3HPv+eFZxDzfi9Org8TxYGL9kew8JnqZtvCYtQd bHFndrZW3C8x2kj08O6uA0WWIhpCffZ4avig4Azkw5a6ukLzEpsLcODyJb/9Dx4Y EbHNjYZX3MMbbtNh47Bo2JVNjk5Nyxzf1kSdA46eHa5aQoy8DbMuJqFj0b9oUxqh E6QBPeZ0aZuSQZQUh3LtHnDS/aqFAn4zchtwALhHephU3SQZ1pLR8Uz+IwFI70MK bOCWI//LCGvdcXqJ6rspLplte9O6SoFNyqo+PuH1lnWCknnFNVKJWlOiu9LZnNe0 G99VONqwdKlXs5MDVp2LVpW5VbxJEw5TsrWGQJL6FGMGWiSDPt4rpcA/ugokKhVr aeqlV3dWXBJcKIE9G+8XHNn5lwsgfI4bfs8phdrKxEVSshEZbu/B2hg6X+f2guWJ poF2ZJ4g =UcGf -----END PGP SIGNATURE----- Merge tag 'v7.0-rc-part2-ksmbd-server-fixes' of git://git.samba.org/ksmbd Pull smb server fixes from Steve French: "Two small fixes: - fix potential deadlock - minor cleanup" * tag 'v7.0-rc-part2-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: call ksmbd_vfs_kern_path_end_removing() on some error paths smb: server: Remove duplicate include of misc.h	2026-02-21 09:11:32 -08:00
Kees Cook	69050f8d6d	treewide: Replace kmalloc with kmalloc_obj for non-scalar types This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(PTR, FAM, COUNT, ...) (where TYPE may also be VAR) The resulting allocations no longer return "void ", instead returning "TYPE ". Signed-off-by: Kees Cook <kees@kernel.org>	2026-02-21 01:02:28 -08:00
Linus Torvalds	b3f1da2a4d	for-7.0-rc1-tag -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmmYpGMACgkQxWXV+ddt WDvKNQ//cy7gb2nItc9hbYXUM/88Ks3Usu94u/4gREL9I97u2vHer6sT8cRfWA2k OSOZF2L7yPWIcxcj39YC66ubs7uebSwt/bZAL1TKAyA7wUvR/kdhD7DUTWX4ySf3 2+1BANv1Bng8C7vGnWDhYPHcb1u8LvKxKcn+9h8SzBGpW5dyx3k4xUrneaMYq+jf D9sPjkkM6fxsKn+S3OJP/zFUIQ2DQiv7nF+Jv4Ke2h9c9nCVfn8fRK0AuTlYXFY/ mWkKWo1ATGVd0fBg/otRp/ZlZczoKs3/1YBUMYTxZZngyweIms4Q6I4/GIGHO+RD QFFoIQ7OQd0aqBGhuKTDlYMlc6OS2jwoTgVYr6vSIxSRUsCK/grHdPL+s+9dLc3h p7+/URH9Gpfad46wFypb5w7zmmc8jCRkR1Ff+jf6Pi8GgffqocCro3C3HlGRKwcf CAj6gI3ypNPNFfYidcKbS+ehXhjmMVb9xhNa8YwCC1CdgM54ZMmEs/ksAN+uBc/u EfcAbB3T15LQgzUJs2WKvCI3E/0XUYEi54ng8UwCJ6P01p3egfvQo8t6jZal9Vx8 ba/LUG50W1xRRjxgG1AU5s42GmGkO8WNyIixmLlT+Pwog0I2auPVDQBudbXZK4ps +FOtNnN9hYLmuZyRSTT03MHHf0Rqtckdjvq3413KMFILVh+ZM+Q= =CQsu -----END PGP SIGNATURE----- Merge tag 'for-7.0-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - multiple error handling fixes of unexpected conditions - reset block group size class once it becomes empty so that its class can be changed - error message level adjustments - fixes of returned error values - use correct block reserve for delayed refs * tag 'for-7.0-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix invalid leaf access in btrfs_quota_enable() if ref key not found btrfs: fix lost error return in btrfs_find_orphan_roots() btrfs: fix lost return value on error in finish_verity() btrfs: change unaligned root messages to error level in btrfs_validate_super() btrfs: use the correct type to initialize block reserve for delayed refs btrfs: do not ASSERT() when the fs flips RO inside btrfs_repair_io_failure() btrfs: reset block group size class when it becomes empty btrfs: replace BUG() with error handling in __btrfs_balance() btrfs: handle unexpected exact match in btrfs_set_inode_index_count()	2026-02-20 14:57:09 -08:00
Linus Torvalds	233a0c0f44	eCryptfs fixes for 7.0-rc1 The set of eCryptfs patches for the 7.0-rc1 merge window consists of cleanups that are not intended to have any functional changes: - Comment typo fixes - Removal of an unused function declaration - Use strscpy() instead of the deprecated strcpy() - Use string copying helpers instead of memcpy() and manually terminating strings The patches have all spent time in linux-next and they do not regress the tests in the ecryptfs-utils tree. Signed-off-by: Tyler Hicks <code@tyhicks.com> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEKvuQkp28KvPJn/fVfL7QslYSS/0FAmmWWPwRHGNvZGVAdHlo aWNrcy5jb20ACgkQfL7QslYSS/3zQhAAhyHq55LJVxfSOACyabXxxQM4cpmN3vlu fgIsfly+mBYPKQg0/vfTAfC8ln5YhjFMcubVbohXcj6FmHGulqoB8A5w74wvksDN 6kzFIUEg/k10AKijt36gb6ef6BjN00j+LeeSUWZ9uMtJt3SR7UuIKpbxB2sRJSHv Q7DguW6Wr4APDyZxmmyPPjM4CFHjTO0a5ePSRkqtLV0Qebtnk1q/v9Ddidmz1N5t dWNwcn8mXYSzjorFhzhFy0wVML2nStF1DgEy+7FQVzmpHxPfGN/F07TGLj7rW+hr BI9GQpvLOJ/EJ8ssBFqe4Ns44upo/WLnBo4PoxD+iuvPBTJLeP/b7wnHduu6ndjy 4iZqStOhYNMWolQlnx/PmUAzBYn6K2hKOR/egUcHazWnpq5/FnXMKu6bWz4RE+fS n/Pi3XwIwos7zqToWnU7alcCKDwhzamRRicSYWX3ezzj3TbacFJTLv0KO+7W5HpE wqnLrwqSpC2tWl/4Qrm70ZekVK8ZMGW0+HwabXupWcAUZN6OmXy/B4sNkRCsZiuA TGeaiOMlH7AbF7QKvCI3YKRuPxKutgbzyqFfzSRYH97wPmf+1H0DRXfTVqMs1Wa8 97KBzOxKpiJdJkvMC82xKa+1usc8JrzeGcErhleSQTgzIijQvoh08wF+isunJROx aWOTbdbl508= =Ak1f -----END PGP SIGNATURE----- Merge tag 'ecryptfs-7.0-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs Pull ecryptfs updates from Tyler Hicks: "This consists of some really minor typo fixes that fell through the cracks and some more recent code cleanups: - Comment typo fixes - Removal of an unused function declaration - Use strscpy() instead of the deprecated strcpy() - Use string copying helpers instead of memcpy() and manually terminating strings" * tag 'ecryptfs-7.0-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs: ecryptfs: Replace memcpy + NUL termination in ecryptfs_copy_filename ecryptfs: Drop redundant NUL terminations after calling ecryptfs_to_hex ecryptfs: Replace memcpy + NUL termination in ecryptfs_new_file_context ecryptfs: Replace strcpy with strscpy in ecryptfs_validate_options ecryptfs: Replace strcpy with strscpy in ecryptfs_cipher_code_to_string ecryptfs: Replace strcpy with strscpy in ecryptfs_set_default_crypt_stat_vals ecryptfs: simplify list initialization in ecryptfs_parse_packet_set() ecryptfs: Remove unused declartion ecryptfs_fill_zeros() ecryptfs: Fix packet format comment in parse_tag_67_packet() ecryptfs: comment typo fix ecryptfs: keystore: Fix typo 'the the' in comment	2026-02-20 14:46:31 -08:00
Ethan Tidmore	f6a495484a	proc: Fix pointer error dereference The function try_lookup_noperm() can return an error pointer. Add check for error pointer. Detected by Smatch: fs/proc/base.c:2148 proc_fill_cache() error: 'child' dereferencing possible ERR_PTR() Fixes: `1df98b8bbc` ("proc_fill_cache(): clean up, get rid of pointless find_inode_number() use") Signed-off-by: Ethan Tidmore <ethantidmore06@gmail.com> Link: https://patch.msgid.link/20260219221001.1117135-1-ethantidmore06@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2026-02-20 11:21:16 +01:00
Govindarajulu Varadarajan	ea129e55c9	io_uring: Add size check for sqe->cmd For SQE128, sqe->cmd provides 80 bytes for uring_cmd. Add macro to check if size of user struct does not exceed 80 bytes at compile time. User doesn't have to track this manually during development. Replace io_uring_sqe_cmd() inline func with macro and add io_uring_sqe128_cmd() which checks struct size for 16 bytes cmd and 80 bytes cmd respectively. Signed-off-by: Govindarajulu Varadarajan <govind.varadar@gmail.com> Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-02-19 07:26:26 -07:00
Darrick J. Wong	294f54f849	fserror: fix lockdep complaint when igrabbing inode Christoph Hellwig reported a lockdep splat in generic/108: ================================ WARNING: inconsistent lock state 6.19.0+ #4827 Tainted: G N -------------------------------- inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage. swapper/1/0 [HC1[1]:SC0[0]:HE0:SE1] takes: ffff88811ed1b140 (&sb->s_type->i_lock_key#33){?.+.}-{3:3}, at: igrab+0x1a/0xb0 {HARDIRQ-ON-W} state was registered at: lock_acquire+0xca/0x2c0 _raw_spin_lock+0x2e/0x40 unlock_new_inode+0x2c/0xc0 xfs_iget+0xcf4/0x1080 xfs_trans_metafile_iget+0x3d/0x100 xfs_metafile_iget+0x2b/0x50 xfs_mount_setup_metadir+0x20/0x60 xfs_mountfs+0x457/0xa60 xfs_fs_fill_super+0x6b3/0xa90 get_tree_bdev_flags+0x13c/0x1e0 vfs_get_tree+0x27/0xe0 vfs_cmd_create+0x54/0xe0 __do_sys_fsconfig+0x309/0x620 do_syscall_64+0x8b/0xf80 entry_SYSCALL_64_after_hwframe+0x76/0x7e irq event stamp: 139080 hardirqs last enabled at (139079): [<ffffffff813a923c>] do_idle+0x1ec/0x270 hardirqs last disabled at (139080): [<ffffffff828a8d09>] common_interrupt+0x19/0xe0 softirqs last enabled at (139032): [<ffffffff8134a853>] __irq_exit_rcu+0xc3/0x120 softirqs last disabled at (139025): [<ffffffff8134a853>] __irq_exit_rcu+0xc3/0x120 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&sb->s_type->i_lock_key#33); <Interrupt> lock(&sb->s_type->i_lock_key#33); * DEADLOCK * 1 lock held by swapper/1/0: #0: ffff8881052c81a0 (&vblk->vqs[i].lock){-.-.}-{3:3}, at: virtblk_done+0x4b/0x110 stack backtrace: CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Tainted: G N 6.19.0+ #4827 PREEMPT(full) Tainted: [N]=TEST Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014 Call Trace: <IRQ> dump_stack_lvl+0x5b/0x80 print_usage_bug.part.0+0x22c/0x2c0 mark_lock+0xa6f/0xe90 __lock_acquire+0x10b6/0x25e0 lock_acquire+0xca/0x2c0 _raw_spin_lock+0x2e/0x40 igrab+0x1a/0xb0 fserror_report+0x135/0x260 iomap_finish_ioend_buffered+0x170/0x210 clone_endio+0x8f/0x1c0 blk_update_request+0x1e4/0x4d0 blk_mq_end_request+0x1b/0x100 virtblk_done+0x6f/0x110 vring_interrupt+0x59/0x80 __handle_irq_event_percpu+0x8a/0x2e0 handle_irq_event+0x33/0x70 handle_edge_irq+0xdd/0x1e0 __common_interrupt+0x6f/0x180 common_interrupt+0xb7/0xe0 </IRQ> It looks like the concern here is that inode::i_lock is sometimes taken in IRQ context, and sometimes it is held when going to IRQ context, though it's a little difficult to tell since I think this is a kernel from after the actual 6.19 release but before 7.0-rc1. Either way, we don't need to take i_lock, because filesystems should not report files to fserror if they're about to be freed or have not yet been exposed to other threads, because the resulting fsnotify report will be meaningless. Therefore, bump inode::i_count directly and clarify the preconditions on the inode being passed in. Link: https://lore.kernel.org/linux-fsdevel/aY7BndIgQg3ci_6s@infradead.org/ Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Link: https://patch.msgid.link/177148129564.716249.3069780698231701540.stgit@frogsfrogsfrogs Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>	2026-02-19 09:12:08 +01:00
Linus Torvalds	eeccf287a2	mm.git review status for linus..mm-stable Total patches: 36 Reviews/patch: 1.77 Reviewed rate: 83% - The 2 patch series "mm/vmscan: fix demotion targets checks in reclaim/demotion" from Bing Jiao fixes a couple of issues in the demotion code - pages were failed demotion and were finding themselves demoted into disallowed nodes. - The 11 patch series "Remove XA_ZERO from error recovery of dup_mmap()" from Liam Howlett fixes a rare mapledtree race and performs a number of cleanups. - The 13 patch series "mm: add bitmap VMA flag helpers and convert all mmap_prepare to use them" from Lorenzo Stoakes implements a lot of cleanups following on from the conversion of the VMA flags into a bitmap. - The 5 patch series "support batch checking of references and unmapping for large folios" from Baolin Wang implements batching to greatly improve the performance of reclaiming clean file-backed large folios. - The 3 patch series "selftests/mm: add memory failure selftests" from Miaohe Lin does as claimed. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaZaIEQAKCRDdBJ7gKXxA jj73AQCQDwLoipDiQRGyjB5BDYydymWuDoiB1tlDPHfYAP3b/QD/UQtVlOEXqwM3 naOKs3NQ1pwnfhDaQMirGw2eAnJ1SQY= =6Iif -----END PGP SIGNATURE----- Merge tag 'mm-stable-2026-02-18-19-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more MM updates from Andrew Morton: - "mm/vmscan: fix demotion targets checks in reclaim/demotion" fixes a couple of issues in the demotion code - pages were failed demotion and were finding themselves demoted into disallowed nodes (Bing Jiao) - "Remove XA_ZERO from error recovery of dup_mmap()" fixes a rare mapledtree race and performs a number of cleanups (Liam Howlett) - "mm: add bitmap VMA flag helpers and convert all mmap_prepare to use them" implements a lot of cleanups following on from the conversion of the VMA flags into a bitmap (Lorenzo Stoakes) - "support batch checking of references and unmapping for large folios" implements batching to greatly improve the performance of reclaiming clean file-backed large folios (Baolin Wang) - "selftests/mm: add memory failure selftests" does as claimed (Miaohe Lin) * tag 'mm-stable-2026-02-18-19-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (36 commits) mm/page_alloc: clear page->private in free_pages_prepare() selftests/mm: add memory failure dirty pagecache test selftests/mm: add memory failure clean pagecache test selftests/mm: add memory failure anonymous page test mm: rmap: support batched unmapping for file large folios arm64: mm: implement the architecture-specific clear_flush_young_ptes() arm64: mm: support batch clearing of the young flag for large folios arm64: mm: factor out the address and ptep alignment into a new helper mm: rmap: support batched checks of the references for large folios tools/testing/vma: add VMA userland tests for VMA flag functions tools/testing/vma: separate out vma_internal.h into logical headers tools/testing/vma: separate VMA userland tests into separate files mm: make vm_area_desc utilise vma_flags_t only mm: update all remaining mmap_prepare users to use vma_flags_t mm: update shmem_[kernel]_file_*() functions to use vma_flags_t mm: update secretmem to use VMA flags on mmap_prepare mm: update hugetlbfs to use VMA flags on mmap_prepare mm: add basic VMA flag operation helper functions tools: bitmap: add missing bitmap_[subset(), andnot()] mm: add mk_vma_flags() bitmap flag macro helper ...	2026-02-18 20:50:32 -08:00
Linus Torvalds	23b0f90ba8	Summary * Removed macros from proc handler converters Replace the proc converter macros with "regular" functions. Though it is more verbose than the macro version, it helps when debugging and better aligns with coding-style.rst. * General cleanup Remove superfluous ctl_table forward declarations. Const qualify the memory_allocation_profiling_sysctl and loadpin_sysctl_table arrays. Add missing kernel doc to proc_dointvec_conv. * Testing This series was run through sysctl selftests/kunit test suite in x86_64. And went into linux-next after rc4, giving it a good 3 weeks of testing -----BEGIN PGP SIGNATURE----- iQGzBAABCgAdFiEErkcJVyXmMSXOyyeQupfNUreWQU8FAmmUabYACgkQupfNUreW QU8y2Qv/d2y35uQPRDh0HKWKWXJy41C2RJzd/rFCWJPCwo150whTSHIHkWYnu76g 10QblBXQmXi9TVqFnJ7Il7PWgqkMPjzA13tfT9eXNWU8j2OB/mcVKNl9X4wm/jWi QxtGmBsIQ/nxb2pUzMCykzgfc5mLi2NQ8qhZ5bOnq7UW3zdYmzEqx+tRdvIacyIk adComi5v8xUDqyEbVFaBovuX2WHQkPyBMnD64nwWG93JpNG/+9PxGzv/DNUXY11Y epVOfSoKdJbSLjYoHEPEhT0aHjSydq3QHru7uF6wzKOFTfHej/XkXXbUnFXPO2Pn c5J0u/HziYG5eN2QTqGfrhECZYuCFPemtUozltbcgGebkl1wKH+k9K5vsCaz/mhk ihUC3mui++W/n9B9HJRYh1XeEpk6C1pWERCOx27XFZ25fSek2YO6ZWkT0q+gceC0 t4+eIFSGJ3OzheJgHNK9XhTMWiQPmHyA6brXYGx4WeRvJFLpVddPF7k3Z89zIAu/ Fut7FGTH =0Z+I -----END PGP SIGNATURE----- Merge tag 'sysctl-7.00-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl Pull sysctl updates from Joel Granados: - Remove macros from proc handler converters Replace the proc converter macros with "regular" functions. Though it is more verbose than the macro version, it helps when debugging and better aligns with coding-style.rst. - General cleanup Remove superfluous ctl_table forward declarations. Const qualify the memory_allocation_profiling_sysctl and loadpin_sysctl_table arrays. Add missing kernel doc to proc_dointvec_conv. - Testing This series was run through sysctl selftests/kunit test suite in x86_64. And went into linux-next after rc4, giving it a good 3 weeks of testing * tag 'sysctl-7.00-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl: sysctl: replace SYSCTL_INT_CONV_CUSTOM macro with functions sysctl: Replace unidirectional INT converter macros with functions sysctl: Add kernel doc to proc_douintvec_conv sysctl: Replace UINT converter macros with functions sysctl: Add CONFIG_PROC_SYSCTL guards for converter macros sysctl: clarify proc_douintvec_minmax doc sysctl: Return -ENOSYS from proc_douintvec_conv when CONFIG_PROC_SYSCTL=n sysctl: Remove unused ctl_table forward declarations loadpin: Implement custom proc_handler for enforce alloc_tag: move memory_allocation_profiling_sysctls into .rodata sysctl: Add missing kernel-doc for proc_dointvec_conv	2026-02-18 10:45:36 -08:00
Filipe Manana	ecb7c2484c	btrfs: fix invalid leaf access in btrfs_quota_enable() if ref key not found If btrfs_search_slot_for_read() returns 1, it means we did not find any key greater than or equals to the key we asked for, meaning we have reached the end of the tree and therefore the path is not valid. If this happens we need to break out of the loop and stop, instead of continuing and accessing an invalid path. Fixes: `5223cc60b4` ("btrfs: drop the path before adding qgroup items when enabling qgroups") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-18 15:25:54 +01:00
Filipe Manana	7b54e08f2e	btrfs: fix lost error return in btrfs_find_orphan_roots() If the call to btrfs_get_fs_root() returns an error different from -ENOENT we break out of the loop and then return 0, losing the error. Fix this by returning the error instead of breaking from the loop. Reported-by: Chris Mason <clm@meta.com> Link: https://lore.kernel.org/linux-btrfs/20260208185321.1128472-1-clm@meta.com/ Fixes: `8670a25ecb` ("btrfs: use single return variable in btrfs_find_orphan_roots()") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-18 15:25:54 +01:00
Filipe Manana	29e525665a	btrfs: fix lost return value on error in finish_verity() If btrfs_update_inode() or del_orphan() fail, we jump to the 'end_trans' label and then return 0 instead of the error returned by one of those calls. Fix this and return the error. Fixes: `61fb7f04ee` ("btrfs: remove out label in finish_verity()") Reported-by: Chris Mason <clm@meta.com> Link: https://lore.kernel.org/linux-btrfs/20260208161129.3888234-1-clm@meta.com/ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-18 15:25:54 +01:00

1 2 3 4 5 ...

104278 commits