Commit graph

526 commits

Author SHA1 Message Date
Linus Torvalds
bf4afc53b7 Convert 'alloc_obj' family to use the new default GFP_KERNEL argument
This was done entirely with mindless brute force, using

    git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 17:09:51 -08:00
Kees Cook
69050f8d6d treewide: Replace kmalloc with kmalloc_obj for non-scalar types
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-21 01:02:28 -08:00
Shardul Bankar
ebebb04bae hfsplus: avoid double unload_nls() on mount failure
The recent commit "hfsplus: ensure sb->s_fs_info is always cleaned up"
[1] introduced a custom ->kill_sb() handler (hfsplus_kill_super) that
cleans up the s_fs_info structure (including the NLS table) on
superblock destruction.

However, the error handling path in hfsplus_fill_super() still calls
unload_nls() before returning an error. Since the VFS layer calls
->kill_sb() when fill_super fails, this results in unload_nls() being
called twice for the same sbi->nls pointer: once in hfsplus_fill_super()
and again in hfsplus_kill_super() (via delayed_free).

Remove the explicit unload_nls() call from the error path in
hfsplus_fill_super() to rely solely on the cleanup in ->kill_sb().

[1] https://lore.kernel.org/r/20251201222843.82310-3-mehdi.benhadjkhelifa@gmail.com/

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lore.kernel.org/r/20260203043806.GF3183987@ZenIV/
Signed-off-by: Shardul Bankar <shardul.b@mpiricsoftware.com>
Link: https://lore.kernel.org/r/20260204170440.1337261-1-shardul.b@mpiricsoftware.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2026-02-06 15:20:00 -08:00
Viacheslav Dubeyko
14b428cfba hfsplus: fix warning issue in inode.c
This patch fixes the sparse warning issue in inode.c
by adding static to hfsplus_symlink_inode_operations
and hfsplus_special_inode_operations declarations.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202601291957.bunRsD8R-lkp@intel.com/
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
cc: Yangtao Li <frank.li@vivo.com>
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20260129195442.594884-1-slava@dubeyko.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2026-01-29 11:59:01 -08:00
Viacheslav Dubeyko
aef5078471 hfsplus: fix generic/062 xfstests failure
The xfstests' test-case generic/062 fails to execute
correctly:

FSTYP -- hfsplus
PLATFORM -- Linux/x86_64 hfsplus-testing-0001 6.15.0-rc4+ #8 SMP PREEMPT_DYNAMIC Thu May 1 16:43:22 PDT 2025
MKFS_OPTIONS -- /dev/loop51
MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch

generic/062 - output mismatch (see xfstests-dev/results//generic/062.out.bad)

The generic/062 test tries to set and get xattrs for various types
of objects (regular file, folder, block device, character
device, pipe, etc) with the goal to check that xattr operations
works correctly for all possible types of file system objects.
But current HFS+ implementation somehow hasn't support of
xattr operatioons for the case of block device, character
device, and pipe objects. Also, it has not completely correct
set of operations for the case symlinks.

This patch implements proper declaration of xattrs operations
hfsplus_special_inode_operations and hfsplus_symlink_inode_operations.
Also, it slightly corrects the logic of hfsplus_listxattr()
method.

sudo ./check generic/062
FSTYP         -- hfsplus
PLATFORM      -- Linux/x86_64 hfsplus-testing-0001 6.19.0-rc1+ #59 SMP PREEMPT_DYNAMIC Mon Jan 19 16:26:21 PST 2026
MKFS_OPTIONS  -- /dev/loop51
MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch

generic/062 20s ...  20s
Ran: generic/062
Passed all 1 tests

[1] https://github.com/hfs-linux-kernel/hfs-linux-kernel/issues/93

Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
cc: Yangtao Li <frank.li@vivo.com>
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20260120041937.3450928-1-slava@dubeyko.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2026-01-28 14:53:15 -08:00
Viacheslav Dubeyko
b18c5b84fa hfsplus: fix generic/037 xfstests failure
The xfstests' test-case generic/037 fails to execute
correctly:

FSTYP -- hfsplus
PLATFORM -- Linux/x86_64 hfsplus-testing-0001 6.15.0-rc4+ #8 SMP PREEMPT_DYNAMIC Thu May 1 16:43:22 PDT 2025
MKFS_OPTIONS -- /dev/loop51
MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch

generic/037 - output mismatch (see xfstests-dev/results//generic/037.out.bad)

The goal of generic/037 test-case is to "verify that replacing
a xattr's value is an atomic operation". The test "consists of
removing the old value and then inserting the new value in a btree.
This made readers (getxattr and listxattrs) not getting neither
the old nor the new value during a short time window".

The HFS+ has the issue of executing the xattr replace operation
because __hfsplus_setxattr() method [1] implemented it as not
atomic operation [2]:

	if (hfsplus_attr_exists(inode, name)) {
		if (flags & XATTR_CREATE) {
			pr_err("xattr exists yet\n");
			err = -EOPNOTSUPP;
			goto end_setxattr;
		}
		err = hfsplus_delete_attr(inode, name);
		if (err)
			goto end_setxattr;
		err = hfsplus_create_attr(inode, name, value, size);
		if (err)
			goto end_setxattr;
	}

The main issue of the logic that it implements delete and
create of xattr as independent atomic operations, but the replace
operation at whole is not atomic operation. This patch implements
a new hfsplus_replace_attr() method that makes the xattr replace
operation by atomic one. Also, it reworks hfsplus_create_attr() and
hfsplus_delete_attr() with the goal of reusing the common logic
in hfsplus_replace_attr() method.

sudo ./check generic/037
FSTYP         -- hfsplus
PLATFORM      -- Linux/x86_64 hfsplus-testing-0001 6.19.0-rc1+ #47 SMP PREEMPT_DYNAMIC Thu Jan  8 15:37:20 PST 2026
MKFS_OPTIONS  -- /dev/loop51
MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch

generic/037 37s ...  37s
Ran: generic/037
Passed all 1 tests

[1] https://elixir.bootlin.com/linux/v6.19-rc4/source/fs/hfsplus/xattr.c#L261
[2] https://elixir.bootlin.com/linux/v6.19-rc4/source/fs/hfsplus/xattr.c#L338

Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
cc: Yangtao Li <frank.li@vivo.com>
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20260109234213.2805400-1-slava@dubeyko.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2026-01-19 19:46:21 -08:00
Tetsuo Handa
ed8889ca21 hfsplus: pretend special inodes as regular files
Since commit af153bb63a ("vfs: catch invalid modes in may_open()")
requires any inode be one of S_IFDIR/S_IFLNK/S_IFREG/S_IFCHR/S_IFBLK/
S_IFIFO/S_IFSOCK type, use S_IFREG for special inodes.

Reported-by: syzbot <syzbot+895c23f6917da440ed0d@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=895c23f6917da440ed0d
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
Link: https://lore.kernel.org/r/d0a07b1b-8b73-4002-8e29-e2bd56871262@I-love.SAKURA.ne.jp
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2026-01-06 12:41:50 -08:00
Shardul Bankar
d8a73cc46c hfsplus: return error when node already exists in hfs_bnode_create
When hfs_bnode_create() finds that a node is already hashed (which should
not happen in normal operation), it currently returns the existing node
without incrementing its reference count. This causes a reference count
inconsistency that leads to a kernel panic when the node is later freed
in hfs_bnode_put():

    kernel BUG at fs/hfsplus/bnode.c:676!
    BUG_ON(!atomic_read(&node->refcnt))

This scenario can occur when hfs_bmap_alloc() attempts to allocate a node
that is already in use (e.g., when node 0's bitmap bit is incorrectly
unset), or due to filesystem corruption.

Returning an existing node from a create path is not normal operation.

Fix this by returning ERR_PTR(-EEXIST) instead of the node when it's
already hashed. This properly signals the error condition to callers,
which already check for IS_ERR() return values.

Reported-by: syzbot+1c8ff72d0cd8a50dfeaa@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=1c8ff72d0cd8a50dfeaa
Link: https://lore.kernel.org/all/784415834694f39902088fa8946850fc1779a318.camel@ibm.com/
Fixes: 634725a929 ("[PATCH] hfs: cleanup HFS+ prints")
Signed-off-by: Shardul Bankar <shardul.b@mpiricsoftware.com>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
Link: https://lore.kernel.org/r/20251229204938.1907089-1-shardul.b@mpiricsoftware.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2026-01-06 12:40:49 -08:00
Viacheslav Dubeyko
413466f3f0 hfsplus: fix generic/020 xfstests failure
The xfstests' test-case generic/020 fails to execute
correctly:

FSTYP -- hfsplus
PLATFORM -- Linux/x86_64 hfsplus-testing-0001 6.15.0-rc4+ #8 SMP PREEMPT_DYNAMIC Thu May 1 16:43:22 PDT 2025
MKFS_OPTIONS -- /dev/loop51
MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch

generic/020 _check_generic_filesystem: filesystem on /dev/loop50 is inconsistent
(see xfstests-dev/results//generic/020.full for details)

    *** add lots of attributes
    *** check
        *** MAX_ATTRS attribute(s)
        +/mnt/test/attribute_12286: Numerical result out of range
        *** -1 attribute(s)
        *** remove lots of attributes
        ...
        (Run 'diff -u /xfstests-dev/tests/generic/020.out /xfstests-dev/results//generic/020.out.bad' to see the entire diff)

The generic/020 creates more than 100 xattrs and gives its
the names user.attribute_<number> (for example, user.attribute_101).
As the next step, listxattr() is called with the goal to check
the correctness of xattrs creation. However, it was issue
in hfsplus_listxattr() logic. This method re-uses
the fd.key->attr.key_name.unicode and strbuf buffers in the loop
without re-initialization. As a result, part of the previous
name could still remain in the buffers. For example,
user.attribute_101 could be processed before user.attribute_54.
The issue resulted in formation the name user.attribute_541
instead of user.attribute_54. This patch adds initialization of
fd.key->attr.key_name.unicode and strbuf buffers before
calling hfs_brec_goto() method that prepare next name in
the buffer.

HFS+ logic supports only inline xattrs. Such extended attributes
can store values not bigger than 3802 bytes [1]. This limitation
requires correction of generic/020 logic. Finally, generic/020
can be executed without any issue:

sudo ./check generic/020
FSTYP         -- hfsplus
PLATFORM      -- Linux/x86_64 hfsplus-testing-0001 6.19.0-rc1+ #44 SMP PREEMPT_DYNAMIC Mon Dec 22 15:39:00 PST 2025
MKFS_OPTIONS  -- /dev/loop51
MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch

generic/020 31s ...  38s
Ran: generic/020
Passed all 1 tests

[1] https://elixir.bootlin.com/linux/v6.19-rc2/source/include/linux/hfs_common.h#L626

Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
cc: Yangtao Li <frank.li@vivo.com>
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20251224002810.1137139-1-slava@dubeyko.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2026-01-06 12:38:10 -08:00
Viacheslav Dubeyko
9a8c4ad447 hfsplus: fix volume corruption issue for generic/498
The xfstests' test-case generic/498 leaves HFS+ volume
in corrupted state:

sudo ./check generic/498
FSTYP -- hfsplus
PLATFORM -- Linux/x86_64 hfsplus-testing-0001 6.18.0-rc1+ #18 SMP PREEMPT_DYNAMIC Thu Dec 4 12:24:45 PST 2025
MKFS_OPTIONS -- /dev/loop51
MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch

generic/498 _check_generic_filesystem: filesystem on /dev/loop51 is inconsistent
(see XFSTESTS-2/xfstests-dev/results//generic/498.full for details)

Ran: generic/498
Failures: generic/498
Failed 1 of 1 tests

sudo fsck.hfsplus -d /dev/loop51
** /dev/loop51
Using cacheBlockSize=32K cacheTotalBlock=1024 cacheSize=32768K.
Executing fsck_hfs (version 540.1-Linux).
** Checking non-journaled HFS Plus Volume.
The volume name is untitled
** Checking extents overflow file.
** Checking catalog file.
Invalid leaf record count
(It should be 16 instead of 2)
** Checking multi-linked files.
CheckHardLinks: found 1 pre-Leopard file inodes.
** Checking catalog hierarchy.
** Checking extended attributes file.
** Checking volume bitmap.
** Checking volume information.
Verify Status: VIStat = 0x0000, ABTStat = 0x0000 EBTStat = 0x0000
CBTStat = 0x8000 CatStat = 0x00000000
** Repairing volume.
** Rechecking volume.
** Checking non-journaled HFS Plus Volume.
The volume name is untitled
** Checking extents overflow file.
** Checking catalog file.
** Checking multi-linked files.
CheckHardLinks: found 1 pre-Leopard file inodes.
** Checking catalog hierarchy.
** Checking extended attributes file.
** Checking volume bitmap.
** Checking volume information.
** The volume untitled was repaired successfully.

The generic/498 test executes such steps on final phase:

mkdir $SCRATCH_MNT/A
mkdir $SCRATCH_MNT/B
mkdir $SCRATCH_MNT/A/C
touch $SCRATCH_MNT/B/foo
$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/B/foo

ln $SCRATCH_MNT/B/foo $SCRATCH_MNT/A/C/foo
$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/A

"Simulate a power failure and mount the filesystem
to check that what we explicitly fsync'ed exists."

_flakey_drop_and_remount

The FSCK tool complains about "Invalid leaf record count".
HFS+ b-tree header contains leaf_count field is updated
by hfs_brec_insert() and hfs_brec_remove(). The hfs_brec_insert()
is involved into hard link creation process. However,
modified in-core leaf_count field is stored into HFS+
b-tree header by hfs_btree_write() method. But,
unfortunately, hfs_btree_write() hasn't been called
by hfsplus_cat_write_inode() and hfsplus_file_fsync()
stores not fully consistent state of the Catalog File's
b-tree.

This patch adds calling hfs_btree_write() method in
the hfsplus_cat_write_inode() with the goal of
storing consistent state of Catalog File's b-tree.
Finally, it makes FSCK tool happy.

sudo ./check generic/498
FSTYP         -- hfsplus
PLATFORM      -- Linux/x86_64 hfsplus-testing-0001 6.18.0-rc1+ #22 SMP PREEMPT_DYNAMIC Sat Dec  6 17:01:31 PST 2025
MKFS_OPTIONS  -- /dev/loop51
MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch

generic/498 33s ...  31s
Ran: generic/498
Passed all 1 tests

Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
cc: Yangtao Li <frank.li@vivo.com>
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20251207035821.3863657-1-slava@dubeyko.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-12-15 15:19:55 -08:00
Viacheslav Dubeyko
bea4429eb3 hfsplus: fix volume corruption issue for generic/480
The xfstests' test-case generic/480 leaves HFS+ volume
in corrupted state:

sudo ./check generic/480
FSTYP -- hfsplus
PLATFORM -- Linux/x86_64 hfsplus-testing-0001 6.17.0-rc1+ #4 SMP PREEMPT_DYNAMIC Wed Oct 1 15:02:44 PDT 2025
MKFS_OPTIONS -- /dev/loop51
MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch

generic/480 _check_generic_filesystem: filesystem on /dev/loop51 is inconsistent
(see XFSTESTS-2/xfstests-dev/results//generic/480.full for details)

Ran: generic/480
Failures: generic/480
Failed 1 of 1 tests

sudo fsck.hfsplus -d /dev/loop51
** /dev/loop51
Using cacheBlockSize=32K cacheTotalBlock=1024 cacheSize=32768K.
Executing fsck_hfs (version 540.1-Linux).
** Checking non-journaled HFS Plus Volume.
The volume name is untitled
** Checking extents overflow file.
** Checking catalog file.
** Checking multi-linked files.
CheckHardLinks: found 1 pre-Leopard file inodes.
Incorrect number of file hard links
** Checking catalog hierarchy.
** Checking extended attributes file.
** Checking volume bitmap.
** Checking volume information.
invalid VHB nextCatalogID
Volume header needs minor repair
(2, 0)
Verify Status: VIStat = 0x8000, ABTStat = 0x0000 EBTStat = 0x0000
CBTStat = 0x0000 CatStat = 0x00000002
** Repairing volume.
Incorrect flags for file hard link (id = 19)
(It should be 0x22 instead of 0x2)
Incorrect flags for file inode (id = 18)
(It should be 0x22 instead of 0x2)
first link ID=0 is < 16 for fileinode=18
Error getting first link ID for inode = 18 (result=2)
Invalid first link in hard link chain (id = 18)
(It should be 19 instead of 0)
Indirect node 18 needs link count adjustment
(It should be 1 instead of 2)
** Rechecking volume.
** Checking non-journaled HFS Plus Volume.
The volume name is untitled
** Checking extents overflow file.
** Checking catalog file.
** Checking multi-linked files.
** Checking catalog hierarchy.
** Checking extended attributes file.
** Checking volume bitmap.
** Checking volume information.
** The volume untitled was repaired successfully.

The generic/480 test executes such steps on final phase:

"Now remove of the links of our file and create
a new file with the same name and in the same
parent directory, and finally fsync this new file."

unlink $SCRATCH_MNT/testdir/bar
touch $SCRATCH_MNT/testdir/bar
$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/testdir/bar

"Simulate a power failure and mount the filesystem
to check that replaying the fsync log/journal
succeeds, that is the mount operation does not fail."

_flakey_drop_and_remount

The key issue in HFS+ logic is that hfsplus_link(),
hfsplus_unlink(), hfsplus_rmdir(), hfsplus_symlink(),
and hfsplus_mknod() methods don't call
hfsplus_cat_write_inode() for the case of modified
inode objects. As a result, even if hfsplus_file_fsync()
is trying to flush the dirty Catalog File, but because of
not calling hfsplus_cat_write_inode() not all modified
inodes save the new state into Catalog File's records.
Finally, simulation of power failure results in inconsistent
state of Catalog File and FSCK tool reports about
volume corruption.

This patch adds calling of hfsplus_cat_write_inode()
method for modified inodes in hfsplus_link(),
hfsplus_unlink(), hfsplus_rmdir(), hfsplus_symlink(),
and hfsplus_mknod() methods. Also, it adds debug output
in several methods.

sudo ./check generic/480
FSTYP         -- hfsplus
PLATFORM      -- Linux/x86_64 hfsplus-testing-0001 6.18.0-rc1+ #18 SMP PREEMPT_DYNAMIC Thu Dec  4 12:24:45 PST 2025
MKFS_OPTIONS  -- /dev/loop51
MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch

generic/480 16s ...  16s
Ran: generic/480
Passed all 1 tests

Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
cc: Yangtao Li <frank.li@vivo.com>
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20251205000054.3670326-1-slava@dubeyko.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-12-15 15:19:19 -08:00
Mehdi Ben Hadj Khelifa
126fb0ce99 hfsplus: ensure sb->s_fs_info is always cleaned up
When hfsplus was converted to the new mount api a bug was introduced by
changing the allocation pattern of sb->s_fs_info. If setup_bdev_super()
fails after a new superblock has been allocated by sget_fc(), but before
hfsplus_fill_super() takes ownership of the filesystem-specific s_fs_info
data it was leaked.

Fix this by freeing sb->s_fs_info in hfsplus_kill_super().

Cc: stable@vger.kernel.org
Fixes: 432f7c78cb ("hfsplus: convert hfsplus to use the new mount api")
Reported-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Tested-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Mehdi Ben Hadj Khelifa <mehdi.benhadjkhelifa@gmail.com>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
Link: https://lore.kernel.org/r/20251201222843.82310-3-mehdi.benhadjkhelifa@gmail.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-12-15 15:18:26 -08:00
Linus Torvalds
ca010e2ef6 hfs/hfsplus updates for v6.19
- hfs/hfsplus: move on-disk layout declarations into hfs_common.h
 - hfsplus: fix volume corruption issue for generic/101
 - hfsplus: introduce KUnit tests for HFS+ string operations
 - hfs: introduce KUnit tests for HFS string operations
 - hfsplus: fix volume corruption issue for generic/073
 - hfsplus: Verify inode mode when loading from disk
 - hfsplus: fix volume corruption issue for generic/070
 - hfs/hfsplus: prevent getting negative values of offset/length
 - hfsplus: fix missing hfs_bnode_get() in __hfs_bnode_create
 - hfs: fix potential use after free in hfs_correct_next_unused_CNID()
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQT4wVoLCG92poNnMFAhI4xTh21NnQUCaSnmHAAKCRAhI4xTh21N
 nWt0AQDQ4hDGj4VkHNzWWGfh6GL+RhSwKgEzf897tJlUZDewogD/TE9bZnzOKjOw
 YhWPXHEH4xy9+QaDXRgXk2DnWS+YKwg=
 =mAL6
 -----END PGP SIGNATURE-----

Merge tag 'hfs-v6.19-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/hfs

Pull hfs/hfsplus updates from Viacheslav Dubeyko:
 "Several fixes for syzbot reported issues, HFS/HFS+ fixes of xfstests
  failures, Kunit-based unit-tests introduction, and code cleanup:

   - Dan Carpenter fixed a potential use-after-free issue in
     hfs_correct_next_unused_CNID() method. Tetsuo Handa has made nice
     fix of syzbot reported issue related to incorrect inode->i_mode
     management if volume has been corrupted somehow. Yang Chenzhi has
     made really good fix of potential race condition in
     __hfs_bnode_create() method for HFS+ file system.

   - Several fixes to xfstests failures. Particularly, generic/070,
     generic/073, and generic/101 test-cases finish successfully for the
     case of HFS+ file system right now.

   - HFS and HFS+ drivers share multiple structures of on-disk layout
     declarations. Some structures are used without any change. However,
     we had two independent declarations of the same structures in HFS
     and HFS+ drivers.

     The on-disk layout declarations have been moved into
     include/linux/hfs_common.h with the goal to exclude the
     declarations duplication and to keep the HFS/HFS+ on-disk layout
     declarations in one place.

     Also, this patch prepares the basis for creating a hfslib that can
     aggregate common functionality without necessity to duplicate the
     same code in HFS and HFS+ drivers.

   - HFS/HFS+ really need unit-tests because of multiple xfstests
     failures. The first two patches introduce Kunit-based unit-tests
     for the case string operations in HFS/HFS+ file system drivers"

* tag 'hfs-v6.19-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/hfs:
  hfs/hfsplus: move on-disk layout declarations into hfs_common.h
  hfsplus: fix volume corruption issue for generic/101
  hfsplus: introduce KUnit tests for HFS+ string operations
  hfs: introduce KUnit tests for HFS string operations
  hfsplus: fix volume corruption issue for generic/073
  hfsplus: Verify inode mode when loading from disk
  hfsplus: fix volume corruption issue for generic/070
  hfs/hfsplus: prevent getting negative values of offset/length
  hfsplus: fix missing hfs_bnode_get() in __hfs_bnode_create
  hfs: fix potential use after free in hfs_correct_next_unused_CNID()
2025-12-03 20:08:32 -08:00
Linus Torvalds
afdf0fb340 vfs-6.19-rc1.fs_header
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaSmOZgAKCRCRxhvAZXjc
 oq2EAQD09y/qVU81E7Qg7Cn4n5/3WTlnQjx0aSvhb4p6dFUcFwD+K9uVJNP8x8tA
 xTaPt59nZbEX9BIAwtLChSPa4CZsnwM=
 =XrvE
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.19-rc1.fs_header' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull fs header updates from Christian Brauner:
 "This contains initial work to start splitting up fs.h.

  Begin the long-overdue work of splitting up the monolithic fs.h
  header. The header has grown to over 3000 lines and includes types and
  functions for many different subsystems, making it difficult to
  navigate and causing excessive compilation dependencies.

  This series introduces new focused headers for superblock-related
  code:

   - Rename fs_types.h to fs_dirent.h to better reflect its actual
     content (directory entry types)

   - Add fs/super_types.h containing superblock type definitions

   - Add fs/super.h containing superblock function declarations

  This is the first step in a longer effort to modularize the VFS
  headers.

  Cleanups:

   - Inode Field Layout Optimization (Mateusz Guzik)

     Move inode fields used during fast path lookup closer together to
     improve cache locality during path resolution.

   - current_umask() Optimization (Mateusz Guzik)

     Inline current_umask() and move it to fs_struct.h. This improves
     performance by avoiding function call overhead for this
     frequently-used function, and places it in a more appropriate
     header since it operates on fs_struct"

* tag 'vfs-6.19-rc1.fs_header' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  fs: move inode fields used during fast path lookup closer together
  fs: inline current_umask() and move it to fs_struct.h
  fs: add fs/super.h header
  fs: add fs/super_types.h header
  fs: rename fs_types.h to fs_dirent.h
2025-12-01 14:18:01 -08:00
Viacheslav Dubeyko
ec95cd103c hfs/hfsplus: move on-disk layout declarations into hfs_common.h
Currently, HFS declares on-disk layout's metadata structures
in fs/hfs/hfs.h and HFS+ declares it in fs/hfsplus/hfsplus_raw.h.
However, HFS and HFS+ on-disk layouts have some similarity and
overlapping in declarations. As a result, fs/hfs/hfs.h and
fs/hfsplus/hfsplus_raw.h contain multiple duplicated declarations.
Moreover, both HFS and HFS+ drivers contain completely similar
implemented functionality in multiple places.

This patch is moving the on-disk layout declarations from
fs/hfs/hfs.h and fs/hfsplus/hfsplus_raw.h into
include/linux/hfs_common.h with the goal to exclude
the duplication in declarations. Also, this patch prepares
the basis for creating a hfslib that can aggregate common
functionality without necessity to duplicate the same code
in HFS and HFS+ drivers.

Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
cc: Yangtao Li <frank.li@vivo.com>
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-11-25 15:16:03 -08:00
Viacheslav Dubeyko
3f04ee216b hfsplus: fix volume corruption issue for generic/101
The xfstests' test-case generic/101 leaves HFS+ volume
in corrupted state:

sudo ./check generic/101
FSTYP -- hfsplus
PLATFORM -- Linux/x86_64 hfsplus-testing-0001 6.17.0-rc1+ #4 SMP PREEMPT_DYNAMIC Wed Oct 1 15:02:44 PDT 2025
MKFS_OPTIONS -- /dev/loop51
MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch

generic/101 _check_generic_filesystem: filesystem on /dev/loop51 is inconsistent
(see XFSTESTS-2/xfstests-dev/results//generic/101.full for details)

Ran: generic/101
Failures: generic/101
Failed 1 of 1 tests

sudo fsck.hfsplus -d /dev/loop51
** /dev/loop51
Using cacheBlockSize=32K cacheTotalBlock=1024 cacheSize=32768K.
Executing fsck_hfs (version 540.1-Linux).
** Checking non-journaled HFS Plus Volume.
The volume name is untitled
** Checking extents overflow file.
** Checking catalog file.
** Checking multi-linked files.
** Checking catalog hierarchy.
** Checking extended attributes file.
** Checking volume bitmap.
** Checking volume information.
Invalid volume free block count
(It should be 2614350 instead of 2614382)
Verify Status: VIStat = 0x8000, ABTStat = 0x0000 EBTStat = 0x0000
CBTStat = 0x0000 CatStat = 0x00000000
** Repairing volume.
** Rechecking volume.
** Checking non-journaled HFS Plus Volume.
The volume name is untitled
** Checking extents overflow file.
** Checking catalog file.
** Checking multi-linked files.
** Checking catalog hierarchy.
** Checking extended attributes file.
** Checking volume bitmap.
** Checking volume information.
** The volume untitled was repaired successfully.

This test executes such steps: "Test that if we truncate a file
to a smaller size, then truncate it to its original size or
a larger size, then fsyncing it and a power failure happens,
the file will have the range [first_truncate_size, last_size[ with
all bytes having a value of 0x00 if we read it the next time
the filesystem is mounted.".

HFS+ keeps volume's free block count in the superblock.
However, hfsplus_file_fsync() doesn't store superblock's
content. As a result, superblock contains not correct
value of free blocks if a power failure happens.

This patch adds functionality of saving superblock's
content during hfsplus_file_fsync() call.

sudo ./check generic/101
FSTYP         -- hfsplus
PLATFORM      -- Linux/x86_64 hfsplus-testing-0001 6.18.0-rc3+ #96 SMP PREEMPT_DYNAMIC Wed Nov 19 12:47:37 PST 2025
MKFS_OPTIONS  -- /dev/loop51
MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch

generic/101 32s ...  30s
Ran: generic/101
Passed all 1 tests

sudo fsck.hfsplus -d /dev/loop51
** /dev/loop51
	Using cacheBlockSize=32K cacheTotalBlock=1024 cacheSize=32768K.
   Executing fsck_hfs (version 540.1-Linux).
** Checking non-journaled HFS Plus Volume.
   The volume name is untitled
** Checking extents overflow file.
** Checking catalog file.
** Checking multi-linked files.
** Checking catalog hierarchy.
** Checking extended attributes file.
** Checking volume bitmap.
** Checking volume information.
** The volume untitled appears to be OK.

Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
cc: Yangtao Li <frank.li@vivo.com>
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20251119223219.1824434-1-slava@dubeyko.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-11-25 11:35:13 -08:00
Viacheslav Dubeyko
6f84ceb985 hfsplus: introduce KUnit tests for HFS+ string operations
This patch implements the Kunit based set of
unit tests for HFS+ string operations. It checks
functionality of hfsplus_strcasecmp(), hfsplus_strcmp(),
hfsplus_uni2asc(), hfsplus_asc2uni(), hfsplus_hash_dentry(),
and hfsplus_compare_dentry().

./tools/testing/kunit/kunit.py run --kunitconfig ./fs/hfsplus/.kunitconfig
[14:38:05] Configuring KUnit Kernel ...
[14:38:05] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json scripts_gdb ARCH=um O=.kunit --jobs=22
[14:38:09] Starting KUnit Kernel (1/1)...
[14:38:09] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[14:38:09] ============== hfsplus_unicode (27 subtests) ===============
[14:38:09] [PASSED] hfsplus_strcasecmp_test
[14:38:09] [PASSED] hfsplus_strcmp_test
[14:38:09] [PASSED] hfsplus_unicode_edge_cases_test
[14:38:09] [PASSED] hfsplus_unicode_boundary_test
[14:38:09] [PASSED] hfsplus_uni2asc_basic_test
[14:38:09] [PASSED] hfsplus_uni2asc_special_chars_test
[14:38:09] [PASSED] hfsplus_uni2asc_buffer_test
[14:38:09] [PASSED] hfsplus_uni2asc_corrupted_test
[14:38:09] [PASSED] hfsplus_uni2asc_edge_cases_test
[14:38:09] [PASSED] hfsplus_asc2uni_basic_test
[14:38:09] [PASSED] hfsplus_asc2uni_special_chars_test
[14:38:09] [PASSED] hfsplus_asc2uni_buffer_limits_test
[14:38:09] [PASSED] hfsplus_asc2uni_edge_cases_test
[14:38:09] [PASSED] hfsplus_asc2uni_decompose_test
[14:38:09] [PASSED] hfsplus_hash_dentry_basic_test
[14:38:09] [PASSED] hfsplus_hash_dentry_casefold_test
[14:38:09] [PASSED] hfsplus_hash_dentry_special_chars_test
[14:38:09] [PASSED] hfsplus_hash_dentry_decompose_test
[14:38:09] [PASSED] hfsplus_hash_dentry_consistency_test
[14:38:09] [PASSED] hfsplus_hash_dentry_edge_cases_test
[14:38:09] [PASSED] hfsplus_compare_dentry_basic_test
[14:38:09] [PASSED] hfsplus_compare_dentry_casefold_test
[14:38:09] [PASSED] hfsplus_compare_dentry_special_chars_test
[14:38:09] [PASSED] hfsplus_compare_dentry_length_test
[14:38:09] [PASSED] hfsplus_compare_dentry_decompose_test
[14:38:09] [PASSED] hfsplus_compare_dentry_edge_cases_test
[14:38:09] [PASSED] hfsplus_compare_dentry_combined_flags_test
[14:38:09] ================= [PASSED] hfsplus_unicode =================
[14:38:09] ============================================================
[14:38:09] Testing complete. Ran 27 tests: passed: 27
[14:38:09] Elapsed time: 3.875s total, 0.001s configuring, 3.707s building, 0.115s running

v2
Rework memory management model.

Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
cc: Yangtao Li <frank.li@vivo.com>
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-11-24 16:12:51 -08:00
Viacheslav Dubeyko
24e17a29cf hfsplus: fix volume corruption issue for generic/073
The xfstests' test-case generic/073 leaves HFS+ volume
in corrupted state:

sudo ./check generic/073
FSTYP -- hfsplus
PLATFORM -- Linux/x86_64 hfsplus-testing-0001 6.17.0-rc1+ #4 SMP PREEMPT_DYNAMIC Wed Oct 1 15:02:44 PDT 2025
MKFS_OPTIONS -- /dev/loop51
MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch

generic/073 _check_generic_filesystem: filesystem on /dev/loop51 is inconsistent
(see XFSTESTS-2/xfstests-dev/results//generic/073.full for details)

Ran: generic/073
Failures: generic/073
Failed 1 of 1 tests

sudo fsck.hfsplus -d /dev/loop51
** /dev/loop51
Using cacheBlockSize=32K cacheTotalBlock=1024 cacheSize=32768K.
Executing fsck_hfs (version 540.1-Linux).
** Checking non-journaled HFS Plus Volume.
The volume name is untitled
** Checking extents overflow file.
** Checking catalog file.
** Checking multi-linked files.
** Checking catalog hierarchy.
Invalid directory item count
(It should be 1 instead of 0)
** Checking extended attributes file.
** Checking volume bitmap.
** Checking volume information.
Verify Status: VIStat = 0x0000, ABTStat = 0x0000 EBTStat = 0x0000
CBTStat = 0x0000 CatStat = 0x00004000
** Repairing volume.
** Rechecking volume.
** Checking non-journaled HFS Plus Volume.
The volume name is untitled
** Checking extents overflow file.
** Checking catalog file.
** Checking multi-linked files.
** Checking catalog hierarchy.
** Checking extended attributes file.
** Checking volume bitmap.
** Checking volume information.
** The volume untitled was repaired successfully.

The test is doing these steps on final phase:

mv $SCRATCH_MNT/testdir_1/bar $SCRATCH_MNT/testdir_2/bar
$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/testdir_1
$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo

So, we move file bar from testdir_1 into testdir_2 folder. It means that HFS+
logic decrements the number of entries in testdir_1 and increments number of
entries in testdir_2. Finally, we do fsync only for testdir_1 and foo but not
for testdir_2. As a result, this is the reason why fsck.hfsplus detects the
volume corruption afterwards.

This patch fixes the issue by means of adding the
hfsplus_cat_write_inode() call for old_dir and new_dir in
hfsplus_rename() after the successful ending of
hfsplus_rename_cat(). This method makes modification of in-core
inode objects for old_dir and new_dir but it doesn't save these
modifications in Catalog File's entries. It was expected that
hfsplus_write_inode() will save these modifications afterwards.
However, because generic/073 does fsync only for testdir_1 and foo
then testdir_2 modification hasn't beed saved into Catalog File's
entry and it was flushed without this modification. And it was
detected by fsck.hfsplus. Now, hfsplus_rename() stores in Catalog
File all modified entries and correct state of Catalog File will
be flushed during hfsplus_file_fsync() call. Finally, it makes
fsck.hfsplus happy.

sudo ./check generic/073
FSTYP         -- hfsplus
PLATFORM      -- Linux/x86_64 hfsplus-testing-0001 6.18.0-rc3+ #93 SMP PREEMPT_DYNAMIC Wed Nov 12 14:37:49 PST 2025
MKFS_OPTIONS  -- /dev/loop51
MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch

generic/073 32s ...  32s
Ran: generic/073
Passed all 1 tests

Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
cc: Yangtao Li <frank.li@vivo.com>
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20251112232522.814038-1-slava@dubeyko.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-11-18 16:02:35 -08:00
Tetsuo Handa
005d4b0d33 hfsplus: Verify inode mode when loading from disk
syzbot is reporting that S_IFMT bits of inode->i_mode can become bogus when
the S_IFMT bits of the 16bits "mode" field loaded from disk are corrupted.

According to [1], the permissions field was treated as reserved in Mac OS
8 and 9. According to [2], the reserved field was explicitly initialized
with 0, and that field must remain 0 as long as reserved. Therefore, when
the "mode" field is not 0 (i.e. no longer reserved), the file must be
S_IFDIR if dir == 1, and the file must be one of S_IFREG/S_IFLNK/S_IFCHR/
S_IFBLK/S_IFIFO/S_IFSOCK if dir == 0.

Reported-by: syzbot <syzbot+895c23f6917da440ed0d@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=895c23f6917da440ed0d
Link: https://developer.apple.com/library/archive/technotes/tn/tn1150.html#HFSPlusPermissions [1]
Link: https://developer.apple.com/library/archive/technotes/tn/tn1150.html#ReservedAndPadFields [2]
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
Link: https://lore.kernel.org/r/04ded9f9-73fb-496c-bfa5-89c4f5d1d7bb@I-love.SAKURA.ne.jp
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-11-18 16:01:05 -08:00
Viacheslav Dubeyko
ed490f36f4 hfsplus: fix volume corruption issue for generic/070
The xfstests' test-case generic/070 leaves HFS+ volume
in corrupted state:

sudo ./check generic/070
FSTYP -- hfsplus
PLATFORM -- Linux/x86_64 hfsplus-testing-0001 6.17.0-rc1+ #4 SMP PREEMPT_DYNAMIC Wed Oct 1 15:02:44 PDT 2025
MKFS_OPTIONS -- /dev/loop51
MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch

generic/070 _check_generic_filesystem: filesystem on /dev/loop50 is inconsistent
(see xfstests-dev/results//generic/070.full for details)

Ran: generic/070
Failures: generic/070
Failed 1 of 1 tests

sudo fsck.hfsplus -d /dev/loop50
** /dev/loop50
Using cacheBlockSize=32K cacheTotalBlock=1024 cacheSize=32768K.
Executing fsck_hfs (version 540.1-Linux).
** Checking non-journaled HFS Plus Volume.
The volume name is test
** Checking extents overflow file.
Unused node is not erased (node = 1)
** Checking catalog file.
** Checking multi-linked files.
** Checking catalog hierarchy.
** Checking extended attributes file.
** Checking volume bitmap.
** Checking volume information.
Verify Status: VIStat = 0x0000, ABTStat = 0x0000 EBTStat = 0x0004
CBTStat = 0x0000 CatStat = 0x00000000
** Repairing volume.
** Rechecking volume.
** Checking non-journaled HFS Plus Volume.
The volume name is test
** Checking extents overflow file.
** Checking catalog file.
** Checking multi-linked files.
** Checking catalog hierarchy.
** Checking extended attributes file.
** Checking volume bitmap.
** Checking volume information.
** The volume test was repaired successfully.

It is possible to see that fsck.hfsplus detected not
erased and unused node for the case of extents overflow file.
The HFS+ logic has special method that defines if the node
should be erased:

bool hfs_bnode_need_zeroout(struct hfs_btree *tree)
{
	struct super_block *sb = tree->inode->i_sb;
	struct hfsplus_sb_info *sbi = HFSPLUS_SB(sb);
	const u32 volume_attr = be32_to_cpu(sbi->s_vhdr->attributes);

	return tree->cnid == HFSPLUS_CAT_CNID &&
		volume_attr & HFSPLUS_VOL_UNUSED_NODE_FIX;
}

However, it is possible to see that this method works
only for the case of catalog file. But debugging of the issue
has shown that HFSPLUS_VOL_UNUSED_NODE_FIX attribute has been
requested for the extents overflow file too:

catalog file
kernel: hfsplus: node 4, num_recs 0, flags 0x10
kernel: hfsplus: tree->cnid 4, volume_attr 0x80000800

extents overflow file
kernel: hfsplus: node 1, num_recs 0, flags 0x10
kernel: hfsplus: tree->cnid 3, volume_attr 0x80000800

This patch modifies the hfs_bnode_need_zeroout() by checking
only volume_attr but not the b-tree ID because node zeroing
can be requested for all HFS+ b-tree types.

sudo ./check generic/070
FSTYP         -- hfsplus
PLATFORM      -- Linux/x86_64 hfsplus-testing-0001 6.18.0-rc3+ #79 SMP PREEMPT_DYNAMIC Fri Oct 31 16:07:42 PDT 2025
MKFS_OPTIONS  -- /dev/loop51
MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch

generic/070 33s ...  34s
Ran: generic/070
Passed all 1 tests

Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
cc: Yangtao Li <frank.li@vivo.com>
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20251101001229.247432-1-slava@dubeyko.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-11-13 15:04:52 -08:00
Viacheslav Dubeyko
00c14a09a7 hfs/hfsplus: prevent getting negative values of offset/length
The syzbot reported KASAN out-of-bounds issue in
hfs_bnode_move():

[   45.588165][ T9821] hfs: dst 14, src 65536, len -65536
[   45.588895][ T9821] ==================================================================
[   45.590114][ T9821] BUG: KASAN: out-of-bounds in hfs_bnode_move+0xfd/0x140
[   45.591127][ T9821] Read of size 18446744073709486080 at addr ffff888035935400 by task repro/9821
[   45.592207][ T9821]
[   45.592420][ T9821] CPU: 0 UID: 0 PID: 9821 Comm: repro Not tainted 6.16.0-rc7-dirty #42 PREEMPT(full)
[   45.592428][ T9821] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[   45.592431][ T9821] Call Trace:
[   45.592434][ T9821]  <TASK>
[   45.592437][ T9821]  dump_stack_lvl+0x1c1/0x2a0
[   45.592446][ T9821]  ? __virt_addr_valid+0x1c8/0x5c0
[   45.592454][ T9821]  ? __pfx_dump_stack_lvl+0x10/0x10
[   45.592461][ T9821]  ? rcu_is_watching+0x15/0xb0
[   45.592469][ T9821]  ? lock_release+0x4b/0x3e0
[   45.592476][ T9821]  ? __virt_addr_valid+0x1c8/0x5c0
[   45.592483][ T9821]  ? __virt_addr_valid+0x4a5/0x5c0
[   45.592491][ T9821]  print_report+0x17e/0x7c0
[   45.592497][ T9821]  ? __virt_addr_valid+0x1c8/0x5c0
[   45.592504][ T9821]  ? __virt_addr_valid+0x4a5/0x5c0
[   45.592511][ T9821]  ? __phys_addr+0xd3/0x180
[   45.592519][ T9821]  ? hfs_bnode_move+0xfd/0x140
[   45.592526][ T9821]  kasan_report+0x147/0x180
[   45.592531][ T9821]  ? _printk+0xcf/0x120
[   45.592537][ T9821]  ? hfs_bnode_move+0xfd/0x140
[   45.592544][ T9821]  ? hfs_bnode_move+0xfd/0x140
[   45.592552][ T9821]  kasan_check_range+0x2b0/0x2c0
[   45.592557][ T9821]  ? hfs_bnode_move+0xfd/0x140
[   45.592565][ T9821]  __asan_memmove+0x29/0x70
[   45.592572][ T9821]  hfs_bnode_move+0xfd/0x140
[   45.592580][ T9821]  hfs_brec_remove+0x473/0x560
[   45.592589][ T9821]  hfs_cat_move+0x6fb/0x960
[   45.592598][ T9821]  ? __pfx_hfs_cat_move+0x10/0x10
[   45.592607][ T9821]  ? seqcount_lockdep_reader_access+0x122/0x1c0
[   45.592614][ T9821]  ? lockdep_hardirqs_on+0x9c/0x150
[   45.592631][ T9821]  ? __lock_acquire+0xaec/0xd80
[   45.592641][ T9821]  hfs_rename+0x1dc/0x2d0
[   45.592649][ T9821]  ? __pfx_hfs_rename+0x10/0x10
[   45.592657][ T9821]  vfs_rename+0xac6/0xed0
[   45.592664][ T9821]  ? __pfx_vfs_rename+0x10/0x10
[   45.592670][ T9821]  ? d_alloc+0x144/0x190
[   45.592677][ T9821]  ? bpf_lsm_path_rename+0x9/0x20
[   45.592683][ T9821]  ? security_path_rename+0x17d/0x490
[   45.592691][ T9821]  do_renameat2+0x890/0xc50
[   45.592699][ T9821]  ? __pfx_do_renameat2+0x10/0x10
[   45.592707][ T9821]  ? getname_flags+0x1e5/0x540
[   45.592714][ T9821]  __x64_sys_rename+0x82/0x90
[   45.592720][ T9821]  ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[   45.592725][ T9821]  do_syscall_64+0xf3/0x3a0
[   45.592741][ T9821]  ? exc_page_fault+0x9f/0xf0
[   45.592748][ T9821]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[   45.592754][ T9821] RIP: 0033:0x7f7f73fe3fc9
[   45.592760][ T9821] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 48
[   45.592765][ T9821] RSP: 002b:00007ffc7e116cf8 EFLAGS: 00000283 ORIG_RAX: 0000000000000052
[   45.592772][ T9821] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7f73fe3fc9
[   45.592776][ T9821] RDX: 0000200000000871 RSI: 0000200000000780 RDI: 00002000000003c0
[   45.592781][ T9821] RBP: 00007ffc7e116d00 R08: 0000000000000000 R09: 00007ffc7e116d30
[   45.592784][ T9821] R10: fffffffffffffff0 R11: 0000000000000283 R12: 00005557e81f8250
[   45.592788][ T9821] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[   45.592795][ T9821]  </TASK>
[   45.592797][ T9821]
[   45.619721][ T9821] The buggy address belongs to the physical page:
[   45.620300][ T9821] page: refcount:1 mapcount:1 mapping:0000000000000000 index:0x559a88174 pfn:0x35935
[   45.621150][ T9821] memcg:ffff88810a1d5b00
[   45.621531][ T9821] anon flags: 0xfff60000020838(uptodate|dirty|lru|owner_2|swapbacked|node=0|zone=1|lastcpupid=0x7ff)
[   45.622496][ T9821] raw: 00fff60000020838 ffffea0000d64d88 ffff888021753e10 ffff888029da0771
[   45.623260][ T9821] raw: 0000000559a88174 0000000000000000 0000000100000000 ffff88810a1d5b00
[   45.624030][ T9821] page dumped because: kasan: bad access detected
[   45.624602][ T9821] page_owner tracks the page as allocated
[   45.625115][ T9821] page last allocated via order 0, migratetype Movable, gfp_mask 0x140dca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO0
[   45.626685][ T9821]  post_alloc_hook+0x240/0x2a0
[   45.627127][ T9821]  get_page_from_freelist+0x2101/0x21e0
[   45.627628][ T9821]  __alloc_frozen_pages_noprof+0x274/0x380
[   45.628154][ T9821]  alloc_pages_mpol+0x241/0x4b0
[   45.628593][ T9821]  vma_alloc_folio_noprof+0xe4/0x210
[   45.629066][ T9821]  folio_prealloc+0x30/0x180
[   45.629487][ T9821]  __handle_mm_fault+0x34bd/0x5640
[   45.629957][ T9821]  handle_mm_fault+0x40e/0x8e0
[   45.630392][ T9821]  do_user_addr_fault+0xa81/0x1390
[   45.630862][ T9821]  exc_page_fault+0x76/0xf0
[   45.631273][ T9821]  asm_exc_page_fault+0x26/0x30
[   45.631712][ T9821] page last free pid 5269 tgid 5269 stack trace:
[   45.632281][ T9821]  free_unref_folios+0xc73/0x14c0
[   45.632740][ T9821]  folios_put_refs+0x55b/0x640
[   45.633177][ T9821]  free_pages_and_swap_cache+0x26d/0x510
[   45.633685][ T9821]  tlb_flush_mmu+0x3a0/0x680
[   45.634105][ T9821]  tlb_finish_mmu+0xd4/0x200
[   45.634525][ T9821]  exit_mmap+0x44c/0xb70
[   45.634914][ T9821]  __mmput+0x118/0x420
[   45.635286][ T9821]  exit_mm+0x1da/0x2c0
[   45.635659][ T9821]  do_exit+0x652/0x2330
[   45.636039][ T9821]  do_group_exit+0x21c/0x2d0
[   45.636457][ T9821]  __x64_sys_exit_group+0x3f/0x40
[   45.636915][ T9821]  x64_sys_call+0x21ba/0x21c0
[   45.637342][ T9821]  do_syscall_64+0xf3/0x3a0
[   45.637756][ T9821]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[   45.638290][ T9821] page has been migrated, last migrate reason: numa_misplaced
[   45.638956][ T9821]
[   45.639173][ T9821] Memory state around the buggy address:
[   45.639677][ T9821]  ffff888035935300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   45.640397][ T9821]  ffff888035935380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   45.641117][ T9821] >ffff888035935400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   45.641837][ T9821]                    ^
[   45.642207][ T9821]  ffff888035935480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   45.642929][ T9821]  ffff888035935500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   45.643650][ T9821] ==================================================================

This commit [1] fixes the issue if an offset inside of b-tree node
or length of the request is bigger than b-tree node. However,
this fix is still not ready for negative values
of the offset or length. Moreover, negative values of
the offset or length doesn't make sense for b-tree's
operations. Because we could try to access the memory address
outside of the beginning of memory page's addresses range.
Also, using of negative values make logic very complicated,
unpredictable, and we could access the wrong item(s)
in the b-tree node.

This patch changes b-tree interface by means of converting
signed integer arguments of offset and length on u32 type.
Such conversion has goal to prevent of using negative values
unintentionally or by mistake in b-tree operations.

[1] 'commit a431930c9b ("hfs: fix slab-out-of-bounds in hfs_bnode_read()")'

Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
cc: Yangtao Li <frank.li@vivo.com>
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20251002200020.2578311-1-slava@dubeyko.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-11-13 15:02:52 -08:00
Yang Chenzhi
152af11428 hfsplus: fix missing hfs_bnode_get() in __hfs_bnode_create
When sync() and link() are called concurrently, both threads may
enter hfs_bnode_find() without finding the node in the hash table
and proceed to create it.

Thread A:
  hfsplus_write_inode()
    -> hfsplus_write_system_inode()
      -> hfs_btree_write()
        -> hfs_bnode_find(tree, 0)
          -> __hfs_bnode_create(tree, 0)

Thread B:
  hfsplus_create_cat()
    -> hfs_brec_insert()
      -> hfs_bnode_split()
        -> hfs_bmap_alloc()
          -> hfs_bnode_find(tree, 0)
            -> __hfs_bnode_create(tree, 0)

In this case, thread A creates the bnode, sets refcnt=1, and hashes it.
Thread B also tries to create the same bnode, notices it has already
been inserted, drops its own instance, and uses the hashed one without
getting the node.

```

	node2 = hfs_bnode_findhash(tree, cnid);
	if (!node2) {                                 <- Thread A
		hash = hfs_bnode_hash(cnid);
		node->next_hash = tree->node_hash[hash];
		tree->node_hash[hash] = node;
		tree->node_hash_cnt++;
	} else {                                      <- Thread B
		spin_unlock(&tree->hash_lock);
		kfree(node);
		wait_event(node2->lock_wq,
			!test_bit(HFS_BNODE_NEW, &node2->flags));
		return node2;
	}
```

However, hfs_bnode_find() requires each call to take a reference.
Here both threads end up setting refcnt=1. When they later put the node,
this triggers:

BUG_ON(!atomic_read(&node->refcnt))

In this scenario, Thread B in fact finds the node in the hash table
rather than creating a new one, and thus must take a reference.

Fix this by calling hfs_bnode_get() when reusing a bnode newly created by
another thread to ensure the refcount is updated correctly.

A similar bug was fixed in HFS long ago in commit
a9dc087fd3 ("fix missing hfs_bnode_get() in __hfs_bnode_create")
but the same issue remained in HFS+ until now.

Reported-by: syzbot+005d2a9ecd9fbf525f6a@syzkaller.appspotmail.com
Signed-off-by: Yang Chenzhi <yang.chenzhi@vivo.com>
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
Link: https://lore.kernel.org/r/20250829093912.611853-1-yang.chenzhi@vivo.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-11-13 14:59:46 -08:00
Mateusz Guzik
5b8ed52866
fs: inline current_umask() and move it to fs_struct.h
There is no good reason to have this as a func call, other than avoiding
the churn of adding fs_struct.h as needed.

Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Link: https://patch.msgid.link/20251104170448.630414-1-mjguzik@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-05 22:51:23 +01:00
Mateusz Guzik
b4dbfd8653
Coccinelle-based conversion to use ->i_state accessors
All places were patched by coccinelle with the default expecting that
->i_lock is held, afterwards entries got fixed up by hand to use
unlocked variants as needed.

The script:
@@
expression inode, flags;
@@

- inode->i_state & flags
+ inode_state_read(inode) & flags

@@
expression inode, flags;
@@

- inode->i_state &= ~flags
+ inode_state_clear(inode, flags)

@@
expression inode, flag1, flag2;
@@

- inode->i_state &= ~flag1 & ~flag2
+ inode_state_clear(inode, flag1 | flag2)

@@
expression inode, flags;
@@

- inode->i_state |= flags
+ inode_state_set(inode, flags)

@@
expression inode, flags;
@@

- inode->i_state = flags
+ inode_state_assign(inode, flags)

@@
expression inode, flags;
@@

- flags = inode->i_state
+ flags = inode_state_read(inode)

@@
expression inode, flags;
@@

- READ_ONCE(inode->i_state) & flags
+ inode_state_read(inode) & flags

Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-10-20 20:22:26 +02:00
Viacheslav Dubeyko
f32a26fab3 hfs/hfsplus: rework debug output subsystem
Currently, HFS/HFS+ has very obsolete and inconvenient
debug output subsystem. Also, the code is duplicated
in HFS and HFS+ driver. This patch introduces
linux/hfs_common.h for gathering common declarations,
inline functions, and common short methods. Currently,
this file contains only hfs_dbg() function that
employs pr_debug() with the goal to print a debug-level
messages conditionally.

So, now, it is possible to enable the debug output
by means of:

echo 'file extent.c +p' > /proc/dynamic_debug/control
echo 'func hfsplus_evict_inode +p' > /proc/dynamic_debug/control

And debug output looks like this:

hfs: pid 5831:fs/hfs/catalog.c:228 hfs_cat_delete(): delete_cat: 00,48
hfs: pid 5831:fs/hfs/extent.c:484 hfs_file_truncate(): truncate: 48, 409600 -> 0
hfs: pid 5831:fs/hfs/extent.c:212 hfs_dump_extent():
hfs: pid 5831:fs/hfs/extent.c:214 hfs_dump_extent():  78:4
hfs: pid 5831:fs/hfs/extent.c:214 hfs_dump_extent():  0:0
hfs: pid 5831:fs/hfs/extent.c:214 hfs_dump_extent():  0:0

v4
Debug messages have been reworked and information about
new HFS/HFS+ shared declarations file has been added
to MAINTAINERS file.

v5
Yangtao Li suggested to clean up debug output and
fix several typos.

Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
cc: Yangtao Li <frank.li@vivo.com>
cc: linux-fsdevel@vger.kernel.org
cc: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-09-24 16:30:34 -07:00
Viacheslav Dubeyko
42520df65b hfsplus: fix slab-out-of-bounds read in hfsplus_strcasecmp()
The hfsplus_strcasecmp() logic can trigger the issue:

[  117.317703][ T9855] ==================================================================
[  117.318353][ T9855] BUG: KASAN: slab-out-of-bounds in hfsplus_strcasecmp+0x1bc/0x490
[  117.318991][ T9855] Read of size 2 at addr ffff88802160f40c by task repro/9855
[  117.319577][ T9855]
[  117.319773][ T9855] CPU: 0 UID: 0 PID: 9855 Comm: repro Not tainted 6.17.0-rc6 #33 PREEMPT(full)
[  117.319780][ T9855] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[  117.319783][ T9855] Call Trace:
[  117.319785][ T9855]  <TASK>
[  117.319788][ T9855]  dump_stack_lvl+0x1c1/0x2a0
[  117.319795][ T9855]  ? __virt_addr_valid+0x1c8/0x5c0
[  117.319803][ T9855]  ? __pfx_dump_stack_lvl+0x10/0x10
[  117.319808][ T9855]  ? rcu_is_watching+0x15/0xb0
[  117.319816][ T9855]  ? lock_release+0x4b/0x3e0
[  117.319821][ T9855]  ? __kasan_check_byte+0x12/0x40
[  117.319828][ T9855]  ? __virt_addr_valid+0x1c8/0x5c0
[  117.319835][ T9855]  ? __virt_addr_valid+0x4a5/0x5c0
[  117.319842][ T9855]  print_report+0x17e/0x7e0
[  117.319848][ T9855]  ? __virt_addr_valid+0x1c8/0x5c0
[  117.319855][ T9855]  ? __virt_addr_valid+0x4a5/0x5c0
[  117.319862][ T9855]  ? __phys_addr+0xd3/0x180
[  117.319869][ T9855]  ? hfsplus_strcasecmp+0x1bc/0x490
[  117.319876][ T9855]  kasan_report+0x147/0x180
[  117.319882][ T9855]  ? hfsplus_strcasecmp+0x1bc/0x490
[  117.319891][ T9855]  hfsplus_strcasecmp+0x1bc/0x490
[  117.319900][ T9855]  ? __pfx_hfsplus_cat_case_cmp_key+0x10/0x10
[  117.319906][ T9855]  hfs_find_rec_by_key+0xa9/0x1e0
[  117.319913][ T9855]  __hfsplus_brec_find+0x18e/0x470
[  117.319920][ T9855]  ? __pfx_hfsplus_bnode_find+0x10/0x10
[  117.319926][ T9855]  ? __pfx_hfs_find_rec_by_key+0x10/0x10
[  117.319933][ T9855]  ? __pfx___hfsplus_brec_find+0x10/0x10
[  117.319942][ T9855]  hfsplus_brec_find+0x28f/0x510
[  117.319949][ T9855]  ? __pfx_hfs_find_rec_by_key+0x10/0x10
[  117.319956][ T9855]  ? __pfx_hfsplus_brec_find+0x10/0x10
[  117.319963][ T9855]  ? __kmalloc_noprof+0x2a9/0x510
[  117.319969][ T9855]  ? hfsplus_find_init+0x8c/0x1d0
[  117.319976][ T9855]  hfsplus_brec_read+0x2b/0x120
[  117.319983][ T9855]  hfsplus_lookup+0x2aa/0x890
[  117.319990][ T9855]  ? __pfx_hfsplus_lookup+0x10/0x10
[  117.320003][ T9855]  ? d_alloc_parallel+0x2f0/0x15e0
[  117.320008][ T9855]  ? __lock_acquire+0xaec/0xd80
[  117.320013][ T9855]  ? __pfx_d_alloc_parallel+0x10/0x10
[  117.320019][ T9855]  ? __raw_spin_lock_init+0x45/0x100
[  117.320026][ T9855]  ? __init_waitqueue_head+0xa9/0x150
[  117.320034][ T9855]  __lookup_slow+0x297/0x3d0
[  117.320039][ T9855]  ? __pfx___lookup_slow+0x10/0x10
[  117.320045][ T9855]  ? down_read+0x1ad/0x2e0
[  117.320055][ T9855]  lookup_slow+0x53/0x70
[  117.320065][ T9855]  walk_component+0x2f0/0x430
[  117.320073][ T9855]  path_lookupat+0x169/0x440
[  117.320081][ T9855]  filename_lookup+0x212/0x590
[  117.320089][ T9855]  ? __pfx_filename_lookup+0x10/0x10
[  117.320098][ T9855]  ? strncpy_from_user+0x150/0x290
[  117.320105][ T9855]  ? getname_flags+0x1e5/0x540
[  117.320112][ T9855]  user_path_at+0x3a/0x60
[  117.320117][ T9855]  __x64_sys_umount+0xee/0x160
[  117.320123][ T9855]  ? __pfx___x64_sys_umount+0x10/0x10
[  117.320129][ T9855]  ? do_syscall_64+0xb7/0x3a0
[  117.320135][ T9855]  ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  117.320141][ T9855]  ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  117.320145][ T9855]  do_syscall_64+0xf3/0x3a0
[  117.320150][ T9855]  ? exc_page_fault+0x9f/0xf0
[  117.320154][ T9855]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  117.320158][ T9855] RIP: 0033:0x7f7dd7908b07
[  117.320163][ T9855] Code: 23 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 08
[  117.320167][ T9855] RSP: 002b:00007ffd5ebd9698 EFLAGS: 00000202 ORIG_RAX: 00000000000000a6
[  117.320172][ T9855] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7dd7908b07
[  117.320176][ T9855] RDX: 0000000000000009 RSI: 0000000000000009 RDI: 00007ffd5ebd9740
[  117.320179][ T9855] RBP: 00007ffd5ebda780 R08: 0000000000000005 R09: 00007ffd5ebd9530
[  117.320181][ T9855] R10: 00007f7dd799bfc0 R11: 0000000000000202 R12: 000055e2008b32d0
[  117.320184][ T9855] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  117.320189][ T9855]  </TASK>
[  117.320190][ T9855]
[  117.351311][ T9855] Allocated by task 9855:
[  117.351683][ T9855]  kasan_save_track+0x3e/0x80
[  117.352093][ T9855]  __kasan_kmalloc+0x8d/0xa0
[  117.352490][ T9855]  __kmalloc_noprof+0x288/0x510
[  117.352914][ T9855]  hfsplus_find_init+0x8c/0x1d0
[  117.353342][ T9855]  hfsplus_lookup+0x19c/0x890
[  117.353747][ T9855]  __lookup_slow+0x297/0x3d0
[  117.354148][ T9855]  lookup_slow+0x53/0x70
[  117.354514][ T9855]  walk_component+0x2f0/0x430
[  117.354921][ T9855]  path_lookupat+0x169/0x440
[  117.355325][ T9855]  filename_lookup+0x212/0x590
[  117.355740][ T9855]  user_path_at+0x3a/0x60
[  117.356115][ T9855]  __x64_sys_umount+0xee/0x160
[  117.356529][ T9855]  do_syscall_64+0xf3/0x3a0
[  117.356920][ T9855]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  117.357429][ T9855]
[  117.357636][ T9855] The buggy address belongs to the object at ffff88802160f000
[  117.357636][ T9855]  which belongs to the cache kmalloc-2k of size 2048
[  117.358827][ T9855] The buggy address is located 0 bytes to the right of
[  117.358827][ T9855]  allocated 1036-byte region [ffff88802160f000, ffff88802160f40c)
[  117.360061][ T9855]
[  117.360266][ T9855] The buggy address belongs to the physical page:
[  117.360813][ T9855] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x21608
[  117.361562][ T9855] head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[  117.362285][ T9855] flags: 0xfff00000000040(head|node=0|zone=1|lastcpupid=0x7ff)
[  117.362929][ T9855] page_type: f5(slab)
[  117.363282][ T9855] raw: 00fff00000000040 ffff88801a842f00 ffffea0000932000 dead000000000002
[  117.364015][ T9855] raw: 0000000000000000 0000000080080008 00000000f5000000 0000000000000000
[  117.364750][ T9855] head: 00fff00000000040 ffff88801a842f00 ffffea0000932000 dead000000000002
[  117.365491][ T9855] head: 0000000000000000 0000000080080008 00000000f5000000 0000000000000000
[  117.366232][ T9855] head: 00fff00000000003 ffffea0000858201 00000000ffffffff 00000000ffffffff
[  117.366968][ T9855] head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000008
[  117.367711][ T9855] page dumped because: kasan: bad access detected
[  117.368259][ T9855] page_owner tracks the page as allocated
[  117.368745][ T9855] page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN1
[  117.370541][ T9855]  post_alloc_hook+0x240/0x2a0
[  117.370954][ T9855]  get_page_from_freelist+0x2101/0x21e0
[  117.371435][ T9855]  __alloc_frozen_pages_noprof+0x274/0x380
[  117.371935][ T9855]  alloc_pages_mpol+0x241/0x4b0
[  117.372360][ T9855]  allocate_slab+0x8d/0x380
[  117.372752][ T9855]  ___slab_alloc+0xbe3/0x1400
[  117.373159][ T9855]  __kmalloc_cache_noprof+0x296/0x3d0
[  117.373621][ T9855]  nexthop_net_init+0x75/0x100
[  117.374038][ T9855]  ops_init+0x35c/0x5c0
[  117.374400][ T9855]  setup_net+0x10c/0x320
[  117.374768][ T9855]  copy_net_ns+0x31b/0x4d0
[  117.375156][ T9855]  create_new_namespaces+0x3f3/0x720
[  117.375613][ T9855]  unshare_nsproxy_namespaces+0x11c/0x170
[  117.376094][ T9855]  ksys_unshare+0x4ca/0x8d0
[  117.376477][ T9855]  __x64_sys_unshare+0x38/0x50
[  117.376879][ T9855]  do_syscall_64+0xf3/0x3a0
[  117.377265][ T9855] page last free pid 9110 tgid 9110 stack trace:
[  117.377795][ T9855]  __free_frozen_pages+0xbeb/0xd50
[  117.378229][ T9855]  __put_partials+0x152/0x1a0
[  117.378625][ T9855]  put_cpu_partial+0x17c/0x250
[  117.379026][ T9855]  __slab_free+0x2d4/0x3c0
[  117.379404][ T9855]  qlist_free_all+0x97/0x140
[  117.379790][ T9855]  kasan_quarantine_reduce+0x148/0x160
[  117.380250][ T9855]  __kasan_slab_alloc+0x22/0x80
[  117.380662][ T9855]  __kmalloc_noprof+0x232/0x510
[  117.381074][ T9855]  tomoyo_supervisor+0xc0a/0x1360
[  117.381498][ T9855]  tomoyo_env_perm+0x149/0x1e0
[  117.381903][ T9855]  tomoyo_find_next_domain+0x15ad/0x1b90
[  117.382378][ T9855]  tomoyo_bprm_check_security+0x11c/0x180
[  117.382859][ T9855]  security_bprm_check+0x89/0x280
[  117.383289][ T9855]  bprm_execve+0x8f1/0x14a0
[  117.383673][ T9855]  do_execveat_common+0x528/0x6b0
[  117.384103][ T9855]  __x64_sys_execve+0x94/0xb0
[  117.384500][ T9855]
[  117.384706][ T9855] Memory state around the buggy address:
[  117.385179][ T9855]  ffff88802160f300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  117.385854][ T9855]  ffff88802160f380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  117.386534][ T9855] >ffff88802160f400: 00 04 fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  117.387204][ T9855]                       ^
[  117.387566][ T9855]  ffff88802160f480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  117.388243][ T9855]  ffff88802160f500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  117.388918][ T9855] ==================================================================

The issue takes place if the length field of struct hfsplus_unistr
is bigger than HFSPLUS_MAX_STRLEN. The patch simply checks
the length of comparing strings. And if the strings' length
is bigger than HFSPLUS_MAX_STRLEN, then it is corrected
to this value.

v2
The string length correction has been added for hfsplus_strcmp().

Reported-by: Jiaming Zhang <r772577952@gmail.com>
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
cc: Yangtao Li <frank.li@vivo.com>
cc: linux-fsdevel@vger.kernel.org
cc: syzkaller@googlegroups.com
Link: https://lore.kernel.org/r/20250919191243.1370388-1-slava@dubeyko.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-09-22 15:11:33 -07:00
Kang Chen
bea3e1d446 hfsplus: fix slab-out-of-bounds read in hfsplus_uni2asc()
BUG: KASAN: slab-out-of-bounds in hfsplus_uni2asc+0xa71/0xb90 fs/hfsplus/unicode.c:186
Read of size 2 at addr ffff8880289ef218 by task syz.6.248/14290

CPU: 0 UID: 0 PID: 14290 Comm: syz.6.248 Not tainted 6.16.4 #1 PREEMPT(full)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x116/0x1b0 lib/dump_stack.c:120
 print_address_description mm/kasan/report.c:378 [inline]
 print_report+0xca/0x5f0 mm/kasan/report.c:482
 kasan_report+0xca/0x100 mm/kasan/report.c:595
 hfsplus_uni2asc+0xa71/0xb90 fs/hfsplus/unicode.c:186
 hfsplus_listxattr+0x5b6/0xbd0 fs/hfsplus/xattr.c:738
 vfs_listxattr+0xbe/0x140 fs/xattr.c:493
 listxattr+0xee/0x190 fs/xattr.c:924
 filename_listxattr fs/xattr.c:958 [inline]
 path_listxattrat+0x143/0x360 fs/xattr.c:988
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcb/0x4c0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fe0e9fae16d
Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fe0eae67f98 EFLAGS: 00000246 ORIG_RAX: 00000000000000c3
RAX: ffffffffffffffda RBX: 00007fe0ea205fa0 RCX: 00007fe0e9fae16d
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000200000000000
RBP: 00007fe0ea0480f0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fe0ea206038 R14: 00007fe0ea205fa0 R15: 00007fe0eae48000
 </TASK>

Allocated by task 14290:
 kasan_save_stack+0x24/0x50 mm/kasan/common.c:47
 kasan_save_track+0x14/0x30 mm/kasan/common.c:68
 poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
 __kasan_kmalloc+0xaa/0xb0 mm/kasan/common.c:394
 kasan_kmalloc include/linux/kasan.h:260 [inline]
 __do_kmalloc_node mm/slub.c:4333 [inline]
 __kmalloc_noprof+0x219/0x540 mm/slub.c:4345
 kmalloc_noprof include/linux/slab.h:909 [inline]
 hfsplus_find_init+0x95/0x1f0 fs/hfsplus/bfind.c:21
 hfsplus_listxattr+0x331/0xbd0 fs/hfsplus/xattr.c:697
 vfs_listxattr+0xbe/0x140 fs/xattr.c:493
 listxattr+0xee/0x190 fs/xattr.c:924
 filename_listxattr fs/xattr.c:958 [inline]
 path_listxattrat+0x143/0x360 fs/xattr.c:988
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcb/0x4c0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

When hfsplus_uni2asc is called from hfsplus_listxattr,
it actually passes in a struct hfsplus_attr_unistr*.
The size of the corresponding structure is different from that of hfsplus_unistr,
so the previous fix (94458781ae) is insufficient.
The pointer on the unicode buffer is still going beyond the allocated memory.

This patch introduces two warpper functions hfsplus_uni2asc_xattr_str and
hfsplus_uni2asc_str to process two unicode buffers,
struct hfsplus_attr_unistr* and struct hfsplus_unistr* respectively.
When ustrlen value is bigger than the allocated memory size,
the ustrlen value is limited to an safe size.

Fixes: 94458781ae ("hfsplus: fix slab-out-of-bounds read in hfsplus_uni2asc()")
Signed-off-by: Kang Chen <k.chen@smail.nju.edu.cn>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
Link: https://lore.kernel.org/r/20250909031316.1647094-1-k.chen@smail.nju.edu.cn
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-09-09 11:44:38 -07:00
Viacheslav Dubeyko
9b3d15a758 hfsplus: fix KMSAN uninit-value issue in hfsplus_delete_cat()
The syzbot reported issue in hfsplus_delete_cat():

[   70.682285][ T9333] =====================================================
[   70.682943][ T9333] BUG: KMSAN: uninit-value in hfsplus_subfolders_dec+0x1d7/0x220
[   70.683640][ T9333]  hfsplus_subfolders_dec+0x1d7/0x220
[   70.684141][ T9333]  hfsplus_delete_cat+0x105d/0x12b0
[   70.684621][ T9333]  hfsplus_rmdir+0x13d/0x310
[   70.685048][ T9333]  vfs_rmdir+0x5ba/0x810
[   70.685447][ T9333]  do_rmdir+0x964/0xea0
[   70.685833][ T9333]  __x64_sys_rmdir+0x71/0xb0
[   70.686260][ T9333]  x64_sys_call+0xcd8/0x3cf0
[   70.686695][ T9333]  do_syscall_64+0xd9/0x1d0
[   70.687119][ T9333]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[   70.687646][ T9333]
[   70.687856][ T9333] Uninit was stored to memory at:
[   70.688311][ T9333]  hfsplus_subfolders_inc+0x1c2/0x1d0
[   70.688779][ T9333]  hfsplus_create_cat+0x148e/0x1800
[   70.689231][ T9333]  hfsplus_mknod+0x27f/0x600
[   70.689730][ T9333]  hfsplus_mkdir+0x5a/0x70
[   70.690146][ T9333]  vfs_mkdir+0x483/0x7a0
[   70.690545][ T9333]  do_mkdirat+0x3f2/0xd30
[   70.690944][ T9333]  __x64_sys_mkdir+0x9a/0xf0
[   70.691380][ T9333]  x64_sys_call+0x2f89/0x3cf0
[   70.691816][ T9333]  do_syscall_64+0xd9/0x1d0
[   70.692229][ T9333]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[   70.692773][ T9333]
[   70.692990][ T9333] Uninit was stored to memory at:
[   70.693469][ T9333]  hfsplus_subfolders_inc+0x1c2/0x1d0
[   70.693960][ T9333]  hfsplus_create_cat+0x148e/0x1800
[   70.694438][ T9333]  hfsplus_fill_super+0x21c1/0x2700
[   70.694911][ T9333]  mount_bdev+0x37b/0x530
[   70.695320][ T9333]  hfsplus_mount+0x4d/0x60
[   70.695729][ T9333]  legacy_get_tree+0x113/0x2c0
[   70.696167][ T9333]  vfs_get_tree+0xb3/0x5c0
[   70.696588][ T9333]  do_new_mount+0x73e/0x1630
[   70.697013][ T9333]  path_mount+0x6e3/0x1eb0
[   70.697425][ T9333]  __se_sys_mount+0x733/0x830
[   70.697857][ T9333]  __x64_sys_mount+0xe4/0x150
[   70.698269][ T9333]  x64_sys_call+0x2691/0x3cf0
[   70.698704][ T9333]  do_syscall_64+0xd9/0x1d0
[   70.699117][ T9333]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[   70.699730][ T9333]
[   70.699946][ T9333] Uninit was created at:
[   70.700378][ T9333]  __alloc_pages_noprof+0x714/0xe60
[   70.700843][ T9333]  alloc_pages_mpol_noprof+0x2a2/0x9b0
[   70.701331][ T9333]  alloc_pages_noprof+0xf8/0x1f0
[   70.701774][ T9333]  allocate_slab+0x30e/0x1390
[   70.702194][ T9333]  ___slab_alloc+0x1049/0x33a0
[   70.702635][ T9333]  kmem_cache_alloc_lru_noprof+0x5ce/0xb20
[   70.703153][ T9333]  hfsplus_alloc_inode+0x5a/0xd0
[   70.703598][ T9333]  alloc_inode+0x82/0x490
[   70.703984][ T9333]  iget_locked+0x22e/0x1320
[   70.704428][ T9333]  hfsplus_iget+0x5c/0xba0
[   70.704827][ T9333]  hfsplus_btree_open+0x135/0x1dd0
[   70.705291][ T9333]  hfsplus_fill_super+0x1132/0x2700
[   70.705776][ T9333]  mount_bdev+0x37b/0x530
[   70.706171][ T9333]  hfsplus_mount+0x4d/0x60
[   70.706579][ T9333]  legacy_get_tree+0x113/0x2c0
[   70.707019][ T9333]  vfs_get_tree+0xb3/0x5c0
[   70.707444][ T9333]  do_new_mount+0x73e/0x1630
[   70.707865][ T9333]  path_mount+0x6e3/0x1eb0
[   70.708270][ T9333]  __se_sys_mount+0x733/0x830
[   70.708711][ T9333]  __x64_sys_mount+0xe4/0x150
[   70.709158][ T9333]  x64_sys_call+0x2691/0x3cf0
[   70.709630][ T9333]  do_syscall_64+0xd9/0x1d0
[   70.710053][ T9333]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[   70.710611][ T9333]
[   70.710842][ T9333] CPU: 3 UID: 0 PID: 9333 Comm: repro Not tainted 6.12.0-rc6-dirty #17
[   70.711568][ T9333] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[   70.712490][ T9333] =====================================================
[   70.713085][ T9333] Disabling lock debugging due to kernel taint
[   70.713618][ T9333] Kernel panic - not syncing: kmsan.panic set ...
[   70.714159][ T9333] CPU: 3 UID: 0 PID: 9333 Comm: repro Tainted: G    B              6.12.0-rc6-dirty #17
[   70.715007][ T9333] Tainted: [B]=BAD_PAGE
[   70.715365][ T9333] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[   70.716311][ T9333] Call Trace:
[   70.716621][ T9333]  <TASK>
[   70.716899][ T9333]  dump_stack_lvl+0x1fd/0x2b0
[   70.717350][ T9333]  dump_stack+0x1e/0x30
[   70.717743][ T9333]  panic+0x502/0xca0
[   70.718116][ T9333]  ? kmsan_get_metadata+0x13e/0x1c0
[   70.718611][ T9333]  kmsan_report+0x296/0x2a0
[   70.719038][ T9333]  ? __msan_metadata_ptr_for_load_4+0x24/0x40
[   70.719859][ T9333]  ? __msan_warning+0x96/0x120
[   70.720345][ T9333]  ? hfsplus_subfolders_dec+0x1d7/0x220
[   70.720881][ T9333]  ? hfsplus_delete_cat+0x105d/0x12b0
[   70.721412][ T9333]  ? hfsplus_rmdir+0x13d/0x310
[   70.721880][ T9333]  ? vfs_rmdir+0x5ba/0x810
[   70.722458][ T9333]  ? do_rmdir+0x964/0xea0
[   70.722883][ T9333]  ? __x64_sys_rmdir+0x71/0xb0
[   70.723397][ T9333]  ? x64_sys_call+0xcd8/0x3cf0
[   70.723915][ T9333]  ? do_syscall_64+0xd9/0x1d0
[   70.724454][ T9333]  ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[   70.725110][ T9333]  ? vprintk_emit+0xd1f/0xe60
[   70.725616][ T9333]  ? vprintk_default+0x3f/0x50
[   70.726175][ T9333]  ? vprintk+0xce/0xd0
[   70.726628][ T9333]  ? _printk+0x17e/0x1b0
[   70.727129][ T9333]  ? __msan_metadata_ptr_for_load_4+0x24/0x40
[   70.727739][ T9333]  ? kmsan_get_metadata+0x13e/0x1c0
[   70.728324][ T9333]  __msan_warning+0x96/0x120
[   70.728854][ T9333]  hfsplus_subfolders_dec+0x1d7/0x220
[   70.729479][ T9333]  hfsplus_delete_cat+0x105d/0x12b0
[   70.729984][ T9333]  ? kmsan_get_shadow_origin_ptr+0x4a/0xb0
[   70.730646][ T9333]  ? __msan_metadata_ptr_for_load_4+0x24/0x40
[   70.731296][ T9333]  ? kmsan_get_metadata+0x13e/0x1c0
[   70.731863][ T9333]  hfsplus_rmdir+0x13d/0x310
[   70.732390][ T9333]  ? __pfx_hfsplus_rmdir+0x10/0x10
[   70.732919][ T9333]  vfs_rmdir+0x5ba/0x810
[   70.733416][ T9333]  ? kmsan_get_shadow_origin_ptr+0x4a/0xb0
[   70.734044][ T9333]  do_rmdir+0x964/0xea0
[   70.734537][ T9333]  __x64_sys_rmdir+0x71/0xb0
[   70.735032][ T9333]  x64_sys_call+0xcd8/0x3cf0
[   70.735579][ T9333]  do_syscall_64+0xd9/0x1d0
[   70.736092][ T9333]  ? irqentry_exit+0x16/0x60
[   70.736637][ T9333]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[   70.737269][ T9333] RIP: 0033:0x7fa9424eafc9
[   70.737775][ T9333] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 48
[   70.739844][ T9333] RSP: 002b:00007fff099cd8d8 EFLAGS: 00000202 ORIG_RAX: 0000000000000054
[   70.740760][ T9333] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fa9424eafc9
[   70.741642][ T9333] RDX: 006c6f72746e6f63 RSI: 000000000000000a RDI: 0000000020000100
[   70.742543][ T9333] RBP: 00007fff099cd8e0 R08: 00007fff099cd910 R09: 00007fff099cd910
[   70.743376][ T9333] R10: 0000000000000000 R11: 0000000000000202 R12: 0000565430642260
[   70.744247][ T9333] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[   70.745082][ T9333]  </TASK>

The main reason of the issue that struct hfsplus_inode_info
has not been properly initialized for the case of root folder.
In the case of root folder, hfsplus_fill_super() calls
the hfsplus_iget() that implements only partial initialization of
struct hfsplus_inode_info and subfolders field is not
initialized by hfsplus_iget() logic.

This patch implements complete initialization of
struct hfsplus_inode_info in the hfsplus_iget() logic with
the goal to prevent likewise issues for the case of
root folder.

Reported-by: syzbot <syzbot+fdedff847a0e5e84c39f@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=fdedff847a0e5e84c39f
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
cc: Yangtao Li <frank.li@vivo.com>
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20250825225103.326401-1-slava@dubeyko.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-08-31 18:14:32 -07:00
Viacheslav Dubeyko
4840ceadef hfsplus: fix KMSAN uninit-value issue in __hfsplus_ext_cache_extent()
The syzbot reported issue in __hfsplus_ext_cache_extent():

[   70.194323][ T9350] BUG: KMSAN: uninit-value in __hfsplus_ext_cache_extent+0x7d0/0x990
[   70.195022][ T9350]  __hfsplus_ext_cache_extent+0x7d0/0x990
[   70.195530][ T9350]  hfsplus_file_extend+0x74f/0x1cf0
[   70.195998][ T9350]  hfsplus_get_block+0xe16/0x17b0
[   70.196458][ T9350]  __block_write_begin_int+0x962/0x2ce0
[   70.196959][ T9350]  cont_write_begin+0x1000/0x1950
[   70.197416][ T9350]  hfsplus_write_begin+0x85/0x130
[   70.197873][ T9350]  generic_perform_write+0x3e8/0x1060
[   70.198374][ T9350]  __generic_file_write_iter+0x215/0x460
[   70.198892][ T9350]  generic_file_write_iter+0x109/0x5e0
[   70.199393][ T9350]  vfs_write+0xb0f/0x14e0
[   70.199771][ T9350]  ksys_write+0x23e/0x490
[   70.200149][ T9350]  __x64_sys_write+0x97/0xf0
[   70.200570][ T9350]  x64_sys_call+0x3015/0x3cf0
[   70.201065][ T9350]  do_syscall_64+0xd9/0x1d0
[   70.201506][ T9350]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[   70.202054][ T9350]
[   70.202279][ T9350] Uninit was created at:
[   70.202693][ T9350]  __kmalloc_noprof+0x621/0xf80
[   70.203149][ T9350]  hfsplus_find_init+0x8d/0x1d0
[   70.203602][ T9350]  hfsplus_file_extend+0x6ca/0x1cf0
[   70.204087][ T9350]  hfsplus_get_block+0xe16/0x17b0
[   70.204561][ T9350]  __block_write_begin_int+0x962/0x2ce0
[   70.205074][ T9350]  cont_write_begin+0x1000/0x1950
[   70.205547][ T9350]  hfsplus_write_begin+0x85/0x130
[   70.206017][ T9350]  generic_perform_write+0x3e8/0x1060
[   70.206519][ T9350]  __generic_file_write_iter+0x215/0x460
[   70.207042][ T9350]  generic_file_write_iter+0x109/0x5e0
[   70.207552][ T9350]  vfs_write+0xb0f/0x14e0
[   70.207961][ T9350]  ksys_write+0x23e/0x490
[   70.208375][ T9350]  __x64_sys_write+0x97/0xf0
[   70.208810][ T9350]  x64_sys_call+0x3015/0x3cf0
[   70.209255][ T9350]  do_syscall_64+0xd9/0x1d0
[   70.209680][ T9350]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[   70.210230][ T9350]
[   70.210454][ T9350] CPU: 2 UID: 0 PID: 9350 Comm: repro Not tainted 6.12.0-rc5 #5
[   70.211174][ T9350] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[   70.212115][ T9350] =====================================================
[   70.212734][ T9350] Disabling lock debugging due to kernel taint
[   70.213284][ T9350] Kernel panic - not syncing: kmsan.panic set ...
[   70.213858][ T9350] CPU: 2 UID: 0 PID: 9350 Comm: repro Tainted: G    B              6.12.0-rc5 #5
[   70.214679][ T9350] Tainted: [B]=BAD_PAGE
[   70.215057][ T9350] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[   70.215999][ T9350] Call Trace:
[   70.216309][ T9350]  <TASK>
[   70.216585][ T9350]  dump_stack_lvl+0x1fd/0x2b0
[   70.217025][ T9350]  dump_stack+0x1e/0x30
[   70.217421][ T9350]  panic+0x502/0xca0
[   70.217803][ T9350]  ? kmsan_get_metadata+0x13e/0x1c0

[   70.218294][ Message fromT sy9350]  kmsan_report+0x296/slogd@syzkaller 0x2aat Aug 18 22:11:058 ...
 kernel
:[   70.213284][ T9350] Kernel panic - not syncing: kmsan.panic [   70.220179][ T9350]  ? kmsan_get_metadata+0x13e/0x1c0
set ...
[   70.221254][ T9350]  ? __msan_warning+0x96/0x120
[   70.222066][ T9350]  ? __hfsplus_ext_cache_extent+0x7d0/0x990
[   70.223023][ T9350]  ? hfsplus_file_extend+0x74f/0x1cf0
[   70.224120][ T9350]  ? hfsplus_get_block+0xe16/0x17b0
[   70.224946][ T9350]  ? __block_write_begin_int+0x962/0x2ce0
[   70.225756][ T9350]  ? cont_write_begin+0x1000/0x1950
[   70.226337][ T9350]  ? hfsplus_write_begin+0x85/0x130
[   70.226852][ T9350]  ? generic_perform_write+0x3e8/0x1060
[   70.227405][ T9350]  ? __generic_file_write_iter+0x215/0x460
[   70.227979][ T9350]  ? generic_file_write_iter+0x109/0x5e0
[   70.228540][ T9350]  ? vfs_write+0xb0f/0x14e0
[   70.228997][ T9350]  ? ksys_write+0x23e/0x490
[   70.229458][ T9350]  ? __x64_sys_write+0x97/0xf0
[   70.229939][ T9350]  ? x64_sys_call+0x3015/0x3cf0
[   70.230432][ T9350]  ? do_syscall_64+0xd9/0x1d0
[   70.230941][ T9350]  ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[   70.231926][ T9350]  ? kmsan_get_metadata+0x13e/0x1c0
[   70.232738][ T9350]  ? kmsan_internal_set_shadow_origin+0x77/0x110
[   70.233711][ T9350]  ? kmsan_get_metadata+0x13e/0x1c0
[   70.234516][ T9350]  ? kmsan_get_shadow_origin_ptr+0x4a/0xb0
[   70.235398][ T9350]  ? __msan_metadata_ptr_for_load_4+0x24/0x40
[   70.236323][ T9350]  ? hfsplus_brec_find+0x218/0x9f0
[   70.237090][ T9350]  ? __pfx_hfs_find_rec_by_key+0x10/0x10
[   70.237938][ T9350]  ? __msan_instrument_asm_store+0xbf/0xf0
[   70.238827][ T9350]  ? __msan_metadata_ptr_for_store_4+0x27/0x40
[   70.239772][ T9350]  ? __hfsplus_ext_write_extent+0x536/0x620
[   70.240666][ T9350]  ? kmsan_get_metadata+0x13e/0x1c0
[   70.241175][ T9350]  __msan_warning+0x96/0x120
[   70.241645][ T9350]  __hfsplus_ext_cache_extent+0x7d0/0x990
[   70.242223][ T9350]  hfsplus_file_extend+0x74f/0x1cf0
[   70.242748][ T9350]  hfsplus_get_block+0xe16/0x17b0
[   70.243255][ T9350]  ? kmsan_internal_set_shadow_origin+0x77/0x110
[   70.243878][ T9350]  ? kmsan_get_metadata+0x13e/0x1c0
[   70.244400][ T9350]  ? kmsan_get_shadow_origin_ptr+0x4a/0xb0
[   70.244967][ T9350]  __block_write_begin_int+0x962/0x2ce0
[   70.245531][ T9350]  ? __pfx_hfsplus_get_block+0x10/0x10
[   70.246079][ T9350]  cont_write_begin+0x1000/0x1950
[   70.246598][ T9350]  hfsplus_write_begin+0x85/0x130
[   70.247105][ T9350]  ? __pfx_hfsplus_get_block+0x10/0x10
[   70.247650][ T9350]  ? __pfx_hfsplus_write_begin+0x10/0x10
[   70.248211][ T9350]  generic_perform_write+0x3e8/0x1060
[   70.248752][ T9350]  __generic_file_write_iter+0x215/0x460
[   70.249314][ T9350]  generic_file_write_iter+0x109/0x5e0
[   70.249856][ T9350]  ? kmsan_internal_set_shadow_origin+0x77/0x110
[   70.250487][ T9350]  vfs_write+0xb0f/0x14e0
[   70.250930][ T9350]  ? __pfx_generic_file_write_iter+0x10/0x10
[   70.251530][ T9350]  ksys_write+0x23e/0x490
[   70.251974][ T9350]  __x64_sys_write+0x97/0xf0
[   70.252450][ T9350]  x64_sys_call+0x3015/0x3cf0
[   70.252924][ T9350]  do_syscall_64+0xd9/0x1d0
[   70.253384][ T9350]  ? irqentry_exit+0x16/0x60
[   70.253844][ T9350]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[   70.254430][ T9350] RIP: 0033:0x7f7a92adffc9
[   70.254873][ T9350] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 48
[   70.256674][ T9350] RSP: 002b:00007fff0bca3188 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[   70.257485][ T9350] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7a92adffc9
[   70.258246][ T9350] RDX: 000000000208e24b RSI: 0000000020000100 RDI: 0000000000000004
[   70.258998][ T9350] RBP: 00007fff0bca31a0 R08: 00007fff0bca31a0 R09: 00007fff0bca31a0
[   70.259769][ T9350] R10: 0000000000000000 R11: 0000000000000202 R12: 000055e0d75f8250
[   70.260520][ T9350] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[   70.261286][ T9350]  </TASK>
[   70.262026][ T9350] Kernel Offset: disabled

(gdb) l *__hfsplus_ext_cache_extent+0x7d0
0xffffffff8318aef0 is in __hfsplus_ext_cache_extent (fs/hfsplus/extents.c:168).
163		fd->key->ext.cnid = 0;
164		res = hfs_brec_find(fd, hfs_find_rec_by_key);
165		if (res && res != -ENOENT)
166			return res;
167		if (fd->key->ext.cnid != fd->search_key->ext.cnid ||
168		    fd->key->ext.fork_type != fd->search_key->ext.fork_type)
169			return -ENOENT;
170		if (fd->entrylength != sizeof(hfsplus_extent_rec))
171			return -EIO;
172		hfs_bnode_read(fd->bnode, extent, fd->entryoffset,

The __hfsplus_ext_cache_extent() calls __hfsplus_ext_read_extent():

res = __hfsplus_ext_read_extent(fd, hip->cached_extents, inode->i_ino,
				block, HFSPLUS_IS_RSRC(inode) ?
					HFSPLUS_TYPE_RSRC :
					HFSPLUS_TYPE_DATA);

And if inode->i_ino could be equal to zero or any non-available CNID,
then hfs_brec_find() could not find the record in the tree. As a result,
fd->key could be compared with fd->search_key. But hfsplus_find_init()
uses kmalloc() for fd->key and fd->search_key allocation:

int hfs_find_init(struct hfs_btree *tree, struct hfs_find_data *fd)
{
<skipped>
        ptr = kmalloc(tree->max_key_len * 2 + 4, GFP_KERNEL);
        if (!ptr)
                return -ENOMEM;
        fd->search_key = ptr;
        fd->key = ptr + tree->max_key_len + 2;
<skipped>
}

Finally, fd->key is still not initialized if hfs_brec_find()
has found nothing.

This patch changes kmalloc() on kzalloc() in hfs_find_init()
and intializes fd->record, fd->keyoffset, fd->keylength,
fd->entryoffset, fd->entrylength for the case if hfs_brec_find()
has been found nothing in the b-tree node.

Reported-by: syzbot <syzbot+55ad87f38795d6787521@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=55ad87f38795d6787521
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
cc: Yangtao Li <frank.li@vivo.com>
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20250818225232.126402-1-slava@dubeyko.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-08-31 18:11:43 -07:00
Yang Chenzhi
738d5a5186 hfs: validate record offset in hfsplus_bmap_alloc
hfsplus_bmap_alloc can trigger a crash if a
record offset or length is larger than node_size

[   15.264282] BUG: KASAN: slab-out-of-bounds in hfsplus_bmap_alloc+0x887/0x8b0
[   15.265192] Read of size 8 at addr ffff8881085ca188 by task test/183
[   15.265949]
[   15.266163] CPU: 0 UID: 0 PID: 183 Comm: test Not tainted 6.17.0-rc2-gc17b750b3ad9 #14 PREEMPT(voluntary)
[   15.266165] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[   15.266167] Call Trace:
[   15.266168]  <TASK>
[   15.266169]  dump_stack_lvl+0x53/0x70
[   15.266173]  print_report+0xd0/0x660
[   15.266181]  kasan_report+0xce/0x100
[   15.266185]  hfsplus_bmap_alloc+0x887/0x8b0
[   15.266208]  hfs_btree_inc_height.isra.0+0xd5/0x7c0
[   15.266217]  hfsplus_brec_insert+0x870/0xb00
[   15.266222]  __hfsplus_ext_write_extent+0x428/0x570
[   15.266225]  __hfsplus_ext_cache_extent+0x5e/0x910
[   15.266227]  hfsplus_ext_read_extent+0x1b2/0x200
[   15.266233]  hfsplus_file_extend+0x5a7/0x1000
[   15.266237]  hfsplus_get_block+0x12b/0x8c0
[   15.266238]  __block_write_begin_int+0x36b/0x12c0
[   15.266251]  block_write_begin+0x77/0x110
[   15.266252]  cont_write_begin+0x428/0x720
[   15.266259]  hfsplus_write_begin+0x51/0x100
[   15.266262]  cont_write_begin+0x272/0x720
[   15.266270]  hfsplus_write_begin+0x51/0x100
[   15.266274]  generic_perform_write+0x321/0x750
[   15.266285]  generic_file_write_iter+0xc3/0x310
[   15.266289]  __kernel_write_iter+0x2fd/0x800
[   15.266296]  dump_user_range+0x2ea/0x910
[   15.266301]  elf_core_dump+0x2a94/0x2ed0
[   15.266320]  vfs_coredump+0x1d85/0x45e0
[   15.266349]  get_signal+0x12e3/0x1990
[   15.266357]  arch_do_signal_or_restart+0x89/0x580
[   15.266362]  irqentry_exit_to_user_mode+0xab/0x110
[   15.266364]  asm_exc_page_fault+0x26/0x30
[   15.266366] RIP: 0033:0x41bd35
[   15.266367] Code: bc d1 f3 0f 7f 27 f3 0f 7f 6f 10 f3 0f 7f 77 20 f3 0f 7f 7f 30 49 83 c0 0f 49 29 d0 48 8d 7c 17 31 e9 9f 0b 00 00 66 0f ef c0 <f3> 0f 6f 0e f3 0f 6f 56 10 66 0f 74 c1 66 0f d7 d0 49 83 f8f
[   15.266369] RSP: 002b:00007ffc9e62d078 EFLAGS: 00010283
[   15.266371] RAX: 00007ffc9e62d100 RBX: 0000000000000000 RCX: 0000000000000000
[   15.266372] RDX: 00000000000000e0 RSI: 0000000000000000 RDI: 00007ffc9e62d100
[   15.266373] RBP: 0000400000000040 R08: 00000000000000e0 R09: 0000000000000000
[   15.266374] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[   15.266375] R13: 0000000000000000 R14: 0000000000000000 R15: 0000400000000000
[   15.266376]  </TASK>

When calling hfsplus_bmap_alloc to allocate a free node, this function
first retrieves the bitmap from header node and map node using node->page
together with the offset and length from hfs_brec_lenoff

```
len = hfs_brec_lenoff(node, 2, &off16);
off = off16;

off += node->page_offset;
pagep = node->page + (off >> PAGE_SHIFT);
data = kmap_local_page(*pagep);
```

However, if the retrieved offset or length is invalid(i.e. exceeds
node_size), the code may end up accessing pages outside the allocated
range for this node.

This patch adds proper validation of both offset and length before use,
preventing out-of-bounds page access. Move is_bnode_offset_valid and
check_and_correct_requested_length to hfsplus_fs.h, as they may be
required by other functions.

Reported-by: syzbot+356aed408415a56543cd@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/67bcb4a6.050a0220.bbfd1.008f.GAE@google.com/
Signed-off-by: Yang Chenzhi <yang.chenzhi@vivo.com>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
Link: https://lore.kernel.org/r/20250818141734.8559-2-yang.chenzhi@vivo.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-08-31 18:09:44 -07:00
Yangtao Li
9282bc905f hfsplus: return EIO when type of hidden directory mismatch in hfsplus_fill_super()
If Catalog File contains corrupted record for the case of
hidden directory's type, regard it as I/O error instead of
Invalid argument.

Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Link: https://lore.kernel.org/r/20250805165905.3390154-1-frank.li@vivo.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-08-31 18:07:59 -07:00
Linus Torvalds
cb6bbff7e6 hfs/hfsplus updates for v6.17
- hfs: fix general protection fault in hfs_find_init()
 - hfs: fix slab-out-of-bounds in hfs_bnode_read()
 - hfsplus: fix slab-out-of-bounds in hfsplus_bnode_read()
 - hfsplus: fix slab-out-of-bounds read in hfsplus_uni2asc()
 - hfsplus: don't use BUG_ON() in hfsplus_create_attributes_file()
 - hfsplus: don't set REQ_SYNC for hfsplus_submit_bio()
 - hfsplus: remove mutex_lock check in hfsplus_free_extents
 - hfs: make splice write available again
 - hfsplus: make splice write available again
 - hfs: fix not erasing deleted b-tree node issue
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQT4wVoLCG92poNnMFAhI4xTh21NnQUCaIQQ0wAKCRAhI4xTh21N
 nW3yAQDMhJcNyjP1j2dhNRq8l2PO6jDJqLhxAYGKwWMwv1GTvQD5AaOUSeMQbmcs
 hNkMtjzb7OlfBLUthvrWlaCfLKWCmAk=
 =dI94
 -----END PGP SIGNATURE-----

Merge tag 'hfs-v6.17-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/hfs

Pull hfs/hfsplus updates from Viacheslav Dubeyko:
 "Johannes Thumshirn has made nice cleanup in hfsplus_submit_bio().

  Tetsuo Handa has fixed the syzbot reported issue in
  hfsplus_create_attributes_file() for the case of corruption the
  Attributes File's metadata.

  Yangtao Li has fixed the syzbot reported issue by removing the
  uneccessary WARN_ON() in hfsplus_free_extents().

  Other fixes:

   - restore generic/001 successful execution by erasing deleted b-tree
     nodes

   - eliminate slab-out-of-bounds issue in hfs_bnode_read() and
     hfsplus_bnode_read() by checking correctness of offset and length
     when accessing b-tree node contents

   - eliminate slab-out-of-bounds read in hfsplus_uni2asc() if the
     b-tree node record has corrupted length of a name that could be
     bigger than HFSPLUS_MAX_STRLEN

   - eliminate general protection fault in hfs_find_init() for the case
     of initial b-tree object creation"

* tag 'hfs-v6.17-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/hfs:
  hfs: fix general protection fault in hfs_find_init()
  hfs: fix slab-out-of-bounds in hfs_bnode_read()
  hfsplus: fix slab-out-of-bounds in hfsplus_bnode_read()
  hfsplus: fix slab-out-of-bounds read in hfsplus_uni2asc()
  hfsplus: don't use BUG_ON() in hfsplus_create_attributes_file()
  hfsplus: don't set REQ_SYNC for hfsplus_submit_bio()
  hfsplus: remove mutex_lock check in hfsplus_free_extents
  hfs: make splice write available again
  hfsplus: make splice write available again
  hfs: fix not erasing deleted b-tree node issue
2025-07-28 16:17:44 -07:00
Linus Torvalds
57fcb7d930 vfs-6.17-rc1.fileattr
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaINCpgAKCRCRxhvAZXjc
 oqfFAQDcy3rROUF3W34KcSi7rDmaKVSX53d1tUoqH+1zDRpSlwEAriKDNC1ybudp
 YAnxVzkRHjHs1296WIuwKq5lfhJ60Q4=
 =geAl
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.17-rc1.fileattr' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull fileattr updates from Christian Brauner:
 "This introduces the new file_getattr() and file_setattr() system calls
  after lengthy discussions.

  Both system calls serve as successors and extensible companions to
  the FS_IOC_FSGETXATTR and FS_IOC_FSSETXATTR system calls which have
  started to show their age in addition to being named in a way that
  makes it easy to conflate them with extended attribute related
  operations.

  These syscalls allow userspace to set filesystem inode attributes on
  special files. One of the usage examples is the XFS quota projects.

  XFS has project quotas which could be attached to a directory. All new
  inodes in these directories inherit project ID set on parent
  directory.

  The project is created from userspace by opening and calling
  FS_IOC_FSSETXATTR on each inode. This is not possible for special
  files such as FIFO, SOCK, BLK etc. Therefore, some inodes are left
  with empty project ID. Those inodes then are not shown in the quota
  accounting but still exist in the directory. This is not critical but
  in the case when special files are created in the directory with
  already existing project quota, these new inodes inherit extended
  attributes. This creates a mix of special files with and without
  attributes. Moreover, special files with attributes don't have a
  possibility to become clear or change the attributes. This, in turn,
  prevents userspace from re-creating quota project on these existing
  files.

  In addition, these new system calls allow the implementation of
  additional attributes that we couldn't or didn't want to fit into the
  legacy ioctls anymore"

* tag 'vfs-6.17-rc1.fileattr' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  fs: tighten a sanity check in file_attr_to_fileattr()
  tree-wide: s/struct fileattr/struct file_kattr/g
  fs: introduce file_getattr and file_setattr syscalls
  fs: prepare for extending file_get/setattr()
  fs: make vfs_fileattr_[get|set] return -EOPNOTSUPP
  selinux: implement inode_file_[g|s]etattr hooks
  lsm: introduce new hooks for setting/getting inode fsxattr
  fs: split fileattr related helpers into separate file
2025-07-28 15:24:14 -07:00
Linus Torvalds
7031769e10 vfs-6.17-rc1.mmap_prepare
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaINCgQAKCRCRxhvAZXjc
 os+nAP9LFHUwWO6EBzHJJGEVjJvvzsbzqeYrRFamYiMc5ulPJwD+KW4RIgJa/MWO
 pcYE40CacaekD8rFWwYUyszpgmv6ewc=
 =wCwp
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.17-rc1.mmap_prepare' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull mmap_prepare updates from Christian Brauner:
 "Last cycle we introduce f_op->mmap_prepare() in c84bf6dd2b ("mm:
  introduce new .mmap_prepare() file callback").

  This is preferred to the existing f_op->mmap() hook as it does require
  a VMA to be established yet, thus allowing the mmap logic to invoke
  this hook far, far earlier, prior to inserting a VMA into the virtual
  address space, or performing any other heavy handed operations.

  This allows for much simpler unwinding on error, and for there to be a
  single attempt at merging a VMA rather than having to possibly
  reattempt a merge based on potentially altered VMA state.

  Far more importantly, it prevents inappropriate manipulation of
  incompletely initialised VMA state, which is something that has been
  the cause of bugs and complexity in the past.

  The intent is to gradually deprecate f_op->mmap, and in that vein this
  series coverts the majority of file systems to using f_op->mmap_prepare.

  Prerequisite steps are taken - firstly ensuring all checks for mmap
  capabilities use the file_has_valid_mmap_hooks() helper rather than
  directly checking for f_op->mmap (which is now not a valid check) and
  secondly updating daxdev_mapping_supported() to not require a VMA
  parameter to allow ext4 and xfs to be converted.

  Commit bb666b7c27 ("mm: add mmap_prepare() compatibility layer for
  nested file systems") handles the nasty edge-case of nested file
  systems like overlayfs, which introduces a compatibility shim to allow
  f_op->mmap_prepare() to be invoked from an f_op->mmap() callback.

  This allows for nested filesystems to continue to function correctly
  with all file systems regardless of which callback is used. Once we
  finally convert all file systems, this shim can be removed.

  As a result, ecryptfs, fuse, and overlayfs remain unaltered so they
  can nest all other file systems.

  We additionally do not update resctl - as this requires an update to
  remap_pfn_range() (or an alternative to it) which we defer to a later
  series, equally we do not update cramfs which needs a mixed mapping
  insertion with the same issue, nor do we update procfs, hugetlbfs,
  syfs or kernfs all of which require VMAs for internal state and hooks.
  We shall return to all of these later"

* tag 'vfs-6.17-rc1.mmap_prepare' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  doc: update porting, vfs documentation to describe mmap_prepare()
  fs: replace mmap hook with .mmap_prepare for simple mappings
  fs: convert most other generic_file_*mmap() users to .mmap_prepare()
  fs: convert simple use of generic_file_*_mmap() to .mmap_prepare()
  mm/filemap: introduce generic_file_*_mmap_prepare() helpers
  fs/xfs: transition from deprecated .mmap hook to .mmap_prepare
  fs/ext4: transition from deprecated .mmap hook to .mmap_prepare
  fs/dax: make it possible to check dev dax support without a VMA
  fs: consistently use can_mmap_file() helper
  mm/nommu: use file_has_valid_mmap_hooks() helper
  mm: rename call_mmap/mmap_prepare to vfs_mmap/mmap_prepare
2025-07-28 13:43:25 -07:00
Linus Torvalds
7879d7aff0 vfs-6.17-rc1.misc
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaIM/KwAKCRCRxhvAZXjc
 opT+AP407JwhRSBjUEmHg5JzUyDoivkOySdnthunRjaBKD8rlgEApM6SOIZYucU7
 cPC3ZY6ORFM6Mwaw+iDW9lasM5ucHQ8=
 =CHha
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.17-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull misc VFS updates from Christian Brauner:
 "This contains the usual selections of misc updates for this cycle.

  Features:

   - Add ext4 IOCB_DONTCACHE support

     This refactors the address_space_operations write_begin() and
     write_end() callbacks to take const struct kiocb * as their first
     argument, allowing IOCB flags such as IOCB_DONTCACHE to propagate
     to the filesystem's buffered I/O path.

     Ext4 is updated to implement handling of the IOCB_DONTCACHE flag
     and advertises support via the FOP_DONTCACHE file operation flag.

     Additionally, the i915 driver's shmem write paths are updated to
     bypass the legacy write_begin/write_end interface in favor of
     directly calling write_iter() with a constructed synchronous kiocb.
     Another i915 change replaces a manual write loop with
     kernel_write() during GEM shmem object creation.

  Cleanups:

   - don't duplicate vfs_open() in kernel_file_open()

   - proc_fd_getattr(): don't bother with S_ISDIR() check

   - fs/ecryptfs: replace snprintf with sysfs_emit in show function

   - vfs: Remove unnecessary list_for_each_entry_safe() from
     evict_inodes()

   - filelock: add new locks_wake_up_waiter() helper

   - fs: Remove three arguments from block_write_end()

   - VFS: change old_dir and new_dir in struct renamedata to dentrys

   - netfs: Remove unused declaration netfs_queue_write_request()

  Fixes:

   - eventpoll: Fix semi-unbounded recursion

   - eventpoll: fix sphinx documentation build warning

   - fs/read_write: Fix spelling typo

   - fs: annotate data race between poll_schedule_timeout() and
     pollwake()

   - fs/pipe: set FMODE_NOWAIT in create_pipe_files()

   - docs/vfs: update references to i_mutex to i_rwsem

   - fs/buffer: remove comment about hard sectorsize

   - fs/buffer: remove the min and max limit checks in __getblk_slow()

   - fs/libfs: don't assume blocksize <= PAGE_SIZE in
     generic_check_addressable

   - fs_context: fix parameter name in infofc() macro

   - fs: Prevent file descriptor table allocations exceeding INT_MAX"

* tag 'vfs-6.17-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (24 commits)
  netfs: Remove unused declaration netfs_queue_write_request()
  eventpoll: fix sphinx documentation build warning
  ext4: support uncached buffered I/O
  mm/pagemap: add write_begin_get_folio() helper function
  fs: change write_begin/write_end interface to take struct kiocb *
  drm/i915: Refactor shmem_pwrite() to use kiocb and write_iter
  drm/i915: Use kernel_write() in shmem object create
  eventpoll: Fix semi-unbounded recursion
  vfs: Remove unnecessary list_for_each_entry_safe() from evict_inodes()
  fs/libfs: don't assume blocksize <= PAGE_SIZE in generic_check_addressable
  fs/buffer: remove the min and max limit checks in __getblk_slow()
  fs: Prevent file descriptor table allocations exceeding INT_MAX
  fs: Remove three arguments from block_write_end()
  fs/ecryptfs: replace snprintf with sysfs_emit in show function
  fs: annotate suspected data race between poll_schedule_timeout() and pollwake()
  docs/vfs: update references to i_mutex to i_rwsem
  fs/buffer: remove comment about hard sectorsize
  fs_context: fix parameter name in infofc() macro
  VFS: change old_dir and new_dir in struct renamedata to dentrys
  proc_fd_getattr(): don't bother with S_ISDIR() check
  ...
2025-07-28 11:22:56 -07:00
Viacheslav Dubeyko
c80aa2aaaa hfsplus: fix slab-out-of-bounds in hfsplus_bnode_read()
The hfsplus_bnode_read() method can trigger the issue:

[  174.852007][ T9784] ==================================================================
[  174.852709][ T9784] BUG: KASAN: slab-out-of-bounds in hfsplus_bnode_read+0x2f4/0x360
[  174.853412][ T9784] Read of size 8 at addr ffff88810b5fc6c0 by task repro/9784
[  174.854059][ T9784]
[  174.854272][ T9784] CPU: 1 UID: 0 PID: 9784 Comm: repro Not tainted 6.16.0-rc3 #7 PREEMPT(full)
[  174.854281][ T9784] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[  174.854286][ T9784] Call Trace:
[  174.854289][ T9784]  <TASK>
[  174.854292][ T9784]  dump_stack_lvl+0x10e/0x1f0
[  174.854305][ T9784]  print_report+0xd0/0x660
[  174.854315][ T9784]  ? __virt_addr_valid+0x81/0x610
[  174.854323][ T9784]  ? __phys_addr+0xe8/0x180
[  174.854330][ T9784]  ? hfsplus_bnode_read+0x2f4/0x360
[  174.854337][ T9784]  kasan_report+0xc6/0x100
[  174.854346][ T9784]  ? hfsplus_bnode_read+0x2f4/0x360
[  174.854354][ T9784]  hfsplus_bnode_read+0x2f4/0x360
[  174.854362][ T9784]  hfsplus_bnode_dump+0x2ec/0x380
[  174.854370][ T9784]  ? __pfx_hfsplus_bnode_dump+0x10/0x10
[  174.854377][ T9784]  ? hfsplus_bnode_write_u16+0x83/0xb0
[  174.854385][ T9784]  ? srcu_gp_start+0xd0/0x310
[  174.854393][ T9784]  ? __mark_inode_dirty+0x29e/0xe40
[  174.854402][ T9784]  hfsplus_brec_remove+0x3d2/0x4e0
[  174.854411][ T9784]  __hfsplus_delete_attr+0x290/0x3a0
[  174.854419][ T9784]  ? __pfx_hfs_find_1st_rec_by_cnid+0x10/0x10
[  174.854427][ T9784]  ? __pfx___hfsplus_delete_attr+0x10/0x10
[  174.854436][ T9784]  ? __asan_memset+0x23/0x50
[  174.854450][ T9784]  hfsplus_delete_all_attrs+0x262/0x320
[  174.854459][ T9784]  ? __pfx_hfsplus_delete_all_attrs+0x10/0x10
[  174.854469][ T9784]  ? rcu_is_watching+0x12/0xc0
[  174.854476][ T9784]  ? __mark_inode_dirty+0x29e/0xe40
[  174.854483][ T9784]  hfsplus_delete_cat+0x845/0xde0
[  174.854493][ T9784]  ? __pfx_hfsplus_delete_cat+0x10/0x10
[  174.854507][ T9784]  hfsplus_unlink+0x1ca/0x7c0
[  174.854516][ T9784]  ? __pfx_hfsplus_unlink+0x10/0x10
[  174.854525][ T9784]  ? down_write+0x148/0x200
[  174.854532][ T9784]  ? __pfx_down_write+0x10/0x10
[  174.854540][ T9784]  vfs_unlink+0x2fe/0x9b0
[  174.854549][ T9784]  do_unlinkat+0x490/0x670
[  174.854557][ T9784]  ? __pfx_do_unlinkat+0x10/0x10
[  174.854565][ T9784]  ? __might_fault+0xbc/0x130
[  174.854576][ T9784]  ? getname_flags.part.0+0x1c5/0x550
[  174.854584][ T9784]  __x64_sys_unlink+0xc5/0x110
[  174.854592][ T9784]  do_syscall_64+0xc9/0x480
[  174.854600][ T9784]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  174.854608][ T9784] RIP: 0033:0x7f6fdf4c3167
[  174.854614][ T9784] Code: f0 ff ff 73 01 c3 48 8b 0d 26 0d 0e 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 08
[  174.854622][ T9784] RSP: 002b:00007ffcb948bca8 EFLAGS: 00000206 ORIG_RAX: 0000000000000057
[  174.854630][ T9784] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f6fdf4c3167
[  174.854636][ T9784] RDX: 00007ffcb948bcc0 RSI: 00007ffcb948bcc0 RDI: 00007ffcb948bd50
[  174.854641][ T9784] RBP: 00007ffcb948cd90 R08: 0000000000000001 R09: 00007ffcb948bb40
[  174.854645][ T9784] R10: 00007f6fdf564fc0 R11: 0000000000000206 R12: 0000561e1bc9c2d0
[  174.854650][ T9784] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  174.854658][ T9784]  </TASK>
[  174.854661][ T9784]
[  174.879281][ T9784] Allocated by task 9784:
[  174.879664][ T9784]  kasan_save_stack+0x20/0x40
[  174.880082][ T9784]  kasan_save_track+0x14/0x30
[  174.880500][ T9784]  __kasan_kmalloc+0xaa/0xb0
[  174.880908][ T9784]  __kmalloc_noprof+0x205/0x550
[  174.881337][ T9784]  __hfs_bnode_create+0x107/0x890
[  174.881779][ T9784]  hfsplus_bnode_find+0x2d0/0xd10
[  174.882222][ T9784]  hfsplus_brec_find+0x2b0/0x520
[  174.882659][ T9784]  hfsplus_delete_all_attrs+0x23b/0x320
[  174.883144][ T9784]  hfsplus_delete_cat+0x845/0xde0
[  174.883595][ T9784]  hfsplus_rmdir+0x106/0x1b0
[  174.884004][ T9784]  vfs_rmdir+0x206/0x690
[  174.884379][ T9784]  do_rmdir+0x2b7/0x390
[  174.884751][ T9784]  __x64_sys_rmdir+0xc5/0x110
[  174.885167][ T9784]  do_syscall_64+0xc9/0x480
[  174.885568][ T9784]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  174.886083][ T9784]
[  174.886293][ T9784] The buggy address belongs to the object at ffff88810b5fc600
[  174.886293][ T9784]  which belongs to the cache kmalloc-192 of size 192
[  174.887507][ T9784] The buggy address is located 40 bytes to the right of
[  174.887507][ T9784]  allocated 152-byte region [ffff88810b5fc600, ffff88810b5fc698)
[  174.888766][ T9784]
[  174.888976][ T9784] The buggy address belongs to the physical page:
[  174.889533][ T9784] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10b5fc
[  174.890295][ T9784] flags: 0x57ff00000000000(node=1|zone=2|lastcpupid=0x7ff)
[  174.890927][ T9784] page_type: f5(slab)
[  174.891284][ T9784] raw: 057ff00000000000 ffff88801b4423c0 ffffea000426dc80 dead000000000002
[  174.892032][ T9784] raw: 0000000000000000 0000000080100010 00000000f5000000 0000000000000000
[  174.892774][ T9784] page dumped because: kasan: bad access detected
[  174.893327][ T9784] page_owner tracks the page as allocated
[  174.893825][ T9784] page last allocated via order 0, migratetype Unmovable, gfp_mask 0x52c00(GFP_NOIO|__GFP_NOWARN|__GFP_NO1
[  174.895373][ T9784]  post_alloc_hook+0x1c0/0x230
[  174.895801][ T9784]  get_page_from_freelist+0xdeb/0x3b30
[  174.896284][ T9784]  __alloc_frozen_pages_noprof+0x25c/0x2460
[  174.896810][ T9784]  alloc_pages_mpol+0x1fb/0x550
[  174.897242][ T9784]  new_slab+0x23b/0x340
[  174.897614][ T9784]  ___slab_alloc+0xd81/0x1960
[  174.898028][ T9784]  __slab_alloc.isra.0+0x56/0xb0
[  174.898468][ T9784]  __kmalloc_noprof+0x2b0/0x550
[  174.898896][ T9784]  usb_alloc_urb+0x73/0xa0
[  174.899289][ T9784]  usb_control_msg+0x1cb/0x4a0
[  174.899718][ T9784]  usb_get_string+0xab/0x1a0
[  174.900133][ T9784]  usb_string_sub+0x107/0x3c0
[  174.900549][ T9784]  usb_string+0x307/0x670
[  174.900933][ T9784]  usb_cache_string+0x80/0x150
[  174.901355][ T9784]  usb_new_device+0x1d0/0x19d0
[  174.901786][ T9784]  register_root_hub+0x299/0x730
[  174.902231][ T9784] page last free pid 10 tgid 10 stack trace:
[  174.902757][ T9784]  __free_frozen_pages+0x80c/0x1250
[  174.903217][ T9784]  vfree.part.0+0x12b/0xab0
[  174.903645][ T9784]  delayed_vfree_work+0x93/0xd0
[  174.904073][ T9784]  process_one_work+0x9b5/0x1b80
[  174.904519][ T9784]  worker_thread+0x630/0xe60
[  174.904927][ T9784]  kthread+0x3a8/0x770
[  174.905291][ T9784]  ret_from_fork+0x517/0x6e0
[  174.905709][ T9784]  ret_from_fork_asm+0x1a/0x30
[  174.906128][ T9784]
[  174.906338][ T9784] Memory state around the buggy address:
[  174.906828][ T9784]  ffff88810b5fc580: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[  174.907528][ T9784]  ffff88810b5fc600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  174.908222][ T9784] >ffff88810b5fc680: 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc
[  174.908917][ T9784]                                            ^
[  174.909481][ T9784]  ffff88810b5fc700: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  174.910432][ T9784]  ffff88810b5fc780: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[  174.911401][ T9784] ==================================================================

The reason of the issue that code doesn't check the correctness
of the requested offset and length. As a result, incorrect value
of offset or/and length could result in access out of allocated
memory.

This patch introduces is_bnode_offset_valid() method that checks
the requested offset value. Also, it introduces
check_and_correct_requested_length() method that checks and
correct the requested length (if it is necessary). These methods
are used in hfsplus_bnode_read(), hfsplus_bnode_write(),
hfsplus_bnode_clear(), hfsplus_bnode_copy(), and hfsplus_bnode_move()
with the goal to prevent the access out of allocated memory
and triggering the crash.

Reported-by: Kun Hu <huk23@m.fudan.edu.cn>
Reported-by: Jiaji Qin <jjtan24@m.fudan.edu.cn>
Reported-by: Shuoran Bai <baishuoran@hrbeu.edu.cn>
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
Link: https://lore.kernel.org/r/20250703214804.244077-1-slava@dubeyko.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-07-25 15:37:12 -07:00
Viacheslav Dubeyko
94458781ae hfsplus: fix slab-out-of-bounds read in hfsplus_uni2asc()
The hfsplus_readdir() method is capable to crash by calling
hfsplus_uni2asc():

[  667.121659][ T9805] ==================================================================
[  667.122651][ T9805] BUG: KASAN: slab-out-of-bounds in hfsplus_uni2asc+0x902/0xa10
[  667.123627][ T9805] Read of size 2 at addr ffff88802592f40c by task repro/9805
[  667.124578][ T9805]
[  667.124876][ T9805] CPU: 3 UID: 0 PID: 9805 Comm: repro Not tainted 6.16.0-rc3 #1 PREEMPT(full)
[  667.124886][ T9805] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[  667.124890][ T9805] Call Trace:
[  667.124893][ T9805]  <TASK>
[  667.124896][ T9805]  dump_stack_lvl+0x10e/0x1f0
[  667.124911][ T9805]  print_report+0xd0/0x660
[  667.124920][ T9805]  ? __virt_addr_valid+0x81/0x610
[  667.124928][ T9805]  ? __phys_addr+0xe8/0x180
[  667.124934][ T9805]  ? hfsplus_uni2asc+0x902/0xa10
[  667.124942][ T9805]  kasan_report+0xc6/0x100
[  667.124950][ T9805]  ? hfsplus_uni2asc+0x902/0xa10
[  667.124959][ T9805]  hfsplus_uni2asc+0x902/0xa10
[  667.124966][ T9805]  ? hfsplus_bnode_read+0x14b/0x360
[  667.124974][ T9805]  hfsplus_readdir+0x845/0xfc0
[  667.124984][ T9805]  ? __pfx_hfsplus_readdir+0x10/0x10
[  667.124994][ T9805]  ? stack_trace_save+0x8e/0xc0
[  667.125008][ T9805]  ? iterate_dir+0x18b/0xb20
[  667.125015][ T9805]  ? trace_lock_acquire+0x85/0xd0
[  667.125022][ T9805]  ? lock_acquire+0x30/0x80
[  667.125029][ T9805]  ? iterate_dir+0x18b/0xb20
[  667.125037][ T9805]  ? down_read_killable+0x1ed/0x4c0
[  667.125044][ T9805]  ? putname+0x154/0x1a0
[  667.125051][ T9805]  ? __pfx_down_read_killable+0x10/0x10
[  667.125058][ T9805]  ? apparmor_file_permission+0x239/0x3e0
[  667.125069][ T9805]  iterate_dir+0x296/0xb20
[  667.125076][ T9805]  __x64_sys_getdents64+0x13c/0x2c0
[  667.125084][ T9805]  ? __pfx___x64_sys_getdents64+0x10/0x10
[  667.125091][ T9805]  ? __x64_sys_openat+0x141/0x200
[  667.125126][ T9805]  ? __pfx_filldir64+0x10/0x10
[  667.125134][ T9805]  ? do_user_addr_fault+0x7fe/0x12f0
[  667.125143][ T9805]  do_syscall_64+0xc9/0x480
[  667.125151][ T9805]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  667.125158][ T9805] RIP: 0033:0x7fa8753b2fc9
[  667.125164][ T9805] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 48
[  667.125172][ T9805] RSP: 002b:00007ffe96f8e0f8 EFLAGS: 00000217 ORIG_RAX: 00000000000000d9
[  667.125181][ T9805] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fa8753b2fc9
[  667.125185][ T9805] RDX: 0000000000000400 RSI: 00002000000063c0 RDI: 0000000000000004
[  667.125190][ T9805] RBP: 00007ffe96f8e110 R08: 00007ffe96f8e110 R09: 00007ffe96f8e110
[  667.125195][ T9805] R10: 0000000000000000 R11: 0000000000000217 R12: 0000556b1e3b4260
[  667.125199][ T9805] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  667.125207][ T9805]  </TASK>
[  667.125210][ T9805]
[  667.145632][ T9805] Allocated by task 9805:
[  667.145991][ T9805]  kasan_save_stack+0x20/0x40
[  667.146352][ T9805]  kasan_save_track+0x14/0x30
[  667.146717][ T9805]  __kasan_kmalloc+0xaa/0xb0
[  667.147065][ T9805]  __kmalloc_noprof+0x205/0x550
[  667.147448][ T9805]  hfsplus_find_init+0x95/0x1f0
[  667.147813][ T9805]  hfsplus_readdir+0x220/0xfc0
[  667.148174][ T9805]  iterate_dir+0x296/0xb20
[  667.148549][ T9805]  __x64_sys_getdents64+0x13c/0x2c0
[  667.148937][ T9805]  do_syscall_64+0xc9/0x480
[  667.149291][ T9805]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  667.149809][ T9805]
[  667.150030][ T9805] The buggy address belongs to the object at ffff88802592f000
[  667.150030][ T9805]  which belongs to the cache kmalloc-2k of size 2048
[  667.151282][ T9805] The buggy address is located 0 bytes to the right of
[  667.151282][ T9805]  allocated 1036-byte region [ffff88802592f000, ffff88802592f40c)
[  667.152580][ T9805]
[  667.152798][ T9805] The buggy address belongs to the physical page:
[  667.153373][ T9805] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x25928
[  667.154157][ T9805] head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[  667.154916][ T9805] anon flags: 0xfff00000000040(head|node=0|zone=1|lastcpupid=0x7ff)
[  667.155631][ T9805] page_type: f5(slab)
[  667.155997][ T9805] raw: 00fff00000000040 ffff88801b442f00 0000000000000000 dead000000000001
[  667.156770][ T9805] raw: 0000000000000000 0000000080080008 00000000f5000000 0000000000000000
[  667.157536][ T9805] head: 00fff00000000040 ffff88801b442f00 0000000000000000 dead000000000001
[  667.158317][ T9805] head: 0000000000000000 0000000080080008 00000000f5000000 0000000000000000
[  667.159088][ T9805] head: 00fff00000000003 ffffea0000964a01 00000000ffffffff 00000000ffffffff
[  667.159865][ T9805] head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000008
[  667.160643][ T9805] page dumped because: kasan: bad access detected
[  667.161216][ T9805] page_owner tracks the page as allocated
[  667.161732][ T9805] page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN9
[  667.163566][ T9805]  post_alloc_hook+0x1c0/0x230
[  667.164003][ T9805]  get_page_from_freelist+0xdeb/0x3b30
[  667.164503][ T9805]  __alloc_frozen_pages_noprof+0x25c/0x2460
[  667.165040][ T9805]  alloc_pages_mpol+0x1fb/0x550
[  667.165489][ T9805]  new_slab+0x23b/0x340
[  667.165872][ T9805]  ___slab_alloc+0xd81/0x1960
[  667.166313][ T9805]  __slab_alloc.isra.0+0x56/0xb0
[  667.166767][ T9805]  __kmalloc_cache_noprof+0x255/0x3e0
[  667.167255][ T9805]  psi_cgroup_alloc+0x52/0x2d0
[  667.167693][ T9805]  cgroup_mkdir+0x694/0x1210
[  667.168118][ T9805]  kernfs_iop_mkdir+0x111/0x190
[  667.168568][ T9805]  vfs_mkdir+0x59b/0x8d0
[  667.168956][ T9805]  do_mkdirat+0x2ed/0x3d0
[  667.169353][ T9805]  __x64_sys_mkdir+0xef/0x140
[  667.169784][ T9805]  do_syscall_64+0xc9/0x480
[  667.170195][ T9805]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  667.170730][ T9805] page last free pid 1257 tgid 1257 stack trace:
[  667.171304][ T9805]  __free_frozen_pages+0x80c/0x1250
[  667.171770][ T9805]  vfree.part.0+0x12b/0xab0
[  667.172182][ T9805]  delayed_vfree_work+0x93/0xd0
[  667.172612][ T9805]  process_one_work+0x9b5/0x1b80
[  667.173067][ T9805]  worker_thread+0x630/0xe60
[  667.173486][ T9805]  kthread+0x3a8/0x770
[  667.173857][ T9805]  ret_from_fork+0x517/0x6e0
[  667.174278][ T9805]  ret_from_fork_asm+0x1a/0x30
[  667.174703][ T9805]
[  667.174917][ T9805] Memory state around the buggy address:
[  667.175411][ T9805]  ffff88802592f300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  667.176114][ T9805]  ffff88802592f380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  667.176830][ T9805] >ffff88802592f400: 00 04 fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  667.177547][ T9805]                       ^
[  667.177933][ T9805]  ffff88802592f480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  667.178640][ T9805]  ffff88802592f500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  667.179350][ T9805] ==================================================================

The hfsplus_uni2asc() method operates by struct hfsplus_unistr:

struct hfsplus_unistr {
	__be16 length;
	hfsplus_unichr unicode[HFSPLUS_MAX_STRLEN];
} __packed;

where HFSPLUS_MAX_STRLEN is 255 bytes. The issue happens if length
of the structure instance has value bigger than 255 (for example,
65283). In such case, pointer on unicode buffer is going beyond of
the allocated memory.

The patch fixes the issue by checking the length value of
hfsplus_unistr instance and using 255 value in the case if length
value is bigger than HFSPLUS_MAX_STRLEN. Potential reason of such
situation could be a corruption of Catalog File b-tree's node.

Reported-by: Wenzhi Wang <wenzhi.wang@uwaterloo.ca>
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
cc: Yangtao Li <frank.li@vivo.com>
cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Yangtao Li <frank.li@vivo.com>
Link: https://lore.kernel.org/r/20250710230830.110500-1-slava@dubeyko.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-07-25 15:27:21 -07:00
Tetsuo Handa
c7c6363ca1 hfsplus: don't use BUG_ON() in hfsplus_create_attributes_file()
When the volume header contains erroneous values that do not reflect
the actual state of the filesystem, hfsplus_fill_super() assumes that
the attributes file is not yet created, which later results in hitting
BUG_ON() when hfsplus_create_attributes_file() is called. Replace this
BUG_ON() with -EIO error with a message to suggest running fsck tool.

Reported-by: syzbot <syzbot+1107451c16b9eb9d29e6@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=1107451c16b9eb9d29e6
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Link: https://lore.kernel.org/r/7b587d24-c8a1-4413-9b9a-00a33fbd849f@I-love.SAKURA.ne.jp
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-07-25 15:22:00 -07:00
Johannes Thumshirn
4c6a567cb8 hfsplus: don't set REQ_SYNC for hfsplus_submit_bio()
hfsplus_submit_bio() called by hfsplus_sync_fs() uses bdev_virt_rw() which
in turn uses submit_bio_wait() to submit the BIO.

But submit_bio_wait() already sets the REQ_SYNC flag on the BIO so there
is no need for setting the flag in hfsplus_sync_fs() when calling
hfsplus_submit_bio().

Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
Link: https://lore.kernel.org/r/20250710063553.4805-1-johannes.thumshirn@wdc.com
Link: https://lore.kernel.org/r/20250710063553.4805-1-johannes.thumshirn@wdc.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-07-25 15:20:06 -07:00
Taotao Chen
e9d8e2bf23
fs: change write_begin/write_end interface to take struct kiocb *
Change the address_space_operations callbacks write_begin() and
write_end() to take struct kiocb * as the first argument instead of
struct file *.

Update all affected function prototypes, implementations, call sites,
and related documentation across VFS, filesystems, and block layer.

Part of a series refactoring address_space_operations write_begin and
write_end callbacks to use struct kiocb for passing write context and
flags.

Signed-off-by: Taotao Chen <chentaotao@didiglobal.com>
Link: https://lore.kernel.org/20250716093559.217344-4-chentaotao@didiglobal.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-16 14:48:18 +02:00
Yangtao Li
fcb96956c9 hfsplus: remove mutex_lock check in hfsplus_free_extents
Syzbot reported an issue in hfsplus filesystem:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 4400 at fs/hfsplus/extents.c:346
	hfsplus_free_extents+0x700/0xad0
Call Trace:
<TASK>
hfsplus_file_truncate+0x768/0xbb0 fs/hfsplus/extents.c:606
hfsplus_write_begin+0xc2/0xd0 fs/hfsplus/inode.c:56
cont_expand_zero fs/buffer.c:2383 [inline]
cont_write_begin+0x2cf/0x860 fs/buffer.c:2446
hfsplus_write_begin+0x86/0xd0 fs/hfsplus/inode.c:52
generic_cont_expand_simple+0x151/0x250 fs/buffer.c:2347
hfsplus_setattr+0x168/0x280 fs/hfsplus/inode.c:263
notify_change+0xe38/0x10f0 fs/attr.c:420
do_truncate+0x1fb/0x2e0 fs/open.c:65
do_sys_ftruncate+0x2eb/0x380 fs/open.c:193
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

To avoid deadlock, Commit 31651c6071 ("hfsplus: avoid deadlock
on file truncation") unlock extree before hfsplus_free_extents(),
and add check wheather extree is locked in hfsplus_free_extents().

However, when operations such as hfsplus_file_release,
hfsplus_setattr, hfsplus_unlink, and hfsplus_get_block are executed
concurrently in different files, it is very likely to trigger the
WARN_ON, which will lead syzbot and xfstest to consider it as an
abnormality.

The comment above this warning also describes one of the easy
triggering situations, which can easily trigger and cause
xfstest&syzbot to report errors.

[task A]			[task B]
->hfsplus_file_release
  ->hfsplus_file_truncate
    ->hfs_find_init
      ->mutex_lock
    ->mutex_unlock
				->hfsplus_write_begin
				  ->hfsplus_get_block
				    ->hfsplus_file_extend
				      ->hfsplus_ext_read_extent
				        ->hfs_find_init
					  ->mutex_lock
    ->hfsplus_free_extents
      WARN_ON(mutex_is_locked) !!!

Several threads could try to lock the shared extents tree.
And warning can be triggered in one thread when another thread
has locked the tree. This is the wrong behavior of the code and
we need to remove the warning.

Fixes: 31651c6071 ("hfsplus: avoid deadlock on file truncation")
Reported-by: syzbot+8c0bc9f818702ff75b76@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/00000000000057fa4605ef101c4c@google.com/
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
Link: https://lore.kernel.org/r/20250529061807.2213498-1-frank.li@vivo.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-07-06 17:54:34 -07:00
Yangtao Li
2eafb669da hfsplus: make splice write available again
Since 5.10, splice() or sendfile() return EINVAL. This was
caused by commit 36e2c7421f ("fs: don't allow splice read/write
without explicit ops").

This patch initializes the splice_write field in file_operations, like
most file systems do, to restore the functionality.

Fixes: 36e2c7421f ("fs: don't allow splice read/write without explicit ops")
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
Link: https://lore.kernel.org/r/20250529140033.2296791-1-frank.li@vivo.com
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-07-06 17:53:37 -07:00
Christian Brauner
ca115d7e75
tree-wide: s/struct fileattr/struct file_kattr/g
Now that we expose struct file_attr as our uapi struct rename all the
internal struct to struct file_kattr to clearly communicate that it is a
kernel internal struct. This is similar to struct mount_{k}attr and
others.

Link: https://lore.kernel.org/20250703-restlaufzeit-baurecht-9ed44552b481@brauner
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-04 16:14:39 +02:00
Lorenzo Stoakes
951ea2f484
fs: convert simple use of generic_file_*_mmap() to .mmap_prepare()
Since commit c84bf6dd2b ("mm: introduce new .mmap_prepare() file
callback"), the f_op->mmap() hook has been deprecated in favour of
f_op->mmap_prepare().

We have provided generic .mmap_prepare() equivalents, so update all file
systems that specify these directly in their file_operations structures.

This updates 9p, adfs, affs, bfs, fat, hfs, hfsplus, hostfs, hpfs, jffs2,
jfs, minix, omfs, ramfs and ufs file systems directly.

It updates generic_ro_fops which impacts qnx4, cramfs, befs, squashfs,
frebxfs, qnx6, efs, romfs, erofs and isofs file systems.

There are remaining file systems which use generic hooks in a less direct
way which we address in a subsequent commit.

Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Link: https://lore.kernel.org/c7dc90e44a9e75e750939ea369290d6e441a18e6.1750099179.git.lorenzo.stoakes@oracle.com
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-06-17 13:47:45 +02:00
Al Viro
05fb0e6664 new helper: set_default_d_op()
... to be used instead of manually assigning to ->s_d_op.
All in-tree filesystem converted (and field itself is renamed,
so any out-of-tree ones in need of conversion will be caught
by compiler).

Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-06-10 22:21:16 -04:00
Christoph Hellwig
15c9d5f623 hfsplus: use bdev_rw_virt in hfsplus_submit_bio
Replace the code building a bio from a kernel direct map address and
submitting it synchronously with the bdev_rw_virt helper.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20250507120451.4000627-20-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-07 07:31:08 -06:00
Vasiliy Kovalev
bb5e07cb92
hfs/hfsplus: fix slab-out-of-bounds in hfs_bnode_read_key
Syzbot reported an issue in hfs subsystem:

BUG: KASAN: slab-out-of-bounds in memcpy_from_page include/linux/highmem.h:423 [inline]
BUG: KASAN: slab-out-of-bounds in hfs_bnode_read fs/hfs/bnode.c:35 [inline]
BUG: KASAN: slab-out-of-bounds in hfs_bnode_read_key+0x314/0x450 fs/hfs/bnode.c:70
Write of size 94 at addr ffff8880123cd100 by task syz-executor237/5102

Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
 print_address_description mm/kasan/report.c:377 [inline]
 print_report+0x169/0x550 mm/kasan/report.c:488
 kasan_report+0x143/0x180 mm/kasan/report.c:601
 kasan_check_range+0x282/0x290 mm/kasan/generic.c:189
 __asan_memcpy+0x40/0x70 mm/kasan/shadow.c:106
 memcpy_from_page include/linux/highmem.h:423 [inline]
 hfs_bnode_read fs/hfs/bnode.c:35 [inline]
 hfs_bnode_read_key+0x314/0x450 fs/hfs/bnode.c:70
 hfs_brec_insert+0x7f3/0xbd0 fs/hfs/brec.c:159
 hfs_cat_create+0x41d/0xa50 fs/hfs/catalog.c:118
 hfs_mkdir+0x6c/0xe0 fs/hfs/dir.c:232
 vfs_mkdir+0x2f9/0x4f0 fs/namei.c:4257
 do_mkdirat+0x264/0x3a0 fs/namei.c:4280
 __do_sys_mkdir fs/namei.c:4300 [inline]
 __se_sys_mkdir fs/namei.c:4298 [inline]
 __x64_sys_mkdir+0x6c/0x80 fs/namei.c:4298
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fbdd6057a99

Add a check for key length in hfs_bnode_read_key to prevent
out-of-bounds memory access. If the key length is invalid, the
key buffer is cleared, improving stability and reliability.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: syzbot+5f3a973ed3dfb85a6683@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=5f3a973ed3dfb85a6683
Cc: stable@vger.kernel.org
Signed-off-by: Vasiliy Kovalev <kovalev@altlinux.org>
Link: https://lore.kernel.org/20241019191303.24048-1-kovalev@altlinux.org
Reviewed-by: Cengiz Can <cengiz.can@canonical.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-04-07 19:21:51 +02:00
NeilBrown
88d5baf690
Change inode_operations.mkdir to return struct dentry *
Some filesystems, such as NFS, cifs, ceph, and fuse, do not have
complete control of sequencing on the actual filesystem (e.g.  on a
different server) and may find that the inode created for a mkdir
request already exists in the icache and dcache by the time the mkdir
request returns.  For example, if the filesystem is mounted twice the
directory could be visible on the other mount before it is on the
original mount, and a pair of name_to_handle_at(), open_by_handle_at()
calls could instantiate the directory inode with an IS_ROOT() dentry
before the first mkdir returns.

This means that the dentry passed to ->mkdir() may not be the one that
is associated with the inode after the ->mkdir() completes.  Some
callers need to interact with the inode after the ->mkdir completes and
they currently need to perform a lookup in the (rare) case that the
dentry is no longer hashed.

This lookup-after-mkdir requires that the directory remains locked to
avoid races.  Planned future patches to lock the dentry rather than the
directory will mean that this lookup cannot be performed atomically with
the mkdir.

To remove this barrier, this patch changes ->mkdir to return the
resulting dentry if it is different from the one passed in.
Possible returns are:
  NULL - the directory was created and no other dentry was used
  ERR_PTR() - an error occurred
  non-NULL - this other dentry was spliced in

This patch only changes file-systems to return "ERR_PTR(err)" instead of
"err" or equivalent transformations.  Subsequent patches will make
further changes to some file-systems to return a correct dentry.

Not all filesystems reliably result in a positive hashed dentry:

- NFS, cifs, hostfs will sometimes need to perform a lookup of
  the name to get inode information.  Races could result in this
  returning something different. Note that this lookup is
  non-atomic which is what we are trying to avoid.  Placing the
  lookup in filesystem code means it only happens when the filesystem
  has no other option.
- kernfs and tracefs leave the dentry negative and the ->revalidate
  operation ensures that lookup will be called to correctly populate
  the dentry.  This could be fixed but I don't think it is important
  to any of the users of vfs_mkdir() which look at the dentry.

The recommendation to use
    d_drop();d_splice_alias()
is ugly but fits with current practice.  A planned future patch will
change this.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: NeilBrown <neilb@suse.de>
Link: https://lore.kernel.org/r/20250227013949.536172-2-neilb@suse.de
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-27 20:00:17 +01:00
Linus Torvalds
70e7730c2a vfs-6.13.misc
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZzcToAAKCRCRxhvAZXjc
 osL9AP948FFumJRC28gDJ4xp+X4eohNOfkgoEG8FTbF2zU6ulwD+O0pr26FqpFli
 pqlG+38UdATImpfqqWjPbb72sBYcfQg=
 =wLUh
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.13.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull misc vfs updates from Christian Brauner:
 "Features:

   - Fixup and improve NLM and kNFSD file lock callbacks

     Last year both GFS2 and OCFS2 had some work done to make their
     locking more robust when exported over NFS. Unfortunately, part of
     that work caused both NLM (for NFS v3 exports) and kNFSD (for
     NFSv4.1+ exports) to no longer send lock notifications to clients

     This in itself is not a huge problem because most NFS clients will
     still poll the server in order to acquire a conflicted lock

     It's important for NLM and kNFSD that they do not block their
     kernel threads inside filesystem's file_lock implementations
     because that can produce deadlocks. We used to make sure of this by
     only trusting that posix_lock_file() can correctly handle blocking
     lock calls asynchronously, so the lock managers would only setup
     their file_lock requests for async callbacks if the filesystem did
     not define its own lock() file operation

     However, when GFS2 and OCFS2 grew the capability to correctly
     handle blocking lock requests asynchronously, they started
     signalling this behavior with EXPORT_OP_ASYNC_LOCK, and the check
     for also trusting posix_lock_file() was inadvertently dropped, so
     now most filesystems no longer produce lock notifications when
     exported over NFS

     Fix this by using an fop_flag which greatly simplifies the problem
     and grooms the way for future uses by both filesystems and lock
     managers alike

   - Add a sysctl to delete the dentry when a file is removed instead of
     making it a negative dentry

     Commit 681ce86235 ("vfs: Delete the associated dentry when
     deleting a file") introduced an unconditional deletion of the
     associated dentry when a file is removed. However, this led to
     performance regressions in specific benchmarks, such as
     ilebench.sum_operations/s, prompting a revert in commit
     4a4be1ad3a ("Revert "vfs: Delete the associated dentry when
     deleting a file""). This reintroduces the concept conditionally
     through a sysctl

   - Expand the statmount() system call:

       * Report the filesystem subtype in a new fs_subtype field to
         e.g., report fuse filesystem subtypes

       * Report the superblock source in a new sb_source field

       * Add a new way to return filesystem specific mount options in an
         option array that returns filesystem specific mount options
         separated by zero bytes and unescaped. This allows caller's to
         retrieve filesystem specific mount options and immediately pass
         them to e.g., fsconfig() without having to unescape or split
         them

       * Report security (LSM) specific mount options in a separate
         security option array. We don't lump them together with
         filesystem specific mount options as security mount options are
         generic and most users aren't interested in them

         The format is the same as for the filesystem specific mount
         option array

   - Support relative paths in fsconfig()'s FSCONFIG_SET_STRING command

   - Optimize acl_permission_check() to avoid costly {g,u}id ownership
     checks if possible

   - Use smp_mb__after_spinlock() to avoid full smp_mb() in evict()

   - Add synchronous wakeup support for ep_poll_callback.

     Currently, epoll only uses wake_up() to wake up task. But sometimes
     there are epoll users which want to use the synchronous wakeup flag
     to give a hint to the scheduler, e.g., the Android binder driver.
     So add a wake_up_sync() define, and use wake_up_sync() when sync is
     true in ep_poll_callback()

  Fixes:

   - Fix kernel documentation for inode_insert5() and iget5_locked()

   - Annotate racy epoll check on file->f_ep

   - Make F_DUPFD_QUERY associative

   - Avoid filename buffer overrun in initramfs

   - Don't let statmount() return empty strings

   - Add a cond_resched() to dump_user_range() to avoid hogging the CPU

   - Don't query the device logical blocksize multiple times for hfsplus

   - Make filemap_read() check that the offset is positive or zero

  Cleanups:

   - Various typo fixes

   - Cleanup wbc_attach_fdatawrite_inode()

   - Add __releases annotation to wbc_attach_and_unlock_inode()

   - Add hugetlbfs tracepoints

   - Fix various vfs kernel doc parameters

   - Remove obsolete TODO comment from io_cancel()

   - Convert wbc_account_cgroup_owner() to take a folio

   - Fix comments for BANDWITH_INTERVAL and wb_domain_writeout_add()

   - Reorder struct posix_acl to save 8 bytes

   - Annotate struct posix_acl with __counted_by()

   - Replace one-element array with flexible array member in freevxfs

   - Use idiomatic atomic64_inc_return() in alloc_mnt_ns()"

* tag 'vfs-6.13.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (35 commits)
  statmount: retrieve security mount options
  vfs: make evict() use smp_mb__after_spinlock instead of smp_mb
  statmount: add flag to retrieve unescaped options
  fs: add the ability for statmount() to report the sb_source
  writeback: wbc_attach_fdatawrite_inode out of line
  writeback: add a __releases annoation to wbc_attach_and_unlock_inode
  fs: add the ability for statmount() to report the fs_subtype
  fs: don't let statmount return empty strings
  fs:aio: Remove TODO comment suggesting hash or array usage in io_cancel()
  hfsplus: don't query the device logical block size multiple times
  freevxfs: Replace one-element array with flexible array member
  fs: optimize acl_permission_check()
  initramfs: avoid filename buffer overrun
  fs/writeback: convert wbc_account_cgroup_owner to take a folio
  acl: Annotate struct posix_acl with __counted_by()
  acl: Realign struct posix_acl to save 8 bytes
  epoll: Add synchronous wakeup support for ep_poll_callback
  coredump: add cond_resched() to dump_user_range
  mm/page-writeback.c: Fix comment of wb_domain_writeout_add()
  mm/page-writeback.c: Update comment for BANDWIDTH_INTERVAL
  ...
2024-11-18 09:35:30 -08:00
Thadeu Lima de Souza Cascardo
1c82587cb5
hfsplus: don't query the device logical block size multiple times
Devices block sizes may change. One of these cases is a loop device by
using ioctl LOOP_SET_BLOCK_SIZE.

While this may cause other issues like IO being rejected, in the case of
hfsplus, it will allocate a block by using that size and potentially write
out-of-bounds when hfsplus_read_wrapper calls hfsplus_submit_bio and the
latter function reads a different io_size.

Using a new min_io_size initally set to sb_min_blocksize works for the
purposes of the original fix, since it will be set to the max between
HFSPLUS_SECTOR_SIZE and the first seen logical block size. We still use the
max between HFSPLUS_SECTOR_SIZE and min_io_size in case the latter is not
initialized.

Tested by mounting an hfsplus filesystem with loop block sizes 512, 1024
and 4096.

The produced KASAN report before the fix looks like this:

[  419.944641] ==================================================================
[  419.945655] BUG: KASAN: slab-use-after-free in hfsplus_read_wrapper+0x659/0xa0a
[  419.946703] Read of size 2 at addr ffff88800721fc00 by task repro/10678
[  419.947612]
[  419.947846] CPU: 0 UID: 0 PID: 10678 Comm: repro Not tainted 6.12.0-rc5-00008-gdf56e0f2f3ca #84
[  419.949007] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
[  419.950035] Call Trace:
[  419.950384]  <TASK>
[  419.950676]  dump_stack_lvl+0x57/0x78
[  419.951212]  ? hfsplus_read_wrapper+0x659/0xa0a
[  419.951830]  print_report+0x14c/0x49e
[  419.952361]  ? __virt_addr_valid+0x267/0x278
[  419.952979]  ? kmem_cache_debug_flags+0xc/0x1d
[  419.953561]  ? hfsplus_read_wrapper+0x659/0xa0a
[  419.954231]  kasan_report+0x89/0xb0
[  419.954748]  ? hfsplus_read_wrapper+0x659/0xa0a
[  419.955367]  hfsplus_read_wrapper+0x659/0xa0a
[  419.955948]  ? __pfx_hfsplus_read_wrapper+0x10/0x10
[  419.956618]  ? do_raw_spin_unlock+0x59/0x1a9
[  419.957214]  ? _raw_spin_unlock+0x1a/0x2e
[  419.957772]  hfsplus_fill_super+0x348/0x1590
[  419.958355]  ? hlock_class+0x4c/0x109
[  419.958867]  ? __pfx_hfsplus_fill_super+0x10/0x10
[  419.959499]  ? __pfx_string+0x10/0x10
[  419.960006]  ? lock_acquire+0x3e2/0x454
[  419.960532]  ? bdev_name.constprop.0+0xce/0x243
[  419.961129]  ? __pfx_bdev_name.constprop.0+0x10/0x10
[  419.961799]  ? pointer+0x3f0/0x62f
[  419.962277]  ? __pfx_pointer+0x10/0x10
[  419.962761]  ? vsnprintf+0x6c4/0xfba
[  419.963178]  ? __pfx_vsnprintf+0x10/0x10
[  419.963621]  ? setup_bdev_super+0x376/0x3b3
[  419.964029]  ? snprintf+0x9d/0xd2
[  419.964344]  ? __pfx_snprintf+0x10/0x10
[  419.964675]  ? lock_acquired+0x45c/0x5e9
[  419.965016]  ? set_blocksize+0x139/0x1c1
[  419.965381]  ? sb_set_blocksize+0x6d/0xae
[  419.965742]  ? __pfx_hfsplus_fill_super+0x10/0x10
[  419.966179]  mount_bdev+0x12f/0x1bf
[  419.966512]  ? __pfx_mount_bdev+0x10/0x10
[  419.966886]  ? vfs_parse_fs_string+0xce/0x111
[  419.967293]  ? __pfx_vfs_parse_fs_string+0x10/0x10
[  419.967702]  ? __pfx_hfsplus_mount+0x10/0x10
[  419.968073]  legacy_get_tree+0x104/0x178
[  419.968414]  vfs_get_tree+0x86/0x296
[  419.968751]  path_mount+0xba3/0xd0b
[  419.969157]  ? __pfx_path_mount+0x10/0x10
[  419.969594]  ? kmem_cache_free+0x1e2/0x260
[  419.970311]  do_mount+0x99/0xe0
[  419.970630]  ? __pfx_do_mount+0x10/0x10
[  419.971008]  __do_sys_mount+0x199/0x1c9
[  419.971397]  do_syscall_64+0xd0/0x135
[  419.971761]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  419.972233] RIP: 0033:0x7c3cb812972e
[  419.972564] Code: 48 8b 0d f5 46 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c2 46 0d 00 f7 d8 64 89 01 48
[  419.974371] RSP: 002b:00007ffe30632548 EFLAGS: 00000286 ORIG_RAX: 00000000000000a5
[  419.975048] RAX: ffffffffffffffda RBX: 00007ffe306328d8 RCX: 00007c3cb812972e
[  419.975701] RDX: 0000000020000000 RSI: 0000000020000c80 RDI: 00007ffe306325d0
[  419.976363] RBP: 00007ffe30632720 R08: 00007ffe30632610 R09: 0000000000000000
[  419.977034] R10: 0000000000200008 R11: 0000000000000286 R12: 0000000000000000
[  419.977713] R13: 00007ffe306328e8 R14: 00005a0eb298bc68 R15: 00007c3cb8356000
[  419.978375]  </TASK>
[  419.978589]

Fixes: 6596528e39 ("hfsplus: ensure bio requests are not smaller than the hardware sectors")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Link: https://lore.kernel.org/r/20241107114109.839253-1-cascardo@igalia.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-11-12 14:36:45 +01:00