linux

mirror of https://github.com/torvalds/linux.git synced 2026-03-08 03:24:45 +01:00

Author	SHA1	Message	Date
Linus Torvalds	c44db6c820	for-7.0-rc2-tag -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmmm7WwACgkQxWXV+ddt WDs7PQ/+Mc0CHhMhRH3DyEnZTPO5YcNLGl2ytqu19X2VdGu3Ra86Au4V+0tWJ+zf g4jI8UdgJWdR7aIoIgtMkl2BbK0tyY0WBEJ76EJNDsatByNmTXc0iXwGROe6tL9p n4qrEnaTMh4SmYEsFEQX9lO5ISbDbk+kfN8qapCl03c9JyKO6D3PSGrM7wzIkXX4 oIyfDWpYpAxbyWKjn+uJlpPzdsdfRceJ0fyCbq9sJITVW/FhicqTr6xvqqeoPSXp oJiL/Bbsilh7AtCLHguqpczt0X+Fus9enpjT9QqATN/JgUsaXt6O6Mk6NHcnEwjS vW6ZdeiFdELz2yLnJyb15ROf6Uorm3Mt2kAnkatLpyHxG9Z7rkxs3+cX4nm7MxSG GfLBkFB+HGw155z7cK0dPHMAhQ0KCF66I99VKTgLChjmUs8ipjPAYR8f/Tsq82RD mrYf3mEgWYnw6alx2ak454hsNjiXuYmc9bNy8Q+TXD73gQGqwUcZR6alIV+eoWVB xbX/0BQPemMITlhX6IuNn5EkCZSoB7eLcDMmYRSOpJOd8oo+gXmzQ5WvQIpwYhwz IZIH+KTdErw2FKJ8x9tStydnrmzN63QTEMMtuBy8pRsP5qJMrncPfAOMNBlqhqMq 3W1GJuurHt2dBmUOQXWrUcMQlDLPyOxUHV6TdpCL83xNzdZK8G4= =REHq -----END PGP SIGNATURE----- Merge tag 'for-7.0-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "One-liner or short fixes for minor/moderate problems reported recently: - fixes or level adjustments of error messages - fix leaked transaction handles after aborted transactions, when using the remap tree feature - fix a few leaked chunk maps after errors - fix leaked page array in io_uring encoded read if an error occurs and the 'finished' is not called - fix double release of reserved extents when doing a range COW - don't commit super block when the filesystem is in shutdown state - fix squota accounting condition when checking members vs parent usage - other error handling fixes" * tag 'for-7.0-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: check block group lookup in remove_range_from_remap_tree() btrfs: fix transaction handle leaks in btrfs_last_identity_remap_gone() btrfs: fix chunk map leak in btrfs_map_block() after btrfs_translate_remap() btrfs: fix chunk map leak in btrfs_map_block() after btrfs_chunk_map_num_copies() btrfs: fix compat mask in error messages in btrfs_check_features() btrfs: print correct subvol num if active swapfile prevents deletion btrfs: fix warning in scrub_verify_one_metadata() btrfs: fix objectid value in error message in check_extent_data_ref() btrfs: fix incorrect key offset in error message in check_dev_extent_item() btrfs: fix error message order of parameters in btrfs_delete_delayed_dir_index() btrfs: don't commit the super block when unmounting a shutdown filesystem btrfs: free pages on error in btrfs_uring_read_extent() btrfs: fix referenced/exclusive check in squota_check_parent_usage() btrfs: remove pointless WARN_ON() in cache_save_setup() btrfs: convert log messages to error level in btrfs_replay_log() btrfs: remove btrfs_handle_fs_error() after failure to recover log trees btrfs: remove redundant warning message in btrfs_check_uuid_tree() btrfs: change warning messages to error level in open_ctree() btrfs: fix a double release on reserved extents in cow_one_range() btrfs: handle discard errors in in btrfs_finish_extent_commit()	2026-03-03 09:08:00 -08:00
Mark Harmstone	f8db8009ea	btrfs: check block group lookup in remove_range_from_remap_tree() Add a check in remove_range_from_remap_tree() after we call btrfs_lookup_block_group(), to check if it is NULL. This shouldn't happen, but if it does we at least get an error rather than a segfault. Reported-by: Chris Mason <clm@fb.com> Link: https://lore.kernel.org/linux-btrfs/20260125125129.2245240-1-clm@meta.com/ Fixes: `979e1dc3d6` ("btrfs: handle deletions from remapped block group") Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-26 15:03:29 +01:00
Mark Harmstone	7885ca40c3	btrfs: fix transaction handle leaks in btrfs_last_identity_remap_gone() btrfs_abort_transaction(), unlike btrfs_commit_transaction(), doesn't also free the transaction handle. Fix the instances in btrfs_last_identity_remap_gone() where we're also leaking the transaction on abort. Reported-by: Chris Mason <clm@fb.com> Link: https://lore.kernel.org/linux-btrfs/20260125125129.2245240-1-clm@meta.com/ Fixes: `979e1dc3d6` ("btrfs: handle deletions from remapped block group") Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-26 15:03:29 +01:00
Mark Harmstone	54b9395b18	btrfs: fix chunk map leak in btrfs_map_block() after btrfs_translate_remap() If the call to btrfs_translate_remap() in btrfs_map_block() returns an error code, we were leaking the chunk map. Fix it by jumping to out rather than returning directly. Reported-by: Chris Mason <clm@fb.com> Link: https://lore.kernel.org/linux-btrfs/20260125125830.2352988-1-clm@meta.com/ Fixes: `18ba649928` ("btrfs: redirect I/O for remapped block groups") Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-26 15:03:29 +01:00
Mark Harmstone	f15fb3d415	btrfs: fix chunk map leak in btrfs_map_block() after btrfs_chunk_map_num_copies() Fix a chunk map leak in btrfs_map_block(): if we return early with -EINVAL, we're not freeing the chunk map that we've just looked up. Fixes: `0ae653fbec` ("btrfs: reduce chunk_map lookups in btrfs_map_block()") CC: stable@vger.kernel.org # 6.12+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-26 15:03:29 +01:00
Mark Harmstone	587bb33b10	btrfs: fix compat mask in error messages in btrfs_check_features() Commit `d7f67ac9a9` ("btrfs: relax block-group-tree feature dependency checks") introduced a regression when it comes to handling unsupported incompat or compat_ro flags. Beforehand we only printed the flags that we didn't recognize, afterwards we printed them all, which is less useful. Fix the error handling so it behaves like it used to. Fixes: `d7f67ac9a9` ("btrfs: relax block-group-tree feature dependency checks") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-26 15:03:28 +01:00
Mark Harmstone	1c7e9111f4	btrfs: print correct subvol num if active swapfile prevents deletion Fix the error message in btrfs_delete_subvolume() if we can't delete a subvolume because it has an active swapfile: we were printing the number of the parent rather than the target. Fixes: `60021bd754` ("btrfs: prevent subvol with swapfile from being deleted") Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-26 15:03:28 +01:00
Mark Harmstone	44e2fda664	btrfs: fix warning in scrub_verify_one_metadata() Commit `b471965fdb` ("btrfs: fix replace/scrub failure with metadata_uuid") fixed the comparison in scrub_verify_one_metadata() to use metadata_uuid rather than fsid, but left the warning as it was. Fix it so it matches what we're doing. Fixes: `b471965fdb` ("btrfs: fix replace/scrub failure with metadata_uuid") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-26 15:03:28 +01:00
Mark Harmstone	a101727805	btrfs: fix objectid value in error message in check_extent_data_ref() Fix a copy-paste error in check_extent_data_ref(): we're printing root as in the message above, we should be printing objectid. Fixes: `f333a3c7e8` ("btrfs: tree-checker: validate dref root and objectid") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-26 15:03:28 +01:00
Mark Harmstone	511dc8912a	btrfs: fix incorrect key offset in error message in check_dev_extent_item() Fix the error message in check_dev_extent_item(), when an overlapping stripe is encountered. For dev extents, objectid is the disk number and offset the physical address, so prev_key->objectid should actually be prev_key->offset. (I can't take any credit for this one - this was discovered by Chris and his friend Claude.) Reported-by: Chris Mason <clm@fb.com> Fixes: `008e2512dc` ("btrfs: tree-checker: add dev extent item checks") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-26 15:03:28 +01:00
Mark Harmstone	3cf0f35779	btrfs: fix error message order of parameters in btrfs_delete_delayed_dir_index() Fix the error message in btrfs_delete_delayed_dir_index() if __btrfs_add_delayed_item() fails: the message says root, inode, index, error, but we're actually passing index, root, inode, error. Fixes: `adc1ef55dc` ("btrfs: add details to error messages at btrfs_delete_delayed_dir_index()") Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-26 15:03:27 +01:00
Miquel Sabaté Solà	a752653312	btrfs: don't commit the super block when unmounting a shutdown filesystem When unmounting a filesystem we will try, among many other things, to commit the super block. On a filesystem that was shutdown, though, this will always fail with -EROFS as writes are forbidden on this context; and an error will be reported. Don't commit the super block on this situation, which should be fine as the filesystem is frozen before shutdown and, therefore, it should be at a consistent state. Signed-off-by: Miquel Sabaté Solà <mssola@mssola.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-26 15:03:27 +01:00
Miquel Sabaté Solà	3f501412f2	btrfs: free pages on error in btrfs_uring_read_extent() In this function the 'pages' object is never freed in the hopes that it is picked up by btrfs_uring_read_finished() whenever that executes in the future. But that's just the happy path. Along the way previous allocations might have gone wrong, or we might not get -EIOCBQUEUED from btrfs_encoded_read_regular_fill_pages(). In all these cases, we go to a cleanup section that frees all memory allocated by this function without assuming any deferred execution, and this also needs to happen for the 'pages' allocation. Fixes: `34310c442e` ("btrfs: add io_uring command for encoded reads (ENCODED_READ ioctl)") Signed-off-by: Miquel Sabaté Solà <mssola@mssola.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-26 15:03:27 +01:00
Boris Burkov	2ab2244642	btrfs: fix referenced/exclusive check in squota_check_parent_usage() We compared rfer_cmpr against excl_cmpr_sum instead of rfer_cmpr_sum which is confusing. I expect that rfer_cmpr == excl_cmpr in squota, but it is much better to be consistent in case of any surprises or bugs. Reported-by: Chris Mason <clm@meta.com> Link: https://lore.kernel.org/linux-btrfs/cover.1764796022.git.boris@bur.io/T/#mccb231643ffd290b44a010d4419474d280be5537 Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-26 15:03:27 +01:00
Filipe Manana	8ac7fad32b	btrfs: remove pointless WARN_ON() in cache_save_setup() This WARN_ON(ret) is never executed since the previous if statement makes us jump into the 'out_put' label when ret is not zero. The existing transaction abort inside the if statement also gives us a stack trace, so we don't need to move the WARN_ON(ret) into the if statement either. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-26 15:03:27 +01:00
Filipe Manana	912db40655	btrfs: convert log messages to error level in btrfs_replay_log() We are logging messages as warnings but they should really have an error level instead, as if the respective conditions are met the mount will fail. So convert them to error level and also log the error code returned by read_tree_block(). Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-26 15:03:27 +01:00
Filipe Manana	4db8d56c6f	btrfs: remove btrfs_handle_fs_error() after failure to recover log trees There is no need to call btrfs_handle_fs_error() (which we are trying to deprecate) if we fail to recover log trees: 1) Such a failure results in failing the mount immediately; 2) If the recovery started a transaction before failing, it has already aborted the transaction down in the call chain. So remove the btrfs_handle_fs_error() call, replace it with an error message and assert that the FS is in error state (so that no partial updates are committed due to a transaction that was not aborted). Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-26 15:03:26 +01:00
Filipe Manana	64def7d7d6	btrfs: remove redundant warning message in btrfs_check_uuid_tree() If we fail to start the UUID rescan kthread, btrfs_check_uuid_tree() logs an error message and returns the error to the single caller, open_ctree(). This however is redundant since the caller already logs an error message, which is also more informative since it logs the error code. Some remove the warning message from btrfs_check_uuid_tree() as it doesn't add any value. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-26 15:03:26 +01:00
Filipe Manana	0649355303	btrfs: change warning messages to error level in open_ctree() Failure to read the fs root results in a mount error, but we log a warning message. Same goes for checking the UUID tree, an error results in a mount failure but we log a warning message. Change the level of the logged messages from warning to error. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-26 15:03:26 +01:00
Qu Wenruo	a4fe134fc1	btrfs: fix a double release on reserved extents in cow_one_range() [BUG] Commit `c28214bde6` ("btrfs: refactor the main loop of cow_file_range()") refactored the handling of COWing one range. However it changed the error handling of the reserved extent. The old cleanup looks like this: out_drop_extent_cache: btrfs_drop_extent_map_range(inode, start, start + cur_alloc_size - 1, false); out_reserve: btrfs_dec_block_group_reservations(fs_info, ins.objectid); btrfs_free_reserved_extent(fs_info, ins.objectid, ins.offset, true); [...] clear_bits = EXTENT_LOCKED \| EXTENT_DELALLOC \| EXTENT_DELALLOC_NEW \| EXTENT_DEFRAG \| EXTENT_CLEAR_META_RESV; page_ops = PAGE_UNLOCK \| PAGE_START_WRITEBACK \| PAGE_END_WRITEBACK; /* * For the range (2). If we reserved an extent for our delalloc range * (or a subrange) and failed to create the respective ordered extent, * then it means that when we reserved the extent we decremented the * extent's size from the data space_info's bytes_may_use counter and * incremented the space_info's bytes_reserved counter by the same * amount. We must make sure extent_clear_unlock_delalloc() does not try * to decrement again the data space_info's bytes_may_use counter, * therefore we do not pass it the flag EXTENT_CLEAR_DATA_RESV. */ if (cur_alloc_size) { extent_clear_unlock_delalloc(inode, start, start + cur_alloc_size - 1, locked_folio, &cached, clear_bits, page_ops); btrfs_qgroup_free_data(inode, NULL, start, cur_alloc_size, NULL); } Which only calls EXTENT_CLEAR_META_RESV. As the reserved extent is properly handled by btrfs_free_reserved_extent(). However the new cleanup is: extent_clear_unlock_delalloc(inode, file_offset, cur_end, locked_folio, cached, EXTENT_LOCKED \| EXTENT_DELALLOC \| EXTENT_DELALLOC_NEW \| EXTENT_DEFRAG \| EXTENT_DO_ACCOUNTING, PAGE_UNLOCK \| PAGE_START_WRITEBACK \| PAGE_END_WRITEBACK); btrfs_qgroup_free_data(inode, NULL, file_offset, cur_len, NULL); btrfs_dec_block_group_reservations(fs_info, ins->objectid); btrfs_free_reserved_extent(fs_info, ins->objectid, ins->offset, true); The flag EXTENT_DO_ACCOUNTING implies both EXTENT_CLEAR_META_RESV and EXTENT_CLEAR_DATA_RESV, which will release the bytes_may_use, which later btrfs_free_reserved_extent() will do again, causing incorrect double release (and may underflow bytes_may_use). [FIX] Use EXTENT_CLEAR_META_RESV to replace EXTENT_DO_ACCOUNTING, and add back the comments on why we only use EXTENT_CLEAR_META_RESV. Fixes: `c28214bde6` ("btrfs: refactor the main loop of cow_file_range()") Reported-by: Chris Mason <clm@meta.com> Link: https://lore.kernel.org/linux-btrfs/20260208184920.1102719-1-clm@meta.com/ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-26 15:03:26 +01:00
Jingkai Tan	2970525f78	btrfs: handle discard errors in in btrfs_finish_extent_commit() Coverity (ID: 1226842) reported that the return value of btrfs_discard_extent() is assigned to ret but is immediately overwritten by unpin_extent_range() without being checked. Use the same error handling that is done later in the same function. Signed-off-by: Jingkai Tan <contact@jingk.ai> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-26 15:03:26 +01:00
Kees Cook	189f164e57	Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses Conversion performed via this Coccinelle script: // SPDX-License-Identifier: GPL-2.0-only // Options: --include-headers-for-types --all-includes --include-headers --keep-comments virtual patch @gfp depends on patch && !(file in "tools") && !(file in "samples")@ identifier ALLOC = {kmalloc_obj,kmalloc_objs,kmalloc_flex, kzalloc_obj,kzalloc_objs,kzalloc_flex, kvmalloc_obj,kvmalloc_objs,kvmalloc_flex, kvzalloc_obj,kvzalloc_objs,kvzalloc_flex}; @@ ALLOC(... - , GFP_KERNEL ) $ make coccicheck MODE=patch COCCI=gfp.cocci Build and boot tested x86_64 with Fedora 42's GCC and Clang: Linux version 6.19.0+ (user@host) (gcc (GCC) 15.2.1 20260123 (Red Hat 15.2.1-7), GNU ld version 2.44-12.fc42) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Linux version 6.19.0+ (user@host) (clang version 20.1.8 (Fedora 20.1.8-4.fc42), LLD 20.1.8) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-22 08:26:33 -08:00
Linus Torvalds	32a92f8c89	Convert more 'alloc_obj' cases to default GFP_KERNEL arguments This converts some of the visually simpler cases that have been split over multiple lines. I only did the ones that are easy to verify the resulting diff by having just that final GFP_KERNEL argument on the next line. Somebody should probably do a proper coccinelle script for this, but for me the trivial script actually resulted in an assertion failure in the middle of the script. I probably had made it a bit _too_ trivial. So after fighting that far a while I decided to just do some of the syntactically simpler cases with variations of the previous 'sed' scripts. The more syntactically complex multi-line cases would mostly really want whitespace cleanup anyway. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 20:03:00 -08:00
Linus Torvalds	323bbfcf1e	Convert 'alloc_flex' family to use the new default GFP_KERNEL argument This is the exact same thing as the 'alloc_obj()' version, only much smaller because there are a lot fewer users of the alloc_flex() interface. As with alloc_obj() version, this was done entirely with mindless brute force, using the same script, except using 'flex' in the pattern rather than 'objs'. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 17:09:51 -08:00
Linus Torvalds	bf4afc53b7	Convert 'alloc_obj' family to use the new default GFP_KERNEL argument This was done entirely with mindless brute force, using git grep -l '\<k[vmz]alloc_objs(., GFP_KERNEL)' \| xargs sed -i 's/$alloc_objs(.*$, GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 17:09:51 -08:00
Kees Cook	69050f8d6d	treewide: Replace kmalloc with kmalloc_obj for non-scalar types This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(PTR, FAM, COUNT, ...) (where TYPE may also be VAR) The resulting allocations no longer return "void ", instead returning "TYPE ". Signed-off-by: Kees Cook <kees@kernel.org>	2026-02-21 01:02:28 -08:00
Linus Torvalds	b3f1da2a4d	for-7.0-rc1-tag -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmmYpGMACgkQxWXV+ddt WDvKNQ//cy7gb2nItc9hbYXUM/88Ks3Usu94u/4gREL9I97u2vHer6sT8cRfWA2k OSOZF2L7yPWIcxcj39YC66ubs7uebSwt/bZAL1TKAyA7wUvR/kdhD7DUTWX4ySf3 2+1BANv1Bng8C7vGnWDhYPHcb1u8LvKxKcn+9h8SzBGpW5dyx3k4xUrneaMYq+jf D9sPjkkM6fxsKn+S3OJP/zFUIQ2DQiv7nF+Jv4Ke2h9c9nCVfn8fRK0AuTlYXFY/ mWkKWo1ATGVd0fBg/otRp/ZlZczoKs3/1YBUMYTxZZngyweIms4Q6I4/GIGHO+RD QFFoIQ7OQd0aqBGhuKTDlYMlc6OS2jwoTgVYr6vSIxSRUsCK/grHdPL+s+9dLc3h p7+/URH9Gpfad46wFypb5w7zmmc8jCRkR1Ff+jf6Pi8GgffqocCro3C3HlGRKwcf CAj6gI3ypNPNFfYidcKbS+ehXhjmMVb9xhNa8YwCC1CdgM54ZMmEs/ksAN+uBc/u EfcAbB3T15LQgzUJs2WKvCI3E/0XUYEi54ng8UwCJ6P01p3egfvQo8t6jZal9Vx8 ba/LUG50W1xRRjxgG1AU5s42GmGkO8WNyIixmLlT+Pwog0I2auPVDQBudbXZK4ps +FOtNnN9hYLmuZyRSTT03MHHf0Rqtckdjvq3413KMFILVh+ZM+Q= =CQsu -----END PGP SIGNATURE----- Merge tag 'for-7.0-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - multiple error handling fixes of unexpected conditions - reset block group size class once it becomes empty so that its class can be changed - error message level adjustments - fixes of returned error values - use correct block reserve for delayed refs * tag 'for-7.0-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix invalid leaf access in btrfs_quota_enable() if ref key not found btrfs: fix lost error return in btrfs_find_orphan_roots() btrfs: fix lost return value on error in finish_verity() btrfs: change unaligned root messages to error level in btrfs_validate_super() btrfs: use the correct type to initialize block reserve for delayed refs btrfs: do not ASSERT() when the fs flips RO inside btrfs_repair_io_failure() btrfs: reset block group size class when it becomes empty btrfs: replace BUG() with error handling in __btrfs_balance() btrfs: handle unexpected exact match in btrfs_set_inode_index_count()	2026-02-20 14:57:09 -08:00
Filipe Manana	ecb7c2484c	btrfs: fix invalid leaf access in btrfs_quota_enable() if ref key not found If btrfs_search_slot_for_read() returns 1, it means we did not find any key greater than or equals to the key we asked for, meaning we have reached the end of the tree and therefore the path is not valid. If this happens we need to break out of the loop and stop, instead of continuing and accessing an invalid path. Fixes: `5223cc60b4` ("btrfs: drop the path before adding qgroup items when enabling qgroups") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-18 15:25:54 +01:00
Filipe Manana	7b54e08f2e	btrfs: fix lost error return in btrfs_find_orphan_roots() If the call to btrfs_get_fs_root() returns an error different from -ENOENT we break out of the loop and then return 0, losing the error. Fix this by returning the error instead of breaking from the loop. Reported-by: Chris Mason <clm@meta.com> Link: https://lore.kernel.org/linux-btrfs/20260208185321.1128472-1-clm@meta.com/ Fixes: `8670a25ecb` ("btrfs: use single return variable in btrfs_find_orphan_roots()") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-18 15:25:54 +01:00
Filipe Manana	29e525665a	btrfs: fix lost return value on error in finish_verity() If btrfs_update_inode() or del_orphan() fail, we jump to the 'end_trans' label and then return 0 instead of the error returned by one of those calls. Fix this and return the error. Fixes: `61fb7f04ee` ("btrfs: remove out label in finish_verity()") Reported-by: Chris Mason <clm@meta.com> Link: https://lore.kernel.org/linux-btrfs/20260208161129.3888234-1-clm@meta.com/ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-18 15:25:54 +01:00
Filipe Manana	f46a283bbc	btrfs: change unaligned root messages to error level in btrfs_validate_super() If the root nodes for the chunk root, tree root or log root are not sector size aligned, we are logging a warning message but these are in fact errors that makes the super block validation fail. So change the level of the messages from warning to error. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-18 15:25:53 +01:00
Filipe Manana	2155d0c0a7	btrfs: use the correct type to initialize block reserve for delayed refs When initializing the delayed refs block reserve for a transaction handle we are passing a type of BTRFS_BLOCK_RSV_DELOPS, which is meant for delayed items and not for delayed refs. The correct type for delayed refs is BTRFS_BLOCK_RSV_DELREFS. On release of any excess space reserved in a local delayed refs reserve, we also should transfer that excess space to the global block reserve (it it's full, we return to the space info for general availability). By initializing a transaction's local delayed refs block reserve with a type of BTRFS_BLOCK_RSV_DELOPS, we were also causing any excess space released from the delayed block reserve (fs_info->delayed_block_rsv, used for delayed inodes and items) to be transferred to the global block reserve instead of the global delayed refs block reserve. This was an unintentional change in commit `28270e25c6` ("btrfs: always reserve space for delayed refs when starting transaction"), but it's not particularly serious as things tend to cancel out each other most of the time and it's relatively rare to be anywhere near exhaustion of the global reserve. Fix this by initializing a transaction's local delayed refs reserve with a type of BTRFS_BLOCK_RSV_DELREFS and making btrfs_block_rsv_release() attempt to transfer unused space from such a reserve into the global block reserve, just as we did before that commit for when the block reserve is a delayed refs rsv. Reported-by: Alex Lyakas <alex.lyakas@zadara.com> Link: https://lore.kernel.org/linux-btrfs/CAOcd+r0FHG5LWzTSu=LknwSoqxfw+C00gFAW7fuX71+Z5AfEew@mail.gmail.com/ Fixes: `28270e25c6` ("btrfs: always reserve space for delayed refs when starting transaction") Reviewed-by: Alex Lyakas <alex.lyakas@zadara.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-18 15:25:53 +01:00
Qu Wenruo	8ceaad6cd6	btrfs: do not ASSERT() when the fs flips RO inside btrfs_repair_io_failure() [BUG] There is a bug report that when btrfs hits ENOSPC error in a critical path, btrfs flips RO (this part is expected, although the ENOSPC bug still needs to be addressed). The problem is after the RO flip, if there is a read repair pending, we can hit the ASSERT() inside btrfs_repair_io_failure() like the following: BTRFS info (device vdc): relocating block group 30408704 flags metadata\|raid1 ------------[ cut here ]------------ BTRFS: Transaction aborted (error -28) WARNING: fs/btrfs/extent-tree.c:3235 at __btrfs_free_extent.isra.0+0x453/0xfd0, CPU#1: btrfs/383844 Modules linked in: kvm_intel kvm irqbypass [...] ---[ end trace 0000000000000000 ]--- BTRFS info (device vdc state EA): 2 enospc errors during balance BTRFS info (device vdc state EA): balance: ended with status: -30 BTRFS error (device vdc state EA): parent transid verify failed on logical 30556160 mirror 2 wanted 8 found 6 BTRFS error (device vdc state EA): bdev /dev/nvme0n1 errs: wr 0, rd 0, flush 0, corrupt 10, gen 0 [...] assertion failed: !(fs_info->sb->s_flags & SB_RDONLY) :: 0, in fs/btrfs/bio.c:938 ------------[ cut here ]------------ assertion failed: !(fs_info->sb->s_flags & SB_RDONLY) :: 0, in fs/btrfs/bio.c:938 kernel BUG at fs/btrfs/bio.c:938! Oops: invalid opcode: 0000 [#1] SMP NOPTI CPU: 0 UID: 0 PID: 868 Comm: kworker/u8:13 Tainted: G W N 6.19.0-rc6+ #4788 PREEMPT(full) Tainted: [W]=WARN, [N]=TEST Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014 Workqueue: btrfs-endio simple_end_io_work RIP: 0010:btrfs_repair_io_failure.cold+0xb2/0x120 RSP: 0000:ffffc90001d2bcf0 EFLAGS: 00010246 RAX: 0000000000000051 RBX: 0000000000001000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff8305cf42 RDI: 00000000ffffffff RBP: 0000000000000002 R08: 00000000fffeffff R09: ffffffff837fa988 R10: ffffffff8327a9e0 R11: 6f69747265737361 R12: ffff88813018d310 R13: ffff888168b8a000 R14: ffffc90001d2bd90 R15: ffff88810a169000 FS: 0000000000000000(0000) GS:ffff8885e752c000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 ------------[ cut here ]------------ [CAUSE] The cause of -ENOSPC error during the test case btrfs/124 is still unknown, although it's known that we still have cases where metadata can be over-committed but can not be fulfilled correctly, thus if we hit such ENOSPC error inside a critical path, we have no choice but abort the current transaction. This will mark the fs read-only. The problem is inside the btrfs_repair_io_failure() path that we require the fs not to be mount read-only. This is normally fine, but if we are doing a read-repair meanwhile the fs flips RO due to a critical error, we can enter btrfs_repair_io_failure() with super block set to read-only, thus triggering the above crash. [FIX] Just replace the ASSERT() with a proper return if the fs is already read-only. Reported-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/linux-btrfs/20260126045555.GB31641@lst.de/ Tested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-18 15:25:53 +01:00
Jiasheng Jiang	5870ec7c8f	btrfs: reset block group size class when it becomes empty Block group size classes are managed consistently everywhere. Currently, btrfs_use_block_group_size_class() sets a block group's size class to specialize it for a specific allocation size. However, this size class remains "stale" even if the block group becomes completely empty (both used and reserved bytes reach zero). This happens in two scenarios: 1. When space reservations are freed (e.g., due to errors or transaction aborts) via btrfs_free_reserved_bytes(). 2. When the last extent in a block group is freed via btrfs_update_block_group(). While size classes are advisory, a stale size class can cause find_free_extent to unnecessarily skip candidate block groups during initial search loops. This undermines the purpose of size classes to reduce fragmentation by keeping block groups restricted to a specific size class when they could be reused for any size. Fix this by resetting the size class to BTRFS_BG_SZ_NONE whenever a block group's used and reserved counts both reach zero. This ensures that empty block groups are fully available for any allocation size in the next cycle. Fixes: `52bb7a2166` ("btrfs: introduce size class to block group allocator") Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Jiasheng Jiang <jiashengjiangcool@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-18 15:25:53 +01:00
Adarsh Das	be6324a809	btrfs: replace BUG() with error handling in __btrfs_balance() We search with offset (u64)-1 which should never match exactly. Previously this was handled with BUG(). Now logs an error and return -EUCLEAN. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Adarsh Das <adarshdas950@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-18 15:25:53 +01:00
Adarsh Das	1c88823a19	btrfs: handle unexpected exact match in btrfs_set_inode_index_count() We search with offset (u64)-1 which should never match exactly. Previously the code silently returned success without setting the index count. Now logs an error and return -EUCLEAN instead. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Adarsh Das <adarshdas950@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com>, Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-18 15:25:53 +01:00
Linus Torvalds	997f9640c9	fsverity updates for 7.0 fsverity cleanups, speedup, and memory usage optimization from Christoph Hellwig: - Move some logic into common code - Fix btrfs to reject truncates of fsverity files - Improve the readahead implementation - Store each inode's fsverity_info in a hash table instead of using a pointer in the filesystem-specific part of the inode. This optimizes for memory usage in the usual case where most files don't have fsverity enabled. - Look up the fsverity_info fewer times during verification, to amortize the hash table overhead -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaY0nZhQcZWJpZ2dlcnNA a2VybmVsLm9yZwAKCRDzXCl4vpKOK/AVAP9wSLEYsG3dqnNIHjIvLeK+9NC3Ni4d m+fvT1JfuideOwEA9r2EfztusLU5iyqWJlHyxekibXItUDgYGltaYb7eXAU= =a+To -----END PGP SIGNATURE----- Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux Pull fsverity updates from Eric Biggers: "fsverity cleanups, speedup, and memory usage optimization from Christoph Hellwig: - Move some logic into common code - Fix btrfs to reject truncates of fsverity files - Improve the readahead implementation - Store each inode's fsverity_info in a hash table instead of using a pointer in the filesystem-specific part of the inode. This optimizes for memory usage in the usual case where most files don't have fsverity enabled. - Look up the fsverity_info fewer times during verification, to amortize the hash table overhead" * tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux: fsverity: remove inode from fsverity_verification_ctx fsverity: use a hashtable to find the fsverity_info btrfs: consolidate fsverity_info lookup f2fs: consolidate fsverity_info lookup ext4: consolidate fsverity_info lookup fs: consolidate fsverity_info lookup in buffer.c fsverity: push out fsverity_info lookup fsverity: deconstify the inode pointer in struct fsverity_info fsverity: kick off hash readahead at data I/O submission time ext4: move ->read_folio and ->readahead to readpage.c readahead: push invalidate_lock out of page_cache_ra_unbounded fsverity: don't issue readahead for non-ENOENT errors from __filemap_get_folio fsverity: start consolidating pagecache code fsverity: pass struct file to ->write_merkle_tree_block f2fs: don't build the fsverity work handler for !CONFIG_FS_VERITY ext4: don't build the fsverity work handler for !CONFIG_FS_VERITY fs,fsverity: clear out fsverity_info from common code fs,fsverity: reject size changes on fsverity files in setattr_prepare	2026-02-12 10:41:34 -08:00
Linus Torvalds	41f1a08645	Kbuild/Kconfig updates for 7.0 Kbuild changes ============== * Drop '_probe' pattern from modpost section check allowlist, which hid legitimate warnings (Johan Hovold) Disable -Wtype-limits altogether, instead of enabling at W=2 (Vincent Mailhol) * Improve UAPI testing to skip testing headers that require a libc when CONFIG_CC_CAN_LINK is not set, opening up testing of headers with no libc dependencies to more environments (Thomas Weißschuh) * Update gendwarfksyms documentation with required dependencies (Jihan LIN) * Reject invalid LLVM= values to avoid unintentionally falling back to system toolchain (Thomas Weißschuh) * Add a script to help run the kernel build process in a container for consistent environments and testing (Guillaume Tucker) * Simplify kallsyms by getting rid of the relative base (Ard Biesheuvel) * Performance and usability improvements to scripts/make_fit.py (Simon Glass) * Minor various clean ups and fixes Kconfig changes =============== * Move XPM icons to individual files, clearing up GTK deprecation warnings (Rostislav Krasny) * Support depends on FOO if BAR as syntactic sugar for depends on FOO \|\| !BAR' (Nicolas Pitre, Graham Roff) * Refactor merge_config.sh to use awk over shell/sed/grep, dramatically speeding up processing large number of config fragments (Anders Roxell, Mikko Rapeli) -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQR74yXHMTGczQHYypIdayaRccAalgUCaYpgQwAKCRAdayaRccAa liOGAQCqMI42YMLqljFcPu3B/3f43xhDBCXAhquPBIMhbgt+aAEAmmo3uMLHKSRV XZDKkq13HMMV3Zlmrn5Xk/tzk+hkwwk= =WYl4 -----END PGP SIGNATURE----- Merge tag 'kbuild-7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux Pull Kbuild/Kconfig updates from Nathan Chancellor: "Kbuild: - Drop '_probe' pattern from modpost section check allowlist, which hid legitimate warnings (Johan Hovold) - Disable -Wtype-limits altogether, instead of enabling at W=2 (Vincent Mailhol) - Improve UAPI testing to skip testing headers that require a libc when CONFIG_CC_CAN_LINK is not set, opening up testing of headers with no libc dependencies to more environments (Thomas Weißschuh) - Update gendwarfksyms documentation with required dependencies (Jihan LIN) - Reject invalid LLVM= values to avoid unintentionally falling back to system toolchain (Thomas Weißschuh) - Add a script to help run the kernel build process in a container for consistent environments and testing (Guillaume Tucker) - Simplify kallsyms by getting rid of the relative base (Ard Biesheuvel) - Performance and usability improvements to scripts/make_fit.py (Simon Glass) - Minor various clean ups and fixes Kconfig: - Move XPM icons to individual files, clearing up GTK deprecation warnings (Rostislav Krasny) - Support depends on FOO if BAR as syntactic sugar for depends on FOO \|\| !BAR (Nicolas Pitre, Graham Roff) - Refactor merge_config.sh to use awk over shell/sed/grep, dramatically speeding up processing large number of config fragments (Anders Roxell, Mikko Rapeli)" tag 'kbuild-7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux: (39 commits) kbuild: remove dependency of run-command on config scripts/make_fit: Compress dtbs in parallel scripts/make_fit: Support a few more parallel compressors kbuild: Support a FIT_EXTRA_ARGS environment variable scripts/make_fit: Move dtb processing into a function scripts/make_fit: Support an initial ramdisk scripts/make_fit: Speed up operation rust: kconfig: Don't require RUST_IS_AVAILABLE for rustc-option MAINTAINERS: Add scripts/install.sh into Kbuild entry modpost: Amend ppc64 save/restfpr symnames for -Os build MIPS: tools: relocs: Ship a definition of R_MIPS_PC32 streamline_config.pl: remove superfluous exclamation mark kbuild: dummy-tools: Add python3 scripts: kconfig: merge_config.sh: warn on duplicate input files scripts: kconfig: merge_config.sh: use awk in checks too scripts: kconfig: merge_config.sh: refactor from shell/sed/grep to awk kallsyms: Get rid of kallsyms relative base mips: Add support for PC32 relocations in vmlinux Documentation: dev-tools: add container.rst page scripts: add tool to run containerized builds ...	2026-02-11 13:40:35 -08:00
Linus Torvalds	8912c2fd58	for-6.20-tag -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmmDT6sACgkQxWXV+ddt WDteIBAAnQBKtHZOrefnA/SjbT4N+IV20x8sxVc3XI2MXw6RpjEN6k+0oGMLdvMy 5NBryJ43q5CwCV6iNkWQE4mT86gcPa6Bqv1nFOC5Q2BDkvbVBpOfOq7kC2+fQ7ay HF2Mr0PUHc0Y0MhkRSljO+T2QD4tDpWaxbEeVY+TxiAsepD1paK4fHV6Lwu2sk25 17RJQvm/2XRY32g9Sa6NZIc7mGuyIasMCBcTpDKDJW10hP61NNtK4wHgPLtMRtzx qzCAPSMS6QkeJZHcDa/Atg+iqpR5U8pdKAUSYJii3Kgcmjr5n1U1ZTp5WRLlXSS2 tHiR62a983ya022wKR1ApsdjN7ncE8iIeT/GrezZVcPtm9jTxaSzgd7dDNfSmr29 my4crJWvlEuD9Qt+/oz//eLAjkgEe2Q5RtaAworCAG00MzaGOEwNiXXP7DDMQApI VTxx9dvY0s/W3UF/IuJWTTN9q95KjvlmZ9ELAPxwwtyq+sAD41CvlYhJqCaLLec5 6xMotP5cy3Ur+yp+J7RCDprQ7x6YcU98PYIXQxf1/77f3Lz/7QA2TWafPzJ5V2Bk UtprVCrlqwCmSFrSISN6HzNf0UYY/ZI36WRoUj/ZJkGNfkQwvs9aBjb+lVYRb8T8 OcMlJrJvoUwIY//ef5K97ma8HOecodxszdEIafOgmnJtE9H3foI= =Ie8n -----END PGP SIGNATURE----- Merge tag 'for-6.20-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "User visible changes, feature updates: - when using block size > page size, enable direct IO - fallback to buffered IO if the data profile has duplication, workaround to avoid checksum mismatches on block group profiles with redundancy, real direct IO is possible on single or RAID0 - redo export of zoned statistics, moved from sysfs to /proc/pid/mountstats due to size limitations of the former Experimental features: - remove offload checksum tunable, intended to find best way to do it but since we've switched to offload to thread for everything we don't need it anymore - initial support for remap-tree feature, a translation layer of logical block addresses that allow changes without moving/rewriting blocks to do eg. relocation, or other changes that require COW Notable fixes: - automatic removal of accidentally leftover chunks when free-space-tree is enabled since mkfs.btrfs v6.16.1 - zoned mode: - do not try to append to conventional zones when RAID is mixing zoned and conventional drives - fixup write pointers when mixing zoned and conventional on DUP/RAID* profiles - when using squota, relax deletion rules for qgroups with 0 members to allow easier recovery from accounting bugs, also add more checks to detect bad accounting - fix periodic reclaim scanning, properly check boundary conditions not to trigger it unexpectedly or miss the time to run it - trim: - continue after first error - change reporting to the first detected error - add more cancellation points - reduce contention of big device lock that can block other operations when there's lots of trimmed space - when chunk allocation is forced (needs experimental build) fix transaction abort when unexpected space layout is detected Core: - switch to crypto library API for checksumming, removed module dependencies, pointer indirections, etc. - error handling improvements - adjust how and where transaction commit or abort are done and are maybe not necessary - minor compression optimization to skip single block ranges - improve how compression folios are handled - new and updated selftests - cleanups, refactoring: - auto-freeing and other automatic variable cleanup conversion - structure size optimizations - condition annotations" * tag 'for-6.20-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (137 commits) btrfs: get rid of compressed_bio::compressed_folios[] btrfs: get rid of compressed_folios[] usage for encoded writes btrfs: get rid of compressed_folios[] usage for compressed read btrfs: remove the old btrfs_compress_folios() infrastructure btrfs: switch to btrfs_compress_bio() interface for compressed writes btrfs: introduce btrfs_compress_bio() helper btrfs: zlib: introduce zlib_compress_bio() helper btrfs: zstd: introduce zstd_compress_bio() helper btrfs: lzo: introduce lzo_compress_bio() helper btrfs: zoned: factor out the zone loading part into a testable function btrfs: add cleanup function for btrfs_free_chunk_map btrfs: tests: add cleanup functions for test specific functions btrfs: raid56: fix memory leak of btrfs_raid_bio::stripe_uptodate_bitmap btrfs: tests: add unit tests for pending extent walking functions btrfs: fix EEXIST abort due to non-consecutive gaps in chunk allocation btrfs: fix transaction commit blocking during trim of unallocated space btrfs: handle user interrupt properly in btrfs_trim_fs() btrfs: preserve first error in btrfs_trim_fs() btrfs: continue trimming remaining devices on failure btrfs: do not BUG_ON() in btrfs_remove_block_group() ...	2026-02-09 15:45:21 -08:00
Linus Torvalds	9e355113f0	vfs-7.0-rc1.misc Please consider pulling these changes from the signed vfs-7.0-rc1.misc tag. Thanks! Christian -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaYX49QAKCRCRxhvAZXjc ojrZAQD1VJzY46r5FnAVf4jlEHyjIbDnZCP/n+c4x6XnqpU6EQEAgB0yAtAGP6+u SBuytElqHoTT5VtmEXTAabCNQ9Ks8wo= =JwZz -----END PGP SIGNATURE----- Merge tag 'vfs-7.0-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc vfs updates from Christian Brauner: "This contains a mix of VFS cleanups, performance improvements, API fixes, documentation, and a deprecation notice. Scalability and performance: - Rework pid allocation to only take pidmap_lock once instead of twice during alloc_pid(), improving thread creation/teardown throughput by 10-16% depending on false-sharing luck. Pad the namespace refcount to reduce false-sharing - Track file lock presence via a flag in ->i_opflags instead of reading ->i_flctx, avoiding false-sharing with ->i_readcount on open/close hot paths. Measured 4-16% improvement on 24-core open-in-a-loop benchmarks - Use a consume fence in locks_inode_context() to match the store-release/load-consume idiom, eliminating a hardware fence on some architectures - Annotate cdev_lock with __cacheline_aligned_in_smp to prevent false-sharing - Remove a redundant DCACHE_MANAGED_DENTRY check in __follow_mount_rcu() that never fires since the caller already verifies it, eliminating a 100% mispredicted branch - Fix a 100% mispredicted likely() in devcgroup_inode_permission() that became wrong after a prior code reorder Bug fixes and correctness: - Make insert_inode_locked() wait for inode destruction instead of skipping, fixing a corner case where two matching inodes could exist in the hash - Move f_mode initialization before file_ref_init() in alloc_file() to respect the SLAB_TYPESAFE_BY_RCU ordering contract - Add a WARN_ON_ONCE guard in try_to_free_buffers() for folios with no buffers attached, preventing a null pointer dereference when AS_RELEASE_ALWAYS is set but no release_folio op exists - Fix select restart_block to store end_time as timespec64, avoiding truncation of tv_sec on 32-bit architectures - Make dump_inode() use get_kernel_nofault() to safely access inode and superblock fields, matching the dump_mapping() pattern API modernization: - Make posix_acl_to_xattr() allocate the buffer internally since every single caller was doing it anyway. Reduces boilerplate and unnecessary error checking across ~15 filesystems - Replace deprecated simple_strtoul() with kstrtoul() for the ihash_entries, dhash_entries, mhash_entries, and mphash_entries boot parameters, adding proper error handling - Convert chardev code to use guard(mutex) and __free(kfree) cleanup patterns - Replace min_t() with min() or umin() in VFS code to avoid silently truncating unsigned long to unsigned int - Gate LOOKUP_RCU assertions behind CONFIG_DEBUG_VFS since callers already check the flag Deprecation: - Begin deprecating legacy BSD process accounting (acct(2)). The interface has numerous footguns and better alternatives exist (eBPF) Documentation: - Fix and complete kernel-doc for struct export_operations, removing duplicated documentation between ReST and source - Fix kernel-doc warnings for __start_dirop() and ilookup5_nowait() Testing: - Add a kunit test for initramfs cpio handling of entries with filesize > PATH_MAX Misc: - Add missing <linux/init_task.h> include in fs_struct.c" * tag 'vfs-7.0-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits) posix_acl: make posix_acl_to_xattr() alloc the buffer fs: make insert_inode_locked() wait for inode destruction initramfs_test: kunit test for cpio.filesize > PATH_MAX fs: improve dump_inode() to safely access inode fields fs: add <linux/init_task.h> for 'init_fs' docs: exportfs: Use source code struct documentation fs: move initializing f_mode before file_ref_init() exportfs: Complete kernel-doc for struct export_operations exportfs: Mark struct export_operations functions at kernel-doc exportfs: Fix kernel-doc output for get_name() acct(2): begin the deprecation of legacy BSD process accounting device_cgroup: remove branch hint after code refactor VFS: fix __start_dirop() kernel-doc warnings fs: Describe @isnew parameter in ilookup5_nowait() fs/namei: Remove redundant DCACHE_MANAGED_DENTRY check in __follow_mount_rcu fs: only assert on LOOKUP_RCU when built with CONFIG_DEBUG_VFS select: store end_time as timespec64 in restart block chardev: Switch to guard(mutex) and __free(kfree) namespace: Replace simple_strtoul with kstrtoul to parse boot params dcache: Replace simple_strtoul with kstrtoul in set_dhash_entries ...	2026-02-09 15:13:05 -08:00
Linus Torvalds	6124fa45e2	vfs-7.0-rc1.btrfs Please consider pulling these changes from the signed vfs-7.0-rc1.btrfs tag. Thanks! Christian -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaYX49gAKCRCRxhvAZXjc oogiAP0bJ72jxff4CcV1VDltO/mDT2XcCBRz3hYSZdC12Q+AYAD/XlozEUrUgbgg V2pWb1Xo+NrbNyhtNQ+2btHFmkzJ1gY= =7Xrx -----END PGP SIGNATURE----- Merge tag 'vfs-7.0-rc1.btrfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs updates for btrfs from Christian Brauner: "This contains some changes for btrfs that are taken to the vfs tree to stop duplicating VFS code for subvolume/snapshot dentry Btrfs has carried private copies of the VFS may_delete() and may_create() functions in fs/btrfs/ioctl.c for permission checks during subvolume creation and snapshot destruction. These copies have drifted out of sync with the VFS originals — btrfs_may_delete() is missing the uid/gid validity check and btrfs_may_create() is missing the audit_inode_child() call. Export the VFS functions as may_{create,delete}_dentry() and switch btrfs to use them, removing ~70 lines of duplicated code" * tag 'vfs-7.0-rc1.btrfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: btrfs: use may_create_dentry() in btrfs_mksubvol() btrfs: use may_delete_dentry() in btrfs_ioctl_snap_destroy() fs: export may_create() as may_create_dentry() fs: export may_delete() as may_delete_dentry()	2026-02-09 13:05:35 -08:00
Linus Torvalds	aa2a0fcd4c	vfs-7.0-rc1.leases Please consider pulling these changes from the signed vfs-7.0-rc1.leases tag. Thanks! Christian -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaYX49gAKCRCRxhvAZXjc olR/AP40iNOTRn7LosXbRWqGGZqzy9v64QYoLzk3QdsWuGmbRAD/egNQzof8mkAf IscefWTOjY7xyDzmEBEBnfHftgMiEwM= =zre0 -----END PGP SIGNATURE----- Merge tag 'vfs-7.0-rc1.leases' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs lease updates from Christian Brauner: "This contains updates for lease support to require filesystems to explicitly opt-in to lease support Currently kernel_setlease() falls through to generic_setlease() when a a filesystem does not define ->setlease(), silently granting lease support to every filesystem regardless of whether it is prepared for it. This is a poor default: most filesystems never intended to support leases, and the silent fallthrough makes it impossible to distinguish "supports leases" from "never thought about it". This inverts the default. It adds explicit .setlease = generic_setlease; assignments to every in-tree filesystem that should retain lease support, then changes kernel_setlease() to return -EINVAL when ->setlease is NULL. With the new default in place, simple_nosetlease() is redundant and is removed along with all references to it" * tag 'vfs-7.0-rc1.leases' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (25 commits) fuse: add setlease file operation fs: remove simple_nosetlease() filelock: default to returning -EINVAL when ->setlease operation is NULL xfs: add setlease file operation ufs: add setlease file operation udf: add setlease file operation tmpfs: add setlease file operation squashfs: add setlease file operation overlayfs: add setlease file operation orangefs: add setlease file operation ocfs2: add setlease file operation ntfs3: add setlease file operation nilfs2: add setlease file operation jfs: add setlease file operation jffs2: add setlease file operation gfs2: add a setlease file operation fat: add setlease file operation f2fs: add setlease file operation exfat: add setlease file operation ext4: add setlease file operation ...	2026-02-09 11:59:07 -08:00
Linus Torvalds	74554251df	vfs-7.0-rc1.nonblocking_timestamps Please consider pulling these changes from the signed vfs-7.0-rc1.nonblocking_timestamps tag. Thanks! Christian -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaYX49gAKCRCRxhvAZXjc oqNMAQCjHw9iwYDu63n96QAipWopJb8onqc0rTEvi0OOl1zDNwEAufN3EqTzV3uQ JbNgSwBWD/+ICd2aUOuAX0GgU6teyAQ= =lJlI -----END PGP SIGNATURE----- Merge tag 'vfs-7.0-rc1.nonblocking_timestamps' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs timestamp updates from Christian Brauner: "This contains the changes to support non-blocking timestamp updates. Since commit `66fa3cedf1` ("fs: Add async write file modification handling") file_update_time_flags() unconditionally returns -EAGAIN when any timestamp needs updating and IOCB_NOWAIT is set. This makes non-blocking direct writes impossible on file systems with granular enough timestamps, which in practice means all of them. This reworks the timestamp update path to propagate IOCB_NOWAIT through ->update_time so that file systems which can update timestamps without blocking are no longer penalized. With that groundwork in place, the core change passes IOCB_NOWAIT into ->update_time and returns -EAGAIN only when the file system indicates it would block. XFS implements non-blocking timestamp updates by using the new ->sync_lazytime and open-coding generic_update_time without the S_NOWAIT check, since the lazytime path through the generic helpers can never block in XFS" * tag 'vfs-7.0-rc1.nonblocking_timestamps' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: xfs: enable non-blocking timestamp updates xfs: implement ->sync_lazytime fs: refactor file_update_time_flags fs: add support for non-blocking timestamp updates fs: add a ->sync_lazytime method fs: factor out a sync_lazytime helper fs: refactor ->update_time handling fat: cleanup the flags for fat_truncate_time nfs: split nfs_update_timestamps fs: allow error returns from generic_update_time fs: remove inode_update_time	2026-02-09 11:25:01 -08:00
Christoph Hellwig	f77f281b61	fsverity: use a hashtable to find the fsverity_info Use the kernel's resizable hash table (rhashtable) to find the fsverity_info. This way file systems that want to support fsverity don't have to bloat every inode in the system with an extra pointer. The trade-off is that looking up the fsverity_info is a bit more expensive now, but the main operations are still dominated by I/O and hashing overhead. The rhashtable implementations requires no external synchronization, and the _fast versions of the APIs provide the RCU critical sections required by the implementation. Because struct fsverity_info is only removed on inode eviction and does not contain a reference count, there is no need for an extended critical section to grab a reference or validate the object state. The file open path uses rhashtable_lookup_get_insert_fast, which can either find an existing object for the hash key or insert a new one in a single atomic operation, so that concurrent opens never instantiate duplicate fsverity_info structure. FS_IOC_ENABLE_VERITY must already be synchronized by a combination of i_rwsem and file system flags and uses rhashtable_lookup_insert_fast, which errors out on an existing object for the hash key as an additional safety check. Because insertion into the hash table now happens before S_VERITY is set, fsverity just becomes a barrier and a flag check and doesn't have to look up the fsverity_info at all, so there is only a single lookup per ->read_folio or ->readahead invocation. For btrfs there is an additional one for each bio completion, while for ext4 and f2fs the fsverity_info is stored in the per-I/O context and reused for the completion workqueue. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Link: https://lore.kernel.org/r/20260202060754.270269-12-hch@lst.de [EB: folded in fix for missing fsverity_free_info()] Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-02-04 11:31:54 -08:00
Christoph Hellwig	b0160e4501	btrfs: consolidate fsverity_info lookup Look up the fsverity_info once in btrfs_do_readpage, and then use it for all operations performed there, and do the same in end_folio_read for all folios processed there. The latter is also changed to derive the inode from the btrfs_bio - while bbio->inode is optional, it is always set for buffered reads. This amortizes the lookup better once it becomes less efficient. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: David Sterba <dsterba@suse.com> Link: https://lore.kernel.org/r/20260202060754.270269-11-hch@lst.de Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-02-04 11:31:54 -08:00
Linus Torvalds	de0674d9bc	for-6.19-rc8-tag -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmmCG6sACgkQxWXV+ddt WDuZfQ/8C2GRu5amfd4kd7blAcgRcsVFO0br8gCZtjQ7dNGieK+1HIg/nJZsxSp+ h/gbW+QV0Sz3qz8Qqa5zxI7vApCnOC6DNstbNv6U/b2NL42vWaKujKjNHo+UaxMX 7nmsFgBjzUf3CbKhSgCWVQdcxFhOd+t6Od9DhQrmmabM2v3uWBFvWYEB0GqlUs/g G+VXGMk/Q55Fr29CRristPs/Xbc8Yw3nkeDpHBqvpy7H3dJlMY6qN5lbOzlH99BB Bnx1DC4plOcIb6yerWAYsV+GWEpZUq+sUwnwLAEsN39J+R6JtRhn/HPXu+ElFijV 7daM8WwzBcopDVKhtjkiywpAIKSCPgU7er06gRFgPegNWB5g+KIyxRZU1Yn9dXcP xZR3meexnM0MV8YXqJhIS27TX24Lq13IaWfEKc/VoSwxodhhH22w5M2W2nipRhsK 28GKZL2JJc8JEwO++cU/NgXuCzYRfR4WzUrpri6gVp3h5UJh3mLIkBQG9r0ORbKu qbO/IXyp7GMGaH9RjecmcGEtW7LCA75E4rsrJWpCNdtBug+gygk82surTmnRKXhc PWc+QRzz8aOmdJTu5uLQvfNNi/so/TQJ+hq+yU0NMPAEjYa1UvKGV5A+Gu5z8iDY jhLfo7EV+x5XB74PaBulQtpUV6i51E41ggusLYhBVc06KEE3YtQ= =WiGo -----END PGP SIGNATURE----- Merge tag 'for-6.19-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "A regression fix for a memory leak when raid56 is used" * tag 'for-6.19-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: raid56: fix memory leak of btrfs_raid_bio::stripe_uptodate_bitmap	2026-02-03 10:19:58 -08:00
Qu Wenruo	161ab30da6	btrfs: get rid of compressed_bio::compressed_folios[] Now there is no one utilizing that member, we can safely remove it along with compressed_bio::nr_folios member. The size is reduced from 352 to 336 bytes on x86_64. Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-03 07:59:07 +01:00
Qu Wenruo	e1bc83f8b1	btrfs: get rid of compressed_folios[] usage for encoded writes Currently only encoded writes utilized btrfs_submit_compressed_write(), which utilized compressed_bio::compressed_folios[] array. Change the only call site to call the new helper, btrfs_alloc_compressed_write(), to allocate a compressed bio, then queue needed folios into that bio, and finally call btrfs_submit_compressed_write() to submit the compressed bio. This change has one hidden benefit, previously we used btrfs_alloc_folio_array() for the folios of btrfs_submit_compressed_read(), which doesn't utilize the compression page pool for bs == ps cases. Now we call btrfs_alloc_compr_folio() which will benefit from the page pool. The other obvious benefit is that we no longer need to allocate an array to hold all those folios, thus one less error path. Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-03 07:59:07 +01:00
Qu Wenruo	dafcfa1c8e	btrfs: get rid of compressed_folios[] usage for compressed read Currently btrfs_submit_compressed_read() still uses compressed_bio::compressed_folios[] array. Change it to allocate each folio and queue them into the compressed bio so that we do not need to allocate that array. Considering how small each compressed read bio is (less than 128KiB), we do not benefit that much from btrfs_alloc_folio_array() anyway, while we may benefit more from btrfs_alloc_compr_folio() by using the global folio pool. So changing from btrfs_alloc_folio_array() to btrfs_alloc_compr_folio() in a loop should still be fine. This removes one error path, and paves the way to completely remove compressed_folios[] array. Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-03 07:59:07 +01:00
Qu Wenruo	26902be0cd	btrfs: remove the old btrfs_compress_folios() infrastructure Since it's been replaced by btrfs_compress_bio(), remove all involved functions. Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-03 07:59:07 +01:00

1 2 3 4 5 ...

15053 commits