Commit graph

101854 commits

Author SHA1 Message Date
NeilBrown
cf296b294c
VFS: introduce end_creating_keep()
Occasionally the caller of end_creating() wants to keep using the dentry.
Rather then requiring them to dget() the dentry (when not an error)
before calling end_creating(), provide end_creating_keep() which does
this.

cachefiles and overlayfs make use of this.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: NeilBrown <neil@brown.name>
Link: https://patch.msgid.link/20251113002050.676694-16-neilb@ownmail.net
Tested-by: syzbot@syzkaller.appspotmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-14 13:15:58 +01:00
NeilBrown
fe497f0759
VFS: change vfs_mkdir() to unlock on failure.
vfs_mkdir() already drops the reference to the dentry on failure but it
leaves the parent locked.
This complicates end_creating() which needs to unlock the parent even
though the dentry is no longer available.

If we change vfs_mkdir() to unlock on failure as well as releasing the
dentry, we can remove the "parent" arg from end_creating() and simplify
the rules for calling it.

Note that cachefiles_get_directory() can choose to substitute an error
instead of actually calling vfs_mkdir(), for fault injection.  In that
case it needs to call end_creating(), just as vfs_mkdir() now does on
error.

ovl_create_real() will now unlock on error.  So the conditional
end_creating() after the call is removed, and end_creating() is called
internally on error.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Tested-by: syzbot@syzkaller.appspotmail.com
Signed-off-by: NeilBrown <neil@brown.name>
Link: https://patch.msgid.link/20251113002050.676694-15-neilb@ownmail.net
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-14 13:15:58 +01:00
NeilBrown
f046fbb4d8
ecryptfs: use new start_creating/start_removing APIs
This requires the addition of start_creating_dentry() which is given the
dentry which has already been found, and asks for it to be locked and
its parent validated.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: NeilBrown <neil@brown.name>
Link: https://patch.msgid.link/20251113002050.676694-14-neilb@ownmail.net
Tested-by: syzbot@syzkaller.appspotmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-14 13:15:58 +01:00
NeilBrown
833d2b3a07
Add start_renaming_two_dentries()
A few callers want to lock for a rename and already have both dentries.
Also debugfs does want to perform a lookup but doesn't want permission
checking, so start_renaming_dentry() cannot be used.

This patch introduces start_renaming_two_dentries() which is given both
dentries.  debugfs performs one lookup itself.  As it will only continue
with a negative dentry and as those cannot be renamed or unlinked, it is
safe to do the lookup before getting the rename locks.

overlayfs uses start_renaming_two_dentries() in three places and  selinux
uses it twice in sel_make_policy_nodes().

In sel_make_policy_nodes() we now lock for rename twice instead of just
once so the combined operation is no longer atomic w.r.t the parent
directory locks.  As selinux_state.policy_mutex is held across the whole
operation this does not open up any interesting races.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: NeilBrown <neil@brown.name>
Link: https://patch.msgid.link/20251113002050.676694-13-neilb@ownmail.net
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-14 13:15:58 +01:00
NeilBrown
ac50950ca1
VFS/ovl/smb: introduce start_renaming_dentry()
Several callers perform a rename on a dentry they already have, and only
require lookup for the target name.  This includes smb/server and a few
different places in overlayfs.

start_renaming_dentry() performs the required lookup and takes the
required lock using lock_rename_child()

It is used in three places in overlayfs and in ksmbd_vfs_rename().

In the ksmbd case, the parent of the source is not important - the
source must be renamed from wherever it is.  So start_renaming_dentry()
allows rd->old_parent to be NULL and only checks it if it is non-NULL.
On success rd->old_parent will be the parent of old_dentry with an extra
reference taken.  Other start_renaming function also now take the extra
reference and end_renaming() now drops this reference as well.

ovl_lookup_temp(), ovl_parent_lock(), and ovl_parent_unlock() are
all removed as they are no longer needed.

OVL_TEMPNAME_SIZE and ovl_tempname() are now declared in overlayfs.h so
that ovl_check_rename_whiteout() can access them.

ovl_copy_up_workdir() now always cleans up on error.

Reviewed-by: Namjae Jeon <linkinjeon@kernel.org>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: NeilBrown <neil@brown.name>
Link: https://patch.msgid.link/20251113002050.676694-12-neilb@ownmail.net
Tested-by: syzbot@syzkaller.appspotmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-14 13:15:57 +01:00
NeilBrown
5c87527299
VFS/nfsd/ovl: introduce start_renaming() and end_renaming()
start_renaming() combines name lookup and locking to prepare for rename.
It is used when two names need to be looked up as in nfsd and overlayfs -
cases where one or both dentries are already available will be handled
separately.

__start_renaming() avoids the inode_permission check and hash
calculation and is suitable after filename_parentat() in do_renameat2().
It subsumes quite a bit of code from that function.

start_renaming() does calculate the hash and check X permission and is
suitable elsewhere:
- nfsd_rename()
- ovl_rename()

In ovl, ovl_do_rename_rd() is factored out of ovl_do_rename(), which
itself will be gone by the end of the series.

Acked-by: Chuck Lever <chuck.lever@oracle.com> (for nfsd parts)
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: NeilBrown <neil@brown.name>

--
Changes since v3:
 - added missig dput() in ovl_rename when "whiteout" is not-NULL.

Changes since v2:
 - in __start_renaming() some label have been renamed, and err
   is always set before a "goto out_foo" rather than passing the
   error in a dentry*.
 - ovl_do_rename() changed to call the new ovl_do_rename_rd() rather
   than keeping duplicate code
 - code around ovl_cleanup() call in ovl_rename() restructured.

Link: https://patch.msgid.link/20251113002050.676694-11-neilb@ownmail.net
Tested-by: syzbot@syzkaller.appspotmail.com
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-14 13:15:57 +01:00
NeilBrown
ff7c4ea11a
VFS: add start_creating_killable() and start_removing_killable()
These are similar to start_creating() and start_removing(), but allow a
fatal signal to abort waiting for the lock.

They are used in btrfs for subvol creation and removal.

btrfs_may_create() no longer needs IS_DEADDIR() and
start_creating_killable() includes that check.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: NeilBrown <neil@brown.name>
Link: https://patch.msgid.link/20251113002050.676694-10-neilb@ownmail.net
Tested-by: syzbot@syzkaller.appspotmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-14 13:15:57 +01:00
NeilBrown
7bb1eb45e4
VFS: introduce start_removing_dentry()
start_removing_dentry() is similar to start_removing() but instead of
providing a name for lookup, the target dentry is given.

start_removing_dentry() checks that the dentry is still hashed and in
the parent, and if so it locks and increases the refcount so that
end_removing() can be used to finish the operation.

This is used in cachefiles, overlayfs, smb/server, and apparmor.

There will be other users including ecryptfs.

As start_removing_dentry() takes an extra reference to the dentry (to be
put by end_removing()), there is no need to explicitly take an extra
reference to stop d_delete() from using dentry_unlink_inode() to negate
the dentry - as in cachefiles_delete_object(), and ksmbd_vfs_unlink().

cachefiles_bury_object() now gets an extra ref to the victim, which is
drops.  As it includes the needed end_removing() calls, the caller
doesn't need them.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Namjae Jeon <linkinjeon@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: NeilBrown <neil@brown.name>
Link: https://patch.msgid.link/20251113002050.676694-9-neilb@ownmail.net
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-14 13:15:57 +01:00
NeilBrown
1ead2213dd
smb/server: use end_removing_noperm for for target of smb2_create_link()
Sometimes smb2_create_link() needs to remove the target before creating
the link.
It uses ksmbd_vfs_kern_locked(), and is the only user of that interface.

To match the new naming, that function is changed to
ksmbd_vfs_kern_start_removing(), and related functions or flags are also
renamed.

The lock actually happens in ksmbd_vfs_path_lookup() and that is changed
to use start_removing_noperm() - permission to perform lookup in the
parent was already checked in vfs_path_parent_lookup().

Signed-off-by: NeilBrown <neil@brown.name>
Link: https://patch.msgid.link/20251113002050.676694-8-neilb@ownmail.net
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-14 13:15:56 +01:00
NeilBrown
c9ba789dad
VFS: introduce start_creating_noperm() and start_removing_noperm()
xfs, fuse, ipc/mqueue need variants of start_creating or start_removing
which do not check permissions.
This patch adds _noperm versions of these functions.

Note that do_mq_open() was only calling mntget() so it could call
path_put() - it didn't really need an extra reference on the mnt.
Now it doesn't call mntget() and uses end_creating() which does
the dput() half of path_put().

Also mq_unlink() previously passed
   d_inode(dentry->d_parent)
as the dir inode to vfs_unlink().  This is after locking
   d_inode(mnt->mnt_root)
These two inodes are the same, but normally calls use the textual
parent.
So I've changes the vfs_unlink() call to be given d_inode(mnt->mnt_root).

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: NeilBrown <neil@brown.name>

--
changes since v2:
 - dir arg passed to vfs_unlink() in mq_unlink() changed to match
   the dir passed to lookup_noperm()
 - restore assignment to path->mnt even though the mntget() is removed.

Link: https://patch.msgid.link/20251113002050.676694-7-neilb@ownmail.net
Tested-by: syzbot@syzkaller.appspotmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-14 13:15:56 +01:00
NeilBrown
bd6ede8a06
VFS/nfsd/cachefiles/ovl: introduce start_removing() and end_removing()
start_removing() is similar to start_creating() but will only return a
positive dentry with the expectation that it will be removed.  This is
used by nfsd, cachefiles, and overlayfs.  They are changed to also use
end_removing() to terminate the action begun by start_removing().  This
is a simple alias for end_dirop().

Apart from changes to the error paths, as we no longer need to unlock on
a lookup error, an effect on callers is that they don't need to test if
the found dentry is positive or negative - they can be sure it is
positive.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: NeilBrown <neil@brown.name>
Link: https://patch.msgid.link/20251113002050.676694-6-neilb@ownmail.net
Tested-by: syzbot@syzkaller.appspotmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-14 13:15:56 +01:00
NeilBrown
7ab96df840
VFS/nfsd/cachefiles/ovl: add start_creating() and end_creating()
start_creating() is similar to simple_start_creating() but is not so
simple.
It takes a qstr for the name, includes permission checking, and does NOT
report an error if the name already exists, returning a positive dentry
instead.

This is currently used by nfsd, cachefiles, and overlayfs.

end_creating() is called after the dentry has been used.
end_creating() drops the reference to the dentry as it is generally no
longer needed.  This is exactly the first section of end_creating_path()
so that function is changed to call the new end_creating()

These calls help encapsulate locking rules so that directory locking can
be changed.

Occasionally this change means that the parent lock is held for a
shorter period of time, for example in cachefiles_commit_tmpfile().
As this function now unlocks after an unlink and before the following
lookup, it is possible that the lookup could again find a positive
dentry, so a while loop is introduced there.

In overlayfs the ovl_lookup_temp() function has ovl_tempname()
split out to be used in ovl_start_creating_temp().  The other use
of ovl_lookup_temp() is preparing for a rename.  When rename handling
is updated, ovl_lookup_temp() will be removed.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: NeilBrown <neil@brown.name>
Link: https://patch.msgid.link/20251113002050.676694-5-neilb@ownmail.net
Tested-by: syzbot@syzkaller.appspotmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-14 13:15:56 +01:00
NeilBrown
3661a78874
VFS: tidy up do_unlinkat()
The simplification of locking in the previous patch opens up some room
for tidying up do_unlinkat()

- change all "exit" labels to describe what will happen at the label.
- always goto an exit label on an error - unwrap the "if (!IS_ERR())" branch.
- Move the "slashes" handing inline, but mark it as unlikely()
- simplify use of the "inode" variable - we no longer need to test for NULL.

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: NeilBrown <neil@brown.name>
Link: https://patch.msgid.link/20251113002050.676694-4-neilb@ownmail.net
Tested-by: syzbot@syzkaller.appspotmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-14 13:15:56 +01:00
NeilBrown
4037d966f0
VFS: introduce start_dirop() and end_dirop()
The fact that directory operations (create,remove,rename) are protected
by a lock on the parent is known widely throughout the kernel.
In order to change this - to instead lock the target dentry  - it is
best to centralise this knowledge so it can be changed in one place.

This patch introduces start_dirop() which is local to VFS code.
It performs the required locking for create and remove.  Rename
will be handled separately.

Various functions with names like start_creating() or start_removing_path(),
some of which already exist, will export this functionality beyond the VFS.

end_dirop() is the partner of start_dirop().  It drops the lock and
releases the reference on the dentry.
It *is* exported so that various end_creating etc functions can be inline.

As vfs_mkdir() drops the dentry on error we cannot use end_dirop() as
that won't unlock when the dentry IS_ERR().  For now we need an explicit
unlock when dentry IS_ERR().  I hope to change vfs_mkdir() to unlock
when it drops a dentry so that explicit unlock can go away.

end_dirop() can always be called on the result of start_dirop(), but not
after vfs_mkdir().  After a vfs_mkdir() we still may need the explicit
unlock as seen in end_creating_path().

As well as adding start_dirop() and end_dirop()
this patch uses them in:
 - simple_start_creating (which requires sharing lookup_noperm_common()
        with libfs.c)
 - start_removing_path / start_removing_user_path_at
 - filename_create / end_creating_path()
 - do_rmdir(), do_unlinkat()

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: NeilBrown <neil@brown.name>
Link: https://patch.msgid.link/20251113002050.676694-3-neilb@ownmail.net
Tested-by: syzbot@syzkaller.appspotmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-14 13:15:56 +01:00
NeilBrown
8b45b9a882
debugfs: rename end_creating() to debugfs_end_creating()
By not using the generic end_creating() name here we are free to use it
more globally for a more generic function.
This should have been done when start_creating() was renamed.

For consistency, also rename failed_creating().

Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NeilBrown <neil@brown.name>
Link: https://patch.msgid.link/20251113002050.676694-2-neilb@ownmail.net
Tested-by: syzbot@syzkaller.appspotmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-14 13:15:55 +01:00
Linus Torvalds
0739473694 - Avoid -Wflex-array-member-not-at-end warnings
- Replace simple_strtoul with kstrtoint
 
 - Fix error code for new_inode() failure
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRnH8MwLyZDhyYfesYTAyx9YGnhbQUCaOlnMBQcbXBhdG9ja2FA
 cmVkaGF0LmNvbQAKCRATAyx9YGnhbWiLAQDOJdFzQxSdrX7KPmi+yMBqnrtL7TYD
 cazx/3zORdP2kAEA16PUvT7uEdrblf3ZmOfHGe6RDhCzff56ebRc2Y1HqA4=
 =pwvh
 -----END PGP SIGNATURE-----

Merge tag 'for-6.18/hpfs-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull hpfs updates from Mikulas Patocka:

 - Avoid -Wflex-array-member-not-at-end warnings

 - Replace simple_strtoul with kstrtoint

 - Fix error code for new_inode() failure

* tag 'for-6.18/hpfs-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  fs/hpfs: Fix error code for new_inode() failure in mkdir/create/mknod/symlink
  hpfs: Replace simple_strtoul with kstrtoint in hpfs_parse_param
  fs: hpfs: Avoid multiple -Wflex-array-member-not-at-end warnings
2025-10-10 14:06:02 -07:00
Linus Torvalds
8bd9238e51 Some messenger improvements from Eric and Max, a patch to address the
issue (also affected userspace) of incorrect permissions being granted
 to users who have access to multiple different CephFS instances within
 the same cluster from Kotresh and a bunch of assorted CephFS fixes from
 Slava.
 -----BEGIN PGP SIGNATURE-----
 
 iQFFBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAmjpShsTHGlkcnlvbW92
 QGdtYWlsLmNvbQAKCRBKf944AhHzizoaB/C7qTw5Olh8NDpX8I+ljEO50XNduurf
 fp2eNn0ub5brhcvh8iACSPKE2oer/bDvv3b2SN9310GmBX3f7H2Ht5TeH2tGBRN0
 clg+C2DmY/2watHovo+ua7YAd+HiPH2XMbpeU38Pu1nEdmiU6cQ0YaOn8n2p+c1E
 bID0dMHWb4HTmFRURqWqKPDkM1fLHRxIVgyOMaov5vs0T7XdglwPja3S2W6epvqF
 hKSMSvO/j9qYlOsBM6G6IuHDMJomzBqOQKqsQqC4XZN6uXeaKPTLYRnzxKfJUEWj
 P5JTaum7NGGtfIs0L9wr6zpou/GY2zTFiyXhLZsLJMn894bBO5nArg==
 =wGIB
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-6.18-rc1' of https://github.com/ceph/ceph-client

Pull ceph updates from Ilya Dryomov:

 - some messenger improvements (Eric and Max)

 - address an issue (also affected userspace) of incorrect permissions
   being granted to users who have access to multiple different CephFS
   instances within the same cluster (Kotresh)

 - a bunch of assorted CephFS fixes (Slava)

* tag 'ceph-for-6.18-rc1' of https://github.com/ceph/ceph-client:
  ceph: add bug tracking system info to MAINTAINERS
  ceph: fix multifs mds auth caps issue
  ceph: cleanup in ceph_alloc_readdir_reply_buffer()
  ceph: fix potential NULL dereference issue in ceph_fill_trace()
  libceph: add empty check to ceph_con_get_out_msg()
  libceph: pass the message pointer instead of loading con->out_msg
  libceph: make ceph_con_get_out_msg() return the message pointer
  ceph: fix potential race condition on operations with CEPH_I_ODIRECT flag
  ceph: refactor wake_up_bit() pattern of calling
  ceph: fix potential race condition in ceph_ioctl_lazyio()
  ceph: fix overflowed constant issue in ceph_do_objects_copy()
  ceph: fix wrong sizeof argument issue in register_session()
  ceph: add checking of wait_for_completion_killable() return value
  ceph: make ceph_start_io_*() killable
  libceph: Use HMAC-SHA256 library instead of crypto_shash
2025-10-10 11:30:19 -07:00
Linus Torvalds
91b436fc92 20 smb client fixes
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmjpMA0ACgkQiiy9cAdy
 T1Emcgv/WWk7+nRyHQwnv4fLkIMXt+OwJUhVrHvxoMUjbi4L2635wyOWQJ2pYAdt
 oy3J6ozk/TkvSI7KzUGAo+5vVXjAkX3fKWWA8R2jrVcmSOgXGjVSm4NSQ2bO8B35
 7OXyW8ke1V5swzRGUhmn0Ppv6As8Hb77puZD7NzcYLLKhRTeuL9DuPyy2kNYYnIN
 WL6Cs3wApRAwNw7d6bI/0IQZPG2z0F0ShnLEytlgCxrifpMLSwj+8rfu8oy1Ks0S
 9NzeZtku9RIct+zkgDUS0Msg6+w6greynaLy2UzN6Ki4pgOp48sbPZ3TRVT0iNPC
 amtgqazY3+1Pbs6cdGcSv2VeJCJ9RvF9dzqmQFEIsPysOhVCjDvyQ3oQru2a7lsQ
 ImPqwmMGSW7t1jtBJsKDxVKcANgDM0w6GdPGaBPlN3Zpw+jVO1UgA+7fWKTW+30L
 JiKjBQYYXcAieFJsHUkcpWDTi6asr0Xcmx4Sy2nnJ9msDmdsevlLQzY7hBSFySjn
 /tvSQiQf
 =/qIG
 -----END PGP SIGNATURE-----

Merge tag 'v6.18-rc-part2-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull more smb client updates from Steve French:

 - fix i_size in fallocate

 - two truncate fixes

 - utime fix

 - minor cleanups

 - SMB1 fixes

 - improve error check in read

 - improve perf of copy file_range (copy_chunk)

* tag 'v6.18-rc-part2-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: update internal version number
  cifs: Add comments for DeletePending assignments in open functions
  cifs: Add fallback code path for cifs_mkdir_setinfo()
  cifs: Allow fallback code in smb_set_file_info() also for directories
  cifs: Query EA $LXMOD in cifs_query_path_info() for WSL reparse points
  smb: client: remove cfids_invalidation_worker
  smb: client: remove redudant assignment in cifs_strict_fsync()
  smb: client: fix race with fallocate(2) and AIO+DIO
  smb: client: fix missing timestamp updates after utime(2)
  smb: client: fix missing timestamp updates after ftruncate(2)
  smb: client: fix missing timestamp updates with O_TRUNC
  cifs: Fix copy_to_iter return value check
  smb: client: batch SRV_COPYCHUNK entries to cut round trips
  smb: client: Omit an if branch in smb2_find_smb_tcon()
  smb: client: Return directly after a failed genlmsg_new() in cifs_swn_send_register_message()
  smb: client: Use common code in cifs_do_create()
  smb: client: Improve unlocking of a mutex in cifs_get_swn_reg()
  smb: client: Return a status code only as a constant in cifs_spnego_key_instantiate()
  smb: client: Use common code in cifs_lookup()
  smb: client: Reduce the scopes for a few variables in two functions
2025-10-10 11:23:57 -07:00
Linus Torvalds
1b1391b9c4 block-6.18-20251009
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmjoDmUQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgptyjD/94YYv1sabG9M6UHq7j9lOAgruqXaaEMOw+
 Blnm4ejuLNcM8FMCBcuvbhp3ktzT7v1/bWal7FLnmujuKBfhAe+t2AVcHFWUQie2
 CIfMjc3p77U/bwL5wt0O5WFqu1UPDVe+qzrppqRYduTxvPKk9Fi6mqpYCXKlYN7K
 FhINsytoZp/CvTdf5EDSsPv2r4W85OhrPeq0VjYufFBD1wxXD94ii8WAvyfsl20s
 0gIfdlfa2vaNVwH1kdCd+IeATrSBpyCZKGEVTzcHYoo/1MgfNFigrJ8GUA5c+DLM
 fmNE+E+wFtobq5WBmbrtmAxtBnzzV49HS1OT1amUktuq87ryiY5Svn6vFAqEJQl6
 2HLE9nNN2PBdPMAmQ57u1bvp/3nGD0mk/hC1666MTDxHpxg5c6cugCSlJGVG+uC9
 ShLgi8bWV6RXelso0qMaSmNNCA8dskxJg/YDJ06AViTSuW8Y1+adoXddCjE7jne9
 3lci/r2WiuwqTJuub9D7LUtC7VhbCY19VVkgDE64VB2+CjR8B9AlLVG3sGl1HDOY
 EFAddJ3lAEOz5F1H2AzcOBPqqeBfuipr6lEpdb9+6hNu5wRILAHtme8W76c4PtuF
 PRk/3JYcHE77DZlFeE+iN8n0y1tNdWR/6QzWIOsGcNlUyeGGV/zvgGOodtFRpHt2
 t7Eue56EFw==
 =/1jf
 -----END PGP SIGNATURE-----

Merge tag 'block-6.18-20251009' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux

Pull block fixes from Jens Axboe:

 - Don't include __GFP_NOWARN for loop worker allocation, as it already
   uses GFP_NOWAIT which has __GFP_NOWARN set already

 - Small series cleaning up the recent bio_iov_iter_get_pages() changes

 - loop fix for leaking the backing reference file, if validation fails

 - Update of a comment pertaining to disk/partition stat locking

* tag 'block-6.18-20251009' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
  loop: remove redundant __GFP_NOWARN flag
  block: move bio_iov_iter_get_bdev_pages to block/fops.c
  iomap: open code bio_iov_iter_get_bdev_pages
  block: rename bio_iov_iter_get_pages_aligned to bio_iov_iter_get_pages
  block: remove bio_iov_iter_get_pages
  block: Update a comment of disk statistics
  loop: fix backing file reference leak on validation error
2025-10-10 10:37:13 -07:00
Steve French
b30c32c784 cifs: update internal version number
to 2.57

Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-10 11:10:01 -05:00
Pali Rohár
fa9fe87150 cifs: Add comments for DeletePending assignments in open functions
On more places is set DeletePending member to 0. Add comments why is 0 the
correct value. Paths in DELETE_PENDING state cannot be opened by new calls.
So if the newly issued open for that path succeed then it means that the
path cannot be in DELETE_PENDING state.

Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-09 23:01:24 -05:00
Pali Rohár
92210ccd87 cifs: Add fallback code path for cifs_mkdir_setinfo()
Use SMBSetInformation() as a fallback function (when CIFSSMBSetPathInfo()
fails) which can set attribudes on the directory, including changing
read-only attribute.

Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-09 23:01:24 -05:00
Pali Rohár
88cae132dc cifs: Allow fallback code in smb_set_file_info() also for directories
On NT systems, it is possible to do SMB open call also for directories.
Open argument CREATE_NOT_DIR disallows opening directories. So in fallback
code path in smb_set_file_info() remove CREATE_NOT_DIR restriction to allow
it also for directories.

Similar fallback is implemented also in CIFSSMBSetPathInfoFB() function and
this function already allows to call operation for directories.

Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-09 23:01:24 -05:00
Pali Rohár
057ac50638 cifs: Query EA $LXMOD in cifs_query_path_info() for WSL reparse points
EA $LXMOD is required for WSL non-symlink reparse points.

Fixes: ef86ab131d ("cifs: Fix querying of WSL CHR and BLK reparse points over SMB1")
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-09 23:01:24 -05:00
Linus Torvalds
80b7065ec1 Bunch of unrelated fixes
- polling fix for trans fd that ought to have been fixed otherwise back
 in March, but apparently came back somewhere else...
 - USB transport buffer overflow fix
 - Some dentry lifetime rework to handle metadata update for currently
 opened files in uncached mode, or inode type change in cached mode
 - a double-put on invalid flush found by syzbot
 - and finally /sys/fs/9p/caches not advancing buffer and overwriting
 itself for large contents
 
 Thanks to everyone involved!
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE/IPbcYBuWt0zoYhOq06b7GqY5nAFAmjns5cACgkQq06b7GqY
 5nDGLg/+Ltsiwj51CDwyJxFPX4ki1jM3j2iA8BNpdjjzCev6Jp1vuWBPiV+sTtgM
 zKmIw9joyvzaIXKzA3h7tMe7h7y8H4p7KZyD3YSDSNW5XIb5UYXeQwYVw39OxENk
 nVQkQ41H+bPepmz3R0t/F4la9myVLBtfdcPWJN+JfkOLgdeemCGf+TShJ6F+9Qaq
 PTd3UKfdv8JRlKQmeLH7pPlBoWYRoC2Tq3pZbyC3D1A50bWkTfxusW0k3MUzRyHO
 t/jNbhHt0ai+densQ6O5M+ALMI7KIIzR+obVaBHfzSUcwFeGhmW2x84Mo0qX6YMF
 0B/fY3mCXi8QO6my1hfwmbRtY5mEV967wM/uF7E/SuvyNVizOMBZn8FgWfVPVszr
 ix1iNBgQJdoOEMbHvDfCAtGNorFrMiybyLvoabL7YCwFo68LdYqj2br3/feajoYM
 5g46nUcidH4fVBLtiBGE8w0ko8aa3zLB4iu/fdATuEZ1r+gPt40SWo1mKahoCgY3
 BOtYxEZTnd4l4GPJDQIaYgVckgSiex0xcE9pRJ5LtBwdyw9g4jwfb39WHx4MT2sE
 u5xWDbadU/ZM6ppLJ/iBkotvU9D7ohKTviN1IfCwjqL7IMBNz5c+nBwLQZWK4YTR
 2ySKpv1VOMROglt1KvnHfdpmdJ+CjAJLLNE+XSz9AHXinK5v8aM=
 =bUN1
 -----END PGP SIGNATURE-----

Merge tag '9p-for-6.18-rc1' of https://github.com/martinetd/linux

Pull 9p updates from Dominique Martinet:
 "A bunch of unrelated fixes:

   - polling fix for trans fd that ought to have been fixed otherwise
     back in March, but apparently came back somewhere else...

   - USB transport buffer overflow fix

   - Some dentry lifetime rework to handle metadata update for currently
     opened files in uncached mode, or inode type change in cached mode

   - a double-put on invalid flush found by syzbot

   - and finally /sys/fs/9p/caches not advancing buffer and overwriting
     itself for large contents

  Thanks to everyone involved!"

* tag '9p-for-6.18-rc1' of https://github.com/martinetd/linux:
  9p: sysfs_init: don't hardcode error to ENOMEM
  9p: fix /sys/fs/9p/caches overwriting itself
  9p: clean up comment typos
  9p/trans_fd: p9_fd_request: kick rx thread if EPOLLIN
  net/9p: fix double req put in p9_fd_cancelled
  net/9p: Fix buffer overflow in USB transport layer
  fs/9p: Add p9_debug(VFS) in d_revalidate
  fs/9p: Invalidate dentry if inode type change detected in cached mode
  fs/9p: Refresh metadata in d_revalidate for uncached mode too
2025-10-09 11:56:59 -07:00
Enzo Matsumiya
7ae6152b78 smb: client: remove cfids_invalidation_worker
We can do the same cleanup on laundromat.

On invalidate_all_cached_dirs(), run laundromat worker with 0 timeout
and flush it for immediate + sync cleanup.

Signed-off-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-09 11:18:09 -05:00
Paulo Alcantara
be3898a395 smb: client: remove redudant assignment in cifs_strict_fsync()
Remove redudant assignment of @rc as it will be overwritten by the
following cifs_file_flush() call.

Reported-by: Steve French <stfrench@microsoft.com>
Addresses-Coverity: 1665925
Fixes: 210627b0aca9 ("smb: client: fix missing timestamp updates with O_TRUNC")
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Cc: linux-cifs@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-09 11:16:25 -05:00
Paulo Alcantara
dba9f997c9 smb: client: fix race with fallocate(2) and AIO+DIO
AIO+DIO may extend the file size, hence we need to make sure ->i_size
is stable across the entire fallocate(2) operation, otherwise it would
become a truncate and then inode size reduced back down when it
finishes.

Fix this by calling netfs_wait_for_outstanding_io() right after
acquiring ->i_rwsem exclusively in cifs_fallocate() and then guarantee
a stable ->i_size across fallocate(2).

Also call netfs_wait_for_outstanding_io() after truncating pagecache
to avoid any potential races with writeback.

Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Reviewed-by: David Howells <dhowells@redhat.com>
Fixes: 210627b0aca9 ("smb: client: fix missing timestamp updates with O_TRUNC")
Cc: Frank Sorenson <sorenson@redhat.com>
Cc: linux-cifs@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-09 10:42:14 -05:00
Paulo Alcantara
b95cd1bdf5 smb: client: fix missing timestamp updates after utime(2)
Don't reuse open handle when changing timestamps to prevent the server
from disabling automatic timestamp updates as per MS-FSA 2.1.4.17.

---8<---
import os
import time

filename = '/mnt/foo'

def print_stat(prefix):
    st = os.stat(filename)
    print(prefix, ': ', time.ctime(st.st_atime), time.ctime(st.st_ctime))

fd = os.open(filename, os.O_CREAT|os.O_TRUNC|os.O_WRONLY, 0o644)
print_stat('old')
os.utime(fd, None)
time.sleep(2)
os.write(fd, b'foo')
os.close(fd)
time.sleep(2)
print_stat('new')
---8<---

Before patch:

$ mount.cifs //srv/share /mnt -o ...
$ python3 run.py
old :  Fri Oct  3 14:01:21 2025 Fri Oct  3 14:01:21 2025
new :  Fri Oct  3 14:01:21 2025 Fri Oct  3 14:01:21 2025

After patch:

$ mount.cifs //srv/share /mnt -o ...
$ python3 run.py
old :  Fri Oct  3 17:03:34 2025 Fri Oct  3 17:03:34 2025
new :  Fri Oct  3 17:03:36 2025 Fri Oct  3 17:03:36 2025

Fixes: b6f2a0f89d ("cifs: for compound requests, use open handle if possible")
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Cc: Frank Sorenson <sorenson@redhat.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Cc: linux-cifs@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-09 10:42:14 -05:00
Paulo Alcantara
57ce9f7793 smb: client: fix missing timestamp updates after ftruncate(2)
Mask off ATTR_MTIME|ATTR_CTIME bits on ATTR_SIZE (e.g. ftruncate(2))
to prevent the client from sending set info calls and then disabling
automatic timestamp updates on server side as per MS-FSA 2.1.4.17.

---8<---
import os
import time

filename = '/mnt/foo'

def print_stat(prefix):
    st = os.stat(filename)
    print(prefix, ': ', time.ctime(st.st_atime), time.ctime(st.st_ctime))

fd = os.open(filename, os.O_CREAT|os.O_TRUNC|os.O_WRONLY, 0o644)
print_stat('old')
os.ftruncate(fd, 10)
time.sleep(2)
os.write(fd, b'foo')
os.close(fd)
time.sleep(2)
print_stat('new')
---8<---

Before patch:

$ mount.cifs //srv/share /mnt -o ...
$ python3 run.py
old :  Fri Oct  3 13:47:03 2025 Fri Oct  3 13:47:03 2025
new :  Fri Oct  3 13:47:00 2025 Fri Oct  3 13:47:03 2025

After patch:

$ mount.cifs //srv/share /mnt -o ...
$ python3 run.py
old :  Fri Oct  3 13:48:39 2025 Fri Oct  3 13:48:39 2025
new :  Fri Oct  3 13:48:41 2025 Fri Oct  3 13:48:41 2025

Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Cc: Frank Sorenson <sorenson@redhat.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Cc: linux-cifs@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-09 10:42:14 -05:00
Paulo Alcantara
110fee6b9b smb: client: fix missing timestamp updates with O_TRUNC
Don't call ->set_file_info() on open handle to prevent the server from
stopping [cm]time updates automatically as per MS-FSA 2.1.4.17.

Fix this by checking for ATTR_OPEN bit earlier in cifs_setattr() to
prevent ->set_file_info() from being called when opening a file with
O_TRUNC.  Do the truncation in ->open() instead.

This also saves two roundtrips when opening a file with O_TRUNC and
there are currently no open handles to be reused.

Before patch:

$ mount.cifs //srv/share /mnt -o ...
$ cd /mnt
$ exec 3>foo; stat -c 'old: %z %y' foo; sleep 2; echo test >&3; exec 3>&-; sleep 2; stat -c 'new: %z %y' foo
old: 2025-10-03 13:26:23.151030500 -0300 2025-10-03 13:26:23.151030500 -0300
new: 2025-10-03 13:26:23.151030500 -0300 2025-10-03 13:26:23.151030500 -0300

After patch:

$ mount.cifs //srv/share /mnt -o ...
$ cd /mnt
$ exec 3>foo; stat -c 'old: %z %y' foo; sleep 2; echo test >&3; exec 3>&-; sleep 2; stat -c 'new: %z %y' foo
$ exec 3>foo; stat -c 'old: %z %y' foo; sleep 2; echo test >&3; exec 3>&-; sleep 2; stat -c 'new: %z %y' foo
old: 2025-10-03 13:28:13.911933800 -0300 2025-10-03 13:28:13.911933800 -0300
new: 2025-10-03 13:28:26.647492700 -0300 2025-10-03 13:28:26.647492700 -0300

Reported-by: Frank Sorenson <sorenson@redhat.com>
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Reviewed-by: David Howells <dhowells@redhat.com>
Cc: linux-cifs@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-09 10:42:14 -05:00
Fushuai Wang
0cc380d0e1 cifs: Fix copy_to_iter return value check
The return value of copy_to_iter() function will never be negative,
it is the number of bytes copied, or zero if nothing was copied.
Update the check to treat 0 as an error, and return -1 in that case.

Fixes: d08089f649 ("cifs: Change the I/O paths to use an iterator rather than a page list")
Acked-by: Tom Talpey <tom@talpey.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Fushuai Wang <wangfushuai@baidu.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-09 10:42:14 -05:00
Henrique Carvalho
68d2e2ca1c smb: client: batch SRV_COPYCHUNK entries to cut round trips
smb2_copychunk_range() used to send a single SRV_COPYCHUNK per
SRV_COPYCHUNK_COPY IOCTL.

Implement variable Chunks[] array in struct copychunk_ioctl and fill it
with struct copychunk (MS-SMB2 2.2.31.1.1), bounded by server-advertised
limits.

This reduces the number of IOCTL requests for large copies.

While we are at it, rename a couple variables to follow the terminology
used in the specification.

Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-09 10:42:14 -05:00
Markus Elfring
1643cd51ba smb: client: Omit an if branch in smb2_find_smb_tcon()
Statements from an if branch and the end of this function implementation
were equivalent.
Thus delete duplicate source code.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-09 10:40:43 -05:00
Kotresh HR
22c73d52a6 ceph: fix multifs mds auth caps issue
The mds auth caps check should also validate the
fsname along with the associated caps. Not doing
so would result in applying the mds auth caps of
one fs on to the other fs in a multifs ceph cluster.
The bug causes multiple issues w.r.t user
authentication, following is one such example.

Steps to Reproduce (on vstart cluster):
1. Create two file systems in a cluster, say 'fsname1' and 'fsname2'
2. Authorize read only permission to the user 'client.usr' on fs 'fsname1'
    $ceph fs authorize fsname1 client.usr / r
3. Authorize read and write permission to the same user 'client.usr' on fs 'fsname2'
    $ceph fs authorize fsname2 client.usr / rw
4. Update the keyring
    $ceph auth get client.usr >> ./keyring

With above permssions for the user 'client.usr', following is the
expectation.
  a. The 'client.usr' should be able to only read the contents
     and not allowed to create or delete files on file system 'fsname1'.
  b. The 'client.usr' should be able to read/write on file system 'fsname2'.

But, with this bug, the 'client.usr' is allowed to read/write on file
system 'fsname1'. See below.

5. Mount the file system 'fsname1' with the user 'client.usr'
     $sudo bin/mount.ceph usr@.fsname1=/ /kmnt_fsname1_usr/
6. Try creating a file on file system 'fsname1' with user 'client.usr'. This
   should fail but passes with this bug.
     $touch /kmnt_fsname1_usr/file1
7. Mount the file system 'fsname1' with the user 'client.admin' and create a
   file.
     $sudo bin/mount.ceph admin@.fsname1=/ /kmnt_fsname1_admin
     $echo "data" > /kmnt_fsname1_admin/admin_file1
8. Try removing an existing file on file system 'fsname1' with the user
   'client.usr'. This shoudn't succeed but succeeds with the bug.
     $rm -f /kmnt_fsname1_usr/admin_file1

For more information, please take a look at the corresponding mds/fuse patch
and tests added by looking into the tracker mentioned below.

v2: Fix a possible null dereference in doutc
v3: Don't store fsname from mdsmap, validate against
    ceph_mount_options's fsname and use it
v4: Code refactor, better warning message and
    fix possible compiler warning

[ Slava.Dubeyko: "fsname check failed" -> "fsname mismatch" ]

Link: https://tracker.ceph.com/issues/72167
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2025-10-08 23:30:47 +02:00
Viacheslav Dubeyko
c66120c842 ceph: cleanup in ceph_alloc_readdir_reply_buffer()
The Coverity Scan service has reported potential issue
in ceph_alloc_readdir_reply_buffer() [1]. If order could
be negative one, then it expects the issue in the logic:

num_entries = (PAGE_SIZE << order) / size;

Technically speaking, this logic [2] should prevent from
making the order variable negative:

if (!rinfo->dir_entries)
    return -ENOMEM;

However, the allocation logic requires some cleanup.
This patch makes sure that calculated bytes count
will never exceed ULONG_MAX before get_order()
calculation. And it adds the checking of order
variable on negative value to guarantee that second
half of the function's code will never operate by
negative value of order variable even if something
will be wrong or to be changed in the first half of
the function's logic.

v2
Alex Markuze suggested to add unlikely() macro
for introduced condition checks.

[1] https://scan5.scan.coverity.com/#/project-view/64304/10063?selectedIssue=1198252
[2] https://elixir.bootlin.com/linux/v6.17-rc3/source/fs/ceph/mds_client.c#L2553

Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Reviewed-by: Alex Markuze <amarkuze@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2025-10-08 23:30:47 +02:00
Viacheslav Dubeyko
98a2850de4 ceph: fix potential NULL dereference issue in ceph_fill_trace()
The Coverity Scan service has detected a potential dereference of
an explicit NULL value in ceph_fill_trace() [1].

The variable in is declared in the beggining of
ceph_fill_trace() [2]:

struct inode *in = NULL;

However, the initialization of the variable is happening under
condition [3]:

if (rinfo->head->is_target) {
    <skipped>
    in = req->r_target_inode;
    <skipped>
}

Potentially, if rinfo->head->is_target == FALSE, then
in variable continues to be NULL and later the dereference of
NULL value could happen in ceph_fill_trace() logic [4,5]:

else if ((req->r_op == CEPH_MDS_OP_LOOKUPSNAP ||
            req->r_op == CEPH_MDS_OP_MKSNAP) &&
            test_bit(CEPH_MDS_R_PARENT_LOCKED, &req->r_req_flags) &&
             !test_bit(CEPH_MDS_R_ABORTED, &req->r_req_flags)) {
<skipped>
     ihold(in);
     err = splice_dentry(&req->r_dentry, in);
     if (err < 0)
         goto done;
}

This patch adds the checking of in variable for NULL value
and it returns -EINVAL error code if it has NULL value.

v2
Alex Markuze suggested to add unlikely macro
in the checking condition.

[1] https://scan5.scan.coverity.com/#/project-view/64304/10063?selectedIssue=1141197
[2] https://elixir.bootlin.com/linux/v6.17-rc3/source/fs/ceph/inode.c#L1522
[3] https://elixir.bootlin.com/linux/v6.17-rc3/source/fs/ceph/inode.c#L1629
[4] https://elixir.bootlin.com/linux/v6.17-rc3/source/fs/ceph/inode.c#L1745
[5] https://elixir.bootlin.com/linux/v6.17-rc3/source/fs/ceph/inode.c#L1777

Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Reviewed-by: Alex Markuze <amarkuze@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2025-10-08 23:30:47 +02:00
Viacheslav Dubeyko
fbeafe782b ceph: fix potential race condition on operations with CEPH_I_ODIRECT flag
The Coverity Scan service has detected potential
race conditions in ceph_block_o_direct(), ceph_start_io_read(),
ceph_block_buffered(), and ceph_start_io_direct() [1 - 4].

The CID 1590942, 1590665, 1589664, 1590377 contain explanation:
"The value of the shared data will be determined by
the interleaving of thread execution. Thread shared data is accessed
without holding an appropriate lock, possibly causing
a race condition (CWE-366)".

This patch reworks the pattern of accessing/modification of
CEPH_I_ODIRECT flag by means of adding smp_mb__before_atomic()
before reading the status of CEPH_I_ODIRECT flag and
smp_mb__after_atomic() after clearing set/clear this flag.
Also, it was reworked the pattern of using of ci->i_ceph_lock
in ceph_block_o_direct(), ceph_start_io_read(),
ceph_block_buffered(), and ceph_start_io_direct() methods.

[1] https://scan5.scan.coverity.com/#/project-view/64304/10063?selectedIssue=1590942
[2] https://scan5.scan.coverity.com/#/project-view/64304/10063?selectedIssue=1590665
[3] https://scan5.scan.coverity.com/#/project-view/64304/10063?selectedIssue=1589664
[4] https://scan5.scan.coverity.com/#/project-view/64304/10063?selectedIssue=1590377

Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Reviewed-by: Alex Markuze <amarkuze@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2025-10-08 23:30:46 +02:00
Viacheslav Dubeyko
53db6f25ee ceph: refactor wake_up_bit() pattern of calling
The wake_up_bit() is called in ceph_async_unlink_cb(),
wake_async_create_waiters(), and ceph_finish_async_create().
It makes sense to switch on clear_bit() function, because
it makes the code much cleaner and easier to understand.
More important rework is the adding of smp_mb__after_atomic()
memory barrier after the bit modification and before
wake_up_bit() call. It can prevent potential race condition
of accessing the modified bit in other threads. Luckily,
clear_and_wake_up_bit() already implements the required
functionality pattern:

static inline void clear_and_wake_up_bit(int bit, unsigned long *word)
{
	clear_bit_unlock(bit, word);
	/* See wake_up_bit() for which memory barrier you need to use. */
	smp_mb__after_atomic();
	wake_up_bit(word, bit);
}

Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Reviewed-by: Alex Markuze <amarkuze@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2025-10-08 23:30:46 +02:00
Viacheslav Dubeyko
5824ccba9a ceph: fix potential race condition in ceph_ioctl_lazyio()
The Coverity Scan service has detected potential
race condition in ceph_ioctl_lazyio() [1].

The CID 1591046 contains explanation: "Check of thread-shared
field evades lock acquisition (LOCK_EVASION). Thread1 sets
fmode to a new value. Now the two threads have an inconsistent
view of fmode and updates to fields correlated with fmode
may be lost. The data guarded by this critical section may
be read while in an inconsistent state or modified by multiple
racing threads. In ceph_ioctl_lazyio: Checking the value of
a thread-shared field outside of a locked region to determine
if a locked operation involving that thread shared field
has completed. (CWE-543)".

The patch places fi->fmode field access under ci->i_ceph_lock
protection. Also, it introduces the is_file_already_lazy
variable that is set under the lock and it is checked later
out of scope of critical section.

[1] https://scan5.scan.coverity.com/#/project-view/64304/10063?selectedIssue=1591046

Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Reviewed-by: Alex Markuze <amarkuze@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2025-10-08 23:30:46 +02:00
Viacheslav Dubeyko
5b2d1377d6 ceph: fix overflowed constant issue in ceph_do_objects_copy()
The Coverity Scan service has detected overflowed constant
issue in ceph_do_objects_copy() [1]. The CID 1624308
defect contains explanation: "The overflowed value due to
arithmetic on constants is too small or unexpectedly
negative, causing incorrect computations. Expression bytes,
which is equal to -95, where ret is known to be equal to -95,
underflows the type that receives it, an unsigned integer
64 bits wide. In ceph_do_objects_copy: Integer overflow occurs
in arithmetic on constant operands (CWE-190)".

The patch changes the type of bytes variable from size_t
to ssize_t with the goal of to be capable to receive
negative values.

[1] https://scan5.scan.coverity.com/#/project-view/64304/10063?selectedIssue=1624308

Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Reviewed-by: Alex Markuze <amarkuze@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2025-10-08 23:30:46 +02:00
Viacheslav Dubeyko
1ed4471a4e ceph: fix wrong sizeof argument issue in register_session()
The Coverity Scan service has detected the wrong sizeof
argument in register_session() [1]. The CID 1598909 defect
contains explanation: "The wrong sizeof value is used in
an expression or as argument to a function. The result is
an incorrect value that may cause unexpected program behaviors.
In register_session: The sizeof operator is invoked on
the wrong argument (CWE-569)".

The patch introduces a ptr_size variable that is initialized
by sizeof(struct ceph_mds_session *). And this variable is used
instead of sizeof(void *) in the code.

[1] https://scan5.scan.coverity.com/#/project-view/64304/10063?selectedIssue=1598909

Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Reviewed-by: Alex Markuze <amarkuze@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2025-10-08 23:30:46 +02:00
Viacheslav Dubeyko
b7ed1e29cf ceph: add checking of wait_for_completion_killable() return value
The Coverity Scan service has detected the calling of
wait_for_completion_killable() without checking the return
value in ceph_lock_wait_for_completion() [1]. The CID 1636232
defect contains explanation: "If the function returns an error
value, the error value may be mistaken for a normal value.
In ceph_lock_wait_for_completion(): Value returned from
a function is not checked for errors before being used. (CWE-252)".

The patch adds the checking of wait_for_completion_killable()
return value and return the error code from
ceph_lock_wait_for_completion().

[1] https://scan5.scan.coverity.com/#/project-view/64304/10063?selectedIssue=1636232

Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Reviewed-by: Alex Markuze <amarkuze@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2025-10-08 23:30:46 +02:00
Max Kellermann
fa07303946 ceph: make ceph_start_io_*() killable
This allows killing processes that wait for a lock when one process is
stuck waiting for the Ceph server.  This is similar to the NFS commit
38a125b315 ("fs/nfs/io: make nfs_start_io_*() killable").

[ idryomov: drop comment on include, formatting ]

Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2025-10-08 23:30:46 +02:00
Markus Elfring
4468490251 smb: client: Return directly after a failed genlmsg_new() in cifs_swn_send_register_message()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* Return directly after a call of the function “genlmsg_new” failed
  at the beginning.

* Delete the label “fail” which became unnecessary
  with this refactoring.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-07 14:28:16 -05:00
Markus Elfring
ce47f74985 smb: client: Use common code in cifs_do_create()
Use a label once more so that a bit of common code can be better reused
at the end of this function implementation.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Enzo Matsumiya <ematsumiya@suse.de>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-07 13:31:27 -05:00
Markus Elfring
e2080b70c5 smb: client: Improve unlocking of a mutex in cifs_get_swn_reg()
Use two additional labels so that another bit of common code can be better
reused at the end of this function implementation.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-07 11:12:23 -05:00
Markus Elfring
0a98b40b8f smb: client: Return a status code only as a constant in cifs_spnego_key_instantiate()
* Return a status code without storing it in an intermediate variable.

* Delete the local variable “ret” and the label “error”
  which became unnecessary with this refactoring.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-07 11:12:19 -05:00
Christoph Hellwig
cb6d51a411 iomap: open code bio_iov_iter_get_bdev_pages
Prepare for passing different alignments, and to retired
bio_iov_iter_get_bdev_pages as a global helper.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-10-07 08:05:44 -06:00
Markus Elfring
61da08ecb5 smb: client: Use common code in cifs_lookup()
Use three additional labels so that another bit of common code can be
better reused at the end of this function implementation.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-06 16:43:44 -05:00