linux/include
Christian Brauner 620c266f39 fhandle: relax open_by_handle_at() permission checks
A current limitation of open_by_handle_at() is that it's currently not possible
to use it from within containers at all because we require CAP_DAC_READ_SEARCH
in the initial namespace. That's unfortunate because there are scenarios where
using open_by_handle_at() from within containers.

Two examples:

(1) cgroupfs allows to encode cgroups to file handles and reopen them with
    open_by_handle_at().
(2) Fanotify allows placing filesystem watches they currently aren't usable in
    containers because the returned file handles cannot be used.

Here's a proposal for relaxing the permission check for open_by_handle_at().

(1) Opening file handles when the caller has privileges over the filesystem
    (1.1) The caller has an unobstructed view of the filesystem.
    (1.2) The caller has permissions to follow a path to the file handle.

This doesn't address the problem of opening a file handle when only a portion
of a filesystem is exposed as is common in containers by e.g., bind-mounting a
subtree. The proposal to solve this use-case is:

(2) Opening file handles when the caller has privileges over a subtree
    (2.1) The caller is able to reach the file from the provided mount fd.
    (2.2) The caller has permissions to construct an unobstructed path to the
          file handle.
    (2.3) The caller has permissions to follow a path to the file handle.

The relaxed permission checks are currently restricted to directory file
handles which are what both cgroupfs and fanotify need. Handling disconnected
non-directory file handles would lead to a potentially non-deterministic api.
If a disconnected non-directory file handle is provided we may fail to decode
a valid path that we could use for permission checking. That in itself isn't a
problem as we would just return EACCES in that case. However, confusion may
arise if a non-disconnected dentry ends up in the cache later and those opening
the file handle would suddenly succeed.

* It's potentially possible to use timing information (side-channel) to infer
  whether a given inode exists. I don't think that's particularly
  problematic. Thanks to Jann for bringing this to my attention.

* An unrelated note (IOW, these are thoughts that apply to
  open_by_handle_at() generically and are unrelated to the changes here):
  Jann pointed out that we should verify whether deleted files could
  potentially be reopened through open_by_handle_at(). I don't think that's
  possible though.

  Another potential thing to check is whether open_by_handle_at() could be
  abused to open internal stuff like memfds or gpu stuff. I don't think so
  but I haven't had the time to completely verify this.

This dates back to discussions Amir and I had quite some time ago and thanks to
him for providing a lot of details around the export code and related patches!

Link: https://lore.kernel.org/r/20240524-vfs-open_by_handle_at-v1-1-3d4b7d22736b@kernel.org
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-05-28 15:57:23 +02:00
..
acpi The usual shower of singleton fixes and minor series all over MM, 2024-05-19 09:21:03 -07:00
asm-generic asm-generic cleanups for 6.10 2024-05-20 15:18:34 -07:00
clocksource
crypto This push fixes a bug in the new ecc P521 code as well as a buggy 2024-05-20 08:47:54 -07:00
drm drm fixes for 6.10-rc1 2024-05-24 17:28:02 -07:00
dt-bindings - Core Frameworks 2024-05-22 10:49:54 -07:00
keys Hi, 2024-05-13 10:40:15 -07:00
kunit kunit: Print last test location on fault 2024-05-06 14:22:02 -06:00
kvm Merge branch kvm-arm64/misc-6.10 into kvmarm-master/next 2024-05-08 16:41:50 +01:00
linux fhandle: relax open_by_handle_at() permission checks 2024-05-28 15:57:23 +02:00
math-emu
media media: cec.h: Fix kerneldoc 2024-05-04 10:19:59 +02:00
memory
misc
net more s390 updates for 6.10 merge window 2024-05-21 12:09:36 -07:00
pcmcia
ras tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
rdma The usual shower of singleton fixes and minor series all over MM, 2024-05-19 09:21:03 -07:00
rv
scsi SCSI misc on 20240514 2024-05-14 18:25:53 -07:00
soc I'm actually surprised this time. There aren't any new Qualcomm SoC clk 2024-05-18 12:48:37 -07:00
sound ASoC: Fixes for v6.10 2024-05-23 13:29:27 +02:00
target
trace block-6.10-20240523 2024-05-23 13:44:47 -07:00
uapi drm fixes for 6.10-rc1 2024-05-24 17:28:02 -07:00
ufs
vdso
video
xen