Commit graph

597 commits

Author SHA1 Message Date
Linus Torvalds
bf4afc53b7 Convert 'alloc_obj' family to use the new default GFP_KERNEL argument
This was done entirely with mindless brute force, using

    git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 17:09:51 -08:00
Kees Cook
69050f8d6d treewide: Replace kmalloc with kmalloc_obj for non-scalar types
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-21 01:02:28 -08:00
Konstantin Andreev
33d589ed60 smack: /smack/doi: accept previously used values
Writing to /smack/doi a value that has ever been
written there in the past disables networking for
non-ambient labels.
E.g.

    # cat /smack/doi
    3
    # netlabelctl -p cipso list
    Configured CIPSO mappings (1)
     DOI value : 3
       mapping type : PASS_THROUGH
    # netlabelctl -p map list
    Configured NetLabel domain mappings (3)
     domain: "_" (IPv4)
       protocol: UNLABELED
     domain: DEFAULT (IPv4)
       protocol: CIPSO, DOI = 3
     domain: DEFAULT (IPv6)
       protocol: UNLABELED

    # cat /smack/ambient
    _
    # cat /proc/$$/attr/smack/current
    _
    # ping -c1 10.1.95.12
    64 bytes from 10.1.95.12: icmp_seq=1 ttl=64 time=0.964 ms
    # echo foo >/proc/$$/attr/smack/current
    # ping -c1 10.1.95.12
    64 bytes from 10.1.95.12: icmp_seq=1 ttl=64 time=0.956 ms
    unknown option 86

    # echo 4 >/smack/doi
    # echo 3 >/smack/doi
!>  [  214.050395] smk_cipso_doi:691 cipso add rc = -17
    # echo 3 >/smack/doi
!>  [  249.402261] smk_cipso_doi:678 remove rc = -2
!>  [  249.402261] smk_cipso_doi:691 cipso add rc = -17

    # ping -c1 10.1.95.12
!!> ping: 10.1.95.12: Address family for hostname not supported

    # echo _ >/proc/$$/attr/smack/current
    # ping -c1 10.1.95.12
    64 bytes from 10.1.95.12: icmp_seq=1 ttl=64 time=0.617 ms

This happens because Smack keeps decommissioned DOIs,
fails to re-add them, and consequently refuses to add
the “default” domain map:

    # netlabelctl -p cipso list
    Configured CIPSO mappings (2)
     DOI value : 3
       mapping type : PASS_THROUGH
     DOI value : 4
       mapping type : PASS_THROUGH
    # netlabelctl -p map list
    Configured NetLabel domain mappings (2)
     domain: "_" (IPv4)
       protocol: UNLABELED
!>  (no ipv4 map for default domain here)
     domain: DEFAULT (IPv6)
       protocol: UNLABELED

Fix by clearing decommissioned DOI definitions and
serializing concurrent DOI updates with a new lock.

Also:
- allow /smack/doi to live unconfigured, since
  adding a map (netlbl_cfg_cipsov4_map_add) may fail.
  CIPSO_V4_DOI_UNKNOWN(0) indicates the unconfigured DOI
- add new DOI before removing the old default map,
  so the old map remains if the add fails

(2008-02-04, Casey Schaufler)
Fixes: e114e47377 ("Smack: Simplified Mandatory Access Control Kernel")

Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-12-30 12:17:15 -08:00
Konstantin Andreev
19c013e155 smack: /smack/doi must be > 0
/smack/doi allows writing and keeping negative doi values.
Correct values are 0 < doi <= (max 32-bit positive integer)

(2008-02-04, Casey Schaufler)
Fixes: e114e47377 ("Smack: Simplified Mandatory Access Control Kernel")

Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-12-30 12:17:15 -08:00
Taimoor Zaeem
e877cbb453 security: smack: fix indentation in smack_access.c
Replace spaces in code indent with tab character.

Signed-off-by: Taimoor Zaeem <taimoorzaeem@gmail.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-12-30 12:17:15 -08:00
Linus Torvalds
7cd122b552 Some filesystems use a kinda-sorta controlled dentry refcount leak to pin
dentries of created objects in dcache (and undo it when removing those).
 Reference is grabbed and not released, but it's not actually _stored_
 anywhere.  That works, but it's hard to follow and verify; among other
 things, we have no way to tell _which_ of the increments is intended
 to be an unpaired one.  Worse, on removal we need to decide whether
 the reference had already been dropped, which can be non-trivial if
 that removal is on umount and we need to figure out if this dentry is
 pinned due to e.g. unlink() not done.  Usually that is handled by using
 kill_litter_super() as ->kill_sb(), but there are open-coded special
 cases of the same (consider e.g. /proc/self).
 
 Things get simpler if we introduce a new dentry flag (DCACHE_PERSISTENT)
 marking those "leaked" dentries.  Having it set claims responsibility
 for +1 in refcount.
 
 The end result this series is aiming for:
 
 * get these unbalanced dget() and dput() replaced with new primitives that
   would, in addition to adjusting refcount, set and clear persistency flag.
 * instead of having kill_litter_super() mess with removing the remaining
   "leaked" references (e.g. for all tmpfs files that hadn't been removed
   prior to umount), have the regular shrink_dcache_for_umount() strip
   DCACHE_PERSISTENT of all dentries, dropping the corresponding
   reference if it had been set.  After that kill_litter_super() becomes
   an equivalent of kill_anon_super().
 
 Doing that in a single step is not feasible - it would affect too many places
 in too many filesystems.  It has to be split into a series.
 
 This work has really started early in 2024; quite a few preliminary pieces
 have already gone into mainline.  This chunk is finally getting to the
 meat of that stuff - infrastructure and most of the conversions to it.
 
 Some pieces are still sitting in the local branches, but the bulk of
 that stuff is here.
 
 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCaTEq1wAKCRBZ7Krx/gZQ
 643uAQC1rRslhw5l7OjxEpIYbGG4M+QaadN4Nf5Sr2SuTRaPJQD/W4oj/u4C2eCw
 Dd3q071tqyvm/PXNgN2EEnIaxlFUlwc=
 =rKq+
 -----END PGP SIGNATURE-----

Merge tag 'pull-persistency' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull persistent dentry infrastructure and conversion from Al Viro:
 "Some filesystems use a kinda-sorta controlled dentry refcount leak to
  pin dentries of created objects in dcache (and undo it when removing
  those). A reference is grabbed and not released, but it's not actually
  _stored_ anywhere.

  That works, but it's hard to follow and verify; among other things, we
  have no way to tell _which_ of the increments is intended to be an
  unpaired one. Worse, on removal we need to decide whether the
  reference had already been dropped, which can be non-trivial if that
  removal is on umount and we need to figure out if this dentry is
  pinned due to e.g. unlink() not done. Usually that is handled by using
  kill_litter_super() as ->kill_sb(), but there are open-coded special
  cases of the same (consider e.g. /proc/self).

  Things get simpler if we introduce a new dentry flag
  (DCACHE_PERSISTENT) marking those "leaked" dentries. Having it set
  claims responsibility for +1 in refcount.

  The end result this series is aiming for:

   - get these unbalanced dget() and dput() replaced with new primitives
     that would, in addition to adjusting refcount, set and clear
     persistency flag.

   - instead of having kill_litter_super() mess with removing the
     remaining "leaked" references (e.g. for all tmpfs files that hadn't
     been removed prior to umount), have the regular
     shrink_dcache_for_umount() strip DCACHE_PERSISTENT of all dentries,
     dropping the corresponding reference if it had been set. After that
     kill_litter_super() becomes an equivalent of kill_anon_super().

  Doing that in a single step is not feasible - it would affect too many
  places in too many filesystems. It has to be split into a series.

  This work has really started early in 2024; quite a few preliminary
  pieces have already gone into mainline. This chunk is finally getting
  to the meat of that stuff - infrastructure and most of the conversions
  to it.

  Some pieces are still sitting in the local branches, but the bulk of
  that stuff is here"

* tag 'pull-persistency' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (54 commits)
  d_make_discardable(): warn if given a non-persistent dentry
  kill securityfs_recursive_remove()
  convert securityfs
  get rid of kill_litter_super()
  convert rust_binderfs
  convert nfsctl
  convert rpc_pipefs
  convert hypfs
  hypfs: swich hypfs_create_u64() to returning int
  hypfs: switch hypfs_create_str() to returning int
  hypfs: don't pin dentries twice
  convert gadgetfs
  gadgetfs: switch to simple_remove_by_name()
  convert functionfs
  functionfs: switch to simple_remove_by_name()
  functionfs: fix the open/removal races
  functionfs: need to cancel ->reset_work in ->kill_sb()
  functionfs: don't bother with ffs->ref in ffs_data_{opened,closed}()
  functionfs: don't abuse ffs_data_closed() on fs shutdown
  convert selinuxfs
  ...
2025-12-05 14:36:21 -08:00
Linus Torvalds
204a920f28 Patches for 6.19
-----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCAA1FiEEC+9tH1YyUwIQzUIeOKUVfIxDyBEFAmkp4M4XHGNhc2V5QHNj
 aGF1Zmxlci1jYS5jb20ACgkQOKUVfIxDyBFc+BAAq564RJ19Nj7WxV3juDGYLyWR
 b2P2w4euHo1PvWng058G3WnCzzS14xj6xShLJL711jbuue40lngBqllP+djHIdGM
 /Mq4I7BnVSJ4z2svvovebiQLoaCiSHN4Uo/b8cdYRz6scueyOQdJUXZvyLwVb4fY
 CxzfXbcXzsS8MEZikSqyVrjdEZ8yLKzL+uJEqm9FU/suaTUuMNeXudZaXHv5sa8H
 wxd5iFQhNe2NYCN0EV8Pz57q36ewvzC6UhZLIM+NlIXorfBFWEq54pYpa9OOnNNW
 4S+e4KWcNqcWtjxJbzh80Fyui3xv2/b1S73/0Akcf+yqp1mZjPLKZtmsFn83F9b3
 SxXGqHNUIZ79U6f656nDYSYYuXzAKkqzmGC94aj7AjABBXNRejPASXqsLLAtm+KY
 qwqcM2W6WwG1ILEixj5UpfC7qpsBVPYGf5cOi9iPlZD2+/l7OgPOTxNDAe7M5m+E
 M30eJCqObdl1Uw7QlHvMG5VQT9XkwJFL93OEOADqvE1GqTphIwMb4vJmGuYe/Sgg
 xqEWMLA6hKBP/B8eQvA7d02WLvzvegwB7C5RTWv1M+QGyzlNxTwqOv+vecm//IVS
 A3LzfxjadSYyRm9nNbZrYWqdVpvYhVcxHrLwzJ4jBsPjBSZM9QlbJHaCjs/T9Sg5
 pK0p5GsXIr3akV2BIvE=
 =n/1b
 -----END PGP SIGNATURE-----

Merge tag 'Smack-for-6.19' of https://github.com/cschaufler/smack-next

Pull smack updates from Casey Schaufler:

 - fix several cases where labels were treated inconsistently when
   imported from user space

 - clean up the assignment of extended attributes

 - documentation improvements

* tag 'Smack-for-6.19' of https://github.com/cschaufler/smack-next:
  Smack: function parameter 'gfp' not described
  smack: fix kernel-doc warnings for smk_import_valid_label()
  smack: fix bug: setting task label silently ignores input garbage
  smack: fix bug: unprivileged task can create labels
  smack: fix bug: invalid label of unix socket file
  smack: always "instantiate" inode in smack_inode_init_security()
  smack: deduplicate xattr setting in smack_inode_init_security()
  smack: fix bug: SMACK64TRANSMUTE set on non-directory
  smack: deduplicate "does access rule request transmutation"
2025-12-03 10:58:59 -08:00
Al Viro
bdd80b5c1b convert smackfs
Entirely static tree populated by simple_fill_super().  Can use
kill_anon_super() as-is.

Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-11-16 01:35:02 -05:00
Casey Schaufler
29c701f90b Smack: function parameter 'gfp' not described
Add a descrition of the gfp parameter to smk_import_allocated_label().

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202511061746.dPegBnNf-lkp@intel.com/
2025-11-11 12:00:18 -08:00
Paul Moore
06643d5584 smack: move initcalls to the LSM framework
As the LSM framework only supports one LSM initcall callback for each
initcall type, the init_smk_fs() and smack_nf_ip_init() functions were
wrapped with a new function, smack_initcall() that is registered with
the LSM framework.

Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: John Johansen <john.johhansen@canonical.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-10-22 19:24:25 -04:00
Paul Moore
9f9dc69e06 lsm: replace the name field with a pointer to the lsm_id struct
Reduce the duplication between the lsm_id struct and the DEFINE_LSM()
definition by linking the lsm_id struct directly into the individual
LSM's DEFINE_LSM() instance.

Linking the lsm_id into the LSM definition also allows us to simplify
the security_add_hooks() function by removing the code which populates
the lsm_idlist[] array and moving it into the normal LSM startup code
where the LSM list is parsed and the individual LSMs are enabled,
making for a cleaner implementation with less overhead at boot.

Reviewed-by: Kees Cook <kees@kernel.org>
Reviewed-by: John Johansen <john.johansen@canonical.com>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-10-22 19:24:18 -04:00
Linus Torvalds
33fc69a05c Simplifying ->d_name audits, easy part.
Turn dentry->d_name into an anon union of const struct qsrt (d_name
 itself) and a writable alias (__d_name).  With constification of some
 struct qstr * arguments of functions that get &dentry->d_name passed
 to them, that ends up with all modifications provably done only in
 fs/dcache.c (and a fairly small part of it).
 
 Any new places doing modifications will be easy to find - grep for
 __d_name will suffice.
 
 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCaNh6XAAKCRBZ7Krx/gZQ
 6wIFAP9nJ9RIsTq2eiqb3YUTQsaFZNu7aqFWiHCFPeHVLzylPwEAgeoGrGdL8zNO
 JqAuPPbQxN6Q6n79qAI/vfFvYQCsAQ0=
 =88fF
 -----END PGP SIGNATURE-----

Merge tag 'pull-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull d_name audit update from Al Viro:
 "Simplifying ->d_name audits, easy part.

  Turn dentry->d_name into an anon union of const struct qsrt (d_name
  itself) and a writable alias (__d_name).

  With constification of some struct qstr * arguments of functions that
  get &dentry->d_name passed to them, that ends up with all
  modifications provably done only in fs/dcache.c (and a fairly small
  part of it).

  Any new places doing modifications will be easy to find - grep for
  __d_name will suffice"

* tag 'pull-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  make it easier to catch those who try to modify ->d_name
  generic_ci_validate_strict_name(): constify name argument
  afs_dir_search: constify qstr argument
  afs_edit_dir_{add,remove}(): constify qstr argument
  exfat_find(): constify qstr argument
  security_dentry_init_security(): constify qstr argument
2025-10-03 11:14:02 -07:00
Al Viro
f9fadf23c7 security_dentry_init_security(): constify qstr argument
Nothing outside of fs/dcache.c has any business modifying
dentry names; passing &dentry->d_name as an argument should
have that argument declared as a const pointer.

Acked-by: Casey Schaufler <casey@schaufler-ca.com> # smack part
Acked-by: Paul Moore <paul@paul-moore.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-09-15 21:08:33 -04:00
Casey Schaufler
0ffbc876d0 audit: add record for multiple object contexts
Create a new audit record AUDIT_MAC_OBJ_CONTEXTS.
An example of the MAC_OBJ_CONTEXTS record is:

    type=MAC_OBJ_CONTEXTS
      msg=audit(1601152467.009:1050):
      obj_selinux=unconfined_u:object_r:user_home_t:s0

When an audit event includes a AUDIT_MAC_OBJ_CONTEXTS record
the "obj=" field in other records in the event will be "obj=?".
An AUDIT_MAC_OBJ_CONTEXTS record is supplied when the system has
multiple security modules that may make access decisions based
on an object security context.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subj tweak, audit example readability indents]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-08-30 10:15:30 -04:00
Casey Schaufler
eb59d494ee audit: add record for multiple task security contexts
Replace the single skb pointer in an audit_buffer with a list of
skb pointers. Add the audit_stamp information to the audit_buffer as
there's no guarantee that there will be an audit_context containing
the stamp associated with the event. At audit_log_end() time create
auxiliary records as have been added to the list. Functions are
created to manage the skb list in the audit_buffer.

Create a new audit record AUDIT_MAC_TASK_CONTEXTS.
An example of the MAC_TASK_CONTEXTS record is:

    type=MAC_TASK_CONTEXTS
      msg=audit(1600880931.832:113)
      subj_apparmor=unconfined
      subj_smack=_

When an audit event includes a AUDIT_MAC_TASK_CONTEXTS record the
"subj=" field in other records in the event will be "subj=?".
An AUDIT_MAC_TASK_CONTEXTS record is supplied when the system has
multiple security modules that may make access decisions based on a
subject security context.

Refactor audit_log_task_context(), creating a new audit_log_subj_ctx().
This is used in netlabel auditing to provide multiple subject security
contexts as necessary.

Suggested-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subj tweak, audit example readability indents]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-08-30 10:15:30 -04:00
Konstantin Andreev
6ddd169d02 smack: fix kernel-doc warnings for smk_import_valid_label()
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202506251712.x5SJiNlh-lkp@intel.com/
Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-06-30 13:02:49 -07:00
Konstantin Andreev
674e2b2479 smack: fix bug: setting task label silently ignores input garbage
This command:
    # echo foo/bar >/proc/$$/attr/smack/current

gives the task a label 'foo' w/o indication
that label does not match input.
Setting the label with lsm_set_self_attr() syscall
behaves identically.

This occures because:

1) smk_parse_smack() is used to convert input to a label
2) smk_parse_smack() takes only that part from the
   beginning of the input that looks like a label.
3) `/' is prohibited in labels, so only "foo" is taken.

(2) is by design, because smk_parse_smack() is used
for parsing strings which are more than just a label.

Silent failure is not a good thing, and there are two
indicators that this was not done intentionally:

    (size >= SMK_LONGLABEL) ~> invalid

clause at the beginning of the do_setattr() and the
"Returns the length of the smack label" claim
in the do_setattr() description.

So I fixed this by adding one tiny check:
the taken label length == input length.

Since input length is now strictly controlled,
I changed the two ways of setting label

   smack_setselfattr(): lsm_set_self_attr() syscall
   smack_setprocattr(): > /proc/.../current

to accommodate the divergence in
what they understand by "input length":

  smack_setselfattr counts mandatory \0 into input length,
  smack_setprocattr does not.

  smack_setprocattr allows various trailers after label

Related changes:

* fixed description for smk_parse_smack

* allow unprivileged tasks validate label syntax.

* extract smk_parse_label_len() from smk_parse_smack()
  so parsing may be done w/o string allocation.

* extract smk_import_valid_label() from smk_import_entry()
  to avoid repeated parsing.

* smk_parse_smack(): scan null-terminated strings
  for no more than SMK_LONGLABEL(256) characters

* smack_setselfattr(): require struct lsm_ctx . flags == 0
  to reserve them for future.

Fixes: e114e47377 ("Smack: Simplified Mandatory Access Control Kernel")
Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-06-24 16:30:24 -07:00
Konstantin Andreev
c147e13ea7 smack: fix bug: unprivileged task can create labels
If an unprivileged task is allowed to relabel itself
(/smack/relabel-self is not empty),
it can freely create new labels by writing their
names into own /proc/PID/attr/smack/current

This occurs because do_setattr() imports
the provided label in advance,
before checking "relabel-self" list.

This change ensures that the "relabel-self" list
is checked before importing the label.

Fixes: 38416e5393 ("Smack: limited capability for changing process label")
Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-06-24 16:30:24 -07:00
Konstantin Andreev
78fc6a94be smack: fix bug: invalid label of unix socket file
According to [1], the label of a UNIX domain socket (UDS)
file (i.e., the filesystem object representing the socket)
is not supposed to participate in Smack security.

To achieve this, [1] labels UDS files with "*"
in smack_d_instantiate().

Before [2], smack_d_instantiate() was responsible
for initializing Smack security for all inodes,
except ones under /proc

[2] imposed the sole responsibility for initializing
inode security for newly created filesystem objects
on smack_inode_init_security().

However, smack_inode_init_security() lacks some logic
present in smack_d_instantiate().
In particular, it does not label UDS files with "*".

This patch adds the missing labeling of UDS files
with "*" to smack_inode_init_security().

Labeling UDS files with "*" in smack_d_instantiate()
still works for stale UDS files that already exist on
disk. Stale UDS files are useless, but I keep labeling
them for consistency and maybe to make easier for user
to delete them.

Compared to [1], this version introduces the following
improvements:

  * UDS file label is held inside inode only
    and not saved to xattrs.

  * relabeling UDS files (setxattr, removexattr, etc.)
    is blocked.

[1] 2010-11-24 Casey Schaufler
commit b4e0d5f079 ("Smack: UDS revision")

[2] 2023-11-16 roberto.sassu
Fixes: e63d86b8b7 ("smack: Initialize the in-memory inode in smack_inode_init_security()")
Link: https://lore.kernel.org/linux-security-module/20231116090125.187209-5-roberto.sassu@huaweicloud.com/

Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-06-22 08:51:32 -07:00
Konstantin Andreev
69204f6cdb smack: always "instantiate" inode in smack_inode_init_security()
If memory allocation for the SMACK64TRANSMUTE
xattr value fails in smack_inode_init_security(),
the SMK_INODE_INSTANT flag is not set in
(struct inode_smack *issp)->smk_flags,
leaving the inode as not "instantiated".

It does not matter if fs frees the inode
after failed smack_inode_init_security() call,
but there is no guarantee for this.

To be safe, mark the inode as "instantiated",
even if allocation of xattr values fails.

Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-06-22 08:51:32 -07:00
Konstantin Andreev
8e5d9f916a smack: deduplicate xattr setting in smack_inode_init_security()
Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-06-22 08:51:32 -07:00
Konstantin Andreev
195da3ff24 smack: fix bug: SMACK64TRANSMUTE set on non-directory
When a new file system object is created
and the conditions for label transmutation are met,
the SMACK64TRANSMUTE extended attribute is set
on the object regardless of its type:
file, pipe, socket, symlink, or directory.

However,
SMACK64TRANSMUTE may only be set on directories.

This bug is a combined effect of the commits [1] and [2]
which both transfer functionality
from smack_d_instantiate() to smack_inode_init_security(),
but only in part.

Commit [1] set blank  SMACK64TRANSMUTE on improper object types.
Commit [2] set "TRUE" SMACK64TRANSMUTE on improper object types.

[1] 2023-06-10,
Fixes: baed456a6a ("smack: Set the SMACK64TRANSMUTE xattr in smack_inode_init_security()")
Link: https://lore.kernel.org/linux-security-module/20230610075738.3273764-3-roberto.sassu@huaweicloud.com/

[2] 2023-11-16,
Fixes: e63d86b8b7 ("smack: Initialize the in-memory inode in smack_inode_init_security()")
Link: https://lore.kernel.org/linux-security-module/20231116090125.187209-5-roberto.sassu@huaweicloud.com/

Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-06-22 08:51:32 -07:00
Konstantin Andreev
635a01da83 smack: deduplicate "does access rule request transmutation"
Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-06-22 08:51:32 -07:00
Linus Torvalds
1b98f357da Networking changes for 6.16.
Core
 ----
 
  - Implement the Device Memory TCP transmit path, allowing zero-copy
    data transmission on top of TCP from e.g. GPU memory to the wire.
 
  - Move all the IPv6 routing tables management outside the RTNL scope,
    under its own lock and RCU. The route control path is now 3x times
    faster.
 
  - Convert queue related netlink ops to instance lock, reducing
    again the scope of the RTNL lock. This improves the control plane
    scalability.
 
  - Refactor the software crc32c implementation, removing unneeded
    abstraction layers and improving significantly the related
    micro-benchmarks.
 
  - Optimize the GRO engine for UDP-tunneled traffic, for a 10%
    performance improvement in related stream tests.
 
  - Cover more per-CPU storage with local nested BH locking; this is a
    prep work to remove the current per-CPU lock in local_bh_disable()
    on PREMPT_RT.
 
  - Introduce and use nlmsg_payload helper, combining buffer bounds
    verification with accessing payload carried by netlink messages.
 
 Netfilter
 ---------
 
  - Rewrite the procfs conntrack table implementation, improving
    considerably the dump performance. A lot of user-space tools
    still use this interface.
 
  - Implement support for wildcard netdevice in netdev basechain
    and flowtables.
 
  - Integrate conntrack information into nft trace infrastructure.
 
  - Export set count and backend name to userspace, for better
    introspection.
 
 BPF
 ---
 
  - BPF qdisc support: BPF-qdisc can be implemented with BPF struct_ops
    programs and can be controlled in similar way to traditional qdiscs
    using the "tc qdisc" command.
 
  - Refactor the UDP socket iterator, addressing long standing issues
    WRT duplicate hits or missed sockets.
 
 Protocols
 ---------
 
  - Improve TCP receive buffer auto-tuning and increase the default
    upper bound for the receive buffer; overall this improves the single
    flow maximum thoughput on 200Gbs link by over 60%.
 
  - Add AFS GSSAPI security class to AF_RXRPC; it provides transport
    security for connections to the AFS fileserver and VL server.
 
  - Improve TCP multipath routing, so that the sources address always
    matches the nexthop device.
 
  - Introduce SO_PASSRIGHTS for AF_UNIX, to allow disabling SCM_RIGHTS,
    and thus preventing DoS caused by passing around problematic FDs.
 
  - Retire DCCP socket. DCCP only receives updates for bugs, and major
    distros disable it by default. Its removal allows for better
    organisation of TCP fields to reduce the number of cache lines hit
    in the fast path.
 
  - Extend TCP drop-reason support to cover PAWS checks.
 
 Driver API
 ----------
 
  - Reorganize PTP ioctl flag support to require an explicit opt-in for
    the drivers, avoiding the problem of drivers not rejecting new
    unsupported flags.
 
  - Converted several device drivers to timestamping APIs.
 
  - Introduce per-PHY ethtool dump helpers, improving the support for
    dump operations targeting PHYs.
 
 Tests and tooling
 -----------------
 
  - Add support for classic netlink in user space C codegen, so that
    ynl-c can now read, create and modify links, routes addresses and
    qdisc layer configuration.
 
  - Add ynl sub-types for binary attributes, allowing ynl-c to output
    known struct instead of raw binary data, clarifying the classic
    netlink output.
 
  - Extend MPTCP selftests to improve the code-coverage.
 
  - Add tests for XDP tail adjustment in AF_XDP.
 
 New hardware / drivers
 ----------------------
 
  - OpenVPN virtual driver: offload OpenVPN data channels processing
    to the kernel-space, increasing the data transfer throughput WRT
    the user-space implementation.
 
  - Renesas glue driver for the gigabit ethernet RZ/V2H(P) SoC.
 
  - Broadcom asp-v3.0 ethernet driver.
 
  - AMD Renoir ethernet device.
 
  - ReakTek MT9888 2.5G ethernet PHY driver.
 
  - Aeonsemi 10G C45 PHYs driver.
 
 Drivers
 -------
 
  - Ethernet high-speed NICs:
    - nVidia/Mellanox (mlx5):
      - refactor the stearing table handling to reduce significantly
        the amount of memory used
      - add support for complex matches in H/W flow steering
      - improve flow streeing error handling
      - convert to netdev instance locking
    - Intel (100G, ice, igb, ixgbe, idpf):
      - ice: add switchdev support for LLDP traffic over VF
      - ixgbe: add firmware manipulation and regions devlink support
      - igb: introduce support for frame transmission premption
      - igb: adds persistent NAPI configuration
      - idpf: introduce RDMA support
      - idpf: add initial PTP support
    - Meta (fbnic):
      - extend hardware stats coverage
      - add devlink dev flash support
    - Broadcom (bnxt):
      - add support for RX-side device memory TCP
    - Wangxun (txgbe):
      - implement support for udp tunnel offload
      - complete PTP and SRIOV support for AML 25G/10G devices
 
  - Ethernet NICs embedded and virtual:
    - Google (gve):
      - add device memory TCP TX support
    - Amazon (ena):
      - support persistent per-NAPI config
    - Airoha:
      - add H/W support for L2 traffic offload
      - add per flow stats for flow offloading
    - RealTek (rtl8211): add support for WoL magic packet
    - Synopsys (stmmac):
      - dwmac-socfpga 1000BaseX support
      - add Loongson-2K3000 support
      - introduce support for hardware-accelerated VLAN stripping
    - Broadcom (bcmgenet):
      - expose more H/W stats
    - Freescale (enetc, dpaa2-eth):
      - enetc: add MAC filter, VLAN filter RSS and loopback support
      - dpaa2-eth: convert to H/W timestamping APIs
    - vxlan: convert FDB table to rhashtable, for better scalabilty
    - veth: apply qdisc backpressure on full ring to reduce TX drops
 
  - Ethernet switches:
    - Microchip (kzZ88x3): add ETS scheduler support
 
  - Ethernet PHYs:
    - RealTek (rtl8211):
      - add support for WoL magic packet
      - add support for PHY LEDs
 
  - CAN:
    - Adds RZ/G3E CANFD support to the rcar_canfd driver.
    - Preparatory work for CAN-XL support.
    - Add self-tests framework with support for CAN physical interfaces.
 
  - WiFi:
    - mac80211:
      - scan improvements with multi-link operation (MLO)
    - Qualcomm (ath12k):
      - enable AHB support for IPQ5332
      - add monitor interface support to QCN9274
      - add multi-link operation support to WCN7850
      - add 802.11d scan offload support to WCN7850
      - monitor mode for WCN7850, better 6 GHz regulatory
    - Qualcomm (ath11k):
      - restore hibernation support
    - MediaTek (mt76):
      - WiFi-7 improvements
      - implement support for mt7990
    - Intel (iwlwifi):
      - enhanced multi-link single-radio (EMLSR) support on 5 GHz links
      - rework device configuration
    - RealTek (rtw88):
      - improve throughput for RTL8814AU
    - RealTek (rtw89):
      - add multi-link operation support
      - STA/P2P concurrency improvements
      - support different SAR configs by antenna
 
  - Bluetooth:
    - introduce HCI Driver protocol
    - btintel_pcie: do not generate coredump for diagnostic events
    - btusb: add HCI Drv commands for configuring altsetting
    - btusb: add RTL8851BE device 0x0bda:0xb850
    - btusb: add new VID/PID 13d3/3584 for MT7922
    - btusb: add new VID/PID 13d3/3630 and 13d3/3613 for MT7925
    - btnxpuart: implement host-wakeup feature
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmg3D64SHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkcIsQAK2eEc+BxQer975wzvtMg6gF9eoex4a+
 rZ7jxfDzDtNvTauoQsrpehDZp0FnySaVGCU36lHGB2OvDnhCpPc5hXzKDWQpOuqQ
 SHrGG3/6FTbdTG/HfHUcbNyrUzIf53SADSObiQ3qg4gyEQ3sCpcOKtVtMcU8rvsY
 /HqMnsJWFaROUMjMtCcnUSgjmeY9kBvha3sTXUqgeRugEOCvZD7z4rpqFIcQqHw7
 e2Fi8dwIXEYNxqPp6MRq2qdyUTewCRruE8ZIMAFuhtfYeMElUZMPlqlMENX3AzTQ
 cr0EgwcFOUxRA7oZRxhoBNBsVXavtSpQr4ZDoWplxP4aQ37n5tc1E9Q72axpB/Og
 FbJRl6GvWYnCd8071BczgmfHlKaTAigPvt2Z4r6JjM5I/Bij/IZ3k+On1OTuOAj/
 EqfFkdZ0a5cfKrwUMP+oSGtSAywkMVUtnIKJlZeRbjSj2432sCfe2jVAlS8ELM43
 3LUgXYrAKtA87g171LlsRu5EEpI5QmqPb+i5LpPlEXe2TJEgPisyfecJ3NafF/2+
 j575lm+TFNm9NTNhGGjDPEvw0djI5wSGGMe9J4gC74eWi6s5t6C4cuUf84TKWdwR
 x+9H0IB7rfFncAwXHJuUUtzd+fPHaYzs5dDGbSgMQOXr1cr1wlubCK8mQ1r/Wt/a
 3GjFIOQKW2Q5
 =t/Tz
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Paolo Abeni:
 "Core:

   - Implement the Device Memory TCP transmit path, allowing zero-copy
     data transmission on top of TCP from e.g. GPU memory to the wire.

   - Move all the IPv6 routing tables management outside the RTNL scope,
     under its own lock and RCU. The route control path is now 3x times
     faster.

   - Convert queue related netlink ops to instance lock, reducing again
     the scope of the RTNL lock. This improves the control plane
     scalability.

   - Refactor the software crc32c implementation, removing unneeded
     abstraction layers and improving significantly the related
     micro-benchmarks.

   - Optimize the GRO engine for UDP-tunneled traffic, for a 10%
     performance improvement in related stream tests.

   - Cover more per-CPU storage with local nested BH locking; this is a
     prep work to remove the current per-CPU lock in local_bh_disable()
     on PREMPT_RT.

   - Introduce and use nlmsg_payload helper, combining buffer bounds
     verification with accessing payload carried by netlink messages.

  Netfilter:

   - Rewrite the procfs conntrack table implementation, improving
     considerably the dump performance. A lot of user-space tools still
     use this interface.

   - Implement support for wildcard netdevice in netdev basechain and
     flowtables.

   - Integrate conntrack information into nft trace infrastructure.

   - Export set count and backend name to userspace, for better
     introspection.

  BPF:

   - BPF qdisc support: BPF-qdisc can be implemented with BPF struct_ops
     programs and can be controlled in similar way to traditional qdiscs
     using the "tc qdisc" command.

   - Refactor the UDP socket iterator, addressing long standing issues
     WRT duplicate hits or missed sockets.

  Protocols:

   - Improve TCP receive buffer auto-tuning and increase the default
     upper bound for the receive buffer; overall this improves the
     single flow maximum thoughput on 200Gbs link by over 60%.

   - Add AFS GSSAPI security class to AF_RXRPC; it provides transport
     security for connections to the AFS fileserver and VL server.

   - Improve TCP multipath routing, so that the sources address always
     matches the nexthop device.

   - Introduce SO_PASSRIGHTS for AF_UNIX, to allow disabling SCM_RIGHTS,
     and thus preventing DoS caused by passing around problematic FDs.

   - Retire DCCP socket. DCCP only receives updates for bugs, and major
     distros disable it by default. Its removal allows for better
     organisation of TCP fields to reduce the number of cache lines hit
     in the fast path.

   - Extend TCP drop-reason support to cover PAWS checks.

  Driver API:

   - Reorganize PTP ioctl flag support to require an explicit opt-in for
     the drivers, avoiding the problem of drivers not rejecting new
     unsupported flags.

   - Converted several device drivers to timestamping APIs.

   - Introduce per-PHY ethtool dump helpers, improving the support for
     dump operations targeting PHYs.

  Tests and tooling:

   - Add support for classic netlink in user space C codegen, so that
     ynl-c can now read, create and modify links, routes addresses and
     qdisc layer configuration.

   - Add ynl sub-types for binary attributes, allowing ynl-c to output
     known struct instead of raw binary data, clarifying the classic
     netlink output.

   - Extend MPTCP selftests to improve the code-coverage.

   - Add tests for XDP tail adjustment in AF_XDP.

  New hardware / drivers:

   - OpenVPN virtual driver: offload OpenVPN data channels processing to
     the kernel-space, increasing the data transfer throughput WRT the
     user-space implementation.

   - Renesas glue driver for the gigabit ethernet RZ/V2H(P) SoC.

   - Broadcom asp-v3.0 ethernet driver.

   - AMD Renoir ethernet device.

   - ReakTek MT9888 2.5G ethernet PHY driver.

   - Aeonsemi 10G C45 PHYs driver.

  Drivers:

   - Ethernet high-speed NICs:
       - nVidia/Mellanox (mlx5):
           - refactor the steering table handling to significantly
             reduce the amount of memory used
           - add support for complex matches in H/W flow steering
           - improve flow streeing error handling
           - convert to netdev instance locking
       - Intel (100G, ice, igb, ixgbe, idpf):
           - ice: add switchdev support for LLDP traffic over VF
           - ixgbe: add firmware manipulation and regions devlink support
           - igb: introduce support for frame transmission premption
           - igb: adds persistent NAPI configuration
           - idpf: introduce RDMA support
           - idpf: add initial PTP support
       - Meta (fbnic):
           - extend hardware stats coverage
           - add devlink dev flash support
       - Broadcom (bnxt):
           - add support for RX-side device memory TCP
       - Wangxun (txgbe):
           - implement support for udp tunnel offload
           - complete PTP and SRIOV support for AML 25G/10G devices

   - Ethernet NICs embedded and virtual:
       - Google (gve):
           - add device memory TCP TX support
       - Amazon (ena):
           - support persistent per-NAPI config
       - Airoha:
           - add H/W support for L2 traffic offload
           - add per flow stats for flow offloading
       - RealTek (rtl8211): add support for WoL magic packet
       - Synopsys (stmmac):
           - dwmac-socfpga 1000BaseX support
           - add Loongson-2K3000 support
           - introduce support for hardware-accelerated VLAN stripping
       - Broadcom (bcmgenet):
           - expose more H/W stats
       - Freescale (enetc, dpaa2-eth):
           - enetc: add MAC filter, VLAN filter RSS and loopback support
           - dpaa2-eth: convert to H/W timestamping APIs
       - vxlan: convert FDB table to rhashtable, for better scalabilty
       - veth: apply qdisc backpressure on full ring to reduce TX drops

   - Ethernet switches:
       - Microchip (kzZ88x3): add ETS scheduler support

   - Ethernet PHYs:
       - RealTek (rtl8211):
           - add support for WoL magic packet
           - add support for PHY LEDs

   - CAN:
       - Adds RZ/G3E CANFD support to the rcar_canfd driver.
       - Preparatory work for CAN-XL support.
       - Add self-tests framework with support for CAN physical interfaces.

   - WiFi:
       - mac80211:
           - scan improvements with multi-link operation (MLO)
       - Qualcomm (ath12k):
           - enable AHB support for IPQ5332
           - add monitor interface support to QCN9274
           - add multi-link operation support to WCN7850
           - add 802.11d scan offload support to WCN7850
           - monitor mode for WCN7850, better 6 GHz regulatory
       - Qualcomm (ath11k):
           - restore hibernation support
       - MediaTek (mt76):
           - WiFi-7 improvements
           - implement support for mt7990
       - Intel (iwlwifi):
           - enhanced multi-link single-radio (EMLSR) support on 5 GHz links
           - rework device configuration
       - RealTek (rtw88):
           - improve throughput for RTL8814AU
       - RealTek (rtw89):
           - add multi-link operation support
           - STA/P2P concurrency improvements
           - support different SAR configs by antenna

   - Bluetooth:
       - introduce HCI Driver protocol
       - btintel_pcie: do not generate coredump for diagnostic events
       - btusb: add HCI Drv commands for configuring altsetting
       - btusb: add RTL8851BE device 0x0bda:0xb850
       - btusb: add new VID/PID 13d3/3584 for MT7922
       - btusb: add new VID/PID 13d3/3630 and 13d3/3613 for MT7925
       - btnxpuart: implement host-wakeup feature"

* tag 'net-next-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1611 commits)
  selftests/bpf: Fix bpf selftest build warning
  selftests: netfilter: Fix skip of wildcard interface test
  net: phy: mscc: Stop clearing the the UDPv4 checksum for L2 frames
  net: openvswitch: Fix the dead loop of MPLS parse
  calipso: Don't call calipso functions for AF_INET sk.
  selftests/tc-testing: Add a test for HFSC eltree double add with reentrant enqueue behaviour on netem
  net_sched: hfsc: Address reentrant enqueue adding class to eltree twice
  octeontx2-pf: QOS: Refactor TC_HTB_LEAF_DEL_LAST callback
  octeontx2-pf: QOS: Perform cache sync on send queue teardown
  net: mana: Add support for Multi Vports on Bare metal
  net: devmem: ncdevmem: remove unused variable
  net: devmem: ksft: upgrade rx test to send 1K data
  net: devmem: ksft: add 5 tuple FS support
  net: devmem: ksft: add exit_wait to make rx test pass
  net: devmem: ksft: add ipv4 support
  net: devmem: preserve sockc_err
  page_pool: fix ugly page_pool formatting
  net: devmem: move list_add to net_devmem_bind_dmabuf.
  selftests: netfilter: nft_queue.sh: include file transfer duration in log message
  net: phy: mscc: Fix memory leak when using one step timestamping
  ...
2025-05-28 15:24:36 -07:00
Randy Dunlap
4b59f4fd0a security/smack/smackfs: small kernel-doc fixes
Add function short descriptions to the kernel-doc where missing.
Correct a verb and add ending periods to sentences.

smackfs.c:1080: warning: missing initial short description on line:
 * smk_net4addr_insert
smackfs.c:1343: warning: missing initial short description on line:
 * smk_net6addr_insert

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: linux-security-module@vger.kernel.org
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
2025-05-19 16:28:32 -07:00
Kuniyuki Iwashima
2a63dd0edf net: Retire DCCP socket.
DCCP was orphaned in 2021 by commit 054c4610bd ("MAINTAINERS: dccp:
move Gerrit Renker to CREDITS"), which noted that the last maintainer
had been inactive for five years.

In recent years, it has become a playground for syzbot, and most changes
to DCCP have been odd bug fixes triggered by syzbot.  Apart from that,
the only changes have been driven by treewide or networking API updates
or adjustments related to TCP.

Thus, in 2023, we announced we would remove DCCP in 2025 via commit
b144fcaf46 ("dccp: Print deprecation notice.").

Since then, only one individual has contacted the netdev mailing list. [0]

There is ongoing research for Multipath DCCP.  The repository is hosted
on GitHub [1], and development is not taking place through the upstream
community.  While the repository is published under the GPLv2 license,
the scheduling part remains proprietary, with a LICENSE file [2] stating:

  "This is not Open Source software."

The researcher mentioned a plan to address the licensing issue, upstream
the patches, and step up as a maintainer, but there has been no further
communication since then.

Maintaining DCCP for a decade without any real users has become a burden.

Therefore, it's time to remove it.

Removing DCCP will also provide significant benefits to TCP.  It allows
us to freely reorganize the layout of struct inet_connection_sock, which
is currently shared with DCCP, and optimize it to reduce the number of
cachelines accessed in the TCP fast path.

Note that we keep DCCP netfilter modules as requested.  [3]

Link: https://lore.kernel.org/netdev/20230710182253.81446-1-kuniyu@amazon.com/T/#u #[0]
Link: https://github.com/telekom/mp-dccp #[1]
Link: https://github.com/telekom/mp-dccp/blob/mpdccp_v03_k5.10/net/dccp/non_gpl_scheduler/LICENSE #[2]
Link: https://lore.kernel.org/netdev/Z_VQ0KlCRkqYWXa-@calendula/ #[3]
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Paul Moore <paul@paul-moore.com> (LSM and SELinux)
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Link: https://patch.msgid.link/20250410023921.11307-3-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11 18:58:10 -07:00
Konstantin Andreev
a158a937d8 smack: recognize ipv4 CIPSO w/o categories
If SMACK label has CIPSO representation w/o categories, e.g.:

| # cat /smack/cipso2
| foo  10
| @ 250/2
| ...

then SMACK does not recognize such CIPSO in input ipv4 packets
and substitues '*' label instead. Audit records may look like

| lsm=SMACK fn=smack_socket_sock_rcv_skb action=denied
|   subject="*" object="_" requested=w pid=0 comm="swapper/1" ...

This happens in two steps:

1) security/smack/smackfs.c`smk_set_cipso
   does not clear NETLBL_SECATTR_MLS_CAT
   from (struct smack_known *)skp->smk_netlabel.flags
   on assigning CIPSO w/o categories:

| rcu_assign_pointer(skp->smk_netlabel.attr.mls.cat, ncats.attr.mls.cat);
| skp->smk_netlabel.attr.mls.lvl = ncats.attr.mls.lvl;

2) security/smack/smack_lsm.c`smack_from_secattr
   can not match skp->smk_netlabel with input packet's
   struct netlbl_lsm_secattr *sap
   because sap->flags have not NETLBL_SECATTR_MLS_CAT (what is correct)
   but skp->smk_netlabel.flags have (what is incorrect):

| if ((sap->flags & NETLBL_SECATTR_MLS_CAT) == 0) {
| 	if ((skp->smk_netlabel.flags &
| 		 NETLBL_SECATTR_MLS_CAT) == 0)
| 		found = 1;
| 	break;
| }

This commit sets/clears NETLBL_SECATTR_MLS_CAT in
skp->smk_netlabel.flags according to the presense of CIPSO categories.
The update of smk_netlabel is not atomic, so input packets processing
still may be incorrect during short time while update proceeds.

Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-02-16 14:17:55 -08:00
Konstantin Andreev
c7fb50cecf smack: Revert "smackfs: Added check catlen"
This reverts commit ccfd889acb

The indicated commit
* does not describe the problem that change tries to solve
* has programming issues
* introduces a bug: forever clears NETLBL_SECATTR_MLS_CAT
         in (struct smack_known *)skp->smk_netlabel.flags

Reverting the commit to reapproach original problem

Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-02-16 10:30:23 -08:00
Konstantin Andreev
bf9f14c91a smack: remove /smack/logging if audit is not configured
If CONFIG_AUDIT is not set then
SMACK does not generate audit messages,
however, keeps audit control file, /smack/logging,
while there is no entity to control.
This change removes audit control file /smack/logging
when audit is not configured in the kernel

Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-02-13 18:33:55 -08:00
Konstantin Andreev
6cce0cc386 smack: ipv4/ipv6: tcp/dccp/sctp: fix incorrect child socket label
Since inception [1], SMACK initializes ipv* child socket security
for connection-oriented communications (tcp/sctp/dccp)
during accept() syscall, in the security_sock_graft() hook:

| void smack_sock_graft(struct sock *sk, ...)
| {
|     // only ipv4 and ipv6 are eligible here
|     // ...
|     ssp = sk->sk_security; // socket security
|     ssp->smk_in = skp;     // process label: smk_of_current()
|     ssp->smk_out = skp;    // process label: smk_of_current()
| }

This approach is incorrect for two reasons:

A) initialization occurs too late for child socket security:

   The child socket is created by the kernel once the handshake
   completes (e.g., for tcp: after receiving ack for syn+ack).

   Data can legitimately start arriving to the child socket
   immediately, long before the application calls accept()
   on the socket.

   Those data are (currently — were) processed by SMACK using
   incorrect child socket security attributes.

B) Incoming connection requests are handled using the listening
   socket's security, hence, the child socket must inherit the
   listening socket's security attributes.

   smack_sock_graft() initilizes the child socket's security with
   a process label, as is done for a new socket()

   But ... the process label is not necessarily the same as the
   listening socket label. A privileged application may legitimately
   set other in/out labels for a listening socket.

   When this happens, SMACK processes incoming packets using
   incorrect socket security attributes.

In [2] Michael Lontke noticed (A) and fixed it in [3] by adding
socket initialization into security_sk_clone_security() hook like

| void smack_sk_clone_security(struct sock *oldsk, struct sock *newsk)
| {
|    *(struct socket_smack *)newsk->sk_security =
|    *(struct socket_smack *)oldsk->sk_security;
| }

This initializes the child socket security with the parent (listening)
socket security at the appropriate time.

I was forced to revisit this old story because

smack_sock_graft() was left in place by [3] and continues overwriting
the child socket's labels with the process label,
and there might be a reason for this, so I undertook a study.

If the process label differs from the listening socket's labels,
the following occurs for ipv4:

assigning the smk_out is not accompanied by netlbl_sock_setattr,
so the outgoing packet's cipso label does not change.

So, the only effect of this assignment for interhost communications
is a divergence between the program-visible “out” socket label and
the cipso network label. For intrahost communications this label,
however, becomes visible via secmark netfilter marking, and is
checked for access rights by the client, receiving side.

Assigning the smk_in affects both interhost and intrahost
communications: the server begins to check access rights against
an wrong label.

Access check against wrong label (smk_in or smk_out),
unsurprisingly fails, breaking the connection.

The above affects protocols that calls security_sock_graft()
during accept(), namely: {tcp,dccp,sctp}/{ipv4,ipv6}
One extra security_sock_graft() caller, crypto/af_alg.c`af_alg_accept
is not affected, because smack_sock_graft() does nothing for PF_ALG.

To reproduce, assign non-default in/out labels to a listening socket,
setup rules between these labels and client label, attempt to connect
and send some data.

Ipv6 specific: ipv6 packets do not convey SMACK labels. To reproduce
the issue in interhost communications set opposite labels in
/smack/ipv6host on both hosts.
Ipv6 intrahost communications do not require tricking, because SMACK
labels are conveyed via secmark netfilter marking.

So, currently smack_sock_graft() is not useful, but harmful,
therefore, I have removed it.

This fixes the issue for {tcp,dccp}/{ipv4,ipv6},
but not sctp/{ipv4,ipv6}.

Although this change is necessary for sctp+smack to function
correctly, it is not sufficient because:
sctp/ipv4 does not call security_sk_clone() and
sctp/ipv6 ignores SMACK completely.

These are separate issues, belong to other subsystem,
and should be addressed separately.

[1] 2008-02-04,
Fixes: e114e47377 ("Smack: Simplified Mandatory Access Control Kernel")

[2] Michael Lontke, 2022-08-31, SMACK LSM checks wrong object label
                                during ingress network traffic
Link: https://lore.kernel.org/linux-security-module/6324997ce4fc092c5020a4add075257f9c5f6442.camel@elektrobit.com/

[3] 2022-08-31, michael.lontke,
    commit 4ca165fc6c ("SMACK: Add sk_clone_security LSM hook")

Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-02-13 13:25:14 -08:00
Konstantin Andreev
bfcf4004bc smack: dont compile ipv6 code unless ipv6 is configured
I want to be sure that ipv6-specific code
is not compiled in kernel binaries
if ipv6 is not configured.

[1] was getting rid of "unused variable" warning, but,
with that, it also mandated compilation of a handful ipv6-
specific functions in ipv4-only kernel configurations:

smk_ipv6_localhost, smack_ipv6host_label, smk_ipv6_check.

Their compiled bodies are likely to be removed by compiler
from the resulting binary, but, to be on the safe side,
I remove them from the compiler view.

[1]
Fixes: 00720f0e7f ("smack: avoid unused 'sip' variable warning")

Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-02-12 13:05:15 -08:00
Casey Schaufler
2aad5cd1db Smack: fix typos and spelling errors
Fix typos and spelling errors in security/smack module comments that
were identified using the codespell tool.
No functional changes - documentation only.

Signed-off-by: Tanya Agarwal <tanyaagarwal25699@gmail.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2025-02-11 14:34:02 -08:00
Linus Torvalds
f96a974170 lsm/stable-6.14 PR 20250121
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmeQFBoUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXPvcA//XCdwMz0bGtWKv58nuyP8vkQx08n6
 //olz/O8te3uWK5O3kRiarzFLwH8qsHQ6A7GYalwwix34hatR4ndJE0Y/guVRWa1
 +aBmJxJ7Jm/q3fvpAEfqiSgreuE6kBoztlDOWEq+hUQGu4qfnQGm2EnvbvfFrAmN
 VheOfIQSU2KCL/Scc3FGnF6uru4WrqN0JJ9RbvrEpfdQgmcyTGLnQsZLljutWSIq
 kDWkteIr7cj3O9J45zpxZsTftvYSgVn/y1iKeXbHI4DBA1eheK12vsHB9AADKI1J
 GwHxOrnLpZtv+ICUKqcfFTmWTl+NmfJJurAT5KXKdBjL3xM5MoJlBvK1A5qE9CMo
 LaHVG/TZR2MmBaoM3EN+gvWhDgWlvT02Q/0cYaafTlVLMez3HtfctxN6OnCvTXTB
 Y8dqYClhhlBm/mHQwYfMoeKw4MftUpzEqBd1Nj7Qe8dbP0f/62Ca3K2B3D6Rf8QV
 pj3ryMlSWYV9mdTerruLNQexTGoN7l66jPwzdWpTbFeL3WmNtfCako8OZGbXgPIu
 Iahm3P+jnSVx8ZQro2c9zwdKXI5xiI335pCBbDZ8aX+JAsfj0OofHsFx5Q5diber
 M7tAEhxDqRisbpz7Ei+/LOAEGg2Z619XKg8ks4z6Y4P5PF7zEgeWTkZJk2iLbxXe
 6LLOjmF7LLw+G4M=
 =fgyr
 -----END PGP SIGNATURE-----

Merge tag 'lsm-pr-20250121' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm

Pull lsm updates from Paul Moore:

 - Improved handling of LSM "secctx" strings through lsm_context struct

   The LSM secctx string interface is from an older time when only one
   LSM was supported, migrate over to the lsm_context struct to better
   support the different LSMs we now have and make it easier to support
   new LSMs in the future.

   These changes explain the Rust, VFS, and networking changes in the
   diffstat.

 - Only build lsm_audit.c if CONFIG_SECURITY and CONFIG_AUDIT are
   enabled

   Small tweak to be a bit smarter about when we build the LSM's common
   audit helpers.

 - Check for absurdly large policies from userspace in SafeSetID

   SafeSetID policies rules are fairly small, basically just "UID:UID",
   it easy to impose a limit of KMALLOC_MAX_SIZE on policy writes which
   helps quiet a number of syzbot related issues. While work is being
   done to address the syzbot issues through other mechanisms, this is a
   trivial and relatively safe fix that we can do now.

 - Various minor improvements and cleanups

   A collection of improvements to the kernel selftests, constification
   of some function parameters, removing redundant assignments, and
   local variable renames to improve readability.

* tag 'lsm-pr-20250121' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
  lockdown: initialize local array before use to quiet static analysis
  safesetid: check size of policy writes
  net: corrections for security_secid_to_secctx returns
  lsm: rename variable to avoid shadowing
  lsm: constify function parameters
  security: remove redundant assignment to return variable
  lsm: Only build lsm_audit.c if CONFIG_SECURITY and CONFIG_AUDIT are set
  selftests: refactor the lsm `flags_overset_lsm_set_self_attr` test
  binder: initialize lsm_context structure
  rust: replace lsm context+len with lsm_context
  lsm: secctx provider check on release
  lsm: lsm_context in security_dentry_init_security
  lsm: use lsm_context in security_inode_getsecctx
  lsm: replace context+len with lsm_context
  lsm: ensure the correct LSM context releaser
2025-01-21 20:03:04 -08:00
Konstantin Andreev
6f71ad02aa smack: deduplicate access to string conversion
Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2024-12-06 13:21:12 -08:00
Casey Schaufler
76ecf306ae lsm: use lsm_context in security_inode_getsecctx
Change the security_inode_getsecctx() interface to fill a lsm_context
structure instead of data and length pointers.  This provides
the information about which LSM created the context so that
security_release_secctx() can use the correct hook.

Cc: linux-nfs@vger.kernel.org
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subject tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-12-04 14:58:09 -05:00
Casey Schaufler
2d470c7781 lsm: replace context+len with lsm_context
Replace the (secctx,seclen) pointer pair with a single
lsm_context pointer to allow return of the LSM identifier
along with the context and context length. This allows
security_release_secctx() to know how to release the
context. Callers have been modified to use or save the
returned data from the new structure.

security_secid_to_secctx() and security_lsmproc_to_secctx()
will now return the length value on success instead of 0.

Cc: netdev@vger.kernel.org
Cc: audit@vger.kernel.org
Cc: netfilter-devel@vger.kernel.org
Cc: Todd Kjos <tkjos@google.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subject tweak, kdoc fix, signedness fix from Dan Carpenter]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-12-04 14:42:31 -05:00
Casey Schaufler
8afd8c8faa lsm: remove lsm_prop scaffolding
Remove the scaffold member from the lsm_prop. Remove the
remaining places it is being set.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subj line tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-10-11 14:34:16 -04:00
Casey Schaufler
05a344e54d netlabel,smack: use lsm_prop for audit data
Replace the secid in the netlbl_audit structure with an lsm_prop.
Remove scaffolding that was required when the value was a secid.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: fix the subject line]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-10-11 14:34:16 -04:00
Casey Schaufler
b0654ca429 lsm: create new security_cred_getlsmprop LSM hook
Create a new LSM hook security_cred_getlsmprop() which, like
security_cred_getsecid(), fetches LSM specific attributes from the
cred structure.  The associated data elements in the audit sub-system
are changed from a secid to a lsm_prop to accommodate multiple possible
LSM audit users.

Cc: linux-integrity@vger.kernel.org
Cc: audit@vger.kernel.org
Cc: selinux@vger.kernel.org
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subj line tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-10-11 14:34:15 -04:00
Casey Schaufler
07f9d2c113 lsm: use lsm_prop in security_inode_getsecid
Change the security_inode_getsecid() interface to fill in a
lsm_prop structure instead of a u32 secid. This allows for its
callers to gather data from all registered LSMs. Data is provided
for IMA and audit. Change the name to security_inode_getlsmprop().

Cc: linux-integrity@vger.kernel.org
Cc: selinux@vger.kernel.org
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subj line tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-10-11 14:34:14 -04:00
Casey Schaufler
37f670aacd lsm: use lsm_prop in security_current_getsecid
Change the security_current_getsecid_subj() and
security_task_getsecid_obj() interfaces to fill in a lsm_prop structure
instead of a u32 secid.  Audit interfaces will need to collect all
possible security data for possible reporting.

Cc: linux-integrity@vger.kernel.org
Cc: audit@vger.kernel.org
Cc: selinux@vger.kernel.org
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subject line tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-10-11 14:34:14 -04:00
Casey Schaufler
f4602f163c lsm: use lsm_prop in security_ipc_getsecid
There may be more than one LSM that provides IPC data for auditing.
Change security_ipc_getsecid() to fill in a lsm_prop structure instead
of the u32 secid.  Change the name to security_ipc_getlsmprop() to
reflect the change.

Cc: audit@vger.kernel.org
Cc: linux-security-module@vger.kernel.org
Cc: selinux@vger.kernel.org
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subject line tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-10-11 14:34:13 -04:00
Casey Schaufler
6f2f724f0e lsm: add lsmprop_to_secctx hook
Add a new hook security_lsmprop_to_secctx() and its LSM specific
implementations. The LSM specific code will use the lsm_prop element
allocated for that module. This allows for the possibility that more
than one module may be called upon to translate a secid to a string,
as can occur in the audit code.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subject line tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-10-11 14:34:12 -04:00
Casey Schaufler
870b7fdc66 lsm: use lsm_prop in security_audit_rule_match
Change the secid parameter of security_audit_rule_match
to a lsm_prop structure pointer. Pass the entry from the
lsm_prop structure for the approprite slot to the LSM hook.

Change the users of security_audit_rule_match to use the
lsm_prop instead of a u32. The scaffolding function lsmprop_init()
fills the structure with the value of the old secid, ensuring that
it is available to the appropriate module hook. The sources of
the secid, security_task_getsecid() and security_inode_getsecid(),
will be converted to use the lsm_prop structure later in the series.
At that point the use of lsmprop_init() is dropped.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
[PM: subject line tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-10-11 14:34:12 -04:00
Linus Torvalds
5c36498d06 lsm/stable-6.12 PR 20240923
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmbxyVEUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXObWA//dDTn1UMEE2zBE5oF46Yw6FDIipEc
 TL7ulL6fXHKZnAGOfkNREKydkLddZVH+mG7AyJQL6A/06s3/tl3J6i8yLdYZ67iD
 6khZzXvwTA41oLKNB/gVCF3xUUIcifnEqoCIRA9AFg7ck+W/gjtXbHD1xaWYjpqX
 rAorbAu3YA1Rv+sOe2NWZ0EDUPUzfzBPJEZT27TSwCVoWED9r9BxMvQgdhijf0XZ
 a0T8wk1RfAvP4+Cf8XPLUkrgu/x9OauLAdx/a48TeODxQ6KjcFUTUtujRsBduzq/
 cnJEeXAJwD7YqbuoNmidwTul/RGZS3nsWhEr2i8JBVdWYSDACpahO1Ls3WtJuQt3
 oCEQGwrXyPlL4LlcSmRjxL+PLc+MIihjWetIOqgujxKQe82rG+fJlu42zBxbmqnI
 BglJ3Ps+kcHPdUh216NAiKwJXw00IsUsldCZpAe+ck7Tz3H1OhMtjKNa0H7nqYtn
 dMV3ieIKj+PVLJTjSeoLSQ3lxx8JFdH7owV7zO++NLsX05dQx8LTUeqSzL6skUk2
 ocn0ekBmH4GRSph2nUBsr5W575Zx2VKdGS8nS9d/TxXOzuwflOZpX81kAzwCX+Ru
 VN9wVlM8qgFwoeK8SlaOD94Jsy7nAeaBu0/H3fYdB5TX1MnNTIOqTtZgxpotr2Gw
 Z295YFAklGMv7zo=
 =KDfa
 -----END PGP SIGNATURE-----

Merge tag 'lsm-pr-20240923' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm

Pull LSM fixes from Paul Moore:

 - Add a missing security_mmap_file() check to the remap_file_pages()
   syscall

 - Properly reference the SELinux and Smack LSM blobs in the
   security_watch_key() LSM hook

 - Fix a random IPE selftest crash caused by a missing list terminator
   in the test

* tag 'lsm-pr-20240923' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
  ipe: Add missing terminator to list of unit tests
  selinux,smack: properly reference the LSM blob in security_watch_key()
  mm: call the security_mmap_file() LSM hook in remap_file_pages()
2024-09-24 10:18:15 -07:00
Paul Moore
8a23c9e1ba selinux,smack: properly reference the LSM blob in security_watch_key()
Unfortunately when we migrated the lifecycle management of the key LSM
blob to the LSM framework we forgot to convert the security_watch_key()
callbacks for SELinux and Smack.  This patch corrects this by making use
of the selinux_key() and smack_key() helper functions respectively.

This patch also removes some input checking in the Smack callback as it
is no longer needed.

Fixes: 5f8d28f6d7 ("lsm: infrastructure management of the key security blob")
Reported-by: syzbot+044fdf24e96093584232@syzkaller.appspotmail.com
Tested-by: syzbot+044fdf24e96093584232@syzkaller.appspotmail.com
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-09-19 16:37:01 -04:00
Linus Torvalds
509d2cd12a Smack changes for v6.12
- rcu pointer assignment in smk_set_cipso
 	- indentation in smack_ip_output
 -----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCAA1FiEEC+9tH1YyUwIQzUIeOKUVfIxDyBEFAmbk0uIXHGNhc2V5QHNj
 aGF1Zmxlci1jYS5jb20ACgkQOKUVfIxDyBG60g//akcCh6Pu93lwer7vtbWRe2dL
 88cziDLadaJbSlGl5G8GBUS6J0j5ZGRMrT2G6Ch7dnZyNUr3irwULKL3E5xAz62U
 UIzp5EqOWX3fChO5H/uKysjm0JtUsTaJALsiZzqacYuC9PHFho8+PBAAB4ALHVoZ
 0ar4IW5T7DfJk1rKVfy1z+ClPLQbQVHloS/0gkstHkBTlB3WaHOCvgmv03JxD1Nj
 +TxLBuin1DYvm0eJtYkIvP8ODv6FRDGY+6ziwHBSCT3dLpOQDgXozGzO48xa6iQa
 l8UMq9HbzYAd7/bbHiySS99mZ38Yq9E4Tg1SXD9WlgmtBhP6KkYYSzFp8o2lee8x
 NkR8lOE7Xp8b4lqVqk6HAwpQOfUErXc6BnZ7CHTX5cWmsFI4X9kJQCxJ0CgfnG+W
 LMByLa3x6V3BnFDG4nP6na6jKBd5iQiika0+VermDrIG/BZ4O4dJMMhwYW+AeA0w
 45zKMucUckUA+AOMcbqFCh5EL8C0zWM2PQY8OlfW5oRIzZSeOiyo5/ourtE6ccx0
 TZKrqowhuzMqXlVW6891+0EphMIzn3au1MRoPw2t3Xr/BodtgMV9MB6VBfpJ/nax
 rxBq2YsH9S18Wb+wo7HRekDc2AIbFwG5VvqY8dV4LdFnUbJ+5tq+q+6Yqn648m9K
 Nkx7UVSXwOImUt0O4KI=
 =V/iJ
 -----END PGP SIGNATURE-----

Merge tag 'Smack-for-6.12' of https://github.com/cschaufler/smack-next

Pull smack updates from Casey Schaufler:
 "Two patches: one is a simple indentation correction, the other
  corrects a potentially rcu unsafe pointer assignment"

* tag 'Smack-for-6.12' of https://github.com/cschaufler/smack-next:
  smackfs: Use rcu_assign_pointer() to ensure safe assignment in smk_set_cipso
  security: smack: Fix indentation in smack_netfilter.c
2024-09-19 13:09:19 +02:00
Linus Torvalds
a430d95c5e lsm/stable-6.12 PR 20240911
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmbiGGAUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXPU8BAA1+A15pmS34I9pq7c8TmRz3rNEs/a
 zrW1aWJ0X/+axNS7sW3Pwtt1EKuaOhskKU8gNSieRhljC8rgXIVjZzLw6Atgcr5k
 upulGbU9TXyVisYN+PWv9/84ito6/nYsKb7Mg3nUVsdodtIFVnsk1fxYLPHQEBig
 Pl3i26U3VqH93Kz0W5vs/QR2uduPB8ZyscdTgcbrY9Vv1Y7IDZ2g9QsJVKLvbQKL
 qcPK1JkHa+sBPJxDqS9A40zgbLbdPQgWQzsXX3dz822w1Ga7FIHSqxMBA6HwHZ+L
 kV4P58wVfavhwt/cQSKMWI/yiGPMMd0B6yD+m8ojOvGfOfRCWxGMmEMqHNuZ3m7k
 Bfll5ZgZTY8phUUhiNf3nxO3F3MM/5bHdhPOj3RReqbAbS6uWr4/fThPDYY/zIo6
 NCY3HGxx3Ae64uQ01gC2p/czC50jDsMwlbXiZbrgdBhjBm/CVk5ozb80mLVcGrLB
 +6XMzzSbC8IaNAH2fDmUJ2ABdwyNPgsSOTGZVzIanpxu1SU2/yk3SMxkp8fv5s36
 wLeODUVcLgsjVV538Mkm6PGTE4TlXaH9yi6apMyJAGp0vPYx5c3Xxk2y5A5cur5p
 hcrbDiX2QgeqFbwsz36incmPmbef2NU2c8feR8XLtPJuwNIeRcMSje0pnkaFlRmb
 TAUJ1sDQAzZ8Fy0=
 =HIAO
 -----END PGP SIGNATURE-----

Merge tag 'lsm-pr-20240911' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm

Pull lsm updates from Paul Moore:

 - Move the LSM framework to static calls

   This transitions the vast majority of the LSM callbacks into static
   calls. Those callbacks which haven't been converted were left as-is
   due to the general ugliness of the changes required to support the
   static call conversion; we can revisit those callbacks at a future
   date.

 - Add the Integrity Policy Enforcement (IPE) LSM

   This adds a new LSM, Integrity Policy Enforcement (IPE). There is
   plenty of documentation about IPE in this patches, so I'll refrain
   from going into too much detail here, but the basic motivation behind
   IPE is to provide a mechanism such that administrators can restrict
   execution to only those binaries which come from integrity protected
   storage, e.g. a dm-verity protected filesystem. You will notice that
   IPE requires additional LSM hooks in the initramfs, dm-verity, and
   fs-verity code, with the associated patches carrying ACK/review tags
   from the associated maintainers. We couldn't find an obvious
   maintainer for the initramfs code, but the IPE patchset has been
   widely posted over several years.

   Both Deven Bowers and Fan Wu have contributed to IPE's development
   over the past several years, with Fan Wu agreeing to serve as the IPE
   maintainer moving forward. Once IPE is accepted into your tree, I'll
   start working with Fan to ensure he has the necessary accounts, keys,
   etc. so that he can start submitting IPE pull requests to you
   directly during the next merge window.

 - Move the lifecycle management of the LSM blobs to the LSM framework

   Management of the LSM blobs (the LSM state buffers attached to
   various kernel structs, typically via a void pointer named "security"
   or similar) has been mixed, some blobs were allocated/managed by
   individual LSMs, others were managed by the LSM framework itself.

   Starting with this pull we move management of all the LSM blobs,
   minus the XFRM blob, into the framework itself, improving consistency
   across LSMs, and reducing the amount of duplicated code across LSMs.
   Due to some additional work required to migrate the XFRM blob, it has
   been left as a todo item for a later date; from a practical
   standpoint this omission should have little impact as only SELinux
   provides a XFRM LSM implementation.

 - Fix problems with the LSM's handling of F_SETOWN

   The LSM hook for the fcntl(F_SETOWN) operation had a couple of
   problems: it was racy with itself, and it was disconnected from the
   associated DAC related logic in such a way that the LSM state could
   be updated in cases where the DAC state would not. We fix both of
   these problems by moving the security_file_set_fowner() hook into the
   same section of code where the DAC attributes are updated. Not only
   does this resolve the DAC/LSM synchronization issue, but as that code
   block is protected by a lock, it also resolve the race condition.

 - Fix potential problems with the security_inode_free() LSM hook

   Due to use of RCU to protect inodes and the placement of the LSM hook
   associated with freeing the inode, there is a bit of a challenge when
   it comes to managing any LSM state associated with an inode. The VFS
   folks are not open to relocating the LSM hook so we have to get
   creative when it comes to releasing an inode's LSM state.
   Traditionally we have used a single LSM callback within the hook that
   is triggered when the inode is "marked for death", but not actually
   released due to RCU.

   Unfortunately, this causes problems for LSMs which want to take an
   action when the inode's associated LSM state is actually released; so
   we add an additional LSM callback, inode_free_security_rcu(), that is
   called when the inode's LSM state is released in the RCU free
   callback.

 - Refactor two LSM hooks to better fit the LSM return value patterns

   The vast majority of the LSM hooks follow the "return 0 on success,
   negative values on failure" pattern, however, there are a small
   handful that have unique return value behaviors which has caused
   confusion in the past and makes it difficult for the BPF verifier to
   properly vet BPF LSM programs. This includes patches to
   convert two of these"special" LSM hooks to the common 0/-ERRNO pattern.

 - Various cleanups and improvements

   A handful of patches to remove redundant code, better leverage the
   IS_ERR_OR_NULL() helper, add missing "static" markings, and do some
   minor style fixups.

* tag 'lsm-pr-20240911' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: (40 commits)
  security: Update file_set_fowner documentation
  fs: Fix file_set_fowner LSM hook inconsistencies
  lsm: Use IS_ERR_OR_NULL() helper function
  lsm: remove LSM_COUNT and LSM_CONFIG_COUNT
  ipe: Remove duplicated include in ipe.c
  lsm: replace indirect LSM hook calls with static calls
  lsm: count the LSMs enabled at compile time
  kernel: Add helper macros for loop unrolling
  init/main.c: Initialize early LSMs after arch code, static keys and calls.
  MAINTAINERS: add IPE entry with Fan Wu as maintainer
  documentation: add IPE documentation
  ipe: kunit test for parser
  scripts: add boot policy generation program
  ipe: enable support for fs-verity as a trust provider
  fsverity: expose verified fsverity built-in signatures to LSMs
  lsm: add security_inode_setintegrity() hook
  ipe: add support for dm-verity as a trust provider
  dm-verity: expose root hash digest and signature data to LSMs
  block,lsm: add LSM blob and new LSM hooks for block devices
  ipe: add permissive toggle
  ...
2024-09-16 18:19:47 +02:00
Linus Torvalds
3352633ce6 vfs-6.12.file
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZuQEwAAKCRCRxhvAZXjc
 osS0AQCgIpvey9oW5DMyMw6Bv0hFMRv95gbNQZfHy09iK+NMNAD9GALhb/4cMIVB
 7YrZGXEz454lpgcs8AnrOVjVNfctOQg=
 =e9s9
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.12.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs file updates from Christian Brauner:
 "This is the work to cleanup and shrink struct file significantly.

  Right now, (focusing on x86) struct file is 232 bytes. After this
  series struct file will be 184 bytes aka 3 cacheline and a spare 8
  bytes for future extensions at the end of the struct.

  With struct file being as ubiquitous as it is this should make a
  difference for file heavy workloads and allow further optimizations in
  the future.

   - struct fown_struct was embedded into struct file letting it take up
     32 bytes in total when really it shouldn't even be embedded in
     struct file in the first place. Instead, actual users of struct
     fown_struct now allocate the struct on demand. This frees up 24
     bytes.

   - Move struct file_ra_state into the union containg the cleanup hooks
     and move f_iocb_flags out of the union. This closes a 4 byte hole
     we created earlier and brings struct file to 192 bytes. Which means
     struct file is 3 cachelines and we managed to shrink it by 40
     bytes.

   - Reorder struct file so that nothing crosses a cacheline.

     I suspect that in the future we will end up reordering some members
     to mitigate false sharing issues or just because someone does
     actually provide really good perf data.

   - Shrinking struct file to 192 bytes is only part of the work.

     Files use a slab that is SLAB_TYPESAFE_BY_RCU and when a kmem cache
     is created with SLAB_TYPESAFE_BY_RCU the free pointer must be
     located outside of the object because the cache doesn't know what
     part of the memory can safely be overwritten as it may be needed to
     prevent object recycling.

     That has the consequence that SLAB_TYPESAFE_BY_RCU may end up
     adding a new cacheline.

     So this also contains work to add a new kmem_cache_create_rcu()
     function that allows the caller to specify an offset where the
     freelist pointer is supposed to be placed. Thus avoiding the
     implicit addition of a fourth cacheline.

   - And finally this removes the f_version member in struct file.

     The f_version member isn't particularly well-defined. It is mainly
     used as a cookie to detect concurrent seeks when iterating
     directories. But it is also abused by some subsystems for
     completely unrelated things.

     It is mostly a directory and filesystem specific thing that doesn't
     really need to live in struct file and with its wonky semantics it
     really lacks a specific function.

     For pipes, f_version is (ab)used to defer poll notifications until
     a write has happened. And struct pipe_inode_info is used by
     multiple struct files in their ->private_data so there's no chance
     of pushing that down into file->private_data without introducing
     another pointer indirection.

     But pipes don't rely on f_pos_lock so this adds a union into struct
     file encompassing f_pos_lock and a pipe specific f_pipe member that
     pipes can use. This union of course can be extended to other file
     types and is similar to what we do in struct inode already"

* tag 'vfs-6.12.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (26 commits)
  fs: remove f_version
  pipe: use f_pipe
  fs: add f_pipe
  ubifs: store cookie in private data
  ufs: store cookie in private data
  udf: store cookie in private data
  proc: store cookie in private data
  ocfs2: store cookie in private data
  input: remove f_version abuse
  ext4: store cookie in private data
  ext2: store cookie in private data
  affs: store cookie in private data
  fs: add generic_llseek_cookie()
  fs: use must_set_pos()
  fs: add must_set_pos()
  fs: add vfs_setpos_cookie()
  s390: remove unused f_version
  ceph: remove unused f_version
  adi: remove unused f_version
  mm: Removed @freeptr_offset to prevent doc warning
  ...
2024-09-16 09:14:02 +02:00
Jiawei Ye
2749749afa smackfs: Use rcu_assign_pointer() to ensure safe assignment in smk_set_cipso
In the `smk_set_cipso` function, the `skp->smk_netlabel.attr.mls.cat`
field is directly assigned to a new value without using the appropriate
RCU pointer assignment functions. According to RCU usage rules, this is
illegal and can lead to unpredictable behavior, including data
inconsistencies and impossible-to-diagnose memory corruption issues.

This possible bug was identified using a static analysis tool developed
by myself, specifically designed to detect RCU-related issues.

To address this, the assignment is now done using rcu_assign_pointer(),
which ensures that the pointer assignment is done safely, with the
necessary memory barriers and synchronization. This change prevents
potential RCU dereference issues by ensuring that the `cat` field is
safely updated while still adhering to RCU's requirements.

Fixes: 0817534ff9 ("smackfs: Fix use-after-free in netlbl_catmap_walk()")
Signed-off-by: Jiawei Ye <jiawei.ye@foxmail.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
2024-09-03 08:37:17 -07:00