Commit graph

49487 commits

Author SHA1 Message Date
Petr Mladek
475bb520c3 Merge branch 'rework/atomic-flush-hardlockup' into for-linus 2025-12-01 14:15:26 +01:00
Petr Mladek
3869e431b5 Merge branch 'for-6.19-vsprintf-timespec64' into for-linus 2025-12-01 14:14:34 +01:00
Andy Shevchenko
ace3852170 tracing: Switch to use %ptSp
Use %ptSp instead of open coded variants to print content of
struct timespec64 in human readable format.

Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20251113150217.3030010-22-andriy.shevchenko@linux.intel.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
2025-11-19 12:30:11 +01:00
Petr Mladek
394aa576c0 printk_ringbuffer: Create a helper function to decide whether more space is needed
The decision whether some more space is needed is tricky in the printk
ring buffer code:

  1. The given lpos values might overflow. A subtraction must be used
     instead of a simple "lower than" check.

  2. Another CPU might reuse the space in the mean time. It can be
     detected when the subtraction is bigger than DATA_SIZE(data_ring).

  3. There is exactly enough space when the result of the subtraction
     is zero. But more space is needed when the result is exactly
     DATA_SIZE(data_ring).

Add a helper function to make sure that the check is done correctly
in all situations. Also it helps to make the code consistent and
better documented.

Suggested-by: John Ogness <john.ogness@linutronix.de>
Link: https://lore.kernel.org/r/87tsz7iea2.fsf@jogness.linutronix.de
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Link: https://patch.msgid.link/20251107194720.1231457-3-pmladek@suse.com
[pmladek@suse.com: Updated wording as suggested by John]
Signed-off-by: Petr Mladek <pmladek@suse.com>
2025-11-10 13:09:43 +01:00
Petr Mladek
cc3bad11de printk_ringbuffer: Fix check of valid data size when blk_lpos overflows
The commit 67e1b0052f ("printk_ringbuffer: don't needlessly wrap
data blocks around") allows to use the last 4 bytes of the ring buffer.

But the check for the @data_size was not properly updated in get_data().
It fails when "blk_lpos->next" overflows to "0". In this case:

  + is_blk_wrapped(data_ring, blk_lpos->begin, blk_lpos->next)
    returns "false" because it checks "blk_lpos->next - 1".

  + "blk_lpos->begin < blk_lpos->next" fails because "blk_lpos->next"
    is already 0.

  + is_blk_wrapped(data_ring, blk_lpos->begin + DATA_SIZE(data_ring),
    blk_lpos->next) returns "false" because "begin_lpos" is from
    the next wrap but "next_lpos - 1" is from the previous one.

As a result, get_data() triggers the WARN_ON_ONCE() for "Illegal
block description", for example:

[  216.317316][ T7652] loop0: detected capacity change from 0 to 16
** 1 printk messages dropped **
[  216.327750][ T7652] ------------[ cut here ]------------
[  216.327789][ T7652] WARNING: kernel/printk/printk_ringbuffer.c:1278 at get_data+0x48a/0x840, CPU#1: syz.0.585/7652
[  216.327848][ T7652] Modules linked in:
[  216.327907][ T7652] CPU: 1 UID: 0 PID: 7652 Comm: syz.0.585 Not tainted syzkaller #0 PREEMPT(full)
[  216.327933][ T7652] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/02/2025
[  216.327953][ T7652] RIP: 0010:get_data+0x48a/0x840
[  216.327986][ T7652] Code: 83 c4 f8 48 b8 00 00 00 00 00 fc ff df 41 0f b6 04 07 84 c0 0f 85 ee 01 00 00 44 89 65 00 49 83 c5 08 eb 13 e8 a7 19 1f 00 90 <0f> 0b 90 eb 05 e8 9c 19 1f 00 45 31 ed 4c 89 e8 48 83 c4 28 5b 41
[  216.328007][ T7652] RSP: 0018:ffffc900035170e0 EFLAGS: 00010293
[  216.328029][ T7652] RAX: ffffffff81a1eee9 RBX: 00003fffffffffff RCX: ffff888033255b80
[  216.328048][ T7652] RDX: 0000000000000000 RSI: 00003fffffffffff RDI: 0000000000000000
[  216.328063][ T7652] RBP: 0000000000000012 R08: 0000000000000e55 R09: 000000325e213cc7
[  216.328079][ T7652] R10: 000000325e213cc7 R11: 00001de4c2000037 R12: 0000000000000012
[  216.328095][ T7652] R13: 0000000000000000 R14: ffffc90003517228 R15: 1ffffffff1bca646
[  216.328111][ T7652] FS:  00007f44eb8da6c0(0000) GS:ffff888125fda000(0000) knlGS:0000000000000000
[  216.328131][ T7652] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  216.328147][ T7652] CR2: 00007f44ea9722e0 CR3: 0000000066344000 CR4: 00000000003526f0
[  216.328168][ T7652] Call Trace:
[  216.328178][ T7652]  <TASK>
[  216.328199][ T7652]  _prb_read_valid+0x672/0xa90
[  216.328328][ T7652]  ? desc_read+0x1b8/0x3f0
[  216.328381][ T7652]  ? __pfx__prb_read_valid+0x10/0x10
[  216.328422][ T7652]  ? panic_on_this_cpu+0x32/0x40
[  216.328450][ T7652]  prb_read_valid+0x3c/0x60
[  216.328482][ T7652]  printk_get_next_message+0x15c/0x7b0
[  216.328526][ T7652]  ? __pfx_printk_get_next_message+0x10/0x10
[  216.328561][ T7652]  ? __lock_acquire+0xab9/0xd20
[  216.328595][ T7652]  ? console_flush_all+0x131/0xb10
[  216.328621][ T7652]  ? console_flush_all+0x478/0xb10
[  216.328648][ T7652]  console_flush_all+0x4cc/0xb10
[  216.328673][ T7652]  ? console_flush_all+0x131/0xb10
[  216.328704][ T7652]  ? __pfx_console_flush_all+0x10/0x10
[  216.328748][ T7652]  ? is_printk_cpu_sync_owner+0x32/0x40
[  216.328781][ T7652]  console_unlock+0xbb/0x190
[  216.328815][ T7652]  ? __pfx___down_trylock_console_sem+0x10/0x10
[  216.328853][ T7652]  ? __pfx_console_unlock+0x10/0x10
[  216.328899][ T7652]  vprintk_emit+0x4c5/0x590
[  216.328935][ T7652]  ? __pfx_vprintk_emit+0x10/0x10
[  216.328993][ T7652]  _printk+0xcf/0x120
[  216.329028][ T7652]  ? __pfx__printk+0x10/0x10
[  216.329051][ T7652]  ? kernfs_get+0x5a/0x90
[  216.329090][ T7652]  _erofs_printk+0x349/0x410
[  216.329130][ T7652]  ? __pfx__erofs_printk+0x10/0x10
[  216.329161][ T7652]  ? __raw_spin_lock_init+0x45/0x100
[  216.329186][ T7652]  ? __init_swait_queue_head+0xa9/0x150
[  216.329231][ T7652]  erofs_fc_fill_super+0x1591/0x1b20
[  216.329285][ T7652]  ? __pfx_erofs_fc_fill_super+0x10/0x10
[  216.329324][ T7652]  ? sb_set_blocksize+0x104/0x180
[  216.329356][ T7652]  ? setup_bdev_super+0x4c1/0x5b0
[  216.329385][ T7652]  get_tree_bdev_flags+0x40e/0x4d0
[  216.329410][ T7652]  ? __pfx_erofs_fc_fill_super+0x10/0x10
[  216.329444][ T7652]  ? __pfx_get_tree_bdev_flags+0x10/0x10
[  216.329483][ T7652]  vfs_get_tree+0x92/0x2b0
[  216.329512][ T7652]  do_new_mount+0x302/0xa10
[  216.329537][ T7652]  ? apparmor_capable+0x137/0x1b0
[  216.329576][ T7652]  ? __pfx_do_new_mount+0x10/0x10
[  216.329605][ T7652]  ? ns_capable+0x8a/0xf0
[  216.329637][ T7652]  ? kmem_cache_free+0x19b/0x690
[  216.329682][ T7652]  __se_sys_mount+0x313/0x410
[  216.329717][ T7652]  ? __pfx___se_sys_mount+0x10/0x10
[  216.329836][ T7652]  ? do_syscall_64+0xbe/0xfa0
[  216.329869][ T7652]  ? __x64_sys_mount+0x20/0xc0
[  216.329901][ T7652]  do_syscall_64+0xfa/0xfa0
[  216.329932][ T7652]  ? lockdep_hardirqs_on+0x9c/0x150
[  216.329964][ T7652]  ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  216.329988][ T7652]  ? clear_bhb_loop+0x60/0xb0
[  216.330017][ T7652]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  216.330040][ T7652] RIP: 0033:0x7f44ea99076a
[  216.330080][ T7652] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb a6 e8 de 1a 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
[  216.330100][ T7652] RSP: 002b:00007f44eb8d9e68 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
[  216.330128][ T7652] RAX: ffffffffffffffda RBX: 00007f44eb8d9ef0 RCX: 00007f44ea99076a
[  216.330146][ T7652] RDX: 0000200000000180 RSI: 00002000000001c0 RDI: 00007f44eb8d9eb0
[  216.330164][ T7652] RBP: 0000200000000180 R08: 00007f44eb8d9ef0 R09: 0000000000000000
[  216.330181][ T7652] R10: 0000000000000000 R11: 0000000000000246 R12: 00002000000001c0
[  216.330196][ T7652] R13: 00007f44eb8d9eb0 R14: 00000000000001a1 R15: 0000200000000080
[  216.330233][ T7652]  </TASK>

Solve the problem by moving and fixing the sanity check. The problematic
if-else-if-else code will just distinguish three basic scenarios:
"regular" vs. "wrapped" vs. "too many times wrapped" block.

The new sanity check is more precise. A valid "data_size" must be
lower than half of the data buffer size. Also it must not be zero at
this stage. It allows to catch problematic "data_size" even for wrapped
blocks.

Closes: https://lore.kernel.org/all/69096836.a70a0220.88fb8.0006.GAE@google.com/
Closes: https://lore.kernel.org/all/69078fb6.050a0220.29fc44.0029.GAE@google.com/
Fixes: 67e1b0052f ("printk_ringbuffer: don't needlessly wrap data blocks around")
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Tested-by: John Ogness <john.ogness@linutronix.de>
Link: https://patch.msgid.link/20251107194720.1231457-2-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
2025-11-10 12:58:28 +01:00
Petr Mladek
d5d399efff printk/nbcon: Release nbcon consoles ownership in atomic flush after each emitted record
printk() tries to flush messages with NBCON_PRIO_EMERGENCY on
nbcon consoles immediately. It might take seconds to flush all
pending lines on slow serial consoles. Note that there might be
hundreds of messages, for example:

[    3.771531][    T1] pci 0000:3e:08.1: [8086:324
** replaying previous printk message **
[    3.771531][    T1] pci 0000:3e:08.1: [8086:3246] type 00 class 0x088000 PCIe Root Complex Integrated Endpoint
[ ... more than 2000 lines, about 200kB messages ... ]
[    3.837752][    T1] pci 0000:20:01.0: Adding to iommu group 18
[    3.837851][    T
** replaying previous printk message **
[    3.837851][    T1] pci 0000:20:03.0: Adding to iommu group 19
[    3.837946][    T1] pci 0000:20:05.0: Adding to iommu group 20
[ ... more than 500 messages for iommu groups 21-590 ...]
[    3.912932][    T1] pci 0000:f6:00.1: Adding to iommu group 591
[    3.913070][    T1] pci 0000:f6:00.2: Adding to iommu group 592
[    3.913243][    T1] DMAR: Intel(R) Virtualization Technology for Directed I/O
[    3.913245][    T1] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[    3.913245][    T1] software IO TLB: mapped [mem 0x000000004f000000-0x0000000053000000] (64MB)
[    3.913324][    T1] RAPL PMU: API unit is 2^-32 Joules, 3 fixed counters, 655360 ms ovfl timer
[    3.913325][    T1] RAPL PMU: hw unit of domain package 2^-14 Joules
[    3.913326][    T1] RAPL PMU: hw unit of domain dram 2^-14 Joules
[    3.913327][    T1] RAPL PMU: hw unit of domain psys 2^-0 Joules
[    3.933486][    T1] ------------[ cut here ]------------
[    3.933488][    T1] WARNING: CPU: 2 PID: 1 at arch/x86/events/intel/uncore.c:1156 uncore_pci_pmu_register+0x15e/0x180
[    3.930291][    C0] watchdog: Watchdog detected hard LOCKUP on cpu 0
[    3.930291][    C0] Kernel panic - not syncing: Hard LOCKUP
[...]
[    3.930291][    C0] CPU: 0 UID: 0 PID: 18 Comm: pr/ttyS0 Not tainted...
[...]
[    3.930291][    C0] RIP: 0010:nbcon_reacquire_nobuf+0x11/0x50
[    3.930291][    C0] Call Trace:
[...]
[    3.930291][    C0]  <TASK>
[    3.930291][    C0]  serial8250_console_write+0x16d/0x5c0
[    3.930291][    C0]  nbcon_emit_next_record+0x22c/0x250
[    3.930291][    C0]  nbcon_emit_one+0x93/0xe0
[    3.930291][    C0]  nbcon_kthread_func+0x13c/0x1c0

The are visible two takeovers of the console ownership:

  - The 1st one is triggered by the "WARNING: CPU: 2 PID: 1 at
    arch/x86/..." line printed with NBCON_PRIO_EMERGENCY.

  - The 2nd one is triggered by the "Kernel panic - not syncing:
    Hard LOCKUP" line printed with NBCON_PRIO_PANIC.

There are more than 2500 lines, at about 240kB, emitted between
the takeover and the 1st "WARNING" line in the emergency context.
This amount of pending messages had to be flushed by
nbcon_atomic_flush_pending() when WARN() printed its first line.

The atomic flush was holding the nbcon console context for too long so
that it triggered hard lockup on the CPU running the printk kthread
"pr/ttyS0". The kthread needed to reacquire the console ownership
for restoring the original serial port state in serial8250_console_write().

Prevent the hardlockup by releasing the nbcon console ownership after
each emitted record.

Note that __nbcon_atomic_flush_pending_con() used to hold the console
ownership all the time because it blocked the printk kthread. Otherwise
the kthread tried to flush the messages in parallel which caused repeated
takeovers and more replayed messages.

It is not longer a problem because the repeated takeovers are blocked
by the counter of emergency contexts, see nbcon_cpu_emergency_cnt.

Link: https://lore.kernel.org/all/aNQO-zl3k1l4ENfy@pathway.suse.cz
Reviewed-by: Andrew Murray <amurray@thegoodpenguin.co.uk>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Link: https://patch.msgid.link/20250926124912.243464-4-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
2025-10-30 12:11:33 +01:00
Petr Mladek
4c3ba0d592 printk/nbcon/panic: Allow printk kthread to sleep when the system is in panic
The printk kthread might be running when there is a panic in progress.
But it is not able to acquire the console ownership any longer.

Prevent the desperate attempts to acquire the ownership and allow sleeping
in panic. It would make it behave the same as when there is any CPU
in an emergency context.

Reviewed-by: Andrew Murray <amurray@thegoodpenguin.co.uk>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Link: https://patch.msgid.link/20250926124912.243464-3-pmladek@suse.com
[pmladek@suse.com: Rebased on top of 6.18-rc1 (panic_in_progress() moved to panic.c)]
Signed-off-by: Petr Mladek <pmladek@suse.com>
2025-10-30 12:11:33 +01:00
Petr Mladek
c41c0ebfa1 printk/nbcon: Block printk kthreads when any CPU is in an emergency context
In emergency contexts, printk() tries to flush messages directly even
on nbcon consoles. And it is allowed to takeover the console ownership
and interrupt the printk kthread in the middle of a message.

Only one takeover and one repeated message should be enough in most
situations. The first emergency message flushes the backlog and printk
kthreads get to sleep. Next emergency messages are flushed directly
and printk() does not wake up the kthreads.

However, the one takeover is not guaranteed. Any printk() in normal
context on another CPU could wake up the kthreads. Or a new emergency
message might be added before the kthreads get to sleep. Note that
the interrupted .write_thread() callbacks usually have to call
nbcon_reacquire_nobuf() and restore the original device setting
before checking for pending messages.

The risk of the repeated takeovers will be even bigger because
__nbcon_atomic_flush_pending_con is going to release the console
ownership after each emitted record. It will be needed to prevent
hardlockup reports on other CPUs which are busy waiting for
the context ownership, for example, by nbcon_reacquire_nobuf() or
__uart_port_nbcon_acquire().

The repeated takeovers break the output, for example:

    [ 5042.650211][ T2220] Call Trace:
    [ 5042.6511
    ** replaying previous printk message **
    [ 5042.651192][ T2220]  <TASK>
    [ 5042.652160][ T2220]  kunit_run_
    ** replaying previous printk message **
    [ 5042.652160][ T2220]  kunit_run_tests+0x72/0x90
    [ 5042.653340][ T22
    ** replaying previous printk message **
    [ 5042.653340][ T2220]  ? srso_alias_return_thunk+0x5/0xfbef5
    [ 5042.654628][ T2220]  ? stack_trace_save+0x4d/0x70
    [ 5042.6553
    ** replaying previous printk message **
    [ 5042.655394][ T2220]  ? srso_alias_return_thunk+0x5/0xfbef5
    [ 5042.656713][ T2220]  ? save_trace+0x5b/0x180

A more robust solution is to block the printk kthread entirely whenever
*any* CPU enters an emergency context. This ensures that critical messages
can be flushed without contention from the normal, non-atomic printing
path.

Link: https://lore.kernel.org/all/aNQO-zl3k1l4ENfy@pathway.suse.cz
Reviewed-by: Andrew Murray <amurray@thegoodpenguin.co.uk>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Link: https://patch.msgid.link/20250926124912.243464-2-pmladek@suse.com
[pmladek@suse.com: Added changes proposed by John Ogness]
Signed-off-by: Petr Mladek <pmladek@suse.com>
2025-10-30 12:10:16 +01:00
Daniil Tatianin
67e1b0052f printk_ringbuffer: don't needlessly wrap data blocks around
Previously, data blocks that perfectly fit the data ring buffer would
get wrapped around to the beginning for no reason since the calculated
offset of the next data block would belong to the next wrap. Since this
offset is not actually part of the data block, but rather the offset of
where the next data block is going to start, there is no reason to
include it when deciding whether the current block fits the buffer.

Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Tested-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Tested-by: John Ogness <john.ogness@linutronix.de>
Link: https://patch.msgid.link/20250905144152.9137-2-d-tatianin@yandex-team.ru
[pmladek@suse.com: Updated indentation.]
Signed-off-by: Petr Mladek <pmladek@suse.com>
2025-10-22 14:03:36 +02:00
Linus Torvalds
48e3694ae7 printk changes for 6.18
-----BEGIN PGP SIGNATURE-----
 
 iQJPBAABCAA5FiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmjeOX8bFIAAAAAABAAO
 bWFudTIsMi41KzEuMTEsMiwyAAoJEFKgDEdIgJTyM7MP/0mD0oJa/8DRiZ0r1qLM
 xt/mCZIXhbqfJdCIaLEpfL35qPI59PWvrECGxwzL7CXOHlORxoCW98LoUDkigRKW
 e93mv8xSBaac11QrMSy32cEMSpzXmcNjz5u/02AUveXpxERr82nLgGwKY8eF0eAy
 7wF+N4bGHPShEy2J7kVFVrgZompFQLDgPB3/GycSdKsZd7ss9XiKa/EWvPJbEGOD
 CNC0MlAd5QDxVXxhioC9g9tZaUENZrqxys1vfROFx8RlaobRa2Vt/TEJQ5WCzmKo
 JO8JdojD03h0U2DdFE/lgTjf/lz8hTiaRaTifNnhUofqvzFKupzcv9L2xq/zmpfC
 4dMW6YOObD/FVeNVsgVC16/1WDKl+XfyogAmtTBjSNoPd6WoSczZsfubxGc30lW4
 AqJ2OpPa+CXq2elL+Qb0dLVekMhNytjfYdyFK/FP0DXrkoV4FMHdfNLW58TR7Pcq
 iN1MNEkEs4dWSbn2xJzKlFQyVq2/t6A6l5N/kodZ1xLWiwFYzeA3FTOVzrOlXnde
 IAYh3iH2WoTiHjZB/oomhnXKCMeTIvMYsGQQl4y+XqRHIwpEjSjcm8/M3hvlMyJE
 m0D6cpW8IRLoZ0raxqJPVpXR3WZsLJm2InGMrfY/tPWTU7L5uxmF51GPXYKdUmpZ
 NUgzh2cOwSWWU4UHt/AiD9cd
 =uFn0
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

 - Add KUnit test for the printk ring buffer

 - Fix the check of the maximal record size which is allowed to be
   stored into the printk ring buffer. It prevents corruptions of the
   ring buffer.

   Note that printk() is on the safe side. The messages are limited by
   1kB buffer and are always small enough for the minimal log buffer
   size 4kB, see CONFIG_LOG_BUF_SHIFT definition.

* tag 'printk-for-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printk: ringbuffer: Fix data block max size check
  printk: kunit: support offstack cpumask
  printk: kunit: Fix __counted_by() in struct prbtest_rbdata
  printk: ringbuffer: Explain why the KUnit test ignores failed writes
  printk: ringbuffer: Add KUnit test
2025-10-04 11:13:11 -07:00
Linus Torvalds
5b7ce93854 kgdb patches for 6.18
A collection of small cleanups this cycle. Thorsten Blum has replaced a
 number strcpy() calls with safer alternatives (fixing a pointer
 aliasing bug in the process). Colin Ian King has simplified things by
 removing some unreachable code.
 
 Signed-off-by: Daniel Thompson (RISCstar) <danielt@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEELzVBU1D3lWq6cKzwfOMlXTn3iKEFAmjfvoQACgkQfOMlXTn3
 iKGqiA/8DKfwbWqnLuFld5U3lK5AeYSWTnbFbQbk8kEUzl/nAVuSGBGMKSnLHKaF
 S3DVWkQA1syhhoidm/XCco2cjkvDoOPQNzQwD35nxOX1QuoxOxbx2qRT4FRJE9CC
 B5WUonEo++z+tmwqVWh9mMXTrYJIMQItg7igMM9wuKWB9Gv08LAfAFU2g4p7O9hf
 A22mSV3EobcxJWELXZhuPyBoWwmC6W6QoxdJ2mTra2/TaKV7DL4PY8o74vRxGW9+
 RKd+fyxpDSvStbYfXIeRZYBjrvqfZ8UHfPV7gCIfppbsrOSrl2HwNlmnOwyNMlgo
 iy2kNOVoXbdRPZrgisLdvX3Q7/BBMRtX2+wpGqaEMRCHMbIcd8RA/uW6v56XGmhh
 zg0GBK+OPIudT4XsB+mwtk1QNbnpECsS1JrI5H88zYnvTsyD0XJfT0ER95uTa30c
 qvz3FvWlxWGpHrtJP17+N/t/lZDWPRXuQZeFOH4oPpsN5dbxBLmXbylWnOmuv86e
 ilICYU3czpflm2moQmKGg63SRRCtti+1Cs78VJnXu1eNfgJn0ko54HtQi6x2nK76
 z0ugll7hi8WKB43sh+OIl0aF5E6Jn9VHpWXgXwVoyTToNrf/cDhHjDsplypaNFDi
 ewXvQR3bP26g7J6jCcmmm7EhJoZOqHelkKxUemyMBzs0seEbPqg=
 =6bO0
 -----END PGP SIGNATURE-----

Merge tag 'kgdb-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux

Pull kgdb updates from Daniel Thompson:
 "A collection of small cleanups this cycle.

  Thorsten Blum has replaced a number strcpy() calls with safer
  alternatives (fixing a pointer aliasing bug in the process).

  Colin Ian King has simplified things by removing some unreachable
  code"

* tag 'kgdb-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux:
  kdb: remove redundant check for scancode 0xe0
  kdb: Replace deprecated strcpy() with helper function in kdb_defcmd()
  kdb: Replace deprecated strcpy() with memcpy() in parse_grep()
  kdb: Replace deprecated strcpy() with memmove() in vkdb_printf()
  kdb: Replace deprecated strcpy() with memcpy() in kdb_strdup()
  kernel: debug: gdbstub: Replace deprecated strcpy() with strscpy()
2025-10-04 09:59:05 -07:00
Linus Torvalds
cbf33b8e0b bpf-fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmjfBtMACgkQ6rmadz2v
 bToTuA/+JizieMIW6eqzCDrrhj45DmuKu6gUMrkmM0xte3QCoLly5FDGJ9HMmrqf
 L5xfz/TAExg/Nh9a0zpCGE7/fMr9P10Z5vVHMe/I06rcW+/fDrAERXDurlfpfliv
 /7KqF/lq1D6kp7bcBzWX4dzkT66Nh/Ax6ZJKIXF/K3iwFXBb1A/95Wwgqxuc3xNM
 n0kthEhnX6x5cojy6fla2zYNoebNLVxPvafTcAtnE0QM8vxdOl0Q7KWrdQpVE5V/
 z6WF2M822eTQeUIy6k/ZRckFlmEwa++I+w5EnygBFP8INUHARdYNBQpQVFkg31jz
 GYGlbTs7jaDbMNWkEhKyDbuhL5Xq9/5FGOaLI9CvDtpPjbfYTZMGtIn9L902XtqE
 VEfBcTA/g/qd8Urx5622sfPwau64LrSAKBRE/4CJ7/dsnPx4BdKavsA5DMVuyRIm
 Au6tOxlRFhPWhultnmmbFBjwn33xTleYNUF9wpWQnijNq+Ph29gKZk8Ac5eKKS6A
 dTSCxv7LGafezN/evr779vXohZyF9vWfqHdITVzTo40tsSbi5v4uyfJ2qzoaS7Gf
 UQ4F2nkjqddhG/O1W2TY8aFzmWXP2q92V6yqkcq5zjA3D1Ue6JOxRzG/3kNzgsLX
 Gf1DmPosUyeLNpwdDYRD83Dk7tJ7rAI+x3i6Iwh8qivIhC4CzrU=
 =Kzf9
 -----END PGP SIGNATURE-----

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Alexei Starovoitov:

 - Fix selftests/bpf (typo, conflicts) and unbreak BPF CI (Jiri Olsa)

 - Remove linux/unaligned.h dependency for libbpf_sha256 (Andrii
   Nakryiko) and add a test (Eric Biggers)

 - Reject negative offsets for ALU operations in the verifier (Yazhou
   Tang) and add a test (Eduard Zingerman)

 - Skip scalar adjustment for BPF_NEG operation if destination register
   is a pointer (Brahmajit Das) and add a test (KaFai Wan)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  libbpf: Fix missing #pragma in libbpf_utils.c
  selftests/bpf: Add tests for rejection of ALU ops with negative offsets
  selftests/bpf: Add test for libbpf_sha256()
  bpf: Reject negative offsets for ALU ops
  libbpf: remove linux/unaligned.h dependency for libbpf_sha256()
  libbpf: move libbpf_sha256() implementation into libbpf_utils.c
  libbpf: move libbpf_errstr() into libbpf_utils.c
  libbpf: remove unused libbpf_strerror_r and STRERR_BUFSIZE
  libbpf: make libbpf_errno.c into more generic libbpf_utils.c
  selftests/bpf: Add test for BPF_NEG alu on CONST_PTR_TO_MAP
  bpf: Skip scalar adjustment for BPF_NEG if dst is a pointer
  selftests/bpf: Fix realloc size in bpf_get_addrs
  selftests/bpf: Fix typo in subtest_basic_usdt after merge conflict
  selftests/bpf: Fix open-coded gettid syscall in uprobe syscall tests
2025-10-03 19:38:19 -07:00
Linus Torvalds
a498d59c46 dma-mapping updates for Linux 6.18:
- refactoring of DMA mapping API to physical addresses as the primary
 interface instead of page+offset parameters; this gets much closer to
 Matthew Wilcox's long term wish for struct-pageless IO to cacheable DRAM and is
 supporting memdesc project which seeks to substantially transform how
 struct page works; an advantage of this approach is the possibility of
 introducing DMA_ATTR_MMIO, which covers existing 'dma_map_resource' flow
 in the common paths, what in turn lets to use recently introduced
 dma_iova_link() API to map PCI P2P MMIO without creating struct page;
 developped by Leon Romanovsky and Jason Gunthorpe
 - minor clean-up by Petr Tesarik and Qianfeng Rong
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSrngzkoBtlA8uaaJ+Jp1EFxbsSRAUCaNugqAAKCRCJp1EFxbsS
 RBvDAQCEd4P6pz6ROQHf5BfiF5J1db2H6bWsFLjajx3KfNWf8gD+P0eQ0hTzLrcd
 zuSKZTivviOiyjXlt/9GOaXXPnmTwA0=
 =b0nZ
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-6.18-2025-09-30' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux

Pull dma-mapping updates from Marek Szyprowski:

 - Refactoring of DMA mapping API to physical addresses as the primary
   interface instead of page+offset parameters

   This gets much closer to Matthew Wilcox's long term wish for
   struct-pageless IO to cacheable DRAM and is supporting memdesc
   project which seeks to substantially transform how struct page works.

   An advantage of this approach is the possibility of introducing
   DMA_ATTR_MMIO, which covers existing 'dma_map_resource' flow in the
   common paths, what in turn lets to use recently introduced
   dma_iova_link() API to map PCI P2P MMIO without creating struct page

   Developped by Leon Romanovsky and Jason Gunthorpe

 - Minor clean-up by Petr Tesarik and Qianfeng Rong

* tag 'dma-mapping-6.18-2025-09-30' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux:
  kmsan: fix missed kmsan_handle_dma() signature conversion
  mm/hmm: properly take MMIO path
  mm/hmm: migrate to physical address-based DMA mapping API
  dma-mapping: export new dma_*map_phys() interface
  xen: swiotlb: Open code map_resource callback
  dma-mapping: implement DMA_ATTR_MMIO for dma_(un)map_page_attrs()
  kmsan: convert kmsan_handle_dma to use physical addresses
  dma-mapping: convert dma_direct_*map_page to be phys_addr_t based
  iommu/dma: implement DMA_ATTR_MMIO for iommu_dma_(un)map_phys()
  iommu/dma: rename iommu_dma_*map_page to iommu_dma_*map_phys
  dma-mapping: rename trace_dma_*map_page to trace_dma_*map_phys
  dma-debug: refactor to use physical addresses for page mapping
  iommu/dma: implement DMA_ATTR_MMIO for dma_iova_link().
  dma-mapping: introduce new DMA attribute to indicate MMIO memory
  swiotlb: Remove redundant __GFP_NOWARN
  dma-direct: clean up the logic in __dma_direct_alloc_pages()
2025-10-03 17:41:12 -07:00
Linus Torvalds
50647a1176 file->f_path constification
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCaN3daAAKCRBZ7Krx/gZQ
 6zNWAP9kD6rOJRNqDgea4pibDPa47Tps/WM5tsDv3dsLliY29gEA6sveOWZ3guAj
 4oY3ts/NtHLWXvhI7Vd/1mr2aTKEZQk=
 =YNK+
 -----END PGP SIGNATURE-----

Merge tag 'pull-f_path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull file->f_path constification from Al Viro:
 "Only one thing was modifying ->f_path of an opened file - acct(2).

  Massaging that away and constifying a bunch of struct path * arguments
  in functions that might be given &file->f_path ends up with the
  situation where we can turn ->f_path into an anon union of const
  struct path f_path and struct path __f_path, the latter modified only
  in a few places in fs/{file_table,open,namei}.c, all for struct file
  instances that are yet to be opened"

* tag 'pull-f_path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (23 commits)
  Have cc(1) catch attempts to modify ->f_path
  kernel/acct.c: saner struct file treatment
  configfs:get_target() - release path as soon as we grab configfs_item reference
  apparmor/af_unix: constify struct path * arguments
  ovl_is_real_file: constify realpath argument
  ovl_sync_file(): constify path argument
  ovl_lower_dir(): constify path argument
  ovl_get_verity_digest(): constify path argument
  ovl_validate_verity(): constify {meta,data}path arguments
  ovl_ensure_verity_loaded(): constify datapath argument
  ksmbd_vfs_set_init_posix_acl(): constify path argument
  ksmbd_vfs_inherit_posix_acl(): constify path argument
  ksmbd_vfs_kern_path_unlock(): constify path argument
  ksmbd_vfs_path_lookup_locked(): root_share_path can be const struct path *
  check_export(): constify path argument
  export_operations->open(): constify path argument
  rqst_exp_get_by_name(): constify path argument
  nfs: constify path argument of __vfs_getattr()
  bpf...d_path(): constify path argument
  done_path_create(): constify path argument
  ...
2025-10-03 16:32:36 -07:00
Linus Torvalds
51e9889ab1 vfs_parse_fs_string() stuff
change on vfs_parse_fs_string() calling conventions - get rid of
 the length argument (almost all callers pass strlen() of the
 string argument there), add vfs_parse_fs_qstr() for the cases
 that do want separate length
 
 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCaNhllQAKCRBZ7Krx/gZQ
 6wyiAP9TmyFLBWKC/sDNtRAiGRybEtcwvVZj1agpx0kZIWshUwD7BVg4AfDs+vN3
 RoYYg9DR1SP5kZF7h2Ve1T39XDq6ZQU=
 =YZFG
 -----END PGP SIGNATURE-----

Merge tag 'pull-fs_context' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull fs_context updates from Al Viro:
 "Change vfs_parse_fs_string() calling conventions

  Get rid of the length argument (almost all callers pass strlen() of
  the string argument there), add vfs_parse_fs_qstr() for the cases that
  do want separate length"

* tag 'pull-fs_context' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  do_nfs4_mount(): switch to vfs_parse_fs_string()
  change the calling conventions for vfs_parse_fs_string()
2025-10-03 10:51:44 -07:00
Linus Torvalds
e64aeecbbb mount-related stuff for this cycle
* saner handling of guards in fs/namespace.c, getting
 rid of needlessly strong locking in some of the users.
 
 	* lock_mount() calling conventions change - have it set
 the environment for attaching to given location, storing the
 results in caller-supplied object, without altering the passed
 struct path.  Make unlock_mount() called as __cleanup for those
 objects.  It's not exactly guard(), but similar to it.
 
 	* MNT_WRITE_HOLD done right - mnt_hold_writers() does *not*
 mess with ->mnt_flags anymore, so insertion of a new mount into
 ->s_mounts of underlying superblock does not, in itself, expose
 ->mnt_flags of that mount to concurrent modifications.
 
 	* getting rid of pathological cases when umount() spends
 quadratic time removing the victims from propagation graph -
 part of that had been dealt with last cycle, this should finish
 it.
 
 	* a bunch of stuff constified.
 
 	* assorted cleanups.
 
 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCaNhzLAAKCRBZ7Krx/gZQ
 63/IAP4yxJ6e3Pt66Uw0MeuSNmeLsQwb7mYo72lsYHpxjYANZAEAspMaLDU9NHxM
 Dy6WDVoJnf7+aDlD6E443YMfPX8XRQM=
 =5T+t
 -----END PGP SIGNATURE-----

Merge tag 'pull-mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs mount updates from Al Viro:
 "Several piles this cycle, this mount-related one being the largest and
  trickiest:

   - saner handling of guards in fs/namespace.c, getting rid of
     needlessly strong locking in some of the users

   - lock_mount() calling conventions change - have it set the
     environment for attaching to given location, storing the results in
     caller-supplied object, without altering the passed struct path.

     Make unlock_mount() called as __cleanup for those objects. It's not
     exactly guard(), but similar to it

   - MNT_WRITE_HOLD done right.

     mnt_hold_writers() does *not* mess with ->mnt_flags anymore, so
     insertion of a new mount into ->s_mounts of underlying superblock
     does not, in itself, expose ->mnt_flags of that mount to concurrent
     modifications

   - getting rid of pathological cases when umount() spends quadratic
     time removing the victims from propagation graph - part of that had
     been dealt with last cycle, this should finish it

   - a bunch of stuff constified

   - assorted cleanups

* tag 'pull-mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (64 commits)
  constify {__,}mnt_is_readonly()
  WRITE_HOLD machinery: no need for to bump mount_lock seqcount
  struct mount: relocate MNT_WRITE_HOLD bit
  preparations to taking MNT_WRITE_HOLD out of ->mnt_flags
  setup_mnt(): primitive for connecting a mount to filesystem
  simplify the callers of mnt_unhold_writers()
  copy_mnt_ns(): use guards
  copy_mnt_ns(): use the regular mechanism for freeing empty mnt_ns on failure
  open_detached_copy(): separate creation of namespace into helper
  open_detached_copy(): don't bother with mount_lock_hash()
  path_has_submounts(): use guard(mount_locked_reader)
  fs/namespace.c: sanitize descriptions for {__,}lookup_mnt()
  ecryptfs: get rid of pointless mount references in ecryptfs dentries
  umount_tree(): take all victims out of propagation graph at once
  do_mount(): use __free(path_put)
  do_move_mount_old(): use __free(path_put)
  constify can_move_mount_beneath() arguments
  path_umount(): constify struct path argument
  may_copy_tree(), __do_loopback(): constify struct path argument
  path_mount(): constify struct path argument
  ...
2025-10-03 10:19:44 -07:00
Linus Torvalds
e406d57be7 Patch series in this pull request:
- The 3 patch series "ida: Remove the ida_simple_xxx() API" from
   Christophe Jaillet completes the removal of this legacy IDR API.
 
 - The 9 patch series "panic: introduce panic status function family"
   from Jinchao Wang provides a number of cleanups to the panic code and
   its various helpers, which were rather ad-hoc and scattered all over the
   place.
 
 - The 5 patch series "tools/delaytop: implement real-time keyboard
   interaction support" from Fan Yu adds a few nice user-facing usability
   changes to the delaytop monitoring tool.
 
 - The 3 patch series "efi: Fix EFI boot with kexec handover (KHO)" from
   Evangelos Petrongonas fixes a panic which was happening with the
   combination of EFI and KHO.
 
 - The 2 patch series "Squashfs: performance improvement and a sanity
   check" from Phillip Lougher teaches squashfs's lseek() about
   SEEK_DATA/SEEK_HOLE.  A mere 150x speedup was measured for a well-chosen
   microbenchmark.
 
 - Plus another 50-odd singleton patches all over the place.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaN78zwAKCRDdBJ7gKXxA
 jhLeAQCddTv0XtSUTrvBvmrJVUBrQQeJc+LtNopMIjfAF/WAWAEAogSVKxg+HHEB
 GaVixx4zDriNzEqrqiCx9rm4l+YooQA=
 =XRe0
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2025-10-02-15-29' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - "ida: Remove the ida_simple_xxx() API" from Christophe Jaillet
   completes the removal of this legacy IDR API

 - "panic: introduce panic status function family" from Jinchao Wang
   provides a number of cleanups to the panic code and its various
   helpers, which were rather ad-hoc and scattered all over the place

 - "tools/delaytop: implement real-time keyboard interaction support"
   from Fan Yu adds a few nice user-facing usability changes to the
   delaytop monitoring tool

 - "efi: Fix EFI boot with kexec handover (KHO)" from Evangelos
   Petrongonas fixes a panic which was happening with the combination of
   EFI and KHO

 - "Squashfs: performance improvement and a sanity check" from Phillip
   Lougher teaches squashfs's lseek() about SEEK_DATA/SEEK_HOLE. A mere
   150x speedup was measured for a well-chosen microbenchmark

 - plus another 50-odd singleton patches all over the place

* tag 'mm-nonmm-stable-2025-10-02-15-29' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (75 commits)
  Squashfs: reject negative file sizes in squashfs_read_inode()
  kallsyms: use kmalloc_array() instead of kmalloc()
  MAINTAINERS: update Sibi Sankar's email address
  Squashfs: add SEEK_DATA/SEEK_HOLE support
  Squashfs: add additional inode sanity checking
  lib/genalloc: fix device leak in of_gen_pool_get()
  panic: remove CONFIG_PANIC_ON_OOPS_VALUE
  ocfs2: fix double free in user_cluster_connect()
  checkpatch: suppress strscpy warnings for userspace tools
  cramfs: fix incorrect physical page address calculation
  kernel: prevent prctl(PR_SET_PDEATHSIG) from racing with parent process exit
  Squashfs: fix uninit-value in squashfs_get_parent
  kho: only fill kimage if KHO is finalized
  ocfs2: avoid extra calls to strlen() after ocfs2_sprintf_system_inode_name()
  kernel/sys.c: fix the racy usage of task_lock(tsk->group_leader) in sys_prlimit64() paths
  sched/task.h: fix the wrong comment on task_lock() nesting with tasklist_lock
  coccinelle: platform_no_drv_owner: handle also built-in drivers
  coccinelle: of_table: handle SPI device ID tables
  lib/decompress: use designated initializers for struct compress_format
  efi: support booting with kexec handover (KHO)
  ...
2025-10-02 18:44:54 -07:00
Linus Torvalds
8804d970fa Summary of significant series in this pull request:
- The 3 patch series "mm, swap: improve cluster scan strategy" from
   Kairui Song improves performance and reduces the failure rate of swap
   cluster allocation.
 
 - The 4 patch series "support large align and nid in Rust allocators"
   from Vitaly Wool permits Rust allocators to set NUMA node and large
   alignment when perforning slub and vmalloc reallocs.
 
 - The 2 patch series "mm/damon/vaddr: support stat-purpose DAMOS" from
   Yueyang Pan extend DAMOS_STAT's handling of the DAMON operations sets
   for virtual address spaces for ops-level DAMOS filters.
 
 - The 3 patch series "execute PROCMAP_QUERY ioctl under per-vma lock"
   from Suren Baghdasaryan reduces mmap_lock contention during reads of
   /proc/pid/maps.
 
 - The 2 patch series "mm/mincore: minor clean up for swap cache
   checking" from Kairui Song performs some cleanup in the swap code.
 
 - The 11 patch series "mm: vm_normal_page*() improvements" from David
   Hildenbrand provides code cleanup in the pagemap code.
 
 - The 5 patch series "add persistent huge zero folio support" from
   Pankaj Raghav provides a block layer speedup by optionalls making the
   huge_zero_pagepersistent, instead of releasing it when its refcount
   falls to zero.
 
 - The 3 patch series "kho: fixes and cleanups" from Mike Rapoport adds a
   few touchups to the recently added Kexec Handover feature.
 
 - The 10 patch series "mm: make mm->flags a bitmap and 64-bit on all
   arches" from Lorenzo Stoakes turns mm_struct.flags into a bitmap.  To
   end the constant struggle with space shortage on 32-bit conflicting with
   64-bit's needs.
 
 - The 2 patch series "mm/swapfile.c and swap.h cleanup" from Chris Li
   cleans up some swap code.
 
 - The 7 patch series "selftests/mm: Fix false positives and skip
   unsupported tests" from Donet Tom fixes a few things in our selftests
   code.
 
 - The 7 patch series "prctl: extend PR_SET_THP_DISABLE to only provide
   THPs when advised" from David Hildenbrand "allows individual processes
   to opt-out of THP=always into THP=madvise, without affecting other
   workloads on the system".
 
   It's a long story - the [1/N] changelog spells out the considerations.
 
 - The 11 patch series "Add and use memdesc_flags_t" from Matthew Wilcox
   gets us started on the memdesc project.  Please see
   https://kernelnewbies.org/MatthewWilcox/Memdescs and
   https://blogs.oracle.com/linux/post/introducing-memdesc.
 
 - The 3 patch series "Tiny optimization for large read operations" from
   Chi Zhiling improves the efficiency of the pagecache read path.
 
 - The 5 patch series "Better split_huge_page_test result check" from Zi
   Yan improves our folio splitting selftest code.
 
 - The 2 patch series "test that rmap behaves as expected" from Wei Yang
   adds some rmap selftests.
 
 - The 3 patch series "remove write_cache_pages()" from Christoph Hellwig
   removes that function and converts its two remaining callers.
 
 - The 2 patch series "selftests/mm: uffd-stress fixes" from Dev Jain
   fixes some UFFD selftests issues.
 
 - The 3 patch series "introduce kernel file mapped folios" from Boris
   Burkov introduces the concept of "kernel file pages".  Using these
   permits btrfs to account its metadata pages to the root cgroup, rather
   than to the cgroups of random inappropriate tasks.
 
 - The 2 patch series "mm/pageblock: improve readability of some
   pageblock handling" from Wei Yang provides some readability improvements
   to the page allocator code.
 
 - The 11 patch series "mm/damon: support ARM32 with LPAE" from SeongJae
   Park teaches DAMON to understand arm32 highmem.
 
 - The 4 patch series "tools: testing: Use existing atomic.h for
   vma/maple tests" from Brendan Jackman performs some code cleanups and
   deduplication under tools/testing/.
 
 - The 2 patch series "maple_tree: Fix testing for 32bit compiles" from
   Liam Howlett fixes a couple of 32-bit issues in
   tools/testing/radix-tree.c.
 
 - The 2 patch series "kasan: unify kasan_enabled() and remove
   arch-specific implementations" from Sabyrzhan Tasbolatov moves KASAN
   arch-specific initialization code into a common arch-neutral
   implementation.
 
 - The 3 patch series "mm: remove zpool" from Johannes Weiner removes
   zspool - an indirection layer which now only redirects to a single thing
   (zsmalloc).
 
 - The 2 patch series "mm: task_stack: Stack handling cleanups" from
   Pasha Tatashin makes a couple of cleanups in the fork code.
 
 - The 37 patch series "mm: remove nth_page()" from David Hildenbrand
   makes rather a lot of adjustments at various nth_page() callsites,
   eventually permitting the removal of that undesirable helper function.
 
 - The 2 patch series "introduce kasan.write_only option in hw-tags" from
   Yeoreum Yun creates a KASAN read-only mode for ARM, using that
   architecture's memory tagging feature.  It is felt that a read-only mode
   KASAN is suitable for use in production systems rather than debug-only.
 
 - The 3 patch series "mm: hugetlb: cleanup hugetlb folio allocation"
   from Kefeng Wang does some tidying in the hugetlb folio allocation code.
 
 - The 12 patch series "mm: establish const-correctness for pointer
   parameters" from Max Kellermann makes quite a number of the MM API
   functions more accurate about the constness of their arguments.  This
   was getting in the way of subsystems (in this case CEPH) when they
   attempt to improving their own const/non-const accuracy.
 
 - The 7 patch series "Cleanup free_pages() misuse" from Vishal Moola
   fixes a number of code sites which were confused over when to use
   free_pages() vs __free_pages().
 
 - The 3 patch series "Add Rust abstraction for Maple Trees" from Alice
   Ryhl makes the mapletree code accessible to Rust.  Required by nouveau
   and by its forthcoming successor: the new Rust Nova driver.
 
 - The 2 patch series "selftests/mm: split_huge_page_test:
   split_pte_mapped_thp improvements" from David Hildenbrand adds a fix and
   some cleanups to the thp selftesting code.
 
 - The 14 patch series "mm, swap: introduce swap table as swap cache
   (phase I)" from Chris Li and Kairui Song is the first step along the
   path to implementing "swap tables" - a new approach to swap allocation
   and state tracking which is expected to yield speed and space
   improvements.  This patchset itself yields a 5-20% performance benefit
   in some situations.
 
 - The 3 patch series "Some ptdesc cleanups" from Matthew Wilcox utilizes
   the new memdesc layer to clean up the ptdesc code a little.
 
 - The 3 patch series "Fix va_high_addr_switch.sh test failure" from
   Chunyu Hu fixes some issues in our 5-level pagetable selftesting code.
 
 - The 2 patch series "Minor fixes for memory allocation profiling" from
   Suren Baghdasaryan addresses a couple of minor issues in relatively new
   memory allocation profiling feature.
 
 - The 3 patch series "Small cleanups" from Matthew Wilcox has a few
   cleanups in preparation for more memdesc work.
 
 - The 2 patch series "mm/damon: add addr_unit for DAMON_LRU_SORT and
   DAMON_RECLAIM" from Quanmin Yan makes some changes to DAMON in
   furtherance of supporting arm highmem.
 
 - The 2 patch series "selftests/mm: Add -Wunreachable-code and fix
   warnings" from Muhammad Anjum adds that compiler check to selftests code
   and fixes the fallout, by removing dead code.
 
 - The 10 patch series "Improvements to Victim Process Thawing and OOM
   Reaper Traversal Order" from zhongjinji makes a number of improvements
   in the OOM killer: mainly thawing a more appropriate group of victim
   threads so they can release resources.
 
 - The 5 patch series "mm/damon: misc fixups and improvements for 6.18"
   from SeongJae Park is a bunch of small and unrelated fixups for DAMON.
 
 - The 7 patch series "mm/damon: define and use DAMON initialization
   check function" from SeongJae Park implement reliability and
   maintainability improvements to a recently-added bug fix.
 
 - The 2 patch series "mm/damon/stat: expose auto-tuned intervals and
   non-idle ages" from SeongJae Park provides additional transparency to
   userspace clients of the DAMON_STAT information.
 
 - The 2 patch series "Expand scope of khugepaged anonymous collapse"
   from Dev Jain removes some constraints on khubepaged's collapsing of
   anon VMAs.  It also increases the success rate of MADV_COLLAPSE against
   an anon vma.
 
 - The 2 patch series "mm: do not assume file == vma->vm_file in
   compat_vma_mmap_prepare()" from Lorenzo Stoakes moves us further towards
   removal of file_operations.mmap().  This patchset concentrates upon
   clearing up the treatment of stacked filesystems.
 
 - The 6 patch series "mm: Improve mlock tracking for large folios" from
   Kiryl Shutsemau provides some fixes and improvements to mlock's tracking
   of large folios.  /proc/meminfo's "Mlocked" field became more accurate.
 
 - The 2 patch series "mm/ksm: Fix incorrect accounting of KSM counters
   during fork" from Donet Tom fixes several user-visible KSM stats
   inaccuracies across forks and adds selftest code to verify these
   counters.
 
 - The 2 patch series "mm_slot: fix the usage of mm_slot_entry" from Wei
   Yang addresses some potential but presently benign issues in KSM's
   mm_slot handling.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaN3cywAKCRDdBJ7gKXxA
 jtaPAQDmIuIu7+XnVUK5V11hsQ/5QtsUeLHV3OsAn4yW5/3dEQD/UddRU08ePN+1
 2VRB0EwkLAdfMWW7TfiNZ+yhuoiL/AA=
 =4mhY
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2025-10-01-19-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - "mm, swap: improve cluster scan strategy" from Kairui Song improves
   performance and reduces the failure rate of swap cluster allocation

 - "support large align and nid in Rust allocators" from Vitaly Wool
   permits Rust allocators to set NUMA node and large alignment when
   perforning slub and vmalloc reallocs

 - "mm/damon/vaddr: support stat-purpose DAMOS" from Yueyang Pan extend
   DAMOS_STAT's handling of the DAMON operations sets for virtual
   address spaces for ops-level DAMOS filters

 - "execute PROCMAP_QUERY ioctl under per-vma lock" from Suren
   Baghdasaryan reduces mmap_lock contention during reads of
   /proc/pid/maps

 - "mm/mincore: minor clean up for swap cache checking" from Kairui Song
   performs some cleanup in the swap code

 - "mm: vm_normal_page*() improvements" from David Hildenbrand provides
   code cleanup in the pagemap code

 - "add persistent huge zero folio support" from Pankaj Raghav provides
   a block layer speedup by optionalls making the
   huge_zero_pagepersistent, instead of releasing it when its refcount
   falls to zero

 - "kho: fixes and cleanups" from Mike Rapoport adds a few touchups to
   the recently added Kexec Handover feature

 - "mm: make mm->flags a bitmap and 64-bit on all arches" from Lorenzo
   Stoakes turns mm_struct.flags into a bitmap. To end the constant
   struggle with space shortage on 32-bit conflicting with 64-bit's
   needs

 - "mm/swapfile.c and swap.h cleanup" from Chris Li cleans up some swap
   code

 - "selftests/mm: Fix false positives and skip unsupported tests" from
   Donet Tom fixes a few things in our selftests code

 - "prctl: extend PR_SET_THP_DISABLE to only provide THPs when advised"
   from David Hildenbrand "allows individual processes to opt-out of
   THP=always into THP=madvise, without affecting other workloads on the
   system".

   It's a long story - the [1/N] changelog spells out the considerations

 - "Add and use memdesc_flags_t" from Matthew Wilcox gets us started on
   the memdesc project. Please see

      https://kernelnewbies.org/MatthewWilcox/Memdescs and
      https://blogs.oracle.com/linux/post/introducing-memdesc

 - "Tiny optimization for large read operations" from Chi Zhiling
   improves the efficiency of the pagecache read path

 - "Better split_huge_page_test result check" from Zi Yan improves our
   folio splitting selftest code

 - "test that rmap behaves as expected" from Wei Yang adds some rmap
   selftests

 - "remove write_cache_pages()" from Christoph Hellwig removes that
   function and converts its two remaining callers

 - "selftests/mm: uffd-stress fixes" from Dev Jain fixes some UFFD
   selftests issues

 - "introduce kernel file mapped folios" from Boris Burkov introduces
   the concept of "kernel file pages". Using these permits btrfs to
   account its metadata pages to the root cgroup, rather than to the
   cgroups of random inappropriate tasks

 - "mm/pageblock: improve readability of some pageblock handling" from
   Wei Yang provides some readability improvements to the page allocator
   code

 - "mm/damon: support ARM32 with LPAE" from SeongJae Park teaches DAMON
   to understand arm32 highmem

 - "tools: testing: Use existing atomic.h for vma/maple tests" from
   Brendan Jackman performs some code cleanups and deduplication under
   tools/testing/

 - "maple_tree: Fix testing for 32bit compiles" from Liam Howlett fixes
   a couple of 32-bit issues in tools/testing/radix-tree.c

 - "kasan: unify kasan_enabled() and remove arch-specific
   implementations" from Sabyrzhan Tasbolatov moves KASAN arch-specific
   initialization code into a common arch-neutral implementation

 - "mm: remove zpool" from Johannes Weiner removes zspool - an
   indirection layer which now only redirects to a single thing
   (zsmalloc)

 - "mm: task_stack: Stack handling cleanups" from Pasha Tatashin makes a
   couple of cleanups in the fork code

 - "mm: remove nth_page()" from David Hildenbrand makes rather a lot of
   adjustments at various nth_page() callsites, eventually permitting
   the removal of that undesirable helper function

 - "introduce kasan.write_only option in hw-tags" from Yeoreum Yun
   creates a KASAN read-only mode for ARM, using that architecture's
   memory tagging feature. It is felt that a read-only mode KASAN is
   suitable for use in production systems rather than debug-only

 - "mm: hugetlb: cleanup hugetlb folio allocation" from Kefeng Wang does
   some tidying in the hugetlb folio allocation code

 - "mm: establish const-correctness for pointer parameters" from Max
   Kellermann makes quite a number of the MM API functions more accurate
   about the constness of their arguments. This was getting in the way
   of subsystems (in this case CEPH) when they attempt to improving
   their own const/non-const accuracy

 - "Cleanup free_pages() misuse" from Vishal Moola fixes a number of
   code sites which were confused over when to use free_pages() vs
   __free_pages()

 - "Add Rust abstraction for Maple Trees" from Alice Ryhl makes the
   mapletree code accessible to Rust. Required by nouveau and by its
   forthcoming successor: the new Rust Nova driver

 - "selftests/mm: split_huge_page_test: split_pte_mapped_thp
   improvements" from David Hildenbrand adds a fix and some cleanups to
   the thp selftesting code

 - "mm, swap: introduce swap table as swap cache (phase I)" from Chris
   Li and Kairui Song is the first step along the path to implementing
   "swap tables" - a new approach to swap allocation and state tracking
   which is expected to yield speed and space improvements. This
   patchset itself yields a 5-20% performance benefit in some situations

 - "Some ptdesc cleanups" from Matthew Wilcox utilizes the new memdesc
   layer to clean up the ptdesc code a little

 - "Fix va_high_addr_switch.sh test failure" from Chunyu Hu fixes some
   issues in our 5-level pagetable selftesting code

 - "Minor fixes for memory allocation profiling" from Suren Baghdasaryan
   addresses a couple of minor issues in relatively new memory
   allocation profiling feature

 - "Small cleanups" from Matthew Wilcox has a few cleanups in
   preparation for more memdesc work

 - "mm/damon: add addr_unit for DAMON_LRU_SORT and DAMON_RECLAIM" from
   Quanmin Yan makes some changes to DAMON in furtherance of supporting
   arm highmem

 - "selftests/mm: Add -Wunreachable-code and fix warnings" from Muhammad
   Anjum adds that compiler check to selftests code and fixes the
   fallout, by removing dead code

 - "Improvements to Victim Process Thawing and OOM Reaper Traversal
   Order" from zhongjinji makes a number of improvements in the OOM
   killer: mainly thawing a more appropriate group of victim threads so
   they can release resources

 - "mm/damon: misc fixups and improvements for 6.18" from SeongJae Park
   is a bunch of small and unrelated fixups for DAMON

 - "mm/damon: define and use DAMON initialization check function" from
   SeongJae Park implement reliability and maintainability improvements
   to a recently-added bug fix

 - "mm/damon/stat: expose auto-tuned intervals and non-idle ages" from
   SeongJae Park provides additional transparency to userspace clients
   of the DAMON_STAT information

 - "Expand scope of khugepaged anonymous collapse" from Dev Jain removes
   some constraints on khubepaged's collapsing of anon VMAs. It also
   increases the success rate of MADV_COLLAPSE against an anon vma

 - "mm: do not assume file == vma->vm_file in compat_vma_mmap_prepare()"
   from Lorenzo Stoakes moves us further towards removal of
   file_operations.mmap(). This patchset concentrates upon clearing up
   the treatment of stacked filesystems

 - "mm: Improve mlock tracking for large folios" from Kiryl Shutsemau
   provides some fixes and improvements to mlock's tracking of large
   folios. /proc/meminfo's "Mlocked" field became more accurate

 - "mm/ksm: Fix incorrect accounting of KSM counters during fork" from
   Donet Tom fixes several user-visible KSM stats inaccuracies across
   forks and adds selftest code to verify these counters

 - "mm_slot: fix the usage of mm_slot_entry" from Wei Yang addresses
   some potential but presently benign issues in KSM's mm_slot handling

* tag 'mm-stable-2025-10-01-19-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (372 commits)
  mm: swap: check for stable address space before operating on the VMA
  mm: convert folio_page() back to a macro
  mm/khugepaged: use start_addr/addr for improved readability
  hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list
  alloc_tag: fix boot failure due to NULL pointer dereference
  mm: silence data-race in update_hiwater_rss
  mm/memory-failure: don't select MEMORY_ISOLATION
  mm/khugepaged: remove definition of struct khugepaged_mm_slot
  mm/ksm: get mm_slot by mm_slot_entry() when slot is !NULL
  hugetlb: increase number of reserving hugepages via cmdline
  selftests/mm: add fork inheritance test for ksm_merging_pages counter
  mm/ksm: fix incorrect KSM counter handling in mm_struct during fork
  drivers/base/node: fix double free in register_one_node()
  mm: remove PMD alignment constraint in execmem_vmalloc()
  mm/memory_hotplug: fix typo 'esecially' -> 'especially'
  mm/rmap: improve mlock tracking for large folios
  mm/filemap: map entire large folio faultaround
  mm/fault: try to map the entire file folio in finish_fault()
  mm/rmap: mlock large folios in try_to_unmap_one()
  mm/rmap: fix a mlock race condition in folio_referenced_one()
  ...
2025-10-02 18:18:33 -07:00
Linus Torvalds
24d9e8b3c9 slab updates for 6.18
-----BEGIN PGP SIGNATURE-----
 
 iQFPBAABCAA5FiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmja74IbFIAAAAAABAAO
 bWFudTIsMi41KzEuMTEsMiwyAAoJELvgsHXSRYiacR4H/04aBsr7LZnTJVeZLQwK
 HKoOwXBqiQyqPdjKXGKnp7Mh9gRp2W3V11VsYTuDJNUS+Vz5YXW0z8cRnUfZ3SYs
 l+GZC3vZeAy2EVJE1U6Mb673hU8vziI80IO2q/tGzaj9a+wC3L0lemc+YFQTwG+u
 pMtt8zU2vHRjgkx8TNNqJBBOLLDV+RzIl8pqXVnh4eju6x6ZdreGnjXaePYMdjG0
 fXLf9XwIeWREqbfeOCEOB50Ts71kkdiOeskwnJyfCTDT8WTu3zC/dICqfh66e3Gg
 8hQKvMsuKpm/FwbtgdB0WvaDjENH6PmY+ubLYVxwvNpcsTSqfe0IYGm+HpUP+TPf
 m+Y=
 =w+JL
 -----END PGP SIGNATURE-----

Merge tag 'slab-for-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab updates from Vlastimil Babka:

 - A new layer for caching objects for allocation and free via percpu
   arrays called sheaves.

   The aim is to combine the good parts of SLAB (lower-overhead and
   simpler percpu caching, compared to SLUB) without the past issues
   with arrays for freeing remote NUMA node objects and their flushing.

   It also allows more efficient kfree_rcu(), and cheaper object
   preallocations for cases where the exact number of objects is
   unknown, but an upper bound is.

   Currently VMAs and maple nodes are using this new caching, with a
   plan to enable it for all caches and remove the complex SLUB fastpath
   based on cpu (partial) slabs and this_cpu_cmpxchg_double().
   (Vlastimil Babka, with Liam Howlett and Pedro Falcato for the maple
   tree changes)

 - Re-entrant kmalloc_nolock(), which allows opportunistic allocations
   from NMI and tracing/kprobe contexts.

   Building on prior page allocator and memcg changes, it will result in
   removing BPF-specific caches on top of slab (Alexei Starovoitov)

 - Various fixes and cleanups. (Kuan-Wei Chiu, Matthew Wilcox, Suren
   Baghdasaryan, Ye Liu)

* tag 'slab-for-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: (40 commits)
  slab: Introduce kmalloc_nolock() and kfree_nolock().
  slab: Reuse first bit for OBJEXTS_ALLOC_FAIL
  slab: Make slub local_(try)lock more precise for LOCKDEP
  mm: Introduce alloc_frozen_pages_nolock()
  mm: Allow GFP_ACCOUNT to be used in alloc_pages_nolock().
  locking/local_lock: Introduce local_lock_is_locked().
  maple_tree: Convert forking to use the sheaf interface
  maple_tree: Add single node allocation support to maple state
  maple_tree: Prefilled sheaf conversion and testing
  tools/testing: Add support for prefilled slab sheafs
  maple_tree: Replace mt_free_one() with kfree()
  maple_tree: Use kfree_rcu in ma_free_rcu
  testing/radix-tree/maple: Hack around kfree_rcu not existing
  tools/testing: include maple-shim.c in maple.c
  maple_tree: use percpu sheaves for maple_node_cache
  mm, vma: use percpu sheaves for vm_area_struct cache
  tools/testing: Add support for changes to slab for sheaves
  slab: allow NUMA restricted allocations to use percpu sheaves
  tools/testing/vma: Implement vm_refcnt reset
  slab: skip percpu sheaves for remote object freeing
  ...
2025-10-02 15:58:05 -07:00
Linus Torvalds
07fdad3a93 Networking changes for 6.18.
Core & protocols
 ----------------
 
  - Improve drop account scalability on NUMA hosts for RAW and UDP sockets
    and the backlog, almost doubling the Pps capacity under DoS.
 
  - Optimize the UDP RX performance under stress, reducing contention,
    revisiting the binary layout of the involved data structs and
    implementing NUMA-aware locking. This improves UDP RX performance by
    an additional 50%, even more under extreme conditions.
 
  - Add support for PSP encryption of TCP connections; this mechanism has
    some similarities with IPsec and TLS, but offers superior HW offloads
    capabilities.
 
  - Ongoing work to support Accurate ECN for TCP. AccECN allows more than
    one congestion notification signal per RTT and is a building block for
    Low Latency, Low Loss, and Scalable Throughput (L4S).
 
  - Reorganize the TCP socket binary layout for data locality, reducing
    the number of touched cachelines in the fastpath.
 
  - Refactor skb deferral free to better scale on large multi-NUMA hosts,
    this improves TCP and UDP RX performances significantly on such HW.
 
  - Increase the default socket memory buffer limits from 256K to 4M to
    better fit modern link speeds.
 
  - Improve handling of setups with a large number of nexthop, making dump
    operating scaling linearly and avoiding unneeded synchronize_rcu() on
    delete.
 
  - Improve bridge handling of VLAN FDB, storing a single entry per bridge
    instead of one entry per port; this makes the dump order of magnitude
    faster on large switches.
 
  - Restore IP ID correctly for encapsulated packets at GSO segmentation
    time, allowing GRO to merge packets in more scenarios.
 
  - Improve netfilter matching performance on large sets.
 
  - Improve MPTCP receive path performance by leveraging recently
    introduced core infrastructure (skb deferral free) and adopting recent
    TCP autotuning changes.
 
  - Allow bridges to redirect to a backup port when the bridge port is
    administratively down.
 
  - Introduce MPTCP 'laminar' endpoint that con be used only once per
    connection and simplify common MPTCP setups.
 
  - Add RCU safety to dst->dev, closing a lot of possible races.
 
  - A significant crypto library API for SCTP, MPTCP and IPv6 SR, reducing
    code duplication.
 
  - Supports pulling data from an skb frag into the linear area of an XDP
    buffer.
 
 Things we sprinkled into general kernel code
 --------------------------------------------
 
  - Generate netlink documentation from YAML using an integrated
    YAML parser.
 
 Driver API
 ----------
 
  - Support using IPv6 Flow Label in Rx hash computation and RSS queue
    selection.
 
  - Introduce API for fetching the DMA device for a given queue, allowing
    TCP zerocopy RX on more H/W setups.
 
  - Make XDP helpers compatible with unreadable memory, allowing more
    easily building DevMem-enabled drivers with a unified XDP/skbs
    datapath.
 
  - Add a new dedicated ethtool callback enabling drivers to provide the
    number of RX rings directly, improving efficiency and clarity in RX
    ring queries and RSS configuration.
 
  - Introduce a burst period for the health reporter, allowing better
    handling of multiple errors due to the same root cause.
 
  - Support for DPLL phase offset exponential moving average, controlling
    the average smoothing factor.
 
 Device drivers
 --------------
 
  - Add a new Huawei driver for 3rd gen NIC (hinic3).
 
  - Add a new SpacemiT driver for K1 ethernet MAC.
 
  - Add a generic abstraction for shared memory communication devices
    (dibps)
 
  - Ethernet high-speed NICs:
    - nVidia/Mellanox:
      - Use multiple per-queue doorbell, to avoid MMIO contention issues
      - support adjacent functions, allowing them to delegate their
        SR-IOV VFs to sibling PFs
      - support RSS for IPSec offload
      - support exposing raw cycle counters in PTP and mlx5
      - support for disabling host PFs.
    - Intel (100G, ice, idpf):
      - ice: support for SRIOV VFs over an Active-Active link aggregate
      - ice: support for firmware logging via debugfs
      - ice: support for Earliest TxTime First (ETF) hardware offload
      - idpf: support basic XDP functionalities and XSk
    - Broadcom (bnxt):
      - support Hyper-V VF ID
      - dynamic SRIOV resource allocations for RoCE
    - Meta (fbnic):
      - support queue API, zero-copy Rx and Tx
      - support basic XDP functionalities
      - devlink health support for FW crashes and OTP mem corruptions
      - expand hardware stats coverage to FEC, PHY, and Pause
    - Wangxun:
      - support ethtool coalesce options
      - support for multiple RSS contexts
 
  - Ethernet virtual:
    - Macsec:
      - replace custom netlink attribute checks with policy-level checks
    - Bonding:
      - support aggregator selection based on port priority
    - Microsoft vNIC:
      - use page pool fragments for RX buffers instead of full pages to
        improve memory efficiency
 
  - Ethernet NICs consumer, and embedded:
    - Qualcomm: support Ethernet function for IPQ9574 SoC
    - Airoha: implement wlan offloading via NPU
    - Freescale
      - enetc: add NETC timer PTP driver and add PTP support
      - fec: enable the Jumbo frame support for i.MX8QM
    - Renesas (R-Car S4): support HW offloading for layer 2 switching
      - support for RZ/{T2H, N2H} SoCs
    - Cadence (macb): support TAPRIO traffic scheduling
    - TI:
      - support for Gigabit ICSS ethernet SoC (icssm-prueth)
    - Synopsys (stmmac): a lot of cleanups
 
  - Ethernet PHYs:
    - Support 10g-qxgmi phy-mode for AQR412C, Felix DSA and Lynx PCS
      driver
    - Support bcm63268 GPHY power control
    - Support for Micrel lan8842 PHY and PTP
    - Support for Aquantia AQR412 and AQR115
 
  - CAN:
    - a large CAN-XL preparation work
    - reorganize raw_sock and uniqframe struct to minimize memory usage
    - rcar_canfd: update the CAN-FD handling
 
  - WiFi:
    - extended Neighbor Awareness Networking (NAN) support
    - S1G channel representation cleanup
    - improve S1G support
 
  - WiFi drivers:
    - Intel (iwlwifi):
      - major refactor and cleanup
    - Broadcom (brcm80211):
      - support for AP isolation
    - RealTek (rtw88/89) rtw88/89:
      - preparation work for RTL8922DE support
    - MediaTek (mt76):
      - HW restart improvements
      - MLO support
    - Qualcomm/Atheros (ath10k_
      - GTK rekey fixes
 
  - Bluetooth drivers:
    - btusb: support for several new IDs for MT7925
    - btintel: support for BlazarIW core
    - btintel_pcie: support for _suspend() / _resume()
    - btintel_pcie: support for Scorpious, Panther Lake-H484 IDs
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmjdBdsSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkWnAP/2Jz0ExnYMDxLxkkKqMte+3JNeF/KNjB
 zVIslYN/5ekkd1TMXyDLFlWHqUOUMSF9+cK5ZRMHG6/lzOueSzFuuuDFsWs6Kf2f
 q7Q9MzlXhR9++YCsdES1uS5x3PPjMInzo2ZivMFa6fUDLLFSzeAOKL9k+RS0EggU
 VlXv2Wkm73R0O6KAssgDsHke9cnNz+F0DzhQ0S3qkyZF9tS5NrDeUl7fZ47XZgwb
 ZuXdEzKmTTepo2XvZGxJoF2D7nekWFlHhLjEPpDJtET19nwhukCry41/FplJwlKR
 dWsYkqLkrOEQKFQ+q++/5c3BgFOG+IrQLbP5bGQF1tZRMl4Dqp9PDxK5fKJfccbS
 0VY3Y2qWOSYejT2Ji2kEHR5E5rPyFm3Y5A4Rnpz0yEHj14vL2v4zf7CZRkCyyDfD
 doIZXFGkM0+N7QeXLEN833cje9zjaXuP9GAE+2bb+wAWAZAqof4KX8JgHh+y5Rwm
 pvUtvFxmEtntlMwNBap8aT3FquGtfZncU8pzAE5kvWjuMvyF0NVWiE5rB2kSQM/X
 NLmUdvDyjwwJqthXAJG0fl+a0mNJ/kOAqSOKJDJrfKDGWa+ovwY0iY06SpK0wIbO
 Wz7tpMk5MSlYXW8xWMlbyhvvU6T9xuoQ2KV4QTdMxc6Ir3sNX6YkQr+gjQjxB0gx
 ST2QF6lZeWFh
 =w2Kz
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Paolo Abeni:
 "Core & protocols:

   - Improve drop account scalability on NUMA hosts for RAW and UDP
     sockets and the backlog, almost doubling the Pps capacity under DoS

   - Optimize the UDP RX performance under stress, reducing contention,
     revisiting the binary layout of the involved data structs and
     implementing NUMA-aware locking. This improves UDP RX performance
     by an additional 50%, even more under extreme conditions

   - Add support for PSP encryption of TCP connections; this mechanism
     has some similarities with IPsec and TLS, but offers superior HW
     offloads capabilities

   - Ongoing work to support Accurate ECN for TCP. AccECN allows more
     than one congestion notification signal per RTT and is a building
     block for Low Latency, Low Loss, and Scalable Throughput (L4S)

   - Reorganize the TCP socket binary layout for data locality, reducing
     the number of touched cachelines in the fastpath

   - Refactor skb deferral free to better scale on large multi-NUMA
     hosts, this improves TCP and UDP RX performances significantly on
     such HW

   - Increase the default socket memory buffer limits from 256K to 4M to
     better fit modern link speeds

   - Improve handling of setups with a large number of nexthop, making
     dump operating scaling linearly and avoiding unneeded
     synchronize_rcu() on delete

   - Improve bridge handling of VLAN FDB, storing a single entry per
     bridge instead of one entry per port; this makes the dump order of
     magnitude faster on large switches

   - Restore IP ID correctly for encapsulated packets at GSO
     segmentation time, allowing GRO to merge packets in more scenarios

   - Improve netfilter matching performance on large sets

   - Improve MPTCP receive path performance by leveraging recently
     introduced core infrastructure (skb deferral free) and adopting
     recent TCP autotuning changes

   - Allow bridges to redirect to a backup port when the bridge port is
     administratively down

   - Introduce MPTCP 'laminar' endpoint that con be used only once per
     connection and simplify common MPTCP setups

   - Add RCU safety to dst->dev, closing a lot of possible races

   - A significant crypto library API for SCTP, MPTCP and IPv6 SR,
     reducing code duplication

   - Supports pulling data from an skb frag into the linear area of an
     XDP buffer

  Things we sprinkled into general kernel code:

   - Generate netlink documentation from YAML using an integrated YAML
     parser

  Driver API:

   - Support using IPv6 Flow Label in Rx hash computation and RSS queue
     selection

   - Introduce API for fetching the DMA device for a given queue,
     allowing TCP zerocopy RX on more H/W setups

   - Make XDP helpers compatible with unreadable memory, allowing more
     easily building DevMem-enabled drivers with a unified XDP/skbs
     datapath

   - Add a new dedicated ethtool callback enabling drivers to provide
     the number of RX rings directly, improving efficiency and clarity
     in RX ring queries and RSS configuration

   - Introduce a burst period for the health reporter, allowing better
     handling of multiple errors due to the same root cause

   - Support for DPLL phase offset exponential moving average,
     controlling the average smoothing factor

  Device drivers:

   - Add a new Huawei driver for 3rd gen NIC (hinic3)

   - Add a new SpacemiT driver for K1 ethernet MAC

   - Add a generic abstraction for shared memory communication
     devices (dibps)

   - Ethernet high-speed NICs:
      - nVidia/Mellanox:
         - Use multiple per-queue doorbell, to avoid MMIO contention
           issues
         - support adjacent functions, allowing them to delegate their
           SR-IOV VFs to sibling PFs
         - support RSS for IPSec offload
         - support exposing raw cycle counters in PTP and mlx5
         - support for disabling host PFs.
      - Intel (100G, ice, idpf):
         - ice: support for SRIOV VFs over an Active-Active link
           aggregate
         - ice: support for firmware logging via debugfs
         - ice: support for Earliest TxTime First (ETF) hardware offload
         - idpf: support basic XDP functionalities and XSk
      - Broadcom (bnxt):
         - support Hyper-V VF ID
         - dynamic SRIOV resource allocations for RoCE
      - Meta (fbnic):
         - support queue API, zero-copy Rx and Tx
         - support basic XDP functionalities
         - devlink health support for FW crashes and OTP mem corruptions
         - expand hardware stats coverage to FEC, PHY, and Pause
      - Wangxun:
         - support ethtool coalesce options
         - support for multiple RSS contexts

   - Ethernet virtual:
      - Macsec:
         - replace custom netlink attribute checks with policy-level
           checks
      - Bonding:
         - support aggregator selection based on port priority
      - Microsoft vNIC:
         - use page pool fragments for RX buffers instead of full pages
           to improve memory efficiency

   - Ethernet NICs consumer, and embedded:
      - Qualcomm: support Ethernet function for IPQ9574 SoC
      - Airoha: implement wlan offloading via NPU
      - Freescale
         - enetc: add NETC timer PTP driver and add PTP support
         - fec: enable the Jumbo frame support for i.MX8QM
      - Renesas (R-Car S4):
         - support HW offloading for layer 2 switching
         - support for RZ/{T2H, N2H} SoCs
      - Cadence (macb): support TAPRIO traffic scheduling
      - TI:
         - support for Gigabit ICSS ethernet SoC (icssm-prueth)
      - Synopsys (stmmac): a lot of cleanups

   - Ethernet PHYs:
      - Support 10g-qxgmi phy-mode for AQR412C, Felix DSA and Lynx PCS
        driver
      - Support bcm63268 GPHY power control
      - Support for Micrel lan8842 PHY and PTP
      - Support for Aquantia AQR412 and AQR115

   - CAN:
      - a large CAN-XL preparation work
      - reorganize raw_sock and uniqframe struct to minimize memory
        usage
      - rcar_canfd: update the CAN-FD handling

   - WiFi:
      - extended Neighbor Awareness Networking (NAN) support
      - S1G channel representation cleanup
      - improve S1G support

   - WiFi drivers:
      - Intel (iwlwifi):
         - major refactor and cleanup
      - Broadcom (brcm80211):
         - support for AP isolation
      - RealTek (rtw88/89) rtw88/89:
         - preparation work for RTL8922DE support
      - MediaTek (mt76):
         - HW restart improvements
         - MLO support
      - Qualcomm/Atheros (ath10k):
         - GTK rekey fixes

   - Bluetooth drivers:
      - btusb: support for several new IDs for MT7925
      - btintel: support for BlazarIW core
      - btintel_pcie: support for _suspend() / _resume()
      - btintel_pcie: support for Scorpious, Panther Lake-H484 IDs"

* tag 'net-next-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1536 commits)
  net: stmmac: Add support for Allwinner A523 GMAC200
  dt-bindings: net: sun8i-emac: Add A523 GMAC200 compatible
  Revert "Documentation: net: add flow control guide and document ethtool API"
  octeontx2-pf: fix bitmap leak
  octeontx2-vf: fix bitmap leak
  net/mlx5e: Use extack in set rxfh callback
  net/mlx5e: Introduce mlx5e_rss_params for RSS configuration
  net/mlx5e: Introduce mlx5e_rss_init_params
  net/mlx5e: Remove unused mdev param from RSS indir init
  net/mlx5: Improve QoS error messages with actual depth values
  net/mlx5e: Prevent entering switchdev mode with inconsistent netns
  net/mlx5: HWS, Generalize complex matchers
  net/mlx5: Improve write-combining test reliability for ARM64 Grace CPUs
  selftests/net: add tcp_port_share to .gitignore
  Revert "net/mlx5e: Update and set Xon/Xoff upon MTU set"
  net: add NUMA awareness to skb_attempt_defer_free()
  net: use llist for sd->defer_list
  net: make softnet_data.defer_count an atomic
  selftests: drv-net: psp: add tests for destroying devices
  selftests: drv-net: psp: add test for auto-adjusting TCP MSS
  ...
2025-10-02 15:17:01 -07:00
Linus Torvalds
d7a018eb76 Kernel Concurrency Sanitizer (KCSAN) updates for v6.18
- Replace deprecated strcpy() with strscpy()
 
 This change has had 6 weeks of linux-next exposure.
 -----BEGIN PGP SIGNATURE-----
 
 iIcEABYKAC8WIQR7t4b/75lzOR3l5rcxsLN3bbyLnwUCaNpmMxEcZWx2ZXJAZ29v
 Z2xlLmNvbQAKCRAxsLN3bbyLnxmiAP0cwJKy4B307v7TJUCrvXVqXJtyuGmdBdKv
 j2MOuvt4ygD9HIVd/pFeFkjH0MVNuSbYNl1ICb/UK+vx/5MA/doXPAM=
 =jiYN
 -----END PGP SIGNATURE-----

Merge tag 'kcsan-20250929-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/melver/linux

Pull Kernel Concurrency Sanitizer (KCSAN) update from Marco Elver:

 - Replace deprecated strcpy() with strscpy()

* tag 'kcsan-20250929-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/melver/linux:
  kcsan: test: Replace deprecated strcpy() with strscpy()
2025-10-02 08:31:44 -07:00
Petr Mladek
7a75a5da79 Merge branch 'rework/ringbuffer-kunit-test' into for-linus 2025-10-02 10:33:08 +02:00
Linus Torvalds
30bbcb4470 linux_kselftest-kunit-6.18-rc1
- A seven patch series adds a new parameterized test features
   KUnit parameterized tests currently support two primary methods for
   getting parameters:
     1.  Defining custom logic within a generate_params() function.
     2.  Using the KUNIT_ARRAY_PARAM() and KUNIT_ARRAY_PARAM_DESC()
         macros with a pre-defined static array and passing
         the created *_gen_params() to KUNIT_CASE_PARAM().
 
     These methods present limitations when dealing with dynamically
     generated parameter arrays, or in scenarios where populating parameters
     sequentially via generate_params() is inefficient or overly complex.
 
     These limitations are fixed with a parameterized test method.
 
 - Fixes issues in kunit build artifacts cleanup,
 - Fixes parsing skipped test problem in kselftest framework,
 - Enables PCI on UML without triggering WARN()
 - a few other fixes and adds support for new configs such as MIPS
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmjbB0QACgkQCwJExA0N
 QxxpUQ/8CslEjTv2+/LA122eGDtg8np61W1MTNEQslYyiobhZ9CXeFw4yJlTTb0w
 hShFK1pGAWgkVCVKIJOaeItY0hF3BIxyVtlQxDUKvukpMnLvZRI0KhG7p8aYnCq+
 jfQRy4gqwjaHyLQekQ1v6vRdHdTfh5mB6OUOCYa7QPQuKhOkBOaq2ZJ+9eFnkPXl
 KUGTcC3pL0jQQmaSgo7zFUbdGSq0JZkNbpMj0fAYB0zs+MSpfkAnQYEw3qaxU1/Z
 lu+CQdom+77Kt0r7UyEb6xUBeRveqYAWv6oFKq3CBzhlEGCkqVv6zgNg48qMC0QO
 cVW0E61u8bVAalhb7hOT3QsEp1k01Lr2N9/VBLP4LRb2HMlD2qlXDo6ftBjNgx/y
 m5Kukh1wQoOTaTJekdDMlaRXwApdn3MegheXE83n4dr44C5oyhlR8fFk/0Y6gSEU
 RKUg8ZwUNUhCdw4VgBn8zoSkK18D74feRoustmuMS00I/8to3iGvQ7kn3nz2fPMb
 AaZ6a2K95gk9TrD4UZdHF1aPHDudn3e8g6LdSOV/NoRNI9H+fXfhvxcLwW8WCqOw
 u5qfTxNUk4mPQrCs9hcDnWm9Ppuu5YeY989rKkA7j4z71dhyMOPynpOM+vXRijB8
 9zRcvyls87tgDAquIZvTp6Q/oEmi60OF40DRC9MjgOw0Iw7feIc=
 =VfHu
 -----END PGP SIGNATURE-----

Merge tag 'linux_kselftest-kunit-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kunit updates from Shuah Khan:

 - New parameterized test features

   KUnit parameterized tests supported two primary methods for getting
   parameters:

    - Defining custom logic within a generate_params() function.

    - Using the KUNIT_ARRAY_PARAM() and KUNIT_ARRAY_PARAM_DESC() macros
      with a pre-defined static array and passing the created
      *_gen_params() to KUNIT_CASE_PARAM().

   These methods present limitations when dealing with dynamically
   generated parameter arrays, or in scenarios where populating
   parameters sequentially via generate_params() is inefficient or
   overly complex.

   These limitations are fixed with a parameterized test method

 - Fix issues in kunit build artifacts cleanup

 - Fix parsing skipped test problem in kselftest framework

 - Enable PCI on UML without triggering WARN()

 - a few other fixes and adds support for new configs such as MIPS

* tag 'linux_kselftest-kunit-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: Extend kconfig help text for KUNIT_UML_PCI
  rust: kunit: allow `cfg` on `test`s
  kunit: qemu_configs: Add MIPS configurations
  kunit: Enable PCI on UML without triggering WARN()
  Documentation: kunit: Document new parameterized test features
  kunit: Add example parameterized test with direct dynamic parameter array setup
  kunit: Add example parameterized test with shared resource management using the Resource API
  kunit: Enable direct registration of parameter arrays to a KUnit test
  kunit: Pass parameterized test context to generate_params()
  kunit: Introduce param_init/exit for parameterized test context management
  kunit: Add parent kunit for parameterized test context
  kunit: tool: Accept --raw_output=full as an alias of 'all'
  kunit: tool: Parse skipped tests from kselftest.h
  kunit: Always descend into kunit directory during build
2025-10-01 19:15:11 -07:00
Linus Torvalds
991053178e Power management updates for 6.18-rc1
- Rearrange variable declarations involving __free() in the cpufreq
    core and intel_pstate driver to follow common coding style (Rafael
    Wysocki)
 
  - Fix object lifecycle issue in update_qos_request(), rearrange
    freq QoS updates using __free(), and adjust frequency percentage
    computations in the intel_pstate driver (Rafael Wysocki)
 
  - Update intel_pstate to allow it to enable HWP without EPP if the
    new DEC (Dynamic Efficiency Control) HW feature is enabled (Rafael
    Wysocki)
 
  - Use on_each_cpu_mask() in drv_write() in the ACPI cpufreq driver
    to simplify the code (Rafael Wysocki)
 
  - Use likely() optimization in intel_pstate_sample() (Yaxiong Tian)
 
  - Remove dead EPB-related code from intel_pstate (Srinivas Pandruvada)
 
  - Use scope-based cleanup for cpufreq policy references in multiple
    cpufreq drivers (Zihuan Zhang)
 
  - Avoid calling get_governor() for the first policy in the cpufreq core
    to simplify the initial policy path (Zihuan Zhang)
 
  - Clean up the cpufreq core in multiple places (Zihuan Zhang)
 
  - Use int type to store negative error codes in the cpufreq core and
    update the speedstep-lib to use int for error codes (Qianfeng Rong)
 
  - Update the efficient idle check for Intel extended Families in the
    ondemand cpufreq governor (Sohil Mehta)
 
  - Replace sscanf() with kstrtouint() in the conservative cpufreq
    governor (Kaushlendra Kumar)
 
  - Rename CpumaskVar::as[_mut]_ref to from_raw[_mut] in the cpumask
    Rust code and mark CpumaskVar as transparent (Alice Ryhl, Baptiste
    Lepers)
 
  - Update ARef and AlwaysRefCounted imports from sync::aref in the OPP
    Rust code (Shankari Anand)
 
  - Add support for AN7583 SoC to the airoha cpufreq driver (Christian
    Marangi)
 
  - Enable cpufreq for ipq5424 in the qcom-nvmem cpufreq driver (Md Sadre
    Alam)
 
  - Add support for MT8196 to the mediatek-hw cpufreq driver, refactor
    that driver and add mediatek,mt8196-cpufreq-hw DT binding (Nicolas
    Frattaroli)
 
  - Avoid redundant conditions in the mediatek cpufreq driver (Liao
    Yuanhong)
 
  - Add support for AM62D2 to the ti cpufreq driver and blocklist
    ti,am62d2 SoC in dt-platdev (Paresh Bhagat)
 
  - Support more speed grades on AM62Px SoC in the ti cpufreq driver,
    allow all silicon revisions to support OPPs in it, and fix supported
    hardware for 1GHz OPP (Judith Mendez)
 
  - Add QCS615 compatible to DT bindings for cpufreq-qcom-hw (Taniya Das)
 
  - Minor assorted updates of the scmi, longhaul, CPPC, and armada-37xx
    cpufreq drivers (Akhilesh Patil, BowenYu, Dennis Beier, and Florian
    Fainelli)
 
  - Remove outdated cpufreq-dt.txt (Frank Li)
 
  - Fix python gnuplot package names in the amd_pstate_tracer utility
    (Kuan-Wei Chiu)
 
  - Saravana Kannan will maintain the virtual-cpufreq driver (Saravana
    Kannan)
 
  - Prevent CPU capacity updates after registering a perf domain from
    failing on a first CPU that is not present (Christian Loehle)
 
  - Add support for the cases in which frequency alone is not sufficient
    to uniquely identify an OPP (Krishna Chaitanya Chundru)
 
  - Use to_result() for OPP error handling in Rust (Onur Özkan)
 
  - Add support for LPDDR5 on Rockhip RK3588 SoC to rockchip-dfi devfreq
    driver (Nicolas Frattaroli)
 
  - Fix an issue where DDR cycle counts on RK3588/RK3528 with LPDDR4(X)
    are reported as half by adding a cycle multiplier to the DFI driver
    in rockchip-dfi devfreq-event driver (Nicolas Frattaroli)
 
  - Fix missing error pointer dereference check of regulator instance in
    the mtk-cci devfreq driver probe and remove a redundant condition from
    an if () statement in that driver (Dan Carpenter, Liao Yuanhong)
 
  - Fail cpuidle device registration if there is one already to avoid
    sysfs-related issues (Rafael Wysocki)
 
  - Use sysfs_emit()/sysfs_emit_at() instead of sprintf()/scnprintf() in
    cpuidle (Vivek Yadav)
 
  - Fix device and OF node leaks at probe in the qcom-spm cpuidle driver
    and drop unnecessary initialisations from it (Johan Hovold)
 
  - Remove unnecessary address-of operators from the intel_idle cpuidle
    driver (Kaushlendra Kumar)
 
  - Rearrange main loop in menu_select() to make the code in that funtion
    easier to follow (Rafael Wysocki)
 
  - Convert values in microseconds to ktime using us_to_ktime() where
    applicable in the intel_idle power capping driver (Xichao Zhao)
 
  - Annotate loops walking device links in the power management core
    code as _srcu and add macros for walking device links to reduce the
    likelihood of coding mistakes related to them (Rafael Wysocki)
 
  - Document time units for *_time functions in the runtime PM API (Brian
    Norris)
 
  - Clear power.must_resume in noirq suspend error path to avoid resuming
    a dependant device under a suspended parent or supplier (Rafael
    Wysocki)
 
  - Fix GFP mask handling during hybrid suspend and make the amdgpu
    driver handle hybrid suspend correctly (Mario Limonciello, Rafael
    Wysocki)
 
  - Fix GFP mask handling after aborted hibernation in platform mode and
    combine exit paths in power_down() to avoid code duplication (Rafael
    Wysocki)
 
  - Use vmalloc_array() and vcalloc() in the hibernation core to avoid
    open-coded size computations (Qianfeng Rong)
 
  - Fix typo in hibernation core code comment (Li Jun)
 
  - Call pm_wakeup_clear() in the same place where other functions that do
    bookkeeping prior to suspend_prepare() are called (Samuel Wu)
 
  - Fix and clean up the x86_energy_perf_policy utility and update its
    documentation (Len Brown, Kaushlendra Kumar)
 
  - Fix incorrect sorting of PMT telemetry in turbostat (Kaushlendra
    Kumar)
 
  - Fix incorrect size in cpuidle_state_disable() and the error return
    value of cpupower_write_sysfs() in cpupower (Kaushlendra Kumar)
 -----BEGIN PGP SIGNATURE-----
 
 iQFGBAABCAAwFiEEcM8Aw/RY0dgsiRUR7l+9nS/U47UFAmjafbMSHHJqd0Byand5
 c29ja2kubmV0AAoJEO5fvZ0v1OO174EH/jAm4GBn1sbjMt0CybSHTTP9iryTiN6m
 cXML5OpMoLbTnngfXjbEe9t52Fc0YV4awCG/S8Ufbut6ubWOEaVzInlw3zQAeE7c
 V91ioxKrodrykpBxn5UFyCxpT2NZWteWl5rOEPeN7j+hqS4I4GTO0HsSo+E+1Y9F
 DKELrbkLsn7rHy+ZvrOhcvq1IZE8gvINuji0QEf1cZz1VrgrLbQHUFqySpCUJw3F
 /MfnA3l0kA2TXQ+UpDWJw8l1weZpXpOPJiyhYQKKeYHVA4osBwiFA/9pq+8Xb/AJ
 GORHIl9y3x+JDXkEYyqmLn0k4FbHCX+4p5tpV9YDHebtP8dTLJBDF5Q=
 =d4BP
 -----END PGP SIGNATURE-----

Merge tag 'pm-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "The majority of these are cpufreq changes, which has been a recurring
  pattern for a few recent cycles.

  Those changes include new hardware support (AN7583 SoC support in the
  airoha cpufreq driver, ipq5424 support in the qcom-nvmem cpufreq
  driver, MT8196 support in the mediatek cpufreq driver, AM62D2 support
  in the ti cpufreq driver), DT bindings and Rust code updates, cleanups
  of the core and governors, and multiple driver fixes and cleanups.

  Beyond that, there are hibernation fixes (some remaining 6.16 cycle
  fallout and an issue related to hybrid suspend in the amdgpu driver),
  cleanups of the PM core code, runtime PM documentation update, cpuidle
  and power capping cleanups, and tooling updates.

  Specifics:

   - Rearrange variable declarations involving __free() in the cpufreq
     core and intel_pstate driver to follow common coding style (Rafael
     Wysocki)

   - Fix object lifecycle issue in update_qos_request(), rearrange freq
     QoS updates using __free(), and adjust frequency percentage
     computations in the intel_pstate driver (Rafael Wysocki)

   - Update intel_pstate to allow it to enable HWP without EPP if the
     new DEC (Dynamic Efficiency Control) HW feature is enabled (Rafael
     Wysocki)

   - Use on_each_cpu_mask() in drv_write() in the ACPI cpufreq driver to
     simplify the code (Rafael Wysocki)

   - Use likely() optimization in intel_pstate_sample() (Yaxiong Tian)

   - Remove dead EPB-related code from intel_pstate (Srinivas
     Pandruvada)

   - Use scope-based cleanup for cpufreq policy references in multiple
     cpufreq drivers (Zihuan Zhang)

   - Avoid calling get_governor() for the first policy in the cpufreq
     core to simplify the initial policy path (Zihuan Zhang)

   - Clean up the cpufreq core in multiple places (Zihuan Zhang)

   - Use int type to store negative error codes in the cpufreq core and
     update the speedstep-lib to use int for error codes (Qianfeng Rong)

   - Update the efficient idle check for Intel extended Families in the
     ondemand cpufreq governor (Sohil Mehta)

   - Replace sscanf() with kstrtouint() in the conservative cpufreq
     governor (Kaushlendra Kumar)

   - Rename CpumaskVar::as[_mut]_ref to from_raw[_mut] in the cpumask
     Rust code and mark CpumaskVar as transparent (Alice Ryhl, Baptiste
     Lepers)

   - Update ARef and AlwaysRefCounted imports from sync::aref in the OPP
     Rust code (Shankari Anand)

   - Add support for AN7583 SoC to the airoha cpufreq driver (Christian
     Marangi)

   - Enable cpufreq for ipq5424 in the qcom-nvmem cpufreq driver (Md
     Sadre Alam)

   - Add support for MT8196 to the mediatek-hw cpufreq driver, refactor
     that driver and add mediatek,mt8196-cpufreq-hw DT binding (Nicolas
     Frattaroli)

   - Avoid redundant conditions in the mediatek cpufreq driver (Liao
     Yuanhong)

   - Add support for AM62D2 to the ti cpufreq driver and blocklist
     ti,am62d2 SoC in dt-platdev (Paresh Bhagat)

   - Support more speed grades on AM62Px SoC in the ti cpufreq driver,
     allow all silicon revisions to support OPPs in it, and fix
     supported hardware for 1GHz OPP (Judith Mendez)

   - Add QCS615 compatible to DT bindings for cpufreq-qcom-hw (Taniya
     Das)

   - Minor assorted updates of the scmi, longhaul, CPPC, and armada-37xx
     cpufreq drivers (Akhilesh Patil, BowenYu, Dennis Beier, and Florian
     Fainelli)

   - Remove outdated cpufreq-dt.txt (Frank Li)

   - Fix python gnuplot package names in the amd_pstate_tracer utility
     (Kuan-Wei Chiu)

   - Saravana Kannan will maintain the virtual-cpufreq driver (Saravana
     Kannan)

   - Prevent CPU capacity updates after registering a perf domain from
     failing on a first CPU that is not present (Christian Loehle)

   - Add support for the cases in which frequency alone is not
     sufficient to uniquely identify an OPP (Krishna Chaitanya Chundru)

   - Use to_result() for OPP error handling in Rust (Onur Özkan)

   - Add support for LPDDR5 on Rockhip RK3588 SoC to rockchip-dfi
     devfreq driver (Nicolas Frattaroli)

   - Fix an issue where DDR cycle counts on RK3588/RK3528 with LPDDR4(X)
     are reported as half by adding a cycle multiplier to the DFI driver
     in rockchip-dfi devfreq-event driver (Nicolas Frattaroli)

   - Fix missing error pointer dereference check of regulator instance
     in the mtk-cci devfreq driver probe and remove a redundant
     condition from an if () statement in that driver (Dan Carpenter,
     Liao Yuanhong)

   - Fail cpuidle device registration if there is one already to avoid
     sysfs-related issues (Rafael Wysocki)

   - Use sysfs_emit()/sysfs_emit_at() instead of sprintf()/scnprintf()
     in cpuidle (Vivek Yadav)

   - Fix device and OF node leaks at probe in the qcom-spm cpuidle
     driver and drop unnecessary initialisations from it (Johan Hovold)

   - Remove unnecessary address-of operators from the intel_idle cpuidle
     driver (Kaushlendra Kumar)

   - Rearrange main loop in menu_select() to make the code in that
     funtion easier to follow (Rafael Wysocki)

   - Convert values in microseconds to ktime using us_to_ktime() where
     applicable in the intel_idle power capping driver (Xichao Zhao)

   - Annotate loops walking device links in the power management core
     code as _srcu and add macros for walking device links to reduce the
     likelihood of coding mistakes related to them (Rafael Wysocki)

   - Document time units for *_time functions in the runtime PM API
     (Brian Norris)

   - Clear power.must_resume in noirq suspend error path to avoid
     resuming a dependant device under a suspended parent or supplier
     (Rafael Wysocki)

   - Fix GFP mask handling during hybrid suspend and make the amdgpu
     driver handle hybrid suspend correctly (Mario Limonciello, Rafael
     Wysocki)

   - Fix GFP mask handling after aborted hibernation in platform mode
     and combine exit paths in power_down() to avoid code duplication
     (Rafael Wysocki)

   - Use vmalloc_array() and vcalloc() in the hibernation core to avoid
     open-coded size computations (Qianfeng Rong)

   - Fix typo in hibernation core code comment (Li Jun)

   - Call pm_wakeup_clear() in the same place where other functions that
     do bookkeeping prior to suspend_prepare() are called (Samuel Wu)

   - Fix and clean up the x86_energy_perf_policy utility and update its
     documentation (Len Brown, Kaushlendra Kumar)

   - Fix incorrect sorting of PMT telemetry in turbostat (Kaushlendra
     Kumar)

   - Fix incorrect size in cpuidle_state_disable() and the error return
     value of cpupower_write_sysfs() in cpupower (Kaushlendra Kumar)"

* tag 'pm-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (86 commits)
  PM: hibernate: Combine return paths in power_down()
  PM: hibernate: Restrict GFP mask in power_down()
  PM: hibernate: Fix pm_hibernation_mode_is_suspend() build breakage
  PM: runtime: Documentation: ABI: Document time units for *_time
  tools/power x86_energy_perf_policy.8: Emphasize preference for SW interfaces
  tools/power x86_energy_perf_policy: Add make snapshot target
  tools/power x86_energy_perf_policy: Prefer driver HWP limits
  tools/power x86_energy_perf_policy: EPB access is only via sysfs
  tools/power x86_energy_perf_policy: Prepare for MSR/sysfs refactoring
  tools/power x86_energy_perf_policy: Enhance HWP enable
  tools/power x86_energy_perf_policy: Enhance HWP enabled check
  tools/power x86_energy_perf_policy: Fix incorrect fopen mode usage
  tools/power turbostat: Fix incorrect sorting of PMT telemetry
  drm/amd: Fix hybrid sleep
  PM: hibernate: Add pm_hibernation_mode_is_suspend()
  PM: hibernate: Fix hybrid-sleep
  tools/cpupower: Fix incorrect size in cpuidle_state_disable()
  tools/power/x86/amd_pstate_tracer: Fix python gnuplot package names
  cpufreq: Replace pointer subtraction with iteration macro
  cpuidle: Fail cpuidle device registration if there is one already
  ...
2025-10-01 16:08:37 -07:00
Yazhou Tang
55c0ced59f bpf: Reject negative offsets for ALU ops
When verifying BPF programs, the check_alu_op() function validates
instructions with ALU operations. The 'offset' field in these
instructions is a signed 16-bit integer.

The existing check 'insn->off > 1' was intended to ensure the offset is
either 0, or 1 for BPF_MOD/BPF_DIV. However, because 'insn->off' is
signed, this check incorrectly accepts all negative values (e.g., -1).

This commit tightens the validation by changing the condition to
'(insn->off != 0 && insn->off != 1)'. This ensures that any value
other than the explicitly permitted 0 and 1 is rejected, hardening the
verifier against malformed BPF programs.

Co-developed-by: Shenghao Yuan <shenghaoyuan0928@163.com>
Signed-off-by: Shenghao Yuan <shenghaoyuan0928@163.com>
Co-developed-by: Tianci Cao <ziye@zju.edu.cn>
Signed-off-by: Tianci Cao <ziye@zju.edu.cn>
Signed-off-by: Yazhou Tang <tangyazhou518@outlook.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Fixes: ec0e2da95f ("bpf: Support new signed div/mod instructions.")
Link: https://lore.kernel.org/r/tencent_70D024BAE70A0A309A4781694C7B764B0608@qq.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-10-01 15:43:13 -07:00
Brahmajit Das
34904582b5 bpf: Skip scalar adjustment for BPF_NEG if dst is a pointer
In check_alu_op(), the verifier currently calls check_reg_arg() and
adjust_scalar_min_max_vals() unconditionally for BPF_NEG operations.
However, if the destination register holds a pointer, these scalar
adjustments are unnecessary and potentially incorrect.

This patch adds a check to skip the adjustment logic when the destination
register contains a pointer.

Reported-by: syzbot+d36d5ae81e1b0a53ef58@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d36d5ae81e1b0a53ef58
Fixes: aced132599 ("bpf: Add range tracking for BPF_NEG")
Suggested-by: KaFai Wan <kafai.wan@linux.dev>
Suggested-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Brahmajit Das <listout@listout.xyz>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20251001191739.2323644-2-listout@listout.xyz
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-10-01 13:53:20 -07:00
Linus Torvalds
eb3289fc47 Driver core changes for 6.18-rc1
- Auxiliary:
    - Drop call to dev_pm_domain_detach() in auxiliary_bus_probe()
    - Optimize logic of auxiliary_match_id()
 
 - Rust:
   - Auxiliary:
     - Use primitive C types from prelude
 
   - DebugFs:
     - Add debugfs support for simple read/write files and custom callbacks
       through a File-type-based and directory-scope-based API
     - Sample driver code for the File-type-based API
     - Sample module code for the directory-scope-based API
 
   - I/O:
     - Add io::poll module and implement Rust specific read_poll_timeout()
       helper
 
   - IRQ:
     - Implement support for threaded and non-threaded device IRQs based on
       (&Device<Bound>, IRQ number) tuples (IrqRequest)
     - Provide &Device<Bound> cookie in IRQ handlers
 
   - PCI:
     - Support IRQ requests from IRQ vectors for a specific pci::Device<Bound>
     - Implement accessors for subsystem IDs, revision, devid and resource start
     - Provide dedicated pci::Vendor and pci::Class types for vendor and class
       ID numbers
     - Implement Display to print actual vendor and class names; Debug to print
       the raw ID numbers
     - Add pci::DeviceId::from_class_and_vendor() helper
     - Use primitive C types from prelude
     - Various minor inline and (safety) comment improvements
 
   - Platform:
     - Support IRQ requests from IRQ vectors for a specific
       platform::Device<Bound>
 
   - Nova:
     - Use pci::DeviceId::from_class_and_vendor() to avoid probing
       non-display/compute PCI functions
 
   - Misc:
     - Add helper for cpu_relax()
     - Update ARef import from sync::aref
 
 - sysfs:
   - Remove bin_attrs_new field from struct attribute_group
   - Remove read_new() and write_new() from struct bin_attribute
 
 - Misc:
   - Document potential race condition in get_dev_from_fwnode()
   - Constify node_group argument in software node registration functions
   - Fix order of kernel-doc parameters in various functions
   - Set power.no_pm flag for faux devices
   - Set power.no_callbacks flag along with the power.no_pm flag
   - Constify the pmu_bus bus type
   - Minor spelling fixes
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQS2q/xV6QjXAdC7k+1FlHeO1qrKLgUCaNmQGwAKCRBFlHeO1qrK
 LmPzAP9msIvK8eFT4CEDK4buX1gd+VBOdy8mAjAeJ2F80FIo8wEAtOdddNaaqWVF
 m4ac2/a2bSRKMGPX+wIM7d2HGyC7sgY=
 =XbU+
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core

Pull driver core updates from Danilo Krummrich:
 "Auxiliary:
   - Drop call to dev_pm_domain_detach() in auxiliary_bus_probe()
   - Optimize logic of auxiliary_match_id()

  Rust:
   - Auxiliary:
      - Use primitive C types from prelude

   - DebugFs:
      - Add debugfs support for simple read/write files and custom
        callbacks through a File-type-based and directory-scope-based
        API
      - Sample driver code for the File-type-based API
      - Sample module code for the directory-scope-based API

   - I/O:
      - Add io::poll module and implement Rust specific
        read_poll_timeout() helper

   - IRQ:
      - Implement support for threaded and non-threaded device IRQs
        based on (&Device<Bound>, IRQ number) tuples (IrqRequest)
      - Provide &Device<Bound> cookie in IRQ handlers

   - PCI:
      - Support IRQ requests from IRQ vectors for a specific
        pci::Device<Bound>
      - Implement accessors for subsystem IDs, revision, devid and
        resource start
      - Provide dedicated pci::Vendor and pci::Class types for vendor
        and class ID numbers
      - Implement Display to print actual vendor and class names; Debug
        to print the raw ID numbers
      - Add pci::DeviceId::from_class_and_vendor() helper
      - Use primitive C types from prelude
      - Various minor inline and (safety) comment improvements

   - Platform:
      - Support IRQ requests from IRQ vectors for a specific
        platform::Device<Bound>

   - Nova:
      - Use pci::DeviceId::from_class_and_vendor() to avoid probing
        non-display/compute PCI functions

   - Misc:
      - Add helper for cpu_relax()
      - Update ARef import from sync::aref

  sysfs:
   - Remove bin_attrs_new field from struct attribute_group
   - Remove read_new() and write_new() from struct bin_attribute

  Misc:
   - Document potential race condition in get_dev_from_fwnode()
   - Constify node_group argument in software node registration
     functions
   - Fix order of kernel-doc parameters in various functions
   - Set power.no_pm flag for faux devices
   - Set power.no_callbacks flag along with the power.no_pm flag
   - Constify the pmu_bus bus type
   - Minor spelling fixes"

* tag 'driver-core-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core: (43 commits)
  rust: pci: display symbolic PCI vendor names
  rust: pci: display symbolic PCI class names
  rust: pci: fix incorrect platform reference in PCI driver probe doc comment
  rust: pci: fix incorrect platform reference in PCI driver unbind doc comment
  perf: make pmu_bus const
  samples: rust: Add scoped debugfs sample driver
  rust: debugfs: Add support for scoped directories
  samples: rust: Add debugfs sample driver
  rust: debugfs: Add support for callback-based files
  rust: debugfs: Add support for writable files
  rust: debugfs: Add support for read-only files
  rust: debugfs: Add initial support for directories
  driver core: auxiliary bus: Optimize logic of auxiliary_match_id()
  driver core: auxiliary bus: Drop dev_pm_domain_detach() call
  driver core: Fix order of the kernel-doc parameters
  driver core: get_dev_from_fwnode(): document potential race
  drivers: base: fix "publically"->"publicly"
  driver core/PM: Set power.no_callbacks along with power.no_pm
  driver core: faux: Set power.no_pm for faux devices
  rust: pci: inline several tiny functions
  ...
2025-10-01 08:39:23 -07:00
Linus Torvalds
ae28ed4578 bpf-next-6.18
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmjZH40ACgkQ6rmadz2v
 bTrG7w//X/5CyDoKIYJCqynYRdMtfqYuCe8Jhud4p5++iBVqkDyS6Y8EFLqZVyg/
 UHTqaSE4Nz8/pma0WSjhUYn6Chs1AeH+Rw/g109SovE/YGkek2KNwY3o2hDrtPMX
 +oD0my8qF2HLKgEyteXXyZ5Ju+AaF92JFiGko4/wNTX8O99F9nyz2pTkrctS9Vl9
 VwuTxrEXpmhqrhP3WCxkfNfcbs9HP+AALpgOXZKdMI6T4KI0N1gnJ0ZWJbiXZ8oT
 tug0MTPkNRidYMl0wHY2LZ6ZG8Q3a7Sgc+M0xFzaHGvGlJbBg1HjsDMtT6j34CrG
 TIVJ/O8F6EJzAnQ5Hio0FJk8IIgMRgvng5Kd5GXidU+mE6zokTyHIHOXitYkBQNH
 Hk+lGA7+E2cYqUqKvB5PFoyo+jlucuIH7YwrQlyGfqz+98n65xCgZKcmdVXr0hdB
 9v3WmwJFtVIoPErUvBC3KRANQYhFk4eVk1eiGV/20+eIVyUuNbX6wqSWSA9uEXLy
 n5fm/vlk4RjZmrPZHxcJ0dsl9LTF1VvQQHkgoC1Sz/Cc+jA6k4I+ECVHAqEbk36p
 1TUF52yPOD2ViaJKkj+962JaaaXlUn6+Dq7f1GMP6VuyHjz4gsI3mOo4XarqNdWd
 c7TnYmlGO/cGwqd4DdbmWiF1DDsrBcBzdbC8+FgffxQHLPXGzUg=
 =LeQi
 -----END PGP SIGNATURE-----

Merge tag 'bpf-next-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Pull bpf updates from Alexei Starovoitov:

 - Support pulling non-linear xdp data with bpf_xdp_pull_data() kfunc
   (Amery Hung)

   Applied as a stable branch in bpf-next and net-next trees.

 - Support reading skb metadata via bpf_dynptr (Jakub Sitnicki)

   Also a stable branch in bpf-next and net-next trees.

 - Enforce expected_attach_type for tailcall compatibility (Daniel
   Borkmann)

 - Replace path-sensitive with path-insensitive live stack analysis in
   the verifier (Eduard Zingerman)

   This is a significant change in the verification logic. More details,
   motivation, long term plans are in the cover letter/merge commit.

 - Support signed BPF programs (KP Singh)

   This is another major feature that took years to materialize.

   Algorithm details are in the cover letter/marge commit

 - Add support for may_goto instruction to s390 JIT (Ilya Leoshkevich)

 - Add support for may_goto instruction to arm64 JIT (Puranjay Mohan)

 - Fix USDT SIB argument handling in libbpf (Jiawei Zhao)

 - Allow uprobe-bpf program to change context registers (Jiri Olsa)

 - Support signed loads from BPF arena (Kumar Kartikeya Dwivedi and
   Puranjay Mohan)

 - Allow access to union arguments in tracing programs (Leon Hwang)

 - Optimize rcu_read_lock() + migrate_disable() combination where it's
   used in BPF subsystem (Menglong Dong)

 - Introduce bpf_task_work_schedule*() kfuncs to schedule deferred
   execution of BPF callback in the context of a specific task using the
   kernel’s task_work infrastructure (Mykyta Yatsenko)

 - Enforce RCU protection for KF_RCU_PROTECTED kfuncs (Kumar Kartikeya
   Dwivedi)

 - Add stress test for rqspinlock in NMI (Kumar Kartikeya Dwivedi)

 - Improve the precision of tnum multiplier verifier operation
   (Nandakumar Edamana)

 - Use tnums to improve is_branch_taken() logic (Paul Chaignon)

 - Add support for atomic operations in arena in riscv JIT (Pu Lehui)

 - Report arena faults to BPF error stream (Puranjay Mohan)

 - Search for tracefs at /sys/kernel/tracing first in bpftool (Quentin
   Monnet)

 - Add bpf_strcasecmp() kfunc (Rong Tao)

 - Support lookup_and_delete_elem command in BPF_MAP_STACK_TRACE (Tao
   Chen)

* tag 'bpf-next-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (197 commits)
  libbpf: Replace AF_ALG with open coded SHA-256
  selftests/bpf: Add stress test for rqspinlock in NMI
  selftests/bpf: Add test case for different expected_attach_type
  bpf: Enforce expected_attach_type for tailcall compatibility
  bpftool: Remove duplicate string.h header
  bpf: Remove duplicate crypto/sha2.h header
  libbpf: Fix error when st-prefix_ops and ops from differ btf
  selftests/bpf: Test changing packet data from kfunc
  selftests/bpf: Add stacktrace map lookup_and_delete_elem test case
  selftests/bpf: Refactor stacktrace_map case with skeleton
  bpf: Add lookup_and_delete_elem for BPF_MAP_STACK_TRACE
  selftests/bpf: Fix flaky bpf_cookie selftest
  selftests/bpf: Test changing packet data from global functions with a kfunc
  bpf: Emit struct bpf_xdp_sock type in vmlinux BTF
  selftests/bpf: Task_work selftest cleanup fixes
  MAINTAINERS: Delete inactive maintainers from AF_XDP
  bpf: Mark kfuncs as __noclone
  selftests/bpf: Add kprobe multi write ctx attach test
  selftests/bpf: Add kprobe write ctx attach test
  selftests/bpf: Add uprobe context ip register change test
  ...
2025-09-30 17:58:11 -07:00
Linus Torvalds
4b81e2eb9e Updates for the VDSO subsystem:
- Further consolidation of the VDSO infrastructure and the common data
     store.
 
   - Simplification of the related Kconfig logic
 
   - Improve the VDSO selftest suite
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmjaUJgTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoT5VD/9l+TP42vJzwygPArefVM9QijAc4k+g
 iptaz5M+Lhk0BlwMUu502Tp1zcwGq5529LlE2P72DuYoZRtZKSFCGES2mye0JiAs
 A5Vk8d1zicg3IIGjYgw39shA9ghzuzV3gZdgHVqQ6024uSeKxUY3SqF156GGfPFG
 azXWDV7i4RGQvKLm2ye8m7JNYCE3P9f8Wh+GvPaq7qBlW1MsMCWC4FP5qu4iStf3
 mol9uA/EM2zSDwkpIpq6P5L2TpIH1dGDNGxVf/HG3J16UhNXzzXv/rxuKHKPNGWm
 MQZ8WdPjF+uwdUsNXaImC6aCE0hoQ1Ga7GE2YA3pMQYfySeiJ/fA5I1KfhlUCiHD
 oUgnb6PQfZjaBDrIODKcrFHAYc5pTOyJXnJ6q1Lc1yrOfoVcIpMaRXHO/phCeuwI
 61gfb9ggbFEH/7cL26Z9XGSqh4BDCKVP5Z+y98OnNEzV74ScpX13/szB3VgdN6LW
 tmTOeMPYCo4FQoPw8OV1xkeYOKD2JyqH+garnxDZNFjaVaeUkzlAMKz+1JUjxRZj
 UpUhZ8y5KFVjgKkqZHxhucr9cnwV/xd4n+IzZeG4l6XZp4VyirG+5QpAikGYe+Jq
 WFU4Ao1xvevyizrJEhoGJKznwtrNrsy6AQ+wLQxYiRR4aTnng8n9T/ry2DVvhJf8
 diYWxuRTR7KMxg==
 =i3UB
 -----END PGP SIGNATURE-----

Merge tag 'timers-vdso-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull VDSO updates from Thomas Gleixner:

 - Further consolidation of the VDSO infrastructure and the common data
   store

 - Simplification of the related Kconfig logic

 - Improve the VDSO selftest suite

* tag 'timers-vdso-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  selftests: vDSO: Drop vdso_test_clock_getres
  selftests: vDSO: vdso_test_abi: Add tests for clock_gettime64()
  selftests: vDSO: vdso_test_abi: Test CPUTIME clocks
  selftests: vDSO: vdso_test_abi: Use explicit indices for name array
  selftests: vDSO: vdso_test_abi: Drop clock availability tests
  selftests: vDSO: vdso_test_abi: Use ksft_finished()
  selftests: vDSO: vdso_test_abi: Correctly skip whole test with missing vDSO
  selftests: vDSO: Fix -Wunitialized in powerpc VDSO_CALL() wrapper
  vdso: Add struct __kernel_old_timeval forward declaration to gettime.h
  vdso: Gate VDSO_GETRANDOM behind HAVE_GENERIC_VDSO
  vdso: Drop Kconfig GENERIC_VDSO_TIME_NS
  vdso: Drop Kconfig GENERIC_VDSO_DATA_STORE
  vdso: Drop kconfig GENERIC_COMPAT_VDSO
  vdso: Drop kconfig GENERIC_VDSO_32
  riscv: vdso: Untangle Kconfig logic
  time: Build generic update_vsyscall() only with generic time vDSO
  vdso/gettimeofday: Remove !CONFIG_TIME_NS stubs
  vdso: Move ENABLE_COMPAT_VDSO from core to arm64
  ARM: VDSO: Remove cntvct_ok global variable
  vdso/datastore: Gate time data behind CONFIG_GENERIC_GETTIMEOFDAY
2025-09-30 16:58:21 -07:00
Linus Torvalds
70de5572a8 Updates for the clocksource/clockevents drivers subsystem:
- Further preparations for modular clocksource/event drivers
 
   - The usual device tree updates to support new chip variants and the
     related changes to thise drivers
 
   - Avoid a 64-bit division in the TEGRA186 driver, which caused a build
     fail on 32-bit machines.
 
   - Small fixes, improvements and cleanups all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmjaTgwTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoRwiD/9MdweTwv+Ami1i/4DAQzEcIuXqOnmO
 6yDnRXuqHguiuAkmCCwaW0IoqWZbIDtVwkMHv8CxDQoHm03wFg9QbMLN3YkR3D7P
 GWyWGKDMT5/ATQxvx/nZhFjpxnjRI7nFyQuooLIFgh8d9GvxBGFZOrdndudTW+dR
 FdLiv0+kyG9XDRijepUBXV9TSZNxZoF6IrlEbfSE0DFdmVAQhagN47QVhRk2cNux
 LhAp5wFJ/hzBRUIoQyPDAJFPpGrVC/Goz+ulgxmy4NRZ/TVoiAcdQrDTp1o8AOiV
 VxpNQ/pbPVSeij6zhZF1loHxDPDJGUlg9enx6NJDRk3CBNeu2Z617cBfuoCsEb45
 EmvyXLESQ8JzjIpGNzAgAuPa92OI0AJG7K4wg4mWzawUAD+Pvm3Xs+9rRareJ5t+
 OngGPHNhxMyRTO0z6BbFDZub3jv5PWEs/MB1a7O6XNVXokf2Wl4WgigWyKbEn+TU
 rjM34OgwQvcFVSE8nnwNEyzedpqJKAbvRG3WbHkucgkXEDtaLHZ9EWYIekcBYH39
 W4nXFWfQb8E29YmZifaq2KeaQI+Nb2utEyTMTmbZchVlZW7LBRxsfrhoyeNwYqc5
 byGvI4ZbvdNzDKq9JMD2H8P4//EneMrF/r13SoJTz9lGddjoobseKfTJArDrScXR
 fFCOVxwsJKRGNQ==
 =LZUc
 -----END PGP SIGNATURE-----

Merge tag 'timers-clocksource-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull clocksource updates from Thomas Gleixner:

 - Further preparations for modular clocksource/event drivers

 - The usual device tree updates to support new chip variants and the
   related changes to thise drivers

 - Avoid a 64-bit division in the TEGRA186 driver, which caused a build
   fail on 32-bit machines.

 - Small fixes, improvements and cleanups all over the place

* tag 'timers-clocksource-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits)
  dt-bindings: timer: exynos4210-mct: Add compatible for ARTPEC-9 SoC
  clocksource/drivers/sh_cmt: Split start/stop of clock source and events
  clocksource/drivers/clps711x: Fix resource leaks in error paths
  clocksource/drivers/arm_global_timer: Add auto-detection for initial prescaler values
  clocksource/drivers/ingenic-sysost: Convert from round_rate() to determine_rate()
  clocksource/drivers/timer-tegra186: Don't print superfluous errors
  clocksource/drivers/timer-rtl-otto: Simplify documentation
  clocksource/drivers/timer-rtl-otto: Do not interfere with interrupts
  clocksource/drivers/timer-rtl-otto: Drop set_counter function
  clocksource/drivers/timer-rtl-otto: Work around dying timers
  clocksource/drivers/timer-ti-dm : Capture functionality for OMAP DM timer
  clocksource/drivers/arm_arch_timer_mmio: Add MMIO clocksource
  clocksource/drivers/arm_arch_timer_mmio: Switch over to standalone driver
  clocksource/drivers/arm_arch_timer: Add standalone MMIO driver
  ACPI: GTDT: Generate platform devices for MMIO timers
  clocksource/drivers/nxp-pit: Add NXP Automotive s32g2 / s32g3 support
  dt: bindings: fsl,vf610-pit: Add compatible for s32g2 and s32g3
  clocksource/drivers/vf-pit: Rename the VF PIT to NXP PIT
  clocksource/drivers/vf-pit: Unify the function name for irq ack
  clocksource/drivers/vf-pit: Consolidate calls to pit_*_disable/enable
  ...
2025-09-30 16:53:59 -07:00
Linus Torvalds
c5448d46b3 Updates for the time(rs) core subsystem:
- Address the inconsistent shutdown sequence of per CPU clockevents on
     CPU hotplug, which onoly removed it from the core but failed to invoke
     the actual device driver shutdown callback. This keeps the timer
     active, which prevents power savings and causes pointless noise in
     virtualization.
 
   - Encapsulate the open coded access to the hrtimer clock base, which is a
     private implementation detail, so that the implementation can be
     changed without breaking a lot of usage sites.
 
   - Enhance the debug output of the clocksource watchdog to provide better
     information for analysis.
 
   - The usual set of cleanups and enhancements all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmjaT+ETHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoT8HD/47VRqHbFKazGcwZ4+OLKfFTq0PKzIU
 kIhimNMSKVqiPlmWhud/4idGX5JFqMP3dS9ZaDlp7xCtu1ScCqzH72kbCrab0l4g
 wjRgneGWsHhv06PY50Ty9FusFSetkjA8XSYavFV9HZNgCRIvho958akdckpazMcF
 i4V5WTIVkwhGynhcGwqBXcmHASUa7x6gEjZYBbX6wspPl2Wk5z+vn/zAVIHAozwS
 9GPw8HlUeKqM72U12mWGkt4KLjy2gzoupvx9vD4giXRvqkt09rfHtRqvEdkew+/G
 ZANhmTJqfA1AiihYP30fHgY5lQbNQG+9O10UKhlUyrlBKPKHKu6dIuJQRSBy0j59
 Bqef4UubPBlMP4xRgTbt11SFrjuqDo68bTIDGmQNQvb1BXSXhGA08PDPbIKkFdJh
 8cXwUHzjLkDk0tL6kVXRWlsZcCmjz87TVS7PufGE3EIZkh0J2Ob+g7nFS3PsORKW
 65ROSRoYmh4CRh/l9CTNbPWFvz3jahIEnxlrSc70o0A4DUeeWTwT/M+MoNOtSJuV
 D4qSD35JsFaEbKFfSJwF/0CybG3Nx3UPI8nvEEYQtS4D+BCJHy7INg1P/NtWw+vb
 W7puAFd+WbjSN14uQfYkYWC53vQ22isW0nU8W6G5NbMRD5K4WIld4UhDksJIGCLe
 2LZ0VHxv4gaXkg==
 =A9Tu
 -----END PGP SIGNATURE-----

Merge tag 'timers-core-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer core updates from Thomas Gleixner:

 - Address the inconsistent shutdown sequence of per CPU clockevents on
   CPU hotplug, which only removed it from the core but failed to invoke
   the actual device driver shutdown callback. This kept the timer
   active, which prevented power savings and caused pointless noise in
   virtualization.

 - Encapsulate the open coded access to the hrtimer clock base, which is
   a private implementation detail, so that the implementation can be
   changed without breaking a lot of usage sites.

 - Enhance the debug output of the clocksource watchdog to provide
   better information for analysis.

 - The usual set of cleanups and enhancements all over the place

* tag 'timers-core-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  time: Fix spelling mistakes in comments
  clocksource: Print durations for sync check unconditionally
  LoongArch: Remove clockevents shutdown call on offlining
  tick: Do not set device to detached state in tick_shutdown()
  hrtimer: Reorder branches in hrtimer_clockid_to_base()
  hrtimer: Remove hrtimer_clock_base:: Get_time
  hrtimer: Use hrtimer_cb_get_time() helper
  media: pwm-ir-tx: Avoid direct access to hrtimer clockbase
  ALSA: hrtimer: Avoid direct access to hrtimer clockbase
  lib: test_objpool: Avoid direct access to hrtimer clockbase
  sched/core: Avoid direct access to hrtimer clockbase
  timers/itimer: Avoid direct access to hrtimer clockbase
  posix-timers: Avoid direct access to hrtimer clockbase
  jiffies: Remove obsolete SHIFTED_HZ comment
2025-09-30 16:09:27 -07:00
Linus Torvalds
c574fb2ed7 A set of updates for futexes and related selftests:
- Plug the ptrace_may_access() race against a concurrent exec() which
     allows to pass the check before the target's process transition in
     exec() by taking a read lock on signal->ext_update_lock.
 
   - A large set of cleanups and enhancement to the futex selftests. The
     bulk of changes is the conversion to the kselftest harness.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmjaS8QTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYof44EAC7amBzdJlaluVjLTpCV07G8K4bDkzz
 iN5vOm+vrxJlg7kq9L4CffuGKS8yFj7sgNkSexOv/+JUFp/E9WpV3CT+kqJNm71x
 bDPDAFR3reZQ7u27dzpciOTgtCEnq+hqeAovx2c1v2GuESvnTfqWTFSQBLfzqGW7
 leCiaR6VJ87+RB10ZrlxJDdrFnVbKBa48k1dZjy97ivkw+SUS5xSEadSBG2vw7Rd
 C+SQDa8Hhh7xAMJvySWkES6rfzZvC/ubkpq5DmjRZvHu/wS4wDNaKj06wEtbLxEA
 5eaemJVBeEbjC6GNBgba6cUmtrT45PQG7wNFoPmgJsO0nlnfwfmm713CAa9vK+dT
 MTUnGWKSpf0Vj1yHkzLeHer226jqfOe1lA4anRzD6FoZy2XaJiNCymQysfIrIWTG
 BCcA8za3W4PTiUHxDe5JGsbA4f57xDV+bL0SmjaXv5wSo/FX21iJBisBI2FFl531
 9XlgU3ssvUUfQfH6bPIdgcZrWD2Uc9W5aj2dZv+zN6LT8EhvLaQwcM2bxWbuJfWu
 wmU07LMZS9zVrC8B6T3KvrYwnIuETWRZiPhenDy+Shtn8UM1N/MmlNNmAEvdOYue
 V3o1BTm9wDWfSMJAq7q7erjl15QOgUyOpte1gTuzDbqtcWj132q31HDegtxlGIyu
 NLOx8otYdLEOqQ==
 =vUpN
 -----END PGP SIGNATURE-----

Merge tag 'locking-futex-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull futex updates from Thomas Gleixner:
 "A set of updates for futexes and related selftests:

   - Plug the ptrace_may_access() race against a concurrent exec() which
     allows to pass the check before the target's process transition in
     exec() by taking a read lock on signal->ext_update_lock.

   - A large set of cleanups and enhancement to the futex selftests. The
     bulk of changes is the conversion to the kselftest harness"

* tag 'locking-futex-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
  selftest/futex: Fix spelling mistake "boundarie" -> "boundary"
  selftests/futex: Remove logging.h file
  selftests/futex: Drop logging.h include from futex_numa
  selftests/futex: Refactor futex_numa_mpol with kselftest_harness.h
  selftests/futex: Refactor futex_priv_hash with kselftest_harness.h
  selftests/futex: Refactor futex_waitv with kselftest_harness.h
  selftests/futex: Refactor futex_requeue with kselftest_harness.h
  selftests/futex: Refactor futex_wait with kselftest_harness.h
  selftests/futex: Refactor futex_wait_private_mapped_file with kselftest_harness.h
  selftests/futex: Refactor futex_wait_unitialized_heap with kselftest_harness.h
  selftests/futex: Refactor futex_wait_wouldblock with kselftest_harness.h
  selftests/futex: Refactor futex_wait_timeout with kselftest_harness.h
  selftests/futex: Refactor futex_requeue_pi_signal_restart with kselftest_harness.h
  selftests/futex: Refactor futex_requeue_pi_mismatched_ops with kselftest_harness.h
  selftests/futex: Refactor futex_requeue_pi with kselftest_harness.h
  selftests: kselftest: Create ksft_print_dbg_msg()
  futex: Don't leak robust_list pointer on exec race
  selftest/futex: Compile also with libnuma < 2.0.16
  selftest/futex: Reintroduce "Memory out of range" numa_mpol's subtest
  selftest/futex: Make the error check more precise for futex_numa_mpol
  ...
2025-09-30 16:07:10 -07:00
Linus Torvalds
d8de3685f1 An update of the stale smp_call_function_many() documentation to bring it
back in sync with the actual implementation.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmjaTGETHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoSdrD/967aAnjFKB1l/HZh1PPvRWuejrOrlX
 ZaOdwXvnHtYpzTGWjS8KxAxoIZhh6YjEFenknBPZnKTYM4g+3Umw5NobKxrfOg7b
 BHpz90M+Rd1IpUXHwnRcf+j8lWCw/ABq5hSOD+HRxHNratcevwuhShwJJpjmfxL4
 xw1wyryfRniKv7r1X8caguhXa6OgLV0zKEKs6uzPyxtQPgg4vNIaYyF2b/vJ5Mcv
 XGL7QotBEHaZagL6Wfrc1Z5H7h2mukPVtXyOFzVOHFMHPsM5WLr9UVn9EHaVE/aI
 ENYRE7kh+3MVJBJyVUOy11KCoNRuPoqVFEWDrQnam5uOzG/PVOMDMshWwvWM6ez+
 g/ea9T1CAzEJK4B/ALJTWmiNDO3i+vM8oBJyE2TNNvw7j3U/M2ThYifiTSsosBcG
 2u07Z6MZJ28fZmDXZTYs8D8xBCPPiIQQJs6KgApcP4JPTTIebWnGFiX5whPHNBv9
 HldWt+p8F86kDgpyneA12eTeK+KLoGdI6IK5BX2TdpdAuSCH6mkQ7vkv15mynrpH
 uI1Oe3T53yXdJwNlGaPuAgar8pGb6YWj7au5+aJCOUCRBuAsyhJ82eXqZI2gFQjF
 M0+WpjbrvG4Gv/5xXlqKzxUV0GblXkvqVooDMMCufDgkOd2xxQ/UodevrDgg8DYV
 YGgBdunjKbz7Ig==
 =8bI8
 -----END PGP SIGNATURE-----

Merge tag 'smp-core-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull smp doc fixlet from Thomas Gleixner:
 "An update of the stale smp_call_function_many() documentation to bring
  it back in sync with the actual implementation"

* tag 'smp-core-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  smp: Fix up and expand the smp_call_function_many() kerneldoc
2025-09-30 16:04:52 -07:00
Linus Torvalds
3b2074c77d A set of updates for the interrupt core subsystem:
- Introduce irq_chip_[startup|shutdown]_parent() to prepare for
     addressing a few short comings in the PCI/MSI interrupt subsystem.
 
     It allows to utilize the interrupt chip startup/shutdown callbacks for
     initializing the interrupt chip hierarchy properly on certain RISCV
     implementations and provides a mechanism to reduce the overhead of
     masking and unmasking PCI/MSI interrupts during operation when the
     underlying MSI provider can mask the interrupt.
 
     The actual usage comes with the interrupt driver pull request.
 
   - Add generic error handling for devm_request_*_irq()
 
     This allows to remove the zoo of random error printk's all over the
     usage sites.
 
   - Add a mechanism to warn about long-running interrupt handlers
 
     Long running interrupt handlers can introduce latencies and tracking
     them down is a tedious task. The tracking has to be enabled with a
     threshold on the kernel command line and utilizes a static branch to
     remove the overhead when disabled.
 
   - Update and extend the selftests which validate the CPU hotplug
     interrupt migration logic
 
   - Allow dropping the per CPU softirq lock on PREEMPT_RT kernels, which
     causes contention and latencies all over the place.
 
     The serialization requirements have been pushed down into the actual
     affected usage sites already.
 
   - The usual small cleanups and improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmjaRdUTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoZTHD/0anRJCLFFQ8Xtuuu96MDfcvi8m6uhB
 E8gF6lXrljRgEzt33hnqSB4d41PYaOTje4h/Z0nCV2xva8cg6cErzjKlbRU/SMmj
 yEBV5XMHP8Nxog+A5q5OT3lXQ6pT5ZxDZ7l0wven1xX6LbyF63MQFdzFBiLrXnb1
 mOZM7F89FQKhutEMDqVwVLXLmJasNtZK0kBVrdjA/vDBlAk0bYCaPX4BxW1PQIw4
 56KkG1llK221ruT2YbMSv/oFmYulDYbnnM3Aw6stUsR6SPMkQX8Yke2wLQSq+GVV
 RI0f4GTa6XnlIiUivQ77HnV0m0g0Ybkj1PoKeGJIjWDFJ5hqkA0dRQMBRCna68wK
 c3IEWhn2zPl3YxHYMogLwlE4nJAVFUjeEtGPb/Z/mmHaBdd/Y7iqVjTgHJzDvi5Q
 SwgISZEB87vZINpraTuEFNimeIwjCVdBj5ImBHT4T/Av3DFzuvqvxCSP0AlVRdhs
 OdwF5/6jFdPP787+FtytEl/kFYLzf/7R9zVHLcpiCD3eYCU4pHU/ULjUIGxFkSrS
 ltezvfZkGnbqiKTR33nOxWBnM2Fr63FMv0bKMJ97x49TMN6Tw4L8mra/uxuMQSIf
 yRk3jRgUcx0xuigCXcb62dip+2WnsGjnaDj30yOs/d1Ab2eALrJCo0e1tgdK2JDu
 VgCX3xCbvj/NzA==
 =GZty
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq core updates from Thomas Gleixner:
 "A set of updates for the interrupt core subsystem:

   - Introduce irq_chip_[startup|shutdown]_parent() to prepare for
     addressing a few short comings in the PCI/MSI interrupt subsystem.

     It allows to utilize the interrupt chip startup/shutdown callbacks
     for initializing the interrupt chip hierarchy properly on certain
     RISCV implementations and provides a mechanism to reduce the
     overhead of masking and unmasking PCI/MSI interrupts during
     operation when the underlying MSI provider can mask the interrupt.

     The actual usage comes with the interrupt driver pull request.

   - Add generic error handling for devm_request_*_irq()

     This allows to remove the zoo of random error printk's all over the
     usage sites.

   - Add a mechanism to warn about long-running interrupt handlers

     Long running interrupt handlers can introduce latencies and
     tracking them down is a tedious task. The tracking has to be
     enabled with a threshold on the kernel command line and utilizes a
     static branch to remove the overhead when disabled.

   - Update and extend the selftests which validate the CPU hotplug
     interrupt migration logic

   - Allow dropping the per CPU softirq lock on PREEMPT_RT kernels,
     which causes contention and latencies all over the place.

     The serialization requirements have been pushed down into the
     actual affected usage sites already.

   - The usual small cleanups and improvements"

* tag 'irq-core-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  softirq: Allow to drop the softirq-BKL lock on PREEMPT_RT
  softirq: Provide a handshake for canceling tasklets via polling
  genirq/test: Ensure CPU 1 is online for hotplug test
  genirq/test: Drop CONFIG_GENERIC_IRQ_MIGRATION assumptions
  genirq/test: Depend on SPARSE_IRQ
  genirq/test: Fail early if interrupt request fails
  genirq/test: Factor out fake-virq setup
  genirq/test: Select IRQ_DOMAIN
  genirq/test: Fix depth tests on architectures with NOREQUEST by default.
  genirq: Add support for warning on long-running interrupt handlers
  genirq/devres: Add error handling in devm_request_*_irq()
  genirq: Add irq_chip_(startup/shutdown)_parent()
  genirq: Remove GENERIC_IRQ_LEGACY
2025-09-30 15:55:25 -07:00
Linus Torvalds
1d17e808cf Two fixes for RSEQ:
1) Protect the event mask modification against the membarrier() IPI as
      otherwise the RmW operation is unprotected and events might be lost.
 
   2) Fix the weak symbol reference in rseq selftests
 
      The current weak RSEQ symbols definitions which were added to allow
      static linkage are not working correctly as the effectively re-define
      the glibc symbols leading to multiple versions of the symbols when
      compiled with -fno-common. Mark them as 'extern' to convert them from
      weak symbol definitions to weak symbol references. That works with
      static and dynamic linkage independent of -fcommon and -fno-common.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmjaQc4THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoTj1D/0Un97b1GJRNUjmvrhSE8uFy5RxgYdX
 R2dJkfuQSF0fx4MzcrUg9b2KeklwDkYtq0YkmbStQeihp4jYpgODNFAdFdB4Cfhv
 e1JfgQ6iTS9zWq625ImygHNZv5NOiDBxf9jtsRTUA3lv804taoS/jHost4Qi9HwO
 jO9YoBrSmxZIZA1cPO9mA/AEpAdhKFxpwZK6yNNA4lH83gVhhSXKyEOFJQ0nWlZJ
 pycEmwM+v9iW67QZEbWkEBPukflnMXYtcUDLmawYDsMZ2gsB4yPYvZwHNTeiB7gu
 n0dfZH4jfw4DkO7MuU1CV1xlXhTO4sJQjmRsJuu2ypnZFVPh6iR+J2DxnZ5PcgvR
 LifuEa/+mN/LM5/gjohDJwny10EXysrEDJ/vZUR4BBTdTsWK6FwLdSnTuP8WF3wo
 LyY+PfeUmosHYOAa6Q8pLJUJzgXHtzlJLjVhYgb61dQjlgFuRB/0ksA8VoVeVhSz
 6A5v/rcVoZIhRj/fxzsJOXtfP24ghIXtRFyfeDeNsWWlqEuL/X28xYf45YBjws7F
 zApbCqeg+JYwJRaWmCDxjmmLsyMvpH3yujvlPZxuit6TX2PfxYG5CNpIxt192wZ2
 fFk8qcsgJ5x/9ah5mpio7wHDESiDkdaPanYA+ercdXogbzI/8Y7fUua8RgJKm4Mj
 3eo3tY3e8q9A3A==
 =VV+N
 -----END PGP SIGNATURE-----

Merge tag 'core-rseq-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull rseq updates from Thomas Gleixner:
 "Two fixes for RSEQ:

   - Protect the event mask modification against the membarrier() IPI as
     otherwise the RmW operation is unprotected and events might be lost

   - Fix the weak symbol reference in rseq selftests

     The current weak RSEQ symbols definitions which were added to allow
     static linkage are not working correctly as they effectively
     re-define the glibc symbols leading to multiple versions of the
     symbols when compiled with -fno-common.

     Mark them as 'extern' to convert them from weak symbol definitions
     to weak symbol references. That works with static and dynamic
     linkage independent of -fcommon and -fno-common"

* tag 'core-rseq-2025-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  rseq/selftests: Use weak symbol reference, not definition, to link with glibc
  rseq: Protect event mask against membarrier IPI
2025-09-30 15:06:33 -07:00
Linus Torvalds
e4dcbdff11 Performance events updates for v6.18:
Core perf code updates:
 
  - Convert mmap() related reference counts to refcount_t. This
    is in reaction to the recently fixed refcount bugs, which
    could have been detected earlier and could have mitigated
    the bug somewhat. (Thomas Gleixner, Peter Zijlstra)
 
  - Clean up and simplify the callchain code, in preparation
    for sframes. (Steven Rostedt, Josh Poimboeuf)
 
 Uprobes updates:
 
  - Add support to optimize usdt probes on x86-64, which
    gives a substantial speedup. (Jiri Olsa)
 
  - Cleanups and fixes on x86 (Peter Zijlstra)
 
 PMU driver updates:
 
  - Various optimizations and fixes to the Intel PMU driver
    (Dapeng Mi)
 
 Misc cleanups and fixes:
 
  - Remove redundant __GFP_NOWARN (Qianfeng Rong)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmjWpGIRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1iHvxAAvO8qWbbhUdF3EZaFU0Wx6oh5KBhImU49
 VZ107xe9llA0Szy3hIl1YpdOQA2NAHtma6We/ebonrPVTTkcSCGq8absc+GahA3I
 CHIomx2hjD0OQ01aHvTqgHJUdFUQQ0yzE3+FY6Tsn05JsNZvDmqpAMIoMQT0LuuG
 7VvVRLBuDXtuMtNmGaGCvfDGKTZkGGxD6iZS1iWHuixvVAz4IECK0vYqSyh31UGA
 w9Jwa0thwjKm2EZTmcSKaHSM2zw3N8QXJ3SNPPThuMrtO6QDz2+3Da9kO+vhGcRP
 Jls9KnWC2wxNxqIs3dr80Mzn4qMplc67Ekx2tUqX4tYEGGtJQxW6tm3JOKKIgFMI
 g/KF9/WJPXp0rVI9mtoQkgndzyswR/ZJBAwfEQu+nAqlp3gmmQR9+MeYPCyNnyhB
 2g22PTMbXkihJmRPAVeH+WhwFy1YY3nsRhh61ha3/N0ULXTHUh0E+hWwUVMifYSV
 SwXqQx4srlo6RJJNTji1d6R3muNjXCQNEsJ0lCOX6ajVoxWZsPH2x7/W1A8LKmY+
 FLYQUi6X9ogQbOO3WxCjUhzp5nMTNA2vvo87MUzDlZOCLPqYZmqcjntHuXwdjPyO
 lPcfTzc2nK1Ud26bG3+p2Bk3fjqkX9XcTMFniOvjKfffEfwpAq4xRPBQ3uRlzn0V
 pf9067JYF+c=
 =sVXH
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-2025-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull performance events updates from Ingo Molnar:
 "Core perf code updates:

   - Convert mmap() related reference counts to refcount_t. This is in
     reaction to the recently fixed refcount bugs, which could have been
     detected earlier and could have mitigated the bug somewhat (Thomas
     Gleixner, Peter Zijlstra)

   - Clean up and simplify the callchain code, in preparation for
     sframes (Steven Rostedt, Josh Poimboeuf)

  Uprobes updates:

   - Add support to optimize usdt probes on x86-64, which gives a
     substantial speedup (Jiri Olsa)

   - Cleanups and fixes on x86 (Peter Zijlstra)

  PMU driver updates:

   - Various optimizations and fixes to the Intel PMU driver (Dapeng Mi)

  Misc cleanups and fixes:

   - Remove redundant __GFP_NOWARN (Qianfeng Rong)"

* tag 'perf-core-2025-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits)
  selftests/bpf: Fix uprobe_sigill test for uprobe syscall error value
  uprobes/x86: Return error from uprobe syscall when not called from trampoline
  perf: Skip user unwind if the task is a kernel thread
  perf: Simplify get_perf_callchain() user logic
  perf: Use current->flags & PF_KTHREAD|PF_USER_WORKER instead of current->mm == NULL
  perf: Have get_perf_callchain() return NULL if crosstask and user are set
  perf: Remove get_perf_callchain() init_nr argument
  perf/x86: Print PMU counters bitmap in x86_pmu_show_pmu_cap()
  perf/x86/intel: Add ICL_FIXED_0_ADAPTIVE bit into INTEL_FIXED_BITS_MASK
  perf/x86/intel: Change macro GLOBAL_CTRL_EN_PERF_METRICS to BIT_ULL(48)
  perf/x86: Add PERF_CAP_PEBS_TIMING_INFO flag
  perf/x86/intel: Fix IA32_PMC_x_CFG_B MSRs access error
  perf/x86/intel: Use early_initcall() to hook bts_init()
  uprobes: Remove redundant __GFP_NOWARN
  selftests/seccomp: validate uprobe syscall passes through seccomp
  seccomp: passthrough uprobe systemcall without filtering
  selftests/bpf: Fix uprobe syscall shadow stack test
  selftests/bpf: Change test_uretprobe_regs_change for uprobe and uretprobe
  selftests/bpf: Add uprobe_regs_equal test
  selftests/bpf: Add optimized usdt variant for basic usdt test
  ...
2025-09-30 11:11:21 -07:00
Linus Torvalds
6c7340a7a8 Scheduler updates for v6.18:
Core scheduler changes:
 
  - Make migrate_{en,dis}able() inline, to improve performance
    (Menglong Dong)
 
  - Move STDL_INIT() functions out-of-line (Peter Zijlstra)
 
  - Unify the SCHED_{SMT,CLUSTER,MC} Kconfig (Peter Zijlstra)
 
 Fair scheduling:
 
  - Defer throttling when tasks exit to user-space, to reduce the
    chance & impact of throttle-preemption with held locks and
    other resources. (Aaron Lu, Valentin Schneider)
 
  - Get rid of sched_domains_curr_level hack for tl->cpumask(),
    as the warning was getting triggered on certain topologies.
    (Peter Zijlstra)
 
 Misc cleanups & fixes:
 
  - Header cleanups (Menglong Dong)
 
  - Fix race in push_dl_task() (Harshit Agarwal)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmjWn0IRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1i9ng/+KJYEsNilPA6nGzZLurU3HXm6S5NrNBPc
 xJU43eo3QdFUymKqYaCoXCwaMah3m8fFW+DC8ZoEvWC5wHnlIw6+u/ZEtsWQ4R8N
 8+sjQddQeb9Gez+nkC7pXX8h1nqlhHyGWU88+9hOyEEMk21KgH7tJ5gOuGQdPhiA
 v24D8dkS1sT6CAeqZAycYIkos+kfNNV+wAEQCXAxfvSSslpzJVZcj9kdQInxF4O4
 9+WrvFZFVYJWdJgmgfbpBMw7N8a4Gpv+Lr6U0ZVXHNeLjNdn60I+5Me+9BW9GRcO
 pPL50RfGgGgVIqyEHvBhhz8p0tlKUJo9NkRJ9hmNB5SKT3P442SU789ppTYgI1il
 5P4g3TiuomqPVuo99/mYHYOpT6NkYC1fkwMP9Hfk4NRUX0EA6lKXelVnMXaC3Z7R
 U+0xVYtK4BBLRFBo4HIEM1HChSIcOnutuufFX4lPPt+ewnAmJwSt619w+lBF0v+n
 cpiLIlQ74LrYlrULw3bznnwzh5dV8XHm4CT3XAShm6qB97/SaLr0rI5W7njdSvte
 ym9z4yFWidawowvLWjFX1CykSnX/T5MV+uzuIH64MEIA39B4VKJWCAC3l3AMJn/5
 Ckhf93B5uG0q+wUtE66x/JrN7wtbz9PMpwh01fXNWs/eWc1VKkVrvq+PmHODngmz
 drSUoI1P570=
 =8ZTZ
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2025-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Ingo Molnar:
 "Core scheduler changes:

   - Make migrate_{en,dis}able() inline, to improve performance
     (Menglong Dong)

   - Move STDL_INIT() functions out-of-line (Peter Zijlstra)

   - Unify the SCHED_{SMT,CLUSTER,MC} Kconfig (Peter Zijlstra)

  Fair scheduling:

   - Defer throttling to when tasks exit to user-space, to reduce the
     chance & impact of throttle-preemption with held locks and other
     resources (Aaron Lu, Valentin Schneider)

   - Get rid of sched_domains_curr_level hack for tl->cpumask(), as the
     warning was getting triggered on certain topologies (Peter
     Zijlstra)

  Misc cleanups & fixes:

   - Header cleanups (Menglong Dong)

   - Fix race in push_dl_task() (Harshit Agarwal)"

* tag 'sched-core-2025-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched: Fix some typos in include/linux/preempt.h
  sched: Make migrate_{en,dis}able() inline
  rcu: Replace preempt.h with sched.h in include/linux/rcupdate.h
  arch: Add the macro COMPILE_OFFSETS to all the asm-offsets.c
  sched/fair: Do not balance task to a throttled cfs_rq
  sched/fair: Do not special case tasks in throttled hierarchy
  sched/fair: update_cfs_group() for throttled cfs_rqs
  sched/fair: Propagate load for throttled cfs_rq
  sched/fair: Get rid of throttled_lb_pair()
  sched/fair: Task based throttle time accounting
  sched/fair: Switch to task based throttle model
  sched/fair: Implement throttle task work and related helpers
  sched/fair: Add related data structure for task based throttle
  sched: Unify the SCHED_{SMT,CLUSTER,MC} Kconfig
  sched: Move STDL_INIT() functions out-of-line
  sched/fair: Get rid of sched_domains_curr_level hack for tl->cpumask()
  sched/deadline: Fix race in push_dl_task()
2025-09-30 10:35:11 -07:00
Linus Torvalds
755fa5b4fb cgroup: Changes for v6.18
- Extensive cpuset code cleanup and refactoring work with no functional
   changes: CPU mask computation logic refactoring, introducing new helpers,
   removing redundant code paths, and improving error handling for better
   maintainability.
 
 - A few bug fixes to cpuset including fixes for partition creation failures
   when isolcpus is in use, missing error returns, and null pointer access
   prevention in free_tmpmasks().
 
 - Core cgroup changes include replacing the global percpu_rwsem with
   per-threadgroup rwsem when writing to cgroup.procs for better scalability,
   workqueue conversions to use WQ_PERCPU and system_percpu_wq to prepare for
   workqueue default switching from percpu to unbound, and removal of unused
   code including the post_attach callback.
 
 - New cgroup.stat.local time accounting feature that tracks frozen time
   duration.
 
 - Misc changes including selftests updates (new freezer time tests and
   backward compatibility fixes), documentation sync, string function safety
   improvements, and 64-bit division fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCaNb1Sg4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGfLMAPwKwkvUg9DPJEuECRfM9woOOHyIWLp1DwUhpg1v
 Zq0lkAEAmo/+IkJXGZ7TGF+wzSj7GFIugrILu3upzLCHzgYoDgs=
 =39KF
 -----END PGP SIGNATURE-----

Merge tag 'cgroup-for-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup updates from Tejun Heo:

 - Extensive cpuset code cleanup and refactoring work with no functional
   changes: CPU mask computation logic refactoring, introducing new
   helpers, removing redundant code paths, and improving error handling
   for better maintainability.

 - A few bug fixes to cpuset including fixes for partition creation
   failures when isolcpus is in use, missing error returns, and null
   pointer access prevention in free_tmpmasks().

 - Core cgroup changes include replacing the global percpu_rwsem with
   per-threadgroup rwsem when writing to cgroup.procs for better
   scalability, workqueue conversions to use WQ_PERCPU and
   system_percpu_wq to prepare for workqueue default switching from
   percpu to unbound, and removal of unused code including the
   post_attach callback.

 - New cgroup.stat.local time accounting feature that tracks frozen time
   duration.

 - Misc changes including selftests updates (new freezer time tests and
   backward compatibility fixes), documentation sync, string function
   safety improvements, and 64-bit division fixes.

* tag 'cgroup-for-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (39 commits)
  cpuset: remove is_prs_invalid helper
  cpuset: remove impossible warning in update_parent_effective_cpumask
  cpuset: remove redundant special case for null input in node mask update
  cpuset: fix missing error return in update_cpumask
  cpuset: Use new excpus for nocpu error check when enabling root partition
  cpuset: fix failure to enable isolated partition when containing isolcpus
  Documentation: cgroup-v2: Sync manual toctree
  cpuset: use partition_cpus_change for setting exclusive cpus
  cpuset: use parse_cpulist for setting cpus.exclusive
  cpuset: introduce partition_cpus_change
  cpuset: refactor cpus_allowed_validate_change
  cpuset: refactor out validate_partition
  cpuset: introduce cpus_excl_conflict and mems_excl_conflict helpers
  cpuset: refactor CPU mask buffer parsing logic
  cpuset: Refactor exclusive CPU mask computation logic
  cpuset: change return type of is_partition_[in]valid to bool
  cpuset: remove unused assignment to trialcs->partition_root_state
  cpuset: move the root cpuset write check earlier
  cgroup/cpuset: Remove redundant rcu_read_lock/unlock() in spin_lock
  cgroup: Remove redundant rcu_read_lock/unlock() in spin_lock
  ...
2025-09-30 09:55:41 -07:00
Linus Torvalds
77fc3f6696 workqueue: Changes for v6.18
- WQ_PERCPU was added to remaining alloc_workqueue() users and system_wq
   usage was replaced with system_percpu_wq and system_unbound_wq with
   system_dfl_wq. These are equivalent conversions with no functional changes,
   preparing for switching default to unbound workqueues from percpu.
 
 - A handshake mechanism was added for canceling BH workers to avoid live
   lock scenarios under PREEMPT_RT.
 
 - Unnecessary rcu_read_lock/unlock() calls were dropped in
   wq_watchdog_timer_fn() and workqueue_congested().
 
 - Documentation was fixed to resolve texinfodocs warnings.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCaNbvXg4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGTr3AP9lucQOr6wy2+gZOhUd2ItBbQMkKb4Vg7akkqkL
 3cKyMQD/WF/oQdszpUA/Y8coWa9++T8q9Ua56crDVsTJ29mHbgQ=
 =mrZy
 -----END PGP SIGNATURE-----

Merge tag 'wq-for-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq

Pull workqueue updates from Tejun Heo:

 - WQ_PERCPU was added to remaining alloc_workqueue() users and
   system_wq usage was replaced with system_percpu_wq and
   system_unbound_wq with system_dfl_wq.

   These are equivalent conversions with no functional changes,
   preparing for switching default to unbound workqueues from percpu.

 - A handshake mechanism was added for canceling BH workers to avoid
   live lock scenarios under PREEMPT_RT.

 - Unnecessary rcu_read_lock/unlock() calls were dropped in
   wq_watchdog_timer_fn() and workqueue_congested().

 - Documentation was fixed to resolve texinfodocs warnings.

* tag 'wq-for-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: fix texinfodocs warning for WQ_* flags reference
  workqueue: WQ_PERCPU added to alloc_workqueue users
  workqueue: replace use of system_wq with system_percpu_wq
  workqueue: replace use of system_unbound_wq with system_dfl_wq
  workqueue: Provide a handshake for canceling BH workers
  workqueue: Remove rcu_read_lock/unlock() in wq_watchdog_timer_fn()
  workqueue: Remove redundant rcu_read_lock/unlock() in workqueue_congested()
2025-09-30 09:31:09 -07:00
Linus Torvalds
a23cd25bae sched_ext: Changes for v6.18
- Code organization cleanup. Separate internal types and accessors to
   ext_internal.h to reduce the size of ext.c and improve maintainability.
 
 - Prepare for cgroup sub-scheduler support by adding @sch parameter to
   various functions and helpers, reorganizing scheduler instance handling,
   and dropping obsolete helpers like scx_kf_exit() and kf_cpu_valid().
 
 - Add new scx_bpf_cpu_curr() and scx_bpf_locked_rq() BPF helpers to provide
   safer access patterns with proper RCU protection. scx_bpf_cpu_rq() is
   deprecated with warnings due to potential race conditions.
 
 - Improve debugging with migration-disabled counter in error state dumps,
   SCX_EFLAG_INITIALIZED flag, bitfields for warning flags, and other
   enhancements to help diagnose issues.
 
 - Use cgroup_lock/unlock() for cgroup synchronization instead of
   scx_cgroup_rwsem based synchronization. This is simpler and allows
   enable/disable paths to synchronize against cgroup changes independent of
   the CPU controller.
 
 - rhashtable_lookup() replacement to avoid redundant RCU locking was
   reverted due to RCU usage warnings. Will be redone once rhashtable is
   updated to use rcu_dereference_all().
 
 - Other misc updates and fixes including bypass handling improvements,
   scx_task_iter_relock() improvements, tools/sched_ext updates, and
   compatibility helpers.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCaNbpHQ4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGTzWAP9cOG1O2DE0nCJJlDfGoJd1suz9quK+OrVdXhO+
 6nnXoAEAonDP0jXtwoW9GVQFyj8WZ0mfV4QJk0OHM1CoOV5XkQY=
 =Ev85
 -----END PGP SIGNATURE-----

Merge tag 'sched_ext-for-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext

Pull sched_ext updates from Tejun Heo:

 - Code organization cleanup. Separate internal types and accessors to
   ext_internal.h to reduce the size of ext.c and improve
   maintainability.

 - Prepare for cgroup sub-scheduler support by adding @sch parameter to
   various functions and helpers, reorganizing scheduler instance
   handling, and dropping obsolete helpers like scx_kf_exit() and
   kf_cpu_valid().

 - Add new scx_bpf_cpu_curr() and scx_bpf_locked_rq() BPF helpers to
   provide safer access patterns with proper RCU protection.
   scx_bpf_cpu_rq() is deprecated with warnings due to potential race
   conditions.

 - Improve debugging with migration-disabled counter in error state
   dumps, SCX_EFLAG_INITIALIZED flag, bitfields for warning flags, and
   other enhancements to help diagnose issues.

 - Use cgroup_lock/unlock() for cgroup synchronization instead of
   scx_cgroup_rwsem based synchronization. This is simpler and allows
   enable/disable paths to synchronize against cgroup changes
   independent of the CPU controller.

 - rhashtable_lookup() replacement to avoid redundant RCU locking was
   reverted due to RCU usage warnings. Will be redone once rhashtable is
   updated to use rcu_dereference_all().

 - Other misc updates and fixes including bypass handling improvements,
   scx_task_iter_relock() improvements, tools/sched_ext updates, and
   compatibility helpers.

* tag 'sched_ext-for-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: (28 commits)
  Revert "sched_ext: Use rhashtable_lookup() instead of rhashtable_lookup_fast()"
  sched_ext: Misc updates around scx_sched instance pointer
  sched_ext: Drop scx_kf_exit() and scx_kf_error()
  sched_ext: Add the @sch parameter to scx_dsq_insert_preamble/commit()
  sched_ext: Drop kf_cpu_valid()
  sched_ext: Add the @sch parameter to ext_idle helpers
  sched_ext: Add the @sch parameter to __bstr_format()
  sched_ext: Separate out scx_kick_cpu() and add @sch to it
  tools/sched_ext: scx_qmap: Make debug output quieter by default
  sched_ext: Make qmap dump operation non-destructive
  sched_ext: Add SCX_EFLAG_INITIALIZED to indicate successful ops.init()
  sched_ext: Use bitfields for boolean warning flags
  sched_ext: Fix stray scx_root usage in task_can_run_on_remote_rq()
  sched_ext: Improve SCX_KF_DISPATCH comment
  sched_ext: Use rhashtable_lookup() instead of rhashtable_lookup_fast()
  sched_ext: Verify RCU protection in scx_bpf_cpu_curr()
  sched_ext: Add migration-disabled counter to error state dump
  sched_ext: Fix NULL dereference in scx_bpf_cpu_rq() warning
  tools/sched_ext: Add compat helper for scx_bpf_cpu_curr()
  sched_ext: deprecation warn for scx_bpf_cpu_rq()
  ...
2025-09-30 09:05:07 -07:00
Linus Torvalds
56a0810d8c audit/stable-6.18 PR 20250926
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmjWq5oUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXPTjRAAwYapnw+ZGdFtTGIDT63HtlKjCGHg
 DRR8J1RYWhxQL78dInjl7hlGPd4ULdpdF6zsh27X/8OsdFotw4NhDyPJwS1qWZv9
 uBJMy/s1Qi1V/rrtDygLGgkQ9ICfl/hgVh3L+g9AXU8H9IapMULp33z+2ueFU4rA
 PXgXppgNQTOhIQml0tagY7iPlLaaI1uPv/Dbvt792CSrKZReC+uiDSQKD6SUy5oJ
 NBRs0emdCqbllo8Eo7wTGdfzUttsPWYHe7X9BGCMK2bHp0BpMnFBDtuipUAgjNE8
 O16EkAtBMpEBW9VEFvDYW1jMFO7ccD8b09CbqPLdE7E0GeigTiODg+FdncKEpZn0
 Dl4xPbIoPBHVrDHKFK3HcuEdUs0FZH3NpTLFRg0/nWbg3CfSOFq1ZKhSbwLTZ48V
 2Iq22G0hIIl3yTEePSoR8xCSQkWf6hA1SVvzBqw5Xn1tnkdIUuM+KzeZUPKxCOiH
 r5b3ufrN5YMAcmc59q393sNuSMd7s97fohhK8/HouB93EcVNM2UjLEKVJnhMhYRE
 N21O17jwQG9F+OYTnmtMzuUF6yxwSAmkzQOg6F+lalJ8MECnNrZOEeyuA3d5ISi5
 4ZrXHWw90qaDy9lCV1o0UwWt9na+WxeMCJNpI07h5V1k3x7BULiI6WeP7J1qnY9r
 YlLv/6Hgx29dtqE=
 =iQal
 -----END PGP SIGNATURE-----

Merge tag 'audit-pr-20250926' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit

Pull audit updates from Paul Moore:

 - Proper audit support for multiple LSMs

   As the audit subsystem predated the work to enable multiple LSMs,
   some additional work was needed to support logging the different LSM
   labels for the subjects/tasks and objects on the system. Casey's
   patches add new auxillary records for subjects and objects that
   convey the additional labels.

 - Ensure fanotify audit events are always generated

   Generally speaking security relevant subsystems always generate audit
   events, unless explicitly ignored. However, up to this point fanotify
   events had been ignored by default, but starting with this pull
   request fanotify follows convention and generates audit events by
   default.

 - Replace an instance of strcpy() with strscpy()

 - Minor indentation, style, and comment fixes

* tag 'audit-pr-20250926' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
  audit: fix skb leak when audit rate limit is exceeded
  audit: init ab->skb_list earlier in audit_buffer_alloc()
  audit: add record for multiple object contexts
  audit: add record for multiple task security contexts
  lsm: security_lsmblob_to_secctx module selection
  audit: create audit_stamp structure
  audit: add a missing tab
  audit: record fanotify event regardless of presence of rules
  audit: fix typo in auditfilter.c comment
  audit: Replace deprecated strcpy() with strscpy()
  audit: fix indentation in audit_log_exit()
2025-09-30 08:22:16 -07:00
Linus Torvalds
417552999d powerpc updates for 6.18
- powerpc support for BPF arena and arena atomics
  - Patches to switch to msi parent domain (per-device MSI domains)
  - Add a lock contention tracepoint in the queued spinlock slowpath
  - Fixes for underflow in pseries/powernv msi and pci paths
  - Switch from legacy-of-mm-gpiochip dependency to platform driver
  - Fixes for handling TLB misses
  - Introduce support for powerpc papr-hvpipe
  - Add vpa-dtl PMU driver for pseries platform
  - Misc fixes and cleanups
 
 Thanks to: Aboorva Devarajan, Aditya Bodkhe, Andrew Donnellan, Athira Rajeev, Cédric Le Goater, Christophe Leroy,
 Erhard Furtner, Gautam Menghani, Geert Uytterhoeven, Haren Myneni, Hari Bathini, Joe Lawrence, Kajol Jain, Kienan Stewart,
 Linus Walleij, Mahesh Salgaonkar, Nam Cao, Nicolas Schier, Nysal Jan K.A., Ritesh Harjani (IBM), Ruben Wauters,
 Saket Kumar Bhaskar, Shashank MS, Shrikanth Hegde, Tejas Manhas, Thomas Gleixner, Thomas Huth, Thorsten Blum,
 Tyrel Datwyler, Venkat Rao Bagalkote,
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEqX2DNAOgU8sBX3pRpnEsdPSHZJQFAmjatlYACgkQpnEsdPSH
 ZJRQRQ//fCFV+ZMAgg54N/PMaobYSQdjvAGLGj+ynTfLas0qkhV62x3X1R85pLbW
 cqiDTcBeVT9nW7+2w+eIcS4kWku2/oorlPnsDtKTadLa78Sk9r12kzsPn+vOobwA
 /dril1kRHNqdpqUG8P751EUIFWULsKFOwNMdp24q8inpr2YOy3csVbAxUqVkSVbk
 8aafr5/hl5uvj38KdE4ditLlphNenA+kB1Tf67/R4T7MjHTqZtLPd+crqrLtEQlk
 eYvcJ1c5t/+ltgiSnuR5Go48CKDaw4YKCtqnaBiOBxMad2xnzqRo+NWmcxH1HmwI
 MZsSyHeri/sF7zBRCpytqGlci/wFQwiNeI3bpE31lV0wwu8Bmoypommxly/fPUSU
 23Qi7UuTmOXu537aFcMAfqmA05JlJh79JCEuWVN5LUgQUIalATeiyoKNPAuYYw2V
 FNYyURBBe1FNVhGyJdTg+766npUX8eqOWMImcyuAzdAV4MuBPzUbhVKJazHax+4r
 BzqDD5DeApl53+wUzLjWnN4s1rKxOuRUdWgOsrzD3DQZJkIP3p8WDZkgQY4YiI3P
 EWbHEx7q1XSiMJD4lhyAYHY01yc1bwYrrZ0/NCpl4G1trbdLb0sZmGEH9ykJlzSC
 cVV/kvH9tyy95I7HbDKc4lK1c0dC5kwFmcArTaHnDaRpRvc+eBQ=
 =U2s/
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Madhavan Srinivasan:

 - powerpc support for BPF arena and arena atomics

 - Patches to switch to msi parent domain (per-device MSI domains)

 - Add a lock contention tracepoint in the queued spinlock slowpath

 - Fixes for underflow in pseries/powernv msi and pci paths

 - Switch from legacy-of-mm-gpiochip dependency to platform driver

 - Fixes for handling TLB misses

 - Introduce support for powerpc papr-hvpipe

 - Add vpa-dtl PMU driver for pseries platform

 - Misc fixes and cleanups

Thanks to Aboorva Devarajan, Aditya Bodkhe, Andrew Donnellan, Athira
Rajeev, Cédric Le Goater, Christophe Leroy, Erhard Furtner, Gautam
Menghani, Geert Uytterhoeven, Haren Myneni, Hari Bathini, Joe Lawrence,
Kajol Jain, Kienan Stewart, Linus Walleij, Mahesh Salgaonkar, Nam Cao,
Nicolas Schier, Nysal Jan K.A., Ritesh Harjani (IBM), Ruben Wauters,
Saket Kumar Bhaskar, Shashank MS, Shrikanth Hegde, Tejas Manhas, Thomas
Gleixner, Thomas Huth, Thorsten Blum, Tyrel Datwyler, and Venkat Rao
Bagalkote.

* tag 'powerpc-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (49 commits)
  powerpc/pseries: Define __u{8,32} types in papr_hvpipe_hdr struct
  genirq/msi: Remove msi_post_free()
  powerpc/perf/vpa-dtl: Add documentation for VPA dispatch trace log PMU
  powerpc/perf/vpa-dtl: Handle the writing of perf record when aux wake up is needed
  powerpc/perf/vpa-dtl: Add support to capture DTL data in aux buffer
  powerpc/perf/vpa-dtl: Add support to setup and free aux buffer for capturing DTL data
  docs: ABI: sysfs-bus-event_source-devices-vpa-dtl: Document sysfs event format entries for vpa_dtl pmu
  powerpc/vpa_dtl: Add interface to expose vpa dtl counters via perf
  powerpc/time: Expose boot_tb via accessor
  powerpc/32: Remove PAGE_KERNEL_TEXT to fix startup failure
  powerpc/fprobe: fix updated fprobe for function-graph tracer
  powerpc/ftrace: support CONFIG_FUNCTION_GRAPH_RETVAL
  powerpc64/modules: replace stub allocation sentinel with an explicit counter
  powerpc64/modules: correctly iterate over stubs in setup_ftrace_ool_stubs
  powerpc/ftrace: ensure ftrace record ops are always set for NOPs
  powerpc/603: Really copy kernel PGD entries into all PGDIRs
  powerpc/8xx: Remove left-over instruction and comments in DataStoreTLBMiss handler
  powerpc/pseries: HVPIPE changes to support migration
  powerpc/pseries: Enable hvpipe with ibm,set-system-parameter RTAS
  powerpc/pseries: Enable HVPIPE event message interrupt
  ...
2025-09-29 19:28:50 -07:00
Linus Torvalds
feafee2845 arm64 updates for 6.18
Confidential computing:
  - Add support for accepting secrets from firmware (e.g. ACPI CCEL)
    and mapping them with appropriate attributes.
 
 CPU features:
  - Advertise atomic floating-point instructions to userspace.
 
  - Extend Spectre workarounds to cover additional Arm CPU variants.
 
  - Extend list of CPUs that support break-before-make level 2 and
    guarantee not to generate TLB conflict aborts for changes of mapping
    granularity (BBML2_NOABORT).
 
  - Add GCS support to our uprobes implementation.
 
 Documentation:
  - Remove bogus SME documentation concerning register state when
    entering/exiting streaming mode.
 
 Entry code:
  - Switch over to the generic IRQ entry code (GENERIC_IRQ_ENTRY).
 
  - Micro-optimise syscall entry path with a compiler branch hint.
 
 Memory management:
  - Enable huge mappings in vmalloc space even when kernel page-table
    dumping is enabled.
 
  - Tidy up the types used in our early MMU setup code.
 
  - Rework rodata= for closer parity with the behaviour on x86.
 
  - For CPUs implementing BBML2_NOABORT, utilise block mappings in the
    linear map even when rodata= applies to virtual aliases.
 
  - Don't re-allocate the virtual region between '_text' and '_stext',
    as doing so confused tools parsing /proc/vmcore.
 
 Miscellaneous:
  - Clean-up Kconfig menuconfig text for architecture features.
 
  - Avoid redundant bitmap_empty() during determination of supported
    SME vector lengths.
 
  - Re-enable warnings when building the 32-bit vDSO object.
 
  - Avoid breaking our eggs at the wrong end.
 
 Perf and PMUs:
  - Support for v3 of the Hisilicon L3C PMU.
 
  - Support for Hisilicon's MN and NoC PMUs.
 
  - Support for Fujitsu's Uncore PMU.
 
  - Support for SPE's extended event filtering feature.
 
  - Preparatory work to enable data source filtering in SPE.
 
  - Support for multiple lanes in the DWC PCIe PMU.
 
  - Support for i.MX94 in the IMX DDR PMU driver.
 
  - MAINTAINERS update (Thank you, Yicong).
 
  - Minor driver fixes (PERF_IDX2OFF() overflow, CMN register offsets).
 
 Selftests:
  - Add basic LSFE check to the existing hwcaps test.
 
  - Support nolibc in GCS tests.
 
  - Extend SVE ptrace test to pass unsupported regsets and invalid vector
    lengths.
 
  - Minor cleanups (typos, cosmetic changes).
 
 System registers:
  - Fix ID_PFR1_EL1 definition.
 
  - Fix incorrect signedness of some fields in ID_AA64MMFR4_EL1.
 
  - Sync TCR_EL1 definition with the latest Arm ARM (L.b).
 
  - Be stricter about the input fed into our AWK sysreg generator script.
 
  - Typo fixes and removal of redundant definitions.
 
 ACPI, EFI and PSCI:
  - Decouple Arm's "Software Delegated Exception Interface" (SDEI)
    support from the ACPI GHES code so that it can be used by platforms
    booted with device-tree.
 
  - Remove unnecessary per-CPU tracking of the FPSIMD state across EFI
    runtime calls.
 
  - Fix a node refcount imbalance in the PSCI device-tree code.
 
 CPU Features:
  - Ensure register sanitisation is applied to fields in ID_AA64MMFR4.
 
  - Expose AIDR_EL1 to userspace via sysfs, primarily so that KVM guests
    can reliably query the underlying CPU types from the VMM.
 
  - Re-enabling of SME support (CONFIG_ARM64_SME) as a result of fixes
    to our context-switching, signal handling and ptrace code.
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmjWlMIQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNIVYB/sHaFYmToEaK4ldMfTn4ZQ8yr9ZuTMLr/ji
 zOm7wULP08wKIcp6t+1N6e7A8Wb3YSErxTywc4b9MqctIel1WvBMewjJd38xb2qO
 hFCuRuWemfyt6Rw/EGMCP54yueLzKbnN4q+/Aks8jedWtS+AXY6McGCJOzMjuECL
 zKb68Hqqb09YQ2b2BPSAiU9g42rotiIYppLaLbEdxUnw/eqfEN3wSrl92/LuBlqH
 r2wZDnJwA5q8Iy9HWlnhg4Jax0Cs86jSJPIQLB6v3WWhc3HR49pm5cqYzYkUht9L
 nA6eaWJiBl/+k3S+ftPKNzU8n04r15lqyNZE2FYfRk+0BPSacJOf
 =kO15
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "There's good stuff across the board, including some nice mm
  improvements for CPUs with the 'noabort' BBML2 feature and a clever
  patch to allow ptdump to play nicely with block mappings in the
  vmalloc area.

  Confidential computing:

   - Add support for accepting secrets from firmware (e.g. ACPI CCEL)
     and mapping them with appropriate attributes.

  CPU features:

   - Advertise atomic floating-point instructions to userspace

   - Extend Spectre workarounds to cover additional Arm CPU variants

   - Extend list of CPUs that support break-before-make level 2 and
     guarantee not to generate TLB conflict aborts for changes of
     mapping granularity (BBML2_NOABORT)

   - Add GCS support to our uprobes implementation.

  Documentation:

   - Remove bogus SME documentation concerning register state when
     entering/exiting streaming mode.

  Entry code:

   - Switch over to the generic IRQ entry code (GENERIC_IRQ_ENTRY)

   - Micro-optimise syscall entry path with a compiler branch hint.

  Memory management:

   - Enable huge mappings in vmalloc space even when kernel page-table
     dumping is enabled

   - Tidy up the types used in our early MMU setup code

   - Rework rodata= for closer parity with the behaviour on x86

   - For CPUs implementing BBML2_NOABORT, utilise block mappings in the
     linear map even when rodata= applies to virtual aliases

   - Don't re-allocate the virtual region between '_text' and '_stext',
     as doing so confused tools parsing /proc/vmcore.

  Miscellaneous:

   - Clean-up Kconfig menuconfig text for architecture features

   - Avoid redundant bitmap_empty() during determination of supported
     SME vector lengths

   - Re-enable warnings when building the 32-bit vDSO object

   - Avoid breaking our eggs at the wrong end.

  Perf and PMUs:

   - Support for v3 of the Hisilicon L3C PMU

   - Support for Hisilicon's MN and NoC PMUs

   - Support for Fujitsu's Uncore PMU

   - Support for SPE's extended event filtering feature

   - Preparatory work to enable data source filtering in SPE

   - Support for multiple lanes in the DWC PCIe PMU

   - Support for i.MX94 in the IMX DDR PMU driver

   - MAINTAINERS update (Thank you, Yicong)

   - Minor driver fixes (PERF_IDX2OFF() overflow, CMN register offsets).

  Selftests:

   - Add basic LSFE check to the existing hwcaps test

   - Support nolibc in GCS tests

   - Extend SVE ptrace test to pass unsupported regsets and invalid
     vector lengths

   - Minor cleanups (typos, cosmetic changes).

  System registers:

   - Fix ID_PFR1_EL1 definition

   - Fix incorrect signedness of some fields in ID_AA64MMFR4_EL1

   - Sync TCR_EL1 definition with the latest Arm ARM (L.b)

   - Be stricter about the input fed into our AWK sysreg generator
     script

   - Typo fixes and removal of redundant definitions.

  ACPI, EFI and PSCI:

   - Decouple Arm's "Software Delegated Exception Interface" (SDEI)
     support from the ACPI GHES code so that it can be used by platforms
     booted with device-tree

   - Remove unnecessary per-CPU tracking of the FPSIMD state across EFI
     runtime calls

   - Fix a node refcount imbalance in the PSCI device-tree code.

  CPU Features:

   - Ensure register sanitisation is applied to fields in ID_AA64MMFR4

   - Expose AIDR_EL1 to userspace via sysfs, primarily so that KVM
     guests can reliably query the underlying CPU types from the VMM

   - Re-enabling of SME support (CONFIG_ARM64_SME) as a result of fixes
     to our context-switching, signal handling and ptrace code"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (93 commits)
  arm64: cpufeature: Remove duplicate asm/mmu.h header
  arm64: Kconfig: Make CPU_BIG_ENDIAN depend on BROKEN
  perf/dwc_pcie: Fix use of uninitialized variable
  arm/syscalls: mark syscall invocation as likely in invoke_syscall
  Documentation: hisi-pmu: Add introduction to HiSilicon V3 PMU
  Documentation: hisi-pmu: Fix of minor format error
  drivers/perf: hisi: Add support for L3C PMU v3
  drivers/perf: hisi: Refactor the event configuration of L3C PMU
  drivers/perf: hisi: Extend the field of tt_core
  drivers/perf: hisi: Extract the event filter check of L3C PMU
  drivers/perf: hisi: Simplify the probe process of each L3C PMU version
  drivers/perf: hisi: Export hisi_uncore_pmu_isr()
  drivers/perf: hisi: Relax the event ID check in the framework
  perf: Fujitsu: Add the Uncore PMU driver
  arm64: map [_text, _stext) virtual address range non-executable+read-only
  arm64/sysreg: Update TCR_EL1 register
  arm64: Enable vmalloc-huge with ptdump
  arm64: cpufeature: add Neoverse-V3AE to BBML2 allow list
  arm64: errata: Apply workarounds for Neoverse-V3AE
  arm64: cputype: Add Neoverse-V3AE definitions
  ...
2025-09-29 18:48:39 -07:00
Linus Torvalds
a5ba183bde hardening updates for v6.18-rc1
- Clean up usage of TRAILING_OVERLAP() (Gustavo A. R. Silva)
 
 - lkdtm: fortify: Fix potential NULL dereference on kmalloc failure
   (Junjie Cao)
 
 - Add str_assert_deassert() helper (Lad Prabhakar)
 
 - gcc-plugins: Remove TODO_verify_il for GCC >= 16
 
 - kconfig: Fix BrokenPipeError warnings in selftests
 
 - kconfig: Add transitional symbol attribute for migration support
 
 - kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCaNraNQAKCRA2KwveOeQk
 u/DkAPwKPP5BSmVR2wkdpQaXIr3PGA+cbBYp34DMJNujZ9piIwD/WZ+HfGTLoERy
 +2Q6HLj9hUdd+Rx3IZ8/w1QmnhUIUAU=
 =AwV9
 -----END PGP SIGNATURE-----

Merge tag 'hardening-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull hardening updates from Kees Cook:
 "One notable addition is the creation of the 'transitional' keyword for
  kconfig so CONFIG renaming can go more smoothly.

  This has been a long-standing deficiency, and with the renaming of
  CONFIG_CFI_CLANG to CONFIG_CFI (since GCC will soon have KCFI
  support), this came up again.

  The breadth of the diffstat is mainly this renaming.

   - Clean up usage of TRAILING_OVERLAP() (Gustavo A. R. Silva)

   - lkdtm: fortify: Fix potential NULL dereference on kmalloc failure
     (Junjie Cao)

   - Add str_assert_deassert() helper (Lad Prabhakar)

   - gcc-plugins: Remove TODO_verify_il for GCC >= 16

   - kconfig: Fix BrokenPipeError warnings in selftests

   - kconfig: Add transitional symbol attribute for migration support

   - kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI"

* tag 'hardening-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  lib/string_choices: Add str_assert_deassert() helper
  kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI
  kconfig: Add transitional symbol attribute for migration support
  kconfig: Fix BrokenPipeError warnings in selftests
  gcc-plugins: Remove TODO_verify_il for GCC >= 16
  stddef: Introduce __TRAILING_OVERLAP()
  stddef: Remove token-pasting in TRAILING_OVERLAP()
  lkdtm: fortify: Fix potential NULL dereference on kmalloc failure
2025-09-29 17:48:27 -07:00
Linus Torvalds
a240a79d43 seccomp updates for v6.18-rc1
- Fix race with WAIT_KILLABLE_RECV (Johannes Nixdorf)
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCaNrY3wAKCRA2KwveOeQk
 u5vmAP9pH7LAt7zTWgleIxWPYuUvELS+zB9oK9EukyTZNbAVoAD8Do5ZpFP51Llk
 a6mxvmi808aYytROWM5c8O4tk5CFDQQ=
 =Vkfy
 -----END PGP SIGNATURE-----

Merge tag 'seccomp-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull seccomp update from Kees Cook:

 - Fix race with WAIT_KILLABLE_RECV (Johannes Nixdorf)

* tag 'seccomp-v6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  selftests/seccomp: Add a test for the WAIT_KILLABLE_RECV fast reply race
  seccomp: Fix a race with WAIT_KILLABLE_RECV if the tracer replies too fast
2025-09-29 17:44:09 -07:00
Linus Torvalds
449c2b302c vfs-6.18-rc1.async
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaNZQkAAKCRCRxhvAZXjc
 oijOAQCw+gE9GkvYJ7TOCiqVFj8RZZ1tb7NiCdBRCa4wXM7KMwD9Gwim//G8Hjyy
 dNPZejphGdxv99Skwp/0nfyDGL6x7w4=
 =zIJf
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.18-rc1.async' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs async directory updates from Christian Brauner:
 "This contains further preparatory changes for the asynchronous directory
  locking scheme:

   - Add lookup_one_positive_killable() which allows overlayfs to
     perform lookup that won't block on a fatal signal

   - Unify the mount idmap handling in struct renamedata as a rename can
     only happen within a single mount

   - Introduce kern_path_parent() for audit which sets the path to the
     parent and returns a dentry for the target without holding any
     locks on return

   - Rename kern_path_locked() as it is only used to prepare for the
     removal of an object from the filesystem:

	kern_path_locked()    => start_removing_path()
	kern_path_create()    => start_creating_path()
	user_path_create()    => start_creating_user_path()
	user_path_locked_at() => start_removing_user_path_at()
	done_path_create()    => end_creating_path()
	NA                    => end_removing_path()"

* tag 'vfs-6.18-rc1.async' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  debugfs: rename start_creating() to debugfs_start_creating()
  VFS: rename kern_path_locked() and related functions.
  VFS/audit: introduce kern_path_parent() for audit
  VFS: unify old_mnt_idmap and new_mnt_idmap in renamedata
  VFS: discard err2 in filename_create()
  VFS/ovl: add lookup_one_positive_killable()
2025-09-29 11:55:15 -07:00
Linus Torvalds
18b19abc37 namespace-6.18-rc1
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaNZQgQAKCRCRxhvAZXjc
 oiFXAQCpbLvkWbld9wLgxUBhq+q+kw5NvGxzpvqIhXwJB9F9YAEA44/Wevln4xGx
 +kRUbP+xlRQqenIYs2dLzVHzAwAdfQ4=
 =EO4Y
 -----END PGP SIGNATURE-----

Merge tag 'namespace-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull namespace updates from Christian Brauner:
 "This contains a larger set of changes around the generic namespace
  infrastructure of the kernel.

  Each specific namespace type (net, cgroup, mnt, ...) embedds a struct
  ns_common which carries the reference count of the namespace and so
  on.

  We open-coded and cargo-culted so many quirks for each namespace type
  that it just wasn't scalable anymore. So given there's a bunch of new
  changes coming in that area I've started cleaning all of this up.

  The core change is to make it possible to correctly initialize every
  namespace uniformly and derive the correct initialization settings
  from the type of the namespace such as namespace operations, namespace
  type and so on. This leaves the new ns_common_init() function with a
  single parameter which is the specific namespace type which derives
  the correct parameters statically. This also means the compiler will
  yell as soon as someone does something remotely fishy.

  The ns_common_init() addition also allows us to remove ns_alloc_inum()
  and drops any special-casing of the initial network namespace in the
  network namespace initialization code that Linus complained about.

  Another part is reworking the reference counting. The reference
  counting was open-coded and copy-pasted for each namespace type even
  though they all followed the same rules. This also removes all open
  accesses to the reference count and makes it private and only uses a
  very small set of dedicated helpers to manipulate them just like we do
  for e.g., files.

  In addition this generalizes the mount namespace iteration
  infrastructure introduced a few cycles ago. As reminder, the vfs makes
  it possible to iterate sequentially and bidirectionally through all
  mount namespaces on the system or all mount namespaces that the caller
  holds privilege over. This allow userspace to iterate over all mounts
  in all mount namespaces using the listmount() and statmount() system
  call.

  Each mount namespace has a unique identifier for the lifetime of the
  systems that is exposed to userspace. The network namespace also has a
  unique identifier working exactly the same way. This extends the
  concept to all other namespace types.

  The new nstree type makes it possible to lookup namespaces purely by
  their identifier and to walk the namespace list sequentially and
  bidirectionally for all namespace types, allowing userspace to iterate
  through all namespaces. Looking up namespaces in the namespace tree
  works completely locklessly.

  This also means we can move the mount namespace onto the generic
  infrastructure and remove a bunch of code and members from struct
  mnt_namespace itself.

  There's a bunch of stuff coming on top of this in the future but for
  now this uses the generic namespace tree to extend a concept
  introduced first for pidfs a few cycles ago. For a while now we have
  supported pidfs file handles for pidfds. This has proven to be very
  useful.

  This extends the concept to cover namespaces as well. It is possible
  to encode and decode namespace file handles using the common
  name_to_handle_at() and open_by_handle_at() apis.

  As with pidfs file handles, namespace file handles are exhaustive,
  meaning it is not required to actually hold a reference to nsfs in
  able to decode aka open_by_handle_at() a namespace file handle.
  Instead the FD_NSFS_ROOT constant can be passed which will let the
  kernel grab a reference to the root of nsfs internally and thus decode
  the file handle.

  Namespaces file descriptors can already be derived from pidfds which
  means they aren't subject to overmount protection bugs. IOW, it's
  irrelevant if the caller would not have access to an appropriate
  /proc/<pid>/ns/ directory as they could always just derive the
  namespace based on a pidfd already.

  It has the same advantage as pidfds. It's possible to reliably and for
  the lifetime of the system refer to a namespace without pinning any
  resources and to compare them trivially.

  Permission checking is kept simple. If the caller is located in the
  namespace the file handle refers to they are able to open it otherwise
  they must hold privilege over the owning namespace of the relevant
  namespace.

  The namespace file handle layout is exposed as uapi and has a stable
  and extensible format. For now it simply contains the namespace
  identifier, the namespace type, and the inode number. The stable
  format means that userspace may construct its own namespace file
  handles without going through name_to_handle_at() as they are already
  allowed for pidfs and cgroup file handles"

* tag 'namespace-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (65 commits)
  ns: drop assert
  ns: move ns type into struct ns_common
  nstree: make struct ns_tree private
  ns: add ns_debug()
  ns: simplify ns_common_init() further
  cgroup: add missing ns_common include
  ns: use inode initializer for initial namespaces
  selftests/namespaces: verify initial namespace inode numbers
  ns: rename to __ns_ref
  nsfs: port to ns_ref_*() helpers
  net: port to ns_ref_*() helpers
  uts: port to ns_ref_*() helpers
  ipv4: use check_net()
  net: use check_net()
  net-sysfs: use check_net()
  user: port to ns_ref_*() helpers
  time: port to ns_ref_*() helpers
  pid: port to ns_ref_*() helpers
  ipc: port to ns_ref_*() helpers
  cgroup: port to ns_ref_*() helpers
  ...
2025-09-29 11:20:29 -07:00
Linus Torvalds
722df25ddf kernel-6.18-rc1.clone3
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaNZgMQAKCRCRxhvAZXjc
 ornXAP954dZjz+OJw6lJLCf0j9TXJOczGHvK3oW5ZD9KnqtTdwEA7p1A6WMOKJyl
 8VtTgCS0yNt8QlznUnsSDfVm0jXVGAY=
 =tUXG
 -----END PGP SIGNATURE-----

Merge tag 'kernel-6.18-rc1.clone3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull copy_process updates from Christian Brauner:
 "This contains the changes to enable support for clone3() on nios2
  which apparently is still a thing.

  The more exciting part of this is that it cleans up the inconsistency
  in how the 64-bit flag argument is passed from copy_process() into the
  various other copy_*() helpers"

[ Fixed up rv ltl_monitor 32-bit support as per Sasha Levin in the merge ]

* tag 'kernel-6.18-rc1.clone3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  nios2: implement architecture-specific portion of sys_clone3
  arch: copy_thread: pass clone_flags as u64
  copy_process: pass clone_flags as u64 across calltree
  copy_sighand: Handle architectures where sizeof(unsigned long) < sizeof(u64)
2025-09-29 10:36:50 -07:00
Linus Torvalds
e571372101 vfs-6.18-rc1.pidfs
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaNZQVAAKCRCRxhvAZXjc
 oqs7AP9zh3RdSi/M4LooS+eqKmrQensr2UzUmvXkjCPpld7rVgD/TmSYnRCuSFFE
 B72t7OrjDRf3S7uvV0CXEwx7s+B2Uwk=
 =xQC4
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.18-rc1.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull pidfs updates from Christian Brauner:
 "This just contains a few changes to pid_nr_ns() to make it more robust
  and cleans up or improves a few users that ab- or misuse it currently"

* tag 'vfs-6.18-rc1.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  pid: change task_state() to use task_ppid_nr_ns()
  pid: change bacct_add_tsk() to use task_ppid_nr_ns()
  pid: make __task_pid_nr_ns(ns => NULL) safe for zombie callers
  pid: Add a judgment for ns null in pid_nr_ns
2025-09-29 10:02:35 -07:00
Linus Torvalds
b7ce6fa90f vfs-6.18-rc1.misc
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaNZQMQAKCRCRxhvAZXjc
 omNLAQCgrwzd9sa1JTlixweu3OAxQlSEbLuMpEv7Ztm+B7Wz0AD9HtwPC44Kev03
 GbMcB2DCFLC4evqYECj6IG7NBmoKsAs=
 =1ICf
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.18-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull misc vfs updates from Christian Brauner:
 "This contains the usual selections of misc updates for this cycle.

  Features:

   - Add "initramfs_options" parameter to set initramfs mount options.
     This allows to add specific mount options to the rootfs to e.g.,
     limit the memory size

   - Add RWF_NOSIGNAL flag for pwritev2()

     Add RWF_NOSIGNAL flag for pwritev2. This flag prevents the SIGPIPE
     signal from being raised when writing on disconnected pipes or
     sockets. The flag is handled directly by the pipe filesystem and
     converted to the existing MSG_NOSIGNAL flag for sockets

   - Allow to pass pid namespace as procfs mount option

     Ever since the introduction of pid namespaces, procfs has had very
     implicit behaviour surrounding them (the pidns used by a procfs
     mount is auto-selected based on the mounting process's active
     pidns, and the pidns itself is basically hidden once the mount has
     been constructed)

     This implicit behaviour has historically meant that userspace was
     required to do some special dances in order to configure the pidns
     of a procfs mount as desired. Examples include:

     * In order to bypass the mnt_too_revealing() check, Kubernetes
       creates a procfs mount from an empty pidns so that user
       namespaced containers can be nested (without this, the nested
       containers would fail to mount procfs)

       But this requires forking off a helper process because you cannot
       just one-shot this using mount(2)

     * Container runtimes in general need to fork into a container
       before configuring its mounts, which can lead to security issues
       in the case of shared-pidns containers (a privileged process in
       the pidns can interact with your container runtime process)

       While SUID_DUMP_DISABLE and user namespaces make this less of an
       issue, the strict need for this due to a minor uAPI wart is kind
       of unfortunate

       Things would be much easier if there was a way for userspace to
       just specify the pidns they want. So this pull request contains
       changes to implement a new "pidns" argument which can be set
       using fsconfig(2):

           fsconfig(procfd, FSCONFIG_SET_FD, "pidns", NULL, nsfd);
           fsconfig(procfd, FSCONFIG_SET_STRING, "pidns", "/proc/self/ns/pid", 0);

       or classic mount(2) / mount(8):

           // mount -t proc -o pidns=/proc/self/ns/pid proc /tmp/proc
           mount("proc", "/tmp/proc", "proc", MS_..., "pidns=/proc/self/ns/pid");

  Cleanups:

   - Remove the last references to EXPORT_OP_ASYNC_LOCK

   - Make file_remove_privs_flags() static

   - Remove redundant __GFP_NOWARN when GFP_NOWAIT is used

   - Use try_cmpxchg() in start_dir_add()

   - Use try_cmpxchg() in sb_init_done_wq()

   - Replace offsetof() with struct_size() in ioctl_file_dedupe_range()

   - Remove vfs_ioctl() export

   - Replace rwlock() with spinlock in epoll code as rwlock causes
     priority inversion on preempt rt kernels

   - Make ns_entries in fs/proc/namespaces const

   - Use a switch() statement() in init_special_inode() just like we do
     in may_open()

   - Use struct_size() in dir_add() in the initramfs code

   - Use str_plural() in rd_load_image()

   - Replace strcpy() with strscpy() in find_link()

   - Rename generic_delete_inode() to inode_just_drop() and
     generic_drop_inode() to inode_generic_drop()

   - Remove unused arguments from fcntl_{g,s}et_rw_hint()

  Fixes:

   - Document @name parameter for name_contains_dotdot() helper

   - Fix spelling mistake

   - Always return zero from replace_fd() instead of the file descriptor
     number

   - Limit the size for copy_file_range() in compat mode to prevent a
     signed overflow

   - Fix debugfs mount options not being applied

   - Verify the inode mode when loading it from disk in minixfs

   - Verify the inode mode when loading it from disk in cramfs

   - Don't trigger automounts with RESOLVE_NO_XDEV

     If openat2() was called with RESOLVE_NO_XDEV it didn't traverse
     through automounts, but could still trigger them

   - Add FL_RECLAIM flag to show_fl_flags() macro so it appears in
     tracepoints

   - Fix unused variable warning in rd_load_image() on s390

   - Make INITRAMFS_PRESERVE_MTIME depend on BLK_DEV_INITRD

   - Use ns_capable_noaudit() when determining net sysctl permissions

   - Don't call path_put() under namespace semaphore in listmount() and
     statmount()"

* tag 'vfs-6.18-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (38 commits)
  fcntl: trim arguments
  listmount: don't call path_put() under namespace semaphore
  statmount: don't call path_put() under namespace semaphore
  pid: use ns_capable_noaudit() when determining net sysctl permissions
  fs: rename generic_delete_inode() and generic_drop_inode()
  init: INITRAMFS_PRESERVE_MTIME should depend on BLK_DEV_INITRD
  initramfs: Replace strcpy() with strscpy() in find_link()
  initrd: Use str_plural() in rd_load_image()
  initramfs: Use struct_size() helper to improve dir_add()
  initrd: Fix unused variable warning in rd_load_image() on s390
  fs: use the switch statement in init_special_inode()
  fs/proc/namespaces: make ns_entries const
  filelock: add FL_RECLAIM to show_fl_flags() macro
  eventpoll: Replace rwlock with spinlock
  selftests/proc: add tests for new pidns APIs
  procfs: add "pidns" mount option
  pidns: move is-ancestor logic to helper
  openat2: don't trigger automounts with RESOLVE_NO_XDEV
  namei: move cross-device check to __traverse_mounts
  namei: remove LOOKUP_NO_XDEV check from handle_mounts
  ...
2025-09-29 09:03:07 -07:00