linux/kernel
Linus Torvalds 21fbefc588 tracing updates for v6.18:
- Use READ_ONCE() and WRITE_ONCE() instead of RCU for syscall tracepoints
 
   Individual system call trace events are pseudo events attached to the
   raw_syscall trace events that just trace the entry and exit of all system
   calls. When any of these individual system call trace events get enabled,
   an element in an array indexed by the system call number is assigned to
   the trace file that defines how to trace it. When the trace event
   triggers, it reads this array and if the array has an element, it uses that
   trace file to know what to write it (the trace file defines the output
   format of the corresponding system call).
 
   The issue is that it uses rcu_dereference_ptr() and marks the elements of
   the array as using RCU. This is incorrect. There is no RCU synchronization
   here. The event file that is pointed to has a completely different way to
   make sure its freed properly. The reading of the array during the system
   call trace event is only to know if there is a value or not. If not, it
   does nothing (it means this system call isn't being traced). If it does,
   it uses the information to store the system call data.
 
   The RCU usage here can simply be replaced by READ_ONCE() and WRITE_ONCE()
   macros.
 
 - Have the system call trace events use "0x" for hex values
 
   Some system call trace events display hex values but do not have "0x" in
   front of it. Seeing "count: 44" can be assumed that it is 44 decimal when
   in actuality it is 44 hex (68 decimal). Display "0x44" instead.
 
 - Use vmalloc_array() in tracing_map_sort_entries()
 
   The function tracing_map_sort_entries() used array_size() and vmalloc()
   when it could have simply used vmalloc_array().
 
 - Use for_each_online_cpu() in trace_osnoise.c()
 
   Instead of open coding for_each_cpu(cpu, cpu_online_mask), use
   for_each_online_cpu().
 
 - Move the buffer field in struct trace_seq to the end
 
   The buffer field in struct trace_seq is architecture dependent in size,
   and caused padding for the fields after it. By moving the buffer to the
   end of the structure, it compacts the trace_seq structure better.
 
 - Remove redundant zeroing of cmdline_idx field in saved_cmdlines_buffer()
 
   The structure that contains cmdline_idx is zeroed by memset(), no need to
   explicitly zero any of its fields after that.
 
 - Use  system_percpu_wq instead of system_wq in user_event_mm_remove()
 
   As system_wq is being deprecated, use the new wq.
 
 - Add cond_resched() is ftrace_module_enable()
 
   Some modules have a lot of functions (thousands of them), and the enabling
   of those functions can take some time. On non preemtable kernels, it was
   triggering a watchdog timeout. Add a cond_resched() to prevent that.
 
 - Add a BUILD_BUG_ON() to make sure PID_MAX_DEFAULT is always a power of 2
 
   There's code that depends on PID_MAX_DEFAULT being a power of 2 or it will
   break. If in the future that changes, make sure the build fails to ensure
   that the code is fixed that depends on this.
 
 - Grab mutex_lock() before ever exiting s_start()
 
   The s_start() function is a seq_file start routine. As s_stop() is always
   called even if s_start() fails, and s_stop() expects the event_mutex to be
   held as it will always release it. That mutex must always be taken in
   s_start() even if that function fails.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYKADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCaN/4phQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qpToAP4sENpZkZHFOl2PuikmAgjhB6PqiUtL
 LNuMDx45WygcLwD6Awy8DUlEBN6RzTPA761MSjs0+NMg16QLrhPLxWqFEgw=
 =nxDn
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing updates from Steven Rostedt:

 - Use READ_ONCE() and WRITE_ONCE() instead of RCU for syscall
   tracepoints

   Individual system call trace events are pseudo events attached to the
   raw_syscall trace events that just trace the entry and exit of all
   system calls. When any of these individual system call trace events
   get enabled, an element in an array indexed by the system call number
   is assigned to the trace file that defines how to trace it. When the
   trace event triggers, it reads this array and if the array has an
   element, it uses that trace file to know what to write it (the trace
   file defines the output format of the corresponding system call).

   The issue is that it uses rcu_dereference_ptr() and marks the
   elements of the array as using RCU. This is incorrect. There is no
   RCU synchronization here. The event file that is pointed to has a
   completely different way to make sure its freed properly. The reading
   of the array during the system call trace event is only to know if
   there is a value or not. If not, it does nothing (it means this
   system call isn't being traced). If it does, it uses the information
   to store the system call data.

   The RCU usage here can simply be replaced by READ_ONCE() and
   WRITE_ONCE() macros.

 - Have the system call trace events use "0x" for hex values

   Some system call trace events display hex values but do not have "0x"
   in front of it. Seeing "count: 44" can be assumed that it is 44
   decimal when in actuality it is 44 hex (68 decimal). Display "0x44"
   instead.

 - Use vmalloc_array() in tracing_map_sort_entries()

   The function tracing_map_sort_entries() used array_size() and
   vmalloc() when it could have simply used vmalloc_array().

 - Use for_each_online_cpu() in trace_osnoise.c()

   Instead of open coding for_each_cpu(cpu, cpu_online_mask), use
   for_each_online_cpu().

 - Move the buffer field in struct trace_seq to the end

   The buffer field in struct trace_seq is architecture dependent in
   size, and caused padding for the fields after it. By moving the
   buffer to the end of the structure, it compacts the trace_seq
   structure better.

 - Remove redundant zeroing of cmdline_idx field in
   saved_cmdlines_buffer()

   The structure that contains cmdline_idx is zeroed by memset(), no
   need to explicitly zero any of its fields after that.

 - Use system_percpu_wq instead of system_wq in user_event_mm_remove()

   As system_wq is being deprecated, use the new wq.

 - Add cond_resched() is ftrace_module_enable()

   Some modules have a lot of functions (thousands of them), and the
   enabling of those functions can take some time. On non preemtable
   kernels, it was triggering a watchdog timeout. Add a cond_resched()
   to prevent that.

 - Add a BUILD_BUG_ON() to make sure PID_MAX_DEFAULT is always a power
   of 2

   There's code that depends on PID_MAX_DEFAULT being a power of 2 or it
   will break. If in the future that changes, make sure the build fails
   to ensure that the code is fixed that depends on this.

 - Grab mutex_lock() before ever exiting s_start()

   The s_start() function is a seq_file start routine. As s_stop() is
   always called even if s_start() fails, and s_stop() expects the
   event_mutex to be held as it will always release it. That mutex must
   always be taken in s_start() even if that function fails.

* tag 'trace-v6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Fix lock imbalance in s_start() memory allocation failure path
  tracing: Ensure optimized hashing works
  ftrace: Fix softlockup in ftrace_module_enable
  tracing: replace use of system_wq with system_percpu_wq
  tracing: Remove redundant 0 value initialization
  tracing: Move buffer in trace_seq to end of struct
  tracing/osnoise: Use for_each_online_cpu() instead of for_each_cpu()
  tracing: Use vmalloc_array() to improve code
  tracing: Have syscall trace events show "0x" for values greater than 10
  tracing: Replace syscall RCU pointer assignment with READ/WRITE_ONCE()
2025-10-05 09:43:36 -07:00
..
bpf bpf-fixes 2025-10-03 19:38:19 -07:00
cgroup RCU pull request for v6.18 2025-10-04 11:28:45 -07:00
configs kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI 2025-09-24 14:29:14 -07:00
debug kdb: remove redundant check for scancode 0xe0 2025-09-20 21:19:09 +01:00
dma dma-mapping updates for Linux 6.18: 2025-10-03 17:41:12 -07:00
entry entry: Add arch_irqentry_exit_need_resched() for arm64 2025-09-11 15:55:34 +01:00
events Summary of significant series in this pull request: 2025-10-02 18:18:33 -07:00
futex A set of updates for futexes and related selftests: 2025-09-30 16:07:10 -07:00
gcov kbuild: require gcc-8 and binutils-2.30 2025-04-30 21:53:35 +02:00
irq A set of updates for the interrupt core subsystem: 2025-09-30 15:55:25 -07:00
kcsan Kernel Concurrency Sanitizer (KCSAN) updates for v6.18 2025-10-02 08:31:44 -07:00
livepatch sched,livepatch: Untangle cond_resched() and live-patching 2025-05-14 13:16:24 +02:00
locking locking/local_lock: Introduce local_lock_is_locked(). 2025-09-29 09:42:35 +02:00
module kcfi: Rename CONFIG_CFI_CLANG to CONFIG_CFI 2025-09-24 14:29:14 -07:00
power Merge branches 'pm-core', 'pm-runtime' and 'pm-sleep' 2025-09-29 12:54:01 +02:00
printk printk changes for 6.18 2025-10-04 11:13:11 -07:00
rcu RCU pull request for v6.18 2025-10-04 11:28:45 -07:00
sched Summary of significant series in this pull request: 2025-10-02 18:18:33 -07:00
time Networking changes for 6.18. 2025-10-02 15:17:01 -07:00
trace tracing updates for v6.18: 2025-10-05 09:43:36 -07:00
unwind unwind: Finish up unwind when a task exits 2025-07-31 10:20:11 -04:00
.gitignore kheaders: rebuild kheaders_data.tar.xz when a file is modified within a minute 2025-06-24 20:30:37 +09:00
acct.c kernel/acct.c: saner struct file treatment 2025-09-27 20:13:56 -04:00
async.c
audit.c audit: fix skb leak when audit rate limit is exceeded 2025-09-10 19:55:00 -04:00
audit.h audit: create audit_stamp structure 2025-08-30 10:15:28 -04:00
audit_fsnotify.c VFS/audit: introduce kern_path_parent() for audit 2025-09-23 12:37:35 +02:00
audit_tree.c mount-related stuff for this cycle 2025-10-03 10:19:44 -07:00
audit_watch.c VFS/audit: introduce kern_path_parent() for audit 2025-09-23 12:37:35 +02:00
auditfilter.c audit/stable-6.18 PR 20250926 2025-09-30 08:22:16 -07:00
auditsc.c audit: add record for multiple object contexts 2025-08-30 10:15:30 -04:00
backtracetest.c
bounds.c
capability.c capability: Remove unused has_capability 2025-03-07 22:03:09 -06:00
cfi.c cfi: Move BPF CFI types and helpers to generic code 2025-07-31 18:23:53 -07:00
compat.c
configs.c
context_tracking.c context_tracking: Make RCU watch ct_kernel_exit_state() warning 2025-03-04 18:44:29 -08:00
cpu.c cpu: Remove obsolete comment from takedown_cpu() 2025-08-06 22:48:12 +02:00
cpu_pm.c
crash_core.c crash: add KUnit tests for crash_exclude_mem_range 2025-09-13 17:32:55 -07:00
crash_core_test.c crash: add KUnit tests for crash_exclude_mem_range 2025-09-13 17:32:55 -07:00
crash_dump_dm_crypt.c crash_dump: retrieve dm crypt keys in kdump kernel 2025-05-21 10:48:21 -07:00
crash_reserve.c kdump: implement reserve_crashkernel_cma 2025-07-19 19:08:23 -07:00
cred.c copy_process: pass clone_flags as u64 across calltree 2025-09-01 15:31:34 +02:00
delayacct.c delayacct: remove redundant code and adjust indentation 2025-05-27 19:40:33 -07:00
dma.c
elfcorehdr.c
exec_domain.c
exit.c task_stack.h: clean-up stack_not_used() implementation 2025-09-21 14:22:00 -07:00
exit.h
extable.c
fail_function.c
fork.c Patch series in this pull request: 2025-10-02 18:44:54 -07:00
freezer.c mm/oom_kill: thaw the entire OOM victim process 2025-09-21 14:22:35 -07:00
gen_kheaders.sh kheaders: make it possible to override TAR 2025-08-06 10:23:36 +09:00
groups.c
hung_task.c hung_task: dump blocker task if it is not hung 2025-09-13 17:32:43 -07:00
iomem.c mm/memremap: Pass down MEMREMAP_* flags to arch_memremap_wb() 2025-02-21 15:05:38 +01:00
irq_work.c
jump_label.c jump_label: Use RCU in all users of __module_text_address(). 2025-03-10 11:54:46 +01:00
kallsyms.c bpf: Clean up individual BTF_ID code 2025-07-16 18:34:42 -07:00
kallsyms_internal.h
kallsyms_selftest.c kallsyms: use kmalloc_array() instead of kmalloc() 2025-09-28 11:36:14 -07:00
kallsyms_selftest.h
kcmp.c kcmp: improve performance adding an unlikely hint to task comparisons 2025-02-21 10:25:33 +01:00
Kconfig.freezer
Kconfig.hz kernel: Fix "select" wording on HZ_250 description 2025-02-21 09:20:30 +01:00
Kconfig.kexec crash: add KUnit tests for crash_exclude_mem_range 2025-09-13 17:32:55 -07:00
Kconfig.locks
Kconfig.preempt softirq: Allow to drop the softirq-BKL lock on PREEMPT_RT 2025-09-17 16:25:41 +02:00
kcov.c kcov: use write memory barrier after memcpy() in kcov_move_area() 2025-09-13 17:32:44 -07:00
kexec.c kexec: enable CMA based contiguous allocation 2025-08-02 12:01:38 -07:00
kexec_core.c kexec_core: remove redundant 0 value initialization 2025-09-13 17:32:49 -07:00
kexec_elf.c kexec: initialize ELF lowest address to ULONG_MAX 2025-03-16 22:30:47 -07:00
kexec_file.c x86/kexec: carry forward the boot DTB on kexec 2025-09-13 17:32:43 -07:00
kexec_handover.c Patch series in this pull request: 2025-10-02 18:44:54 -07:00
kexec_internal.h kexec: enable CMA based contiguous allocation 2025-08-02 12:01:38 -07:00
kheaders.c
kprobes.c kprobes: Add missing kerneldoc for __get_insn_slot 2025-07-15 18:45:34 +09:00
kstack_erase.c stackleak: Rename stackleak_track_stack to __sanitizer_cov_stack_depth 2025-07-21 21:40:39 -07:00
ksyms_common.c
ksysfs.c
kthread.c ipvs: Fix estimator kthreads preferred affinity 2025-08-13 08:34:33 +02:00
latencytop.c
Makefile Patch series in this pull request: 2025-10-02 18:44:54 -07:00
module_signature.c
notifier.c
nscommon.c ns: drop assert 2025-09-25 09:23:54 +02:00
nsproxy.c namespace-6.18-rc1 2025-09-29 11:20:29 -07:00
nstree.c ns: move ns type into struct ns_common 2025-09-25 09:23:54 +02:00
padata.c padata: WQ_PERCPU added to alloc_workqueue users 2025-09-13 12:11:06 +08:00
panic.c panic: remove CONFIG_PANIC_ON_OOPS_VALUE 2025-09-28 11:36:13 -07:00
params.c params: Replace deprecated strcpy() with strscpy() and memcpy() 2025-08-16 21:47:25 +02:00
pid.c namespace-6.18-rc1 2025-09-29 11:20:29 -07:00
pid_namespace.c namespace-6.18-rc1 2025-09-29 11:20:29 -07:00
pid_sysctl.h
profile.c
ptrace.c ptrace: introduce PTRACE_SET_SYSCALL_INFO request 2025-05-11 17:48:15 -07:00
range.c
reboot.c - The 7 patch series "powerpc/crash: use generic crashkernel 2025-04-01 10:06:52 -07:00
regset.c
relay.c relayfs: support a counter tracking if data is too big to write 2025-07-09 22:57:52 -07:00
resource.c resource: improve child resource handling in release_mem_region_adjustable() 2025-09-21 14:22:34 -07:00
resource_kunit.c
rseq.c rseq: Protect event mask against membarrier IPI 2025-09-13 19:51:59 +02:00
scftorture.c
scs.c
seccomp.c Performance events updates for v6.18: 2025-09-30 11:11:21 -07:00
signal.c signal: Fix memory leak for PIDFD_SELF* sentinels 2025-08-19 13:51:28 +02:00
smp.c smp: Fix up and expand the smp_call_function_many() kerneldoc 2025-09-18 22:21:28 +02:00
smpboot.c sched/smp: Use the SMP version of idle_thread_set_boot_cpu() 2025-06-13 08:47:20 +02:00
smpboot.h
softirq.c softirq: Allow to drop the softirq-BKL lock on PREEMPT_RT 2025-09-17 16:25:41 +02:00
stacktrace.c
static_call.c
static_call_inline.c Modules changes for 6.15-rc1 2025-03-30 15:44:36 -07:00
stop_machine.c sched/core: Fix migrate_swap() vs. hotplug 2025-07-01 15:02:03 +02:00
sys.c Patch series in this pull request: 2025-10-02 18:44:54 -07:00
sys_ni.c uprobes/x86: Add uprobe syscall to speed up uprobe 2025-08-21 20:09:20 +02:00
sysctl-test.c sysctl: move u8 register test to lib/test_sysctl.c 2025-04-14 14:13:41 +02:00
sysctl.c sysctl: rename kern_table -> sysctl_subsys_table 2025-07-23 11:56:02 +02:00
task_work.c
taskstats.c
torture.c torture: Delay CPU-hotplug operations until boot completes 2025-08-14 15:26:30 -07:00
tracepoint.c tracepoint: Print the function symbol when tracepoint_debug is set 2025-03-21 15:30:10 -04:00
tsacct.c pid: change bacct_add_tsk() to use task_ppid_nr_ns() 2025-08-19 13:38:20 +02:00
ucount.c ucount: use atomic_long_try_cmpxchg() in atomic_long_inc_below() 2025-08-02 12:01:38 -07:00
uid16.c
uid16.h
umh.c
up.c
user-return-notifier.c
user.c ns: move ns type into struct ns_common 2025-09-25 09:23:54 +02:00
user_namespace.c ns: move ns type into struct ns_common 2025-09-25 09:23:54 +02:00
utsname.c namespace-6.18-rc1 2025-09-29 11:20:29 -07:00
utsname_sysctl.c
vhost_task.c vhost: Take a reference on the task in struct vhost_task. 2025-09-21 17:44:20 -04:00
vmcore_info.c crash: export PAGE_UNACCEPTED_MAPCOUNT_VALUE to vmcoreinfo 2025-05-11 17:54:04 -07:00
watch_queue.c vfs-6.15-rc1.pipe 2025-03-24 09:52:37 -07:00
watchdog.c watchdog: skip checks when panic is in progress 2025-09-13 17:32:53 -07:00
watchdog_buddy.c watchdog: fix opencoded cpumask_next_wrap() in watchdog_next_cpu() 2025-07-31 11:28:03 -04:00
watchdog_perf.c watchdog: skip checks when panic is in progress 2025-09-13 17:32:53 -07:00
workqueue.c workqueue: WQ_PERCPU added to alloc_workqueue users 2025-09-16 10:33:53 -10:00
workqueue_internal.h