linux/kernel/sched
Thomas Gleixner 1e83ccd592 sched/mmcid: Don't assume CID is CPU owned on mode switch
Shinichiro reported a KASAN UAF, which is actually an out of bounds access
in the MMCID management code.

   CPU0						CPU1
   						T1 runs in userspace
   T0: fork(T4) -> Switch to per CPU CID mode
         fixup() set MM_CID_TRANSIT on T1/CPU1
   T4 exit()
   T3 exit()
   T2 exit()
						T1 exit() switch to per task mode
						 ---> Out of bounds access.

As T1 has not scheduled after T0 set the TRANSIT bit, it exits with the
TRANSIT bit set. sched_mm_cid_remove_user() clears the TRANSIT bit in
the task and drops the CID, but it does not touch the per CPU storage.
That's functionally correct because a CID is only owned by the CPU when
the ONCPU bit is set, which is mutually exclusive with the TRANSIT flag.

Now sched_mm_cid_exit() assumes that the CID is CPU owned because the
prior mode was per CPU. It invokes mm_drop_cid_on_cpu() which clears the
not set ONCPU bit and then invokes clear_bit() with an insanely large
bit number because TRANSIT is set (bit 29).

Prevent that by actually validating that the CID is CPU owned in
mm_drop_cid_on_cpu().

Fixes: 007d84287c ("sched/mmcid: Drop per CPU CID immediately when switching to per task mode")
Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Tested-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Cc: stable@vger.kernel.org
Closes: https://lore.kernel.org/aYsZrixn9b6s_2zL@shinmob
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-11 12:59:56 -08:00
..
autogroup.c cgroup: Rename cgroup lifecycle hooks to cgroup_task_*() 2025-11-03 11:46:18 -10:00
autogroup.h sched: Clean up and standardize #if/#else/#endif markers in sched/autogroup.[ch] 2025-06-13 08:47:14 +02:00
build_policy.c sched_ext: Move internal type and accessor definitions to ext_internal.h 2025-09-03 11:33:28 -10:00
build_utility.c sched/smp: Make SMP unconditional 2025-06-13 08:47:18 +02:00
clock.c sched/clock: Avoid false sharing for sched_clock_irqtime 2026-02-03 12:04:19 +01:00
completion.c sched: Make clangd usable 2025-06-11 11:20:53 +02:00
core.c sched/mmcid: Don't assume CID is CPU owned on mode switch 2026-02-11 12:59:56 -08:00
core_sched.c sched: Make clangd usable 2025-06-11 11:20:53 +02:00
cpuacct.c sched: Make clangd usable 2025-06-11 11:20:53 +02:00
cpudeadline.c sched/deadline: only set free_cpus for online runqueues 2025-10-16 11:13:49 +02:00
cpudeadline.h sched/deadline: only set free_cpus for online runqueues 2025-10-16 11:13:49 +02:00
cpufreq.c sched: Make clangd usable 2025-06-11 11:20:53 +02:00
cpufreq_schedutil.c sched/cpufreq: Use %pe format for PTR_ERR() printing 2026-02-03 12:04:19 +01:00
cpupri.c sched: Make clangd usable 2025-06-11 11:20:53 +02:00
cpupri.h sched/smp: Make SMP unconditional 2025-06-13 08:47:18 +02:00
cputime.c - A nice cleanup to the paravirt code containing a unification of the paravirt 2026-02-10 19:01:45 -08:00
deadline.c sched/debug: Fix dl_server (re)start conditions 2026-02-03 12:04:18 +01:00
debug.c sched/debug: Fix dl_server (re)start conditions 2026-02-03 12:04:18 +01:00
ext.c Scheduler changes for v7.0: 2026-02-10 12:50:10 -08:00
ext.h sched_ext: Use cgroup_lock/unlock() to synchronize against cgroup operations 2025-09-03 11:36:07 -10:00
ext_idle.c sched_ext: Wrap kfunc args in struct to prepare for aux__prog 2025-10-13 08:49:29 -10:00
ext_idle.h sched_ext: Always use SMP versions in kernel/sched/ext_idle.h 2025-06-13 14:47:59 -10:00
ext_internal.h sched_ext: Implement load balancer for bypass mode 2025-11-12 06:43:44 -10:00
fair.c Scheduler changes for v7.0: 2026-02-10 12:50:10 -08:00
features.h sched/fair: Disable scheduler feature NEXT_BUDDY 2026-01-23 11:53:19 +01:00
idle.c sched_ext: Add a DL server for sched_ext tasks 2026-02-03 12:04:17 +01:00
isolation.c kthread: Honour kthreads preferred affinity after cpuset changes 2026-02-03 15:23:35 +01:00
loadavg.c Merge branch 'tip/sched/urgent' 2025-07-14 17:16:28 +02:00
Makefile sched: Enable context analysis for core.c and fair.c 2026-01-05 16:43:36 +01:00
membarrier.c rseq: Simplify the event notification 2025-11-04 08:30:09 +01:00
pelt.c treewide: Update email address 2026-01-11 06:09:11 -10:00
pelt.h sched/fair: Switch to task based throttle model 2025-09-03 10:03:14 +02:00
psi.c sched/psi: Fix psi_seq initialization 2025-08-04 10:51:22 -07:00
rq-offsets.c sched: Make migrate_{en,dis}able() inline 2025-09-25 09:57:16 +02:00
rt.c sched/rt: Skip currently executing CPU in rto_next_cpu() 2026-02-03 12:04:19 +01:00
sched-pelt.h sched: Make clangd usable 2025-06-11 11:20:53 +02:00
sched.h sched/mmcid: Don't assume CID is CPU owned on mode switch 2026-02-11 12:59:56 -08:00
smp.h sched: Make clangd usable 2025-06-11 11:20:53 +02:00
stats.c sched/smp: Use the SMP version of schedstats 2025-06-13 08:47:21 +02:00
stats.h sched/core: Fix psi_dequeue() for Proxy Execution 2025-12-06 10:13:16 +01:00
stop_task.c sched/core: Rework sched_class::wakeup_preempt() and rq_modified_*() 2025-12-17 10:53:25 +01:00
swait.c sched: Make clangd usable 2025-06-11 11:20:53 +02:00
syscalls.c sched: Deadline has dynamic priority 2026-01-15 21:57:53 +01:00
topology.c sched_ext: Add a DL server for sched_ext tasks 2026-02-03 12:04:17 +01:00
wait.c ARM: 2025-07-30 17:14:01 -07:00
wait_bit.c sched: Make clangd usable 2025-06-11 11:20:53 +02:00