linux/include
Thomas Gleixner 76031d9536 clocksource: Make negative motion detection more robust
Guenter reported boot stalls on a emulated ARM 32-bit platform, which has a
24-bit wide clocksource.

It turns out that the calculated maximal idle time, which limits idle
sleeps to prevent clocksource wrap arounds, is close to the point where the
negative motion detection triggers.

  max_idle_ns:                    597268854 ns
  negative motion tripping point: 671088640 ns

If the idle wakeup is delayed beyond that point, the clocksource
advances far enough to trigger the negative motion detection. This
prevents the clock to advance and in the worst case the system stalls
completely if the consecutive sleeps based on the stale clock are
delayed as well.

Cure this by calculating a more robust cut-off value for negative motion,
which covers 87.5% of the actual clocksource counter width. Compare the
delta against this value to catch negative motion. This is specifically for
clock sources with a small counter width as their wrap around time is close
to the half counter width. For clock sources with wide counters this is not
a problem because the maximum idle time is far from the half counter width
due to the math overflow protection constraints.

For the case at hand this results in a tripping point of 1174405120ns.

Note, that this cannot prevent issues when the delay exceeds the 87.5%
margin, but that's not different from the previous unchecked version which
allowed arbitrary time jumps.

Systems with small counter width are prone to invalid results, but this
problem is unlikely to be seen on real hardware. If such a system
completely stalls for more than half a second, then there are other more
urgent problems than the counter wrapping around.

Fixes: c163e40af9 ("timekeeping: Always check for negative motion")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/all/8734j5ul4x.ffs@tglx
Closes: https://lore.kernel.org/all/387b120b-d68a-45e8-b6ab-768cd95d11c2@roeck-us.net
2024-12-05 16:03:24 +01:00
..
acpi common: switch back from remove_new() to remove() callback 2024-11-25 17:31:39 -08:00
asm-generic - Fix a case where posix timers with a thread-group-wide target would miss 2024-12-01 12:41:21 -08:00
clocksource
crypto This update includes the following changes: 2024-11-19 10:28:41 -08:00
cxl
drm drm for 6.13-rc1 2024-11-21 14:56:17 -08:00
dt-bindings Char/Misc/IIO/Whatever driver subsystem updates for 6.13-rc1 2024-11-29 11:58:27 -08:00
keys
kunit The core framework gained a clk provider helper, a clk consumer helper, and 2024-11-22 17:02:25 -08:00
kvm KVM: arm64: vgic: Kill VGIC_MAX_PRIVATE definition 2024-11-20 17:21:08 -08:00
linux clocksource: Make negative motion detection more robust 2024-12-05 16:03:24 +01:00
math-emu
media media: replace obsolete hans.verkuil@cisco.com alias 2024-11-08 13:38:09 +01:00
memory
misc
net Kbuild updates for v6.13 2024-11-30 13:41:50 -08:00
pcmcia
ras
rdma RDMA/core: Move ib_uverbs_file struct to uverbs_types.h 2024-11-04 06:57:21 -05:00
rv
scsi Random number generator updates for Linux 6.13-rc1. 2024-11-19 10:43:44 -08:00
soc The core framework gained a clk provider helper, a clk consumer helper, and 2024-11-22 17:02:25 -08:00
sound ALSA: hda/tas2781: Add speaker id check for ASUS projects 2024-11-26 08:54:08 +01:00
target
trace NFS client updates for Linux 6.13 2024-11-30 10:17:53 -08:00
uapi io_uring-6.13-20242901 2024-11-30 15:43:02 -08:00
ufs scsi: ufs: ufs-mediatek: Configure individual LU queue flags 2024-11-06 20:42:17 -05:00
vdso vdso: Rename struct arch_vdso_data to arch_vdso_time_data 2024-11-02 12:37:36 +01:00
video - Improved handling of LCD power states and interactions with the fbdev subsystem. 2024-11-22 16:29:57 -08:00
xen