linux/arch
David Laight 630f96a687 lib: mul_u64_u64_div_u64(): optimise multiply on 32bit x86
gcc generates horrid code for both ((u64)u32_a * u32_b) and (u64_a +
u32_b).  As well as the extra instructions it can generate a lot of spills
to stack (including spills of constant zeros and even multiplies by
constant zero).

mul_u32_u32() already exists to optimise the multiply.  Add a similar
add_u64_32() for the addition.  Disable both for clang - it generates
better code without them.

Move the 64x64 => 128 multiply into a static inline helper function for
code clarity.  No need for the a/b_hi/lo variables, the implicit casts on
the function calls do the work for us.  Should have minimal effect on the
generated code.

Use mul_u32_u32() and add_u64_u32() in the 64x64 => 128 multiply in
mul_u64_add_u64_div_u64().

Link: https://lkml.kernel.org/r/20251105201035.64043-8-david.laight.linux@gmail.com
Signed-off-by: David Laight <david.laight.linux@gmail.com>
Reviewed-by: Nicolas Pitre <npitre@baylibre.com>
Cc: Biju Das <biju.das.jz@bp.renesas.com>
Cc: Borislav Betkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Li RongQing <lirongqing@baidu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20 14:03:42 -08:00
..
alpha assorted dead code removal around asm/pgtable.h 2025-10-03 11:37:50 -07:00
arc Ext4 bug fixes for 6.18-rc2, including 2025-10-15 07:51:57 -07:00
arm hung_task: panic when there are more than N hung tasks at the same time 2025-11-12 10:00:14 -08:00
arm64 bpf-fixes 2025-10-31 18:22:26 -07:00
csky csky: abiv2: adapt to new folio flags field 2025-10-21 15:46:18 -07:00
hexagon Remove long-stale ext3 defconfig option 2025-10-15 07:57:28 -07:00
loongarch rust: kbuild: support -Cjump-tables=n for Rust 1.93.0 2025-11-04 19:11:39 +01:00
m68k Ext4 bug fixes for 6.18-rc2, including 2025-10-15 07:51:57 -07:00
microblaze Ext4 bug fixes for 6.18-rc2, including 2025-10-15 07:51:57 -07:00
mips pci-v6.18-fixes-3 2025-10-24 16:43:08 -07:00
nios2 Summary of significant series in this pull request: 2025-10-02 18:18:33 -07:00
openrisc Ext4 bug fixes for 6.18-rc2, including 2025-10-15 07:51:57 -07:00
parisc parisc: Avoid crash due to unaligned access in unwinder 2025-11-04 12:21:59 +01:00
powerpc crash: let architecture decide crash memory export to iomem_resource 2025-11-12 10:00:15 -08:00
riscv riscv: KGDB: Replace deprecated strcpy in kgdb_arch_handle_qxfer_pkt 2025-10-27 23:30:01 -06:00
s390 s390 fixes for 6.18-rc4 2025-10-31 12:50:35 -07:00
sh Remove long-stale ext3 defconfig option 2025-10-15 07:57:28 -07:00
sparc Remove long-stale ext3 defconfig option 2025-10-15 07:57:28 -07:00
um updates for UML, notably 2025-10-06 12:10:55 -07:00
x86 lib: mul_u64_u64_div_u64(): optimise multiply on 32bit x86 2025-11-20 14:03:42 -08:00
xtensa Ext4 bug fixes for 6.18-rc2, including 2025-10-15 07:51:57 -07:00
.gitignore
Kconfig treewide: drop outdated compiler version remarks in Kconfig help texts 2025-11-12 10:00:14 -08:00