linux

mirror of https://github.com/torvalds/linux.git synced 2026-03-08 04:44:45 +01:00

Author	SHA1	Message	Date
Eric Biggers	20d6f07004	lib/crypto: tests: Add a .kunitconfig file Add a .kunitconfig file to the lib/crypto/ directory so that the crypto library tests can be run more easily using kunit.py. Example with UML: tools/testing/kunit/kunit.py run --kunitconfig=lib/crypto Example with QEMU: tools/testing/kunit/kunit.py run --kunitconfig=lib/crypto --arch=arm64 --make_options LLVM=1 Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260301040140.490310-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-03-02 15:35:05 -08:00
Eric Biggers	4478e8eeb8	lib/crypto: tests: Depend on library options rather than selecting them The convention for KUnit tests is to have the test kconfig options visible only when the code they depend on is already enabled. This way only the tests that are relevant to the particular kernel build can be enabled, either manually or via KUNIT_ALL_TESTS. Update lib/crypto/tests/Kconfig to follow that convention, i.e. depend on the corresponding library options rather than selecting them. This fixes an issue where enabling KUNIT_ALL_TESTS enabled non-test code. This does mean that it becomes a bit more difficult to enable all the crypto library tests (which is what I do as a maintainer of the code), since doing so will now require enabling other options that select the libraries. Regardless, we should follow the standard KUnit convention. I'll also add a .kunitconfig file that does enable all these options. Note: currently most of the crypto library options are selected by visible options in crypto/Kconfig, which can be used to enable them without too much trouble. If in the future we end up with more cases like CRYPTO_LIB_CURVE25519 which is selected only by WIREGUARD (thus making CRYPTO_LIB_CURVE25519_KUNIT_TEST effectively depend on WIREGUARD after this commit), we could consider adding a new kconfig option that enables all the library code specifically for testing. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Closes: https://lore.kernel.org/r/CAMuHMdULzMdxuTVfg8_4jdgzbzjfx-PHkcgbGSthcUx_sHRNMg@mail.gmail.com Fixes: `4dcf6cadda` ("lib/crypto: tests: Add KUnit tests for SHA-224 and SHA-256") Fixes: `571eaeddb6` ("lib/crypto: tests: Add KUnit tests for SHA-384 and SHA-512") Fixes: `6dd4d9f791` ("lib/crypto: tests: Add KUnit tests for Poly1305") Fixes: `66b1306079` ("lib/crypto: tests: Add KUnit tests for SHA-1 and HMAC-SHA1") Fixes: `d6b6aac0cd` ("lib/crypto: tests: Add KUnit tests for MD5 and HMAC-MD5") Fixes: `afc4e4a5f1` ("lib/crypto: tests: Migrate Curve25519 self-test to KUnit") Fixes: `6401fd334d` ("lib/crypto: tests: Add KUnit tests for BLAKE2b") Fixes: `15c64c47e4` ("lib/crypto: tests: Add SHA3 kunit tests") Fixes: `b3aed551b3` ("lib/crypto: tests: Add KUnit tests for POLYVAL") Fixes: `ed894faccb` ("lib/crypto: tests: Add KUnit tests for ML-DSA verification") Fixes: `7246fe6cd6` ("lib/crypto: tests: Add KUnit tests for NH") Cc: stable@vger.kernel.org Reviewed-by: David Gow <david@davidgow.net> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260226191749.39397-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-02-28 19:17:30 -08:00
Linus Torvalds	75e1f66a9e	Crypto library fix for v7.0-rc1 Fix a big endian specific issue in the PPC64-optimized AES code. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaZtiphQcZWJpZ2dlcnNA a2VybmVsLm9yZwAKCRDzXCl4vpKOK/z2AQD1j15Ao3iDW3yBSyTS+tFJaRUVDtjg bNostoNJAjMM9wD/X0oAnt95WTNkdHexs+2aMzQ4ULKFRcwQrUTarKx7IgY= =Dmfx -----END PGP SIGNATURE----- Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull crypto library fix from Eric Biggers: "Fix a big endian specific issue in the PPC64-optimized AES code" * tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: lib/crypto: powerpc/aes: Fix rndkey_from_vsx() on big endian CPUs	2026-02-22 13:09:33 -08:00
Linus Torvalds	bf4afc53b7	Convert 'alloc_obj' family to use the new default GFP_KERNEL argument This was done entirely with mindless brute force, using git grep -l '\<k[vmz]alloc_objs(., GFP_KERNEL)' \| xargs sed -i 's/\(alloc_objs(.*\), GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 17:09:51 -08:00
Kees Cook	69050f8d6d	treewide: Replace kmalloc with kmalloc_obj for non-scalar types This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(PTR, FAM, COUNT, ...) (where TYPE may also be VAR) The resulting allocations no longer return "void ", instead returning "TYPE ". Signed-off-by: Kees Cook <kees@kernel.org>	2026-02-21 01:02:28 -08:00
Eric Biggers	beeebffc80	lib/crypto: powerpc/aes: Fix rndkey_from_vsx() on big endian CPUs I finally got a big endian PPC64 kernel to boot in QEMU. The PPC64 VSX optimized AES library code does work in that case, with the exception of rndkey_from_vsx() which doesn't take into account that the order in which the VSX code stores the round key words depends on the endianness. So fix rndkey_from_vsx() to do the right thing on big endian CPUs. Fixes: `7cf2082e74` ("lib/crypto: powerpc/aes: Migrate POWER8 optimized code into library") Link: https://lore.kernel.org/r/20260216022104.332991-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-02-18 13:38:14 -08:00
Linus Torvalds	37a93dd5c4	Networking changes for 7.0 Core & protocols ---------------- - A significant effort all around the stack to guide the compiler to make the right choice when inlining code, to avoid unneeded calls for small helper and stack canary overhead in the fast-path. This generates better and faster code with very small or no text size increases, as in many cases the call generated more code than the actual inlined helper. - Extend AccECN implementation so that is now functionally complete, also allow the user-space enabling it on a per network namespace basis. - Add support for memory providers with large (above 4K) rx buffer. Paired with hw-gro, larger rx buffer sizes reduce the number of buffers traversing the stack, dincreasing single stream CPU usage by up to ~30%. - Do not add HBH header to Big TCP GSO packets. This simplifies the RX path, the TX path and the NIC drivers, and is possible because user-space taps can now interpret correctly such packets without the HBH hint. - Allow IPv6 routes to be configured with a gateway address that is resolved out of a different interface than the one specified, aligning IPv6 to IPv4 behavior. - Multi-queue aware sch_cake. This makes it possible to scale the rate shaper of sch_cake across multiple CPUs, while still enforcing a single global rate on the interface. - Add support for the nbcon (new buffer console) infrastructure to netconsole, enabling lock-free, priority-based console operations that are safer in crash scenarios. - Improve the TCP ipv6 output path to cache the flow information, saving cpu cycles, reducing cache line misses and stack use. - Improve netfilter packet tracker to resolve clashes for most protocols, avoiding unneeded drops on rare occasions. - Add IP6IP6 tunneling acceleration to the flowtable infrastructure. - Reduce tcp socket size by one cache line. - Notify neighbour changes atomically, avoiding inconsistencies between the notification sequence and the actual states sequence. - Add vsock namespace support, allowing complete isolation of vsocks across different network namespaces. - Improve xsk generic performances with cache-alignment-oriented optimizations. - Support netconsole automatic target recovery, allowing netconsole to reestablish targets when underlying low-level interface comes back online. Driver API ---------- - Support for switching the working mode (automatic vs manual) of a DPLL device via netlink. - Introduce PHY ports representation to expose multiple front-facing media ports over a single MAC. - Introduce "rx-polarity" and "tx-polarity" device tree properties, to generalize polarity inversion requirements for differential signaling. - Add helper to create, prepare and enable managed clocks. Device drivers -------------- - Add Huawei hinic3 PF etherner driver. - Add DWMAC glue driver for Motorcomm YT6801 PCIe ethernet controller. - Add ethernet driver for MaxLinear MxL862xx switches - Remove parallel-port Ethernet driver. - Convert existing driver timestamp configuration reporting to hwtstamp_get and remove legacy ioctl(). - Convert existing drivers to .get_rx_ring_count(), simplifing the RX ring count retrieval. Also remove the legacy fallback path. - Ethernet high-speed NICs: - Broadcom (bnxt, bng): - bnxt: add FW interface update to support FEC stats histogram and NVRAM defragmentation - bng: add TSO and H/W GRO support - nVidia/Mellanox (mlx5): - improve latency of channel restart operations, reducing the used H/W resources - add TSO support for UDP over GRE over VLAN - add flow counters support for hardware steering (HWS) rules - use a static memory area to store headers for H/W GRO, leading to 12% RX tput improvement - Intel (100G, ice, idpf): - ice: reorganizes layout of Tx and Rx rings for cacheline locality and utilizes __cacheline_group* macros on the new layouts - ice: introduces Synchronous Ethernet (SyncE) support - Meta (fbnic): - adds debugfs for firmware mailbox and tx/rx rings vectors - Ethernet virtual: - geneve: introduce GRO/GSO support for double UDP encapsulation - Ethernet NICs consumer, and embedded: - Synopsys (stmmac): - some code refactoring and cleanups - RealTek (r8169): - add support for RTL8127ATF (10G Fiber SFP) - add dash and LTR support - Airoha: - AN8811HB 2.5 Gbps phy support - Freescale (fec): - add XDP zero-copy support - Thunderbolt: - add get link setting support to allow bonding - Renesas: - add support for RZ/G3L GBETH SoC - Ethernet switches: - Maxlinear: - support R(G)MII slow rate configuration - add support for Intel GSW150 - Motorcomm (yt921x): - add DCB/QoS support - TI: - icssm-prueth: support bridging (STP/RSTP) via the switchdev framework - Ethernet PHYs: - Realtek: - enable SGMII and 2500Base-X in-band auto-negotiation - simplify and reunify C22/C45 drivers - Micrel: convert bindings to DT schema - CAN: - move skb headroom content into skb extensions, making CAN metadata access more robust - CAN drivers: - rcar_canfd: - add support for FD-only mode - add support for the RZ/T2H SoC - sja1000: cleanup the CAN state handling - WiFi: - implement EPPKE/802.1X over auth frames support - split up drop reasons better, removing generic RX_DROP - additional FTM capabilities: 6 GHz support, supported number of spatial streams and supported number of LTF repetitions - better mac80211 iterators to enumerate resources - initial UHR (Wi-Fi 8) support for cfg80211/mac80211 - WiFi drivers: - Qualcomm/Atheros: - ath11k: support for Channel Frequency Response measurement - ath12k: a significant driver refactor to support multi-wiphy devices and and pave the way for future device support in the same driver (rather than splitting to ath13k) - ath12k: support for the QCC2072 chipset - Intel: - iwlwifi: partial Neighbor Awareness Networking (NAN) support - iwlwifi: initial support for U-NII-9 and IEEE 802.11bn - RealTek (rtw89): - preparations for RTL8922DE support - Bluetooth: - implement setsockopt(BT_PHY) to set the connection packet type/PHY - set link_policy on incoming ACL connections - Bluetooth drivers: - btusb: add support for MediaTek7920, Realtek RTL8761BU and 8851BE - btqca: add WCN6855 firmware priority selection feature Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCgAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmmMum4SHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOkDnMP/3bpHAGj+gylTid3Xsj0TjJ8AkPsQs+W uvSMiCB1TvGCTD9kK36Vr+qPoIgJY10UxYMMjt5Gs0A9TvGDDfYnUOVoUIkfkWCH grqSdp6dVkyaJVfyLEcuOVQQG2HwEnhC4c3ZOhOxaKNAnsLCP142lYsMR9ktGRuA 4vDGtz1+y7t8qBk/lyfXDM71KRrtq0HWJZIhmhz8QXTBsgPDfSejbTPNxXQOJoeO sKeArsHr/Cmvf89ZtLZ63vbfr4BKDm4PeXqPYR3PrQs2Yu6I1EK4lehygTY2yE2O I3MEPlvpa/tiVLxqXNNwEFbYIkMPY6FXS9x05hTxNZM65A6aB3vvdkqPVnVmAlXE f+4PYg9paI13lbzZOeQbGfZ5HgPpzQvnginaaX6s9Fp12K3Ll1FkwWdUznFWhzVn 5LSrGyecR00CdKJByTIw9JGg/1ptz5a57pa8OQmcKRx3WhQ1XeV5TIJQF4QcPgHw ApyjmeGDTQMQMzha1fsaVr+i6BK2zgZvKK9uGDTX90xn2JUw/M75tyOlsTtGlnuM sZgj0KVGQlG2wLwBB/+D4S9Oi9YlPG00rkCs0E4jk5C/G4NBmMgpEPQg6azkb57h Uiy0paohxfwcZ3qbGA9In091ClGqIwOiCBaq+uXRq1ro88Neo6PWkjz5ItNrsD8t Ttgd5AVAQyPT =O31Y -----END PGP SIGNATURE----- Merge tag 'net-next-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Paolo Abeni: "Core & protocols: - A significant effort all around the stack to guide the compiler to make the right choice when inlining code, to avoid unneeded calls for small helper and stack canary overhead in the fast-path. This generates better and faster code with very small or no text size increases, as in many cases the call generated more code than the actual inlined helper. - Extend AccECN implementation so that is now functionally complete, also allow the user-space enabling it on a per network namespace basis. - Add support for memory providers with large (above 4K) rx buffer. Paired with hw-gro, larger rx buffer sizes reduce the number of buffers traversing the stack, dincreasing single stream CPU usage by up to ~30%. - Do not add HBH header to Big TCP GSO packets. This simplifies the RX path, the TX path and the NIC drivers, and is possible because user-space taps can now interpret correctly such packets without the HBH hint. - Allow IPv6 routes to be configured with a gateway address that is resolved out of a different interface than the one specified, aligning IPv6 to IPv4 behavior. - Multi-queue aware sch_cake. This makes it possible to scale the rate shaper of sch_cake across multiple CPUs, while still enforcing a single global rate on the interface. - Add support for the nbcon (new buffer console) infrastructure to netconsole, enabling lock-free, priority-based console operations that are safer in crash scenarios. - Improve the TCP ipv6 output path to cache the flow information, saving cpu cycles, reducing cache line misses and stack use. - Improve netfilter packet tracker to resolve clashes for most protocols, avoiding unneeded drops on rare occasions. - Add IP6IP6 tunneling acceleration to the flowtable infrastructure. - Reduce tcp socket size by one cache line. - Notify neighbour changes atomically, avoiding inconsistencies between the notification sequence and the actual states sequence. - Add vsock namespace support, allowing complete isolation of vsocks across different network namespaces. - Improve xsk generic performances with cache-alignment-oriented optimizations. - Support netconsole automatic target recovery, allowing netconsole to reestablish targets when underlying low-level interface comes back online. Driver API: - Support for switching the working mode (automatic vs manual) of a DPLL device via netlink. - Introduce PHY ports representation to expose multiple front-facing media ports over a single MAC. - Introduce "rx-polarity" and "tx-polarity" device tree properties, to generalize polarity inversion requirements for differential signaling. - Add helper to create, prepare and enable managed clocks. Device drivers: - Add Huawei hinic3 PF etherner driver. - Add DWMAC glue driver for Motorcomm YT6801 PCIe ethernet controller. - Add ethernet driver for MaxLinear MxL862xx switches - Remove parallel-port Ethernet driver. - Convert existing driver timestamp configuration reporting to hwtstamp_get and remove legacy ioctl(). - Convert existing drivers to .get_rx_ring_count(), simplifing the RX ring count retrieval. Also remove the legacy fallback path. - Ethernet high-speed NICs: - Broadcom (bnxt, bng): - bnxt: add FW interface update to support FEC stats histogram and NVRAM defragmentation - bng: add TSO and H/W GRO support - nVidia/Mellanox (mlx5): - improve latency of channel restart operations, reducing the used H/W resources - add TSO support for UDP over GRE over VLAN - add flow counters support for hardware steering (HWS) rules - use a static memory area to store headers for H/W GRO, leading to 12% RX tput improvement - Intel (100G, ice, idpf): - ice: reorganizes layout of Tx and Rx rings for cacheline locality and utilizes __cacheline_group* macros on the new layouts - ice: introduces Synchronous Ethernet (SyncE) support - Meta (fbnic): - adds debugfs for firmware mailbox and tx/rx rings vectors - Ethernet virtual: - geneve: introduce GRO/GSO support for double UDP encapsulation - Ethernet NICs consumer, and embedded: - Synopsys (stmmac): - some code refactoring and cleanups - RealTek (r8169): - add support for RTL8127ATF (10G Fiber SFP) - add dash and LTR support - Airoha: - AN8811HB 2.5 Gbps phy support - Freescale (fec): - add XDP zero-copy support - Thunderbolt: - add get link setting support to allow bonding - Renesas: - add support for RZ/G3L GBETH SoC - Ethernet switches: - Maxlinear: - support R(G)MII slow rate configuration - add support for Intel GSW150 - Motorcomm (yt921x): - add DCB/QoS support - TI: - icssm-prueth: support bridging (STP/RSTP) via the switchdev framework - Ethernet PHYs: - Realtek: - enable SGMII and 2500Base-X in-band auto-negotiation - simplify and reunify C22/C45 drivers - Micrel: convert bindings to DT schema - CAN: - move skb headroom content into skb extensions, making CAN metadata access more robust - CAN drivers: - rcar_canfd: - add support for FD-only mode - add support for the RZ/T2H SoC - sja1000: cleanup the CAN state handling - WiFi: - implement EPPKE/802.1X over auth frames support - split up drop reasons better, removing generic RX_DROP - additional FTM capabilities: 6 GHz support, supported number of spatial streams and supported number of LTF repetitions - better mac80211 iterators to enumerate resources - initial UHR (Wi-Fi 8) support for cfg80211/mac80211 - WiFi drivers: - Qualcomm/Atheros: - ath11k: support for Channel Frequency Response measurement - ath12k: a significant driver refactor to support multi-wiphy devices and and pave the way for future device support in the same driver (rather than splitting to ath13k) - ath12k: support for the QCC2072 chipset - Intel: - iwlwifi: partial Neighbor Awareness Networking (NAN) support - iwlwifi: initial support for U-NII-9 and IEEE 802.11bn - RealTek (rtw89): - preparations for RTL8922DE support - Bluetooth: - implement setsockopt(BT_PHY) to set the connection packet type/PHY - set link_policy on incoming ACL connections - Bluetooth drivers: - btusb: add support for MediaTek7920, Realtek RTL8761BU and 8851BE - btqca: add WCN6855 firmware priority selection feature" * tag 'net-next-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1254 commits) bnge/bng_re: Add a new HSI net: macb: Fix tx/rx malfunction after phy link down and up af_unix: Fix memleak of newsk in unix_stream_connect(). net: ti: icssg-prueth: Add optional dependency on HSR net: dsa: add basic initial driver for MxL862xx switches net: mdio: add unlocked mdiodev C45 bus accessors net: dsa: add tag format for MxL862xx switches dt-bindings: net: dsa: add MaxLinear MxL862xx selftests: drivers: net: hw: Modify toeplitz.c to poll for packets octeontx2-pf: Unregister devlink on probe failure net: renesas: rswitch: fix forwarding offload statemachine ionic: Rate limit unknown xcvr type messages tcp: inet6_csk_xmit() optimization tcp: populate inet->cork.fl.u.ip6 in tcp_v6_syn_recv_sock() tcp: populate inet->cork.fl.u.ip6 in tcp_v6_connect() ipv6: inet6_csk_xmit() and inet6_csk_update_pmtu() use inet->cork.fl.u.ip6 ipv6: use inet->cork.fl.u.ip6 and np->final in ip6_datagram_dst_update() ipv6: use np->final in inet6_sk_rebuild_header() ipv6: add daddr/final storage in struct ipv6_pinfo net: stmmac: qcom-ethqos: fix qcom_ethqos_serdes_powerup() ...	2026-02-11 19:31:52 -08:00
Eric Biggers	ffd42b6d04	lib/crypto: mldsa: Clarify the documentation for mldsa_verify() slightly mldsa_verify() implements ML-DSA.Verify with ctx='', so document this more explicitly. Remove the one-liner comment above mldsa_verify() which was somewhat misleading. Reviewed-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20260202221552.174341-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-02-03 19:28:51 -08:00
Eric Biggers	9ddfabcc1e	lib/crypto: sha1: Remove low-level functions from API Now that there are no users of the low-level SHA-1 interface, remove it. Specifically: - Remove SHA1_DIGEST_WORDS (no longer used) - Remove sha1_init_raw() (no longer used) - Rename sha1_transform() to sha1_block_generic() and make it static - Move SHA1_WORKSPACE_WORDS into lib/crypto/sha1.c Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Link: https://patch.msgid.link/20260123051656.396371-3-ebiggers@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-27 15:47:41 -08:00
Eric Biggers	fbfeca7404	lib/crypto: aes: Drop 'volatile' from aes_sbox and aes_inv_sbox The volatile keyword is no longer necessary or useful on aes_sbox and aes_inv_sbox, since the table prefetching is now done using a helper function that casts to volatile itself and also includes an optimization barrier. Since it prevents some compiler optimizations, remove it. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-36-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-01-15 14:09:09 -08:00
Eric Biggers	953f2db0bf	lib/crypto: aes: Remove old AES en/decryption functions Now that all callers of the aes_encrypt() and aes_decrypt() type-generic macros are using the new types, remove the old functions. Then, replace the macro with direct calls to the new functions, dropping the "_new" suffix from them. This completes the change in the type of the key struct that is passed to aes_encrypt() and aes_decrypt(). Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-35-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-01-15 14:09:09 -08:00
Eric Biggers	bc79efa08c	lib/crypto: aesgcm: Use new AES library API Switch from the old AES library functions (which use struct crypto_aes_ctx) to the new ones (which use struct aes_enckey). This eliminates the unnecessary computation and caching of the decryption round keys. The new AES en/decryption functions are also much faster and use AES instructions when supported by the CPU. Note that in addition to the change in the key preparation function and the key struct type itself, the change in the type of the key struct results in aes_encrypt() (which is temporarily a type-generic macro) calling the new encryption function rather than the old one. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-34-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-01-15 14:09:09 -08:00
Eric Biggers	7fcb22dc77	lib/crypto: aescfb: Use new AES library API Switch from the old AES library functions (which use struct crypto_aes_ctx) to the new ones (which use struct aes_enckey). This eliminates the unnecessary computation and caching of the decryption round keys. The new AES en/decryption functions are also much faster and use AES instructions when supported by the CPU. Note that in addition to the change in the key preparation function and the key struct type itself, the change in the type of the key struct results in aes_encrypt() (which is temporarily a type-generic macro) calling the new encryption function rather than the old one. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-33-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-01-15 14:09:08 -08:00
Eric Biggers	24eb22d816	lib/crypto: x86/aes: Add AES-NI optimization Optimize the AES library with x86 AES-NI instructions. The relevant existing assembly functions, aesni_set_key(), aesni_enc(), and aesni_dec(), are a bit difficult to extract into the library: - They're coupled to the code for the AES modes. - They operate on struct crypto_aes_ctx. The AES library now uses different structs. - They assume the key is 16-byte aligned. The AES library only prefers 16-byte alignment; it doesn't require it. Moreover, they're not all that great in the first place: - They use unrolled loops, which isn't a great choice on x86. - They use the 'aeskeygenassist' instruction, which is unnecessary, is slow on Intel CPUs, and forces the loop to be unrolled. - They have special code for AES-192 key expansion, despite that being kind of useless. AES-128 and AES-256 are the ones used in practice. These are small functions anyway. Therefore, I opted to just write replacements of these functions for the library. They address all the above issues. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-18-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-01-15 14:09:07 -08:00
Eric Biggers	293c7cd5c6	lib/crypto: sparc/aes: Migrate optimized code into library Move the SPARC64 AES assembly code into lib/crypto/, wire the key expansion and single-block en/decryption functions up to the AES library API, and remove the "aes-sparc64" crypto_cipher algorithm. The result is that both the AES library and crypto_cipher APIs use the SPARC64 AES opcodes, whereas previously only crypto_cipher did (and it wasn't enabled by default, which this commit fixes as well). Note that some of the functions in the SPARC64 AES assembly code are still used by the AES mode implementations in arch/sparc/crypto/aes_glue.c. For now, just export these functions. These exports will go away once the AES mode implementations are migrated to the library as well. (Trying to split up the assembly file seemed like much more trouble than it would be worth.) Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-17-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-01-15 14:09:07 -08:00
Eric Biggers	0cab15611e	lib/crypto: s390/aes: Migrate optimized code into library Implement aes_preparekey_arch(), aes_encrypt_arch(), and aes_decrypt_arch() using the CPACF AES instructions. Then, remove the superseded "aes-s390" crypto_cipher. The result is that both the AES library and crypto_cipher APIs use the CPACF AES instructions, whereas previously only crypto_cipher did (and it wasn't enabled by default, which this commit fixes as well). Note that this preserves the optimization where the AES key is stored in raw form rather than expanded form. CPACF just takes the raw key. Acked-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Holger Dengler <dengler@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20260112192035.10427-16-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-01-15 14:08:55 -08:00
Eric Biggers	a4e573db06	lib/crypto: riscv/aes: Migrate optimized code into library Move the aes_encrypt_zvkned() and aes_decrypt_zvkned() assembly functions into lib/crypto/, wire them up to the AES library API, and remove the "aes-riscv64-zvkned" crypto_cipher algorithm. To make this possible, change the prototypes of these functions to take (rndkeys, key_len) instead of a pointer to crypto_aes_ctx, and change the RISC-V AES-XTS code to implement tweak encryption using the AES library instead of directly calling aes_encrypt_zvkned(). The result is that both the AES library and crypto_cipher APIs use RISC-V's AES instructions, whereas previously only crypto_cipher did (and it wasn't enabled by default, which this commit fixes as well). Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-15-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-01-12 11:39:58 -08:00
Eric Biggers	7cf2082e74	lib/crypto: powerpc/aes: Migrate POWER8 optimized code into library Move the POWER8 AES assembly code into lib/crypto/, wire the key expansion and single-block en/decryption functions up to the AES library API, and remove the superseded "p8_aes" crypto_cipher algorithm. The result is that both the AES library and crypto_cipher APIs are now optimized for POWER8, whereas previously only crypto_cipher was (and optimizations weren't enabled by default, which this commit fixes too). Note that many of the functions in the POWER8 assembly code are still used by the AES mode implementations in arch/powerpc/crypto/. For now, just export these functions. These exports will go away once the AES modes are migrated to the library as well. (Trying to split up the assembly file seemed like much more trouble than it would be worth.) Another challenge with this code is that the POWER8 assembly code uses a custom format for the expanded AES key. Since that code is imported from OpenSSL and is also targeted to POWER8 (rather than POWER9 which has better data movement and byteswap instructions), that is not easily changed. For now I've just kept the custom format. To maintain full correctness, this requires executing some slow fallback code in the case where the usability of VSX changes between key expansion and use. This should be tolerable, as this case shouldn't happen in practice. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-14-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-01-12 11:39:58 -08:00
Eric Biggers	0892c91b81	lib/crypto: powerpc/aes: Migrate SPE optimized code into library Move the PowerPC SPE AES assembly code into lib/crypto/, wire the key expansion and single-block en/decryption functions up to the AES library API, and remove the superseded "aes-ppc-spe" crypto_cipher algorithm. The result is that both the AES library and crypto_cipher APIs are now optimized with SPE, whereas previously only crypto_cipher was (and optimizations weren't enabled by default, which this commit fixes too). Note that many of the functions in the PowerPC SPE assembly code are still used by the AES mode implementations in arch/powerpc/crypto/. For now, just export these functions. These exports will go away once the AES modes are migrated to the library as well. (Trying to split up the assembly files seemed like much more trouble than it would be worth.) Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-13-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-01-12 11:39:58 -08:00
Eric Biggers	2b1ef7aeeb	lib/crypto: arm64/aes: Migrate optimized code into library Move the ARM64 optimized AES key expansion and single-block AES en/decryption code into lib/crypto/, wire it up to the AES library API, and remove the superseded crypto_cipher algorithms. The result is that both the AES library and crypto_cipher APIs are now optimized for ARM64, whereas previously only crypto_cipher was (and the optimizations weren't enabled by default, which this fixes as well). Note: to see the diff from arch/arm64/crypto/aes-ce-glue.c to lib/crypto/arm64/aes.h, view this commit with 'git show -M10'. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-12-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-01-12 11:39:58 -08:00
Eric Biggers	fa2297750c	lib/crypto: arm/aes: Migrate optimized code into library Move the ARM optimized single-block AES en/decryption code into lib/crypto/, wire it up to the AES library API, and remove the superseded "aes-arm" crypto_cipher algorithm. The result is that both the AES library and crypto_cipher APIs are now optimized for ARM, whereas previously only crypto_cipher was (and the optimizations weren't enabled by default, which this fixes as well). Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-11-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-01-12 11:39:58 -08:00
Eric Biggers	a22fd0e3c4	lib/crypto: aes: Introduce improved AES library The kernel's AES library currently has the following issues: - It doesn't take advantage of the architecture-optimized AES code, including the implementations using AES instructions. - It's much slower than even the other software AES implementations: 2-4 times slower than "aes-generic", "aes-arm", and "aes-arm64". - It requires that both the encryption and decryption round keys be computed and cached. This is wasteful for users that need only the forward (encryption) direction of the cipher: the key struct is 484 bytes when only 244 are actually needed. This missed optimization is very common, as many AES modes (e.g. GCM, CFB, CTR, CMAC, and even the tweak key in XTS) use the cipher only in the forward (encryption) direction even when doing decryption. - It doesn't provide the flexibility to customize the prepared key format. The API is defined to do key expansion, and several callers in drivers/crypto/ use it specifically to expand the key. This is an issue when integrating the existing powerpc, s390, and sparc code, which is necessary to provide full parity with the traditional API. To resolve these issues, I'm proposing the following changes: 1. New structs 'aes_key' and 'aes_enckey' are introduced, with corresponding functions aes_preparekey() and aes_prepareenckey(). Generally these structs will include the encryption+decryption round keys and the encryption round keys, respectively. However, the exact format will be under control of the architecture-specific AES code. (The verb "prepare" is chosen over "expand" since key expansion isn't necessarily done. It's also consistent with hmac*_preparekey().) 2. aes_encrypt() and aes_decrypt() will be changed to operate on the new structs instead of struct crypto_aes_ctx. 3. aes_encrypt() and aes_decrypt() will use architecture-optimized code when available, or else fall back to a new generic AES implementation that unifies the existing two fragmented generic AES implementations. The new generic AES implementation uses tables for both SubBytes and MixColumns, making it almost as fast as "aes-generic". However, instead of aes-generic's huge 8192-byte tables per direction, it uses only 1024 bytes for encryption and 1280 bytes for decryption (similar to "aes-arm"). The cost is just some extra rotations. The new generic AES implementation also includes table prefetching, making it have some "constant-time hardening". That's an improvement from aes-generic which has no constant-time hardening. It does slightly regress in constant-time hardening vs. the old lib/crypto/aes.c which had smaller tables, and from aes-fixed-time which disabled IRQs on top of that. But I think this is tolerable. The real solutions for constant-time AES are AES instructions or bit-slicing. The table-based code remains a best-effort fallback for the increasingly-rare case where a real solution is unavailable. 4. crypto_aes_ctx and aes_expandkey() will remain for now, but only for callers that are using them specifically for the AES key expansion (as opposed to en/decrypting data with the AES library). This commit begins the migration process by introducing the new structs and functions, backed by the new generic AES implementation. To allow callers to be incrementally converted, aes_encrypt() and aes_decrypt() are temporarily changed into macros that use a _Generic expression to call either the old functions (which take crypto_aes_ctx) or the new functions (which take the new types). Once all callers have been updated, these macros will go away, the old functions will be removed, and the "_new" suffix will be dropped from the new functions. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260112192035.10427-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-01-12 11:39:58 -08:00
Eric Biggers	959a634ebc	lib/crypto: mldsa: Add FIPS cryptographic algorithm self-test Since ML-DSA is FIPS-approved, add the boot-time self-test which is apparently required. Just add a test vector manually for now, borrowed from lib/crypto/tests/mldsa-testvecs.h (where in turn it's borrowed from leancrypto). The SHA-* FIPS test vectors are generated by scripts/crypto/gen-fips-testvecs.py instead, but the common Python libraries don't support ML-DSA yet. Acked-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20260107044215.109930-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-01-12 11:07:50 -08:00
Eric Biggers	0d92c55532	lib/crypto: nh: Restore dependency of arch code on !KMSAN Since the architecture-specific implementations of NH initialize memory in assembly code, they aren't compatible with KMSAN as-is. Fixes: 382de740759a ("lib/crypto: nh: Add NH library") Link: https://lore.kernel.org/r/20260105053652.1708299-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-01-12 11:07:50 -08:00
Rusydi H. Makarim	c8bf0b969d	lib/crypto: md5: Use rol32() instead of open-coding it For the bitwise left rotation in MD5STEP, use rol32() from <linux/bitops.h> instead of open-coding it. Signed-off-by: Rusydi H. Makarim <rusydi.makarim@kriptograf.id> Link: https://lore.kernel.org/r/20251214-rol32_in_md5-v1-1-20f5f11a92b2@kriptograf.id Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-01-12 11:07:50 -08:00
Eric Biggers	a229d83235	lib/crypto: x86/nh: Migrate optimized code into library Migrate the x86_64 implementations of NH into lib/crypto/. This makes the nh() function be optimized on x86_64 kernels. Note: this temporarily makes the adiantum template not utilize the x86_64 optimized NH code. This is resolved in a later commit that converts the adiantum template to use nh() instead of "nhpoly1305". Link: https://lore.kernel.org/r/20251211011846.8179-6-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-01-12 11:07:50 -08:00
Eric Biggers	b4a8528d17	lib/crypto: arm64/nh: Migrate optimized code into library Migrate the arm64 NEON implementation of NH into lib/crypto/. This makes the nh() function be optimized on arm64 kernels. Note: this temporarily makes the adiantum template not utilize the arm64 optimized NH code. This is resolved in a later commit that converts the adiantum template to use nh() instead of "nhpoly1305". Link: https://lore.kernel.org/r/20251211011846.8179-5-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-01-12 11:07:50 -08:00
Eric Biggers	29e39a11f5	lib/crypto: arm/nh: Migrate optimized code into library Migrate the arm32 NEON implementation of NH into lib/crypto/. This makes the nh() function be optimized on arm32 kernels. Note: this temporarily makes the adiantum template not utilize the arm32 optimized NH code. This is resolved in a later commit that converts the adiantum template to use nh() instead of "nhpoly1305". Link: https://lore.kernel.org/r/20251211011846.8179-4-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-01-12 11:07:49 -08:00
Eric Biggers	7246fe6cd6	lib/crypto: tests: Add KUnit tests for NH Add some simple KUnit tests for the nh() function. These replace the test coverage which will be lost by removing the nhpoly1305 crypto_shash. Note that the NH code also continues to be tested indirectly as well, via the tests for the "adiantum(xchacha12,aes)" crypto_skcipher. Link: https://lore.kernel.org/r/20251211011846.8179-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-01-12 11:07:49 -08:00
Eric Biggers	14e15c71d7	lib/crypto: nh: Add NH library Add support for the NH "almost-universal hash function" to lib/crypto/, specifically the variant of NH used in Adiantum. This will replace the need for the "nhpoly1305" crypto_shash algorithm. All the implementations of "nhpoly1305" use architecture-optimized code only for the NH stage; they just use the generic C Poly1305 code for the Poly1305 stage. We can achieve the same result in a simpler way using an (architecture-optimized) nh() function combined with code in crypto/adiantum.c that passes the results to the Poly1305 library. This commit begins this cleanup by adding the nh() function. The code is derived from crypto/nhpoly1305.c and include/crypto/nhpoly1305.h. Link: https://lore.kernel.org/r/20251211011846.8179-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-01-12 11:07:49 -08:00
Eric Biggers	ed894faccb	lib/crypto: tests: Add KUnit tests for ML-DSA verification Add a KUnit test suite for ML-DSA verification, including the following for each ML-DSA parameter set (ML-DSA-44, ML-DSA-65, and ML-DSA-87): - Positive test (valid signature), using vector imported from leancrypto - Various negative tests: - Wrong length for signature, message, or public key - Out-of-range coefficients in z vector - Invalid encoded hint vector - Any bit flipped in signature, message, or public key - Unit test for the internal function use_hint() - A benchmark ML-DSA inputs and outputs are very large. To keep the size of the tests down, use just one valid test vector per parameter set, and generate the negative tests at runtime by mutating the valid test vector. I also considered importing the test vectors from Wycheproof. I've tested that mldsa_verify() indeed passes all of Wycheproof's ML-DSA test vectors that use an empty context string. However, importing these permanently would add over 6 MB of source. That's too much to be a reasonable addition to the Linux kernel tree for one algorithm. It also wouldn't actually provide much better test coverage than this commit. Another potential issue is that Wycheproof uses the Apache license. Similarly, this also differs from the earlier proposal to import a long list of test vectors from leancrypto. I retained only one valid signature for each algorithm, and I also added (runtime-generated) negative tests which were missing. I think this is a better tradeoff. Reviewed-by: David Howells <dhowells@redhat.com> Tested-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20251214181712.29132-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-01-12 11:07:49 -08:00
Eric Biggers	64edccea59	lib/crypto: Add ML-DSA verification support Add support for verifying ML-DSA signatures. ML-DSA (Module-Lattice-Based Digital Signature Algorithm) is specified in FIPS 204 and is the standard version of Dilithium. Unlike RSA and elliptic-curve cryptography, ML-DSA is believed to be secure even against adversaries in possession of a large-scale quantum computer. Compared to the earlier patch (https://lore.kernel.org/r/20251117145606.2155773-3-dhowells@redhat.com/) that was based on "leancrypto", this implementation: - Is about 700 lines of source code instead of 4800. - Generates about 4 KB of object code instead of 28 KB. - Uses 9-13 KB of memory to verify a signature instead of 31-84 KB. - Is at least about the same speed, with a microbenchmark showing 3-5% improvements on one x86_64 CPU and -1% to 1% changes on another. When memory is a bottleneck, it's likely much faster. - Correctly implements the RejNTTPoly step of the algorithm. The API just consists of a single function mldsa_verify(), supporting pure ML-DSA with any standard parameter set (ML-DSA-44, ML-DSA-65, or ML-DSA-87) as selected by an enum. That's all that's actually needed. The following four potential features are unneeded and aren't included. However, any that ever become needed could fairly easily be added later, as they only affect how the message representative mu is calculated: - Nonempty context strings - Incremental message hashing - HashML-DSA - External mu Signing support would, of course, be a larger and more complex addition. However, the kernel doesn't, and shouldn't, need ML-DSA signing support. Note that mldsa_verify() allocates memory, so it can sleep and can fail with ENOMEM. Unfortunately we don't have much choice about that, since ML-DSA needs a lot of memory. At least callers have to check for errors anyway, since the signature could be invalid. Note that verification doesn't require constant-time code, and in fact some steps are inherently variable-time. I've used constant-time patterns in some places anyway, but technically they're not needed. Reviewed-by: David Howells <dhowells@redhat.com> Tested-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20251214181712.29132-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-01-12 11:07:49 -08:00
Eric Biggers	74d74bb78a	lib/crypto: aes: Fix missing MMU protection for AES S-box __cacheline_aligned puts the data in the ".data..cacheline_aligned" section, which isn't marked read-only i.e. it doesn't receive MMU protection. Replace it with ____cacheline_aligned which does the right thing and just aligns the data while keeping it in ".rodata". Fixes: `b5e0b032b6` ("crypto: aes - add generic time invariant AES cipher") Cc: stable@vger.kernel.org Reported-by: Qingfang Deng <dqfext@gmail.com> Closes: https://lore.kernel.org/r/20260105074712.498-1-dqfext@gmail.com/ Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20260107052023.174620-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-01-08 11:14:59 -08:00
Thomas Weißschuh	fcff71fd88	lib/crypto: tests: polyval_kunit: Increase iterations for preparekey in IRQs On my development machine the generic, memcpy()-only implementation of polyval_preparekey() is too fast for the IRQ workers to actually fire. The test fails. Increase the iterations to make the test more robust. The test will run for a maximum of one second in any case. [EB: This failure was already fixed by commit `c31f4aa8fe` ("kunit: Enforce task execution in {soft,hard}irq contexts"). I'm still applying this patch too, since the iteration count in this test made its running time much shorter than the other similar ones.] Fixes: `b3aed551b3` ("lib/crypto: tests: Add KUnit tests for POLYVAL") Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Link: https://lore.kernel.org/r/20260102-kunit-polyval-fix-v1-1-5313b5a65f35@linutronix.de Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2026-01-08 11:14:59 -08:00
Charles Mirabile	5a0b188250	lib/crypto: riscv: Add poly1305-core.S to .gitignore poly1305-core.S is an auto-generated file, so it should be ignored. Fixes: `bef9c75598` ("lib/crypto: riscv/poly1305: Import OpenSSL/CRYPTOGAMS implementation") Cc: stable@vger.kernel.org Signed-off-by: Charles Mirabile <cmirabil@redhat.com> Link: https://lore.kernel.org/r/20251212184717.133701-1-cmirabil@redhat.com Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2025-12-14 10:18:22 -08:00
Eric Biggers	68b233b1d5	lib/crypto: blake2s: Replace manual unrolling with unrolled_full As we're doing in the BLAKE2b code, use unrolled_full to make the compiler handle the loop unrolling. This simplifies the code slightly. The generated object code is nearly the same with both gcc and clang. Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251205051155.25274-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2025-12-09 15:10:21 -08:00
Eric Biggers	2e8f7b170a	lib/crypto: blake2b: Roll up BLAKE2b round loop on 32-bit BLAKE2b has a state of 16 64-bit words. Add the message data in and there are 32 64-bit words. With the current code where all the rounds are unrolled to enable constant-folding of the blake2b_sigma values, this results in a very large code size on 32-bit kernels, including a recurring issue where gcc uses a large amount of stack. There's just not much benefit to this unrolling when the code is already so large. Let's roll up the rounds when !CONFIG_64BIT. To avoid having to duplicate the code, just write the code once using a loop, and conditionally use 'unrolled_full' from <linux/unroll.h>. Then, fold the now-unneeded ROUND() macro into the loop. Finally, also remove the now-unneeded override of the stack frame size warning. Code size improvements for blake2b_compress_generic(): Size before (bytes) Size after (bytes) ------------------- ------------------ i386, gcc 27584 3632 i386, clang 18208 3248 arm32, gcc 19912 2860 arm32, clang 21336 3344 Running the BLAKE2b benchmark on a !CONFIG_64BIT kernel on an x86_64 processor shows a 16384B throughput change of 351 => 340 MB/s (gcc) or 442 MB/s => 375 MB/s (clang). So clearly not much of a slowdown either. But also that microbenchmark also effectively disregards cache usage, which is important in practice and is far better in the smaller code. Note: If we rolled up the loop on x86_64 too, the change would be 7024 bytes => 1584 bytes and 1960 MB/s => 1396 MB/s (gcc), or 6848 bytes => 1696 bytes and 1920 MB/s => 1263 MB/s (clang). Maybe still worth it, though not quite as clearly beneficial. Fixes: `91d689337f` ("crypto: blake2b - add blake2b generic implementation") Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251205050330.89704-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2025-12-09 15:10:21 -08:00
Eric Biggers	1cd5bb6e9e	lib/crypto: riscv: Depend on RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS Replace the RISCV_ISA_V dependency of the RISC-V crypto code with RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS, which implies RISCV_ISA_V as well as vector unaligned accesses being efficient. This is necessary because this code assumes that vector unaligned accesses are supported and are efficient. (It does so to avoid having to use lots of extra vsetvli instructions to switch the element width back and forth between 8 and either 32 or 64.) This was omitted from the code originally just because the RISC-V kernel support for detecting this feature didn't exist yet. Support has now been added, but it's fragmented into per-CPU runtime detection, a command-line parameter, and a kconfig option. The kconfig option is the only reasonable way to do it, though, so let's just rely on that. Fixes: `eb24af5d7a` ("crypto: riscv - add vector crypto accelerated AES-{ECB,CBC,CTR,XTS}") Fixes: `bb54668837` ("crypto: riscv - add vector crypto accelerated ChaCha20") Fixes: `600a3853df` ("crypto: riscv - add vector crypto accelerated GHASH") Fixes: `8c8e40470f` ("crypto: riscv - add vector crypto accelerated SHA-{256,224}") Fixes: `b3415925a0` ("crypto: riscv - add vector crypto accelerated SHA-{512,384}") Fixes: `563a5255af` ("crypto: riscv - add vector crypto accelerated SM3") Fixes: `b8d06352bb` ("crypto: riscv - add vector crypto accelerated SM4") Cc: stable@vger.kernel.org Reported-by: Vivian Wang <wangruikang@iscas.ac.cn> Closes: https://lore.kernel.org/r/b3cfcdac-0337-4db0-a611-258f2868855f@iscas.ac.cn/ Reviewed-by: Jerry Shih <jerry.shih@sifive.com> Link: https://lore.kernel.org/r/20251206213750.81474-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2025-12-09 15:10:21 -08:00
Vivian Wang	43169328c7	lib/crypto: riscv/chacha: Avoid s0/fp register In chacha_zvkb, avoid using the s0 register, which is the frame pointer, by reallocating KEY0 to t5. This makes stack traces available if e.g. a crash happens in chacha_zvkb. No frame pointer maintenance is otherwise required since this is a leaf function. Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn> Fixes: `bb54668837` ("crypto: riscv - add vector crypto accelerated ChaCha20") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20251202-riscv-chacha_zvkb-fp-v2-1-7bd00098c9dc@iscas.ac.cn Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2025-12-09 15:10:20 -08:00
Linus Torvalds	a619fe35ab	This update includes the following changes: API: - Rewrite memcpy_sglist from scratch. - Add on-stack AEAD request allocation. - Fix partial block processing in ahash. Algorithms: - Remove ansi_cprng. - Remove tcrypt tests for poly1305. - Fix EINPROGRESS processing in authenc. - Fix double-free in zstd. Drivers: - Use drbg ctr helper when reseeding xilinx-trng. - Add support for PCI device 0x115A to ccp. - Add support of paes in caam. - Add support for aes-xts in dthev2. Others: - Use likely in rhashtable lookup. - Fix lockdep false-positive in padata by removing a helper. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmktaHwACgkQxycdCkmx i6duthAAl4ZjsuSgt0P9ZPJXWgSH+QbNT/6fL1QzLEuzLVGn8Mt99LTQpaYU8HRh fced8+R7UpqA/FgZTYbRKopZJVJJqhmTf2zqjbe47CroRm2Wf5UO+6ZXBsiqbMwa 6fNLilhcrq5G3DrIHepCpIQ7NM2+ucTMnPRIWP3cvzLwX0JzPtYIpYUSiVPAtkjh 9g24oPz6LR/xZfyk+wPbHOSYeqz4sSXnGJkL+Vn33AtU5KJZLum9zMP4Lleim7HP XaNnUL/S/PYCspycrvfrnq6+YMLPw2USguttuZe0Dg0qhq/jPMyzdEkTAjcTD5LG NZavVUbQsf6BW+YjXgaE/ybcSs6WR3ySs8aza1Ev8QqsmpbJj9xdpF9fn4RsffGR mbhc5plJCKWzfiaparea8yY9n5vHwbOK4zoyF9P6kI5ykkoA+GmwRwTW73M9KCfa i1R6g97O+t4Yaq9JI9GG7dkm9bxJpY+XaKouW7rqv/MX0iND1ExDYaqdcA+Xa61c TNfdlVcGyX7Dolm2xnpvRv8EqF9NzeK4Vw1QslrdCijXfe7eJymabNKhLBlV4li0 tVfmh4vyQFgruyiR7r7AkXIKzsLZbji030UoOsQqiMW7ualBUQ0dCDbBa8J6kUcX /vjbSmxV3LKgVgYvUBRRGIi9CJbKfs29RkS6RFtdqcq/YT4KsJU= =DHes -----END PGP SIGNATURE----- Merge tag 'v6.19-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Rewrite memcpy_sglist from scratch - Add on-stack AEAD request allocation - Fix partial block processing in ahash Algorithms: - Remove ansi_cprng - Remove tcrypt tests for poly1305 - Fix EINPROGRESS processing in authenc - Fix double-free in zstd Drivers: - Use drbg ctr helper when reseeding xilinx-trng - Add support for PCI device 0x115A to ccp - Add support of paes in caam - Add support for aes-xts in dthev2 Others: - Use likely in rhashtable lookup - Fix lockdep false-positive in padata by removing a helper" * tag 'v6.19-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (71 commits) crypto: zstd - fix double-free in per-CPU stream cleanup crypto: ahash - Zero positive err value in ahash_update_finish crypto: ahash - Fix crypto_ahash_import with partial block data crypto: lib/mpi - use min() instead of min_t() crypto: ccp - use min() instead of min_t() hwrng: core - use min3() instead of nested min_t() crypto: aesni - ctr_crypt() use min() instead of min_t() crypto: drbg - Delete unused ctx from struct sdesc crypto: testmgr - Add missing DES weak and semi-weak key tests Revert "crypto: scatterwalk - Move skcipher walk and use it for memcpy_sglist" crypto: scatterwalk - Fix memcpy_sglist() to always succeed crypto: iaa - Request to add Kanchana P Sridhar to Maintainers. crypto: tcrypt - Remove unused poly1305 support crypto: ansi_cprng - Remove unused ansi_cprng algorithm crypto: asymmetric_keys - fix uninitialized pointers with free attribute KEYS: Avoid -Wflex-array-member-not-at-end warning crypto: ccree - Correctly handle return of sg_nents_for_len crypto: starfive - Correctly handle return of sg_nents_for_len crypto: iaa - Fix incorrect return value in save_iaa_wq() crypto: zstd - Remove unnecessary size_t cast ...	2025-12-03 11:28:38 -08:00
Linus Torvalds	f617d24606	arm64 FPSIMD buffer on-stack for 6.19 In v6.8, the size of task_struct on arm64 increased by 528 bytes due to the new 'kernel_fpsimd_state' field. This field was added to allow kernel-mode FPSIMD code to be preempted. Unfortunately, 528 bytes is kind of a lot for task_struct. This regression in the task_struct size was noticed and reported. Recover that space by making this state be allocated on the stack at the beginning of each kernel-mode FPSIMD section. To make it easier for all the users of kernel-mode FPSIMD to do that correctly, introduce and use a 'scoped_ksimd' abstraction. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaSuxbxQcZWJpZ2dlcnNA a2VybmVsLm9yZwAKCRDzXCl4vpKOKysnAQCbN4Jed8IqwGUEkkjZrnMeN0pEO4RI lAhb2Obj3n/grQEAiPBmqWVjXaIPO4lSgLQxY6XoVLr+utMod4TMTYHfnAY= =0zQQ -----END PGP SIGNATURE----- Merge tag 'fpsimd-on-stack-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull arm64 FPSIMD on-stack buffer updates from Eric Biggers: "This is a core arm64 change. However, I was asked to take this because most uses of kernel-mode FPSIMD are in crypto or CRC code. In v6.8, the size of task_struct on arm64 increased by 528 bytes due to the new 'kernel_fpsimd_state' field. This field was added to allow kernel-mode FPSIMD code to be preempted. Unfortunately, 528 bytes is kind of a lot for task_struct. This regression in the task_struct size was noticed and reported. Recover that space by making this state be allocated on the stack at the beginning of each kernel-mode FPSIMD section. To make it easier for all the users of kernel-mode FPSIMD to do that correctly, introduce and use a 'scoped_ksimd' abstraction" * tag 'fpsimd-on-stack-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (23 commits) lib/crypto: arm64: Move remaining algorithms to scoped ksimd API lib/crypto: arm/blake2b: Move to scoped ksimd API arm64/fpsimd: Allocate kernel mode FP/SIMD buffers on the stack arm64/fpu: Enforce task-context only for generic kernel mode FPU net/mlx5: Switch to more abstract scoped ksimd guard API on arm64 arm64/xorblocks: Switch to 'ksimd' scoped guard API crypto/arm64: sm4 - Switch to 'ksimd' scoped guard API crypto/arm64: sm3 - Switch to 'ksimd' scoped guard API crypto/arm64: sha3 - Switch to 'ksimd' scoped guard API crypto/arm64: polyval - Switch to 'ksimd' scoped guard API crypto/arm64: nhpoly1305 - Switch to 'ksimd' scoped guard API crypto/arm64: aes-gcm - Switch to 'ksimd' scoped guard API crypto/arm64: aes-blk - Switch to 'ksimd' scoped guard API crypto/arm64: aes-ccm - Switch to 'ksimd' scoped guard API raid6: Move to more abstract 'ksimd' guard API crypto: aegis128-neon - Move to more abstract 'ksimd' guard API crypto/arm64: sm4-ce-gcm - Avoid pointless yield of the NEON unit crypto/arm64: sm4-ce-ccm - Avoid pointless yield of the NEON unit crypto/arm64: aes-ce-ccm - Avoid pointless yield of the NEON unit lib/crc: Switch ARM and arm64 to 'ksimd' scoped guard API ...	2025-12-02 18:53:50 -08:00
Linus Torvalds	906003e151	'at_least' array sizes for 6.19 C supports lower bounds on the sizes of array parameters, using the static keyword as follows: 'void f(int a[static 32]);'. This allows the compiler to warn about a too-small array being passed. As discussed, this reuse of the 'static' keyword, while standard, is a bit obscure. Therefore, add an alias 'at_least' to compiler_types.h. Then, add this 'at_least' annotation to the array parameters of various crypto library functions. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaSuslxQcZWJpZ2dlcnNA a2VybmVsLm9yZwAKCRDzXCl4vpKOK/dXAP9WyuTsEYOOSwSaDI5sKzdNXT3GZNeO jGhx9qPN3KIq8QD/RElN9oF7iU9wsKvU6kKpnqGcajGTzkW/GOAA20BFcAM= =qFEB -----END PGP SIGNATURE----- Merge tag 'libcrypto-at-least-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull 'at_least' array size update from Eric Biggers: "C supports lower bounds on the sizes of array parameters, using the static keyword as follows: 'void f(int a[static 32]);'. This allows the compiler to warn about a too-small array being passed. As discussed, this reuse of the 'static' keyword, while standard, is a bit obscure. Therefore, add an alias 'at_least' to compiler_types.h. Then, add this 'at_least' annotation to the array parameters of various crypto library functions" * tag 'libcrypto-at-least-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: lib/crypto: sha2: Add at_least decoration to fixed-size array params lib/crypto: sha1: Add at_least decoration to fixed-size array params lib/crypto: poly1305: Add at_least decoration to fixed-size array params lib/crypto: md5: Add at_least decoration to fixed-size array params lib/crypto: curve25519: Add at_least decoration to fixed-size array params lib/crypto: chacha: Add at_least decoration to fixed-size array params lib/crypto: chacha20poly1305: Statically check fixed array lengths compiler_types: introduce at_least parameter decoration pseudo keyword wifi: iwlwifi: trans: rename at_least variable to min_mode	2025-12-02 18:26:54 -08:00
Linus Torvalds	db425f7a0b	Crypto library tests for 6.19 - Add KUnit test suites for SHA-3, BLAKE2b, and POLYVAL. These are the algorithms that have new crypto library interfaces this cycle. - Remove the crypto_shash POLYVAL tests. They're no longer needed because POLYVAL support was removed from crypto_shash. Better POLYVAL test coverage is now provided via the KUnit test suite. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaSur3BQcZWJpZ2dlcnNA a2VybmVsLm9yZwAKCRDzXCl4vpKOKxlzAP0cf+vR8aA+nS1LaAC8WTt6skTtwbVF J/LPvOzHCLlj2AEA15CIQvBYkAAVCF2JPZiPb43lMMso7PkrjqIszQSunAc= =B84P -----END PGP SIGNATURE----- Merge tag 'libcrypto-tests-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull crypto library test updates from Eric Biggers: - Add KUnit test suites for SHA-3, BLAKE2b, and POLYVAL. These are the algorithms that have new crypto library interfaces this cycle. - Remove the crypto_shash POLYVAL tests. They're no longer needed because POLYVAL support was removed from crypto_shash. Better POLYVAL test coverage is now provided via the KUnit test suite. * tag 'libcrypto-tests-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: crypto: testmgr - Remove polyval tests lib/crypto: tests: Add KUnit tests for POLYVAL lib/crypto: tests: Add additional SHAKE tests lib/crypto: tests: Add SHA3 kunit tests lib/crypto: tests: Add KUnit tests for BLAKE2b	2025-12-02 18:20:06 -08:00
Linus Torvalds	5abe8d8efc	Crypto library updates for 6.19 This is the main crypto library pull request for 6.19. It includes: - Add SHA-3 support to lib/crypto/, including support for both the hash functions and the extendable-output functions. Reimplement the existing SHA-3 crypto_shash support on top of the library. This is motivated mainly by the upcoming support for the ML-DSA signature algorithm, which needs the SHAKE128 and SHAKE256 functions. But even on its own it's a useful cleanup. This also fixes the longstanding issue where the architecture-optimized SHA-3 code was disabled by default. - Add BLAKE2b support to lib/crypto/, and reimplement the existing BLAKE2b crypto_shash support on top of the library. This is motivated mainly by btrfs, which supports BLAKE2b checksums. With this change, all btrfs checksum algorithms now have library APIs. btrfs is planned to start just using the library directly. This refactor also improves consistency between the BLAKE2b code and BLAKE2s code. And as usual, it also fixes the issue where the architecture-optimized BLAKE2b code was disabled by default. - Add POLYVAL support to lib/crypto/, replacing the existing POLYVAL support in crypto_shash. Reimplement HCTR2 on top of the library. This simplifies the code and improves HCTR2 performance. As usual, it also makes the architecture-optimized code be enabled by default. The generic implementation of POLYVAL is greatly improved as well. - Clean up the BLAKE2s code. - Add FIPS self-tests for SHA-1, SHA-2, and SHA-3. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaSurXBQcZWJpZ2dlcnNA a2VybmVsLm9yZwAKCRDzXCl4vpKOK4RhAQC/ZWiFD8MAnyBSVEeo5G5wr835bGl0 FbXLdIT8orj/mAD+JgO9LaE15THL+Cy7K9VO4XN6HjbKd/7FdshYlrqrag8= =qitH -----END PGP SIGNATURE----- Merge tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull crypto library updates from Eric Biggers: "This is the main crypto library pull request for 6.19. It includes: - Add SHA-3 support to lib/crypto/, including support for both the hash functions and the extendable-output functions. Reimplement the existing SHA-3 crypto_shash support on top of the library. This is motivated mainly by the upcoming support for the ML-DSA signature algorithm, which needs the SHAKE128 and SHAKE256 functions. But even on its own it's a useful cleanup. This also fixes the longstanding issue where the architecture-optimized SHA-3 code was disabled by default. - Add BLAKE2b support to lib/crypto/, and reimplement the existing BLAKE2b crypto_shash support on top of the library. This is motivated mainly by btrfs, which supports BLAKE2b checksums. With this change, all btrfs checksum algorithms now have library APIs. btrfs is planned to start just using the library directly. This refactor also improves consistency between the BLAKE2b code and BLAKE2s code. And as usual, it also fixes the issue where the architecture-optimized BLAKE2b code was disabled by default. - Add POLYVAL support to lib/crypto/, replacing the existing POLYVAL support in crypto_shash. Reimplement HCTR2 on top of the library. This simplifies the code and improves HCTR2 performance. As usual, it also makes the architecture-optimized code be enabled by default. The generic implementation of POLYVAL is greatly improved as well. - Clean up the BLAKE2s code - Add FIPS self-tests for SHA-1, SHA-2, and SHA-3" * tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (37 commits) fscrypt: Drop obsolete recommendation to enable optimized POLYVAL crypto: polyval - Remove the polyval crypto_shash crypto: hctr2 - Convert to use POLYVAL library lib/crypto: x86/polyval: Migrate optimized code into library lib/crypto: arm64/polyval: Migrate optimized code into library lib/crypto: polyval: Add POLYVAL library crypto: polyval - Rename conflicting functions lib/crypto: x86/blake2s: Use vpternlogd for 3-input XORs lib/crypto: x86/blake2s: Avoid writing back unchanged 'f' value lib/crypto: x86/blake2s: Improve readability lib/crypto: x86/blake2s: Use local labels for data lib/crypto: x86/blake2s: Drop check for nblocks == 0 lib/crypto: x86/blake2s: Fix 32-bit arg treated as 64-bit lib/crypto: arm, arm64: Drop filenames from file comments lib/crypto: arm/blake2s: Fix some comments crypto: s390/sha3 - Remove superseded SHA-3 code crypto: sha3 - Reimplement using library API crypto: jitterentropy - Use default sha3 implementation lib/crypto: s390/sha3: Add optimized one-shot SHA-3 digest functions lib/crypto: sha3: Support arch overrides of one-shot digest functions ...	2025-12-02 18:01:03 -08:00
David Laight	80b61046b6	crypto: lib/mpi - use min() instead of min_t() min_t(unsigned int, a, b) casts an 'unsigned long' to 'unsigned int'. Use min(a, b) instead as it promotes any 'unsigned int' to 'unsigned long' and so cannot discard significant bits. In this case the 'unsigned long' value is small enough that the result is ok. Detected by an extra check added to min_t(). Signed-off-by: David Laight <david.laight.linux@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-11-24 17:44:14 +08:00
Jason A. Donenfeld	ac653d57ad	lib/crypto: chacha20poly1305: Statically check fixed array lengths Several parameters of the chacha20poly1305 functions require arrays of an exact length. Use the new at_least keyword to instruct gcc and clang to statically check that the caller is passing an object of at least that length. Here it is in action, with this faulty patch to wireguard's cookie.h: struct cookie_checker { u8 secret[NOISE_HASH_LEN]; - u8 cookie_encryption_key[NOISE_SYMMETRIC_KEY_LEN]; + u8 cookie_encryption_key[NOISE_SYMMETRIC_KEY_LEN - 1]; u8 message_mac1_key[NOISE_SYMMETRIC_KEY_LEN]; If I try compiling this code, I get this helpful warning: CC drivers/net/wireguard/cookie.o drivers/net/wireguard/cookie.c: In function ‘wg_cookie_message_create’: drivers/net/wireguard/cookie.c:193:9: warning: ‘xchacha20poly1305_encrypt’ reading 32 bytes from a region of size 31 [-Wstringop-overread] 193 \| xchacha20poly1305_encrypt(dst->encrypted_cookie, cookie, COOKIE_LEN, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 194 \| macs->mac1, COOKIE_LEN, dst->nonce, \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 195 \| checker->cookie_encryption_key); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/wireguard/cookie.c:193:9: note: referencing argument 7 of type ‘const u8 ’ {aka ‘const unsigned char ’} In file included from drivers/net/wireguard/messages.h:10, from drivers/net/wireguard/cookie.h:9, from drivers/net/wireguard/cookie.c:6: include/crypto/chacha20poly1305.h:28:6: note: in a call to function ‘xchacha20poly1305_encrypt’ 28 \| void xchacha20poly1305_encrypt(u8 dst, const u8 src, const size_t src_len, Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: "Jason A. Donenfeld" <Jason@zx2c4.com> Link: https://lore.kernel.org/r/20251123054819.2371989-4-Jason@zx2c4.com Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2025-11-23 12:19:21 -08:00
Eric Biggers	141fbbecec	lib/crypto: tests: Fix KMSAN warning in test_sha256_finup_2x() Fully initialize ctx, including the buf field which sha256_init() doesn't initialize, to avoid a KMSAN warning when comparing ctx to orig_ctx. This KMSAN warning slipped in while KMSAN was not working reliably due to a stackdepot bug, which has now been fixed. Fixes: `6733968be7` ("lib/crypto: tests: Add tests and benchmark for sha256_finup_2x()") Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20251121033431.34406-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2025-11-21 10:22:24 -08:00
Ard Biesheuvel	8dcac98a47	lib/crypto: arm64: Move remaining algorithms to scoped ksimd API Move the arm64 implementations of SHA-3 and POLYVAL to the newly introduced scoped ksimd API, which replaces kernel_neon_begin() and kernel_neon_end(). On arm64, this is needed because the latter API will change in an incompatible manner. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2025-11-12 10:14:11 -08:00
Ard Biesheuvel	c0d597e016	lib/crypto: arm/blake2b: Move to scoped ksimd API Even though ARM's versions of kernel_neon_begin()/_end() are not being changed, update the newly migrated ARM blake2b to the scoped ksimd API so that all ARM and arm64 in lib/crypto remains consistent in this manner. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2025-11-12 09:57:52 -08:00
Eric Biggers	065f040010	Introduce scoped ksimd API for ARM and arm64 Introduce a more strict replacement API for kernel_neon_begin()/kernel_neon_end() on both ARM and arm64, and replace occurrences of the latter pair appearing in lib/crypto -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQQm/3uucuRGn1Dmh0wbglWLn0tXAUCaRRKpgAKCRAwbglWLn0t XAKXAQD/L/XErOIGgSnvJnxG9sF+V2S+id1u9aoEJApbqMvW/gD9Fnvjqa7mRM7f jSZeDCMB++24SS2zL0/BFiRMmEl5/gc= =0IKE -----END PGP SIGNATURE----- Merge tag 'scoped-ksimd-for-arm-arm64' into libcrypto-fpsimd-on-stack Pull scoped ksimd API for ARM and arm64 from Ard Biesheuvel: "Introduce a more strict replacement API for kernel_neon_begin()/kernel_neon_end() on both ARM and arm64, and replace occurrences of the latter pair appearing in lib/crypto" Signed-off-by: Eric Biggers <ebiggers@kernel.org>	2025-11-12 09:55:55 -08:00

1 2 3 4 5 ...

321 commits