Commit graph

110 commits

Author SHA1 Message Date
Eric Biggers
24eb22d816 lib/crypto: x86/aes: Add AES-NI optimization
Optimize the AES library with x86 AES-NI instructions.

The relevant existing assembly functions, aesni_set_key(), aesni_enc(),
and aesni_dec(), are a bit difficult to extract into the library:

- They're coupled to the code for the AES modes.
- They operate on struct crypto_aes_ctx.  The AES library now uses
  different structs.
- They assume the key is 16-byte aligned.  The AES library only
  *prefers* 16-byte alignment; it doesn't require it.

Moreover, they're not all that great in the first place:

- They use unrolled loops, which isn't a great choice on x86.
- They use the 'aeskeygenassist' instruction, which is unnecessary, is
  slow on Intel CPUs, and forces the loop to be unrolled.
- They have special code for AES-192 key expansion, despite that being
  kind of useless.  AES-128 and AES-256 are the ones used in practice.

These are small functions anyway.

Therefore, I opted to just write replacements of these functions for the
library.  They address all the above issues.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260112192035.10427-18-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-15 14:09:07 -08:00
Eric Biggers
293c7cd5c6 lib/crypto: sparc/aes: Migrate optimized code into library
Move the SPARC64 AES assembly code into lib/crypto/, wire the key
expansion and single-block en/decryption functions up to the AES library
API, and remove the "aes-sparc64" crypto_cipher algorithm.

The result is that both the AES library and crypto_cipher APIs use the
SPARC64 AES opcodes, whereas previously only crypto_cipher did (and it
wasn't enabled by default, which this commit fixes as well).

Note that some of the functions in the SPARC64 AES assembly code are
still used by the AES mode implementations in
arch/sparc/crypto/aes_glue.c.  For now, just export these functions.
These exports will go away once the AES mode implementations are
migrated to the library as well.  (Trying to split up the assembly file
seemed like much more trouble than it would be worth.)

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260112192035.10427-17-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-15 14:09:07 -08:00
Eric Biggers
0cab15611e lib/crypto: s390/aes: Migrate optimized code into library
Implement aes_preparekey_arch(), aes_encrypt_arch(), and
aes_decrypt_arch() using the CPACF AES instructions.

Then, remove the superseded "aes-s390" crypto_cipher.

The result is that both the AES library and crypto_cipher APIs use the
CPACF AES instructions, whereas previously only crypto_cipher did (and
it wasn't enabled by default, which this commit fixes as well).

Note that this preserves the optimization where the AES key is stored in
raw form rather than expanded form.  CPACF just takes the raw key.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Holger Dengler <dengler@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Link: https://lore.kernel.org/r/20260112192035.10427-16-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-15 14:08:55 -08:00
Eric Biggers
a4e573db06 lib/crypto: riscv/aes: Migrate optimized code into library
Move the aes_encrypt_zvkned() and aes_decrypt_zvkned() assembly
functions into lib/crypto/, wire them up to the AES library API, and
remove the "aes-riscv64-zvkned" crypto_cipher algorithm.

To make this possible, change the prototypes of these functions to
take (rndkeys, key_len) instead of a pointer to crypto_aes_ctx, and
change the RISC-V AES-XTS code to implement tweak encryption using the
AES library instead of directly calling aes_encrypt_zvkned().

The result is that both the AES library and crypto_cipher APIs use
RISC-V's AES instructions, whereas previously only crypto_cipher did
(and it wasn't enabled by default, which this commit fixes as well).

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260112192035.10427-15-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12 11:39:58 -08:00
Eric Biggers
7cf2082e74 lib/crypto: powerpc/aes: Migrate POWER8 optimized code into library
Move the POWER8 AES assembly code into lib/crypto/, wire the key
expansion and single-block en/decryption functions up to the AES library
API, and remove the superseded "p8_aes" crypto_cipher algorithm.

The result is that both the AES library and crypto_cipher APIs are now
optimized for POWER8, whereas previously only crypto_cipher was (and
optimizations weren't enabled by default, which this commit fixes too).

Note that many of the functions in the POWER8 assembly code are still
used by the AES mode implementations in arch/powerpc/crypto/.  For now,
just export these functions.  These exports will go away once the AES
modes are migrated to the library as well.  (Trying to split up the
assembly file seemed like much more trouble than it would be worth.)

Another challenge with this code is that the POWER8 assembly code uses a
custom format for the expanded AES key.  Since that code is imported
from OpenSSL and is also targeted to POWER8 (rather than POWER9 which
has better data movement and byteswap instructions), that is not easily
changed.  For now I've just kept the custom format.  To maintain full
correctness, this requires executing some slow fallback code in the case
where the usability of VSX changes between key expansion and use.  This
should be tolerable, as this case shouldn't happen in practice.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260112192035.10427-14-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12 11:39:58 -08:00
Eric Biggers
0892c91b81 lib/crypto: powerpc/aes: Migrate SPE optimized code into library
Move the PowerPC SPE AES assembly code into lib/crypto/, wire the key
expansion and single-block en/decryption functions up to the AES library
API, and remove the superseded "aes-ppc-spe" crypto_cipher algorithm.

The result is that both the AES library and crypto_cipher APIs are now
optimized with SPE, whereas previously only crypto_cipher was (and
optimizations weren't enabled by default, which this commit fixes too).

Note that many of the functions in the PowerPC SPE assembly code are
still used by the AES mode implementations in arch/powerpc/crypto/.  For
now, just export these functions.  These exports will go away once the
AES modes are migrated to the library as well.  (Trying to split up the
assembly files seemed like much more trouble than it would be worth.)

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260112192035.10427-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12 11:39:58 -08:00
Eric Biggers
2b1ef7aeeb lib/crypto: arm64/aes: Migrate optimized code into library
Move the ARM64 optimized AES key expansion and single-block AES
en/decryption code into lib/crypto/, wire it up to the AES library API,
and remove the superseded crypto_cipher algorithms.

The result is that both the AES library and crypto_cipher APIs are now
optimized for ARM64, whereas previously only crypto_cipher was (and the
optimizations weren't enabled by default, which this fixes as well).

Note: to see the diff from arch/arm64/crypto/aes-ce-glue.c to
lib/crypto/arm64/aes.h, view this commit with 'git show -M10'.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260112192035.10427-12-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12 11:39:58 -08:00
Eric Biggers
fa2297750c lib/crypto: arm/aes: Migrate optimized code into library
Move the ARM optimized single-block AES en/decryption code into
lib/crypto/, wire it up to the AES library API, and remove the
superseded "aes-arm" crypto_cipher algorithm.

The result is that both the AES library and crypto_cipher APIs are now
optimized for ARM, whereas previously only crypto_cipher was (and the
optimizations weren't enabled by default, which this fixes as well).

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260112192035.10427-11-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12 11:39:58 -08:00
Eric Biggers
a22fd0e3c4 lib/crypto: aes: Introduce improved AES library
The kernel's AES library currently has the following issues:

- It doesn't take advantage of the architecture-optimized AES code,
  including the implementations using AES instructions.

- It's much slower than even the other software AES implementations: 2-4
  times slower than "aes-generic", "aes-arm", and "aes-arm64".

- It requires that both the encryption and decryption round keys be
  computed and cached.  This is wasteful for users that need only the
  forward (encryption) direction of the cipher: the key struct is 484
  bytes when only 244 are actually needed.  This missed optimization is
  very common, as many AES modes (e.g. GCM, CFB, CTR, CMAC, and even the
  tweak key in XTS) use the cipher only in the forward (encryption)
  direction even when doing decryption.

- It doesn't provide the flexibility to customize the prepared key
  format.  The API is defined to do key expansion, and several callers
  in drivers/crypto/ use it specifically to expand the key.  This is an
  issue when integrating the existing powerpc, s390, and sparc code,
  which is necessary to provide full parity with the traditional API.

To resolve these issues, I'm proposing the following changes:

1. New structs 'aes_key' and 'aes_enckey' are introduced, with
   corresponding functions aes_preparekey() and aes_prepareenckey().

   Generally these structs will include the encryption+decryption round
   keys and the encryption round keys, respectively.  However, the exact
   format will be under control of the architecture-specific AES code.

   (The verb "prepare" is chosen over "expand" since key expansion isn't
   necessarily done.  It's also consistent with hmac*_preparekey().)

2. aes_encrypt() and aes_decrypt() will be changed to operate on the new
   structs instead of struct crypto_aes_ctx.

3. aes_encrypt() and aes_decrypt() will use architecture-optimized code
   when available, or else fall back to a new generic AES implementation
   that unifies the existing two fragmented generic AES implementations.

   The new generic AES implementation uses tables for both SubBytes and
   MixColumns, making it almost as fast as "aes-generic".  However,
   instead of aes-generic's huge 8192-byte tables per direction, it uses
   only 1024 bytes for encryption and 1280 bytes for decryption (similar
   to "aes-arm").  The cost is just some extra rotations.

   The new generic AES implementation also includes table prefetching,
   making it have some "constant-time hardening".  That's an improvement
   from aes-generic which has no constant-time hardening.

   It does slightly regress in constant-time hardening vs. the old
   lib/crypto/aes.c which had smaller tables, and from aes-fixed-time
   which disabled IRQs on top of that.  But I think this is tolerable.
   The real solutions for constant-time AES are AES instructions or
   bit-slicing.  The table-based code remains a best-effort fallback for
   the increasingly-rare case where a real solution is unavailable.

4. crypto_aes_ctx and aes_expandkey() will remain for now, but only for
   callers that are using them specifically for the AES key expansion
   (as opposed to en/decrypting data with the AES library).

This commit begins the migration process by introducing the new structs
and functions, backed by the new generic AES implementation.

To allow callers to be incrementally converted, aes_encrypt() and
aes_decrypt() are temporarily changed into macros that use a _Generic
expression to call either the old functions (which take crypto_aes_ctx)
or the new functions (which take the new types).  Once all callers have
been updated, these macros will go away, the old functions will be
removed, and the "_new" suffix will be dropped from the new functions.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260112192035.10427-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12 11:39:58 -08:00
Eric Biggers
0d92c55532 lib/crypto: nh: Restore dependency of arch code on !KMSAN
Since the architecture-specific implementations of NH initialize memory
in assembly code, they aren't compatible with KMSAN as-is.

Fixes: 382de740759a ("lib/crypto: nh: Add NH library")
Link: https://lore.kernel.org/r/20260105053652.1708299-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12 11:07:50 -08:00
Eric Biggers
a229d83235 lib/crypto: x86/nh: Migrate optimized code into library
Migrate the x86_64 implementations of NH into lib/crypto/.  This makes
the nh() function be optimized on x86_64 kernels.

Note: this temporarily makes the adiantum template not utilize the
x86_64 optimized NH code.  This is resolved in a later commit that
converts the adiantum template to use nh() instead of "nhpoly1305".

Link: https://lore.kernel.org/r/20251211011846.8179-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12 11:07:50 -08:00
Eric Biggers
b4a8528d17 lib/crypto: arm64/nh: Migrate optimized code into library
Migrate the arm64 NEON implementation of NH into lib/crypto/.  This
makes the nh() function be optimized on arm64 kernels.

Note: this temporarily makes the adiantum template not utilize the arm64
optimized NH code.  This is resolved in a later commit that converts the
adiantum template to use nh() instead of "nhpoly1305".

Link: https://lore.kernel.org/r/20251211011846.8179-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12 11:07:50 -08:00
Eric Biggers
29e39a11f5 lib/crypto: arm/nh: Migrate optimized code into library
Migrate the arm32 NEON implementation of NH into lib/crypto/.  This
makes the nh() function be optimized on arm32 kernels.

Note: this temporarily makes the adiantum template not utilize the arm32
optimized NH code.  This is resolved in a later commit that converts the
adiantum template to use nh() instead of "nhpoly1305".

Link: https://lore.kernel.org/r/20251211011846.8179-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12 11:07:49 -08:00
Eric Biggers
14e15c71d7 lib/crypto: nh: Add NH library
Add support for the NH "almost-universal hash function" to lib/crypto/,
specifically the variant of NH used in Adiantum.

This will replace the need for the "nhpoly1305" crypto_shash algorithm.
All the implementations of "nhpoly1305" use architecture-optimized code
only for the NH stage; they just use the generic C Poly1305 code for the
Poly1305 stage.  We can achieve the same result in a simpler way using
an (architecture-optimized) nh() function combined with code in
crypto/adiantum.c that passes the results to the Poly1305 library.

This commit begins this cleanup by adding the nh() function.  The code
is derived from crypto/nhpoly1305.c and include/crypto/nhpoly1305.h.

Link: https://lore.kernel.org/r/20251211011846.8179-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12 11:07:49 -08:00
Eric Biggers
64edccea59 lib/crypto: Add ML-DSA verification support
Add support for verifying ML-DSA signatures.

ML-DSA (Module-Lattice-Based Digital Signature Algorithm) is specified
in FIPS 204 and is the standard version of Dilithium.  Unlike RSA and
elliptic-curve cryptography, ML-DSA is believed to be secure even
against adversaries in possession of a large-scale quantum computer.

Compared to the earlier patch
(https://lore.kernel.org/r/20251117145606.2155773-3-dhowells@redhat.com/)
that was based on "leancrypto", this implementation:

  - Is about 700 lines of source code instead of 4800.

  - Generates about 4 KB of object code instead of 28 KB.

  - Uses 9-13 KB of memory to verify a signature instead of 31-84 KB.

  - Is at least about the same speed, with a microbenchmark showing 3-5%
    improvements on one x86_64 CPU and -1% to 1% changes on another.
    When memory is a bottleneck, it's likely much faster.

  - Correctly implements the RejNTTPoly step of the algorithm.

The API just consists of a single function mldsa_verify(), supporting
pure ML-DSA with any standard parameter set (ML-DSA-44, ML-DSA-65, or
ML-DSA-87) as selected by an enum.  That's all that's actually needed.

The following four potential features are unneeded and aren't included.
However, any that ever become needed could fairly easily be added later,
as they only affect how the message representative mu is calculated:

  - Nonempty context strings
  - Incremental message hashing
  - HashML-DSA
  - External mu

Signing support would, of course, be a larger and more complex addition.
However, the kernel doesn't, and shouldn't, need ML-DSA signing support.

Note that mldsa_verify() allocates memory, so it can sleep and can fail
with ENOMEM.  Unfortunately we don't have much choice about that, since
ML-DSA needs a lot of memory.  At least callers have to check for errors
anyway, since the signature could be invalid.

Note that verification doesn't require constant-time code, and in fact
some steps are inherently variable-time.  I've used constant-time
patterns in some places anyway, but technically they're not needed.

Reviewed-by: David Howells <dhowells@redhat.com>
Tested-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20251214181712.29132-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-01-12 11:07:49 -08:00
Eric Biggers
1cd5bb6e9e lib/crypto: riscv: Depend on RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS
Replace the RISCV_ISA_V dependency of the RISC-V crypto code with
RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS, which implies RISCV_ISA_V as
well as vector unaligned accesses being efficient.

This is necessary because this code assumes that vector unaligned
accesses are supported and are efficient.  (It does so to avoid having
to use lots of extra vsetvli instructions to switch the element width
back and forth between 8 and either 32 or 64.)

This was omitted from the code originally just because the RISC-V kernel
support for detecting this feature didn't exist yet.  Support has now
been added, but it's fragmented into per-CPU runtime detection, a
command-line parameter, and a kconfig option.  The kconfig option is the
only reasonable way to do it, though, so let's just rely on that.

Fixes: eb24af5d7a ("crypto: riscv - add vector crypto accelerated AES-{ECB,CBC,CTR,XTS}")
Fixes: bb54668837 ("crypto: riscv - add vector crypto accelerated ChaCha20")
Fixes: 600a3853df ("crypto: riscv - add vector crypto accelerated GHASH")
Fixes: 8c8e40470f ("crypto: riscv - add vector crypto accelerated SHA-{256,224}")
Fixes: b3415925a0 ("crypto: riscv - add vector crypto accelerated SHA-{512,384}")
Fixes: 563a5255af ("crypto: riscv - add vector crypto accelerated SM3")
Fixes: b8d06352bb ("crypto: riscv - add vector crypto accelerated SM4")
Cc: stable@vger.kernel.org
Reported-by: Vivian Wang <wangruikang@iscas.ac.cn>
Closes: https://lore.kernel.org/r/b3cfcdac-0337-4db0-a611-258f2868855f@iscas.ac.cn/
Reviewed-by: Jerry Shih <jerry.shih@sifive.com>
Link: https://lore.kernel.org/r/20251206213750.81474-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-12-09 15:10:21 -08:00
Linus Torvalds
5abe8d8efc Crypto library updates for 6.19
This is the main crypto library pull request for 6.19. It includes:
 
 - Add SHA-3 support to lib/crypto/, including support for both the
   hash functions and the extendable-output functions. Reimplement the
   existing SHA-3 crypto_shash support on top of the library.
 
   This is motivated mainly by the upcoming support for the ML-DSA
   signature algorithm, which needs the SHAKE128 and SHAKE256
   functions. But even on its own it's a useful cleanup.
 
   This also fixes the longstanding issue where the
   architecture-optimized SHA-3 code was disabled by default.
 
 - Add BLAKE2b support to lib/crypto/, and reimplement the existing
   BLAKE2b crypto_shash support on top of the library.
 
   This is motivated mainly by btrfs, which supports BLAKE2b checksums.
   With this change, all btrfs checksum algorithms now have library
   APIs. btrfs is planned to start just using the library directly.
 
   This refactor also improves consistency between the BLAKE2b code and
   BLAKE2s code. And as usual, it also fixes the issue where the
   architecture-optimized BLAKE2b code was disabled by default.
 
 - Add POLYVAL support to lib/crypto/, replacing the existing POLYVAL
   support in crypto_shash. Reimplement HCTR2 on top of the library.
 
   This simplifies the code and improves HCTR2 performance. As usual,
   it also makes the architecture-optimized code be enabled by default.
   The generic implementation of POLYVAL is greatly improved as well.
 
 - Clean up the BLAKE2s code.
 
 - Add FIPS self-tests for SHA-1, SHA-2, and SHA-3.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCaSurXBQcZWJpZ2dlcnNA
 a2VybmVsLm9yZwAKCRDzXCl4vpKOK4RhAQC/ZWiFD8MAnyBSVEeo5G5wr835bGl0
 FbXLdIT8orj/mAD+JgO9LaE15THL+Cy7K9VO4XN6HjbKd/7FdshYlrqrag8=
 =qitH
 -----END PGP SIGNATURE-----

Merge tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux

Pull crypto library updates from Eric Biggers:
 "This is the main crypto library pull request for 6.19. It includes:

   - Add SHA-3 support to lib/crypto/, including support for both the
     hash functions and the extendable-output functions. Reimplement the
     existing SHA-3 crypto_shash support on top of the library.

     This is motivated mainly by the upcoming support for the ML-DSA
     signature algorithm, which needs the SHAKE128 and SHAKE256
     functions. But even on its own it's a useful cleanup.

     This also fixes the longstanding issue where the
     architecture-optimized SHA-3 code was disabled by default.

   - Add BLAKE2b support to lib/crypto/, and reimplement the existing
     BLAKE2b crypto_shash support on top of the library.

     This is motivated mainly by btrfs, which supports BLAKE2b
     checksums. With this change, all btrfs checksum algorithms now have
     library APIs. btrfs is planned to start just using the library
     directly.

     This refactor also improves consistency between the BLAKE2b code
     and BLAKE2s code. And as usual, it also fixes the issue where the
     architecture-optimized BLAKE2b code was disabled by default.

   - Add POLYVAL support to lib/crypto/, replacing the existing POLYVAL
     support in crypto_shash. Reimplement HCTR2 on top of the library.

     This simplifies the code and improves HCTR2 performance. As usual,
     it also makes the architecture-optimized code be enabled by
     default. The generic implementation of POLYVAL is greatly improved
     as well.

   - Clean up the BLAKE2s code

   - Add FIPS self-tests for SHA-1, SHA-2, and SHA-3"

* tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (37 commits)
  fscrypt: Drop obsolete recommendation to enable optimized POLYVAL
  crypto: polyval - Remove the polyval crypto_shash
  crypto: hctr2 - Convert to use POLYVAL library
  lib/crypto: x86/polyval: Migrate optimized code into library
  lib/crypto: arm64/polyval: Migrate optimized code into library
  lib/crypto: polyval: Add POLYVAL library
  crypto: polyval - Rename conflicting functions
  lib/crypto: x86/blake2s: Use vpternlogd for 3-input XORs
  lib/crypto: x86/blake2s: Avoid writing back unchanged 'f' value
  lib/crypto: x86/blake2s: Improve readability
  lib/crypto: x86/blake2s: Use local labels for data
  lib/crypto: x86/blake2s: Drop check for nblocks == 0
  lib/crypto: x86/blake2s: Fix 32-bit arg treated as 64-bit
  lib/crypto: arm, arm64: Drop filenames from file comments
  lib/crypto: arm/blake2s: Fix some comments
  crypto: s390/sha3 - Remove superseded SHA-3 code
  crypto: sha3 - Reimplement using library API
  crypto: jitterentropy - Use default sha3 implementation
  lib/crypto: s390/sha3: Add optimized one-shot SHA-3 digest functions
  lib/crypto: sha3: Support arch overrides of one-shot digest functions
  ...
2025-12-02 18:01:03 -08:00
Eric Biggers
4d8da35579 lib/crypto: x86/polyval: Migrate optimized code into library
Migrate the x86_64 implementation of POLYVAL into lib/crypto/, wiring it
up to the POLYVAL library interface.  This makes the POLYVAL library be
properly optimized on x86_64.

This drops the x86_64 optimizations of polyval in the crypto_shash API.
That's fine, since polyval will be removed from crypto_shash entirely
since it is unneeded there.  But even if it comes back, the crypto_shash
API could just be implemented on top of the library API, as usual.

Adjust the names and prototypes of the assembly functions to align more
closely with the rest of the library code.

Also replace a movaps instruction with movups to remove the assumption
that the key struct is 16-byte aligned.  Users can still align the key
if they want (and at least in this case, movups is just as fast as
movaps), but it's inconvenient to require it.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:03:38 -08:00
Eric Biggers
37919e239e lib/crypto: arm64/polyval: Migrate optimized code into library
Migrate the arm64 implementation of POLYVAL into lib/crypto/, wiring it
up to the POLYVAL library interface.  This makes the POLYVAL library be
properly optimized on arm64.

This drops the arm64 optimizations of polyval in the crypto_shash API.
That's fine, since polyval will be removed from crypto_shash entirely
since it is unneeded there.  But even if it comes back, the crypto_shash
API could just be implemented on top of the library API, as usual.

Adjust the names and prototypes of the assembly functions to align more
closely with the rest of the library code.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:03:38 -08:00
Eric Biggers
3d176751e5 lib/crypto: polyval: Add POLYVAL library
Add support for POLYVAL to lib/crypto/.

This will replace the polyval crypto_shash algorithm and its use in the
hctr2 template, simplifying the code and reducing overhead.

Specifically, this commit introduces the POLYVAL library API and a
generic implementation of it.  Later commits will migrate the existing
architecture-optimized implementations of POLYVAL into lib/crypto/ and
add a KUnit test suite.

I've also rewritten the generic implementation completely, using a more
modern approach instead of the traditional table-based approach.  It's
now constant-time, requires no precomputation or dynamic memory
allocations, decreases the per-key memory usage from 4096 bytes to 16
bytes, and is faster than the old polyval-generic even on bulk data
reusing the same key (at least on x86_64, where I measured 15% faster).
We should do this for GHASH too, but for now just do it for POLYVAL.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251109234726.638437-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-11 11:03:38 -08:00
Eric Biggers
04171105d3 lib/crypto: s390/sha3: Add optimized Keccak functions
Implement sha3_absorb_blocks() and sha3_keccakf() using the hardware-
accelerated SHA-3 support in Message-Security-Assist Extension 6.

This accelerates the SHA3-224, SHA3-256, SHA3-384, SHA3-512, and
SHAKE256 library functions.

Note that arch/s390/crypto/ already has SHA-3 code that uses this
extension, but it is exposed only via crypto_shash.  This commit brings
the same acceleration to the SHA-3 library.  The arch/s390/crypto/
version will become redundant and be removed in later changes.

Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-11-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:02:35 -08:00
Eric Biggers
1e29a75057 lib/crypto: arm64/sha3: Migrate optimized code into library
Instead of exposing the arm64-optimized SHA-3 code via arm64-specific
crypto_shash algorithms, instead just implement the sha3_absorb_blocks()
and sha3_keccakf() library functions.  This is much simpler, it makes
the SHA-3 library functions be arm64-optimized, and it fixes the
longstanding issue where the arm64-optimized SHA-3 code was disabled by
default.  SHA-3 still remains available through crypto_shash, but
individual architectures no longer need to handle it.

Note: to see the diff from arch/arm64/crypto/sha3-ce-glue.c to
lib/crypto/arm64/sha3.h, view this commit with 'git show -M10'.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-10-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:02:35 -08:00
David Howells
0593447248 lib/crypto: sha3: Add SHA-3 support
Add SHA-3 support to lib/crypto/.  All six algorithms in the SHA-3
family are supported: four digests (SHA3-224, SHA3-256, SHA3-384, and
SHA3-512) and two extendable-output functions (SHAKE128 and SHAKE256).

The SHAKE algorithms will be required for ML-DSA.

[EB: simplified the API to use fewer types and functions, fixed bug that
     sometimes caused incorrect SHAKE output, cleaned up the
     documentation, dropped an ad-hoc test that was inconsistent with
     the rest of lib/crypto/, and many other cleanups]

Signed-off-by: David Howells <dhowells@redhat.com>
Co-developed-by: Eric Biggers <ebiggers@kernel.org>
Tested-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251026055032.1413733-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-05 20:02:32 -08:00
Eric Biggers
44e8241c51 lib/crypto: arm/curve25519: Disable on CPU_BIG_ENDIAN
On big endian arm kernels, the arm optimized Curve25519 code produces
incorrect outputs and fails the Curve25519 test.  This has been true
ever since this code was added.

It seems that hardly anyone (or even no one?) actually uses big endian
arm kernels.  But as long as they're ostensibly supported, we should
disable this code on them so that it's not accidentally used.

Note: for future-proofing, use !CPU_BIG_ENDIAN instead of
CPU_LITTLE_ENDIAN.  Both of these are arch-specific options that could
get removed in the future if big endian support gets dropped.

Fixes: d8f1308a02 ("crypto: arm/curve25519 - wire up NEON implementation")
Cc: stable@vger.kernel.org
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251104054906.716914-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-11-04 09:36:22 -08:00
Eric Biggers
ba6617bd47 lib/crypto: arm/blake2b: Migrate optimized code into library
Migrate the arm-optimized BLAKE2b code from arch/arm/crypto/ to
lib/crypto/arm/.  This makes the BLAKE2b library able to use it, and it
also simplifies the code because it's easier to integrate with the
library than crypto_shash.

This temporarily makes the arm-optimized BLAKE2b code unavailable via
crypto_shash.  A later commit reimplements the blake2b-* crypto_shash
algorithms on top of the BLAKE2b library API, making it available again.

Note that as per the lib/crypto/ convention, the optimized code is now
enabled by default.  So, this also fixes the longstanding issue where
the optimized BLAKE2b code was not enabled by default.

To see the diff from arch/arm/crypto/blake2b-neon-glue.c to
lib/crypto/arm/blake2b.h, view this commit with 'git show -M10'.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29 22:04:24 -07:00
Eric Biggers
23a16c9533 lib/crypto: blake2b: Add BLAKE2b library functions
Add a library API for BLAKE2b, closely modeled after the BLAKE2s API.

This will allow in-kernel users such as btrfs to use BLAKE2b without
going through the generic crypto layer.  In addition, as usual the
BLAKE2b crypto_shash algorithms will be reimplemented on top of this.

Note: to create lib/crypto/blake2b.c I made a copy of
lib/crypto/blake2s.c and made the updates from BLAKE2s => BLAKE2b.  This
way, the BLAKE2s and BLAKE2b code is kept consistent.  Therefore, it
borrows the SPDX-License-Identifier and Copyright from
lib/crypto/blake2s.c rather than crypto/blake2b_generic.c.

The library API uses 'struct blake2b_ctx', consistent with other
lib/crypto/ APIs.  The existing 'struct blake2b_state' will be removed
once the blake2b crypto_shash algorithms are updated to stop using it.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251018043106.375964-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-29 22:04:24 -07:00
Eric Biggers
1af424b154 lib/crypto: poly1305: Restore dependency of arch code on !KMSAN
Restore the dependency of the architecture-optimized Poly1305 code on
!KMSAN.  It was dropped by commit b646b782e5 ("lib/crypto: poly1305:
Consolidate into single module").

Unlike the other hash algorithms in lib/crypto/ (e.g., SHA-512), the way
the architecture-optimized Poly1305 code is integrated results in
assembly code initializing memory, for several different architectures.
Thus, it generates false positive KMSAN warnings.  These could be
suppressed with kmsan_unpoison_memory(), but it would be needed in quite
a few places.  For now let's just restore the dependency on !KMSAN.

Note: this should have been caught by running poly1305_kunit with
CONFIG_KMSAN=y, which I did.  However, due to an unrelated KMSAN bug
(https://lore.kernel.org/r/20251022030213.GA35717@sol/), KMSAN currently
isn't working reliably.  Thus, the warning wasn't noticed until later.

Fixes: b646b782e5 ("lib/crypto: poly1305: Consolidate into single module")
Reported-by: syzbot+01fcd39a0d90cdb0e3df@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/r/68f6a48f.050a0220.91a22.0452.GAE@google.com/
Reported-by: Pei Xiao <xiaopei01@kylinos.cn>
Closes: https://lore.kernel.org/r/751b3d80293a6f599bb07770afcef24f623c7da0.1761026343.git.xiaopei01@kylinos.cn/
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251022033405.64761-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-10-22 10:52:10 -07:00
Eric Biggers
68546e5632 lib/crypto: curve25519: Consolidate into single module
Reorganize the Curve25519 library code:

- Build a single libcurve25519 module, instead of up to three modules:
  libcurve25519, libcurve25519-generic, and an arch-specific module.

- Move the arch-specific Curve25519 code from arch/$(SRCARCH)/crypto/ to
  lib/crypto/$(SRCARCH)/.  Centralize the build rules into
  lib/crypto/Makefile and lib/crypto/Kconfig.

- Include the arch-specific code directly in lib/crypto/curve25519.c via
  a header, rather than using a separate .c file.

- Eliminate the entanglement with CRYPTO.  CRYPTO_LIB_CURVE25519 no
  longer selects CRYPTO, and the arch-specific Curve25519 code no longer
  depends on CRYPTO.

This brings Curve25519 in line with the latest conventions for
lib/crypto/, used by other algorithms.  The exception is that I kept the
generic code in separate translation units for now.  (Some of the
function names collide between the x86 and generic Curve25519 code.  And
the Curve25519 functions are very long anyway, so inlining doesn't
matter as much for Curve25519 as it does for some other algorithms.)

Link: https://lore.kernel.org/r/20250906213523.84915-11-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-09-06 16:32:43 -07:00
Eric Biggers
39ee3970f2 lib/crypto: blake2s: Consolidate into single C translation unit
As was done with the other algorithms, reorganize the BLAKE2s code so
that the generic implementation and the arch-specific "glue" code is
consolidated into a single translation unit, so that the compiler will
inline the functions and automatically decide whether to include the
generic code in the resulting binary or not.

Similarly, also consolidate the build rules into
lib/crypto/{Makefile,Kconfig}.  This removes the last uses of
lib/crypto/{arm,x86}/{Makefile,Kconfig}, so remove those too.

Don't keep the !KMSAN dependency.  It was needed only for other
algorithms such as ChaCha that initialize memory from assembly code.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250827151131.27733-12-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-08-29 09:50:19 -07:00
Eric Biggers
13cecc526d lib/crypto: chacha: Consolidate into single module
Consolidate the ChaCha code into a single module (excluding
chacha-block-generic.c which remains always built-in for random.c),
similar to various other algorithms:

- Each arch now provides a header file lib/crypto/$(SRCARCH)/chacha.h,
  replacing lib/crypto/$(SRCARCH)/chacha*.c.  The header defines
  chacha_crypt_arch() and hchacha_block_arch().  It is included by
  lib/crypto/chacha.c, and thus the code gets built into the single
  libchacha module, with improved inlining in some cases.

- Whether arch-optimized ChaCha is buildable is now controlled centrally
  by lib/crypto/Kconfig instead of by lib/crypto/$(SRCARCH)/Kconfig.
  The conditions for enabling it remain the same as before, and it
  remains enabled by default.

- Any additional arch-specific translation units for the optimized
  ChaCha code, such as assembly files, are now compiled by
  lib/crypto/Makefile instead of lib/crypto/$(SRCARCH)/Makefile.

This removes the last use for the Makefile and Kconfig files in the
arm64, mips, powerpc, riscv, and s390 subdirectories of lib/crypto/.  So
also remove those files and the references to them.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250827151131.27733-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-08-29 09:50:19 -07:00
Zhihang Shao
bef9c75598 lib/crypto: riscv/poly1305: Import OpenSSL/CRYPTOGAMS implementation
This is a straight import of the OpenSSL/CRYPTOGAMS Poly1305
implementation for riscv authored by Andy Polyakov.  The file
'poly1305-riscv.pl' is taken straight from
https://github.com/dot-asm/cryptogams commit
5e3fba73576244708a752fa61a8e93e587f271bb.  This patch was tested on
SpacemiT X60, with 2~2.5x improvement over generic implementation.

Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
Signed-off-by: Zhihang Shao <zhihang.shao.iscas@gmail.com>
[EB: ported to lib/crypto/riscv/]
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250829152513.92459-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-08-29 09:49:18 -07:00
Eric Biggers
b646b782e5 lib/crypto: poly1305: Consolidate into single module
Consolidate the Poly1305 code into a single module, similar to various
other algorithms (SHA-1, SHA-256, SHA-512, etc.):

- Each arch now provides a header file lib/crypto/$(SRCARCH)/poly1305.h,
  replacing lib/crypto/$(SRCARCH)/poly1305*.c.  The header defines
  poly1305_block_init(), poly1305_blocks(), poly1305_emit(), and
  optionally poly1305_mod_init_arch().  It is included by
  lib/crypto/poly1305.c, and thus the code gets built into the single
  libpoly1305 module, with improved inlining in some cases.

- Whether arch-optimized Poly1305 is buildable is now controlled
  centrally by lib/crypto/Kconfig instead of by
  lib/crypto/$(SRCARCH)/Kconfig.  The conditions for enabling it remain
  the same as before, and it remains enabled by default.  (The PPC64 one
  remains unconditionally disabled due to 'depends on BROKEN'.)

- Any additional arch-specific translation units for the optimized
  Poly1305 code, such as assembly files, are now compiled by
  lib/crypto/Makefile instead of lib/crypto/$(SRCARCH)/Makefile.

A special consideration is needed because the Adiantum code uses the
poly1305_core_*() functions directly.  For now, just carry forward that
approach.  This means retaining the CRYPTO_LIB_POLY1305_GENERIC kconfig
symbol, and keeping the poly1305_core_*() functions in separate
translation units.  So it's not quite as streamlined I've done with the
other hash functions, but we still get a single libpoly1305 module.

Note: to see the diff from the arm, arm64, and x86 .c files to the new
.h files, view this commit with 'git show -M10'.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250829152513.92459-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-08-29 09:49:18 -07:00
Eric Biggers
a1848f6e38 lib/crypto: sparc/md5: Migrate optimized code into library
Instead of exposing the sparc-optimized MD5 code via sparc-specific
crypto_shash algorithms, instead just implement the md5_blocks() library
function.  This is much simpler, it makes the MD5 library functions be
sparc-optimized, and it fixes the longstanding issue where the
sparc-optimized MD5 code was disabled by default.  MD5 still remains
available through crypto_shash, but individual architectures no longer
need to handle it.

Note: to see the diff from arch/sparc/crypto/md5_glue.c to
lib/crypto/sparc/md5.h, view this commit with 'git show -M10'.

Link: https://lore.kernel.org/r/20250805222855.10362-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-08-26 12:52:28 -04:00
Eric Biggers
09371e1349 lib/crypto: powerpc/md5: Migrate optimized code into library
Instead of exposing the powerpc-optimized MD5 code via powerpc-specific
crypto_shash algorithms, instead just implement the md5_blocks() library
function.  This is much simpler, it makes the MD5 library functions be
powerpc-optimized, and it fixes the longstanding issue where the
powerpc-optimized MD5 code was disabled by default.  MD5 still remains
available through crypto_shash, but individual architectures no longer
need to handle it.

Link: https://lore.kernel.org/r/20250805222855.10362-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-08-26 12:52:28 -04:00
Eric Biggers
c9e5ac0ab9 lib/crypto: mips/md5: Migrate optimized code into library
Instead of exposing the mips-optimized MD5 code via mips-specific
crypto_shash algorithms, instead just implement the md5_blocks() library
function.  This is much simpler, it makes the MD5 library functions be
mips-optimized, and it fixes the longstanding issue where the
mips-optimized MD5 code was disabled by default.  MD5 still remains
available through crypto_shash, but individual architectures no longer
need to handle it.

Note: to see the diff from arch/mips/cavium-octeon/crypto/octeon-md5.c
to lib/crypto/mips/md5.h, view this commit with 'git show -M10'.

Link: https://lore.kernel.org/r/20250805222855.10362-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-08-26 12:52:28 -04:00
Eric Biggers
e164461349 lib/crypto: md5: Add MD5 and HMAC-MD5 library functions
Add library functions for MD5, including HMAC support.  The MD5
implementation is derived from crypto/md5.c.  This closely mirrors the
corresponding SHA-1 and SHA-2 changes.

Like SHA-1 and SHA-2, support for architecture-optimized MD5
implementations is included.  I originally proposed dropping those, but
unfortunately there is an AF_ALG user of the PowerPC MD5 code
(https://lore.kernel.org/r/c4191597-341d-4fd7-bc3d-13daf7666c41@csgroup.eu/),
and dropping that code would be viewed as a performance regression.  We
don't add new software algorithm implementations purely for AF_ALG, as
escalating to kernel mode merely to do calculations that could be done
in userspace is inefficient and is completely the wrong design.  But
since this one already existed, it gets grandfathered in for now.  An
objection was also raised to dropping the SPARC64 MD5 code because it
utilizes the CPU's direct support for MD5, although it remains unclear
that anyone is using that.  Regardless, we'll keep these around for now.

Note that while MD5 is a legacy algorithm that is vulnerable to
practical collision attacks, it still has various in-kernel users that
implement legacy protocols.  Switching to a simple library API, which is
the way the code should have been organized originally, will greatly
simplify their code.  For example:

    MD5:
        drivers/md/dm-crypt.c (for lmk IV generation)
        fs/nfsd/nfs4recover.c
        fs/ecryptfs/
        fs/smb/client/
        net/{ipv4,ipv6}/ (for TCP-MD5 signatures)

    HMAC-MD5:
        fs/smb/client/
        fs/smb/server/

(Also net/sctp/ if it continues using HMAC-MD5 for cookie generation.
However, that use case has the flexibility to upgrade to a more modern
algorithm, which I'll be proposing instead.)

As usual, the "md5" and "hmac(md5)" crypto_shash algorithms will also be
reimplemented on top of these library functions.  For "hmac(md5)" this
will provide a faster, more streamlined implementation.

Link: https://lore.kernel.org/r/20250805222855.10362-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-08-26 12:52:27 -04:00
Eric Biggers
d73915fdc0 lib/crypto: sha: Update Kconfig help for SHA1 and SHA256
Update the help text for CRYPTO_LIB_SHA1 and CRYPTO_LIB_SHA256 to
reflect the addition of HMAC support, and to be consistent with
CRYPTO_LIB_SHA512.

Link: https://lore.kernel.org/r/20250731224218.137947-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-08-14 18:00:47 -07:00
Eric Biggers
4dcf6cadda lib/crypto: tests: Add KUnit tests for SHA-224 and SHA-256
Add KUnit test suites for the SHA-224 and SHA-256 library functions,
including the corresponding HMAC support.  The core test logic is in the
previously-added hash-test-template.h.  This commit just adds the actual
KUnit suites, and it adds the generated test vectors to the tree so that
gen-hash-testvecs.py won't have to be run at build time.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250709200112.258500-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 11:29:36 -07:00
Eric Biggers
f3d6cb3dc0 lib/crypto: x86/sha1: Migrate optimized code into library
Instead of exposing the x86-optimized SHA-1 code via x86-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function.  This is much simpler, it makes the SHA-1 library
functions be x86-optimized, and it fixes the longstanding issue where
the x86-optimized SHA-1 code was disabled by default.  SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.

To match sha1_blocks(), change the type of the nblocks parameter of the
assembly functions from int to size_t.  The assembly functions actually
already treated it as size_t.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-14-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 11:28:35 -07:00
Eric Biggers
c751059985 lib/crypto: sparc/sha1: Migrate optimized code into library
Instead of exposing the sparc-optimized SHA-1 code via sparc-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function.  This is much simpler, it makes the SHA-1 library
functions be sparc-optimized, and it fixes the longstanding issue where
the sparc-optimized SHA-1 code was disabled by default.  SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.

Note: to see the diff from arch/sparc/crypto/sha1_glue.c to
lib/crypto/sparc/sha1.h, view this commit with 'git show -M10'.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 11:11:49 -07:00
Eric Biggers
377982d561 lib/crypto: s390/sha1: Migrate optimized code into library
Instead of exposing the s390-optimized SHA-1 code via s390-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function.  This is much simpler, it makes the SHA-1 library
functions be s390-optimized, and it fixes the longstanding issue where
the s390-optimized SHA-1 code was disabled by default.  SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-12-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 11:11:49 -07:00
Eric Biggers
6b9ae8cfaa lib/crypto: powerpc/sha1: Migrate optimized code into library
Instead of exposing the powerpc-optimized SHA-1 code via
powerpc-specific crypto_shash algorithms, instead just implement the
sha1_blocks() library function.  This is much simpler, it makes the
SHA-1 library functions be powerpc-optimized, and it fixes the
longstanding issue where the powerpc-optimized SHA-1 code was disabled
by default.  SHA-1 still remains available through crypto_shash, but
individual architectures no longer need to handle it.

Note: to see the diff from arch/powerpc/crypto/sha1-spe-glue.c to
lib/crypto/powerpc/sha1.h, view this commit with 'git show -M10'.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-11-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 11:11:49 -07:00
Eric Biggers
b6ac1dac2f lib/crypto: mips/sha1: Migrate optimized code into library
Instead of exposing the mips-optimized SHA-1 code via mips-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function.  This is much simpler, it makes the SHA-1 library
functions be mips-optimized, and it fixes the longstanding issue where
the mips-optimized SHA-1 code was disabled by default.  SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.

Note: to see the diff from arch/mips/cavium-octeon/crypto/octeon-sha1.c
to lib/crypto/mips/sha1.h, view this commit with 'git show -M10'.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-10-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 11:11:49 -07:00
Eric Biggers
00d549bb89 lib/crypto: arm64/sha1: Migrate optimized code into library
Instead of exposing the arm64-optimized SHA-1 code via arm64-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function.  This is much simpler, it makes the SHA-1 library
functions be arm64-optimized, and it fixes the longstanding issue where
the arm64-optimized SHA-1 code was disabled by default.  SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.

Remove support for SHA-1 finalization from assembly code, since the
library does not yet support architecture-specific overrides of the
finalization.  (Support for that has been omitted for now, for
simplicity and because usually it isn't performance-critical.)

To match sha1_blocks(), change the type of the nblocks parameter and the
return value of __sha1_ce_transform() from int to size_t.  Update the
assembly code accordingly.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 11:11:48 -07:00
Eric Biggers
70cb6ca58f lib/crypto: arm/sha1: Migrate optimized code into library
Instead of exposing the arm-optimized SHA-1 code via arm-specific
crypto_shash algorithms, instead just implement the sha1_blocks()
library function.  This is much simpler, it makes the SHA-1 library
functions be arm-optimized, and it fixes the longstanding issue where
the arm-optimized SHA-1 code was disabled by default.  SHA-1 still
remains available through crypto_shash, but individual architectures no
longer need to handle it.

To match sha1_blocks(), change the type of the nblocks parameter of the
assembly functions from int to size_t.  The assembly functions actually
already treated it as size_t.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 11:11:29 -07:00
Eric Biggers
90860aef63 lib/crypto: sha1: Add SHA-1 library functions
Add a library interface for SHA-1, following the SHA-2 one.  As was the
case with SHA-2, this will be useful for various in-kernel users.  The
crypto_shash interface will be reimplemented on top of it as well.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250712232329.818226-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-14 08:58:53 -07:00
Eric Biggers
aacb37f597 lib/crypto: hash_info: Move hash_info.c into lib/crypto/
crypto/hash_info.c just contains a couple of arrays that map HASH_ALGO_*
algorithm IDs to properties of those algorithms.  It is compiled only
when CRYPTO_HASH_INFO=y, but currently CRYPTO_HASH_INFO depends on
CRYPTO.  Since this can be useful without the old-school crypto API,
move it into lib/crypto/ so that it no longer depends on CRYPTO.

This eliminates the need for FS_VERITY to select CRYPTO after it's been
converted to use lib/crypto/.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630172224.46909-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-08 12:03:44 -07:00
Eric Biggers
e96cb9507f lib/crypto: sha256: Consolidate into single module
Consolidate the CPU-based SHA-256 code into a single module, following
what I did with SHA-512:

- Each arch now provides a header file lib/crypto/$(SRCARCH)/sha256.h,
  replacing lib/crypto/$(SRCARCH)/sha256.c.  The header defines
  sha256_blocks() and optionally sha256_mod_init_arch().  It is included
  by lib/crypto/sha256.c, and thus the code gets built into the single
  libsha256 module, with proper inlining and dead code elimination.

- sha256_blocks_generic() is moved from lib/crypto/sha256-generic.c into
  lib/crypto/sha256.c.  It's now a static function marked with
  __maybe_unused, so the compiler automatically eliminates it in any
  cases where it's not used.

- Whether arch-optimized SHA-256 is buildable is now controlled
  centrally by lib/crypto/Kconfig instead of by
  lib/crypto/$(SRCARCH)/Kconfig.  The conditions for enabling it remain
  the same as before, and it remains enabled by default.

- Any additional arch-specific translation units for the optimized
  SHA-256 code (such as assembly files) are now compiled by
  lib/crypto/Makefile instead of lib/crypto/$(SRCARCH)/Makefile.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-13-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-04 10:23:11 -07:00
Eric Biggers
9f97707bdb lib/crypto: sha256: Remove sha256_blocks_simd()
Instead of having both sha256_blocks_arch() and sha256_blocks_simd(),
instead have just sha256_blocks_arch() which uses the most efficient
implementation that is available in the calling context.

This is simpler, as it reduces the API surface.  It's also safer, since
sha256_blocks_arch() just works in all contexts, including contexts
where the FPU/SIMD/vector registers cannot be used.  This doesn't mean
that SHA-256 computations *should* be done in such contexts, but rather
we should just do the right thing instead of corrupting a random task's
registers.  Eliminating this footgun and simplifying the code is well
worth the very small performance cost of doing the check.

Note: in the case of arm and arm64, what used to be sha256_blocks_arch()
is renamed back to its original name of sha256_block_data_order().
sha256_blocks_arch() is now used for the higher-level dispatch function.
This renaming also required an update to lib/crypto/arm64/sha512.h,
since sha2-armv8.pl is shared by both SHA-256 and SHA-512.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160645.3198-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-07-04 10:18:53 -07:00
Eric Biggers
74750aa78d lib/crypto: x86: Move arch/x86/lib/crypto/ into lib/crypto/
Move the contents of arch/x86/lib/crypto/ into lib/crypto/x86/.

The new code organization makes a lot more sense for how this code
actually works and is developed.  In particular, it makes it possible to
build each algorithm as a single module, with better inlining and dead
code elimination.  For a more detailed explanation, see the patchset
which did this for the CRC library code:
https://lore.kernel.org/r/20250607200454.73587-1-ebiggers@kernel.org/.
Also see the patchset which did this for SHA-512:
https://lore.kernel.org/linux-crypto/20250616014019.415791-1-ebiggers@kernel.org/

This is just a preparatory commit, which does the move to get the files
into their new location but keeps them building the same way as before.
Later commits will make the actual improvements to the way the
arch-optimized code is integrated for each algorithm.

Add a gitignore entry for the removed directory arch/x86/lib/crypto/ so
that people don't accidentally commit leftover generated files.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20250619191908.134235-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-06-30 09:26:20 -07:00