Commit graph

2169 commits

Author SHA1 Message Date
Kees Cook
189f164e57 Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses
Conversion performed via this Coccinelle script:

  // SPDX-License-Identifier: GPL-2.0-only
  // Options: --include-headers-for-types --all-includes --include-headers --keep-comments
  virtual patch

  @gfp depends on patch && !(file in "tools") && !(file in "samples")@
  identifier ALLOC = {kmalloc_obj,kmalloc_objs,kmalloc_flex,
 		    kzalloc_obj,kzalloc_objs,kzalloc_flex,
		    kvmalloc_obj,kvmalloc_objs,kvmalloc_flex,
		    kvzalloc_obj,kvzalloc_objs,kvzalloc_flex};
  @@

  	ALLOC(...
  -		, GFP_KERNEL
  	)

  $ make coccicheck MODE=patch COCCI=gfp.cocci

Build and boot tested x86_64 with Fedora 42's GCC and Clang:

Linux version 6.19.0+ (user@host) (gcc (GCC) 15.2.1 20260123 (Red Hat 15.2.1-7), GNU ld version 2.44-12.fc42) #1 SMP PREEMPT_DYNAMIC 1970-01-01
Linux version 6.19.0+ (user@host) (clang version 20.1.8 (Fedora 20.1.8-4.fc42), LLD 20.1.8) #1 SMP PREEMPT_DYNAMIC 1970-01-01

Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-22 08:26:33 -08:00
Linus Torvalds
32a92f8c89 Convert more 'alloc_obj' cases to default GFP_KERNEL arguments
This converts some of the visually simpler cases that have been split
over multiple lines.  I only did the ones that are easy to verify the
resulting diff by having just that final GFP_KERNEL argument on the next
line.

Somebody should probably do a proper coccinelle script for this, but for
me the trivial script actually resulted in an assertion failure in the
middle of the script.  I probably had made it a bit _too_ trivial.

So after fighting that far a while I decided to just do some of the
syntactically simpler cases with variations of the previous 'sed'
scripts.

The more syntactically complex multi-line cases would mostly really want
whitespace cleanup anyway.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 20:03:00 -08:00
Linus Torvalds
323bbfcf1e Convert 'alloc_flex' family to use the new default GFP_KERNEL argument
This is the exact same thing as the 'alloc_obj()' version, only much
smaller because there are a lot fewer users of the *alloc_flex()
interface.

As with alloc_obj() version, this was done entirely with mindless brute
force, using the same script, except using 'flex' in the pattern rather
than 'objs*'.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 17:09:51 -08:00
Linus Torvalds
bf4afc53b7 Convert 'alloc_obj' family to use the new default GFP_KERNEL argument
This was done entirely with mindless brute force, using

    git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21 17:09:51 -08:00
Kees Cook
69050f8d6d treewide: Replace kmalloc with kmalloc_obj for non-scalar types
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-21 01:02:28 -08:00
Linus Torvalds
14c357c4ad - Remove two drivers for obsolete hardware: i82443bxgx_edac and r82600_edac
- Add support for Intel Amston Lake and Panther Lake-H SoCs to igen6_edac
 
 - The usual amount of fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmmJ2DQACgkQEsHwGGHe
 VUo8ow/9HFeAYSqEqYxBzv46qumdCE/Ru6ijz0v6wFW0ayKUVnwJeAfCDOSAnsO3
 G7CTHt9a6kD1aZG8YbZXmjc+dghffqzUeMt7pOoIB0RliY+LUDC6FVGOt8U1ulBB
 ymp4iPuBWNapTs083sVLJsXtphCt2U48PE1Hz63c9QfM1LHyOfb6edCfdGemaP4B
 gV1GalYuUlDm2Bmw5PA35Kc2MXff7FA7FF6WvSIV1ljVYGpFKHL9VFRLESWSEYi6
 27A27l4VSqZfQXPRJkH1naIKkflsralrgvZ7cSCsb4Lq8V4oGzatfrnTK/0J5A9v
 PEZk2HO7ZUaYStmzcTfUt8mSjlAN34Q77BQ9Z6796ju2PhJlN5TQJxLz+lvzbiec
 kWKtded961qrLB/mXDFfiyLMyhqdPR13ZK3vb0xJcDvq8LQZ0tVXsbmH6UclNwOs
 Nk+Hb8F7PghU/nDgJkYlL0mjCyCucl+4UpS/ObCXhAVUFDHik4SRZ8TPw1mz5ywt
 HHidlCDbKN42zji7p8ISpC+OTJWW7IQX24V+QarJ/Qi519K/o71XeBwnhXNSjuku
 sRB1tSWg9ZEXec2BPnqEvW76ODRkblQxO6/KdYW2D8X9aV5XDQdq/RX4yGbsV+7s
 faALpB26wCVJ7oPoYPl/AX3UKBvaGz0IRR/zPoI0LkXqjnRrgUk=
 =s1Be
 -----END PGP SIGNATURE-----

Merge tag 'edac_updates_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

Pull EDAC updates from Borislav Petkov:

 - Remove two drivers for obsolete hardware: i82443bxgx_edac and
   r82600_edac

 - Add support for Intel Amston Lake and Panther Lake-H SoCs to
   igen6_edac

 - The usual amount of fixes and cleanups

* tag 'edac_updates_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/r82600: Remove this obsolete driver
  EDAC/i82443bxgx: Remove driver that has been marked broken since 2007
  EDAC/amd64: Avoid a -Wformat-security warning
  RAS/AMD/ATL: Remove an unneeded semicolon
  EDAC/igen6: Add more Intel Panther Lake-H SoCs support
  EDAC/igen6: Make masks of {MCHBAR, TOM, TOUUD, ECC_ERROR_LOG} configurable
  EDAC/igen6: Add two Intel Amston Lake SoCs support
  EDAC/i5400: Fix snprintf() limit calculation in calculate_dimm_size()
  EDAC/i5000: Fix snprintf() size calculation in calculate_dimm_size()
2026-02-10 18:14:36 -08:00
Borislav Petkov (AMD)
5b115dccdc Merge remote-tracking branches 'ras/edac-drivers', 'ras/edac-misc' and 'ras/edac-amd-atl' into edac-updates
* ras/edac-drivers:
  EDAC/r82600: Remove this obsolete driver
  EDAC/i82443bxgx: Remove driver that has been marked broken since 2007
  EDAC/igen6: Add more Intel Panther Lake-H SoCs support
  EDAC/igen6: Make masks of {MCHBAR, TOM, TOUUD, ECC_ERROR_LOG} configurable
  EDAC/igen6: Add two Intel Amston Lake SoCs support
  EDAC/i5400: Fix snprintf() limit calculation in calculate_dimm_size()
  EDAC/i5000: Fix snprintf() size calculation in calculate_dimm_size()

* ras/edac-misc:
  EDAC/amd64: Avoid a -Wformat-security warning

* ras/edac-amd-atl:
  RAS/AMD/ATL: Remove an unneeded semicolon

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2026-02-09 13:49:36 +01:00
Sebastian Andrzej Siewior
5c858d6c66 EDAC/altera: Remove IRQF_ONESHOT
Passing IRQF_ONESHOT ensures that the interrupt source is masked until
the secondary (threaded) handler is done. If only a primary handler is
used then the flag makes no sense because the interrupt can not fire
(again) while its handler is running.

The flag also prevents force-threading of the primary handler and the
irq-core will warn about this.

Remove IRQF_ONESHOT from irqflags.

Fixes: a29d64a45e ("EDAC, altera: Add IRQ Flags to disable IRQ while handling")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260128095540.863589-11-bigeasy@linutronix.de
2026-02-01 17:37:15 +01:00
Ethan Nelson-Moore
1795dc528c EDAC/r82600: Remove this obsolete driver
This driver supports the Radisys 82600 embedded chipset, which was used with
Pentium III-era CPUs. It is highly unlikely that it is still used.  Besides
its own documentation, the only information I was able to find about the
R82600, after looking through many pages of Google results, was that it was
used in a Nokia 2G GSM base station.

The original author said:

"This was after Bluesmoke (watch was out-of-tree), and was pay of the first
set of in tree edac drivers (a fresh design as far as I remember).

This particular driver did apparently get used by Akamai quite heavily and
widely  but all ancient history now. The Radisys r82600 edac driver 
r82600_edac.c was closely related hardware from the same era, and can probably
go too (although being embedded hardware, it's possible its production
lifespan was considerably longer).

Tim."

Mail is

  https://lore.kernel.org/r/01BCCA37-F6A2-458B-BFE7-99C7724CDDEA@buttersideup.com

but lkml drops html mail so quoting the relevant info here too.

  [ bp: Massage and extend commit message. ]

Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20260130052633.13119-1-enelsonmoore@gmail.com
2026-01-30 13:41:32 +01:00
Ethan Nelson-Moore
83b314e9c8 EDAC/i82443bxgx: Remove driver that has been marked broken since 2007
The history of this driver is pretty amusing. It was marked broken in
2007 in

  28f96eeafc ("drivers/edac-new-i82443bxgz-mc-driver: mark as broken").

It was then fixed in 2008 in

  53a2fe5804 ("edac: make i82443bxgx_edac coexist with intel_agp"),

but the dependency on BROKEN was never removed. Given that this was never
fixed in the last ~18 years, it is obvious there is no demand for this driver.
Remove it and move the former maintainer to the CREDITS file.

Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20260129082937.48740-1-enelsonmoore@gmail.com
2026-01-29 18:07:13 +01:00
Haoxiang Li
0ff7c44106 EDAC/x38: Fix a resource leak in x38_probe1()
If edac_mc_alloc() fails, also unmap the window.

  [ bp: Use separate labels, turning it into the classic unwind pattern. ]

Fixes: df8bc08c19 ("edac x38: new MC driver module")
Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20251223124350.1496325-1-lihaoxiang@isrc.iscas.ac.cn
2026-01-04 08:35:39 +01:00
Haoxiang Li
d42d5715dc EDAC/i3200: Fix a resource leak in i3200_probe1()
If edac_mc_alloc() fails, also unmap the window.

  [ bp: Use separate labels, turning it into the classic unwind pattern. ]

Fixes: dd8ef1db87 ("edac: i3200 memory controller driver")
Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20251223123202.1492038-1-lihaoxiang@isrc.iscas.ac.cn
2026-01-04 08:31:44 +01:00
Arnd Bergmann
c816ba1dcd EDAC/amd64: Avoid a -Wformat-security warning
Using a variable as a format string causes a (default-disabled) warning:

  drivers/edac/amd64_edac.c: In function 'per_family_init':
  drivers/edac/amd64_edac.c:3914:17: error: format not a string literal and no format arguments [-Werror=format-security]
   3914 |                 scnprintf(pvt->ctl_name, sizeof(pvt->ctl_name), tmp_name);
        |                 ^~~~~~~~~

The code here is safe, but in order to enable the warning by default in the
future, change this instance to pass the name indirectly.

Fixes: e9abd990ae ("EDAC/amd64: Generate ctl_name string at runtime")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Avadhut Naik <avadhut.naik@amd.com>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Link: https://patch.msgid.link/20251204100231.1034557-1-arnd@kernel.org
2025-12-30 17:27:11 +01:00
Lili Li
4c36e61069 EDAC/igen6: Add more Intel Panther Lake-H SoCs support
Add more Intel Panther Lake-H SoC compute die IDs for EDAC support.

Signed-off-by: Lili Li <lili.li@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Link: https://patch.msgid.link/20251124131537.3633983-1-qiuxu.zhuo@intel.com
2025-12-19 13:39:14 -08:00
Qiuxu Zhuo
4b720906ef EDAC/igen6: Make masks of {MCHBAR, TOM, TOUUD, ECC_ERROR_LOG} configurable
The masks used to retrieve base addresses from {MCHBAR, TOM, TOUUD,
ECC_ERROR_LOG} registers can be CPU model-specific. Currently,
igen6_edac hard-codes these masks with the most significant bit at
38, while some CPUs have extended the most significant bit to bit 41
or bit 45.

Systems with more than 512GB (2^39) memory need this extension to get
correct masks. But all CPUs currently supported by igen6_edac support
max memory less than 512GB (e.g., max memory size for Raptor Lake
systems is 192GB, for Alder Lake systems is 128GB, ...), which means
the previous hard-coded most significant bit 38 still works properly.
So backporting this patch to stable kernels is not necessary.

To make these masks reflect the CPUs' real support and easily support
future Intel client CPUs supported by igen6_edac that have more than
512GB memory, add four new fields to structure res_config to make these
masks CPU model-specific and configure them properly.

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Tested-by: Jianfeng Gao <jianfeng.gao@intel.com>
Link: https://patch.msgid.link/20251219042956.3232568-3-qiuxu.zhuo@intel.com
2025-12-19 13:36:07 -08:00
Qiuxu Zhuo
41ca2155d6 EDAC/igen6: Add two Intel Amston Lake SoCs support
Intel Amston Lake SoCs with IBECC (In-Band ECC) capability share the same
IBECC registers as Alder Lake-N SoCs. Add two new compute die IDs for
Amston Lake SoC products to enable EDAC support.

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Tested-by: Jianfeng Gao <jianfeng.gao@intel.com>
Link: https://patch.msgid.link/20251124065457.3630949-2-qiuxu.zhuo@intel.com
2025-12-17 10:04:01 -08:00
Dan Carpenter
72f1268361 EDAC/i5400: Fix snprintf() limit calculation in calculate_dimm_size()
The snprintf() can't really overflow because we're writing a max of 42
bytes to a PAGE_SIZE buffer.  But my static checker complains because
the limit calculation doesn't take the first 11 space characters that
we wrote into the buffer into consideration.  Fix this for the sake of
correctness even though it doesn't affect runtime.

Also delete an earlier "space -= n;" which was not used.

Fixes: 68d086f89b ("i5400_edac: improve debug messages to better represent the filled memory")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Link: https://patch.msgid.link/ccd06b91748e7ed8e33eeb2ff1e7b98700879304.1765290801.git.dan.carpenter@linaro.org
2025-12-17 10:00:20 -08:00
Dan Carpenter
7b5c7e83ac EDAC/i5000: Fix snprintf() size calculation in calculate_dimm_size()
The snprintf() can't really overflow because we're writing a max of 42
bytes to a PAGE_SIZE buffer.  But the limit calculation doesn't take
the first 11 bytes that we wrote into consideration so the limit is
not correct.  Just fix it for correctness even though it doesn't
affect runtime.

Fixes: 64e1fdaf55 ("i5000_edac: Fix the logic that retrieves memory information")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Link: https://patch.msgid.link/07cd652c51e77aad5a8350e1a7cd9407e5bbe373.1765290801.git.dan.carpenter@linaro.org
2025-12-17 09:59:17 -08:00
Linus Torvalds
f468cf53c5 bitmap updates for v6.19
- Runtime field_{get,prep}() (Geert);
  - Rust ID pool updates (Alice);
  - min_t() simplification (David);
  - __sw_hweightN kernel-doc fixes (Andy);
  - cpumask.h headers cleanup (Andy).
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmkxsKkACgkQsUSA/Tof
 vshxxgv+Ly1WkW65Sr3KmzY0lCFBg+oH+1uc9Y6avc3gciY1nEwHEP0mqjOVuGRd
 HRkxhBKQlZe+GEp09IeCzONhhcAe9VnftD4isIrLlqjlcavs9gWaQRU38lCvfj79
 HPVOOe3zy1TlBFqLfcc+cZWDBG9BMGCZycI1+dZMYzGZ3SUwpdGjNIfFNOC0x0Jg
 7u+nVqduzH155kBSaPUH2FhhC9SjmgW429EBpksKs0POcOiijdLesezksDP+5bfr
 9YyAuP1MZ+bWpMS5S0h/Mw9M/X9eB0ZhY0ahkHV8XFhv/8Wo/gYO98yBb5v8bxa9
 9F3D8FFMfYDmMzmFXlUVH7mNbe3fAtbQq/XQKzjGbe2jZM+3A3YNfCXpBASLsZLt
 p3G31cZRRtuDz4hlEiJeQuF0VB3sN7ycfT53dLIyjl9IMLBk4ArhXSPasN7wHa3Y
 VO5UYCQAOBAu9Kou+ThHDPJz0aBI9GtfwvqJTzgvXa0elZ+Iid6DfeqOSzmHyUOd
 A0qHDI/O
 =EM4O
 -----END PGP SIGNATURE-----

Merge tag 'bitmap-for-6.19' of github.com:/norov/linux

Pull bitmap updates from Yury Norov:

 - Runtime field_{get,prep}() (Geert)

 - Rust ID pool updates (Alice)

 - min_t() simplification (David)

 - __sw_hweightN kernel-doc fixes (Andy)

 - cpumask.h headers cleanup (Andy)

* tag 'bitmap-for-6.19' of github.com:/norov/linux: (32 commits)
  rust_binder: use bitmap for allocation of handles
  rust: id_pool: do not immediately acquire new ids
  rust: id_pool: do not supply starting capacity
  rust: id_pool: rename IdPool::new() to with_capacity()
  rust: bitmap: add BitmapVec::new_inline()
  rust: bitmap: add MAX_LEN and MAX_INLINE_LEN constants
  cpumask: Don't use "proxy" headers
  soc: renesas: Use bitfield helpers
  clk: renesas: Use bitfield helpers
  ALSA: usb-audio: Convert to common field_{get,prep}() helpers
  soc: renesas: rz-sysc: Convert to common field_get() helper
  pinctrl: ma35: Convert to common field_{get,prep}() helpers
  iio: mlx90614: Convert to common field_{get,prep}() helpers
  iio: dac: Convert to common field_prep() helper
  gpio: aspeed: Convert to common field_{get,prep}() helpers
  EDAC/ie31200: Convert to common field_get() helper
  crypto: qat - convert to common field_get() helper
  clk: at91: Convert to common field_{get,prep}() helpers
  bitfield: Add non-constant field_{prep,get}() helpers
  bitfield: Add less-checking __FIELD_{GET,PREP}()
  ...
2025-12-06 09:01:27 -08:00
Linus Torvalds
49219bba01 - imh_edac: Add a new EDAC driver for Intel Diamond Rapids and
future incarnations of this memory controllers architecture
 
 - amd64_edac: Remove the legacy csrow sysfs interface which has been
   deprecated and unused (we assume) for at least a decade
 
 - Add the capability to fallback to BIOS-provided address translation
   functionality (ACPI PRM) which can be used on systems unsupported by
   the current AMD address translation library
 
 - The usual fixes, fixlets, cleanups and improvements all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmktdyMACgkQEsHwGGHe
 VUpXTxAAhdQxn1v1tYKya6YHxBS3T3Y3+4fec+LeKgoY1YnoFHMse3TAU+G67opR
 1xnEKHKrkX4v1FAwe7eD2G6qyz2ytqcApv4XGxmQ1WgldFWuPl/lI3ngPNMCHMog
 dqeQFRQ7MXsk0no0cjMA6NjafFpYOGGGhIzdU3wvgZawH4hG9wHLS6Urvn2SfWj6
 Pf/449qS7XoPU5G22qWPqqixRHpc9BPkJfKMIYeaWbxldePlwbh9cOMLqwsZo1QV
 v5cv/3CAIVFzRvNVIx05kDhRrwqTjIZL+u9IYHg2g9DA45GQuktYQwd1KksbVpUn
 CijhpKMoSnQHN+ZLW84XzvEH2rvroSTZl28d5suY1GHXG3ePc9HpmTVbVElFXWKZ
 dq0X2RIbMEbSxneePFHJ4ESUfNN2HbPSfh/sXN4epxcMQI0VWVhXYs5+Ek4UV1+E
 hvhCS/kuAypODzEi0cULoMcXdyKr2V1zpaAHNlZshepp/kUzY46b3cBhxKiL3Fsd
 x+IhZgow9a+iMJfMpCJhMABKEkoZRgS3gs5nWMJ6t0EvulvknG+aovGB/Q0VaIIa
 H69Fn+R2ewnEuZf1JGZDMit1y+wjGgeamk+uWTym+tCyNH1eHaSq48POribajcYF
 UtcobK4kG7hPodsbwwD4MhqtSLhuyIcXTHbI3x4+r+LLAgdAPKM=
 =NidS
 -----END PGP SIGNATURE-----

Merge tag 'edac_updates_for_v6.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

Pull EDAC updates from Borislav Petkov:

 - imh_edac: Add a new EDAC driver for Intel Diamond Rapids and future
   incarnations of this memory controllers architecture

 - amd64_edac: Remove the legacy csrow sysfs interface which has been
   deprecated and unused (we assume) for at least a decade

 - Add the capability to fallback to BIOS-provided address translation
   functionality (ACPI PRM) which can be used on systems unsupported by
   the current AMD address translation library

 - The usual fixes, fixlets, cleanups and improvements all over the
   place

* tag 'edac_updates_for_v6.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  RAS/AMD/ATL: Replace bitwise_xor_bits() with hweight16()
  EDAC/igen6: Fix error handling in igen6_edac driver
  EDAC/imh: Setup 'imh_test' debugfs testing node
  EDAC/{skx_comm,imh}: Detect 2-level memory configuration
  EDAC/skx_common: Extend the maximum number of DRAM chip row bits
  EDAC/{skx_common,imh}: Add EDAC driver for Intel Diamond Rapids servers
  EDAC/skx_common: Prepare for skx_set_hi_lo()
  EDAC/skx_common: Prepare for skx_get_edac_list()
  EDAC/{skx_common,skx,i10nm}: Make skx_register_mci() independent of pci_dev
  EDAC/ghes: Replace deprecated strcpy() in ghes_edac_report_mem_error()
  EDAC/ie31200: Fix error handling in ie31200_register_mci
  RAS/CEC: Replace use of system_wq with system_percpu_wq
  EDAC: Remove the legacy EDAC sysfs interface
  EDAC/amd64: Remove NUM_CONTROLLERS macro
  EDAC/amd64: Generate ctl_name string at runtime
  RAS/AMD/ATL: Require PRM support for future systems
  ACPI: PRM: Add acpi_prm_handler_available()
  RAS/AMD/ATL: Return error codes from helper functions
2025-12-02 10:45:50 -08:00
Geert Uytterhoeven
331a1457d8 EDAC/ie31200: Convert to common field_get() helper
Drop the driver-specific field_get() macro, in favor of the globally
available variant from <linux/bitfield.h>.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
2025-11-24 14:15:47 -05:00
Geert Uytterhoeven
d51b09a0fe EDAC/ie31200: #undef field_get() before local definition
Prepare for the advent of a globally available common field_get() macro
by undefining the symbol before defining a local variant.  This prevents
redefinition warnings from the C preprocessor when introducing the common
macro later.

Suggested-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
2025-11-24 14:15:46 -05:00
Ma Ke
ef1b6d9049 EDAC/igen6: Fix error handling in igen6_edac driver
The igen6_edac driver calls device_initialize() for all memory
controllers in igen6_register_mci(), but misses corresponding
put_device() calls in error paths and during normal shutdown in
igen6_unregister_mcis().

Adding the missing put_device() calls improves code readability and
ensures proper reference counting for the device structure.

Found by code review.

Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://patch.msgid.link/20251105090244.23327-1-make24@iscas.ac.cn
2025-11-21 10:20:51 -08:00
Qiuxu Zhuo
5f40ea7f41 EDAC/imh: Setup 'imh_test' debugfs testing node
Setup the following debugfs testing node to enable fake memory error
address decoding tests for the imh_edac driver.

  /sys/kernel/debug/edac/imh_test/addr

Tested-by: Yi Lai <yi1.lai@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://patch.msgid.link/20251119134132.2389472-8-qiuxu.zhuo@intel.com
2025-11-21 10:20:51 -08:00
Qiuxu Zhuo
f619613f30 EDAC/{skx_comm,imh}: Detect 2-level memory configuration
Detect 2-level memory configurations and notify the 'skx_common' library
to enable ADXL 2-level memory error decoding.

Tested-by: Yi Lai <yi1.lai@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://patch.msgid.link/20251119134132.2389472-7-qiuxu.zhuo@intel.com
2025-11-21 10:20:51 -08:00
Qiuxu Zhuo
39abdcbdad EDAC/skx_common: Extend the maximum number of DRAM chip row bits
The allowed maximum number of row bits for DRAM chips in the Diamond
Rapids server processor is up to 19. Extend the current maximum row
bits from 18 to 19.

Tested-by: Yi Lai <yi1.lai@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://patch.msgid.link/20251119134132.2389472-6-qiuxu.zhuo@intel.com
2025-11-21 10:20:51 -08:00
Qiuxu Zhuo
9fc67b1170 EDAC/{skx_common,imh}: Add EDAC driver for Intel Diamond Rapids servers
Intel Diamond Rapids CPUs include Integrated Memory and I/O Hubs (IMH).
The memory controllers within the IMHs provide memory stacks to the
processor. Create a new driver for this IMH-based memory controllers
rather than applying additional patches to the existing i10nm_edac.c
for the following reasons:

1) The memory controllers are not presented as PCI devices; instead,
   the detection and all their registers have been transitioned to
   MMIO-based memory spaces.

2) Validation processes are costly. Modifications to i10nm_edac would
   require extensive validation checks against multiple platforms,
   including Ice Lake, Sapphire Rapids, Emerald Rapids, Granite Rapids,
   Sierra Forest, and Grand Ridge.

3) Future Intel CPUs will likely only need patches on top of this new
   EDAC driver. Validation can be limited to Diamond Rapids servers
   and future Intel CPU generations.

[Tony: Fix kerneldoc for struct local_reg]
[randconfig: Added dependencies on NFIT and DMI]

Tested-by: Yi Lai <yi1.lai@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://patch.msgid.link/20251119134132.2389472-5-qiuxu.zhuo@intel.com
2025-11-21 10:19:43 -08:00
Qiuxu Zhuo
d4839582bc EDAC/skx_common: Prepare for skx_set_hi_lo()
The upcoming imh_edac driver for Intel Diamond Rapids servers cannot
use skx_get_hi_lo() in skx_common to retrieve the TOHM (Top of High
Memory) and TOLM (Top of Low Memory) parameters. Instead, it obtains
these parameters within its own EDAC driver. To accommodate this,
prepare skx_set_hi_lo() to allow the driver to notify skx_common of
these parameters.

Tested-by: Yi Lai <yi1.lai@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://patch.msgid.link/20251119134132.2389472-4-qiuxu.zhuo@intel.com
2025-11-19 12:11:40 -08:00
Qiuxu Zhuo
9529e69773 EDAC/skx_common: Prepare for skx_get_edac_list()
The Intel EDAC library 'skx_common' maintains the Intel server EDAC device
list for {skx, i10nm}_edac drivers, which use skx_get_all_bus_mappings()
to build and retrieve the EDAC device list.

However, the upcoming Intel EDAC driver, imh_edac, for Diamond Rapids
servers is designed for memory controllers that are MMIO-based devices
rather than PCI devices. Consequently, it can't use
skx_get_all_bus_mappings() due to the absence of a PCI bus. To accommodate
this, prepare skx_get_edac_list() to enable the upcoming imh_edac driver
to obtain the EDAC device list from the skx_common library and build the
EDAC device list independently.

Tested-by: Yi Lai <yi1.lai@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://patch.msgid.link/20251119134132.2389472-3-qiuxu.zhuo@intel.com
2025-11-19 12:11:40 -08:00
Qiuxu Zhuo
b3d70059cb EDAC/{skx_common,skx,i10nm}: Make skx_register_mci() independent of pci_dev
Memory controllers in the new Intel server CPUs, such as Diamond Rapids,
are presented as MMIO-based devices rather than PCI devices.
Modify skx_register_mci() to be independent of 'pci_dev' and use a generic
'dev' of 'struct device' to prepare for support of such MMIO-based memory
controllers.

Tested-by: Yi Lai <yi1.lai@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://patch.msgid.link/20251119134132.2389472-2-qiuxu.zhuo@intel.com
2025-11-19 12:11:40 -08:00
Thorsten Blum
cdf5ecc3f6 EDAC/ghes: Replace deprecated strcpy() in ghes_edac_report_mem_error()
strcpy() has been deprecated¹ because it performs no bounds checking on the
destination buffer, which can lead to buffer overflows. Use the safer
strscpy() instead.

¹ https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Link: https://patch.msgid.link/20251118135621.101148-2-thorsten.blum@linux.dev
2025-11-18 16:50:32 +01:00
Niravkumar L Rabara
281326be67 EDAC/altera: Use INTTEST register for Ethernet and USB SBE injection
The current single-bit error injection mechanism flips bits directly in ECC RAM
by performing write and read operations. When the ECC RAM is actively used by
the Ethernet or USB controller, this approach sometimes trigger a false
double-bit error.

Switch both Ethernet and USB EDAC devices to use the INTTEST register
(altr_edac_a10_device_inject_fops) for single-bit error injection, similar to
the existing double-bit error injection method.

Fixes: 064acbd4f4 ("EDAC, altera: Add Stratix10 peripheral support")
Signed-off-by: Niravkumar L Rabara <niravkumarlaxmidas.rabara@altera.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20251111081333.1279635-1-niravkumarlaxmidas.rabara@altera.com
2025-11-11 14:59:04 +01:00
Niravkumar L Rabara
fd3ecda38f EDAC/altera: Handle OCRAM ECC enable after warm reset
The OCRAM ECC is always enabled either by the BootROM or by the Secure Device
Manager (SDM) during a power-on reset on SoCFPGA.

However, during a warm reset, the OCRAM content is retained to preserve data,
while the control and status registers are reset to their default values. As
a result, ECC must be explicitly re-enabled after a warm reset.

Fixes: 17e47dc6db ("EDAC/altera: Add Stratix10 OCRAM ECC support")
Signed-off-by: Niravkumar L Rabara <niravkumarlaxmidas.rabara@altera.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20251111080801.1279401-1-niravkumarlaxmidas.rabara@altera.com
2025-11-11 13:48:13 +01:00
Ma Ke
f18e71cd6c EDAC/ie31200: Fix error handling in ie31200_register_mci
ie31200_register_mci() calls device_initialize() for priv->dev
unconditionally. However, in the error path, put_device() is not
called, leading to an imbalance. Similarly, in the unload path,
put_device() is missing.

Although edac_mc_free() eventually frees the memory, it does not
release the device initialized by device_initialize(). For code
readability and proper pairing of device_initialize()/put_device(),
add put_device() calls in both error and unload paths.

Found by code review.

Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Link: https://patch.msgid.link/20251106084735.35017-1-make24@iscas.ac.cn
2025-11-10 17:06:10 -08:00
Shubhrajyoti Datta
2cf95b9baa EDAC/versalnet: Handle split messages for non-standard errors
The current code assumes that only DDR errors have split messages.  Ensure
proper logging of non-standard event errors that may be split across multiple
messages too.

  [ bp: Massage, move comment too, fix it up. ]

Fixes: d5fe2fec6c ("EDAC: Add a driver for the AMD Versal NET DDR controller")
Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20251023113108.3467132-1-shubhrajyoti.datta@amd.com
2025-11-07 20:15:14 +01:00
Avadhut Naik
8616025ae6 EDAC: Remove the legacy EDAC sysfs interface
Commit

  1997471069 ("edac: add a new per-dimm API and make the old per-virtual-rank API obsolete")

introduced a new per-DIMM sysfs interface for EDAC making the old
per-virtual-rank sysfs interface obsolete.

Since this new sysfs interface was introduced more than a decade ago, remove
the obsolete legacy interface.

Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20251106015727.1987246-1-avadhut.naik@amd.com
2025-11-06 13:21:29 +01:00
Avadhut Naik
6a85796915 EDAC/amd64: Remove NUM_CONTROLLERS macro
Currently, the NUM_CONTROLLERS macro is used to limit the amount of memory
controllers (UMCs) available per node. The number of UMCs available per node,
however, is already cached by the max_mcs variable of struct amd64_pvt.

Allocate the relevant data structures dynamically using the variable instead
of static allocation through the macro.

The max_mcs variable is used for legacy systems too. These systems have a max
of 2 controllers. Since the default value of max_mcs, set in per_family_init(),
is 2, these legacy systems are also covered.

Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20251106015727.1987246-1-avadhut.naik@amd.com
2025-11-06 12:51:33 +01:00
Avadhut Naik
e9abd990ae EDAC/amd64: Generate ctl_name string at runtime
Currently, the ctl_name string is statically assigned based on the family and
model of the SOC when the amd64_edac module is loaded.

The same, however, is not exactly needed as the string can be generated and
assigned at runtime through scnprintf().

Remove all static assignments and generate the string at runtime. Also,
cleanup the switch cases which became defunct and consolidate identical cases.

Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20251106015727.1987246-1-avadhut.naik@amd.com
2025-11-06 12:35:59 +01:00
Dan Carpenter
79c0a2b7ab EDAC/versalnet: Fix off by one in handle_error()
The priv->mci[] array has NUM_CONTROLLERS so this > comparison needs to be >=
to prevent an out of bounds access.

Fixes: d5fe2fec6c ("EDAC: Add a driver for the AMD Versal NET DDR controller")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
2025-10-13 17:14:47 +02:00
Linus Torvalds
03f76ddff5 - Add support for new AMD family 0x1a models to amd64_edac
- Add an EDAC driver for the AMD VersalNET memory controller which
   reports hw errors from different IP blocks in the fabric using an
   IPC-type transport
 
 - Drop the silly static number of memory controllers in the Intel EDAC
   drivers (skx, i10nm) in favor of a flexible array so that former
   doesn't need to be increased with every new generation which adds more
   memory controllers; along with a proper refactoring
 
 - Add support for two Alder Lake-S SOCs to ie31200_edac
 
 - Add an EDAC driver for ADM Cortex A72 cores, and specifically for
   reporting L1 and L2 cache errors
 
 - Last but not least, the usual fixes, cleanups and improvements all
   over the subsystem
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmjWYXEACgkQEsHwGGHe
 VUqN0g/+KaDOP5caif7/5IJ2fL+9Qv3VvbxucVMS4UgMYBY21V4msfPkuCg8iGes
 zEpFUuEFc2NE6XV9i4JNgYNAR+uffOY4rZb67VSr2rQSVeRvBFHb9aMsXBYssV/r
 XtCfTdJL/bJ7SLk10aWvBM4quLF9BchdoPctNMt5PuN3dtb1dVFi1TkylXKaocRX
 sfu/hOQ0FUbOlYnGTpW+t4TufNcWzC8q9hL4mrbSVHS3XTKk/zQ9PJ8I8f44XqYo
 Bn1JXfAErkgo9rqlmjxU90Lg2G+EV+qwDWs61Ox8q3lzbC+9FOd4WIbD3c9TiTT/
 Io6tx8PvgFUz43lD+XGoCfd87ZI9CbGoVAEEiFWr+HaqL/XVF5NS5GiBNTyxGGaP
 nDzxm1OYQbDEnBfmaWZCMbbd5yCOZ1EZHTgp4VxqJfooU1Ucbct4oPDnERMTNDlv
 UUGDh19BAXwcZ9xpy36AIprppZKOBu0WPjXee9sby5cF+KB57Tbrzd+nm/uZRhHj
 bTkQTfCcs+EPAksG0snGufy4BlfS6UGqx4HkSZ3ITVJQX4x27razsxTDbKDk50jq
 S1cyZlZ5n+mpR0MtC/zNDMB6cxutgAKoqwssVBUiEh9bCaA/tOPqJmoD9Lx2ESDt
 0/QcF1ilBRctunguhDbY8EZKye9gM4WWHW5kxE29PtAzepSdTDo=
 =4oCR
 -----END PGP SIGNATURE-----

Merge tag 'edac_updates_for_v6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

Pull EDAC updates from Borislav Petkov:

 - Add support for new AMD family 0x1a models to amd64_edac

 - Add an EDAC driver for the AMD VersalNET memory controller which
   reports hw errors from different IP blocks in the fabric using an
   IPC-type transport

 - Drop the silly static number of memory controllers in the Intel EDAC
   drivers (skx, i10nm) in favor of a flexible array so that former
   doesn't need to be increased with every new generation which adds
   more memory controllers; along with a proper refactoring

 - Add support for two Alder Lake-S SOCs to ie31200_edac

 - Add an EDAC driver for ADM Cortex A72 cores, and specifically for
   reporting L1 and L2 cache errors

 - Last but not least, the usual fixes, cleanups and improvements all
   over the subsystem

* tag 'edac_updates_for_v6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: (23 commits)
  EDAC/versalnet: Return the correct error in mc_probe()
  EDAC/mc_sysfs: Increase legacy channel support to 16
  EDAC/amd64: Add support for AMD family 1Ah-based newer models
  EDAC: Add a driver for the AMD Versal NET DDR controller
  dt-bindings: memory-controllers: Add support for Versal NET EDAC
  RAS: Export log_non_standard_event() to drivers
  cdx: Export Symbols for MCDI RPC and Initialization
  cdx: Split mcdi.h and reorganize headers
  EDAC/skx_common: Use topology_physical_package_id() instead of open coding
  EDAC: Fix wrong executable file modes for C source files
  EDAC/altera: Use dev_fwnode()
  EDAC/skx_common: Remove unused *NUM*_IMC macros
  EDAC/i10nm: Reallocate skx_dev list if preconfigured cnt != runtime cnt
  EDAC/skx_common: Remove redundant upper bound check for res->imc
  EDAC/skx_common: Make skx_dev->imc[] a flexible array
  EDAC/skx_common: Swap memory controller index mapping
  EDAC/skx_common: Move mc_mapping to be a field inside struct skx_imc
  EDAC/{skx_common,skx}: Use configuration data, not global macros
  EDAC/i10nm: Skip DIMM enumeration on a disabled memory controller
  EDAC/ie31200: Add two more Intel Alder Lake-S SoCs for EDAC support
  ...
2025-09-30 11:41:03 -07:00
Borislav Petkov (AMD)
69ed025aeb Merge branches 'edac-drivers' and 'edac-misc' into edac-updates
* edac-drivers:
  EDAC/versalnet: Return the correct error in mc_probe()
  EDAC/mc_sysfs: Increase legacy channel support to 16
  EDAC/amd64: Add support for AMD family 1Ah-based newer models
  EDAC: Add a driver for the AMD Versal NET DDR controller
  dt-bindings: memory-controllers: Add support for Versal NET EDAC
  RAS: Export log_non_standard_event() to drivers
  cdx: Export Symbols for MCDI RPC and Initialization
  cdx: Split mcdi.h and reorganize headers
  EDAC/skx_common: Use topology_physical_package_id() instead of open coding
  EDAC/altera: Use dev_fwnode()
  EDAC/skx_common: Remove unused *NUM*_IMC macros
  EDAC/i10nm: Reallocate skx_dev list if preconfigured cnt != runtime cnt
  EDAC/skx_common: Remove redundant upper bound check for res->imc
  EDAC/skx_common: Make skx_dev->imc[] a flexible array
  EDAC/skx_common: Swap memory controller index mapping
  EDAC/skx_common: Move mc_mapping to be a field inside struct skx_imc
  EDAC/{skx_common,skx}: Use configuration data, not global macros
  EDAC/i10nm: Skip DIMM enumeration on a disabled memory controller
  EDAC/ie31200: Add two more Intel Alder Lake-S SoCs for EDAC support
  dt-bindings: arm: cpus: Add edac-enabled property
  EDAC: Add EDAC driver for ARM Cortex A72 cores

* edac-misc:
  EDAC: Fix wrong executable file modes for C source files
  MAINTAINERS: EDAC: Drop inactive reviewers

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2025-09-26 11:44:35 +02:00
Dan Carpenter
c2fcb2e79d EDAC/versalnet: Return the correct error in mc_probe()
Return -ENOMEM if memory allocation in mc_probe() fails.

  [ bp: Massage commit message. ]

Fixes: d5fe2fec6c ("EDAC: Add a driver for the AMD Versal NET DDR controller")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2025-09-18 14:29:17 +02:00
Avadhut Naik
6e1c2c6c2c EDAC/mc_sysfs: Increase legacy channel support to 16
Newer AMD systems can support up to 16 channels per EDAC "mc" device.
These are detected by the EDAC module running on the device, and the
current EDAC interface is appropriately enumerated.

The legacy EDAC sysfs interface however, provides device attributes for
channels 0 through 11 only. Consequently, the last four channels, 12
through 15, will not be enumerated and will not be visible through the
legacy sysfs interface.

Add additional device attributes to ensure that all 16 channels, if
present, are enumerated by and visible through the legacy EDAC sysfs
interface.

Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250916203242.1281036-1-avadhut.naik@amd.com
2025-09-17 11:59:29 +02:00
Avadhut Naik
6fffa38c4c EDAC/amd64: Add support for AMD family 1Ah-based newer models
Add support for family 1Ah-based models 50h-57h, 90h-9Fh, A0h-AFh, and
C0h-C7h.

Also, raise the maximum memory controllers number as those machines
support that many.

Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250916203242.1281036-1-avadhut.naik@amd.com
2025-09-17 11:53:54 +02:00
Shubhrajyoti Datta
d5fe2fec6c EDAC: Add a driver for the AMD Versal NET DDR controller
Add a driver for the AMD Versal NET DDR memory controller which supports
single bit error correction, double bit error detection and other system
errors from various IP subsystems (e.g., RPU, NOCs, HNICX, PL).

The driver listens for notifications from the NMC (Network management
controller) using RPMsg (Remote Processor Messaging).

The channel used for communicating to RPMsg is named "error_edac".  Upon
receipt of a notification, the driver sends a RAS event trace.

  [ bp:
    - Fixup title
    - Rewrite commit message
    - Fixup Kconfig text
    - Zap unused defines and align them
    - Simplify rpmsg_cb() considerably
    - Drop silly double-brackets in conditionals
    - Use proper void * type in mcdi_request()
    - Do not clear chinfo in rpmsg_probe() unnecessarily
    - Fix indentation
    - Do a proper err unwind path in init_versalnet()
    - Redo the error unwind path in mc_probe() properly
    - Fix the ordering in mc_remove()
    ]

Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250908115649.22903-1-shubhrajyoti.datta@amd.com
Link: https://lore.kernel.org/r/20250703173105.GLaGa-WQCESDNsqygm@fat_crate.local
2025-09-15 16:22:27 +02:00
Qiuxu Zhuo
2292c8061c EDAC/skx_common: Use topology_physical_package_id() instead of open coding
Use topology_physical_package_id() to get the CPU package ID instead of
open coding.

Suggested-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20250903030648.3285935-1-qiuxu.zhuo@intel.com
2025-09-03 07:48:01 -07:00
Kuan-Wei Chiu
71965cae7d EDAC: Fix wrong executable file modes for C source files
Three EDAC source files were mistakenly marked as executable when adding the
EDAC scrub controls.

These are plain C source files and should not carry the executable bit.
Correcting their modes follows the principle of least privilege and avoids
unnecessary execute permissions in the repository.

  [ bp: Massage commit message. ]

Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250828191954.903125-1-visitorckw@gmail.com
2025-08-30 17:23:06 +02:00
Salah Triki
ff2a66d21f EDAC/altera: Delete an inappropriate dma_free_coherent() call
dma_free_coherent() must only be called if the corresponding
dma_alloc_coherent() call has succeeded. Calling it when the allocation fails
leads to undefined behavior.

Delete the wrong call.

  [ bp: Massage commit message. ]

Fixes: 71bcada88b ("edac: altera: Add Altera SDRAM EDAC support")
Signed-off-by: Salah Triki <salah.triki@gmail.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/aIrfzzqh4IzYtDVC@pc
2025-08-25 13:56:16 +02:00
Jiri Slaby (SUSE)
776cc2ec15 EDAC/altera: Use dev_fwnode()
irq_domain_create_simple() takes fwnode as the first argument. It can be
extracted from the struct device using dev_fwnode() helper instead of using
of_node with of_fwnode_handle().

So use the dev_fwnode() helper.

Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Link: https://lore.kernel.org/20250723062631.1830757-1-jirislaby@kernel.org
2025-08-20 18:14:33 +02:00
Qiuxu Zhuo
a95dcf3d67 EDAC/skx_common: Remove unused *NUM*_IMC macros
There are no references to the *NUM*_IMC macros, so remove them.

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20250731145534.2759334-8-qiuxu.zhuo@intel.com
2025-08-19 16:25:22 -07:00