linux

mirror of https://github.com/torvalds/linux.git synced 2026-03-07 23:04:33 +01:00

Author	SHA1	Message	Date
Jisheng Zhang	0100e495cd	arm64: make runtime const not usable by modules Similar as commit `284922f4c5` ("x86: uaccess: don't use runtime-const rewriting in modules") does, make arm64's runtime const not usable by modules too, to "make sure this doesn't get forgotten the next time somebody wants to do runtime constant optimizations". The reason is well explained in the above commit: "The runtime-const infrastructure was never designed to handle the modular case, because the constant fixup is only done at boot time for core kernel code." Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>	2026-03-04 16:13:58 +00:00
Linus Torvalds	4053c47680	[GIT PULL for v7.0-rc3] media fixes -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAmmn9/8ACgkQCF8+vY7k 4RVyYw//SYCNeobk5OBImtOr8qq9VVs6H0Bxkofx+THkGEGrWhqvSz/DhZjdjJps nT5UFFKs7v2yM+HqcH8A1CqIz+Njd71a1dwIsos49T9RKJ7cSJ5eS5v1F2Ea38VF BchkLkcwM/GwX14ZpEigHOKMYJ5HJWlX5DFr/EC1msMxg+kmF3PJjB+jCrb5oU0w 4xbaE39fUFIxeEtXVVKPQ7kqJCXJwjUWdM+vfcTwxtUJUgFnpLjF6CzRTqwiDfrn 1vNhxVYzWq49dgvpBUVzw6K5E3gAtqmAJUVFNn76sNgq6z/LW1KpGbgtvRq6vOws 9gymtdu+S+gVm7/wCA3Qj6QmtmO4me5hEk0SwBjRpU8Osu/M4XDxvADdfJG7KAJB ae6jyyJTRgqUdtqJK5isv1emg+1+yt5Onm10c1uyVJa1qLiL5NGJwK2hg2GjCUFk 05SvNUToztNUdPcHzFKS25ZE+xfhjTeH7+ZHTc0sGD4EvtRQHPvp59g6EqLCKHcH 6F4+sCILO425sxrSSca3+KcseSvBdjqV1xJuaNwmyTN9g6uOjkwLGIiIX5GIOG5X FZ7I/xD+zLjeVLl5jrJSj3II+fddar2lOKnN41B6wjBs72tQIH2yvI3J73sViZJ6 /L3DzbfCOtkZJbPlm6Dzf3H+27HkZMIiihUs00bVqWZwlVGZdyw= =EOM4 -----END PGP SIGNATURE----- Merge tag 'media/v7.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fix from Mauro Carvalho Chehab: "Fix for MPEG-TS decoder in dvb-net" * tag 'media/v7.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: dvb-net: fix OOB access in ULE extension header tables	2026-03-04 08:12:06 -08:00
Shyam Prasad N	340cea84f6	cifs: open files should not hold ref on superblock Today whenever we deal with a file, in addition to holding a reference on the dentry, we also get a reference on the superblock. This happens in two cases: 1. when a new cinode is allocated 2. when an oplock break is being processed The reasoning for holding the superblock ref was to make sure that when umount happens, if there are users of inodes and dentries, it does not try to clean them up and wait for the last ref to superblock to be dropped by last of such users. But the side effect of doing that is that umount silently drops a ref on the superblock and we could have deferred closes and lease breaks still holding these refs. Ideally, we should ensure that all of these users of inodes and dentries are cleaned up at the time of umount, which is what this code is doing. This code change allows these code paths to use a ref on the dentry (and hence the inode). That way, umount is ensured to clean up SMB client resources when it's the last ref on the superblock (For ex: when same objects are shared). The code change also moves the call to close all the files in deferred close list to the umount code path. It also waits for oplock_break workers to be flushed before calling kill_anon_super (which eventually frees up those objects). Fixes: `24261fc23d` ("cifs: delay super block destruction until all cifsFileInfo objects are gone") Fixes: `705c79101c` ("smb: client: fix use-after-free in cifs_oplock_break") Cc: <stable@vger.kernel.org> Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2026-03-04 10:11:39 -06:00
Linus Torvalds	40d3f62247	Pin control fixes for the v7.0 series: - Rename and fix up the Intel Equilibrium immutable interrupt chip. - Handle the Qualcomm QCS615 dual edge GPIO IRQ by adding the right flag. - Fix a memory leak in the widely used pinconf_generic_parse_dt_config() and a more local leak in aml_dt_node_to_map_pinmux(). - Fix double put in the Cirrus cs42l43_pin_probe(). - Staticize amdisp_pinctrl_ops, Qualcomm SDM660 groups and functions. - Unexport CIX sky1_pinctrl_pm_ops. - Fix configuration of deferred pin in the Rockchip driver. - Implement .get_direction() in the Sunxi driver quelching a dmesg warning message. - Fix a readout of the last bank of registers in the Cypress CY8C95x0 driver. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAmmn5pUACgkQQRCzN7AZ XXMUBQ/8C7CXyAlP/RFrJasvdMNSLxOHFoutBXQWM+bvbULK80EbbIpbguwdQUm/ PkuozjJynt76zc4BgymqQDoCGQXIubVvhgYZMpDlJ9zcaqYeW8AwY4SlpYSpDqRw jpZwhpDxRGK8BelmN6mjMYrW1HL4hc70fyy6aIT/XS5wNCx/NSI2EJ/8PhWga7tS husAaXLk4i6dG4xFC4TUs5BepJAsTYnHc/L2XHY6d+OXxpvBcley964n5X2KmvG7 IrDrx2+9AcJGk84pZFgd1yODDTZ4yL2fJbxbwT7Qy4ZEVDKd/HviKGnA3Z8mdE5/ +ZvzPdOir0MKnKt6lEAVhGwduN1KsZei6lIZIMBMByYtpU+dAnm89mMD3LR2aRYH WmdUm4ml7fN6ghvqfZcjYcj7hlMWKc91LPiMMEWDCac8Gn1hDneN0z/VuIOovWM2 JyTtdaCIV9XdCDM0AbVmlUsvuCBF+T3jiXQE3k2TgUWLFFxes7DaKMbDvRoR+JsA 35VTvJzjpexZ1l1eGhmwsCOcnuxoH/FJ9gwJmXEy+nSB8oY2x4b8LGR+RgSI+R9W OJ/D08Z01IxXS69Acj6dl0N2LDjt5YoWVavqhtAMbRDkmU/W8DnTJxzHjNBaVrAE lfefWDwiMxOBCzMvJTQFm+NFe22b/uk19mrDqs+ibqJnIVBXdrE= =5Sap -----END PGP SIGNATURE----- Merge tag 'pinctrl-v7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: "All of these are driver fixes except a memory leak in the pinconf_generic_parse_dt_config() helper which is the most important fix. - Rename and fix up the Intel Equilibrium immutable interrupt chip - Handle the Qualcomm QCS615 dual edge GPIO IRQ by adding the right flag - Fix a memory leak in the widely used pinconf_generic_parse_dt_config() and a more local leak in aml_dt_node_to_map_pinmux() - Fix double put in the Cirrus cs42l43_pin_probe() - Staticize amdisp_pinctrl_ops, Qualcomm SDM660 groups and functions - Unexport CIX sky1_pinctrl_pm_ops - Fix configuration of deferred pin in the Rockchip driver - Implement .get_direction() in the Sunxi driver squelching a dmesg warning message - Fix a readout of the last bank of registers in the Cypress CY8C95x0 driver" * tag 'pinctrl-v7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: cy8c95x0: Don't miss reading the last bank registers pinctrl: sunxi: Implement gpiochip::get_direction() pinctrl: rockchip: Fix configuring a deferred pin pinctrl: cirrus: cs42l43: Fix double-put in cs42l43_pin_probe() pinctrl: meson: amlogic-a4: Fix device node reference leak in aml_dt_node_to_map_pinmux() pinctrl: qcom: sdm660-lpass-lpi: Make groups and functions variables static pinctrl: cix: sky1: Unexport sky1_pinctrl_pm_ops pinctrl: amdisp: Make amdisp_pinctrl_ops variable static pinctrl: pinconf-generic: Fix memory leak in pinconf_generic_parse_dt_config() pinctrl: qcom: qcs615: Add missing dual edge GPIO IRQ errata flag pinctrl: equilibrium: fix warning trace on load pinctrl: equilibrium: rename irq_chip function callbacks	2026-03-04 08:03:43 -08:00
Jens Axboe	d90c470b0e	nvme fixes for Linux 7.0 - Improve quirk visibility and configurability (Maurizio) - Fix runtime user modification to queue setup (Keith) - Fix multipath leak on try_module_get failure (Keith) - Ignore ambiguous spec definitions for better atomics support (John) - Fix admin queue leak on controller reset (Ming) - Fix large allocation in persistent reservation read keys (Sungwoo Kim) - Fix fcloop callback handling (Justin) - Securely free DHCHAP secrets (Daniel) - Various cleanups and typo fixes (John, Wilfred) -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE3Fbyvv+648XNRdHTPe3zGtjzRgkFAmmoSbMACgkQPe3zGtjz RgkpuQ/9EfCp24xowwKEXycX7pquojwjEAh1n5WsUyBDXQls/7Dq3w0EXtkc8fA8 SUcDpTj7ABiF/faschCoFO47R5/0TPtNMCleWFSdW0OG6B7IYaUt9Cj86JK1dzme Zn7luH47Pesmd+H184IOIfDhsiVs5Z3YCISlT1aa1EFg+3/neDqGGpT4+ySOjSZe 9j8ASUTOqfuBZ2Xc8RNvumABBEkEkUd4xwYTLRi+o/PR9econGrpiEqDyUBAf8dr VrZoL0aoQoUEaU08tJOci4GH3Spp4RXlpQo92RBE4yDTxWozRRBWwoCycmPKHQ5b +5nC77t1p2OyzgP0xPngQZVMi7A+QTFZf4shq0Xho5kifjB8ZTqVSJJSGK7RlwE4 GmXgHfMs8Gvn3aew8BcpXilhe4InXfY1LqYmTvJxo9VLK/u7apo94vrJICewHh2z lsiWTOHe9xSm8wR20fcxp3D3kXpQ5sMcMoco96dVFetw1WNE30qDy+xtpOvPwdL5 9mloguR7Pmsu+gVim2VaqSA8HsPIYEbXymLMVzTeVbtPALzrKsGLLW8k/DYFhSTm +Ow4KeItyL5hgDU2jenjS3xwshKqKTeJDueue4WBFxgqdbH9hwiJ6aVWS2eoJxev RAZXSGTmxEo8X5nDsNz048iT96lFpM7ERViHOWnrptLcFX4yFNM= =fMd5 -----END PGP SIGNATURE----- Merge tag 'nvme-7.0-2026-03-04' of git://git.infradead.org/nvme into block-7.0 Pull NVMe fixes from Keith: "- Improve quirk visibility and configurability (Maurizio) - Fix runtime user modification to queue setup (Keith) - Fix multipath leak on try_module_get failure (Keith) - Ignore ambiguous spec definitions for better atomics support (John) - Fix admin queue leak on controller reset (Ming) - Fix large allocation in persistent reservation read keys (Sungwoo Kim) - Fix fcloop callback handling (Justin) - Securely free DHCHAP secrets (Daniel) - Various cleanups and typo fixes (John, Wilfred)" * tag 'nvme-7.0-2026-03-04' of git://git.infradead.org/nvme: nvme: fix memory allocation in nvme_pr_read_keys() nvme-multipath: fix leak on try_module_get failure nvmet-fcloop: Check remoteport port_state before calling done callback nvme-pci: do not try to add queue maps at runtime nvme-pci: cap queue creation to used queues nvme-pci: ensure we're polling a polled queue nvme: fix memory leak in quirks_param_set() nvme: correct comment about nvme_ns_remove() nvme: stop setting namespace gendisk device driver data nvme: add support for dynamic quirk configuration via module parameter nvme: fix admin queue leak on controller reset nvme-fabrics: use kfree_sensitive() for DHCHAP secrets nvme: stop using AWUPF nvme: expose active quirks in sysfs nvme/host: fixup some typos	2026-03-04 08:15:17 -07:00
Sungwoo Kim	c332015376	nvme: fix memory allocation in nvme_pr_read_keys() nvme_pr_read_keys() takes num_keys from userspace and uses it to calculate the allocation size for rse via struct_size(). The upper limit is PR_KEYS_MAX (64K). A malicious or buggy userspace can pass a large num_keys value that results in a 4MB allocation attempt at most, causing a warning in the page allocator when the order exceeds MAX_PAGE_ORDER. To fix this, use kvzalloc() instead of kzalloc(). This bug has the same reasoning and fix with the patch below: https://lore.kernel.org/linux-block/20251212013510.3576091-1-kartikey406@gmail.com/ Warning log: WARNING: mm/page_alloc.c:5216 at __alloc_frozen_pages_noprof+0x5aa/0x2300 mm/page_alloc.c:5216, CPU#1: syz-executor117/272 Modules linked in: CPU: 1 UID: 0 PID: 272 Comm: syz-executor117 Not tainted 6.19.0 #1 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 RIP: 0010:__alloc_frozen_pages_noprof+0x5aa/0x2300 mm/page_alloc.c:5216 Code: ff 83 bd a8 fe ff ff 0a 0f 86 69 fb ff ff 0f b6 1d f9 f9 c4 04 80 fb 01 0f 87 3b 76 30 ff 83 e3 01 75 09 c6 05 e4 f9 c4 04 01 <0f> 0b 48 c7 85 70 fe ff ff 00 00 00 00 e9 8f fd ff ff 31 c0 e9 0d RSP: 0018:ffffc90000fcf450 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 1ffff920001f9ea0 RDX: 0000000000000000 RSI: 000000000000000b RDI: 0000000000040dc0 RBP: ffffc90000fcf648 R08: ffff88800b6c3380 R09: 0000000000000001 R10: ffffc90000fcf840 R11: ffff88807ffad280 R12: 0000000000000000 R13: 0000000000040dc0 R14: 0000000000000001 R15: ffffc90000fcf620 FS: 0000555565db33c0(0000) GS:ffff8880be26c000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000002000000c CR3: 0000000003b72000 CR4: 00000000000006f0 Call Trace: <TASK> alloc_pages_mpol+0x236/0x4d0 mm/mempolicy.c:2486 alloc_frozen_pages_noprof+0x149/0x180 mm/mempolicy.c:2557 ___kmalloc_large_node+0x10c/0x140 mm/slub.c:5598 __kmalloc_large_node_noprof+0x25/0xc0 mm/slub.c:5629 __do_kmalloc_node mm/slub.c:5645 [inline] __kmalloc_noprof+0x483/0x6f0 mm/slub.c:5669 kmalloc_noprof include/linux/slab.h:961 [inline] kzalloc_noprof include/linux/slab.h:1094 [inline] nvme_pr_read_keys+0x8f/0x4c0 drivers/nvme/host/pr.c:245 blkdev_pr_read_keys block/ioctl.c:456 [inline] blkdev_common_ioctl+0x1b71/0x29b0 block/ioctl.c:730 blkdev_ioctl+0x299/0x700 block/ioctl.c:786 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:597 [inline] __se_sys_ioctl fs/ioctl.c:583 [inline] __x64_sys_ioctl+0x1bf/0x220 fs/ioctl.c:583 x64_sys_call+0x1280/0x21b0 mnt/fuzznvme_1/fuzznvme/linux-build/v6.19/./arch/x86/include/generated/asm/syscalls_64.h:17 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0x71/0x330 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7fb893d3108d Code: 28 c3 e8 46 1e 00 00 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffff61f2f38 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007ffff61f3138 RCX: 00007fb893d3108d RDX: 0000000020000040 RSI: 00000000c01070ce RDI: 0000000000000003 RBP: 0000000000000001 R08: 0000000000000000 R09: 00007ffff61f3138 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 R13: 00007ffff61f3128 R14: 00007fb893dae530 R15: 0000000000000001 </TASK> Fixes: `5fd96a4e15` (nvme: Add pr_ops read_keys support) Acked-by: Chao Shi <cshi008@fiu.edu> Acked-by: Weidong Zhu <weizhu@fiu.edu> Acked-by: Dave Tian <daveti@purdue.edu> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Sungwoo Kim <iam@sung-woo.kim> Signed-off-by: Keith Busch <kbusch@kernel.org>	2026-03-04 06:53:41 -08:00
Catalin Marinas	c25c4aa3f7	arm64: mm: Add PTE_DIRTY back to PAGE_KERNEL* to fix kexec/hibernation Commit `143937ca51` ("arm64, mm: avoid always making PTE dirty in pte_mkwrite()") changed pte_mkwrite_novma() to only clear PTE_RDONLY when PTE_DIRTY is set. This was to allow writable-clean PTEs for swap pages that haven't actually been written. However, this broke kexec and hibernation for some platforms. Both go through trans_pgd_create_copy() -> _copy_pte(), which calls pte_mkwrite_novma() to make the temporary linear-map copy fully writable. With the updated pte_mkwrite_novma(), read-only kernel pages (without PTE_DIRTY) remain read-only in the temporary mapping. While such behaviour is fine for user pages where hardware DBM or trapping will make them writeable, subsequent in-kernel writes by the kexec relocation code will fault. Add PTE_DIRTY back to all _PAGE_KERNEL* protection definitions. This was the case prior to 5.4, commit `aa57157be6` ("arm64: Ensure VM_WRITE\|VM_SHARED ptes are clean by default"). With the kernel linear-map PTEs always having PTE_DIRTY set, pte_mkwrite_novma() correctly clears PTE_RDONLY. Fixes: `143937ca51` ("arm64, mm: avoid always making PTE dirty in pte_mkwrite()") Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: stable@vger.kernel.org Reported-by: Jianpeng Chang <jianpeng.chang.cn@windriver.com> Link: https://lore.kernel.org/r/20251204062722.3367201-1-jianpeng.chang.cn@windriver.com Cc: Will Deacon <will@kernel.org> Cc: Huang, Ying <ying.huang@linux.alibaba.com> Cc: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Huang Ying <ying.huang@linux.alibaba.com> Signed-off-by: Will Deacon <will@kernel.org>	2026-03-04 14:33:08 +00:00
Juergen Gross	e2dcf90655	xen/xenbus: better handle backend crash When the backend domain crashes, coordinated device cleanup is not possible (as it involves waiting for the backend state change). In that case, toolstack forcefully removes frontend xenstore entries. xenbus_dev_changed() handles this case, and triggers device cleanup. It's possible that toolstack manages to connect new device in that place, before xenbus_dev_changed() notices the old one is missing. If that happens, new one won't be probed and will forever remain in XenbusStateInitialising. Fix this by checking the frontend's state in Xenstore. In case it has been reset to XenbusStateInitialising by Xen tools, consider this being the result of an unplug+plug operation. It's important that cleanup on such unplug doesn't modify Xenstore entries (especially the "state" key) as it belong to the new device to be probed - changing it would derail establishing connection to the new backend (most likely, closing the device before it was even connected). Handle this case by setting new xenbus_device->vanished flag to true, and check it before changing state entry. And even if xenbus_dev_changed() correctly detects the device was forcefully removed, the cleanup handling is still racy. Since this whole handling doesn't happened in a single Xenstore transaction, it's possible that toolstack might put a new device there already. Avoid re-creating the state key (which in the case of loosing the race would actually close newly attached device). The problem does not apply to frontend domain crash, as this case involves coordinated cleanup. Problem originally reported at https://lore.kernel.org/xen-devel/aOZvivyZ9YhVWDLN@mail-itl/T/#t, including reproduction steps. Based-on-patch-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: Juergen Gross <jgross@suse.com> Message-ID: <20260218095205.453657-3-jgross@suse.com>	2026-03-04 15:31:40 +01:00
Juergen Gross	82169dace4	xenbus: add xenbus_device parameter to xenbus_read_driver_state() In order to prepare checking the xenbus device status in xenbus_read_driver_state(), add the pointer to struct xenbus_device as a parameter. Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: "Martin K. Petersen" <martin.petersen@oracle.com> # SCSI Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # drivers/pci/xen-pcifront.c Signed-off-by: Juergen Gross <jgross@suse.com> Message-ID: <20260218095205.453657-2-jgross@suse.com>	2026-03-04 15:31:40 +01:00
Catalin Marinas	212dd84776	arm64: Silence sparse warnings caused by the type casting in (cmp)xchg The arm64 xchg/cmpxchg() wrappers cast the arguments to (unsigned long) prior to invoking the static inline functions implementing the operation. Some restrictive type annotations (e.g. __bitwise) lead to sparse warnings like below: sparse warnings: (new ones prefixed by >>) fs/crypto/bio.c:67:17: sparse: sparse: cast from restricted blk_status_t >> fs/crypto/bio.c:67:17: sparse: sparse: cast to restricted blk_status_t Force the casting in the arm64 xchg/cmpxchg() wrappers to silence sparse. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202602230947.uNRsPyBn-lkp@intel.com/ Link: https://lore.kernel.org/r/202602230947.uNRsPyBn-lkp@intel.com/ Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Will Deacon <will@kernel.org>	2026-03-04 14:31:27 +00:00
Yang Xiuwei	8da8df4312	block: use __bio_add_page in bio_copy_kern Since the bio is allocated with the exact number of pages needed via blk_rq_map_bio_alloc(), and the loop iterates exactly that many times, bio_add_page() cannot fail due to insufficient space. Switch to __bio_add_page() and remove the dead error handling code. Suggested-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Yang Xiuwei <yangxiuwei@kylinos.cn> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-03-04 06:59:26 -07:00
Varun Gupta	0cfe9c4838	drm/xe: Fix memory leak in xe_vm_madvise_ioctl When check_bo_args_are_sane() validation fails, jump to the new free_vmas cleanup label to properly free the allocated resources. This ensures proper cleanup in this error path. Fixes: `293032eec4` ("drm/xe/bo: Update atomic_access attribute on madvise") Cc: stable@vger.kernel.org # v6.18+ Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Varun Gupta <varun.gupta@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260223175145.1532801-1-varun.gupta@intel.com Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> (cherry picked from commit 29bd06faf727a4b76663e4be0f7d770e2d2a7965) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2026-03-04 08:54:19 -05:00
Shuicheng Lin	3091723785	drm/xe/reg_sr: Fix leak on xa_store failure Free the newly allocated entry when xa_store() fails to avoid a memory leak on the error path. v2: use goto fail_free. (Bala) Fixes: `e5283bd4df` ("drm/xe/reg_sr: Remove register pool") Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20260204172810.1486719-2-shuicheng.lin@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit 6bc6fec71ac45f52db609af4e62bdb96b9f5fadb) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2026-03-04 08:54:19 -05:00
Matt Roper	89865e6dc8	drm/xe/xe2_hpg: Correct implementation of Wa_16025250150 Wa_16025250150 asks us to set five register fields of the register to 0x1 each. However we were just OR'ing this into the existing register value (which has a default of 0x4 for each nibble-sized field) resulting in final field values of 0x5 instead of the desired 0x1. Correct the RTP programming (use FIELD_SET instead of SET) to ensure each field is assigned to exactly the value we want. Cc: Aradhya Bhatia <aradhya.bhatia@intel.com> Cc: Tejas Upadhyay <tejas.upadhyay@intel.com> Cc: stable@vger.kernel.org # v6.16+ Fixes: `7654d51f1f` ("drm/xe/xe2hpg: Add Wa_16025250150") Reviewed-by: Ngai-Mint Kwan <ngai-mint.kwan@linux.intel.com> Link: https://patch.msgid.link/20260227164341.3600098-2-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit d139209ef88e48af1f6731cd45440421c757b6b5) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2026-03-04 08:54:18 -05:00
Zhanjun Dong	b3368ecca9	drm/xe/gsc: Fix GSC proxy cleanup on early initialization failure xe_gsc_proxy_remove undoes what is done in both xe_gsc_proxy_init and xe_gsc_proxy_start; however, if we fail between those 2 calls, it is possible that the HW forcewake access hasn't been initialized yet and so we hit errors when the cleanup code tries to write GSC register. To avoid that, split the cleanup in 2 functions so that the HW cleanup is only called if the HW setup was completed successfully. Since the HW cleanup (interrupt disabling) is now removed from xe_gsc_proxy_remove, the cleanup on error paths in xe_gsc_proxy_start must be updated to disable interrupts before returning. Fixes: `ff6cd29b69` ("drm/xe: Cleanup unwind of gt initialization") Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patch.msgid.link/20260220225308.101469-1-zhanjun.dong@intel.com (cherry picked from commit 2b37c401b265c07b46408b5cb36a4b757c9b5060) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2026-03-04 08:54:18 -05:00
Thomas Hellström	a99d34e5ec	Revert "drm/pagemap: Disable device-to-device migration" With commit a69d1ab971a6 ("mm: Fix a hmm_range_fault() livelock / starvation problem") device-to-device migration is not functional again and the disabling can be reverted. Add the above commit as a Fixes: tag in order for the revert to not take place unless that commit is present. This reverts commit `10dd1eaa80`. Cc: Matthew Brost <matthew.brost@intel.com> Fixes: `b570f37a2c` ("mm: Fix a hmm_range_fault() livelock / starvation problem") Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260211104159.114947-1-thomas.hellstrom@linux.intel.com (cherry picked from commit 1a3c0049b3f56278c9caf2784c53f6ab435fd12c) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Rodrigo updated Fixes tag]	2026-03-04 08:53:37 -05:00
Darrick J. Wong	d320f160aa	iomap: reject delalloc mappings during writeback Filesystems should never provide a delayed allocation mapping to writeback; they're supposed to allocate the space before replying. This can lead to weird IO errors and crashes in the block layer if the filesystem is being malicious, or if it hadn't set iomap->dev because it's a delalloc mapping. Fix this by failing writeback on delalloc mappings. Currently no filesystems actually misbehave in this manner, but we ought to be stricter about things like that. Cc: stable@vger.kernel.org # v5.5 Fixes: `598ecfbaa7` ("iomap: lift the xfs writeback code to iomap") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Link: https://patch.msgid.link/20260302173002.GL13829@frogsfrogsfrogs Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>	2026-03-04 14:31:56 +01:00
Pavel Begunkov	531bb98a03	io_uring/zcrx: use READ_ONCE with user shared RQEs Refill queue entries are shared with the user space, use READ_ONCE when reading them. Fixes: `34a3e60821` ("io_uring/zcrx: implement zerocopy receive pp memory provider"); Cc: stable@vger.kernel.org Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-03-04 06:30:39 -07:00
Jouni Högander	a99cac460d	drm/i915/psr: Fix for Panel Replay X granularity DPCD register handling DP specification is saying value 0xff 0xff in PANEL REPLAY SELECTIVE UPDATE X GRANULARITY CAPABILITY registers (0xb2 and 0xb3) means full-line granularity. Take this into account when handling Panel Replay X granularity informed by the panel. Fixes: `1cc8546474` ("drm/i915/psr: Use SU granularity information available in intel_connector") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7284 Tested-by: Mark Pearson <mpearson-lenovo@squebb.ca> Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Uma Shankar <uma.shankar@intel.com> Link: https://patch.msgid.link/20260225074221.1744330-2-jouni.hogander@intel.com (cherry picked from commit f5c8f824a495e849492f09a43bd965a8f4d86cb2) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2026-03-04 15:26:08 +02:00
Jouni Högander	ace7dcc818	drm/dp: Add definition for Panel Replay full-line granularity DP specification is saying value 0xff 0xff in PANEL REPLAY SELECTIVE UPDATE X GRANULARITY CAPABILITY registers (0xb2 and 0xb3) means full-line granularity. Add definition for this. Cc: dri-devel@lists.freedesktop.org Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Uma Shankar <uma.shankar@intel.com> Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patch.msgid.link/20260225074221.1744330-1-jouni.hogander@intel.com (cherry picked from commit b93311673263bb98a200ab1cb6304f969bdada5c) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2026-03-04 15:26:08 +02:00
Christian Brauner	d3ccc4d86d	Merge patch "iomap: don't mark folio uptodate if read IO has bytes pending" Joanne Koong <joannelkoong@gmail.com> says: This is a fix for this scenario: ->read_folio() gets called on a folio size that is 16k while the file is 4k: a) ifs->read_bytes_pending gets initialized to 16k b) ->read_folio_range() is called for the 4k read c) the 4k read succeeds, ifs->read_bytes_pending is now 12k and the 0 to 4k range is marked uptodate d) the post-eof blocks are zeroed and marked uptodate in the call to iomap_set_range_uptodate() e) iomap_set_range_uptodate() sees all the ranges are marked uptodate and it marks the folio uptodate f) iomap_read_end() gets called to subtract the 12k from ifs->read_bytes_pending. it too sees all the ranges are marked uptodate and marks the folio uptodate using XOR g) the XOR call clears the uptodate flag on the folio The same situation can occur if the last range read for the folio is done as an inline read and all the previous ranges have already completed by the time the inline read completes. For more context, the full discussion can be found in [1]. There was a discussion about alternative approaches in that thread, but they had more complications. There is another discussion in v1 [2] about consolidating the read paths. Until that is resolved, this patch fixes the issue. [1] https://lore.kernel.org/linux-fsdevel/CAJnrk1Z9za5w4FoJqTGx50zR2haHHaoot1KJViQyEHJQq4=34w@mail.gmail.com/#t [2] https://lore.kernel.org/linux-fsdevel/20260219003911.344478-1-joannelkoong@gmail.com/T/#u * patches from https://patch.msgid.link/20260303233420.874231-1-joannelkoong@gmail.com: iomap: don't mark folio uptodate if read IO has bytes pending Link: https://patch.msgid.link/20260303233420.874231-1-joannelkoong@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2026-03-04 14:19:05 +01:00
Joanne Koong	debc1a492b	iomap: don't mark folio uptodate if read IO has bytes pending If a folio has ifs metadata attached to it and the folio is partially read in through an async IO helper with the rest of it then being read in through post-EOF zeroing or as inline data, and the helper successfully finishes the read first, then post-EOF zeroing / reading inline will mark the folio as uptodate in iomap_set_range_uptodate(). This is a problem because when the read completion path later calls iomap_read_end(), it will call folio_end_read(), which sets the uptodate bit using XOR semantics. Calling folio_end_read() on a folio that was already marked uptodate clears the uptodate bit. Fix this by not marking the folio as uptodate if the read IO has bytes pending. The folio uptodate state will be set in the read completion path through iomap_end_read() -> folio_end_read(). Reported-by: Wei Gao <wegao@suse.com> Suggested-by: Sasha Levin <sashal@kernel.org> Tested-by: Wei Gao <wegao@suse.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Cc: stable@vger.kernel.org # v6.19 Link: https://lore.kernel.org/linux-fsdevel/aYbmy8JdgXwsGaPP@autotest-wegao.qe.prg2.suse.org/ Fixes: `b2f35ac414` ("iomap: add caller-provided callbacks for read and readahead") Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Link: https://patch.msgid.link/20260303233420.874231-2-joannelkoong@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2026-03-04 14:18:54 +01:00
Gerd Rausch	6932256d3a	time/jiffies: Fix sysctl file error on configurations where USER_HZ < HZ Commit `2dc164a48e` ("sysctl: Create converter functions with two new macros") incorrectly returns error to user space when jiffies sysctl converter is used. The old overflow check got replaced with an unconditional one: + if (USER_HZ < HZ) + return -EINVAL; which will always be true on configurations with "USER_HZ < HZ". Remove the check; it is no longer needed as clock_t_to_jiffies() returns ULONG_MAX for the overflow case and proc_int_u2k_conv_uop() checks for "> INT_MAX" after conversion Fixes: `2dc164a48e` ("sysctl: Create converter functions with two new macros") Reported-by: Colm Harrington <colm.harrington@oracle.com> Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com> Signed-off-by: Joel Granados <joel.granados@kernel.org>	2026-03-04 13:48:31 +01:00
Zhang Heng	325291b20f	ASoC: amd: yc: Add DMI quirk for ASUS EXPERTBOOK PM1503CDA Add a DMI quirk for the ASUS EXPERTBOOK PM1503CDA fixing the issue where the internal microphone was not detected. Link: https://bugzilla.kernel.org/show_bug.cgi?id=221070 Cc: stable@vger.kernel.org Signed-off-by: Zhang Heng <zhangheng@kylinos.cn> Link: https://patch.msgid.link/20260304063255.139331-1-zhangheng@kylinos.cn Signed-off-by: Mark Brown <broonie@kernel.org>	2026-03-04 11:40:17 +00:00
Biju Das	fbb143e4a6	ASoC: dt-bindings: renesas,rz-ssi: Document RZ/G3L SoC Document RZ/G3L SSIF-2 bindings. The RZ/G3L SSIF-2 IP is identical to one found on the RZ/G2L SoC. Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Link: https://patch.msgid.link/20260304072000.6787-1-biju.das.jz@bp.renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>	2026-03-04 11:40:16 +00:00
Maarten Lankhorst	a58d487fb1	drm/ttm/tests: Fix build failure on PREEMPT_RT Fix a compile error in the kunit tests when CONFIG_PREEMPT_RT is enabled, and the normal mutex is converted into a rtmutex. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202602261547.3bM6yVAS-lkp@intel.com/ Reviewed-by: Jouni Högander <jouni.hogander@intel.com> Link: https://patch.msgid.link/20260304085616.1216961-1-dev@lankhorst.se Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>	2026-03-04 11:31:54 +01:00
Shawn Lin	0fb59eaca1	pmdomain: rockchip: Fix PD_VCODEC for RK3588 >From the RK3588 TRM Table 7-1 RK3588 Voltage Domain and Power Domain Summary, PD_RKVDEC0/1 and PD_VENC0/1 rely on VD_VCODEC which require extra voltages to be applied, otherwise it breaks RK3588-evb1-v10 board after vdec support landed[1]. The panic looks like below: rockchip-pm-domain fd8d8000.power-management:power-controller: failed to set domain 'rkvdec0' on, val=0 rockchip-pm-domain fd8d8000.power-management:power-controller: failed to set domain 'rkvdec1' on, val=0 ... Hardware name: Rockchip RK3588S EVB1 V10 Board (DT) Workqueue: pm genpd_power_off_work_fn Call trace: show_stack+0x18/0x24 (C) dump_stack_lvl+0x40/0x84 dump_stack+0x18/0x24 vpanic+0x1ec/0x4fc vpanic+0x0/0x4fc check_panic_on_warn+0x0/0x94 arm64_serror_panic+0x6c/0x78 do_serror+0xc4/0xcc el1h_64_error_handler+0x3c/0x5c el1h_64_error+0x6c/0x70 regmap_mmio_read32le+0x18/0x24 (P) regmap_bus_reg_read+0xfc/0x130 regmap_read+0x188/0x1ac regmap_read+0x54/0x78 rockchip_pd_power+0xcc/0x5f0 rockchip_pd_power_off+0x1c/0x4c genpd_power_off+0x84/0x120 genpd_power_off+0x1b4/0x260 genpd_power_off_work_fn+0x38/0x58 process_scheduled_works+0x194/0x2c4 worker_thread+0x2ac/0x3d8 kthread+0x104/0x124 ret_from_fork+0x10/0x20 SMP: stopping secondary CPUs Kernel Offset: disabled CPU features: 0x3000000,000e0005,40230521,0400720b Memory Limit: none ---[ end Kernel panic - not syncing: Asynchronous SError Interrupt ]--- Chaoyi pointed out the PD_VCODEC is the parent of PD_RKVDEC0/1 and PD_VENC0/1, so checking the PD_VCODEC is enough. [1] https://lore.kernel.org/linux-rockchip/20251020212009.8852-2-detlev.casanova@collabora.com/ Fixes: `db6df2e3fc` ("pmdomain: rockchip: add regulator support") Cc: stable@vger.kernel.org Suggested-by: Chaoyi Chen <chaoyi.chen@rock-chips.com> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Reviewed-by: Chaoyi Chen <chaoyi.chen@rock-chips.com> Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>	2026-03-04 11:22:36 +01:00
Harry Yoo	6432f15c81	mm/slab: change stride type from unsigned short to unsigned int Commit `7a8e71bc61` ("mm/slab: use stride to access slabobj_ext") defined the type of slab->stride as unsigned short, because the author initially planned to store stride within the lower 16 bits of the page_type field, but later stored it in unused bits in the counters field instead. However, the idea of having only 2-byte stride turned out to be a serious mistake. On systems with 64k pages, order-1 pages are 128k, which is larger than USHRT_MAX. It triggers a debug warning because s->size is 128k while stride, truncated to 2 bytes, becomes zero: ------------[ cut here ]------------ Warning! stride (0) != s->size (131072) WARNING: mm/slub.c:2231 at alloc_slab_obj_exts_early.constprop.0+0x524/0x534, CPU#6: systemd-sysctl/307 Modules linked in: CPU: 6 UID: 0 PID: 307 Comm: systemd-sysctl Not tainted 7.0.0-rc1+ #6 PREEMPTLAZY Hardware name: IBM,9009-22A POWER9 (architected) 0x4e0202 0xf000005 of:IBM,FW950.E0 (VL950_179) hv:phyp pSeries NIP: c0000000008a9ac0 LR: c0000000008a9abc CTR: 0000000000000000 REGS: c0000000141f7390 TRAP: 0700 Not tainted (7.0.0-rc1+) MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 28004400 XER: 00000005 CFAR: c000000000279318 IRQMASK: 0 GPR00: c0000000008a9abc c0000000141f7630 c00000000252a300 c00000001427b200 GPR04: 0000000000000004 0000000000000000 c000000000278fd0 0000000000000000 GPR08: fffffffffffe0000 0000000000000000 0000000000000000 0000000022004400 GPR12: c000000000f644b0 c000000017ff8f00 0000000000000000 0000000000000000 GPR16: 0000000000000000 c0000000141f7aa0 0000000000000000 c0000000141f7a88 GPR20: 0000000000000000 0000000000400cc0 ffffffffffffffff c00000001427b180 GPR24: 0000000000000004 00000000000c0cc0 c000000004e89a20 c00000005de90011 GPR28: 0000000000010010 c00000005df00000 c000000006017f80 c00c000000177a00 NIP [c0000000008a9ac0] alloc_slab_obj_exts_early.constprop.0+0x524/0x534 LR [c0000000008a9abc] alloc_slab_obj_exts_early.constprop.0+0x520/0x534 Call Trace: [c0000000141f7630] [c0000000008a9abc] alloc_slab_obj_exts_early.constprop.0+0x520/0x534 (unreliable) [c0000000141f76c0] [c0000000008aafbc] allocate_slab+0x154/0x94c [c0000000141f7760] [c0000000008b41c0] refill_objects+0x124/0x16c [c0000000141f77c0] [c0000000008b4be0] __pcs_replace_empty_main+0x2b0/0x444 [c0000000141f7810] [c0000000008b9600] __kvmalloc_node_noprof+0x840/0x914 [c0000000141f7900] [c000000000a3dd40] seq_read_iter+0x60c/0xb00 [c0000000141f7a10] [c000000000b36b24] proc_reg_read_iter+0x154/0x1fc [c0000000141f7a50] [c0000000009cee7c] vfs_read+0x39c/0x4e4 [c0000000141f7b30] [c0000000009d0214] ksys_read+0x9c/0x180 [c0000000141f7b90] [c00000000003a8d0] system_call_exception+0x1e0/0x4b0 [c0000000141f7e50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec This leads to slab_obj_ext() returning the first slabobj_ext or all objects and confuses the reference counting of object cgroups [1] and memory (un)charging for memory cgroups [2]. Fortunately, the counters field has 32 unused bits instead of 16 on 64-bit CPUs, which is wide enough to hold any value of s->size. Change the type to unsigned int. Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Closes: https://lore.kernel.org/lkml/ca241daa-e7e7-4604-a48d-de91ec9184a5@linux.ibm.com [1] Closes: https://lore.kernel.org/all/ddff7c7d-c0c3-4780-808f-9a83268bbf0c@linux.ibm.com [2] Fixes: `7a8e71bc61` ("mm/slab: use stride to access slabobj_ext") Signed-off-by: Harry Yoo <harry.yoo@oracle.com> Link: https://patch.msgid.link/20260303135722.2680521-1-harry.yoo@oracle.com Reviewed-by: Hao Li <hao.li@linux.dev> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>	2026-03-04 11:05:57 +01:00
Vlastimil Babka (SUSE)	fb1091febd	mm/slab: allow sheaf refill if blocking is not allowed Ming Lei reported [1] a regression in the ublk null target benchmark due to sheaves. The profile shows that the alloc_from_pcs() fastpath fails and allocations fall back to ___slab_alloc(). It also shows the allocations happen through mempool_alloc(). The strategy of mempool_alloc() is to call the underlying allocator (here slab) without __GFP_DIRECT_RECLAIM first. This does not play well with __pcs_replace_empty_main() checking for gfpflags_allow_blocking() to decide if it should refill an empty sheaf or fallback to the slowpath, so we end up falling back. We could change the mempool strategy but there might be other paths doing the same ting. So instead allow sheaf refill when blocking is not allowed, changing the condition to gfpflags_allow_spinning(). The original condition was unnecessarily restrictive. Note this doesn't fully resolve the regression [1] as another component of that are memoryless nodes, which is to be addressed separately. Reported-by: Ming Lei <ming.lei@redhat.com> Fixes: `e47c897a29` ("slab: add sheaves to most caches") Link: https://lore.kernel.org/all/aZ0SbIqaIkwoW2mB@fedora/ [1] Link: https://patch.msgid.link/20260302095536.34062-2-vbabka@kernel.org Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>	2026-03-04 11:03:54 +01:00
Niklas Cassel	aac9b27f7c	ata: libata: cancel pending work after clearing deferred_qc Syzbot reported a WARN_ON() in ata_scsi_deferred_qc_work(), caused by ap->ops->qc_defer() returning non-zero before issuing the deferred qc. ata_scsi_schedule_deferred_qc() is called during each command completion. This function will check if there is a deferred QC, and if ap->ops->qc_defer() returns zero, meaning that it is possible to queue the deferred qc at this time (without being deferred), then it will queue the work which will issue the deferred qc. Once the work get to run, which can potentially be a very long time after the work was scheduled, there is a WARN_ON() if ap->ops->qc_defer() returns non-zero. While we hold the ap->lock both when assigning and clearing deferred_qc, and the work itself holds the ap->lock, the code currently does not cancel the work after clearing the deferred qc. This means that the following scenario can happen: 1) One or several NCQ commands are queued. 2) A non-NCQ command is queued, gets stored in ap->deferred_qc. 3) Last NCQ command gets completed, work is queued to issue the deferred qc. 4) Timeout or error happens, ap->deferred_qc is cleared. The queued work is currently NOT canceled. 5) Port is reset. 6) One or several NCQ commands are queued. 7) A non-NCQ command is queued, gets stored in ap->deferred_qc. 8) Work is finally run. Yet at this time, there is still NCQ commands in flight. The work in 8) really belongs to the non-NCQ command in 2), not to the non-NCQ command in 7). The reason why the work is executed when it is not supposed to, is because it was never canceled when ap->deferred_qc was cleared in 4). Thus, ensure that we always cancel the work after clearing ap->deferred_qc. Another potential fix would have been to let ata_scsi_deferred_qc_work() do nothing if ap->ops->qc_defer() returns non-zero. However, canceling the work when clearing ap->deferred_qc seems slightly more logical, as we hold the ap->lock when clearing ap->deferred_qc, so we know that the work cannot be holding the lock. (The function could be waiting for the lock, but that is okay since it will do nothing if ap->deferred_qc is not set.) Reported-by: syzbot+bcaf842a1e8ead8dfb89@syzkaller.appspotmail.com Fixes: `0ea84089db` ("ata: libata-scsi: avoid Non-NCQ command starvation") Fixes: `eddb98ad93` ("ata: libata-eh: correctly handle deferred qc timeouts") Reviewed-by: Igor Pylypiv <ipylypiv@google.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Niklas Cassel <cassel@kernel.org>	2026-03-04 10:42:12 +01:00
Yujie Liu	61ded1083b	drm/sched: Fix kernel-doc warning for drm_sched_job_done() There is a kernel-doc warning for the scheduler: Warning: drivers/gpu/drm/scheduler/sched_main.c:367 function parameter 'result' not described in 'drm_sched_job_done' Fix the warning by describing the undocumented error code. Fixes: `539f9ee4b5` ("drm/scheduler: properly forward fence errors") Signed-off-by: Yujie Liu <yujie.liu@intel.com> [phasta: Flesh out commit message] Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://patch.msgid.link/20260227082452.1802922-1-yujie.liu@intel.com	2026-03-04 10:29:27 +01:00
Maximilian Pezzullo	b3b1d3ae1d	ata: libata-core: Disable LPM on ST1000DM010-2EP102 According to a user report, the ST1000DM010-2EP102 has problems with LPM, causing random system freezes. The drive belongs to the same BarraCuda family as the ST2000DM008-2FR102 which has the same issue. Cc: stable@vger.kernel.org Fixes: `7627a0edef` ("ata: ahci: Drop low power policy board type") Reported-by: Filippo Baiamonte <filippo.ba03@bugzilla.kernel.org> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221163 Signed-off-by: Maximilian Pezzullo <maximilianpezzullo@gmail.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Niklas Cassel <cassel@kernel.org>	2026-03-04 09:16:56 +01:00
Naveen Anandhan	fbdfa8da05	selftests: tc-testing: fix list_categories() crash on list type list_categories() builds a set directly from the 'category' field of each test case. Since 'category' is a list, set(map(...)) attempts to insert lists into a set, which raises: TypeError: unhashable type: 'list' Flatten category lists and collect unique category names using set.update() instead. Signed-off-by: Naveen Anandhan <mr.navi8680@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2026-03-04 05:42:57 +00:00
Qing Wang	e39bb9e02b	tracing: Fix WARN_ON in tracing_buffers_mmap_close When a process forks, the child process copies the parent's VMAs but the user_mapped reference count is not incremented. As a result, when both the parent and child processes exit, tracing_buffers_mmap_close() is called twice. On the second call, user_mapped is already 0, causing the function to return -ENODEV and triggering a WARN_ON. Normally, this isn't an issue as the memory is mapped with VM_DONTCOPY set. But this is only a hint, and the application can call madvise(MADVISE_DOFORK) which resets the VM_DONTCOPY flag. When the application does that, it can trigger this issue on fork. Fix it by incrementing the user_mapped reference count without re-mapping the pages in the VMA's open callback. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Vincent Donnefort <vdonnefort@google.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Link: https://patch.msgid.link/20260227025842.1085206-1-wangqing7171@gmail.com Fixes: `cf9f0f7c4c` ("tracing: Allow user-space mapping of the ring-buffer") Reported-by: syzbot+3b5dd2030fe08afdf65d@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=3b5dd2030fe08afdf65d Tested-by: syzbot+3b5dd2030fe08afdf65d@syzkaller.appspotmail.com Signed-off-by: Qing Wang <wangqing7171@gmail.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2026-03-03 22:25:32 -05:00
Masami Hiramatsu (Google)	a5dd6f5866	tracing: Disable preemption in the tracepoint callbacks handling filtered pids Filtering PIDs for events triggered the following during selftests: [37] event tracing - restricts events based on pid notrace filtering [ 155.874095] [ 155.874869] ============================= [ 155.876037] WARNING: suspicious RCU usage [ 155.877287] 7.0.0-rc1-00004-g8cd473a19bc7 #7 Not tainted [ 155.879263] ----------------------------- [ 155.882839] kernel/trace/trace_events.c:1057 suspicious rcu_dereference_check() usage! [ 155.889281] [ 155.889281] other info that might help us debug this: [ 155.889281] [ 155.894519] [ 155.894519] rcu_scheduler_active = 2, debug_locks = 1 [ 155.898068] no locks held by ftracetest/4364. [ 155.900524] [ 155.900524] stack backtrace: [ 155.902645] CPU: 1 UID: 0 PID: 4364 Comm: ftracetest Not tainted 7.0.0-rc1-00004-g8cd473a19bc7 #7 PREEMPT(lazy) [ 155.902648] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-debian-1.17.0-1 04/01/2014 [ 155.902651] Call Trace: [ 155.902655] <TASK> [ 155.902659] dump_stack_lvl+0x67/0x90 [ 155.902665] lockdep_rcu_suspicious+0x154/0x1a0 [ 155.902672] event_filter_pid_sched_process_fork+0x9a/0xd0 [ 155.902678] kernel_clone+0x367/0x3a0 [ 155.902689] __x64_sys_clone+0x116/0x140 [ 155.902696] do_syscall_64+0x158/0x460 [ 155.902700] ? entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 155.902702] ? trace_irq_disable+0x1d/0xc0 [ 155.902709] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 155.902711] RIP: 0033:0x4697c3 [ 155.902716] Code: 1f 84 00 00 00 00 00 64 48 8b 04 25 10 00 00 00 45 31 c0 31 d2 31 f6 bf 11 00 20 01 4c 8d 90 d0 02 00 00 b8 38 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 89 c2 85 c0 75 2c 64 48 8b 04 25 10 00 00 [ 155.902718] RSP: 002b:00007ffc41150428 EFLAGS: 00000246 ORIG_RAX: 0000000000000038 [ 155.902721] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004697c3 [ 155.902722] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011 [ 155.902724] RBP: 0000000000000000 R08: 0000000000000000 R09: 000000003fccf990 [ 155.902725] R10: 000000003fccd690 R11: 0000000000000246 R12: 0000000000000001 [ 155.902726] R13: 000000003fce8103 R14: 0000000000000001 R15: 0000000000000000 [ 155.902733] </TASK> [ 155.902747] The tracepoint callbacks recently were changed to allow preemption. The event PID filtering callbacks that were attached to the fork and exit tracepoints expected preemption disabled in order to access the RCU protected PID lists. Add a guard(preempt)() to protect the references to the PID list. Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://patch.msgid.link/20260303215738.6ab275af@fedora Fixes: `a46023d561` ("tracing: Guard __DECLARE_TRACE() use of __DO_TRACE_CALL() with SRCU-fast") Link: https://patch.msgid.link/20260303131706.96057f61a48a34c43ce1e396@kernel.org Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2026-03-03 22:25:32 -05:00
Steven Rostedt	cc337974cd	ftrace: Disable preemption in the tracepoint callbacks handling filtered pids When function trace PID filtering is enabled, the function tracer will attach a callback to the fork tracepoint as well as the exit tracepoint that will add the forked child PID to the PID filtering list as well as remove the PID that is exiting. Commit `a46023d561` ("tracing: Guard __DECLARE_TRACE() use of __DO_TRACE_CALL() with SRCU-fast") removed the disabling of preemption when calling tracepoint callbacks. The callbacks used for the PID filtering accounting depended on preemption being disabled, and now the trigger a "suspicious RCU usage" warning message. Make them explicitly disable preemption. Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://patch.msgid.link/20260302213546.156e3e4f@gandalf.local.home Fixes: `a46023d561` ("tracing: Guard __DECLARE_TRACE() use of __DO_TRACE_CALL() with SRCU-fast") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>	2026-03-03 22:25:31 -05:00
Huiwen He	0a663b764d	tracing: Fix syscall events activation by ensuring refcount hits zero When multiple syscall events are specified in the kernel command line (e.g., trace_event=syscalls:sys_enter_openat,syscalls:sys_enter_close), they are often not captured after boot, even though they appear enabled in the tracing/set_event file. The issue stems from how syscall events are initialized. Syscall tracepoints require the global reference count (sys_tracepoint_refcount) to transition from 0 to 1 to trigger the registration of the syscall work (TIF_SYSCALL_TRACEPOINT) for tasks, including the init process (pid 1). The current implementation of early_enable_events() with disable_first=true used an interleaved sequence of "Disable A -> Enable A -> Disable B -> Enable B". If multiple syscalls are enabled, the refcount never drops to zero, preventing the 0->1 transition that triggers actual registration. Fix this by splitting early_enable_events() into two distinct phases: 1. Disable all events specified in the buffer. 2. Enable all events specified in the buffer. This ensures the refcount hits zero before re-enabling, allowing syscall events to be properly activated during early boot. The code is also refactored to use a helper function to avoid logic duplication between the disable and enable phases. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://patch.msgid.link/20260224023544.1250787-1-hehuiwen@kylinos.cn Fixes: `ce1039bd3a` ("tracing: Fix enabling of syscall events on the command line") Signed-off-by: Huiwen He <hehuiwen@kylinos.cn> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2026-03-03 22:15:02 -05:00
Shengming Hu	b96d0c59cd	fgraph: Fix thresh_return nosleeptime double-adjust trace_graph_thresh_return() called handle_nosleeptime() and then delegated to trace_graph_return(), which calls handle_nosleeptime() again. When sleep-time accounting is disabled this double-adjusts calltime and can produce bogus durations (including underflow). Fix this by computing rettime once, applying handle_nosleeptime() only once, using the adjusted calltime for threshold comparison, and writing the return event directly via __trace_graph_return() when the threshold is met. Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260221113314048jE4VRwIyZEALiYByGK0My@zte.com.cn Fixes: `3c9880f3ab` ("ftrace: Use a running sleeptime instead of saving on shadow stack") Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Shengming Hu <hu.shengming@zte.com.cn> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2026-03-03 22:11:20 -05:00
Shengming Hu	6ca8379b5d	fgraph: Fix thresh_return clear per-task notrace When tracing_thresh is enabled, function graph tracing uses trace_graph_thresh_return() as the return handler. Unlike trace_graph_return(), it did not clear the per-task TRACE_GRAPH_NOTRACE flag set by the entry handler for set_graph_notrace addresses. This could leave the task permanently in "notrace" state and effectively disable function graph tracing for that task. Mirror trace_graph_return()'s per-task notrace handling by clearing TRACE_GRAPH_NOTRACE and returning early when set. Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260221113007819YgrZsMGABff4Rc-O_fZxL@zte.com.cn Fixes: `b84214890a` ("function_graph: Move graph notrace bit to shadow stack global var") Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Shengming Hu <hu.shengming@zte.com.cn> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>	2026-03-03 22:10:37 -05:00
Eric Biggers	26bc83b88b	smb: client: Compare MACs in constant time To prevent timing attacks, MAC comparisons need to be constant-time. Replace the memcmp() with the correct function, crypto_memneq(). Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2026-03-03 20:56:36 -06:00
J. Neuschäfer	9e7dc228bb	io_uring/mock: Fix typo in help text Fix the spelling of "subsystem". Signed-off-by: J. Neuschäfer <j.ne@posteo.net> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-03-03 19:42:53 -07:00
Eric Biggers	46d0d6f50d	net/tcp-md5: Fix MAC comparison to be constant-time To prevent timing attacks, MACs need to be compared in constant time. Use the appropriate helper function for this. Fixes: `cfb6eeb4c8` ("[TCP]: MD5 Signature Option (RFC2385) support.") Fixes: `658ddaaf66` ("tcp: md5: RST: getting md5 key from listener") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org> Link: https://patch.msgid.link/20260302203409.13388-1-ebiggers@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-03 18:39:43 -08:00
Stephen Hemminger	7f5d8e63f3	MAINTAINERS: update the skge/sky2 maintainers Mark the skge and sky2 drivers as orphan. I no longer have any Marvell/SysKonnect boards to test with and mail to Mirko Lindner bounced because Marvell sold off that divsion. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Link: https://patch.msgid.link/20260302195120.187183-1-stephen@networkplumber.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-03 18:39:27 -08:00
Raju Rangoju	e2f27363aa	amd-xgbe: fix sleep while atomic on suspend/resume The xgbe_powerdown() and xgbe_powerup() functions use spinlocks (spin_lock_irqsave) while calling functions that may sleep: - napi_disable() can sleep waiting for NAPI polling to complete - flush_workqueue() can sleep waiting for pending work items This causes a "BUG: scheduling while atomic" error during suspend/resume cycles on systems using the AMD XGBE Ethernet controller. The spinlock protection in these functions is unnecessary as these functions are called from suspend/resume paths which are already serialized by the PM core Fix this by removing the spinlock. Since only code that takes this lock is xgbe_powerdown() and xgbe_powerup(), remove it completely. Fixes: `c5aa9e3b81` ("amd-xgbe: Initial AMD 10GbE platform driver") Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Link: https://patch.msgid.link/20260302042124.1386445-1-Raju.Rangoju@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-03 17:24:38 -08:00
Breno Leitao	5af6e8b549	netconsole: fix sysdata_release_enabled_show checking wrong flag sysdata_release_enabled_show() checks SYSDATA_TASKNAME instead of SYSDATA_RELEASE, causing the configfs release_enabled attribute to reflect the taskname feature state rather than the release feature state. This is a copy-paste error from the adjacent sysdata_taskname_enabled_show() function. The corresponding _store function already uses the correct SYSDATA_RELEASE flag. Fixes: `343f902270` ("netconsole: implement configfs for release_enabled") Signed-off-by: Breno Leitao <leitao@debian.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260302-sysdata_release_fix-v1-1-e5090f677c7c@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-03 17:23:50 -08:00
Yung Chih Su	4ee7fa6cf7	net: ipv4: fix ARM64 alignment fault in multipath hash seed `struct sysctl_fib_multipath_hash_seed` contains two u32 fields (user_seed and mp_seed), making it an 8-byte structure with a 4-byte alignment requirement. In `fib_multipath_hash_from_keys()`, the code evaluates the entire struct atomically via `READ_ONCE()`: mp_seed = READ_ONCE(net->ipv4.sysctl_fib_multipath_hash_seed).mp_seed; While this silently works on GCC by falling back to unaligned regular loads which the ARM64 kernel tolerates, it causes a fatal kernel panic when compiled with Clang and LTO enabled. Commit `e35123d83e` ("arm64: lto: Strengthen READ_ONCE() to acquire when CONFIG_LTO=y") strengthens `READ_ONCE()` to use Load-Acquire instructions (`ldar` / `ldapr`) to prevent compiler reordering bugs under Clang LTO. Since the macro evaluates the full 8-byte struct, Clang emits a 64-bit `ldar` instruction. ARM64 architecture strictly requires `ldar` to be naturally aligned, thus executing it on a 4-byte aligned address triggers a strict Alignment Fault (FSC = 0x21). Fix the read side by moving the `READ_ONCE()` directly to the `u32` member, which emits a safe 32-bit `ldar Wn`. Furthermore, Eric Dumazet pointed out that `WRITE_ONCE()` on the entire struct in `proc_fib_multipath_hash_set_seed()` is also flawed. Analysis shows that Clang splits this 8-byte write into two separate 32-bit `str` instructions. While this avoids an alignment fault, it destroys atomicity and exposes a tear-write vulnerability. Fix this by explicitly splitting the write into two 32-bit `WRITE_ONCE()` operations. Finally, add the missing `READ_ONCE()` when reading `user_seed` in `proc_fib_multipath_hash_seed()` to ensure proper pairing and concurrency safety. Fixes: `4ee2a8cace` ("net: ipv4: Add a sysctl to set multipath hash seed") Signed-off-by: Yung Chih Su <yuuchihsu@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260302060247.7066-1-yuuchihsu@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-03 17:20:37 -08:00
Eric Biggers	67edfec516	net/tcp-ao: Fix MAC comparison to be constant-time To prevent timing attacks, MACs need to be compared in constant time. Use the appropriate helper function for this. Fixes: `0a3a809089` ("net/tcp: Verify inbound TCP-AO signed segments") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org> Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com> Link: https://patch.msgid.link/20260302203600.13561-1-ebiggers@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-03 17:16:54 -08:00
Jakub Kicinski	2ffb4f5c2c	ipv6: fix NULL pointer deref in ip6_rt_get_dev_rcu() l3mdev_master_dev_rcu() can return NULL when the slave device is being un-slaved from a VRF. All other callers deal with this, but we lost the fallback to loopback in ip6_rt_pcpu_alloc() -> ip6_rt_get_dev_rcu() with commit `4832c30d54` ("net: ipv6: put host and anycast routes on device with address"). KASAN: null-ptr-deref in range [0x0000000000000108-0x000000000000010f] RIP: 0010:ip6_rt_pcpu_alloc (net/ipv6/route.c:1418) Call Trace: ip6_pol_route (net/ipv6/route.c:2318) fib6_rule_lookup (net/ipv6/fib6_rules.c:115) ip6_route_output_flags (net/ipv6/route.c:2607) vrf_process_v6_outbound (drivers/net/vrf.c:437) I was tempted to rework the un-slaving code to clear the flag first and insert synchronize_rcu() before we remove the upper. But looks like the explicit fallback to loopback_dev is an established pattern. And I guess avoiding the synchronize_rcu() is nice, too. Fixes: `4832c30d54` ("net: ipv6: put host and anycast routes on device with address") Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20260301194548.927324-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-03-03 17:14:48 -08:00
ZhangGuoDong	8098179dc9	smb/client: remove unused SMB311_posix_query_info() It is currently unused, as now we are doing compounding instead (see smb2_query_path_info()). Signed-off-by: ZhangGuoDong <zhangguodong@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>	2026-03-03 18:03:56 -06:00
ZhangGuoDong	9621b996e4	smb/client: fix buffer size for smb311_posix_qinfo in SMB311_posix_query_info() SMB311_posix_query_info() is currently unused, but it may still be used in some stable versions, so these changes are submitted as a separate patch. Use `sizeof(struct smb311_posix_qinfo)` instead of sizeof its pointer, so the allocated buffer matches the actual struct size. Fixes: `b1bc1874b8` ("smb311: Add support for SMB311 query info (non-compounded)") Reported-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: ZhangGuoDong <zhangguodong@kylinos.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>	2026-03-03 18:03:56 -06:00

... 2 3 4 5 6 ...

1427843 commits