linux

mirror of https://github.com/torvalds/linux.git synced 2026-03-08 04:24:31 +01:00

Author	SHA1	Message	Date
Linus Torvalds	0c00ed308d	for-7.0/block-20260206 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmmGLwcQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpv+TD/48S2HTnMhmW6AtFYWErQ+sEKXpHrxbYe7S +qR8/g/T+QSfhfqPwZEuagndFKtIP3LJfaXGSP1Lk1RfP9NLQy91v33Ibe4DjHkp etWSfnMHA9MUAoWKmg8EvncB2G+ZQFiYCpjazj5tKHD9S2+psGMuL8kq6qzMJE83 uhpb8WutUl4aSIXbMSfyGlwBhI1MjjRbbWlIBmg4yC8BWt1sH8Qn2L2GNVylEIcX U8At3KLgPGn0axSg4yGMAwTqtGhL/jwdDyeczbmRlXuAr4iVL9UX/yADCYkazt6U ttQ2/H+cxCwfES84COx9EteAatlbZxo6wjGvZ3xOMiMJVTjYe1x6Gkcckq+LrZX6 tjofi2KK78qkrMXk1mZMkZjpyUWgRtCswhDllbQyqFs0SwzQtno2//Rk8HU9dhbt pkpryDbGFki9X3upcNyEYp5TYflpW6YhAzShYgmE6KXim2fV8SeFLviy0erKOAl+ fwjTE6KQ5QoQv0s3WxkWa4lREm34O6IHrCUmbiPm5CruJnQDhqAN2QZIDgYC4WAf 0gu9cR/O4Vxu7TQXrumPs5q+gCyDU0u0B8C3mG2s+rIo+PI5cVZKs2OIZ8HiPo0F x73kR/pX3DMe35ZQkQX22ymMuowV+aQouDLY9DTwakP5acdcg7h7GZKABk6VLB06 gUIsnxURiQ== =jNzW -----END PGP SIGNATURE----- Merge tag 'for-7.0/block-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block updates from Jens Axboe: - Support for batch request processing for ublk, improving the efficiency of the kernel/ublk server communication. This can yield nice 7-12% performance improvements - Support for integrity data for ublk - Various other ublk improvements and additions, including a ton of selftests additions and updated - Move the handling of blk-crypto software fallback from below the block layer to above it. This reduces the complexity of dealing with bio splitting - Series fixing a number of potential deadlocks in blk-mq related to the queue usage counter and writeback throttling and rq-qos debugfs handling - Add an async_depth queue attribute, to resolve a performance regression that's been around for a qhilw related to the scheduler depth handling - Only use task_work for IOPOLL completions on NVMe, if it is necessary to do so. An earlier fix for an issue resulted in all these completions being punted to task_work, to guarantee that completions were only run for a given io_uring ring when it was local to that ring. With the new changes, we can detect if it's necessary to use task_work or not, and avoid it if possible. - rnbd fixes: - Fix refcount underflow in device unmap path - Handle PREFLUSH and NOUNMAP flags properly in protocol - Fix server-side bi_size for special IOs - Zero response buffer before use - Fix trace format for flags - Add .release to rnbd_dev_ktype - MD pull requests via Yu Kuai - Fix raid5_run() to return error when log_init() fails - Fix IO hang with degraded array with llbitmap - Fix percpu_ref not resurrected on suspend timeout in llbitmap - Fix GPF in write_page caused by resize race - Fix NULL pointer dereference in process_metadata_update - Fix hang when stopping arrays with metadata through dm-raid - Fix any_working flag handling in raid10_sync_request - Refactor sync/recovery code path, improve error handling for badblocks, and remove unused recovery_disabled field - Consolidate mddev boolean fields into mddev_flags - Use mempool to allocate stripe_request_ctx and make sure max_sectors is not less than io_opt in raid5 - Fix return value of mddev_trylock - Fix memory leak in raid1_run() - Add Li Nan as mdraid reviewer - Move phys_vec definitions to the kernel types, mostly in preparation for some VFIO and RDMA changes - Improve the speed for secure erase for some devices - Various little rust updates - Various other minor fixes, improvements, and cleanups * tag 'for-7.0/block-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (162 commits) blk-mq: ABI/sysfs-block: fix docs build warnings selftests: ublk: organize test directories by test ID block: decouple secure erase size limit from discard size limit block: remove redundant kill_bdev() call in set_blocksize() blk-mq: add documentation for new queue attribute async_dpeth block, bfq: convert to use request_queue->async_depth mq-deadline: covert to use request_queue->async_depth kyber: covert to use request_queue->async_depth blk-mq: add a new queue sysfs attribute async_depth blk-mq: factor out a helper blk_mq_limit_depth() blk-mq-sched: unify elevators checking for async requests block: convert nr_requests to unsigned int block: don't use strcpy to copy blockdev name blk-mq-debugfs: warn about possible deadlock blk-mq-debugfs: add missing debugfs_mutex in blk_mq_debugfs_register_hctxs() blk-mq-debugfs: remove blk_mq_debugfs_unregister_rqos() blk-mq-debugfs: make blk_mq_debugfs_register_rqos() static blk-rq-qos: fix possible debugfs_mutex deadlock blk-mq-debugfs: factor out a helper to register debugfs for all rq_qos blk-wbt: fix possible deadlock to nest pcpu_alloc_mutex under q_usage_counter ...	2026-02-09 17:57:21 -08:00
Linus Torvalds	591beb0e3a	io_uring-bpf-restrictions.4-20260206 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmmGJ1kQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpky8EAChIL3uJ5Vmv+oQTxT4EVb1wpc8U/XzXWU5 Q5F9IpZZCGO7+i015Y7iTTqDRixjblRaWpWzZZP8vflWDUS8LESNZLQdcoEnxaiv P367KNPUGwxejcKsu8PvZvfnX6JWSQoNstcDmrwkCF0ND2UUfvvMZyn3uKhkbBRY h5Ehcqkvqc1OJDAWC7+yPzYAmB01uRPQ6sc9/GeujznHPlfbvie4u6gBvvfXeirT 592zbVftINMrm6Twd6zl4n+HNAn+CUoyVMppeeddv5IcyFPm9uz/dLOZBXTz6552 jFYNmB0U4g+SxGXMyqp37YISTALnuY+57y5eXmEAtgkEeE3HrF+F/ZdxQHwXSpo3 T2Lb9IOqFyHtSvq678HZ37JB6aIYbBE/mZdNf8FFFpnPJGb5Ey7d50qPp/ywVq0H p9CahbpkzGUBMsZ+koew0YHiFdWV9tww+/Bnk5dTtn2197uyaHsLdmbf4C36GWke Bk5cwNgU+3DMFAfTiL9m+AIXYsJkBayRJn+hViTrF5AL7gcGiBryGF43FOSKoYuq f0mniDnGSwvn86VZPuZQ6wBRHZPEMR3OlaUXn6XrUU6cYyvMg0pBZV+QHF7zlsSP 2sdfUbPL5TxexF3G8dsxlDIypz9Z6TCoUCfU0WiiUETnCrVNkXfIY846A+w08p0b ejBjzrwRtQ== =CqJq -----END PGP SIGNATURE----- Merge tag 'io_uring-bpf-restrictions.4-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull io_uring bpf filters from Jens Axboe: "This adds support for both cBPF filters for io_uring, as well as task inherited restrictions and filters. seccomp and io_uring don't play along nicely, as most of the interesting data to filter on resides somewhat out-of-band, in the submission queue ring. As a result, things like containers and systemd that apply seccomp filters, can't filter io_uring operations. That leaves them with just one choice if filtering is critical - filter the actual io_uring_setup(2) system call to simply disallow io_uring. That's rather unfortunate, and has limited us because of it. io_uring already has some filtering support. It requires the ring to be setup in a disabled state, and then a filter set can be applied. This filter set is completely bi-modal - an opcode is either enabled or it's not. Once a filter set is registered, the ring can be enabled. This is very restrictive, and it's not useful at all to systemd or containers which really want both broader and more specific control. This first adds support for cBPF filters for opcodes, which enables tighter control over what exactly a specific opcode may do. As examples, specific support is added for IORING_OP_OPENAT/OPENAT2, allowing filtering on resolve flags. And another example is added for IORING_OP_SOCKET, allowing filtering on domain/type/protocol. These are both common use cases. cBPF was chosen rather than eBPF, because the latter is often restricted in containers as well. These filters are run post the init phase of the request, which allows filters to even dip into data that is being passed in struct in user memory, as the init side of requests make that data stable by bringing it into the kernel. This allows filtering without needing to copy this data twice, or have filters etc know about the exact layout of the user data. The filters get the already copied and sanitized data passed. On top of that support is added for per-task filters, meaning that any ring created with a task that has a per-task filter will get those filters applied when it's created. These filters are inherited across fork as well. Once a filter has been registered, any further added filters may only further restrict what operations are permitted. Filters cannot change the return value of an operation, they can only permit or deny it based on the contents" * tag 'io_uring-bpf-restrictions.4-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: io_uring: allow registration of per-task restrictions io_uring: add task fork hook io_uring/bpf_filter: add ref counts to struct io_bpf_filter io_uring/bpf_filter: cache lookup table in ctx->bpf_filters io_uring/bpf_filter: allow filtering on contents of struct open_how io_uring/net: allow filtering on IORING_OP_SOCKET data io_uring: add support for BPF filtering for opcode restrictions	2026-02-09 17:31:17 -08:00
Linus Torvalds	f5d4feed17	for-7.0/io_uring-20260206 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmmGJxsQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpk6+EACamMdw6WU4VVNjUtjT93FuXxor4ioyhowJ myRtKG3ZvYrE63Z8F1dCQE28RXi9n6MhGxabCq8WZVGkhTv27DuaBkDjU4T8oCnP EYhs5a3sdRXfKuIlqVbxuiFdmiPHEP0vh3/MviKx9Ju3/Po3OEWKBalNMevfGkS4 bRNp9IQkAYNSRhGma2ni9Rnc5welWmhpsxUKFdGtPRX53ZlYegiZxKlfKMB4/SQ+ 7XAWKhy9dOGVo4DpLof7mCX6hMeX+FoNkJzF6cTMO/IF//lCLjI9BN4SMiI6mmEN RY6PLJiFraoQx8wdr3J1LtBCNXzzj6cPk6PNHKtsodoafe2oYFNLNgfAa9pHDzfM 12kvy58au0cQG6TnS2eNlqM2GN116mJi+k00E+UW4iaXXtpqcdcBrLlS+Q5hJ78C 9MBLQofv7D06C6kbpxV2pVS1u4oxefjl19wWLqLKx/VytCHrsaTm50n1r0k7YLCc plvPkQRQobqpp2GtcaXcfmsi1Vfu4jzMBAN+rTN4/te0kudNqL9+hPvrejIMEURc 2AcktMAHC8wjpr93dFASXiWh/fdyhV4e2a/D/ML4PXxhnCfnGx5s5Tp/pGjePHEU dLZm9vadmr/Yrdgycf9gQ8mz9IxI9FNJCKbI7lf7+/KJXe7DwngOa6VHNblWBRHv YoX6bG1yQQ== =Q248 -----END PGP SIGNATURE----- Merge tag 'for-7.0/io_uring-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull io_uring updates from Jens Axboe: - Clean up the IORING_SETUP_R_DISABLED and submitter task checking, mostly just in preparation for relaxing the locking for SINGLE_ISSUER in the future. - Improve IOPOLL by using a doubly linked list to manage completions. Previously it was singly listed, which meant that to complete request N in the chain 0..N-1 had to have completed first. With a doubly linked list we can complete whatever request completes in that order, rather than need to wait for a consecutive range to be available. This reduces latencies. - Improve the restriction setup and checking. Mostly in preparation for adding further features on top of that. Coming in a separate pull request. - Split out task_work and wait handling into separate files. These are mostly nicely abstracted already, but still remained in the io_uring.c file which is on the larger side. - Use GFP_KERNEL_ACCOUNT in a few more spots, where appropriate. - Ensure even the idle io-wq worker exits if a task no longer has any rings open. - Add support for a non-circular submission queue. By default, the SQ ring keeps moving around, even if only a few entries are used for each submission. This can be wasteful in terms of cachelines. If IORING_SETUP_SQ_REWIND is set for the ring when created, each submission will start at offset 0 instead of where we last left off doing submissions. - Various little cleanups * tag 'for-7.0/io_uring-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (30 commits) io_uring/kbuf: fix memory leak if io_buffer_add_list fails io_uring: Add SPDX id lines to remaining source files io_uring: allow io-wq workers to exit when unused io_uring/io-wq: add exit-on-idle state io_uring/net: don't continue send bundle if poll was required for retry io_uring/rsrc: use GFP_KERNEL_ACCOUNT consistently io_uring/futex: use GFP_KERNEL_ACCOUNT for futex data allocation io_uring/io-wq: handle !sysctl_hung_task_timeout_secs io_uring: fix bad indentation for setup flags if statement io_uring/rsrc: take unsigned index in io_rsrc_node_lookup() io_uring: introduce non-circular SQ io_uring: split out CQ waiting code into wait.c io_uring: split out task work code into tw.c io_uring/io-wq: don't trigger hung task for syzbot craziness io_uring: add IO_URING_EXIT_WAIT_MAX definition io_uring/sync: validate passed in offset io_uring/eventfd: remove unused ctx->evfd_last_cq_tail member io_uring/timeout: annotate data race in io_flush_timeouts() io_uring/uring_cmd: explicitly disallow cancelations for IOPOLL io_uring: fix IOPOLL with passthrough I/O ...	2026-02-09 17:22:00 -08:00
Linus Torvalds	d10a88ce16	nilfs2 updates for v7.0 - nilfs2: fix missing struct keywords in nilfs2_api.h kernel-doc - nilfs2: convert nilfs_super_block to kernel-doc - nilfs2: Fix potential block overflow that cause system hang -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQT4wVoLCG92poNnMFAhI4xTh21NnQUCaYZw2gAKCRAhI4xTh21N nZuFAQD8vJk3OJKbxxroCyTGxDT3C/WL7PMC9Z1QNQ/yF0zBYwD+JSPN/dfQY6uF LracWCoFQcdaWRRWn1SA2k7L1OchFgQ= =WXsp -----END PGP SIGNATURE----- Merge tag 'nilfs2-v7.0-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/nilfs2 Pull nilfs2 updates from Viacheslav Dubeyko: - Fix potential block overflow that cause system hang When executing the FITRIM command, an underflow can occur in the calculation of nblocks. This ultimately leads to the block layer function __blkdev_issue_discard() taking an excessively long time to process the bio chain, and the ns_segctor_sem lock remains held for a long period. This prevents other tasks from acquiring the ns_segctor_sem lock, resulting in a hang reported by syzbot (Edward Adam Davis) - Fix missing struct keywords in nilfs2_api.h kernel-doc (Ryusuke Konishi) - Convert nilfs_super_block to kernel-doc Eliminate 40+ kernel-doc warnings in nilfs2_ondisk.h by converting all of the struct member comments to kernel-doc comments (Randy Dunlap) * tag 'nilfs2-v7.0-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/nilfs2: nilfs2: fix missing struct keywords in nilfs2_api.h kernel-doc nilfs2: convert nilfs_super_block to kernel-doc nilfs2: Fix potential block overflow that cause system hang	2026-02-09 15:55:41 -08:00
Linus Torvalds	8912c2fd58	for-6.20-tag -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmmDT6sACgkQxWXV+ddt WDteIBAAnQBKtHZOrefnA/SjbT4N+IV20x8sxVc3XI2MXw6RpjEN6k+0oGMLdvMy 5NBryJ43q5CwCV6iNkWQE4mT86gcPa6Bqv1nFOC5Q2BDkvbVBpOfOq7kC2+fQ7ay HF2Mr0PUHc0Y0MhkRSljO+T2QD4tDpWaxbEeVY+TxiAsepD1paK4fHV6Lwu2sk25 17RJQvm/2XRY32g9Sa6NZIc7mGuyIasMCBcTpDKDJW10hP61NNtK4wHgPLtMRtzx qzCAPSMS6QkeJZHcDa/Atg+iqpR5U8pdKAUSYJii3Kgcmjr5n1U1ZTp5WRLlXSS2 tHiR62a983ya022wKR1ApsdjN7ncE8iIeT/GrezZVcPtm9jTxaSzgd7dDNfSmr29 my4crJWvlEuD9Qt+/oz//eLAjkgEe2Q5RtaAworCAG00MzaGOEwNiXXP7DDMQApI VTxx9dvY0s/W3UF/IuJWTTN9q95KjvlmZ9ELAPxwwtyq+sAD41CvlYhJqCaLLec5 6xMotP5cy3Ur+yp+J7RCDprQ7x6YcU98PYIXQxf1/77f3Lz/7QA2TWafPzJ5V2Bk UtprVCrlqwCmSFrSISN6HzNf0UYY/ZI36WRoUj/ZJkGNfkQwvs9aBjb+lVYRb8T8 OcMlJrJvoUwIY//ef5K97ma8HOecodxszdEIafOgmnJtE9H3foI= =Ie8n -----END PGP SIGNATURE----- Merge tag 'for-6.20-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "User visible changes, feature updates: - when using block size > page size, enable direct IO - fallback to buffered IO if the data profile has duplication, workaround to avoid checksum mismatches on block group profiles with redundancy, real direct IO is possible on single or RAID0 - redo export of zoned statistics, moved from sysfs to /proc/pid/mountstats due to size limitations of the former Experimental features: - remove offload checksum tunable, intended to find best way to do it but since we've switched to offload to thread for everything we don't need it anymore - initial support for remap-tree feature, a translation layer of logical block addresses that allow changes without moving/rewriting blocks to do eg. relocation, or other changes that require COW Notable fixes: - automatic removal of accidentally leftover chunks when free-space-tree is enabled since mkfs.btrfs v6.16.1 - zoned mode: - do not try to append to conventional zones when RAID is mixing zoned and conventional drives - fixup write pointers when mixing zoned and conventional on DUP/RAID* profiles - when using squota, relax deletion rules for qgroups with 0 members to allow easier recovery from accounting bugs, also add more checks to detect bad accounting - fix periodic reclaim scanning, properly check boundary conditions not to trigger it unexpectedly or miss the time to run it - trim: - continue after first error - change reporting to the first detected error - add more cancellation points - reduce contention of big device lock that can block other operations when there's lots of trimmed space - when chunk allocation is forced (needs experimental build) fix transaction abort when unexpected space layout is detected Core: - switch to crypto library API for checksumming, removed module dependencies, pointer indirections, etc. - error handling improvements - adjust how and where transaction commit or abort are done and are maybe not necessary - minor compression optimization to skip single block ranges - improve how compression folios are handled - new and updated selftests - cleanups, refactoring: - auto-freeing and other automatic variable cleanup conversion - structure size optimizations - condition annotations" * tag 'for-6.20-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (137 commits) btrfs: get rid of compressed_bio::compressed_folios[] btrfs: get rid of compressed_folios[] usage for encoded writes btrfs: get rid of compressed_folios[] usage for compressed read btrfs: remove the old btrfs_compress_folios() infrastructure btrfs: switch to btrfs_compress_bio() interface for compressed writes btrfs: introduce btrfs_compress_bio() helper btrfs: zlib: introduce zlib_compress_bio() helper btrfs: zstd: introduce zstd_compress_bio() helper btrfs: lzo: introduce lzo_compress_bio() helper btrfs: zoned: factor out the zone loading part into a testable function btrfs: add cleanup function for btrfs_free_chunk_map btrfs: tests: add cleanup functions for test specific functions btrfs: raid56: fix memory leak of btrfs_raid_bio::stripe_uptodate_bitmap btrfs: tests: add unit tests for pending extent walking functions btrfs: fix EEXIST abort due to non-consecutive gaps in chunk allocation btrfs: fix transaction commit blocking during trim of unallocated space btrfs: handle user interrupt properly in btrfs_trim_fs() btrfs: preserve first error in btrfs_trim_fs() btrfs: continue trimming remaining devices on failure btrfs: do not BUG_ON() in btrfs_remove_block_group() ...	2026-02-09 15:45:21 -08:00
Linus Torvalds	157d3d6efd	vfs-7.0-rc1.namespace Please consider pulling these changes from the signed vfs-7.0-rc1.namespace tag. Thanks! Christian -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaYX49gAKCRCRxhvAZXjc ovzgAP9BpqMQhMy2VCurru8/T5VAd6eJdgXzEfXqMksL5BNm8gEAsLx666KJNKgm Sh/yVA2KBjf51gvcLZ4gHOISaMU8bAI= =RGLf -----END PGP SIGNATURE----- Merge tag 'vfs-7.0-rc1.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs mount updates from Christian Brauner: - statmount: accept fd as a parameter Extend struct mnt_id_req with a file descriptor field and a new STATMOUNT_BY_FD flag. When set, statmount() returns mount information for the mount the fd resides on — including detached mounts (unmounted via umount2(MNT_DETACH)). For detached mounts the STATMOUNT_MNT_POINT and STATMOUNT_MNT_NS_ID mask bits are cleared since neither is meaningful. The capability check is skipped for STATMOUNT_BY_FD since holding an fd already implies prior access to the mount and equivalent information is available through fstatfs() and /proc/pid/mountinfo without privilege. Includes comprehensive selftests covering both attached and detached mount cases. - fs: Remove internal old mount API code (1 patch) Now that every in-tree filesystem has been converted to the new mount API, remove all the legacy shim code in fs_context.c that handled unconverted filesystems. This deletes ~280 lines including legacy_init_fs_context(), the legacy_fs_context struct, and associated wrappers. The mount(2) syscall path for userspace remains untouched. Documentation references to the legacy callbacks are cleaned up. - mount: add OPEN_TREE_NAMESPACE to open_tree() Container runtimes currently use CLONE_NEWNS to copy the caller's entire mount namespace — only to then pivot_root() and recursively unmount everything they just copied. With large mount tables and thousands of parallel container launches this creates significant contention on the namespace semaphore. OPEN_TREE_NAMESPACE copies only the specified mount tree (like OPEN_TREE_CLONE) but returns a mount namespace fd instead of a detached mount fd. The new namespace contains the copied tree mounted on top of a clone of the real rootfs. This functions as a combined unshare(CLONE_NEWNS) + pivot_root() in a single syscall. Works with user namespaces: an unshare(CLONE_NEWUSER) followed by OPEN_TREE_NAMESPACE creates a mount namespace owned by the new user namespace. Mount namespace file mounts are excluded from the copy to prevent cycles. Includes ~1000 lines of selftests" * tag 'vfs-7.0-rc1.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: selftests/open_tree: add OPEN_TREE_NAMESPACE tests mount: add OPEN_TREE_NAMESPACE fs: Remove internal old mount API code selftests: statmount: tests for STATMOUNT_BY_FD statmount: accept fd as a parameter statmount: permission check should return EPERM	2026-02-09 14:43:47 -08:00
Linus Torvalds	c84bb79f70	vfs-7.0-rc1.nullfs Please consider pulling these changes from the signed vfs-7.0-rc1.nullfs tag. Thanks! Christian -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaYX49gAKCRCRxhvAZXjc olG7AQD9TywOR0HC9PMT8jrhC1TKODnZ4H1aLNlYVltzfJ09xwEAwFSGO4rQmGAF aZdD0RQw4bkf7IC1PIZHEGUqmVXJCQ8= =NvyI -----END PGP SIGNATURE----- Merge tag 'vfs-7.0-rc1.nullfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs nullfs update from Christian Brauner: "Add a completely catatonic minimal pseudo filesystem called "nullfs" and make pivot_root() work in the initramfs. Currently pivot_root() does not work on the real rootfs because it cannot be unmounted. Userspace has to recursively delete initramfs contents manually before continuing boot, using the fragile switch_root sequence (overmount + chroot). Add nullfs, a minimal immutable filesystem that serves as the true root of the mount hierarchy. The mutable rootfs (tmpfs/ramfs) is mounted on top of it. This allows userspace to simply: chdir(new_root); pivot_root(".", "."); umount2(".", MNT_DETACH); without the traditional switch_root workarounds. systemd already handles this correctly. It tries pivot_root() first and falls back to MS_MOVE only when that fails. This also means rootfs mounts in unprivileged namespaces no longer need MNT_LOCKED, since the immutable nullfs guarantees nothing can be revealed by unmounting the covering mount. nullfs is a single-instance filesystem (get_tree_single()) marked SB_NOUSER \| SB_I_NOEXEC \| SB_I_NODEV with an immutable empty root directory. This means sooner or later it can be used to overmount other directories to hide their contents without any additional protection needed. We enable it unconditionally. If we see any real regression we'll hide it behind a boot option. nullfs has extensions beyond this in the future. It will serve as a concept to support the creation of completely empty mount namespaces - which is work coming up in the next cycle" * tag 'vfs-7.0-rc1.nullfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: use nullfs unconditionally as the real rootfs docs: mention nullfs fs: add immutable rootfs fs: add init_pivot_root() fs: ensure that internal tmpfs mount gets mount id zero	2026-02-09 13:41:34 -08:00
Linus Torvalds	dd466ea002	vfs-7.0-rc1.fserror Please consider pulling these changes from the signed vfs-7.0-rc1.fserror tag. Thanks! Christian -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaYX49gAKCRCRxhvAZXjc orUJAP9taSsjaB9zD9gU/rs8RfaPjhDXbVuPkBiDFARvGPSegwD/ZxTygHYsYarv 7JtAuKI/njOcfhl+fvHSHT1BgcO+nQ8= =nUTi -----END PGP SIGNATURE----- Merge tag 'vfs-7.0-rc1.fserror' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs error reporting updates from Christian Brauner: "This contains the changes to support generic I/O error reporting. Filesystems currently have no standard mechanism for reporting metadata corruption and file I/O errors to userspace via fsnotify. Each filesystem (xfs, ext4, erofs, f2fs, etc.) privately defines EFSCORRUPTED, and error reporting to fanotify is inconsistent or absent entirely. This introduces a generic fserror infrastructure built around struct super_block that gives filesystems a standard way to queue metadata and file I/O error reports for delivery to fsnotify. Errors are queued via mempools and queue_work to avoid holding filesystem locks in the notification path; unmount waits for pending events to drain. A new super_operations::report_error callback lets filesystem drivers respond to file I/O errors themselves (to be used by an upcoming XFS self-healing patchset). On the uapi side, EFSCORRUPTED and EUCLEAN are promoted from private per-filesystem definitions to canonical errno.h values across all architectures" * tag 'vfs-7.0-rc1.fserror' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: ext4: convert to new fserror helpers xfs: translate fsdax media errors into file "data lost" errors when convenient xfs: report fs metadata errors via fsnotify iomap: report file I/O errors to the VFS fs: report filesystem and file I/O errors to fsnotify uapi: promote EFSCORRUPTED and EUCLEAN to errno.h	2026-02-09 12:21:37 -08:00
Linus Torvalds	996812c453	vfs-7.0-rc1.initrd Please consider pulling these changes from the signed vfs-7.0-rc1.initrd tag. Thanks! Christian -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaYX49gAKCRCRxhvAZXjc ordBAQD4d6Y5Zvr852s9deMTDv+ng3bFk1YHqybGe3wEuATstwD/QcvAoNFW9Nn5 n7/268Nk6jTEygT7Fm3tn42SnwOxhgU= =T203 -----END PGP SIGNATURE----- Merge tag 'vfs-7.0-rc1.initrd' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs initrd removal from Christian Brauner: "Remove the deprecated linuxrc-based initrd code path and related dead code. The linuxrc initrd path was deprecated in 2020 and this series completes its removal. If we see real-life regressions we'll revert. The core change removes handle_initrd() and init_linuxrc() — the entire flow that ran /linuxrc from an initrd, pivoted roots, and handed off to the real root filesystem. With that gone, initrd_load() becomes void (no longer short-circuits prepare_namespace()), rd_load_image() is simplified to always load /initrd.image instead of taking a path, and rd_load_disk() is deleted. The /proc/sys/kernel/real-root-dev sysctl and its backing variable are removed since they only existed for linuxrc to communicate the real root device back to the kernel. The no-op load_ramdisk= and prompt_ramdisk= parameters are dropped, and noinitrd and ramdisk_start= gain deprecation warnings. Initramfs is entirely unaffected. The non-linuxrc initrd path (root=/dev/ram0) is preserved but now carries a deprecation warning targeting January 2027 removal" * tag 'vfs-7.0-rc1.initrd' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: init: remove /proc/sys/kernel/real-root-dev initrd: remove deprecated code path (linuxrc) init: remove deprecated "load_ramdisk" and "prompt_ramdisk" command line parameters	2026-02-09 11:03:25 -08:00
Paolo Bonzini	4215ee0d7b	KVM SVM changes for 6.20 - Drop a user-triggerable WARN on nested_svm_load_cr3() failure. - Add support for virtualizing ERAPS. Note, correct virtualization of ERAPS relies on an upcoming, publicly announced change in the APM to reduce the set of conditions where hardware (i.e. KVM) must flush the RAP. - Ignore nSVM intercepts for instructions that are not supported according to L1's virtual CPU model. - Add support for expedited writes to the fast MMIO bus, a la VMX's fastpath for EPT Misconfig. - Don't set GIF when clearing EFER.SVME, as GIF exists independently of SVM, and allow userspace to restore nested state with GIF=0. - Treat exit_code as an unsigned 64-bit value through all of KVM. - Add support for fetching SNP certificates from userspace. - Fix a bug where KVM would use vmcb02 instead of vmcb01 when emulating VMLOAD or VMSAVE on behalf of L2. - Misc fixes and cleanups. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmmGsbEACgkQOlYIJqCj N/18Iw//U9ZiNSW8k9CGRnXN/hmc8h21cNlTdGliqY3lkf0y7feCb1sEdkCFv6/U KXlOhGUD8PiVlcJWm3ZWWMq/bJ5Ahcvyvre8RelRMQ5SRw07IojYSI1IkNHpSUBX brEd8DBG24oaw2El+rkl6mN9fneNUAq4pZtU9QDA/ehKDxpdsym2OAUStAVjXy0R YtIhsz0k1qX+EN/UIrvBTS6bCG3Ihd6btHgCehqGAOnY2rk5gNR0zChdKV3mdk2t hsbpKp8rtZppZ9Ltru/ly4TYzaKT/dl9gWt7h1y78fN7XD5orenAe8MOkav3WoPI zdDkDMzvwjv0p+bGPJKszxJrb4SBagtadvFMmKR+WZ0aYhysdAhxlpt64krqFrSV wjfNfPQ1Z2qHb9PV4TfuBr4g+OyYZfnBcEvyJswrVHOBTfCoMn4hx4tF0bbSZdLd nmOVqcXiPPpnOza2EXtYc97PSiHwl/CVlhXguYRPg/FQFnJKHHYoL9aRH4YpyZiK o/7Bsqe20ouuMoRdVIt+zp8FvhOsuiHV122e6d55+bvNhUGBC4sXNDEKQlmQps4K yvBUIGWLSx3Por/Iey7Rp+7hCXACf9KXaD1ogG2ZxL7xDE0smj9Jzu2NIzFJWUQ6 uubKwsZBJJDhYAZuDLUFmzoGydntb/Wi/FxetPp7Fzi7D4dnSUI= =RH/c -----END PGP SIGNATURE----- Merge tag 'kvm-x86-svm-6.20' of https://github.com/kvm-x86/linux into HEAD KVM SVM changes for 6.20 - Drop a user-triggerable WARN on nested_svm_load_cr3() failure. - Add support for virtualizing ERAPS. Note, correct virtualization of ERAPS relies on an upcoming, publicly announced change in the APM to reduce the set of conditions where hardware (i.e. KVM) must flush the RAP. - Ignore nSVM intercepts for instructions that are not supported according to L1's virtual CPU model. - Add support for expedited writes to the fast MMIO bus, a la VMX's fastpath for EPT Misconfig. - Don't set GIF when clearing EFER.SVME, as GIF exists independently of SVM, and allow userspace to restore nested state with GIF=0. - Treat exit_code as an unsigned 64-bit value through all of KVM. - Add support for fetching SNP certificates from userspace. - Fix a bug where KVM would use vmcb02 instead of vmcb01 when emulating VMLOAD or VMSAVE on behalf of L2. - Misc fixes and cleanups.	2026-02-09 18:51:37 +01:00
Arnd Bergmann	ebcff9daca	vduse: avoid adding implicit padding The vduse_iova_range_v2 and vduse_iotlb_entry_v2 structures are both defined in a way that adds implicit padding and is incompatible between i386 and x86_64 userspace because of the different structure alignment requirements. Building the header with -Wpadded shows these new warnings: vduse.h:305:1: error: padding struct size to alignment boundary with 4 bytes [-Werror=padded] vduse.h:374:1: error: padding struct size to alignment boundary with 4 bytes [-Werror=padded] Change the amount of padding in these two structures to align them to 64 bit words and avoid those problems. Since the v1 vduse_iotlb_entry already has an inconsistent size, do not attempt to reuse the structure but rather list the members indiviudally, with a fixed amount of padding. Fixes: `079212f687` ("vduse: add vq group asid support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20260202224835.559538-1-arnd@kernel.org>	2026-02-09 12:21:32 -05:00
Paolo Bonzini	5490063269	KVM/arm64 updates for 7.0 - Add support for FEAT_IDST, allowing ID registers that are not implemented to be reported as a normal trap rather than as an UNDEF exception. - Add sanitisation of the VTCR_EL2 register, fixing a number of UXN/PXN/XN bugs in the process. - Full handling of RESx bits, instead of only RES0, and resulting in SCTLR_EL2 being added to the list of sanitised registers. - More pKVM fixes for features that are not supposed to be exposed to guests. - Make sure that MTE being disabled on the pKVM host doesn't give it the ability to attack the hypervisor. - Allow pKVM's host stage-2 mappings to use the Force Write Back version of the memory attributes by using the "pass-through' encoding. - Fix trapping of ICC_DIR_EL1 on GICv5 hosts emulating GICv3 for the guest. - Preliminary work for guest GICv5 support. - A bunch of debugfs fixes, removing pointless custom iterators stored in guest data structures. - A small set of FPSIMD cleanups. - Selftest fixes addressing the incorrect alignment of page allocation. - Other assorted low-impact fixes and spelling fixes. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmmGBEkACgkQI9DQutE9 ekPxQQ//VOzle+RVmgzVSJzpNcoW576QGI7+pZLEMIywXTx6rH+uz2FCaZvvgV7M LrJ+1Qps9ea5Yti9OplNJmQwy1yAHIurZnpnAoMR+EJ5PUeq8p1EAypySpHtmT/d KngZsbCvSMydNdfJFwGaz3NFSYj05FlTmWNN+Ndq0JFqyMJQMgY2qKDVmg3pWKcv TLKTNRo9fJFUVhhBIyIoMl2hE36M6Ac3Qd4dUb5J+Fn834QDXgOzVzUjBtkmbSHD kJ4gbSs2Ic6QsYWtt70RlyRdreBYegA4C3z1cZV6DDQYxp5Jz2oqXYYC31Ro520A swuI5y9HMct4mOxqPUqf1lhbvsmkjuZ5Iog6P7W+mOtYHXZIzY8F61sv9YAis9/5 XNOHkg9Cn/n8C2RRQ8vnq0FEI1g7se1UGbe/1NkD4xeR/bzhE/AZSoOrRE7G/XJx qbF9FkPzd4OXYB2Pdm37G1BWsfN4M1bY1rOmmCyMKym793+b/jM7xdoZY1QfbabP uKiavuK8RYgqxrEilhP0asvafKjpZaJbn2R3jwHZgQDWe7WH5FhXwX2UcUpQsTan XZd+/cWaYXjLsKJbiAzy3UArgnzSrHPSpwIOkYq8Lf8EvPgS2g3LLJYbw250Cf1G 74stwoK4PgZ3e6k0nkMk43x1swKb13Gp0vCZjVdnIec9EQgOHfI= =X8iC -----END PGP SIGNATURE----- Merge tag 'kvmarm-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 7.0 - Add support for FEAT_IDST, allowing ID registers that are not implemented to be reported as a normal trap rather than as an UNDEF exception. - Add sanitisation of the VTCR_EL2 register, fixing a number of UXN/PXN/XN bugs in the process. - Full handling of RESx bits, instead of only RES0, and resulting in SCTLR_EL2 being added to the list of sanitised registers. - More pKVM fixes for features that are not supposed to be exposed to guests. - Make sure that MTE being disabled on the pKVM host doesn't give it the ability to attack the hypervisor. - Allow pKVM's host stage-2 mappings to use the Force Write Back version of the memory attributes by using the "pass-through' encoding. - Fix trapping of ICC_DIR_EL1 on GICv5 hosts emulating GICv3 for the guest. - Preliminary work for guest GICv5 support. - A bunch of debugfs fixes, removing pointless custom iterators stored in guest data structures. - A small set of FPSIMD cleanups. - Selftest fixes addressing the incorrect alignment of page allocation. - Other assorted low-impact fixes and spelling fixes.	2026-02-09 18:18:19 +01:00
Yishai Hadas	0ac6f4056c	RDMA/uverbs: Add DMABUF object type and operations Expose DMABUF functionality to userspace through the uverbs interface, enabling InfiniBand/RDMA devices to export PCI based memory regions (e.g. device memory) as DMABUF file descriptors. This allows zero-copy sharing of RDMA memory with other subsystems that support the dma-buf framework. A new UVERBS_OBJECT_DMABUF object type and allocation method were introduced. During allocation, uverbs invokes the driver to supply the rdma_user_mmap_entry associated with the given page offset (pgoff). Based on the returned rdma_user_mmap_entry, uverbs requests the driver to provide the corresponding physical-memory details as well as the driver’s PCI provider information. Using this information, dma_buf_export() is called; if it succeeds, uobj->object is set to the underlying file pointer returned by the dma-buf framework. The file descriptor number follows the standard uverbs allocation flow, but the file pointer comes from the dma-buf subsystem, including its own fops and private data. When an mmap entry is removed, uverbs iterates over its associated DMABUFs, marks them as revoked, and calls dma_buf_move_notify() so that their importers are notified. The same procedure applies during the disassociate flow; final cleanup occurs when the application closes the file. Signed-off-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Edward Srouji <edwards@nvidia.com> Link: https://patch.msgid.link/20260201-dmabuf-export-v3-2-da238b614fe3@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-02-08 23:50:41 -05:00
Arnd Bergmann	90079798f1	delayacct: fix uapi timespec64 definition The custom definition of 'struct timespec64' is incompatible with both the kernel's internal definition and the glibc type, at least on big-endian targets that have the tv_nsec field in a different place, and the definition clashes with any userspace that also defines a timespec64 structure. Running the header check with -Wpadding enabled produces this output that warns about the incorrect padding: usr/include/linux/taskstats.h:25:1: error: padding struct size to alignment boundary with 4 bytes [-Werror=padded] Remove the hack and instead use the regular __kernel_timespec type that is meant to be used in uapi definitions. Link: https://lkml.kernel.org/r/20260202095906.1344100-1-arnd@kernel.org Fixes: 29b63f6eff0e ("delayacct: add timestamp of delay max") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Fan Yu <fan.yu9@zte.com.cn> Cc: Jonathan Corbet <corbet@lwn.net> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Yang Yang <yang.yang29@zte.com.cn> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Jiang Kun <jiang.kun2@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-02-08 00:13:32 -08:00
Matthieu Baerts (NGI0)	136f1e168f	mptcp: fix kdoc warnings The following warnings were visible: $ ./scripts/kernel-doc -Wall -none \ net/mptcp/ include/net/mptcp.h include/uapi/linux/mptcp.h \ include/trace/events/mptcp.h Warning: net/mptcp/token.c:108 No description found for return value of 'mptcp_token_new_request' Warning: net/mptcp/token.c:151 No description found for return value of 'mptcp_token_new_connect' Warning: net/mptcp/token.c:246 No description found for return value of 'mptcp_token_get_sock' Warning: net/mptcp/token.c:298 No description found for return value of 'mptcp_token_iter_next' Warning: net/mptcp/protocol.c:4431 No description found for return value of 'mptcp_splice_read' Warning: include/uapi/linux/mptcp_pm.h:13 missing initial short description on line: enum mptcp_event_type Address all of them: either by using the 'Return:' keyword, or by adding a missing initial short description. The MPTCP CI will soon report issues with kdoc to avoid introducing new issues and being flagged by the Netdev CI. Reviewed-by: Geliang Tang <geliang@kernel.org> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260205-net-mptcp-misc-fixes-6-19-rc8-v2-3-c2720ce75c34@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-06 20:35:06 -08:00
Bjorn Helgaas	93c398be49	Merge branch 'pci/controller/dwc' - Extend PCI_FIND_NEXT_CAP() and PCI_FIND_NEXT_EXT_CAP() to return a pointer to the preceding Capability (Qiang Yu) - Add dw_pcie_remove_capability() and dw_pcie_remove_ext_capability() to remove Capabilities that are advertised but not fully implemented (Qiang Yu) - Remove MSI and MSI-X Capabilities for DWC controllers in platforms that can't support them, so we automatically fall back to INTx (Qiang Yu) - Remove MSI-X and DPC Capabilities for Qualcomm platforms that advertise but don't support them (Qiang Yu) - Remove duplicate dw_pcie_ep_hide_ext_capability() function and replace with dw_pcie_remove_ext_capability() (Qiang Yu) - Add ASPM L1.1 and L1.2 Substates context to debugfs ltssm_status for drivers that support this (Shawn Lin) - Skip PME_Turn_Off broadcast and L2/L3 transition during suspend if link is not up to avoid an unnecessary timeout (Manivannan Sadhasivam) - Revert dw-rockchip, qcom, and DWC core changes that used link-up IRQs to trigger enumeration instead of waiting for link to be up because the PCI core doesn't allocate bus number space for hierarchies that might be attached (Niklas Cassel) - Make endpoint iATU entry for MSI permanent instead of programming it dynamically, which is slow and racy with respect to other concurrent traffic, e.g., eDMA (Koichiro Den) - Use iMSI-RX MSI target address when possible to fix endpoints using 32-bit MSI (Shawn Lin) - Make dw_pcie_ltssm_status_string() available and use it for logging errors in dw_pcie_wait_for_link() (Manivannan Sadhasivam) - Return -ENODEV when dw_pcie_wait_for_link() finds no devices, -EIO for device present but inactive, -ETIMEDOUT for other failures, so callers can handle these cases differently (Manivannan Sadhasivam) - Allow DWC host controller driver probe to continue if device is not found or found but inactive; only fail when there's an error with the link (Manivannan Sadhasivam) - For controllers like NXP i.MX6QP and i.MX7D, where LTSSM registers are not accessible after PME_Turn_Off, simply wait 10ms instead of polling for L2/L3 Ready (Richard Zhu) - Use multiple iATU entries to map large bridge windows and DMA ranges when necessary instead of failing (Samuel Holland) - Rename struct dw_pcie_rp.has_msi_ctrl to .use_imsi_rx for clarity (Qiang Yu) - Add EPC dynamic_inbound_mapping feature bit for Endpoint Controllers that can update BAR inbound address translation without requiring EPF driver to clear/reset the BAR first, and advertise it for DWC-based Endpoints (Koichiro Den) - Add EPC subrange_mapping feature bit for Endpoint Controllers that can map multiple independent inbound regions in a single BAR, implement subrange mapping, advertise it for DWC-based Endpoints, and add Endpoint selftests for it (Koichiro Den) - Allow overriding default BAR sizes for pci-epf-test (Niklas Cassel) - Make resizable BARs work for Endpoint multi-PF configurations; previously it only worked for PF 0 (Aksh Garg) - Fix Endpoint non-PF 0 support for BAR configuration, ATU mappings, and Address Match Mode (Aksh Garg) - Fix issues with outbound iATU index assignment that caused iATU index to be out of bounds (Niklas Cassel) - Clean up iATU index tracking to be consistent (Niklas Cassel) - Set up iATU when ECAM is enabled; previously IO and MEM outbound windows weren't programmed, and ECAM-related iATU entries weren't restored after suspend/resume, so config accesses failed (Krishna Chaitanya Chundru) * pci/controller/dwc: PCI: dwc: Fix missing iATU setup when ECAM is enabled PCI: dwc: Clean up iATU index usage in dw_pcie_iatu_setup() PCI: dwc: Fix msg_atu_index assignment PCI: dwc: ep: Add comment explaining controller level PTM access in multi PF setup PCI: dwc: ep: Add per-PF BAR and inbound ATU mapping support PCI: dwc: ep: Fix resizable BAR support for multi-PF configurations PCI: endpoint: pci-epf-test: Allow overriding default BAR sizes selftests: pci_endpoint: Add BAR subrange mapping test case misc: pci_endpoint_test: Add BAR subrange mapping test case PCI: endpoint: pci-epf-test: Add BAR subrange mapping test support Documentation: PCI: endpoint: Clarify pci_epc_set_bar() usage PCI: dwc: ep: Support BAR subrange inbound mapping via Address Match Mode iATU PCI: dwc: Advertise dynamic inbound mapping support PCI: endpoint: Add BAR subrange mapping support PCI: endpoint: Add dynamic_inbound_mapping EPC feature PCI: dwc: Rename dw_pcie_rp::has_msi_ctrl to dw_pcie_rp::use_imsi_rx for clarity PCI: dwc: Fix grammar and formatting for comment in dw_pcie_remove_ext_capability() PCI: dwc: Use multiple iATU windows for mapping large bridge windows and DMA ranges PCI: dwc: Remove duplicate dw_pcie_ep_hide_ext_capability() function PCI: dwc: Skip waiting for L2/L3 Ready if dw_pcie_rp::skip_l23_wait is true PCI: dwc: Fail dw_pcie_host_init() if dw_pcie_wait_for_link() returns -ETIMEDOUT PCI: dwc: Rework the error print of dw_pcie_wait_for_link() PCI: dwc: Rename and move ltssm_status_string() to pcie-designware.c PCI: dwc: Return -EIO from dw_pcie_wait_for_link() if device is not active PCI: dwc: Return -ENODEV from dw_pcie_wait_for_link() if device is not found PCI: dwc: Use cfg0_base as iMSI-RX target address to support 32-bit MSI devices PCI: dwc: ep: Cache MSI outbound iATU mapping Revert "PCI: dwc: Don't wait for link up if driver can detect Link Up event" Revert "PCI: qcom: Enumerate endpoints based on Link up event in 'global_irq' interrupt" Revert "PCI: qcom: Enable MSI interrupts together with Link up if 'Global IRQ' is supported" Revert "PCI: qcom: Don't wait for link if we can detect Link Up" Revert "PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ" Revert "PCI: dw-rockchip: Don't wait for link since we can detect Link Up" PCI: dwc: Skip PME_Turn_Off broadcast and L2/L3 transition during suspend if link is not up PCI: dw-rockchip: Change get_ltssm() to provide L1 Substates info PCI: dwc: Add L1 Substates context to ltssm_status of debugfs PCI: qcom: Remove DPC Extended Capability PCI: qcom: Remove MSI-X Capability for Root Ports PCI: dwc: Remove MSI/MSIX capability for Root Port if iMSI-RX is used as MSI controller PCI: dwc: Add new APIs to remove standard and extended Capability PCI: Add preceding capability position support in PCI_FIND_NEXT_*_CAP macros	2026-02-06 17:09:34 -06:00
Bjorn Helgaas	401b356520	Merge branch 'pci/trace' - Add generic RAS tracepoint for hotplug events (Shuai Xue) - Add RAS tracepoint for link speed changes (Shuai Xue) * pci/trace: Documentation: tracing: Add PCI tracepoint documentation PCI: trace: Add RAS tracepoint to monitor link speed changes PCI: trace: Add generic RAS tracepoint for hotplug event	2026-02-06 17:09:26 -06:00
Matthieu Buffet	bbb6f53e90	landlock: Minor reword of docs for TCP access rights - Move ABI requirement next to each access right to prepare adding more access rights; - Mention the possibility to remove the random component of a socket's ephemeral port choice within the netns-wide ephemeral port range, since it allows choosing the "random" ephemeral port. Signed-off-by: Matthieu Buffet <matthieu@buffet.re> Link: https://lore.kernel.org/r/20251212163704.142301-2-matthieu@buffet.re Signed-off-by: Mickaël Salaün <mic@digikod.net>	2026-02-06 17:54:40 +01:00
Günther Noack	42fc7e6543	landlock: Multithreading support for landlock_restrict_self() Introduce the LANDLOCK_RESTRICT_SELF_TSYNC flag. With this flag, a given Landlock ruleset is applied to all threads of the calling process, instead of only the current one. Without this flag, multithreaded userspace programs currently resort to using the nptl(7)/libpsx hack for multithreaded policy enforcement, which is also used by libcap and for setuid(2). Using this userspace-based scheme, the threads of a process enforce the same Landlock policy, but the resulting Landlock domains are still separate. The domains being separate causes multiple problems: * When using Landlock's "scoped" access rights, the domain identity is used to determine whether an operation is permitted. As a result, when using LANLDOCK_SCOPE_SIGNAL, signaling between sibling threads stops working. This is a problem for programming languages and frameworks which are inherently multithreaded (e.g. Go). * In audit logging, the domains of separate threads in a process will get logged with different domain IDs, even when they are based on the same ruleset FD, which might confuse users. Cc: Andrew G. Morgan <morgan@kernel.org> Cc: John Johansen <john.johansen@canonical.com> Cc: Paul Moore <paul@paul-moore.com> Suggested-by: Jann Horn <jannh@google.com> Signed-off-by: Günther Noack <gnoack@google.com> Link: https://lore.kernel.org/r/20251127115136.3064948-2-gnoack@google.com [mic: Fix restrict_self_flags test, clean up Makefile, allign comments, reduce local variable scope, add missing includes] Closes: https://github.com/landlock-lsm/linux/issues/2 Signed-off-by: Mickaël Salaün <mic@digikod.net>	2026-02-06 17:54:37 +01:00
Jens Axboe	ed82f35b92	io_uring: allow registration of per-task restrictions Currently io_uring supports restricting operations on a per-ring basis. To use those, the ring must be setup in a disabled state by setting IORING_SETUP_R_DISABLED. Then restrictions can be set for the ring, and the ring can then be enabled. This commit adds support for IORING_REGISTER_RESTRICTIONS with ring_fd == -1, like the other "blind" register opcodes which work on the task rather than a specific ring. This allows registration of the same kind of restrictions as can been done on a specific ring, but with the task itself. Once done, any ring created will inherit these restrictions. If a restriction filter is registered with a task, then it's inherited on fork for its children. Children may only further restrict operations, not extend them. Inheriting restrictions include both the classic IORING_REGISTER_RESTRICTIONS based restrictions, as well as the BPF filters that have been registered with the task via IORING_REGISTER_BPF_FILTER. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-02-06 07:29:19 -07:00
Arnd Bergmann	f7ab71f178	KVM: s390: Add explicit padding to struct kvm_s390_keyop The newly added structure causes a warning about implied padding: ./usr/include/linux/kvm.h:1247:1: error: padding struct size to alignment boundary with 6 bytes [-Werror=padded] The padding can lead to leaking kernel data and ABI incompatibilies when used on x86. Neither of these is a problem in this specific patch, but it's best to avoid it and use explicit padding fields in general. Fixes: `0ee4ddc164` ("KVM: s390: Storage key manipulation IOCTL") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>	2026-02-06 11:40:36 +01:00
Joerg Roedel	ad09563660	Merge branches 'fixes', 'arm/smmu/updates', 'intel/vt-d', 'amd/amd-vi' and 'core' into next	2026-02-06 11:10:40 +01:00
Jakub Kicinski	333225e1e9	Some more changes, including pulls from drivers: - ath drivers: small features/cleanups - rtw drivers: mostly refactoring for rtw89 RTL8922DE support - mac80211: use hrtimers for CAC to avoid too long delays - cfg80211/mac80211: some initial UHR (Wi-Fi 8) support -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEpeA8sTs3M8SN2hR410qiO8sPaAAFAmmDNuEACgkQ10qiO8sP aADhvA/+J35p2CDkffi1KfZxxx1YdHAAlj1zjhjLzshCMCG3oWzLpOL7se5bgN/C axPLPbeCAXtsRXln083lbwtrSRexPHSVhelPDNtybLPEocQYrksV8a6V3eWXCNTR ymN4iDaO/K0gLkDRKH5T8lwZvJttA6iHi+Fm4ir+dsr0O5vwwe4CuAEPA1SuZ2rh 0lQMz6pEzsxq+sZX3p8SoBwXx147l0n6gwMNIgBTKo1tjZha4oaavdvcqq4zaZWV WCcg4YVA/dWHL0UuwtIF8uQADM43quegBBUFx63QgzfgcnHAnBk2Ckeein/bfvnv XOKlI4UJi1cxTkTJkDOrSn5IwBzVSlBXE3qEUKKnu5G3+ZgfdsnWmSPeTtOndvAE rgbwwZb2SKH1kCvL0FDZTwq/iR9KF60ZfhWIq9Sz7m6VZxJoR8QACHglYCysj2JB B1+oT53EIqP7Ob4s/GN2Yg9M0l4Lv3E6J9g6h3b8yeq9qEXVF8MaVN683rtNpec9 mUqLRlcoToB2W/qvEVESKj8jMvajYZ6TDoO7mSP3paTW3HgMC3wlPJlDc4Q/6h7e LAKEljXlv6ofNGCcCL37l6KATqSZpIZn+tpSqbELIirWlc/rnTIDU2qZRb7MA1e1 3lKdrS6pOXGS1GJr7HWuLb4cX1SukyXNeyIcZJlSFoxG4oDPvwI= =/NUu -----END PGP SIGNATURE----- Merge tag 'wireless-next-2026-02-04' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Johannes Berg says: ==================== Some more changes, including pulls from drivers: - ath drivers: small features/cleanups - rtw drivers: mostly refactoring for rtw89 RTL8922DE support - mac80211: use hrtimers for CAC to avoid too long delays - cfg80211/mac80211: some initial UHR (Wi-Fi 8) support * tag 'wireless-next-2026-02-04' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (59 commits) wifi: brcmsmac: phy: Remove unreachable error handling code wifi: mac80211: Add eMLSR/eMLMR action frame parsing support wifi: mac80211: add initial UHR support wifi: cfg80211: add initial UHR support wifi: ieee80211: add some initial UHR definitions wifi: mac80211: use wiphy_hrtimer_work for CAC timeout wifi: mac80211: correct ieee80211-{s1g/eht}.h include guard comments wifi: ath12k: clear stale link mapping of ahvif->links_map wifi: ath12k: Add support TX hardware queue stats wifi: ath12k: Add support RX PDEV stats wifi: ath12k: Fix index decrement when array_len is zero wifi: ath12k: support OBSS PD configuration for AP mode wifi: ath12k: add WMI support for spatial reuse parameter configuration dt-bindings: net: wireless: ath11k-pci: deprecate 'firmware-name' property wifi: ath11k: add usecase firmware handling based on device compatible wifi: ath10k: sdio: add missing lock protection in ath10k_sdio_fw_crashed_dump() wifi: ath10k: fix lock protection in ath10k_wmi_event_peer_sta_ps_state_chg() wifi: ath10k: snoc: support powering on the device via pwrseq wifi: rtw89: pci: warn if SPS OCP happens for RTL8922DE wifi: rtw89: pci: restore LDO setting after device resume ... ==================== Link: https://patch.msgid.link/20260204121143.181112-3-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-04 20:31:05 -08:00
Claudio Imbrenda	0ee4ddc164	KVM: s390: Storage key manipulation IOCTL Add a new IOCTL to allow userspace to manipulate storage keys directly. This will make it easier to write selftests related to storage keys. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>	2026-02-04 17:00:10 +01:00
Ranjani Sridharan	15a55ec2f8	ASoC: SOF: ipc4-topology: Add new tokens for pipeline direction Parse the pipeline direction from topology. The direction_valid token is required for backward-compatibility with older topologies that may not have the direction set for pipelines. This will be used when setting up pipelines to check if a pipeline is in the same direction as the requested params and skip those in the opposite direction like in the case of echo reference capture pipelines during playback. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Link: https://patch.msgid.link/20260204081833.16630-6-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2026-02-04 13:26:05 +00:00
Ranjani Sridharan	af0bc3ac9a	uapi: sound: sof: tokens: Add missing token for KCPS Align with the firmware and add the missing token for pipeline kcps. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Link: https://patch.msgid.link/20260204081833.16630-4-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>	2026-02-04 13:26:03 +00:00
Chia-Yu Chang	4fa4ac5e58	tcp: accecn: add tcpi_ecn_mode and tcpi_option2 in tcp_info Add 2-bit tcpi_ecn_mode feild within tcp_info to indicate which ECN mode is negotiated: ECN_MODE_DISABLED, ECN_MODE_RFC3168, ECN_MODE_ACCECN, or ECN_MODE_PENDING. This is done by utilizing available bits from tcpi_accecn_opt_seen (reduced from 16 bits to 2 bits) and tcpi_accecn_fail_mode (reduced from 16 bits to 4 bits). Also, an extra 24-bit tcpi_options2 field is identified to represent newer options and connection features, as all 8 bits of tcpi_options field have been used. Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com> Co-developed-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260131222515.8485-14-chia-yu.chang@nokia-bell-labs.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-02-03 15:13:25 +01:00
Peter Zijlstra	3e4067169c	Linux 6.19-rc8 -----BEGIN PGP SIGNATURE----- iQFSBAABCgA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAml/zSkeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG+bwIAJ0jbbeKDyeJJxPo 8PgScnPJx9vBL3hGpphZrhbV3GOe9bDhKM/0Xk9qMDbpAm9C6qiBMTiDWyvWv5Qi qzDlZfoymMaDLPMxw9WHjJ++i1Z2StNdrz57Vze98C3/iG6gBcKnUEUzvF9nigri HIoxoOKlbSXLPUIzt49xE7YX+CRJhLF/kXmfoauZn5ghpv+uqSpWvRbUQJa3dmc0 S4Ie/nbPtdVHmy1Fz9LJFDOzsdhGyjzHF4kc4shDkjAs8RAr8fJh74mQHO5a3MWA 3WZ7GAAAc4XXNqj76X2dnVlMWpQNJ4p2e+OalsuXGA6VQ7OgbrJGMX8P6dMFn5AF 8hFsXn4= =IdZ1 -----END PGP SIGNATURE----- Merge branch 'v6.19-rc8' Update to avoid conflicts with /urgent patches. Signed-off-by: Peter Zijlstra <peterz@infradead.org>	2026-02-03 12:04:13 +01:00
Mark Harmstone	8620da16fb	btrfs: allow mounting filesystems with remap-tree incompat flag If we encounter a filesystem with the remap-tree incompat flag set, validate its compatibility with the other flags, and load the remap tree using the values that have been added to the superblock. The remap-tree feature depends on the free-space-tree, but no-holes and block-group-tree have been made dependencies to reduce the testing matrix. Similarly I'm not aware of any reason why mixed-bg and zoned would be incompatible with remap-tree, but this is blocked for the time being until it can be fully tested. Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Mark Harmstone <mark@harmstone.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-03 07:54:35 +01:00
Mark Harmstone	7977011460	btrfs: add extended version of struct block_group_item Add a struct btrfs_block_group_item_v2, which is used in the block group tree if the remap-tree incompat flag is set. This adds two new fields to the block group item: `remap_bytes` and `identity_remap_count`. `remap_bytes` records the amount of data that's physically within this block group, but nominally in another, remapped block group. This is necessary because this data will need to be moved first if this block group is itself relocated. If `remap_bytes` > 0, this is an indicator to the relocation thread that it will need to search the remap-tree for backrefs. A block group must also have `remap_bytes` == 0 before it can be dropped. `identity_remap_count` records how many identity remap items are located in the remap tree for this block group. When relocation is begun for this block group, this is set to the number of holes in the free-space tree for this range. As identity remaps are converted into actual remaps by the relocation process, this number is decreased. Once it reaches 0, either because of relocation or because extents have been deleted, the block group has been fully remapped and its chunk's device extents are removed. Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Mark Harmstone <mark@harmstone.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-03 07:54:34 +01:00
Mark Harmstone	0b4d29fa98	btrfs: add METADATA_REMAP chunk type Add a new METADATA_REMAP chunk type, which is a metadata chunk that holds the remap tree. This is needed for bootstrapping purposes: the remap tree can't itself be remapped, and must be relocated the existing way, by COWing every leaf. The remap tree can't go in the SYSTEM chunk as space there is limited, because a copy of the chunk item gets placed in the superblock. The changes in fs/btrfs/volumes.h are because we're adding a new block group type bit after the profile bits, and so can no longer rely on the const_ilog2 trick. The sizing to 32MB per chunk, matching the SYSTEM chunk, is an estimate here, we can adjust it later if it proves to be too big or too small. This works out to be ~500,000 remap items, which for a 4KB block size covers ~2GB of remapped data in the worst case and ~500TB in the best case. Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-03 07:54:27 +01:00
Mark Harmstone	ef6a31d035	btrfs: add definitions and constants for remap-tree Add an incompat flag for the new remap-tree feature, and the constants and definitions needed to support it. Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-02-03 07:54:02 +01:00
Babis Chalios	3b1526ddb2	ptp: vmclock: support device notifications Add optional support for device notifications in VMClock. When supported, the hypervisor will send a device notification every time it updates the seq_count to a new even value. Moreover, add support for poll() in VMClock as a means to propagate this notification to user space. poll() will return a POLLIN event to listeners every time seq_count changes to a value different than the one last seen (since open() or last read()/pread()). This means that when poll() returns a POLLIN event, listeners need to use read() to observe what has changed and update the reader's view of seq_count. In other words, after a poll() returned, all subsequent calls to poll() will immediately return with a POLLIN event until the listener calls read(). The device advertises support for the notification mechanism by setting flag VMCLOCK_FLAG_NOTIFICATION_PRESENT in vmclock_abi flags field. If the flag is not present the driver won't setup the ACPI notification handler and poll() will always immediately return POLLHUP. Signed-off-by: Babis Chalios <bchalios@amazon.es> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Takahiro Itazuri <itazur@amazon.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Tested-by: Takahiro Itazuri <itazur@amazon.com> Link: https://patch.msgid.link/20260130173704.12575-3-itazur@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-02 18:06:00 -08:00
Babis Chalios	3495064b6d	ptp: vmclock: add vm generation counter Similar to live migration, loading a VM from some saved state (aka snapshot) is also an event that calls for clock adjustments in the guest. However, guests might want to take more actions as a response to such events, e.g. as discarding UUIDs, resetting network connections, reseeding entropy pools, etc. These are actions that guests don't typically take during live migration, so add a new field in the vmclock_abi called vm_generation_counter which informs the guest about such events. Hypervisor advertises support for vm_generation_counter through the VMCLOCK_FLAG_VM_GEN_COUNTER_PRESENT flag. Users need to check the presence of this bit in vmclock_abi flags field before using this flag. Signed-off-by: Babis Chalios <bchalios@amazon.es> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Tested-by: Takahiro Itazur <itazur@amazon.com> Link: https://patch.msgid.link/20260130173704.12575-2-itazur@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-02-02 18:06:00 -08:00
Kalesh AP	13edc7d4e0	RDMA/bnxt_re: Report packet pacing capabilities when querying device Enable the support to report packet pacing capabilities from kernel to user space. Packet pacing allows to limit the rate to any number between the maximum and minimum. The capabilities are exposed to user space through query_device. The following capabilities are reported: 1. The maximum and minimum rate limit in kbps. 2. Bitmap showing which QP types support rate limit. Signed-off-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://patch.msgid.link/20260202133413.3182578-3-kalesh-anakkur.purayil@broadcom.com Reviewed-by: Anantha Prabhu <anantha.prabhu@broadcom.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2026-02-02 08:37:59 -05:00
Johannes Berg	072e6f7f41	wifi: cfg80211: add initial UHR support Add initial support for making UHR connections (or suppressing that), adding UHR capable stations on the AP side, encoding and decoding UHR MCSes (except rate calculation for the new MCSes 17, 19, 20 and 23) as well as regulatory support. Link: https://patch.msgid.link/20260130164259.54cc12fbb307.I26126bebd83c7ab17e99827489f946ceabb3521f@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2026-02-02 10:11:07 +01:00
Dave Airlie	a60f627cf4	amd-drm-next-6.20-2026-01-30: amdgpu: - Misc cleanups - SMU 13 fixes - SMU 14 fixes - GPUVM fault filter fix - USB4 fixes - DC FP guard fixes - Powergating fix - JPEG ring reset fix - RAS fixes - Xclk fix for soc21 APUs - Fix COND_EXEC handling for GC 11 - UserQ fixes - MQD size alignment fixes - SMU feature interface cleanup - GC 10-12 KGQ init fixes - GC 11-12 KGQ reset fixes amdkfd: - Fix device snapshot reporting - GC 12.1 trap handler fixes - MQD size alignment fixes -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCaXz3RgAKCRC93/aFa7yZ 2FonAQCqJopJBlBzJnRfwz+ua2LE6TNNECt3/iuJu1MhpEPE7AD+K/j67yoQQCGJ KMxhcwn32Cii6LOlFbRCMSO1d8rL9Qo= =Cg8B -----END PGP SIGNATURE----- Merge tag 'amd-drm-next-6.20-2026-01-30' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.20-2026-01-30: amdgpu: - Misc cleanups - SMU 13 fixes - SMU 14 fixes - GPUVM fault filter fix - USB4 fixes - DC FP guard fixes - Powergating fix - JPEG ring reset fix - RAS fixes - Xclk fix for soc21 APUs - Fix COND_EXEC handling for GC 11 - UserQ fixes - MQD size alignment fixes - SMU feature interface cleanup - GC 10-12 KGQ init fixes - GC 11-12 KGQ reset fixes amdkfd: - Fix device snapshot reporting - GC 12.1 trap handler fixes - MQD size alignment fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patch.msgid.link/20260130183257.28879-1-alexander.deucher@amd.com	2026-02-02 05:51:54 +10:00
Wang Yaxin	503efe850c	delayacct: add timestamp of delay max Problem ======= Commit `658eb5ab91` ("delayacct: add delay max to record delay peak") introduced the delay max for getdelays, which records abnormal latency peaks and helps us understand the magnitude of such delays. However, the peak latency value alone is insufficient for effective root cause analysis. Without the precise timestamp of when the peak occurred, we still lack the critical context needed to correlate it with other system events. Solution ======== To address this, we need to additionally record a precise timestamp when the maximum latency occurs. By correlating this timestamp with system logs and monitoring metrics, we can identify processes with abnormal resource usage at the same moment, which can help us to pinpoint root causes. Use Case ======== bash-4.4# ./getdelays -d -t 227 print delayacct stats ON TGID 227 CPU count real total virtual total delay total delay average delay max delay min delay max timestamp 46 188000000 192348334 4098012 0.089ms 0.429260ms 0.051205ms 2026-01-15T15:06:58 IO count delay total delay average delay max delay min delay max timestamp 0 0 0.000ms 0.000000ms 0.000000ms N/A SWAP count delay total delay average delay max delay min delay max timestamp 0 0 0.000ms 0.000000ms 0.000000ms N/A RECLAIM count delay total delay average delay max delay min delay max timestamp 0 0 0.000ms 0.000000ms 0.000000ms N/A THRAS HING count delay total delay average delay max delay min delay max timestamp 0 0 0.000ms 0.000000ms 0.000000ms N/A COMPACT count delay total delay average delay max delay min delay max timestamp 0 0 0.000ms 0.000000ms 0.000000ms N/A WPCOPY count delay total delay average delay max delay min delay max timestamp 182 19413338 0.107ms 0.547353ms 0.022462ms 2026-01-15T15:05:24 IRQ count delay total delay average delay max delay min delay max timestamp 0 0 0.000ms 0.000000ms 0.000000ms N/A Link: https://lkml.kernel.org/r/20260119100241520gWubW8-5QfhSf9gjqcc_E@zte.com.cn Signed-off-by: Wang Yaxin <wang.yaxin@zte.com.cn> Cc: Fan Yu <fan.yu9@zte.com.cn> Cc: Jonathan Corbet <corbet@lwn.net> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Yang Yang <yang.yang29@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-01-31 16:16:06 -08:00
Ming Lei	8443e2087e	ublk: add UBLK_F_NO_AUTO_PART_SCAN feature flag Add a new feature flag UBLK_F_NO_AUTO_PART_SCAN to allow users to suppress automatic partition scanning when starting a ublk device. This is useful for some cases in which user don't want to scan partitions. Users still can manually trigger partition scanning later when appropriate using standard tools (e.g., partprobe, blockdev --rereadpt). Reported-by: Yoav Cohen <yoav@nvidia.com> Link: https://lore.kernel.org/linux-block/DM4PR12MB63280C5637917C071C2F0D65A9A8A@DM4PR12MB6328.namprd12.prod.outlook.com/ Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2026-01-31 06:36:41 -07:00
Jakub Kicinski	303c1a66a2	Another fairly large set of changes, notably: - cfg80211/mac80211 - most of EPPKE/802.1X over auth frames support - additional FTM capabilities - split up drop reasons better, removing generic RX_DROP - NAN cleanups/fixes - ath11k: - support for Channel Frequency Response measurement - ath12k: - support for the QCC2072 chipset - iwlwifi: - partial NAN support - UNII-9 support - some UHR/802.11bn FW APIs - remove most of MLO/EHT from iwlmvm (such devices use iwlmld) - rtw89: - preparations for RTL8922DE support -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEpeA8sTs3M8SN2hR410qiO8sPaAAFAml7PQcACgkQ10qiO8sP aAAKiA/6AnyNxa0bX2VFsWYW6KJYnJBVNLlP2ghkV3uIWtJoZdXuQO+W8/cy9Cng yrhfPzNfT+2hqmxasxI0tND3H3tW9CqcwX80J84eP9JCpYuPept9uGpSxPQoQl5J Q2k9gX1NlO/SEa8/mOFDT4EmH0bQobxiN84kxSg6Riaazkj6ZjHVVm/3PgzNhxlA v77m5thlhopzYxKn38qA19E9uHSLcY7XwkeYOZDf00Zhgot29lmDeHOf39IH+HvI +a20q6tW59D7iX2IUyvLnWzFV1iEcJ6ONF/hYJ0r3TlfmX/NDWfOQxx87K8M1Tqh sMa+FGrFdqloE1aYi1l+9m6Wu30pHmh7vhlgskPffPmvG+RkCEQCg1Me7eoFOzTB 81K2CMJ34Cp9se+QdiBtY5GpRPZIOlFmY6ZVyZIoEXHkn6r0R94e6dsMZuFcqjv1 y1dzv7BnraVMAQcqwkE9pQtq6LeJoHl2OUT2JzjbKhQhivMf9YubPBZ2QC1LZdMg NYEX4XSeJ/etpUk1MZFnm5wOw545tMi3U2sAhpYWbE6UBPDrQBvYADqd3lq3DmWe BdCDHTbqMnAJ3C0xFEKTYTmVF8IoFt6eOclFUPw4Uhq+YmU9x8wx1yBQbF9TjyKU a/rDCahmryj5gwD0QFJKhdQjfKaQFVNZWZqaKaokM84+8kIdA2U= =70Rs -----END PGP SIGNATURE----- Merge tag 'wireless-next-2026-01-29' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Johannes Berg says: ==================== Another fairly large set of changes, notably: - cfg80211/mac80211 - most of EPPKE/802.1X over auth frames support - additional FTM capabilities - split up drop reasons better, removing generic RX_DROP - NAN cleanups/fixes - ath11k: - support for Channel Frequency Response measurement - ath12k: - support for the QCC2072 chipset - iwlwifi: - partial NAN support - UNII-9 support - some UHR/802.11bn FW APIs - remove most of MLO/EHT from iwlmvm (such devices use iwlmld) - rtw89: - preparations for RTL8922DE support * tag 'wireless-next-2026-01-29' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (184 commits) wifi: iwlegacy: add missing mutex protection in il4965_store_tx_power() wifi: iwlegacy: add missing mutex protection in il3945_store_measurement() wifi: mac80211: use u64_stats_t with u64_stats_sync properly wifi: p54: Fix memory leak in p54_beacon_update() wifi: cfg80211: treat deprecated INDOOR_SP_AP_OLD control value as LPI mode wifi: rtw88: sdio: Migrate to use sdio specific shutdown function wifi: rsi: sdio: Migrate to use sdio specific shutdown function sdio: Provide a bustype shutdown function wifi: nl80211/cfg80211: support operating as RSTA in PMSR FTM request wifi: nl80211/cfg80211: add negotiated burst period to FTM result wifi: nl80211/cfg80211: clarify periodic FTM parameters for non-EDCA based ranging wifi: nl80211/cfg80211: add new FTM capabilities wifi: iwlwifi: rename struct iwl_mcc_allowed_ap_type_cmd::offset_map wifi: iwlwifi: mvm: Remove link_id from time_events wifi: iwlwifi: mld: change cluster_id type to u8 array wifi: iwlwifi: support V13 of iwl_lari_config_change_cmd wifi: iwlwifi: split bios_value_u32 to separate the header wifi: iwlwifi: uefi: cache the DSM functions wifi: iwlwifi: acpi: cache the DSM functions wifi: iwlwifi: mvm: Cleanup MLO code ... ==================== Link: https://patch.msgid.link/20260129110136.176980-39-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-29 19:17:43 -08:00
Ivan Vecera	bc443c253f	dpll: expose fractional frequency offset in ppt Currently, the dpll subsystem exports the fractional frequency offset (FFO) in parts per million (ppm). This granularity is insufficient for high-precision synchronization scenarios which often require parts per trillion (ppt) resolution. Add a new netlink attribute DPLL_A_PIN_FRACTIONAL_FREQUENCY_OFFSET_PPT to expose the FFO in ppt. Update the dpll netlink core to expect the driver-provided FFO value to be in ppt. To maintain backward compatibility with existing userspace tools, populate the legacy DPLL_A_PIN_FRACTIONAL_FREQUENCY_OFFSET attribute by dividing the new ppt value by 1,000,000. Update the zl3073x and mlx5 drivers to provide the FFO value in ppt: - zl3073x: adjust the fixed-point calculation to produce ppt (10^12) instead of ppm (10^6). - mlx5: scale the existing ppm value by 1,000,000. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20260126162253.27890-1-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-29 18:21:16 -08:00
Jakub Kicinski	a010fe8d86	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.19-rc8). No adjacent changes, conflicts: drivers/net/ethernet/spacemit/k1_emac.c `2c84959167` ("net: spacemit: Check for netif_carrier_ok() in emac_stats_update()") `f66086798f` ("net: spacemit: Remove broken flow control support") https://lore.kernel.org/aXjAqZA3iEWD_DGM@sirena.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-29 17:28:54 -08:00
Koichiro Den	8cf82bb558	misc: pci_endpoint_test: Add BAR subrange mapping test case Add a new PCITEST_BAR_SUBRANGE ioctl to exercise EPC BAR subrange mapping end-to-end. The test programs a simple 2-subrange layout on the endpoint (via pci-epf-test) and verifies that: - the endpoint-provided per-subrange signature bytes are observed at the expected PCIe BAR offsets, and - writes to each subrange are routed to the correct backing region (i.e. the submap order is applied rather than accidentally working due to an identity mapping). Return -EOPNOTSUPP when the endpoint does not advertise subrange mapping, -ENODATA when the BAR is disabled, and -EBUSY when the BAR is reserved for the test register space. Signed-off-by: Koichiro Den <den@valinux.co.jp> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260124145012.2794108-8-den@valinux.co.jp	2026-01-29 17:42:29 -06:00
Andrey Albershteyn	0e6b7eae1f	fs: add FS_XFLAG_VERITY for fs-verity files fs-verity introduced inode flag for inodes with enabled fs-verity on them. This patch adds FS_XFLAG_VERITY file attribute which can be retrieved with FS_IOC_FSGETXATTR ioctl() and file_getattr() syscall. This flag is read-only and can not be set with corresponding set ioctl() and file_setattr(). The FS_IOC_SETFLAGS requires file to be opened for writing which is not allowed for verity files. The FS_IOC_FSSETXATTR and file_setattr() clears this flag from the user input. As this is now common flag for both flag interfaces (flags/xflags) add it to overlapping flags list to exclude it from overwrite. Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org> Link: https://patch.msgid.link/20260126115658.27656-2-aalbersh@kernel.org Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>	2026-01-29 16:00:57 +01:00
Deepak Gupta	2af7c9cf02	riscv/ptrace: expose riscv CFI status and state via ptrace and in core files Expose a new register type NT_RISCV_USER_CFI for risc-v CFI status and state. Intentionally, both landing pad and shadow stack status and state are rolled into the CFI state. Creating two different NT_RISCV_USER_XXX would not be useful and would waste a note type. Enabling, disabling and locking the CFI feature is not allowed via ptrace set interface. However, setting 'elp' state or setting shadow stack pointer are allowed via the ptrace set interface. It is expected that 'gdb' might need to fixup 'elp' state or 'shadow stack' pointer. Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-19-b55691eacf4f@rivosinc.com [pjw@kernel.org: updated to apply; cleaned patch description and comments; addressed checkpatch issues] Signed-off-by: Paul Walmsley <pjw@kernel.org>	2026-01-29 02:38:40 -07:00
Deepak Gupta	5ca243f6e3	prctl: add arch-agnostic prctl()s for indirect branch tracking Three architectures (x86, aarch64, riscv) have support for indirect branch tracking feature in a very similar fashion. On a very high level, indirect branch tracking is a CPU feature where CPU tracks branches which use a memory operand to transfer control. As part of this tracking, during an indirect branch, the CPU expects a landing pad instruction on the target PC, and if not found, the CPU raises some fault (architecture-dependent). x86 landing pad instr - 'ENDBRANCH' arch64 landing pad instr - 'BTI' riscv landing instr - 'lpad' Given that three major architectures have support for indirect branch tracking, this patch creates architecture-agnostic 'prctls' to allow userspace to control this feature. They are: - PR_GET_INDIR_BR_LP_STATUS: Get the current configured status for indirect branch tracking. - PR_SET_INDIR_BR_LP_STATUS: Set the configuration for indirect branch tracking. The following status options are allowed: - PR_INDIR_BR_LP_ENABLE: Enables indirect branch tracking on user thread. - PR_INDIR_BR_LP_DISABLE: Disables indirect branch tracking on user thread. - PR_LOCK_INDIR_BR_LP_STATUS: Locks configured status for indirect branch tracking for user thread. Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Zong Li <zong.li@sifive.com> Signed-off-by: Deepak Gupta <debug@rivosinc.com> Tested-by: Andreas Korb <andreas.korb@aisec.fraunhofer.de> # QEMU, custom CVA6 Tested-by: Valentin Haudiquet <valentin.haudiquet@canonical.com> Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-13-b55691eacf4f@rivosinc.com [pjw@kernel.org: cleaned up patch description, code comments] Signed-off-by: Paul Walmsley <pjw@kernel.org>	2026-01-29 02:36:32 -07:00
Eugenio Pérez	079212f687	vduse: add vq group asid support Add support for assigning Address Space Identifiers (ASIDs) to each VQ group. This enables mapping each group into a distinct memory space. The vq group to ASID association is protected by a rwlock now. But the mutex domain_lock keeps protecting the domains of all ASIDs, as some operations like the one related with the bounce buffer size still requires to lock all the ASIDs. Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20260119143306.1818855-12-eperezma@redhat.com>	2026-01-28 15:32:18 -05:00
Eugenio Pérez	9350a09afd	vduse: add vq group support This allows separate the different virtqueues in groups that shares the same address space. Asking the VDUSE device for the groups of the vq at the beginning as they're needed for the DMA API. Allocating 3 vq groups as net is the device that need the most groups: * Dataplane (guest passthrough) * CVQ * Shadowed vrings. Future versions of the series can include dynamic allocation of the groups array so VDUSE can declare more groups. Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Xie Yongji <xieyongji@bytedance.com> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20260119143306.1818855-4-eperezma@redhat.com>	2026-01-28 15:32:16 -05:00
Eugenio Pérez	a006ed4ecd	vduse: add v1 API definition This allows the kernel to detect whether the userspace VDUSE device supports the VQ group and ASID features. VDUSE devices that don't set the V1 API will not receive the new messages, and vdpa device will be created with only one vq group and asid. The next patches implement the new feature incrementally, only enabling the VDUSE device to set the V1 API version by the end of the series. Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Xie Yongji <xieyongji@bytedance.com> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20260119143306.1818855-3-eperezma@redhat.com>	2026-01-28 15:32:16 -05:00
Jeff Layton	d8316b837c	nfsd: add controls to set the minimum number of threads per pool Add a new "min_threads" variable to the nfsd_net, along with the corresponding netlink interface, to set that value from userland. Pass that value to svc_set_pool_threads() and svc_set_num_threads(). Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2026-01-28 10:15:42 -05:00

1 2 3 4 5 ...

15939 commits