This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
- minor preparations for SMP support
- SPARSE_IRQ support for kunit
- help output cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEpeA8sTs3M8SN2hR410qiO8sPaAAFAmjjjKQACgkQ10qiO8sP
aAA/Sw//SVjKUs6SVnaglxPQl/qRksULuakpejp2n6fdWL6EC25W6w1Zpl+iV0Yx
asMYb6u9cZQxk8bBYXg3luXQF9YdE7MorcaOBQKNxJX5PZheekexTjvV2An0RcM8
t/g4hQXMFN7xv/k7KQ+DTw9l2jEKK21MltsOvB6VXWyBQ+cnvR66nVzzootH9vNR
GELDhd/OFx8M9XMumTNLLHw4GsdLbtUKTFOxnjnBfJf4K0xa/O2wInE0/zrpnsZ4
M6Aj73nxeZ5peKIKmqgzxHcJiwjVx2jwqzwjE2EVmxlUNZjGZ9asTGMnmYHwd7YF
6Z+SrI6vQhBBC4SrDU2NEJxyNg876zjt+ua+D71IRjd0wQhms5nWJ1kDNmpQwWbh
r6RkbddPlpPASQne30EXXZKwOsYQc2IT3Yf1GkOa+2RzNu0WIhQUkEfi8J+n8Jle
ERPDdP9dKHuuPHmZ/CvjLm+PEIVkUKlq0R8+uiWFG8AbBjdyGlue5C4kixfhoZ6I
yuiO/GMRotNaG6Nlr5rfpuWSNSWjY5kNpMFsfV+kVUWPCJavk/MXFFiIZfS4SI12
0GzIE7k1rwKo52EsV4gMnRtuvPioJ/mz335RfxT9ZOfaed9hw/Cl3d8aIw7WuNXz
CFwHvSVe51fMaLJCLCiWKpBPcDioMymI72znWLqgHwqDvv9GJSM=
=lpSi
-----END PGP SIGNATURE-----
Merge tag 'uml-for-linux-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux
Pull uml updates from Johannes Berg:
- minor preparations for SMP support
- SPARSE_IRQ support for kunit
- help output cleanups
* tag 'uml-for-linux-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux:
um: Remove unused ipi_pipe field from cpuinfo_um
um: Remove unused cpu_data and current_cpu_data macros
um: Stop tracking virtual CPUs via mm_cpumask()
um: Centralize stub size calculations
um: Remove outdated comment about STUB_DATA_PAGES
um: Remove unused offset and child_err fields from stub_data
um: Indent time-travel help messages
um: Fix help message for ssl-non-raw
um: vector: Fix indentation for help message
um: Add missing trailing newline to help messages
um: virtio-pci: implement .shutdown()
um: Support SPARSE_IRQ
Some help messages are missing a trailing newline. They should
end with two newlines, but only one is present. Add the missing
newline to make the --help output more readable.
Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Instances are happier that way and it makes more sense anyway -
the only part of the result that is related to partition we are given
is the start sector, and that has been filled in by the caller.
Everything else is a function of the disk. Only one instance
(DASD) is ever looking at anything other than bdev->bd_disk and
that one is trivial to adjust.
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The ubd io thread and UML kernel thread share the same errno, which
can lead to conflicts when both call syscalls concurrently. Switch to
the pthread-based helper to address this issue.
Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
Link: https://patch.msgid.link/20250319135523.97050-3-tiwei.btw@antgroup.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
BLK_MQ_F_SHOULD_MERGE is set for all tag_sets except those that purely
process passthrough commands (bsg-lib, ufs tmf, various nvme admin
queues) and thus don't even check the flag. Remove it to simplify the
driver interface.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20241219060214.1928848-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Currently, the initialization of the disk pointer in the ubd structure
is missing. It should be initialized with the allocated gendisk pointer
in ubd_add().
Fixes: 32621ad7a7 ("ubd: remove the ubd_gendisk array")
Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Link: https://patch.msgid.link/20241104163203.435515-2-tiwei.btw@antgroup.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The ubd io thread is not really a traditional thread. Set the
parent-death signal for it to ensure that it will be killed if
the UML kernel dies unexpectedly without proper cleanup.
Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
Link: https://patch.msgid.link/20241024142828.2612828-3-tiwei.btw@antgroup.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
- Support for preemption
- i386 Rust support
- Huge cleanup by Benjamin Berg
- UBSAN support
- Removal of dead code
-----BEGIN PGP SIGNATURE-----
iQJKBAABCAA0FiEEdgfidid8lnn52cLTZvlZhesYu8EFAmahIpkWHHJpY2hhcmRA
c2lnbWEtc3Rhci5hdAAKCRBm+VmF6xi7wW4PD/wN03iDGNTPhegGgXTJTSwA8Gwk
i5JTEmhc84ifE9/bJpru8w4mcLMiWLWFIpF4bGcqfKLp67tTi3jn9Vk7ivaYkn2G
S875GqjdyqMVMfhJX+1qTxM6q/J5B7XGUpt1Zrot3AY1ANxnlwYscWX8jNvwmf+5
eCK9+xldkNWh1N67EjwsDgH6kkWyx3fcEe4E3gjXY0eSZtIwO/ZXYHSCSKznJOfu
iXo1Sx02w8TZp4tf/EwpWR1SMkPL23X8Of+rmiyI5udyLZixTnrFlclu8WUK4ZBO
ExYvOrzyYZ3E/mPFZf0E88h8xC3ETLsiHO3++JRAM1uDMp1+a6tPK7Bi6NTytemH
PIT++XRiORAbXu3aSTjpFDAhTHIMZ925eJMvQAtVhtAAwbkjSNh9NbusbMiucPNm
vvtYrEqYjPJpx+HRxy8kUywe/+jFLYofSDn6YrNRM+3HaM44YgkvbD6AOEMxWq19
YWkflmkDADez6eti03bAbiVuBB1v+Vnuz15ofrx45IUubb3uGVJYwEqQA5u8bAVr
H4NeIWDRpXOuYLgSyxRLFFVYhe6eAWbAXeSWBFxcGNDY6OBpMqr7kgV1mBOZtooK
8aBgZ0YcyiTpmiEevskkNWSBnqUMKIdztKkD7Db9HfCgd9yy7Vvfl+iLJTIqFJ5m
JxpvTy3it53ghQj40A==
=ybqq
-----END PGP SIGNATURE-----
Merge tag 'uml-for-linus-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux
Pull UML updates from Richard Weinberger:
- Support for preemption
- i386 Rust support
- Huge cleanup by Benjamin Berg
- UBSAN support
- Removal of dead code
* tag 'uml-for-linus-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: (41 commits)
um: vector: always reset vp->opened
um: vector: remove vp->lock
um: register power-off handler
um: line: always fill *error_out in setup_one_line()
um: remove pcap driver from documentation
um: Enable preemption in UML
um: refactor TLB update handling
um: simplify and consolidate TLB updates
um: remove force_flush_all from fork_handler
um: Do not flush MM in flush_thread
um: Delay flushing syscalls until the thread is restarted
um: remove copy_context_skas0
um: remove LDT support
um: compress memory related stub syscalls while adding them
um: Rework syscall handling
um: Add generic stub_syscall6 function
um: Create signal stack memory assignment in stub_data
um: Remove stub-data.h include from common-offsets.h
um: time-travel: fix signal blocking race/hang
um: time-travel: remove time_exit()
...
Conceptually, we want the memory mappings to always be up to date and
represent whatever is in the TLB. To ensure that, we need to sync them
over in the userspace case and for the kernel we need to process the
mappings.
The kernel will call flush_tlb_* if page table entries that were valid
before become invalid. Unfortunately, this is not the case if entries
are added.
As such, change both flush_tlb_* and set_ptes to track the memory range
that has to be synchronized. For the kernel, we need to execute a
flush_tlb_kern_* immediately but we can wait for the first page fault in
case of set_ptes. For userspace in contrast we only store that a range
of memory needs to be synced and do so whenever we switch to that
process.
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Link: https://patch.msgid.link/20240703134536.1161108-13-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Commit fb5d1d389c ("ubd: open the backing files in ubd_add")
removed the last use of ubd_mutex.
Remove it.
Build and kernel startup test only.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20240505001508.255096-1-linux@treblig.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Move the nonrot flag into the queue_limits feature field so that it can
be set atomically with the queue frozen.
Use the chance to switch to defaulting to non-rotational and require
the driver to opt into rotational, which matches the polarity of the
sysfs interface.
For the z2ram, ps3vram, 2x memstick, ubiblock and dcssblk the new
rotational flag is not set as they clearly are not rotational despite
this being a behavior change. There are some other drivers that
unconditionally set the rotational flag to keep the existing behavior
as they arguably can be used on rotational devices even if that is
probably not their main use today (e.g. virtio_blk and drbd).
The flag is automatically inherited in blk_stack_limits matching the
existing behavior in dm and md.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20240617060532.127975-15-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Move the cache control settings into the queue_limits so that the flags
can be set atomically with the device queue frozen.
Add new features and flags field for the driver set flags, and internal
(usually sysfs-controlled) flags in the block layer. Note that we'll
eventually remove enough field from queue_limits to bring it back to the
previous size.
The disable flag is inverted compared to the previous meaning, which
means it now survives a rescan, similar to the max_sectors and
max_discard_sectors user limits.
The FLUSH and FUA flags are now inherited by blk_stack_limits, which
simplified the code in dm a lot, but also causes a slight behavior
change in that dm-switch and dm-unstripe now advertise a write cache
despite setting num_flush_bios to 0. The I/O path will handle this
gracefully, but as far as I can tell the lack of num_flush_bios
and thus flush support is a pre-existing data integrity bug in those
targets that really needs fixing, after which a non-zero num_flush_bios
should be required in dm for targets that map to underlying devices.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20240617060532.127975-14-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
A few drivers optimistically try to support discard, write zeroes and
secure erase and disable the features from the I/O completion handler
if the hardware can't support them. This disable can't be done using
the atomic queue limits API because the I/O completion handlers can't
take sleeping locks or freeze the queue. Keep the existing clearing
of the relevant field to zero, but replace the old blk_queue_max_*
APIs with new disable APIs that force the value to 0.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Nitesh Shetty <nj.shetty@samsung.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20240531074837.1648501-15-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Discard and Write Zeroes are different operation and implemented
by different fallocate opcodes for ubd. If one fails the other one
can work and vice versa.
Split the code to disable the operations in ubd_handler to only
disable the operation that actually failed.
Fixes: 50109b5a03 ("um: Add support for DISCARD in the UBD Driver")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Link: https://lore.kernel.org/r/20240531074837.1648501-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Instead of a separate handler function that leaves no work in the
interrupt hanler itself, split out a per-request end I/O helper and
clean up the coding style and variable naming while we're at it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Link: https://lore.kernel.org/r/20240531074837.1648501-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When kmalloc_array() fails to allocate memory, the ubd_init()
should return -ENOMEM instead of -1. So, fix it.
Fixes: f88f0bdfc3 ("um: UBD Improvements")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Richard Weinberger <richard@nod.at>
Opening the backing device only when the block device is opened is
a bit weird as no one configures block devices to not use them.
Opend them at add time, close them at remove time and remove the
now superflous opened counter as remove can simply check for
disk_openers.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Richard Weinberger <richard@nod.at>
Link: https://lore.kernel.org/r/20240222072417.3773131-8-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
No need for it now, everything goes through the gendisk.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Richard Weinberger <richard@nod.at>
Link: https://lore.kernel.org/r/20240222072417.3773131-7-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
No need to delay this until open time.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Richard Weinberger <richard@nod.at>
Link: https://lore.kernel.org/r/20240222072417.3773131-6-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
No reason to delay this until open time.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Richard Weinberger <richard@nod.at>
Link: https://lore.kernel.org/r/20240222072417.3773131-5-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
No reason to delay this until open time.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Richard Weinberger <richard@nod.at>
Link: https://lore.kernel.org/r/20240222072417.3773131-4-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Fold it into the only caller to remove lots of references to the
global ubd_devs array.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Richard Weinberger <richard@nod.at>
Link: https://lore.kernel.org/r/20240222072417.3773131-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
And add a disk pointer to the ubd structure instead to keep all
the per-device information together.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Richard Weinberger <richard@nod.at>
Link: https://lore.kernel.org/r/20240222072417.3773131-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pass the few limits ubd imposes directly to blk_mq_alloc_disk instead
of setting them one at a time.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20240215070300.2200308-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pass a queue_limits to blk_mq_alloc_disk and apply it if non-NULL. This
will allow allocating queues with valid queue limits instead of setting
the values one at a time later.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20240213073425.1621680-11-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The discard granularity now defaults to a single sector, so don't set
that value explicitly.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Richard Weinberger <richard@nod.at>
Link: https://lore.kernel.org/r/20231228075545.362768-5-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The only overlap between the block open flags mapped into the fmode_t and
other uses of fmode_t are FMODE_READ and FMODE_WRITE. Define a new
blk_mode_t instead for use in blkdev_get_by_{dev,path}, ->open and
->ioctl and stop abusing fmode_t.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jack Wang <jinpu.wang@ionos.com> [rnbd]
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Link: https://lore.kernel.org/r/20230608110258.189493-28-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This code has been dead forever, make sure it doesn't show up in code
searches.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Christian Brauner <brauner@kernel.org>
Acked-by: Richard Weinberger <richard@nod.at>
Link: https://lore.kernel.org/r/20230608110258.189493-25-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The mode argument to the ->release block_device_operation is never used,
so remove it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Christian Brauner <brauner@kernel.org>
Acked-by: Jack Wang <jinpu.wang@ionos.com> [rnbd]
Link: https://lore.kernel.org/r/20230608110258.189493-10-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
->open is only called on the whole device. Make that explicit by
passing a gendisk instead of the block_device.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Christian Brauner <brauner@kernel.org>
Acked-by: Jack Wang <jinpu.wang@ionos.com> [rnbd]
Link: https://lore.kernel.org/r/20230608110258.189493-9-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Improve static type checking by using type enum req_op instead of int where
appropriate.
Cc: Richard Weinberger <richard@nod.at>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220714180729.1065367-21-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
blk_cleanup_disk is nothing but a trivial wrapper for put_disk now,
so remove it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20220619060552.1850436-7-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The discard_alignment queue limit is named a bit misleading means the
offset into the block device at which the discard granularity starts.
Setting it to the discard granularity as done by ubd is mostly harmless
but also useless.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220418045314.360785-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Just use a non-zero max_discard_sectors as an indicator for discard
support, similar to what is done for write zeroes.
The only places where needs special attention is the RAID5 driver,
which must clear discard support for security reasons by default,
even if the default stacking rules would allow for it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd]
Acked-by: Jan Höppner <hoeppner@linux.ibm.com> [s390]
Acked-by: Coly Li <colyli@suse.de> [bcache]
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220415045258.199825-25-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Call to fallocate with FALLOC_FL_PUNCH_HOLE on a device backed by a sparse
file can end up by missing data, zeroes data range, if the underlying file
is used with a tool like bmaptool which will referenced only used spaces.
Signed-off-by: Frédéric Danis <frederic.danis@collabora.com>
Acked-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
We never checked for errors on add_disk() as this function
returned void. Now that this is fixed, use the shiny new
error handling.
ubd_disk_register() never returned an error, so just fix
that now and let the caller handle the error condition.
Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20211015233028.2167651-8-mcgrof@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Drop various include not actually used in genhd.h itself, and
move the remaning includes closer together.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20210920123328.1399408-15-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Use bvec_virt instead of open coding it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Link: https://lore.kernel.org/r/20210804095634.460779-12-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmDnGVYQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpv6UEAC78zkseI8TmKaowNfkz/+MkP9eSFb1pVn3
rxpbPOsZompHoZpeWt4oHL+3Rmm3a9iRo/APA2ELas4zvp+Q+6uG7eha2Dc4hUA9
YgeO4z9YfG8wQNZc3x7bncb6ZwqEE5nnbFe/m25SyrAZVLlZ7FKHxfoZDqjhlGFC
eLNiYO6vdvwgCoBMcotyCDttrPfEu6947/5vB1zevv57twdQQaEWGUhvyx1XrlDX
0YD5fmdOjNU2isgxt4xo2Ur2zL6w254/hvj58sV3Z7JfkJpI9DCK+ztKEfzuyEhA
WYz06rDAT1+1KuVLfowaZ+pYiPPOIsL0+QXI83r3nLaE7WGGlfS8Hmz//1FbziYs
ZSZI826kEN+/lKeWTcKOOMhmkYyXEFFuQZS34eg9KI4xwML8v+ILlHmcp+tjebw9
vzNF6f7N2ki+jnyxxyNxeMHxeAMWsqnIRROOhZg6bbs6UVNpDy4qRzpQaDOaJsVe
uSAQ6PTd/etR9KE+ClhLe6X7Rmp/lfZCPe64wqM/3k1qV2KWhE1fwCQO4c5o1MBN
rpk3Ef5PZYP3aakCvZnfcjMWlpZNbq/xMc6vPc+yq32akq1t1KbODVBiR5odcH0C
Gt5N11im50SO06haBt7EOe4JMQLbK5sxG15t4C6mNQZgPegGfaLlVkKpzIkOzUha
OkRofKMcDA==
=gHse
-----END PGP SIGNATURE-----
Merge tag 'block-5.14-2021-07-08' of git://git.kernel.dk/linux-block
Pull more block updates from Jens Axboe:
"A combination of changes that ended up depending on both the driver
and core branch (and/or the IDE removal), and a few late arriving
fixes. In detail:
- Fix io ticks wrap-around issue (Chunguang)
- nvme-tcp sock locking fix (Maurizio)
- s390-dasd fixes (Kees, Christoph)
- blk_execute_rq polling support (Keith)
- blk-cgroup RCU iteration fix (Yu)
- nbd backend ID addition (Prasanna)
- Partition deletion fix (Yufen)
- Use blk_mq_alloc_disk for mmc, mtip32xx, ubd (Christoph)
- Removal of now dead block request types due to IDE removal
(Christoph)
- Loop probing and control device cleanups (Christoph)
- Device uevent fix (Christoph)
- Misc cleanups/fixes (Tetsuo, Christoph)"
* tag 'block-5.14-2021-07-08' of git://git.kernel.dk/linux-block: (34 commits)
blk-cgroup: prevent rcu_sched detected stalls warnings while iterating blkgs
block: fix the problem of io_ticks becoming smaller
nvme-tcp: can't set sk_user_data without write_lock
loop: remove unused variable in loop_set_status()
block: remove the bdgrab in blk_drop_partitions
block: grab a device refcount in disk_uevent
s390/dasd: Avoid field over-reading memcpy()
dasd: unexport dasd_set_target_state
block: check disk exist before trying to add partition
ubd: remove dead code in ubd_setup_common
nvme: use return value from blk_execute_rq()
block: return errors from blk_execute_rq()
nvme: use blk_execute_rq() for passthrough commands
block: support polling through blk_execute_rq
block: remove REQ_OP_SCSI_{IN,OUT}
block: mark blk_mq_init_queue_data static
loop: rewrite loop_exit using idr_for_each_entry
loop: split loop_lookup
loop: don't allow deleting an unspecified loop device
loop: move loop_ctl_mutex locking into loop_add
...
Remove some leftovers of the fake major number parsing that cause
complains from some compilers.
Fixes: 2933a1b2c6f3 ("ubd: remove the code to register as the legacy IDE driver")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210628093937.1325608-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Use blk_mq_alloc_disk and blk_cleanup_disk to simplify the gendisk and
request_queue allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210614060759.3965724-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
With the legacy IDE driver long deprecated, and modern userspace being
much more flexible about dev_t assignments there is no reason to fake
a registration as the legacy IDE driver in ubd. This registeration
is a little problematic as it registers the same request_queue for
multiple gendisks, so just remove it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Link: https://lore.kernel.org/r/20210614060759.3965724-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
GCC assumes that stack is aligned to 16-byte on call sites [1].
Since GCC 8, GCC began using 16-byte aligned SSE instructions to
implement assignments to structs on stack. When
CC_OPTIMIZE_FOR_PERFORMANCE is enabled, this affects
os-Linux/sigio.c, write_sigio_thread:
struct pollfds *fds, tmp;
tmp = current_poll;
Note that struct pollfds is exactly 16 bytes in size.
GCC 8+ generates assembly similar to:
movdqa (%rdi),%xmm0
movaps %xmm0,-0x50(%rbp)
This is an issue, because movaps will #GP if -0x50(%rbp) is not
aligned to 16 bytes [2], and how rbp gets assigned to is via glibc
clone thread_start, then function prologue, going though execution
trace similar to (showing only relevant instructions):
sub $0x10,%rsi
mov %rcx,0x8(%rsi)
mov %rdi,(%rsi)
syscall
pop %rax
pop %rdi
callq *%rax
push %rbp
mov %rsp,%rbp
The stack pointer always points to the topmost element on stack,
rather then the space right above the topmost. On push, the
pointer decrements first before writing to the memory pointed to
by it. Therefore, there is no need to have the stack pointer
pointer always point to valid memory unless the stack is poped;
so the `- sizeof(void *)` in the code is unnecessary.
On the other hand, glibc reserves the 16 bytes it needs on stack
and pops itself, so by the call instruction the stack pointer
is exactly the caller-supplied sp. It then push the 16 bytes of
the return address and the saved stack pointer, so the base
pointer will be 16-byte aligned if and only if the caller
supplied sp is 16-byte aligned. Therefore, the caller must supply
a 16-byte aligned pointer, which `stack + UM_KERN_PAGE_SIZE`
already satisfies.
On a side note, musl is unaffected by this issue because it forces
16 byte alignment via `and $-16,%rsi` in its clone wrapper.
Similarly, glibc i386 is also unaffected because it has
`andl $0xfffffff0, %ecx`.
To reproduce this bug, enable CONFIG_UML_RTC and
CC_OPTIMIZE_FOR_PERFORMANCE. uml_rtc will call
add_sigio_fd which will then cause write_sigio_thread to either go
into segfault loop or panic with "Segfault with no mm".
Similarly, signal stacks will be aligned by the host kernel upon
signal delivery. `- sizeof(void *)` to sigaltstack is
unconventional and extraneous.
On a related note, initialization of longjmp buffers do require
`- sizeof(void *)`. This is to account for the return address
that would have been pushed to the stack at the call site.
The reason for uml to respect 16-byte alignment, rather than
telling GCC to assume 8-byte alignment like the host kernel since
commit d9b0cde91c ("x86-64, gcc: Use
-mpreferred-stack-boundary=3 if supported"), is because uml links
against libc. There is no reason to assume libc is also compiled
with that flag and assumes 8-byte alignment rather than 16-byte.
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40838
[2] https://c9x.me/x86/html/file_module_x86_id_180.html
Signed-off-by: YiFei Zhu <zhuyifei1999@gmail.com>
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Richard Weinberger <richard@nod.at>
This reverts commit ef4459a6da ("um: allocate a guard page to
helper threads"), it's broken in multiple ways:
1) the free no longer matches the alloc; and
2) more importantly, the set_memory_ro() causes allocation of
page tables for the normal memory that doesn't have any,
and that later causes corruption and crashes (usually but
not always in vfree()).
We could fix the first bug and use vmalloc() to work around the
second, but set_memory_ro() actually doesn't do anything either
so I'll just revert that as well.
Reported-by: Benjamin Berg <benjamin@sipsolutions.net>
Fixes: ef4459a6da ("um: allocate a guard page to helper threads")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
This commit fixes a regression to handle command line parameters of ubd.
With a simple line "./linux ubd0="./disk-ext4.img", it fails at
ubd_setup_common(). The commit adds additional checks to the variables
in order to properly parse the paremeters which previously worked.
Fixes: ef3ba87cb7 ("um: ubd: Set device serial attribute from cmdline")
Cc: Christopher Obbard <chris.obbard@collabora.com>
Signed-off-by: Hajime Tazaki <thehajime@gmail.com>
Acked-by: Christopher Obbard <chris.obbard@collabora.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
We've been running into stack overflows in helper threads
corrupting memory (e.g. because somebody put printf() or
os_info() there), so to avoid those causing hard-to-debug
issues later on, allocate a guard page for helper thread
stacks and mark it read-only.
Unfortunately, the crash dump at that point is useless as
the stack tracer will try to backtrace the *kernel* thread,
not the helper thread, but at least we don't survive to a
random issue caused by corruption.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>