linux/drivers
Oak Zeng ae28e34400 drm/xe: Allow scratch page under fault mode for certain platform
Normally scratch page is not allowed when a vm is operate under page
fault mode, i.e., in the existing codes, DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE
and DRM_XE_VM_CREATE_FLAG_FAULT_MODE are mutual exclusive. The reason
is fault mode relies on recoverable page to work, while scratch page
can mute recoverable page fault.

On xe2 and xe3, out of bound prefetch can cause page fault and further
system hang because xekmd can't resolve such page fault. SYCL and OCL
language runtime requires out of bound prefetch to be silently dropped
without causing any functional problem, thus the existing behavior
doesn't meet language runtime requirement.

At the same time, HW prefetching can cause page fault interrupt. Due to
page fault interrupt overhead (i.e., need Guc and KMD involved to fix
the page fault), HW prefetching can be slowed by many orders of magnitude.

Fix those problems by allowing scratch page under fault mode for xe2 and
xe3. With scratch page in place, HW prefetching could always hit scratch
page instead of causing interrupt.

A side effect is, scratch page could hide application program error.
Application out of bound accesses are hided by scratch page mapping,
instead of get reported to user.

v2: Refine commit message (Thomas)

v3: Move the scratch page flag check to after scratch page wa (Thomas)

v4: drop NEEDS_SCRATCH macro (matt)
    Add a comment to DRM_XE_VM_CREATE_FLAG_SCRATCH_PAGE

Signed-off-by: Oak Zeng <oak.zeng@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Link: https://lore.kernel.org/r/20250403165328.2438690-4-oak.zeng@intel.com
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
2025-04-07 11:17:30 +05:30
..
accel accel/amdxdna: Check interrupt register before mailbox_rx_worker exits 2025-02-27 08:41:46 -06:00
accessibility
acpi ACPI: platform_profile: Add support for hidden choices 2025-03-04 20:45:34 +01:00
amba
android binderfs: fix use-after-free in binder_devices 2025-02-20 15:20:11 +01:00
ata ata fixes for 6.14-rc5 2025-03-01 08:59:29 -08:00
atm Get rid of 'remove_new' relic from platform driver struct 2024-12-01 15:12:43 -08:00
auxdisplay auxdisplay for v6.14-1 2025-01-24 08:03:52 -08:00
base Linux 6.14-rc6 2025-03-12 09:43:12 +10:00
bcma Get rid of 'remove_new' relic from platform driver struct 2024-12-01 15:12:43 -08:00
block block-6.14-20250306 2025-03-07 11:12:33 -10:00
bluetooth bluetooth: btusb: Initialize .owner field of force_poll_sync_fops 2025-02-27 16:50:05 -05:00
bus Linux 6.14-rc6 2025-03-12 09:43:12 +10:00
cache
cdrom treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
cdx cdx: Fix possible UAF error in driver_override_show() 2025-02-20 15:19:07 +01:00
char Char/Misc/IIO driver fixes for 6.14-rc6 2025-03-09 09:07:54 -10:00
clk The various patchsets are summarized below. Plus of course many 2025-01-26 18:36:23 -08:00
clocksource irqchip/jcore-aic, clocksource/drivers/jcore: Fix jcore-pit interrupt request 2025-02-17 23:27:49 +01:00
comedi
connector
counter module: Convert symbol namespace to string literal 2024-12-02 11:34:44 -08:00
cpufreq amd-pstate fixes 2/6/25 2025-02-06 20:39:43 +01:00
cpuidle More power management updates for 6.14-rc1 2025-01-30 15:10:34 -08:00
crypto crypto: ccp: Add external API interface for PSP module initialization 2025-02-14 18:39:19 -05:00
cxl cxl changes for v6.14 2025-01-29 11:23:22 -08:00
dax module: Convert symbol namespace to string literal 2024-12-02 11:34:44 -08:00
dca
devfreq Update devfreq next for v6.14 2025-01-13 20:48:34 +01:00
dio
dma dmaengine fixes for v6.14 2025-03-02 10:08:45 -08:00
dma-buf Merge drm/drm-next into drm-misc-next 2025-02-06 13:47:32 +01:00
dpll
edac EDAC/qcom: Correct interrupt enable register configuration 2025-02-14 20:36:11 +01:00
eisa
extcon Update extcon next for v6.14 2025-01-12 13:44:27 +01:00
firewire Driver core and debugfs updates 2025-01-28 12:25:12 -08:00
firmware EFI fixes for v6.14 #2 2025-02-28 08:47:21 -08:00
fpga FPGA Manager changes for 6.14-rc1 2025-01-09 10:56:57 +01:00
fsi Get rid of 'remove_new' relic from platform driver struct 2024-12-01 15:12:43 -08:00
gnss
gpio gpio: rcar: Fix missing of_node_put() call 2025-03-06 15:51:27 +01:00
gpu drm/xe: Allow scratch page under fault mode for certain platform 2025-04-07 11:17:30 +05:30
greybus
hid hid-for-linus-2025030501 2025-03-05 07:46:59 -10:00
hsi Get rid of 'remove_new' relic from platform driver struct 2024-12-01 15:12:43 -08:00
hte Get rid of 'remove_new' relic from platform driver struct 2024-12-01 15:12:43 -08:00
hv treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
hwmon hwmon: fix a NULL vs IS_ERR_OR_NULL() check in xgene_hwmon_probe() 2025-03-03 06:04:34 -08:00
hwspinlock Get rid of 'remove_new' relic from platform driver struct 2024-12-01 15:12:43 -08:00
hwtracing intel_th: pci: Add Panther Lake-P/U support 2025-02-20 09:35:57 +01:00
i2c i2c: amd-asf: Fix EOI register write to enable successive interrupts 2025-02-26 23:28:41 +01:00
i3c I3C for 6.14 2025-01-24 15:48:01 -08:00
idle intel_idle: Handle older CPUs, which stop the TSC in deeper C states, correctly 2025-02-28 22:04:26 +01:00
iio iio: filter: admv8818: Force initialization of SDO 2025-02-08 12:46:32 +00:00
infiniband RDMA/bnxt_re: Fix the page details for the srq created by kernel consumers 2025-02-23 06:57:56 -05:00
input platform-drivers-x86 for v6.14-1 2025-01-24 07:18:39 -08:00
interconnect interconnect changes for 6.14 2025-01-16 14:01:40 +01:00
iommu iommu/vt-d: Fix suspicious RCU usage 2025-02-28 12:19:01 +01:00
ipack
irqchip irqchip/qcom-pdc: Workaround hardware register bug on X1E80100 2025-02-21 09:47:06 +01:00
isdn isdn: Remove unused get_Bprotocol4id() 2024-12-11 20:12:27 -08:00
leds Driver core and debugfs updates 2025-01-28 12:25:12 -08:00
macintosh The various patchsets are summarized below. Plus of course many 2025-01-26 18:36:23 -08:00
mailbox mailbox: th1520: Fix memory corruption due to incorrect array size 2025-01-18 16:20:55 -06:00
mcb module: Convert symbol namespace to string literal 2024-12-02 11:34:44 -08:00
md - dm-vdo: add missing spin_lock_init 2025-02-24 16:29:48 -08:00
media media: cec: move driver for TDA9950 from drm/i2c 2025-02-13 00:17:42 +02:00
memory spi: Support DTR in spi-mem 2025-01-15 19:07:39 +01:00
memstick Char/Misc/IIO driver updates for 6.14-rc1 2025-01-27 16:51:51 -08:00
message Merge branch '6.13/scsi-fixes' into 6.14/scsi-staging 2025-01-10 15:20:30 -05:00
mfd mfd: syscon: Restore device_node_to_regmap() for non-syscon nodes 2025-02-11 14:53:39 +00:00
misc Revert "drivers/card_reader/rtsx_usb: Restore interrupt based detection" 2025-02-27 12:24:53 -08:00
mmc mmc: mtk-sd: Fix register settings for hs400(es) mode 2025-02-03 13:34:50 +01:00
most
mtd Fix writes on SST flashes 2025-02-19 14:38:47 +01:00
mux mux: constify mux class 2025-01-10 10:15:04 +01:00
net mctp i3c: handle NULL header address 2025-03-06 10:33:07 +01:00
nfc nfc: mrvl: Don't use "proxy" headers 2025-01-18 17:10:05 -08:00
ntb PCI: Remove devres from pci_intx() 2025-01-18 14:38:49 -06:00
nubus
nvdimm driver core: Constify API device_find_child() and adapt for various usages 2025-01-03 11:19:35 +01:00
nvme nvme-tcp: fix signedness bug in nvme_tcp_init_connection() 2025-03-05 10:37:01 -08:00
nvmem nvmem: core: improve range check for nvmem_cell_write() 2025-01-10 16:16:48 +01:00
of Revert "of: reserved-memory: Fix using wrong number of cells to get property 'alignment'" 2025-02-26 13:39:28 -06:00
opp Driver core and debugfs updates 2025-01-28 12:25:12 -08:00
parisc Get rid of 'remove_new' relic from platform driver struct 2024-12-01 15:12:43 -08:00
parport serial: 8250_pci: Share WCH IDs with parport_serial driver 2024-12-04 16:42:55 +01:00
pci pci-v6.14-fixes-3 2025-02-14 16:49:07 -08:00
pcmcia Get rid of 'remove_new' relic from platform driver struct 2024-12-01 15:12:43 -08:00
peci module: Convert symbol namespace to string literal 2024-12-02 11:34:44 -08:00
perf treewide: const qualify ctl_tables where applicable 2025-01-28 13:48:37 +01:00
phy phy: tegra: xusb: reset VBUS & ID OVERRIDE 2025-02-14 18:03:05 +05:30
pinctrl pinctrl: pinconf-generic: Print unsigned value if a format is registered 2025-02-06 10:13:15 +01:00
platform ACPI fix for 6.14-rc6 2025-03-07 12:17:42 -10:00
pmdomain pmdomain: airoha: Fix compilation error with Clang-20 and Thumb2 mode 2025-01-21 10:45:24 +01:00
pnp
power power: supply: axp20x_battery: Fix fault handling for AXP717 2025-02-03 12:41:18 +01:00
powercap Merge branch 'pm-powercap' 2025-02-07 12:43:58 +01:00
pps pps: clients: gpio: Bypass edge's direction check when not needed 2025-01-10 16:12:33 +01:00
ps3
ptp ptp: vmclock: Remove goto-based cleanup logic 2025-02-11 10:20:52 +01:00
pwm Driver core and debugfs updates 2025-01-28 12:25:12 -08:00
rapidio rapidio: add check for rio_add_net() in rio_scan_alloc_net() 2025-03-05 21:36:19 -08:00
ras x86/amd_nb: Move SMN access code to a new amd_node driver 2025-01-08 10:59:44 +01:00
regulator regulator: core: let dt properties override driver init_data 2025-02-11 16:29:01 +00:00
remoteproc remoteproc: st: Use syscon_regmap_lookup_by_phandle_args 2025-01-15 10:04:27 -07:00
reset soc: driver updates for 6.14 2025-01-24 14:56:59 -08:00
rpmsg driver core: Constify API device_find_child() and adapt for various usages 2025-01-03 11:19:35 +01:00
rtc RTC for 6.13 2025-01-30 17:50:02 -08:00
s390 Smaller than usual with no fixes from any subtree. 2025-02-20 10:19:54 -08:00
sbus Get rid of 'remove_new' relic from platform driver struct 2024-12-01 15:12:43 -08:00
scsi scsi: core: Clear driver private data when retrying request 2025-02-20 21:20:58 -05:00
sh sh updates for v6.13 2024-11-30 14:45:29 -08:00
siox
slimbus slimbus: messaging: Free transaction ID in delayed interrupt scenario 2025-02-20 15:19:51 +01:00
soc soc: loongson: loongson2_guts: Add check for devm_kstrdup() 2025-02-20 22:29:05 +01:00
soundwire soundwire updates for 6.14 2025-01-29 14:38:19 -08:00
spi spi: sn-f-ospi: Fix division by zero 2025-02-06 11:33:51 +00:00
spmi spmi: hisi-spmi-controller: Drop duplicated OF node assignment in spmi_controller_probe() 2025-01-17 12:58:49 +01:00
ssb
staging fbtft: Remove access to page->index 2025-03-05 08:38:21 +01:00
target Merge branch '6.14/scsi-queue' into 6.14/scsi-fixes 2025-02-03 16:28:51 -05:00
tc
tee tee: optee: Fix supplicant wait loop 2025-02-14 15:17:34 +01:00
thermal thermal: gov_power_allocator: Update total_weight on bind and cdev updates 2025-02-25 12:30:45 +01:00
thunderbolt Driver core and debugfs updates 2025-01-28 12:25:12 -08:00
tty Serial driver fixes for 6.14-rc3 2025-02-16 12:50:44 -08:00
ufs scsi: ufs: core: bsg: Fix crash when arpmb command fails 2025-02-20 22:18:24 -05:00
uio Char/Misc/IIO driver updates for 6.14-rc1 2025-01-27 16:51:51 -08:00
usb usb: typec: ucsi: Fix NULL pointer access 2025-03-06 16:55:46 +01:00
vdpa virtio: features, fixes, cleanups 2025-01-27 15:26:06 -08:00
vfio VFIO updates for v6.14-rc1 2025-01-28 14:16:46 -08:00
vhost vhost: return task creation error instead of NULL 2025-03-01 02:52:52 -05:00
video gpu: nova-core: add initial driver stub 2025-03-09 19:24:27 +01:00
virt Char/Misc/IIO driver fixes for 6.14-rc6 2025-03-09 09:07:54 -10:00
virtio virtio: features, fixes, cleanups 2025-01-27 15:26:06 -08:00
w1 1-Wire bus drivers for v6.14 2025-01-09 10:54:19 +01:00
watchdog linux-watchdog 6.14-rc1 tag 2025-01-25 16:19:10 -08:00
xen xen: branch for v6.14-rc3 2025-02-14 08:15:17 -08:00
zorro zorro: Constify 'struct bin_attribute' 2025-01-08 18:04:36 +01:00
Kconfig
Makefile