kernel_samsung_a53x/drivers/gpu/drm
Dave Airlie dc108b87d9 nouveau: fix instmem race condition around ptr stores
commit fff1386cc889d8fb4089d285f883f8cba62d82ce upstream.

Running a lot of VK CTS in parallel against nouveau, once every
few hours you might see something like this crash.

BUG: kernel NULL pointer dereference, address: 0000000000000008
PGD 8000000114e6e067 P4D 8000000114e6e067 PUD 109046067 PMD 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 7 PID: 53891 Comm: deqp-vk Not tainted 6.8.0-rc6+ #27
Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021
RIP: 0010:gp100_vmm_pgt_mem+0xe3/0x180 [nouveau]
Code: c7 48 01 c8 49 89 45 58 85 d2 0f 84 95 00 00 00 41 0f b7 46 12 49 8b 7e 08 89 da 42 8d 2c f8 48 8b 47 08 41 83 c7 01 48 89 ee <48> 8b 40 08 ff d0 0f 1f 00 49 8b 7e 08 48 89 d9 48 8d 75 04 48 c1
RSP: 0000:ffffac20c5857838 EFLAGS: 00010202
RAX: 0000000000000000 RBX: 00000000004d8001 RCX: 0000000000000001
RDX: 00000000004d8001 RSI: 00000000000006d8 RDI: ffffa07afe332180
RBP: 00000000000006d8 R08: ffffac20c5857ad0 R09: 0000000000ffff10
R10: 0000000000000001 R11: ffffa07af27e2de0 R12: 000000000000001c
R13: ffffac20c5857ad0 R14: ffffa07a96fe9040 R15: 000000000000001c
FS:  00007fe395eed7c0(0000) GS:ffffa07e2c980000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 000000011febe001 CR4: 00000000003706f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:

...

 ? gp100_vmm_pgt_mem+0xe3/0x180 [nouveau]
 ? gp100_vmm_pgt_mem+0x37/0x180 [nouveau]
 nvkm_vmm_iter+0x351/0xa20 [nouveau]
 ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau]
 ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
 ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
 ? __lock_acquire+0x3ed/0x2170
 ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
 nvkm_vmm_ptes_get_map+0xc2/0x100 [nouveau]
 ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau]
 ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
 nvkm_vmm_map_locked+0x224/0x3a0 [nouveau]

Adding any sort of useful debug usually makes it go away, so I hand
wrote the function in a line, and debugged the asm.

Every so often pt->memory->ptrs is NULL. This ptrs ptr is set in
the nv50_instobj_acquire called from nvkm_kmap.

If Thread A and Thread B both get to nv50_instobj_acquire around
the same time, and Thread A hits the refcount_set line, and in
lockstep thread B succeeds at refcount_inc_not_zero, there is a
chance the ptrs value won't have been stored since refcount_set
is unordered. Force a memory barrier here, I picked smp_mb, since
we want it on all CPUs and it's write followed by a read.

v2: use paired smp_rmb/smp_wmb.

Cc: <stable@vger.kernel.org>
Fixes: be55287aa5ba ("drm/nouveau/imem/nv50: embed nvkm_instobj directly into nv04_instobj")
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240411011510.2546857-1-airlied@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-19 11:32:23 +01:00
..
amd drm/amdgpu: validate the parameters of bo mapping operations more clearly 2024-11-19 11:32:23 +01:00
arc Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
arm drm/komeda: drop all currently held locks if deadlock happens 2024-11-18 11:43:13 +01:00
armada Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
aspeed Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
ast Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
atmel-hlcdc Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
bochs Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
bridge drm/bridge: nxp-ptn3460: simplify some error checking 2024-11-18 12:13:03 +01:00
etnaviv drm/etnaviv: Restore some id values 2024-11-19 09:22:33 +01:00
exynos drm/exynos: do not return negative values from .get_modes() 2024-11-19 09:22:36 +01:00
fsl-dcu Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
gma500 Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
hisilicon Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
i2c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
i810 Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
i915 drm/i915/gt: Reset queue_priority_hint on parking 2024-11-19 09:23:15 +01:00
imx drm/imx/ipuv3: do not return negative values from .get_modes() 2024-11-19 09:22:36 +01:00
ingenic Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
lib Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
lima drm/lima: fix a memleak in lima_heap_alloc 2024-11-19 08:44:51 +01:00
mcde Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
mediatek drm/mediatek: Fix a null pointer crash in mtk_drm_crtc_finish_page_flip 2024-11-19 08:44:55 +01:00
meson Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
mga Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
mgag200 Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
msm drm/msm/dpu: add division of drm_display_mode's hskew parameter 2024-11-19 08:44:55 +01:00
mxsfb Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
nouveau nouveau: fix instmem race condition around ptr stores 2024-11-19 11:32:23 +01:00
omapdrm Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
panel drm/panel: visionox-rm69299: don't unregister DSI device 2024-11-19 11:32:21 +01:00
panfrost Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
pl111 Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
qxl drm/qxl: fix UAF on handle creation 2024-11-18 12:12:11 +01:00
r128 Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
radeon drm/radeon/ni: Fix wrong firmware size logging in ni_init_microcode() 2024-11-19 08:44:53 +01:00
rcar-du Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
rockchip drm/rockchip: lvds: do not print scary message when probing defer 2024-11-19 08:44:51 +01:00
samsung exynos_gpu: Don't allow userspace to control freqs 2024-06-15 16:28:49 -03:00
savage Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
scheduler Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
selftests Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
shmobile Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
sis Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
sti Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
stm Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
sun4i Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
tdfx Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
tegra drm/tegra: put drm_gem_object ref on error in tegra_fb_create 2024-11-19 08:44:54 +01:00
tidss drm/tidss: Fix initial plane zpos values 2024-11-19 08:44:54 +01:00
tilcdc Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
tiny Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
ttm drm/vmwgfx: Fix some static checker warnings 2024-11-19 09:22:15 +01:00
tve200 Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
udl Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
v3d Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
vboxvideo Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
vc4 drm/vc4: hdmi: do not return negative values from .get_modes() 2024-11-19 09:22:36 +01:00
vgem Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
via Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
virtio Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
vkms Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
vmwgfx drm/vmwgfx: Fix possible null pointer derefence with invalid contexts 2024-11-19 09:22:15 +01:00
xen Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
xlnx Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
zte Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_agpsupport.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_atomic.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_atomic_helper.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_atomic_state_helper.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_atomic_uapi.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_auth.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_blend.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_bridge.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_bridge_connector.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_bufs.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_cache.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_client.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_client_modeset.c drm/client: Fully protect modes[] with dev->mode_config.mutex 2024-11-19 11:32:20 +01:00
drm_color_mgmt.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_connector.c drm/connector: Add support for out-of-band hotplug notification (v3) 2024-11-08 11:26:15 +01:00
drm_context.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_crtc.c drm/crtc: fix uninitialized variable use 2024-11-18 12:12:18 +01:00
drm_crtc_helper.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_crtc_helper_internal.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_crtc_internal.h drm/connector: Add drm_connector_find_by_fwnode() function (v3) 2024-11-08 11:26:15 +01:00
drm_damage_helper.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_debugfs.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_debugfs_crc.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_dma.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_dp_aux_dev.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_dp_cec.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_dp_dual_mode_helper.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_dp_helper.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_dp_mst_topology.c drm/dp_mst: Fix NULL deref in get_mst_branch_device_by_guid_helper() 2024-11-18 10:58:29 +01:00
drm_dp_mst_topology_internal.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_drv.c drm/drv: propagate errors from drm_modeset_register_all() 2024-11-18 12:12:40 +01:00
drm_dsc.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_dumb_buffers.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_edid.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_edid_load.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_encoder.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_encoder_slave.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_fb_cma_helper.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_fb_helper.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_file.c drm/drm_file: fix use of uninitialized variable 2024-11-18 12:13:17 +01:00
drm_flip_work.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_format_helper.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_fourcc.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_framebuffer.c drm/framebuffer: Fix use of uninitialized variable 2024-11-18 12:13:18 +01:00
drm_gem.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_gem_cma_helper.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_gem_framebuffer_helper.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_gem_shmem_helper.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_gem_ttm_helper.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_gem_vram_helper.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_hashtab.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_hdcp.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_internal.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_ioc32.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_ioctl.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_irq.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_kms_helper_common.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_lease.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_legacy.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_legacy_misc.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_lock.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_managed.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_memory.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_mipi_dbi.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_mipi_dsi.c drm/mipi-dsi: Fix detach call without attach 2024-11-18 12:13:18 +01:00
drm_mm.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_mode_config.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_mode_object.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_modes.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_modeset_helper.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_modeset_lock.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_of.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_panel.c drm/panel: do not return negative error codes from drm_panel_get_modes() 2024-11-19 09:22:36 +01:00
drm_panel_orientation_quirks.c drm: panel-orientation-quirks: Add quirk for One Mix 2S 2024-11-08 11:26:17 +01:00
drm_pci.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_plane.c drm: Don't unref the same fb many times by mistake due to deadlock handling 2024-11-18 12:13:03 +01:00
drm_plane_helper.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_prime.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_print.c drm: Stub out debug prints 2024-11-17 17:45:26 +01:00
drm_probe_helper.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_property.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_rect.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_scatter.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_scdc_helper.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_self_refresh_helper.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_simple_kms_helper.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_syncobj.c drm/syncobj: call drm_syncobj_fence_add_wait when WAIT_AVAILABLE flag is set 2024-11-18 22:25:42 +01:00
drm_sysfs.c drm/connector: Add a fwnode pointer to drm_connector and register with ACPI (v2) 2024-11-08 11:26:14 +01:00
drm_trace.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_trace_points.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_vblank.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_vblank_work.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_vm.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_vma_manager.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drm_writeback.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
Kconfig Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
Makefile Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00