kernel_samsung_a53x/drivers/gpu/drm/i915/gt
Janusz Krzysztofik 0ecec1bda2 drm/i915/gt: Fix potential UAF by revoke of fence registers
commit 996c3412a06578e9d779a16b9e79ace18125ab50 upstream.

CI has been sporadically reporting the following issue triggered by
igt@i915_selftest@live@hangcheck on ADL-P and similar machines:

<6> [414.049203] i915: Running intel_hangcheck_live_selftests/igt_reset_evict_fence
...
<6> [414.068804] i915 0000:00:02.0: [drm] GT0: GUC: submission enabled
<6> [414.068812] i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled
<3> [414.070354] Unable to pin Y-tiled fence; err:-4
<3> [414.071282] i915_vma_revoke_fence:301 GEM_BUG_ON(!i915_active_is_idle(&fence->active))
...
<4>[  609.603992] ------------[ cut here ]------------
<2>[  609.603995] kernel BUG at drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c:301!
<4>[  609.604003] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
<4>[  609.604006] CPU: 0 PID: 268 Comm: kworker/u64:3 Tainted: G     U  W          6.9.0-CI_DRM_14785-g1ba62f8cea9c+ #1
<4>[  609.604008] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR4 RVP, BIOS RPLPFWI1.R00.4035.A00.2301200723 01/20/2023
<4>[  609.604010] Workqueue: i915 __i915_gem_free_work [i915]
<4>[  609.604149] RIP: 0010:i915_vma_revoke_fence+0x187/0x1f0 [i915]
...
<4>[  609.604271] Call Trace:
<4>[  609.604273]  <TASK>
...
<4>[  609.604716]  __i915_vma_evict+0x2e9/0x550 [i915]
<4>[  609.604852]  __i915_vma_unbind+0x7c/0x160 [i915]
<4>[  609.604977]  force_unbind+0x24/0xa0 [i915]
<4>[  609.605098]  i915_vma_destroy+0x2f/0xa0 [i915]
<4>[  609.605210]  __i915_gem_object_pages_fini+0x51/0x2f0 [i915]
<4>[  609.605330]  __i915_gem_free_objects.isra.0+0x6a/0xc0 [i915]
<4>[  609.605440]  process_scheduled_works+0x351/0x690
...

In the past, there were similar failures reported by CI from other IGT
tests, observed on other platforms.

Before commit 63baf4f3d587 ("drm/i915/gt: Only wait for GPU activity
before unbinding a GGTT fence"), i915_vma_revoke_fence() was waiting for
idleness of vma->active via fence_update().   That commit introduced
vma->fence->active in order for the fence_update() to be able to wait
selectively on that one instead of vma->active since only idleness of
fence registers was needed.  But then, another commit 0d86ee35097a
("drm/i915/gt: Make fence revocation unequivocal") replaced the call to
fence_update() in i915_vma_revoke_fence() with only fence_write(), and
also added that GEM_BUG_ON(!i915_active_is_idle(&fence->active)) in front.
No justification was provided on why we might then expect idleness of
vma->fence->active without first waiting on it.

The issue can be potentially caused by a race among revocation of fence
registers on one side and sequential execution of signal callbacks invoked
on completion of a request that was using them on the other, still
processed in parallel to revocation of those fence registers.  Fix it by
waiting for idleness of vma->fence->active in i915_vma_revoke_fence().

Fixes: 0d86ee35097a ("drm/i915/gt: Make fence revocation unequivocal")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/10021
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: stable@vger.kernel.org # v5.8+
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240603195446.297690-2-janusz.krzysztofik@linux.intel.com
(cherry picked from commit 24bb052d3dd499c5956abad5f7d8e4fd07da7fb1)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-19 14:19:34 +01:00
..
selftests
shaders
uc
debugfs_engines.c
debugfs_engines.h
debugfs_gt.c
debugfs_gt.h
debugfs_gt_pm.c
debugfs_gt_pm.h
gen2_engine_cs.c
gen2_engine_cs.h
gen6_engine_cs.c
gen6_engine_cs.h
gen6_ppgtt.c
gen6_ppgtt.h
gen6_renderstate.c
gen7_renderclear.c
gen7_renderclear.h
gen7_renderstate.c
gen8_ppgtt.c
gen8_ppgtt.h
gen8_renderstate.c
gen9_renderstate.c
hsw_clear_kernel.c
intel_breadcrumbs.c
intel_breadcrumbs.h
intel_breadcrumbs_types.h
intel_context.c
intel_context.h
intel_context_param.c
intel_context_param.h
intel_context_sseu.c
intel_context_types.h
intel_engine.h
intel_engine_cs.c
intel_engine_heartbeat.c
intel_engine_heartbeat.h
intel_engine_pm.c
intel_engine_pm.h
intel_engine_types.h
intel_engine_user.c
intel_engine_user.h
intel_ggtt.c
intel_ggtt_fencing.c
intel_ggtt_fencing.h
intel_gpu_commands.h
intel_gt.c
intel_gt.h
intel_gt_buffer_pool.c
intel_gt_buffer_pool.h
intel_gt_buffer_pool_types.h
intel_gt_clock_utils.c
intel_gt_clock_utils.h
intel_gt_irq.c
intel_gt_irq.h
intel_gt_pm.c
intel_gt_pm.h
intel_gt_pm_irq.c
intel_gt_pm_irq.h
intel_gt_requests.c
intel_gt_requests.h
intel_gt_types.h
intel_gtt.c
intel_gtt.h
intel_llc.c
intel_llc.h
intel_llc_types.h
intel_lrc.c
intel_lrc.h
intel_lrc_reg.h
intel_mocs.c
intel_mocs.h
intel_ppgtt.c
intel_rc6.c
intel_rc6.h
intel_rc6_types.h
intel_renderstate.c
intel_renderstate.h
intel_reset.c
intel_reset.h
intel_reset_types.h
intel_ring.c
intel_ring.h
intel_ring_submission.c
intel_ring_types.h
intel_rps.c
intel_rps.h
intel_rps_types.h
intel_sseu.c
intel_sseu.h
intel_sseu_debugfs.c
intel_sseu_debugfs.h
intel_timeline.c
intel_timeline.h
intel_timeline_types.h
intel_workarounds.c
intel_workarounds.h
intel_workarounds_types.h
ivb_clear_kernel.c
mock_engine.c
mock_engine.h
selftest_context.c
selftest_engine.c
selftest_engine.h
selftest_engine_cs.c
selftest_engine_heartbeat.c
selftest_engine_heartbeat.h
selftest_engine_pm.c
selftest_gt_pm.c
selftest_hangcheck.c
selftest_llc.c
selftest_llc.h
selftest_lrc.c
selftest_mocs.c
selftest_rc6.c
selftest_rc6.h
selftest_reset.c
selftest_ring.c
selftest_ring_submission.c
selftest_rps.c
selftest_rps.h
selftest_timeline.c
selftest_workarounds.c
shmem_utils.c
shmem_utils.h
st_shmem_utils.c
sysfs_engines.c
sysfs_engines.h