0ecec1bda2
commit 996c3412a06578e9d779a16b9e79ace18125ab50 upstream. CI has been sporadically reporting the following issue triggered by igt@i915_selftest@live@hangcheck on ADL-P and similar machines: <6> [414.049203] i915: Running intel_hangcheck_live_selftests/igt_reset_evict_fence ... <6> [414.068804] i915 0000:00:02.0: [drm] GT0: GUC: submission enabled <6> [414.068812] i915 0000:00:02.0: [drm] GT0: GUC: SLPC enabled <3> [414.070354] Unable to pin Y-tiled fence; err:-4 <3> [414.071282] i915_vma_revoke_fence:301 GEM_BUG_ON(!i915_active_is_idle(&fence->active)) ... <4>[ 609.603992] ------------[ cut here ]------------ <2>[ 609.603995] kernel BUG at drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c:301! <4>[ 609.604003] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI <4>[ 609.604006] CPU: 0 PID: 268 Comm: kworker/u64:3 Tainted: G U W 6.9.0-CI_DRM_14785-g1ba62f8cea9c+ #1 <4>[ 609.604008] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR4 RVP, BIOS RPLPFWI1.R00.4035.A00.2301200723 01/20/2023 <4>[ 609.604010] Workqueue: i915 __i915_gem_free_work [i915] <4>[ 609.604149] RIP: 0010:i915_vma_revoke_fence+0x187/0x1f0 [i915] ... <4>[ 609.604271] Call Trace: <4>[ 609.604273] <TASK> ... <4>[ 609.604716] __i915_vma_evict+0x2e9/0x550 [i915] <4>[ 609.604852] __i915_vma_unbind+0x7c/0x160 [i915] <4>[ 609.604977] force_unbind+0x24/0xa0 [i915] <4>[ 609.605098] i915_vma_destroy+0x2f/0xa0 [i915] <4>[ 609.605210] __i915_gem_object_pages_fini+0x51/0x2f0 [i915] <4>[ 609.605330] __i915_gem_free_objects.isra.0+0x6a/0xc0 [i915] <4>[ 609.605440] process_scheduled_works+0x351/0x690 ... In the past, there were similar failures reported by CI from other IGT tests, observed on other platforms. Before commit 63baf4f3d587 ("drm/i915/gt: Only wait for GPU activity before unbinding a GGTT fence"), i915_vma_revoke_fence() was waiting for idleness of vma->active via fence_update(). That commit introduced vma->fence->active in order for the fence_update() to be able to wait selectively on that one instead of vma->active since only idleness of fence registers was needed. But then, another commit 0d86ee35097a ("drm/i915/gt: Make fence revocation unequivocal") replaced the call to fence_update() in i915_vma_revoke_fence() with only fence_write(), and also added that GEM_BUG_ON(!i915_active_is_idle(&fence->active)) in front. No justification was provided on why we might then expect idleness of vma->fence->active without first waiting on it. The issue can be potentially caused by a race among revocation of fence registers on one side and sequential execution of signal callbacks invoked on completion of a request that was using them on the other, still processed in parallel to revocation of those fence registers. Fix it by waiting for idleness of vma->fence->active in i915_vma_revoke_fence(). Fixes: 0d86ee35097a ("drm/i915/gt: Make fence revocation unequivocal") Closes: https://gitlab.freedesktop.org/drm/intel/issues/10021 Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Cc: stable@vger.kernel.org # v5.8+ Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240603195446.297690-2-janusz.krzysztofik@linux.intel.com (cherry picked from commit 24bb052d3dd499c5956abad5f7d8e4fd07da7fb1) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
---|---|---|
.. | ||
display | ||
gem | ||
gt | ||
gvt | ||
selftests | ||
i915_active.c | ||
i915_active.h | ||
i915_active_types.h | ||
i915_buddy.c | ||
i915_buddy.h | ||
i915_cmd_parser.c | ||
i915_config.c | ||
i915_debugfs.c | ||
i915_debugfs.h | ||
i915_debugfs_params.c | ||
i915_debugfs_params.h | ||
i915_drv.c | ||
i915_drv.h | ||
i915_fixed.h | ||
i915_gem.c | ||
i915_gem.h | ||
i915_gem_evict.c | ||
i915_gem_gtt.c | ||
i915_gem_gtt.h | ||
i915_getparam.c | ||
i915_globals.c | ||
i915_globals.h | ||
i915_gpu_error.c | ||
i915_gpu_error.h | ||
i915_ioc32.c | ||
i915_ioc32.h | ||
i915_irq.c | ||
i915_irq.h | ||
i915_memcpy.c | ||
i915_memcpy.h | ||
i915_mitigations.c | ||
i915_mitigations.h | ||
i915_mm.c | ||
i915_params.c | ||
i915_params.h | ||
i915_pci.c | ||
i915_perf.c | ||
i915_perf.h | ||
i915_perf_types.h | ||
i915_pmu.c | ||
i915_pmu.h | ||
i915_priolist_types.h | ||
i915_pvinfo.h | ||
i915_query.c | ||
i915_query.h | ||
i915_reg.h | ||
i915_request.c | ||
i915_request.h | ||
i915_scatterlist.c | ||
i915_scatterlist.h | ||
i915_scheduler.c | ||
i915_scheduler.h | ||
i915_scheduler_types.h | ||
i915_selftest.h | ||
i915_suspend.c | ||
i915_suspend.h | ||
i915_sw_fence.c | ||
i915_sw_fence.h | ||
i915_sw_fence_work.c | ||
i915_sw_fence_work.h | ||
i915_switcheroo.c | ||
i915_switcheroo.h | ||
i915_syncmap.c | ||
i915_syncmap.h | ||
i915_sysfs.c | ||
i915_sysfs.h | ||
i915_trace.h | ||
i915_trace_points.c | ||
i915_user_extensions.c | ||
i915_user_extensions.h | ||
i915_utils.c | ||
i915_utils.h | ||
i915_vgpu.c | ||
i915_vgpu.h | ||
i915_vma.c | ||
i915_vma.h | ||
i915_vma_types.h | ||
intel_device_info.c | ||
intel_device_info.h | ||
intel_dram.c | ||
intel_dram.h | ||
intel_gvt.c | ||
intel_gvt.h | ||
intel_memory_region.c | ||
intel_memory_region.h | ||
intel_pch.c | ||
intel_pch.h | ||
intel_pm.c | ||
intel_pm.h | ||
intel_region_lmem.c | ||
intel_region_lmem.h | ||
intel_runtime_pm.c | ||
intel_runtime_pm.h | ||
intel_sideband.c | ||
intel_sideband.h | ||
intel_uncore.c | ||
intel_uncore.h | ||
intel_wakeref.c | ||
intel_wakeref.h | ||
intel_wopcm.c | ||
intel_wopcm.h | ||
Kconfig | ||
Kconfig.debug | ||
Kconfig.profile | ||
Kconfig.unstable | ||
Makefile | ||
vlv_suspend.c | ||
vlv_suspend.h |