Commit graph

16 commits

Author SHA1 Message Date
Jinjie Ruan
f900ab8c25 drm/msm/adreno: Use IRQF_NO_AUTOEN flag in request_irq()
[ Upstream commit 394679f322649d06fea3c646ba65f5a0887f52c3 ]

disable_irq() after request_irq() still has a time gap in which
interrupts can come. request_irq() with IRQF_NO_AUTOEN flag will
disable IRQ auto-enable when request IRQ.

Fixes: 4b565ca5a2cb ("drm/msm: Add A6XX device support")
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Patchwork: https://patchwork.freedesktop.org/patch/614075/
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-17 13:24:05 +01:00
Ksawlii
37e508a03b Revert "drm/msm/adreno: Fix error return if missing firmware-name"
This reverts commit 114e57cc55.
2024-11-24 00:23:35 +01:00
Ksawlii
75d54cbe54 Revert "drm/msm: Fix incorrect file name output in adreno_request_fw()"
This reverts commit 4135fc06a3.
2024-11-24 00:23:25 +01:00
Ksawlii
8970e43037 Revert "drm/msm/a5xx: disable preemption in submits by default"
This reverts commit a5394161bf.
2024-11-24 00:23:25 +01:00
Ksawlii
f3d2c0c9ba Revert "drm/msm/a5xx: properly clear preemption records on resume"
This reverts commit fcb7824289.
2024-11-24 00:23:25 +01:00
Ksawlii
f007606619 Revert "drm/msm/a5xx: fix races in preemption evaluation stage"
This reverts commit 22746ff9a0.
2024-11-24 00:23:25 +01:00
Ksawlii
c18d247bc2 Revert "drm/msm: Drop priv->lastctx"
This reverts commit 77084c27bb.
2024-11-24 00:23:25 +01:00
Ksawlii
5aff368457 Revert "drm/msm/a5xx: workaround early ring-buffer emptiness check"
This reverts commit 2a3b8254f9.
2024-11-24 00:23:24 +01:00
Vladimir Lypak
2a3b8254f9 drm/msm/a5xx: workaround early ring-buffer emptiness check
[ Upstream commit a30f9f65b5ac82d4390548c32ed9c7f05de7ddf5 ]

There is another cause for soft lock-up of GPU in empty ring-buffer:
race between GPU executing last commands and CPU checking ring for
emptiness. On GPU side IRQ for retire is triggered by CACHE_FLUSH_TS
event and RPTR shadow (which is used to check ring emptiness) is updated
a bit later from CP_CONTEXT_SWITCH_YIELD. Thus if GPU is executing its
last commands slow enough or we check that ring too fast we will miss a
chance to trigger switch to lower priority ring because current ring isn't
empty just yet. This can escalate to lock-up situation described in
previous patch.
To work-around this issue we keep track of last submit sequence number
for each ring and compare it with one written to memptrs from GPU during
execution of CACHE_FLUSH_TS event.

Fixes: b1fc2839d2f9 ("drm/msm: Implement preemption for A5XX targets")
Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/612047/
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-11-23 23:21:22 +01:00
Rob Clark
77084c27bb drm/msm: Drop priv->lastctx
[ Upstream commit 1d054c9b8457b56a651109fac21f56f46ccd46b2 ]

cur_ctx_seqno already does the same thing, but handles the edge cases
where a refcnt'd context can live after lastclose.  So let's not have
two ways to do the same thing.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Akhil P Oommen <akhilpo@codeaurora.org>
Link: https://lore.kernel.org/r/20211109181117.591148-3-robdclark@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
Stable-dep-of: a30f9f65b5ac ("drm/msm/a5xx: workaround early ring-buffer emptiness check")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-11-23 23:21:22 +01:00
Vladimir Lypak
22746ff9a0 drm/msm/a5xx: fix races in preemption evaluation stage
[ Upstream commit ce050f307ad93bcc5958d0dd35fc276fd394d274 ]

On A5XX GPUs when preemption is used it's invietable to enter a soft
lock-up state in which GPU is stuck at empty ring-buffer doing nothing.
This appears as full UI lockup and not detected as GPU hang (because
it's not). This happens due to not triggering preemption when it was
needed. Sometimes this state can be recovered by some new submit but
generally it won't happen because applications are waiting for old
submits to retire.

One of the reasons why this happens is a race between a5xx_submit and
a5xx_preempt_trigger called from IRQ during submit retire. Former thread
updates ring->cur of previously empty and not current ring right after
latter checks it for emptiness. Then both threads can just exit because
for first one preempt_state wasn't NONE yet and for second one all rings
appeared to be empty.

To prevent such situations from happening we need to establish guarantee
for preempt_trigger to make decision after each submit or retire. To
implement this we serialize preemption initiation using spinlock. If
switch is already in progress we need to re-trigger preemption when it
finishes.

Fixes: b1fc2839d2f9 ("drm/msm: Implement preemption for A5XX targets")
Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/612045/
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-11-23 23:21:22 +01:00
Vladimir Lypak
fcb7824289 drm/msm/a5xx: properly clear preemption records on resume
[ Upstream commit 64fd6d01a52904bdbda0ce810a45a428c995a4ca ]

Two fields of preempt_record which are used by CP aren't reset on
resume: "data" and "info". This is the reason behind faults which happen
when we try to switch to the ring that was active last before suspend.
In addition those faults can't be recovered from because we use suspend
and resume to do so (keeping values of those fields again).

Fixes: b1fc2839d2f9 ("drm/msm: Implement preemption for A5XX targets")
Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/612043/
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-11-23 23:21:22 +01:00
Vladimir Lypak
a5394161bf drm/msm/a5xx: disable preemption in submits by default
[ Upstream commit db9dec2db76146d65e1cfbb6afb2e2bd5dab67f8 ]

Fine grain preemption (switching from/to points within submits)
requires extra handling in command stream of those submits, especially
when rendering with tiling (using GMEM). However this handling is
missing at this point in mesa (and always was). For this reason we get
random GPU faults and hangs if more than one priority level is used
because local preemption is enabled prior to executing command stream
from submit.
With that said it was ahead of time to enable local preemption by
default considering the fact that even on downstream kernel it is only
enabled if requested via UAPI.

Fixes: a7a4c19c36de ("drm/msm/a5xx: fix setting of the CP_PREEMPT_ENABLE_LOCAL register")
Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/612041/
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-11-23 23:21:21 +01:00
Aleksandr Mishin
4135fc06a3 drm/msm: Fix incorrect file name output in adreno_request_fw()
[ Upstream commit e19366911340c2313a1abbb09c54eaf9bdea4f58 ]

In adreno_request_fw() when debugging information is printed to the log
after firmware load, an incorrect filename is printed. 'newname' is used
instead of 'fwname', so prefix "qcom/" is being added to filename.
Looks like "copy-paste" mistake.

Fix this mistake by replacing 'newname' with 'fwname'.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 2c41ef1b6f7d ("drm/msm/adreno: deal with linux-firmware fw paths")
Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/602382/
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-11-23 23:21:21 +01:00
Rob Clark
114e57cc55 drm/msm/adreno: Fix error return if missing firmware-name
[ Upstream commit 624ab9cde26a9f150b4fd268b0f3dae3184dc40c ]

-ENODEV is used to signify that there is no zap shader for the platform,
and the CPU can directly take the GPU out of secure mode.  We want to
use this return code when there is no zap-shader node.  But not when
there is, but without a firmware-name property.  This case we want to
treat as-if the needed fw is not found.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/604564/
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-11-23 23:21:11 +01:00
Gabriel2392
7ed7ee9edf Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00