kernel_samsung_a53x/block
Omar Sandoval 6df069c465 blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race
commit e972b08b91ef48488bae9789f03cfedb148667fb upstream.

We're seeing crashes from rq_qos_wake_function that look like this:

  BUG: unable to handle page fault for address: ffffafe180a40084
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  PGD 100000067 P4D 100000067 PUD 10027c067 PMD 10115d067 PTE 0
  Oops: Oops: 0002 [#1] PREEMPT SMP PTI
  CPU: 17 UID: 0 PID: 0 Comm: swapper/17 Not tainted 6.12.0-rc3-00013-geca631b8fe80 #11
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
  RIP: 0010:_raw_spin_lock_irqsave+0x1d/0x40
  Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 62 97 30 4c 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 2c 0b 00
  RSP: 0018:ffffafe180580ca0 EFLAGS: 00010046
  RAX: 0000000000000000 RBX: ffffafe180a3f7a8 RCX: 0000000000000011
  RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffafe180a40084
  RBP: 0000000000000000 R08: 00000000001e7240 R09: 0000000000000011
  R10: 0000000000000028 R11: 0000000000000888 R12: 0000000000000002
  R13: ffffafe180a40084 R14: 0000000000000000 R15: 0000000000000003
  FS:  0000000000000000(0000) GS:ffff9aaf1f280000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: ffffafe180a40084 CR3: 000000010e428002 CR4: 0000000000770ef0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  PKRU: 55555554
  Call Trace:
   <IRQ>
   try_to_wake_up+0x5a/0x6a0
   rq_qos_wake_function+0x71/0x80
   __wake_up_common+0x75/0xa0
   __wake_up+0x36/0x60
   scale_up.part.0+0x50/0x110
   wb_timer_fn+0x227/0x450
   ...

So rq_qos_wake_function() calls wake_up_process(data->task), which calls
try_to_wake_up(), which faults in raw_spin_lock_irqsave(&p->pi_lock).

p comes from data->task, and data comes from the waitqueue entry, which
is stored on the waiter's stack in rq_qos_wait(). Analyzing the core
dump with drgn, I found that the waiter had already woken up and moved
on to a completely unrelated code path, clobbering what was previously
data->task. Meanwhile, the waker was passing the clobbered garbage in
data->task to wake_up_process(), leading to the crash.

What's happening is that in between rq_qos_wake_function() deleting the
waitqueue entry and calling wake_up_process(), rq_qos_wait() is finding
that it already got a token and returning. The race looks like this:

rq_qos_wait()                           rq_qos_wake_function()
==============================================================
prepare_to_wait_exclusive()
                                        data->got_token = true;
                                        list_del_init(&curr->entry);
if (data.got_token)
        break;
finish_wait(&rqw->wait, &data.wq);
  ^- returns immediately because
     list_empty_careful(&wq_entry->entry)
     is true
... return, go do something else ...
                                        wake_up_process(data->task)
                                          (NO LONGER VALID!)-^

Normally, finish_wait() is supposed to synchronize against the waker.
But, as noted above, it is returning immediately because the waitqueue
entry has already been removed from the waitqueue.

The bug is that rq_qos_wake_function() is accessing the waitqueue entry
AFTER deleting it. Note that autoremove_wake_function() wakes the waiter
and THEN deletes the waitqueue entry, which is the proper order.

Fix it by swapping the order. We also need to use
list_del_init_careful() to match the list_empty_careful() in
finish_wait().

Fixes: 38cfb5a45ee0 ("blk-wbt: improve waking of tasks")
Cc: stable@vger.kernel.org
Signed-off-by: Omar Sandoval <osandov@fb.com>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/d3bee2463a67b1ee597211823bf7ad3721c26e41.1729014591.git.osandov@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-23 23:21:55 +01:00
..
partitions block: fix potential invalid pointer dereference in blk_add_partition 2024-11-23 23:21:19 +01:00
badblocks.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
bfq-cgroup.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
bfq-iosched.c block, bfq: don't break merge chain in bfq_split_bfqq() 2024-11-23 23:21:19 +01:00
bfq-iosched.h block, bfq: save also injection state on queue merging 2024-11-19 17:43:15 +01:00
bfq-wf2q.c block, bfq: always inject I/O of queues blocked by wakers 2024-11-19 17:41:42 +01:00
bio-integrity.c block: initialize integrity buffer to zero before writing it to media 2024-11-23 23:20:59 +01:00
bio.c block: prevent an integer overflow in bvec_try_merge_hw_page 2024-11-18 12:13:14 +01:00
blk-cgroup-rwstat.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-cgroup-rwstat.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-cgroup.c cgroup: rstat: punt root-level optimization to individual controllers 2024-11-19 17:40:21 +01:00
blk-core.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-crypto-fallback.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-crypto-internal.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-crypto.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-exec.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-flush.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-integrity.c block: remove the blk_flush_integrity call in blk_integrity_unregister 2024-11-23 23:20:58 +01:00
blk-ioc.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-iocost.c blk_iocost: fix more out of bound shifts 2024-11-23 23:21:38 +01:00
blk-iolatency.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-ioprio.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-ioprio.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-lib.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-map.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-merge.c blk: Fix lock inversion between ioc lock and bfqd lock 2024-11-19 17:40:26 +01:00
blk-mq-cpumap.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-mq-debugfs-zoned.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-mq-debugfs.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-mq-debugfs.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-mq-pci.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-mq-rdma.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-mq-sched.c blk: Fix lock inversion between ioc lock and bfqd lock 2024-11-19 17:40:26 +01:00
blk-mq-sched.h blk: Fix lock inversion between ioc lock and bfqd lock 2024-11-19 17:40:26 +01:00
blk-mq-sysfs.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-mq-tag.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-mq-tag.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-mq-virtio.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-mq.c blk-mq: fix IO hang from sbitmap wakeup race 2024-11-18 12:13:20 +01:00
blk-mq.h blk: Fix lock inversion between ioc lock and bfqd lock 2024-11-19 17:40:26 +01:00
blk-pm.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-pm.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-rq-qos.c blk-rq-qos: fix crash on rq_qos_wait vs. rq_qos_wake_function race 2024-11-23 23:21:55 +01:00
blk-rq-qos.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-sec-stats.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-settings.c block: Clear zone limits for a non-zoned stacked queue 2024-11-19 09:22:16 +01:00
blk-stat.c block: prevent division by zero in blk_rq_stat_sum() 2024-11-19 09:23:14 +01:00
blk-stat.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-sysfs.c Revert "mm: apply init protection" 2024-11-19 18:15:13 +01:00
blk-throttle.c blk-throttle: fix lockdep warning of "cgroup_mutex or RCU read lock required!" 2024-11-18 12:11:56 +01:00
blk-timeout.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-wbt.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-wbt.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk-zoned.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
blk.h blk: Fix lock inversion between ioc lock and bfqd lock 2024-11-19 17:40:26 +01:00
bounce.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
bsg-lib.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
bsg.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
cmdline-parser.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
elevator.c block: Add default I/O scheduler option 2024-11-19 17:43:55 +01:00
genhd.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
ioctl.c block/ioctl: prefer different overflow check 2024-11-19 14:19:06 +01:00
ioprio.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
Kconfig Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
Kconfig.iosched block: Add default I/O scheduler option 2024-11-19 17:43:55 +01:00
keyslot-manager.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
kyber-iosched.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
Makefile Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
mq-deadline-cgroup.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
mq-deadline-cgroup.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
mq-deadline-main.c blk: Fix lock inversion between ioc lock and bfqd lock 2024-11-19 17:40:26 +01:00
opal_proto.h block: sed-opal: handle empty atoms when parsing response 2024-11-19 08:44:36 +01:00
scsi_ioctl.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
sed-opal.c block: sed-opal: handle empty atoms when parsing response 2024-11-19 08:44:36 +01:00
ssg-cgroup.c ssg: Set max available ratio to 25 2024-11-17 17:41:50 +01:00
ssg-cgroup.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
ssg-iosched.c block: ssg-iosched: adapt to new patches 2024-11-19 17:40:09 +01:00
t10-pi.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00