kernel_samsung_a53x/drivers
Shifeng Li 6c9ae14473 net/mlx5e: Fix a race in command alloc flow
commit 8f5100da56b3980276234e812ce98d8f075194cd upstream.

Fix a cmd->ent use after free due to a race on command entry.
Such race occurs when one of the commands releases its last refcount and
frees its index and entry while another process running command flush
flow takes refcount to this command entry. The process which handles
commands flush may see this command as needed to be flushed if the other
process allocated a ent->idx but didn't set ent to cmd->ent_arr in
cmd_work_handler(). Fix it by moving the assignment of cmd->ent_arr into
the spin lock.

[70013.081955] BUG: KASAN: use-after-free in mlx5_cmd_trigger_completions+0x1e2/0x4c0 [mlx5_core]
[70013.081967] Write of size 4 at addr ffff88880b1510b4 by task kworker/26:1/1433361
[70013.081968]
[70013.082028] Workqueue: events aer_isr
[70013.082053] Call Trace:
[70013.082067]  dump_stack+0x8b/0xbb
[70013.082086]  print_address_description+0x6a/0x270
[70013.082102]  kasan_report+0x179/0x2c0
[70013.082173]  mlx5_cmd_trigger_completions+0x1e2/0x4c0 [mlx5_core]
[70013.082267]  mlx5_cmd_flush+0x80/0x180 [mlx5_core]
[70013.082304]  mlx5_enter_error_state+0x106/0x1d0 [mlx5_core]
[70013.082338]  mlx5_try_fast_unload+0x2ea/0x4d0 [mlx5_core]
[70013.082377]  remove_one+0x200/0x2b0 [mlx5_core]
[70013.082409]  pci_device_remove+0xf3/0x280
[70013.082439]  device_release_driver_internal+0x1c3/0x470
[70013.082453]  pci_stop_bus_device+0x109/0x160
[70013.082468]  pci_stop_and_remove_bus_device+0xe/0x20
[70013.082485]  pcie_do_fatal_recovery+0x167/0x550
[70013.082493]  aer_isr+0x7d2/0x960
[70013.082543]  process_one_work+0x65f/0x12d0
[70013.082556]  worker_thread+0x87/0xb50
[70013.082571]  kthread+0x2e9/0x3a0
[70013.082592]  ret_from_fork+0x1f/0x40

The logical relationship of this error is as follows:

             aer_recover_work              |          ent->work
-------------------------------------------+------------------------------
aer_recover_work_func                      |
|- pcie_do_recovery                        |
  |- report_error_detected                 |
    |- mlx5_pci_err_detected               |cmd_work_handler
      |- mlx5_enter_error_state            |  |- cmd_alloc_index
        |- enter_error_state               |    |- lock cmd->alloc_lock
          |- mlx5_cmd_flush                |    |- clear_bit
            |- mlx5_cmd_trigger_completions|    |- unlock cmd->alloc_lock
              |- lock cmd->alloc_lock      |
              |- vector = ~dev->cmd.vars.bitmask
              |- for_each_set_bit          |
                |- cmd_ent_get(cmd->ent_arr[i]) (UAF)
              |- unlock cmd->alloc_lock    |  |- cmd->ent_arr[ent->idx]=ent

The cmd->ent_arr[ent->idx] assignment and the bit clearing are not
protected by the cmd->alloc_lock in cmd_work_handler().

Fixes: 50b2412b7e78 ("net/mlx5: Avoid possible free of command entry while timeout comp handler")
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Shifeng Li <lishifeng@sangfor.com.cn>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Samasth Norway Ananda <samasth.norway.ananda@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-19 11:32:38 +01:00
..
accessibility speakup: Avoid crash on very long word 2024-11-19 11:32:23 +01:00
acpi Revert "ACPI: PM: Block ASUS B1400CEAE from suspend to idle by default" 2024-11-19 09:23:14 +01:00
amba
android binder: check offset alignment in binder_get_object() 2024-11-19 11:32:22 +01:00
ata ata: sata_mv: Fix PCI device ID table declaration compilation warning 2024-11-19 09:23:10 +01:00
atm atm: idt77252: fix a memleak in open_card_ubr0 2024-11-18 12:13:24 +01:00
auxdisplay
base x86/rfds: Mitigate Register File Data Sampling (RFDS) 2024-11-19 09:22:40 +01:00
battery drivers: battery_v2: sec_battery: export {CURRENT/VOLTAGE}_MAX to sysfs 2024-11-17 17:43:14 +01:00
bcma
block aoe: fix the potential use-after-free problem in aoecmd_cfg_pkts 2024-11-19 08:44:37 +01:00
bluetooth Bluetooth: btintel: Fixe build regression 2024-11-19 09:23:15 +01:00
bts
bus bus: tegra-aconnect: Update dependency to ARCH_TEGRA 2024-11-19 08:44:45 +01:00
cdrom
char hwrng: core - Fix page fault dead lock on mmap-ed hwrng 2024-11-18 12:12:55 +01:00
clk clk: Get runtime PM before walking tree during disable_unused 2024-11-19 11:32:22 +01:00
clocksource clocksource/drivers/timer-atmel-tcb: Fix initialization on SAM9 hardware 2024-11-18 11:43:12 +01:00
connector
counter counter: microchip-tcb-capture: Fix the use of internal GCLK logic 2024-11-08 11:25:51 +01:00
cpufreq cpufreq: brcmstb-avs-cpufreq: fix up "add check for cpufreq_cpu_get's return value" 2024-11-19 09:22:38 +01:00
cpuidle
crypto crypto: qat - resolve race condition during AER recovery 2024-11-19 09:22:15 +01:00
dax
dca
devfreq PM / devfreq: Synchronize devfreq_monitor_[start/stop] 2024-11-18 12:13:09 +01:00
dio
dma dmaengine: tegra210-adma: Update dependency to ARCH_TEGRA 2024-11-19 08:44:51 +01:00
dma-buf
edac EDAC/thunderx: Fix possible out-of-bounds string access 2024-11-18 12:12:19 +01:00
eisa
extcon
fingerprint
firewire firewire: core: use long bus reset on gap count error 2024-11-19 08:44:36 +01:00
firmware efivarfs: Request at most 512 bytes for variable names 2024-11-19 09:22:41 +01:00
fpga
fsi
gnss
gpio gpio: fix resource unwinding order in error path 2024-11-18 23:18:30 +01:00
gpu nouveau: fix instmem race condition around ptr stores 2024-11-19 11:32:23 +01:00
greybus
gud
hid HID: lenovo: Add middleclick_workaround sysfs knob for cptkbd 2024-11-19 08:44:51 +01:00
hsi
hv Drivers: hv: vmbus: Drop error message when 'No request id available' 2024-11-18 23:19:53 +01:00
hwmon hwmon: (amc6821) add of_match table 2024-11-19 09:22:33 +01:00
hwspinlock
hwtracing coresight: etm4x: Fix width of CCITMIN field 2024-11-18 12:12:19 +01:00
i2c i2c: i801: Fix block process call transactions 2024-11-18 12:13:29 +01:00
i3c i3c: master: cdns: Update maximum prescaler value for i2c clock 2024-11-18 12:13:19 +01:00
ide
idle
ifconn
iio iio: accel: bma400: Fix a compilation problem 2024-11-18 12:13:31 +01:00
infiniband RDMA/mlx5: Fix port number for counter query in multi-port configuration 2024-11-19 11:32:21 +01:00
input Input: synaptics-rmi4 - fail probing if memory allocation for "phys" fails 2024-11-19 09:23:14 +01:00
interconnect interconnect: Treat xlate() returning NULL node as an error 2024-11-18 12:12:00 +01:00
iommu iommu/vt-d: Allocate local memory for page request queue 2024-11-19 11:32:20 +01:00
ipack
irqchip irqchip/mips-gic: Don't touch vl_map if a local interrupt is not routable 2024-11-18 22:25:34 +01:00
isdn
kperfmon Kperfmon: add xyunbound version 2024-06-15 16:28:49 -03:00
kq/mesh
leds leds: sgm3140: Add missing timer cleanup and flash gpio control 2024-11-19 08:44:56 +01:00
lightnvm
macintosh
mailbox mailbox: imx: fix suspend failue 2024-11-19 11:32:20 +01:00
mcb mcb: fix error handling for different scenarios when parsing 2024-11-18 11:43:25 +01:00
md dm integrity: fix out-of-range warning 2024-11-19 09:22:44 +01:00
media media: cec: core: remove length check of Timer Status 2024-11-19 11:32:19 +01:00
memory
memstick
message
mfd mfd: altera-sysmgr: Call of_node_put() only when of_parse_phandle() takes a ref 2024-11-19 08:44:54 +01:00
misc mei: me: disable RPL-S on SPS and IGN firmwares 2024-11-19 11:32:23 +01:00
mmc mmc: core: Avoid negative index with array access 2024-11-19 09:22:42 +01:00
most
mtd mtd: rawnand: meson: fix scrambling mode value in command macro 2024-11-19 09:22:16 +01:00
muic
mux
net net/mlx5e: Fix a race in command alloc flow 2024-11-19 11:32:38 +01:00
nfc NFC: trf7970a: disable all regulators on removal 2024-11-19 11:32:37 +01:00
ntb
nubus
nvdimm nd_btt: Make BTT lanes preemptible 2024-11-18 11:43:03 +01:00
nvme drivers/nvme: Add quirks for device 126f:2262 2024-11-19 09:23:15 +01:00
nvmem nvmem: meson-efuse: fix function pointer type mismatch 2024-11-19 09:22:34 +01:00
of of: dynamic: Synchronize of_changeset_destroy() with the devlink removals 2024-11-19 09:23:10 +01:00
opp OPP: debugfs: Fix warning around icc_get_name() 2024-11-19 08:44:49 +01:00
oprofile
parisc
parport parport: parport_serial: Add Brainboxes device IDs and geometry 2024-11-18 12:12:19 +01:00
pci Manual Revert: PCI/ASPM: Make Intel DG2 L1 acceptable latency unlimited 2024-11-19 10:37:22 +01:00
pcmcia pcmcia: ds: fix possible name leak in error path in pcmcia_device_add() 2024-11-18 11:43:06 +01:00
perf perf/arm-cmn: Fix the unhandled overflow status of counter 4 to 7 2024-11-08 11:24:52 +01:00
phy phy: tegra: xusb: Add API to retrieve the port number of phy 2024-11-19 09:22:34 +01:00
pinctrl pinctrl: renesas: checker: Limit cfg reg enum checks to provided IDs 2024-11-19 09:23:14 +01:00
platform platform/x86: touchscreen_dmi: Add an extra entry for a variant of the Chuwi Vi8 tablet 2024-11-19 09:23:14 +01:00
pnp PNP: ACPI: fix fortify warning 2024-11-18 12:13:09 +01:00
power power: supply: bq27xxx-i2c: Do not free non existing IRQ 2024-11-18 23:18:29 +01:00
powercap
pps
ps3
ptp ptp: annotate data-race around q->head and q->tail 2024-11-18 11:43:19 +01:00
pwm pwm: jz4740: Don't use dev_err_probe() in .request() 2024-11-18 12:12:47 +01:00
rapidio
ras
regulator regulator: pwm-regulator: Add validity checks in continuous .get_voltage 2024-11-18 22:25:33 +01:00
remoteproc remoteproc: stm32: fix phys_addr_t format string 2024-11-19 08:45:00 +01:00
reset reset: hisilicon: hi6220: fix Wvoid-pointer-to-enum-cast warning 2024-11-18 12:12:16 +01:00
rpmsg rpmsg: virtio: Free driver_override when rpmsg_remove() 2024-11-18 12:12:56 +01:00
rtc rtc: mt6397: select IRQ_DOMAIN instead of depending on it 2024-11-19 08:44:58 +01:00
s390 s390/zcrypt: fix reference counting on zcrypt card objects 2024-11-19 09:22:35 +01:00
samsung Fix clang 16 errors treewide 2024-06-15 16:28:48 -03:00
sbus
scsi scsi: sd: Fix wrong zone_write_granularity value during revalidate 2024-11-19 09:23:16 +01:00
sensorhub treewide: fix build errors 2024-06-15 16:21:17 -03:00
sensors
sfi
sh
siox
slimbus slimbus: core: Remove usage of the deprecated ida_simple_xx() API 2024-11-19 09:22:34 +01:00
soc soc: fsl: qbman: Use raw spinlock for cgr_lock 2024-11-19 09:22:35 +01:00
soundwire soundwire: stream: fix NULL pointer dereference for multi_link 2024-11-18 12:11:57 +01:00
spi spi: spi-mt65xx: Fix NULL pointer access in interrupt handler 2024-11-19 08:45:00 +01:00
spmi
spu_verify
ssb
staging comedi: vmk80xx: fix incomplete endpoint checking 2024-11-19 11:32:22 +01:00
sti
target scsi: target: core: Add TMF to tmr_list handling 2024-11-18 22:25:32 +01:00
tc
tee tee: optee: Fix kernel panic caused by incorrect error handling 2024-11-19 09:22:39 +01:00
thermal thermal: core: prevent potential string overflow 2024-11-18 11:42:50 +01:00
thunderbolt thunderbolt: Fix wake configurations after device unplug 2024-11-19 11:32:22 +01:00
tty serial: mxs-auart: add spinlock around changing cts state 2024-11-19 11:32:38 +01:00
uh
uio uio: Fix use-after-free in uio_open 2024-11-18 12:12:19 +01:00
usb usb: Disable USB3 LPM at shutdown 2024-11-19 11:32:23 +01:00
vdpa
vfio vfio/fsl-mc: Block calling interrupt handler without trigger 2024-11-19 09:22:45 +01:00
vhost vhost: Add smp_rmb() in vhost_vq_avail_empty() 2024-11-19 11:32:20 +01:00
vibrator
video fbmon: prevent division by zero in fb_videomode_from_videomode() 2024-11-19 09:23:15 +01:00
virt
virtio virtio: reenable config if freezing device failed 2024-11-19 09:23:15 +01:00
vision
vision3
visorbus
vlynq
vme
w1
watchdog watchdog: stm32_iwdg: initialize default timeout 2024-11-19 08:44:57 +01:00
xen xen/events: close evtchn after mapping cleanup 2024-11-19 09:22:39 +01:00
zorro
Kconfig drivers: add stub kperfmon 2024-06-15 16:28:49 -03:00
Kconfig.variant1
Makefile drivers: add stub kperfmon 2024-06-15 16:28:49 -03:00
Makefile.variant1