kernel_samsung_a53x/drivers/net
Shifeng Li 6c9ae14473 net/mlx5e: Fix a race in command alloc flow
commit 8f5100da56b3980276234e812ce98d8f075194cd upstream.

Fix a cmd->ent use after free due to a race on command entry.
Such race occurs when one of the commands releases its last refcount and
frees its index and entry while another process running command flush
flow takes refcount to this command entry. The process which handles
commands flush may see this command as needed to be flushed if the other
process allocated a ent->idx but didn't set ent to cmd->ent_arr in
cmd_work_handler(). Fix it by moving the assignment of cmd->ent_arr into
the spin lock.

[70013.081955] BUG: KASAN: use-after-free in mlx5_cmd_trigger_completions+0x1e2/0x4c0 [mlx5_core]
[70013.081967] Write of size 4 at addr ffff88880b1510b4 by task kworker/26:1/1433361
[70013.081968]
[70013.082028] Workqueue: events aer_isr
[70013.082053] Call Trace:
[70013.082067]  dump_stack+0x8b/0xbb
[70013.082086]  print_address_description+0x6a/0x270
[70013.082102]  kasan_report+0x179/0x2c0
[70013.082173]  mlx5_cmd_trigger_completions+0x1e2/0x4c0 [mlx5_core]
[70013.082267]  mlx5_cmd_flush+0x80/0x180 [mlx5_core]
[70013.082304]  mlx5_enter_error_state+0x106/0x1d0 [mlx5_core]
[70013.082338]  mlx5_try_fast_unload+0x2ea/0x4d0 [mlx5_core]
[70013.082377]  remove_one+0x200/0x2b0 [mlx5_core]
[70013.082409]  pci_device_remove+0xf3/0x280
[70013.082439]  device_release_driver_internal+0x1c3/0x470
[70013.082453]  pci_stop_bus_device+0x109/0x160
[70013.082468]  pci_stop_and_remove_bus_device+0xe/0x20
[70013.082485]  pcie_do_fatal_recovery+0x167/0x550
[70013.082493]  aer_isr+0x7d2/0x960
[70013.082543]  process_one_work+0x65f/0x12d0
[70013.082556]  worker_thread+0x87/0xb50
[70013.082571]  kthread+0x2e9/0x3a0
[70013.082592]  ret_from_fork+0x1f/0x40

The logical relationship of this error is as follows:

             aer_recover_work              |          ent->work
-------------------------------------------+------------------------------
aer_recover_work_func                      |
|- pcie_do_recovery                        |
  |- report_error_detected                 |
    |- mlx5_pci_err_detected               |cmd_work_handler
      |- mlx5_enter_error_state            |  |- cmd_alloc_index
        |- enter_error_state               |    |- lock cmd->alloc_lock
          |- mlx5_cmd_flush                |    |- clear_bit
            |- mlx5_cmd_trigger_completions|    |- unlock cmd->alloc_lock
              |- lock cmd->alloc_lock      |
              |- vector = ~dev->cmd.vars.bitmask
              |- for_each_set_bit          |
                |- cmd_ent_get(cmd->ent_arr[i]) (UAF)
              |- unlock cmd->alloc_lock    |  |- cmd->ent_arr[ent->idx]=ent

The cmd->ent_arr[ent->idx] assignment and the bit clearing are not
protected by the cmd->alloc_lock in cmd_work_handler().

Fixes: 50b2412b7e78 ("net/mlx5: Avoid possible free of command entry while timeout comp handler")
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Shifeng Li <lishifeng@sangfor.com.cn>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Samasth Norway Ananda <samasth.norway.ananda@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-19 11:32:38 +01:00
..
appletalk
arcnet arcnet: restoring support for multiple Sohard Arcnet cards 2024-11-18 12:11:39 +01:00
bonding bonding: remove print in bond_verify_device_path 2024-11-18 12:13:23 +01:00
caif
can
dropdump
dsa net: dsa: mt7530: prevent possible incorrect XTAL frequency selection 2024-11-19 08:44:59 +01:00
ethernet net/mlx5e: Fix a race in command alloc flow 2024-11-19 11:32:38 +01:00
fddi
fjes fjes: fix memleaks in fjes_hw_setup 2024-11-18 12:13:01 +01:00
hamradio
hippi
hyperv hv_netvsc: Register VF in netvsc_probe if NET_DEVICE_REGISTER missed 2024-11-18 23:19:52 +01:00
ieee802154
ipa
ipvlan ipvlan: add ipvlan_route_v6_outbound() helper 2024-11-18 11:43:19 +01:00
mdio
netdevsim
pcs
phy net: phy: dp83822: Fix RGMII TX delay configuration 2024-11-19 08:44:49 +01:00
plip
ppp ppp_async: limit MRU to 64K 2024-11-18 12:13:25 +01:00
slip
team team: Fix use-after-free when an option instance allocation fails 2024-11-18 12:11:57 +01:00
usb net: usb: ax88179_178a: stop lying about skb->truesize 2024-11-19 11:32:37 +01:00
vmxnet3
vxlan vxlan: drop packets from invalid src-address 2024-11-19 11:32:36 +01:00
wan
wimax
wireguard wireguard: netlink: access device through ctx instead of peer 2024-11-19 09:22:37 +01:00
wireless wifi: iwlwifi: mvm: remove old PASN station when adding a new one 2024-11-19 11:32:36 +01:00
xen-netback xen-netback: properly sync TX responses 2024-11-18 12:13:30 +01:00
bareudp.c
dummy.c
eql.c
geneve.c geneve: fix header validation in geneve[6]_xmit_skb 2024-11-19 11:32:19 +01:00
gtp.c net: gtp: Fix Use-After-Free in gtp_dellink 2024-11-19 11:32:37 +01:00
ifb.c
Kconfig
LICENSE.SRC
loopback.c
macsec.c
macvlan.c macvlan: Don't propagate promisc change to lower dev in passthru 2024-11-18 11:43:20 +01:00
macvtap.c
Makefile
mdio.c
mii.c
net_failover.c
netconsole.c
nlmon.c
ntb_netdev.c
rionet.c
sb1000.c
Space.c
sungem_phy.c
tap.c
thunderbolt.c
tun.c tun: limit printing rate when illegal packet received by tun dev 2024-11-19 11:32:21 +01:00
veth.c
virtio_net.c virtio_net: Fix "‘%d’ directive writing between 1 and 11 bytes into a region of size 10" warnings 2024-11-18 12:13:20 +01:00
vrf.c
vsockmon.c
xen-netfront.c