kernel_samsung_a53x/net/core
Taehee Yoo b8529bdf8a xdp: fix invalid wait context of page_pool_destroy()
[ Upstream commit 59a931c5b732ca5fc2ca727f5a72aeabaafa85ec ]

If the driver uses a page pool, it creates a page pool with
page_pool_create().
The reference count of page pool is 1 as default.
A page pool will be destroyed only when a reference count reaches 0.
page_pool_destroy() is used to destroy page pool, it decreases a
reference count.
When a page pool is destroyed, ->disconnect() is called, which is
mem_allocator_disconnect().
This function internally acquires mutex_lock().

If the driver uses XDP, it registers a memory model with
xdp_rxq_info_reg_mem_model().
The xdp_rxq_info_reg_mem_model() internally increases a page pool
reference count if a memory model is a page pool.
Now the reference count is 2.

To destroy a page pool, the driver should call both page_pool_destroy()
and xdp_unreg_mem_model().
The xdp_unreg_mem_model() internally calls page_pool_destroy().
Only page_pool_destroy() decreases a reference count.

If a driver calls page_pool_destroy() then xdp_unreg_mem_model(), we
will face an invalid wait context warning.
Because xdp_unreg_mem_model() calls page_pool_destroy() with
rcu_read_lock().
The page_pool_destroy() internally acquires mutex_lock().

Splat looks like:
=============================
[ BUG: Invalid wait context ]
6.10.0-rc6+ #4 Tainted: G W
-----------------------------
ethtool/1806 is trying to lock:
ffffffff90387b90 (mem_id_lock){+.+.}-{4:4}, at: mem_allocator_disconnect+0x73/0x150
other info that might help us debug this:
context-{5:5}
3 locks held by ethtool/1806:
stack backtrace:
CPU: 0 PID: 1806 Comm: ethtool Tainted: G W 6.10.0-rc6+ #4 f916f41f172891c800f2fed
Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021
Call Trace:
<TASK>
dump_stack_lvl+0x7e/0xc0
__lock_acquire+0x1681/0x4de0
? _printk+0x64/0xe0
? __pfx_mark_lock.part.0+0x10/0x10
? __pfx___lock_acquire+0x10/0x10
lock_acquire+0x1b3/0x580
? mem_allocator_disconnect+0x73/0x150
? __wake_up_klogd.part.0+0x16/0xc0
? __pfx_lock_acquire+0x10/0x10
? dump_stack_lvl+0x91/0xc0
__mutex_lock+0x15c/0x1690
? mem_allocator_disconnect+0x73/0x150
? __pfx_prb_read_valid+0x10/0x10
? mem_allocator_disconnect+0x73/0x150
? __pfx_llist_add_batch+0x10/0x10
? console_unlock+0x193/0x1b0
? lockdep_hardirqs_on+0xbe/0x140
? __pfx___mutex_lock+0x10/0x10
? tick_nohz_tick_stopped+0x16/0x90
? __irq_work_queue_local+0x1e5/0x330
? irq_work_queue+0x39/0x50
? __wake_up_klogd.part.0+0x79/0xc0
? mem_allocator_disconnect+0x73/0x150
mem_allocator_disconnect+0x73/0x150
? __pfx_mem_allocator_disconnect+0x10/0x10
? mark_held_locks+0xa5/0xf0
? rcu_is_watching+0x11/0xb0
page_pool_release+0x36e/0x6d0
page_pool_destroy+0xd7/0x440
xdp_unreg_mem_model+0x1a7/0x2a0
? __pfx_xdp_unreg_mem_model+0x10/0x10
? kfree+0x125/0x370
? bnxt_free_ring.isra.0+0x2eb/0x500
? bnxt_free_mem+0x5ac/0x2500
xdp_rxq_info_unreg+0x4a/0xd0
bnxt_free_mem+0x1356/0x2500
bnxt_close_nic+0xf0/0x3b0
? __pfx_bnxt_close_nic+0x10/0x10
? ethnl_parse_bit+0x2c6/0x6d0
? __pfx___nla_validate_parse+0x10/0x10
? __pfx_ethnl_parse_bit+0x10/0x10
bnxt_set_features+0x2a8/0x3e0
__netdev_update_features+0x4dc/0x1370
? ethnl_parse_bitset+0x4ff/0x750
? __pfx_ethnl_parse_bitset+0x10/0x10
? __pfx___netdev_update_features+0x10/0x10
? mark_held_locks+0xa5/0xf0
? _raw_spin_unlock_irqrestore+0x42/0x70
? __pm_runtime_resume+0x7d/0x110
ethnl_set_features+0x32d/0xa20

To fix this problem, it uses rhashtable_lookup_fast() instead of
rhashtable_lookup() with rcu_read_lock().
Using xa without rcu_read_lock() here is safe.
xa is freed by __xdp_mem_allocator_rcu_free() and this is called by
call_rcu() of mem_xa_remove().
The mem_xa_remove() is called by page_pool_destroy() if a reference
count reaches 0.
The xa is already protected by the reference count mechanism well in the
control plane.
So removing rcu_read_lock() for page_pool_destroy() is safe.

Fixes: c3f812cea0d7 ("page_pool: do not release pool until inflight == 0.")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20240712095116.3801586-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-11-23 23:20:08 +01:00
..
bpf_sk_storage.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
datagram.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
datagram.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
dev.c net: give more chances to rcu in netdev_wait_allrefs_any() 2024-11-19 12:26:55 +01:00
dev_addr_lists.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
dev_ioctl.c net: dev: Convert sa_data to flexible array in struct sockaddr 2024-11-18 22:25:41 +01:00
devlink.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
drop_monitor.c drop_monitor: replace spin_lock by raw_spin_lock 2024-11-19 14:19:06 +01:00
dst.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
dst_cache.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
failover.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
fib_notifier.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
fib_rules.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
filter.c bpf: Add a check for struct bpf_fib_lookup size 2024-11-19 14:19:32 +01:00
flow_dissector.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
flow_offload.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
gen_estimator.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
gen_stats.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
gro_cells.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
hwbm.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
link_watch.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
lwt_bpf.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
lwtunnel.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
Makefile Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
neighbour.c neighbour: Don't let neigh_forced_gc() disable preemption for long 2024-11-18 12:12:16 +01:00
net-procfs.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
net-sysfs.c Backport mac80211 patches from linux-6.1.y 2024-06-15 16:29:20 -03:00
net-sysfs.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
net-traces.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
net_namespace.c netns: Make get_net_ns() handle zero refcount net 2024-11-19 14:19:08 +01:00
netclassid_cgroup.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
netevent.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
netpoll.c netpoll: Fix race condition in netpoll_owner_active 2024-11-19 14:19:06 +01:00
netprio_cgroup.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
page_pool.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
pktgen.c net: pktgen: Fix interface flags printing 2024-11-08 11:26:11 +01:00
ptp_classifier.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
request_sock.c tcp: make sure init the accept_queue's spinlocks once 2024-11-18 12:12:59 +01:00
rtnetlink.c rtnetlink: Correct nested IFLA_VF_VLAN_LIST attribute validation 2024-11-19 11:32:45 +01:00
scm.c Revert "io_uring/unix: drop usage of io_uring socket" 2024-11-19 09:11:51 +01:00
secure_seq.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
skbuff.c kcov: Remove kcov include from sched.h and move it to its users. 2024-11-19 11:32:46 +01:00
skmsg.c bpf, sockmap: Fix sk->sk_forward_alloc warn_on in sk_stream_kill_queues 2024-11-19 14:19:42 +01:00
sock.c ipv6: Fix data races around sk->sk_prot. 2024-11-19 14:19:35 +01:00
sock_diag.c sock_diag: annotate data-races around sock_diag_handlers[family] 2024-11-19 08:44:38 +01:00
sock_map.c bpf, sockmap: Fix sk->sk_forward_alloc warn_on in sk_stream_kill_queues 2024-11-19 14:19:42 +01:00
sock_reuseport.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
stream.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
sysctl_net_core.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
timestamping.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
tso.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
utils.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
xdp.c xdp: fix invalid wait context of page_pool_destroy() 2024-11-23 23:20:08 +01:00