kernel_samsung_a53x/net
Jiri Wiesner 6f57e94334 net/ipv6: release expired exception dst cached in socket
[ Upstream commit 3301ab7d5aeb0fe270f73a3d4810c9d1b6a9f045 ]

Dst objects get leaked in ip6_negative_advice() when this function is
executed for an expired IPv6 route located in the exception table. There
are several conditions that must be fulfilled for the leak to occur:
* an ICMPv6 packet indicating a change of the MTU for the path is received,
  resulting in an exception dst being created
* a TCP connection that uses the exception dst for routing packets must
  start timing out so that TCP begins retransmissions
* after the exception dst expires, the FIB6 garbage collector must not run
  before TCP executes ip6_negative_advice() for the expired exception dst

When TCP executes ip6_negative_advice() for an exception dst that has
expired and if no other socket holds a reference to the exception dst, the
refcount of the exception dst is 2, which corresponds to the increment
made by dst_init() and the increment made by the TCP socket for which the
connection is timing out. The refcount made by the socket is never
released. The refcount of the dst is decremented in sk_dst_reset() but
that decrement is counteracted by a dst_hold() intentionally placed just
before the sk_dst_reset() in ip6_negative_advice(). After
ip6_negative_advice() has finished, there is no other object tied to the
dst. The socket lost its reference stored in sk_dst_cache and the dst is
no longer in the exception table. The exception dst becomes a leaked
object.

As a result of this dst leak, an unbalanced refcount is reported for the
loopback device of a net namespace being destroyed under kernels that do
not contain e5f80fcf869a ("ipv6: give an IPv6 dev to blackhole_netdev"):
unregister_netdevice: waiting for lo to become free. Usage count = 2

Fix the dst leak by removing the dst_hold() in ip6_negative_advice(). The
patch that introduced the dst_hold() in ip6_negative_advice() was
92f1655aa2b22 ("net: fix __dst_negative_advice() race"). But 92f1655aa2b22
merely refactored the code with regards to the dst refcount so the issue
was present even before 92f1655aa2b22. The bug was introduced in
54c1a859efd9f ("ipv6: Don't drop cache route entry unless timer actually
expired.") where the expired cached route is deleted and the sk_dst_cache
member of the socket is set to NULL by calling dst_negative_advice() but
the refcount belonging to the socket is left unbalanced.

The IPv4 version - ipv4_negative_advice() - is not affected by this bug.
When the TCP connection times out ipv4_negative_advice() merely resets the
sk_dst_cache of the socket while decrementing the refcount of the
exception dst.

Fixes: 92f1655aa2b22 ("net: fix __dst_negative_advice() race")
Fixes: 54c1a859efd9f ("ipv6: Don't drop cache route entry unless timer actually expired.")
Link: https://lore.kernel.org/netdev/20241113105611.GA6723@incl/T/#u
Signed-off-by: Jiri Wiesner <jwiesner@suse.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241128085950.GA4505@incl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-17 13:24:26 +01:00
..
6lowpan
9p 9p/xen: fix release of IRQ 2024-12-17 13:24:22 +01:00
802
8021q Revert "gro: remove rcu_read_lock/rcu_read_unlock from gro_receive handlers" 2024-11-24 00:23:41 +01:00
appletalk
atm
ax25 Revert "Make more sysctl constants read-only" 2024-12-03 19:56:17 +01:00
batman-adv batman-adv: fix random jitter calculation 2024-11-19 17:55:48 +01:00
bluetooth Bluetooth: Fix type of len in rfcomm_sock_getsockopt{,_old}() 2024-12-17 13:24:18 +01:00
bpf
bpfilter
bridge net: bridge: xmit: make sure we have at least eth header len bytes 2024-11-30 02:33:25 +01:00
caif
can can: j1939: j1939_session_new(): fix skb reference counting 2024-12-17 13:24:26 +01:00
ceph libceph: fix race between delayed_work() and ceph_monc_stop() 2024-11-19 14:19:45 +01:00
core bpf, sockmap: Fix sk_msg_reset_curr 2024-12-17 13:24:06 +01:00
dcb
dccp
decnet
dns_resolver
dsa
ethernet Revert "gro: remove rcu_read_lock/rcu_read_unlock from gro_receive handlers" 2024-11-24 00:23:41 +01:00
ethtool net: introduce a netdev feature for UDP GRO forwarding 2024-12-17 13:24:15 +01:00
hsr net: hsr: avoid potential out-of-bound access in fill_frame_info() 2024-12-17 13:24:26 +01:00
ieee802154
ife
ipv4 ipmr: fix tables suspicious RCU usage 2024-12-17 13:24:16 +01:00
ipv6 net/ipv6: release expired exception dst cached in socket 2024-12-17 13:24:26 +01:00
iucv Revert "net/iucv: fix use after free in iucv_sock_close()" 2024-11-24 00:23:55 +01:00
kcm kcm: Serialise kcm_sendmsg() for the same socket. 2024-11-23 23:20:48 +01:00
key
l2tp genetlink: hold RCU in genlmsg_mcast() 2024-11-23 23:21:59 +01:00
l3mdev net: Add l3mdev index to flow struct and avoid oif reset for port devices 2024-11-23 23:21:52 +01:00
lapb
llc
mac80211 mac80211: fix user-power when emulating chanctx 2024-12-17 13:23:57 +01:00
mac802154 Revert "net: mac802154: Fix racy device stats updates by DEV_STATS_INC() and DEV_STATS_ADD()" 2024-11-19 14:52:14 +01:00
mpls
mptcp Revert "mptcp: correct MPTCP_SUBFLOW_ATTR_SSN_OFFSET reserved size" 2024-11-24 00:23:53 +01:00
ncm
ncsi net/ncsi: Fix the multi thread manner of NCSI driver 2024-11-19 14:19:00 +01:00
netfilter netfilter: x_tables: fix LED ID check in led_tg_check() 2024-12-17 13:24:25 +01:00
netlabel
netlink netlink: terminate outstanding dump on socket close 2024-12-17 13:20:50 +01:00
netrom Revert "Make more sysctl constants read-only" 2024-12-03 19:56:17 +01:00
nfc
nsh
openvswitch
packet af_packet: Handle outgoing VLAN packets without hardware offloading 2024-11-23 23:20:12 +01:00
phonet Revert "Make more sysctl constants read-only" 2024-12-03 19:56:17 +01:00
psample
qrtr Revert "net: qrtr: Update packets cloning when broadcasting" 2024-11-24 00:23:18 +01:00
rds Revert "net:rds: Fix possible deadlock in rds_message_put" 2024-11-24 00:23:49 +01:00
rfkill net: rfkill: gpio: Add check for clk_enable() 2024-12-17 13:24:07 +01:00
rose Revert "Make more sysctl constants read-only" 2024-12-03 19:56:17 +01:00
rxrpc
sched net/sched: tbf: correct backlog statistic for GSO packets 2024-12-17 13:24:26 +01:00
sctp Revert "Make more sysctl constants read-only" 2024-12-03 19:56:17 +01:00
skb_tracer
smc Revert "net/smc: Allow SMC-D 1MB DMB allocations" 2024-11-24 00:23:56 +01:00
strparser
sunrpc sunrpc: clear XPRT_SOCK_UPD_TIMEOUT when reset transport 2024-12-17 13:24:23 +01:00
switchdev
tipc Revert "net: tipc: avoid possible garbage value" 2024-11-24 00:23:28 +01:00
tls
unix Revert "af_unix: Remove put_pid()/put_cred() in copy_peercred()." 2024-11-24 00:23:43 +01:00
vmw_vsock vsock/virtio: Initialization of the dangling pointer occurring in vsk->trans 2024-11-30 02:33:27 +01:00
wimax
wireless Revert "wifi: nl80211: don't give key data to userspace" 2024-11-24 00:23:55 +01:00
x25 Revert "Make more sysctl constants read-only" 2024-12-03 19:56:17 +01:00
xdp
xfrm xfrm: store and rely on direction to construct offload flags 2024-12-17 13:24:04 +01:00
compat.c
devres.c
Kconfig
Makefile
socket.c
sysctl_net.c
TEST_MAPPING