kernel_samsung_a53x/net
Jason Xing 5b2e4aef3f tcp: avoid reusing FIN_WAIT2 when trying to find port in connect() process
[ Upstream commit 0d9e5df4a257afc3a471a82961ace9a22b88295a ]

We found that one close-wait socket was reset by the other side
due to a new connection reusing the same port which is beyond our
expectation, so we have to investigate the underlying reason.

The following experiment is conducted in the test environment. We
limit the port range from 40000 to 40010 and delay the time to close()
after receiving a fin from the active close side, which can help us
easily reproduce like what happened in production.

Here are three connections captured by tcpdump:
127.0.0.1.40002 > 127.0.0.1.9999: Flags [S], seq 2965525191
127.0.0.1.9999 > 127.0.0.1.40002: Flags [S.], seq 2769915070
127.0.0.1.40002 > 127.0.0.1.9999: Flags [.], ack 1
127.0.0.1.40002 > 127.0.0.1.9999: Flags [F.], seq 1, ack 1
// a few seconds later, within 60 seconds
127.0.0.1.40002 > 127.0.0.1.9999: Flags [S], seq 2965590730
127.0.0.1.9999 > 127.0.0.1.40002: Flags [.], ack 2
127.0.0.1.40002 > 127.0.0.1.9999: Flags [R], seq 2965525193
// later, very quickly
127.0.0.1.40002 > 127.0.0.1.9999: Flags [S], seq 2965590730
127.0.0.1.9999 > 127.0.0.1.40002: Flags [S.], seq 3120990805
127.0.0.1.40002 > 127.0.0.1.9999: Flags [.], ack 1

As we can see, the first flow is reset because:
1) client starts a new connection, I mean, the second one
2) client tries to find a suitable port which is a timewait socket
   (its state is timewait, substate is fin_wait2)
3) client occupies that timewait port to send a SYN
4) server finds a corresponding close-wait socket in ehash table,
   then replies with a challenge ack
5) client sends an RST to terminate this old close-wait socket.

I don't think the port selection algo can choose a FIN_WAIT2 socket
when we turn on tcp_tw_reuse because on the server side there
remain unread data. In some cases, if one side haven't call close() yet,
we should not consider it as expendable and treat it at will.

Even though, sometimes, the server isn't able to call close() as soon
as possible like what we expect, it can not be terminated easily,
especially due to a second unrelated connection happening.

After this patch, we can see the expected failure if we start a
connection when all the ports are occupied in fin_wait2 state:
"Ncat: Cannot assign requested address."

Reported-by: Jade Dong <jadedong@tencent.com>
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20240823001152.31004-1-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-11-23 23:21:38 +01:00
..
6lowpan
9p net/9p: fix uninit-value in p9_client_rpc() 2024-11-19 12:27:18 +01:00
802
8021q gro: remove rcu_read_lock/rcu_read_unlock from gro_complete handlers 2024-11-23 23:21:04 +01:00
appletalk
atm
ax25
batman-adv batman-adv: fix random jitter calculation 2024-11-19 17:55:48 +01:00
bluetooth Bluetooth: L2CAP: Fix not validating setsockopt user input 2024-11-23 23:21:36 +01:00
bpf
bpfilter
bridge net: bridge: br_fdb_external_learn_add(): always set EXT_LEARN 2024-11-23 23:21:04 +01:00
caif
can can: bcm: Clear bo->bcm_proc_read after remove_proc_entry(). 2024-11-23 23:21:18 +01:00
ceph libceph: fix race between delayed_work() and ceph_monc_stop() 2024-11-19 14:19:45 +01:00
core net: add more sanity checks to qdisc_pkt_len_init() 2024-11-23 23:21:35 +01:00
dcb
dccp
decnet
dns_resolver keys, dns: Fix size check of V1 server-list header 2024-11-18 12:12:43 +01:00
dsa
ethernet gro: remove rcu_read_lock/rcu_read_unlock from gro_complete handlers 2024-11-23 23:21:04 +01:00
ethtool ethtool: check device is present when getting link settings 2024-11-23 23:20:55 +01:00
hsr hsr: Handle failures in module init 2024-11-19 08:44:59 +01:00
ieee802154
ife
ipv4 tcp: avoid reusing FIN_WAIT2 when trying to find port in connect() process 2024-11-23 23:21:38 +01:00
ipv6 netfilter: nf_tables: prevent nf_skb_duplicated corruption 2024-11-23 23:21:35 +01:00
iucv s390/iucv: fix receive buffer virtual vs physical address confusion 2024-11-23 23:20:47 +01:00
kcm kcm: Serialise kcm_sendmsg() for the same socket. 2024-11-23 23:20:48 +01:00
key
l2tp l2tp: fix lockdep splat 2024-11-23 23:20:22 +01:00
l3mdev
lapb
llc llc: call sock_orphan() at release time 2024-11-18 12:13:22 +01:00
mac80211 wifi: mac80211: use two-phase skb reclamation in ieee80211_do_stop() 2024-11-23 23:21:18 +01:00
mac802154 Revert "net: mac802154: Fix racy device stats updates by DEV_STATS_INC() and DEV_STATS_ADD()" 2024-11-19 14:52:14 +01:00
mpls
mptcp mptcp: fix sometimes-uninitialized warning 2024-11-23 23:21:29 +01:00
ncm
ncsi net/ncsi: Fix the multi thread manner of NCSI driver 2024-11-19 14:19:00 +01:00
netfilter netfilter: ctnetlink: compile ctnetlink_label_size with CONFIG_NF_CONNTRACK_EVENTS 2024-11-23 23:21:28 +01:00
netlabel calipso: fix memory leak in netlbl_calipso_add_pass() 2024-11-18 12:12:25 +01:00
netlink netlink: hold nlk->cb_mutex longer in __netlink_dump_start() 2024-11-23 23:20:45 +01:00
netrom netrom: Fix a memory leak in nr_heartbeat_expiry() 2024-11-19 14:19:08 +01:00
nfc nfc: nci: Fix handling of zero-length payload packets in nci_rx_work() 2024-11-19 12:27:10 +01:00
nsh nsh: Restore skb->{protocol,data,mac_header} for outer header in nsh_gso_segment(). 2024-11-19 11:32:42 +01:00
openvswitch openvswitch: Set the skbuff pkt_type for proper pmtud support. 2024-11-19 12:27:09 +01:00
packet af_packet: Handle outgoing VLAN packets without hardware offloading 2024-11-23 23:20:12 +01:00
phonet phonet: fix rtm_phonet_notify() skb allocation 2024-11-19 11:32:46 +01:00
psample
qrtr net: qrtr: Update packets cloning when broadcasting 2024-11-23 23:21:28 +01:00
rds net:rds: Fix possible deadlock in rds_message_put 2024-11-23 23:20:54 +01:00
rfkill
rose
rxrpc rxrpc: Fix response to PING RESPONSE ACKs to a dead call 2024-11-18 12:13:25 +01:00
sched net: sched: consistently use rcu_replace_pointer() in taprio_change() 2024-11-23 23:21:38 +01:00
sctp sctp: set sk_state back to CLOSED if autobind fails in sctp_listen_start 2024-11-23 23:21:36 +01:00
skb_tracer
smc net/smc: set rmb's SG_MAX_SINGLE_ALLOC limitation only when CONFIG_ARCH_NO_SG_CHAIN is defined 2024-11-23 23:20:06 +01:00
strparser
sunrpc net, sunrpc: Remap EPERM in case of connection failure in xs_tcp_setup_socket 2024-11-23 23:21:09 +01:00
switchdev
tipc tipc: guard against string buffer overrun 2024-11-23 23:21:38 +01:00
tls tls: fix missing memory barrier in tls_init 2024-11-19 12:27:09 +01:00
unix af_unix: Remove put_pid()/put_cred() in copy_peercred(). 2024-11-23 23:21:03 +01:00
vmw_vsock virtio/vsock: fix logic which reduces credit update messages 2024-11-18 12:12:37 +01:00
wimax
wireless wifi: cfg80211: fix two more possible UBSAN-detected off-by-one errors 2024-11-23 23:21:18 +01:00
x25 net/x25: fix incorrect parameter validation in the x25_getsockopt() function 2024-11-19 08:44:50 +01:00
xdp xsk: validate user input for XDP_{UMEM|COMPLETION}_FILL_RING 2024-11-19 11:32:19 +01:00
xfrm net: fix __dst_negative_advice() race 2024-11-19 12:27:19 +01:00
compat.c
devres.c
Kconfig
Makefile
socket.c
sysctl_net.c
TEST_MAPPING