kernel_samsung_a53x/tools
Yonghong Song 5e5ec20502 selftests/bpf: Fix send_signal test with nested CONFIG_PARAVIRT
[ Upstream commit 7015843afcaf68c132784c89528dfddc0005e483 ]

Alexei reported that send_signal test may fail with nested CONFIG_PARAVIRT
configs. In this particular case, the base VM is AMD with 166 cpus, and I
run selftests with regular qemu on top of that and indeed send_signal test
failed. I also tried with an Intel box with 80 cpus and there is no issue.

The main qemu command line includes:

  -enable-kvm -smp 16 -cpu host

The failure log looks like:

  $ ./test_progs -t send_signal
  [   48.501588] watchdog: BUG: soft lockup - CPU#9 stuck for 26s! [test_progs:2225]
  [   48.503622] Modules linked in: bpf_testmod(O)
  [   48.503622] CPU: 9 PID: 2225 Comm: test_progs Tainted: G           O       6.9.0-08561-g2c1713a8f1c9-dirty #69
  [   48.507629] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
  [   48.511635] RIP: 0010:handle_softirqs+0x71/0x290
  [   48.511635] Code: [...] 10 0a 00 00 00 31 c0 65 66 89 05 d5 f4 fa 7e fb bb ff ff ff ff <49> c7 c2 cb
  [   48.518527] RSP: 0018:ffffc90000310fa0 EFLAGS: 00000246
  [   48.519579] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 00000000000006e0
  [   48.522526] RDX: 0000000000000006 RSI: ffff88810791ae80 RDI: 0000000000000000
  [   48.523587] RBP: ffffc90000fabc88 R08: 00000005a0af4f7f R09: 0000000000000000
  [   48.525525] R10: 0000000561d2f29c R11: 0000000000006534 R12: 0000000000000280
  [   48.528525] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
  [   48.528525] FS:  00007f2f2885cd00(0000) GS:ffff888237c40000(0000) knlGS:0000000000000000
  [   48.531600] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [   48.535520] CR2: 00007f2f287059f0 CR3: 0000000106a28002 CR4: 00000000003706f0
  [   48.537538] Call Trace:
  [   48.537538]  <IRQ>
  [   48.537538]  ? watchdog_timer_fn+0x1cd/0x250
  [   48.539590]  ? lockup_detector_update_enable+0x50/0x50
  [   48.539590]  ? __hrtimer_run_queues+0xff/0x280
  [   48.542520]  ? hrtimer_interrupt+0x103/0x230
  [   48.544524]  ? __sysvec_apic_timer_interrupt+0x4f/0x140
  [   48.545522]  ? sysvec_apic_timer_interrupt+0x3a/0x90
  [   48.547612]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
  [   48.547612]  ? handle_softirqs+0x71/0x290
  [   48.547612]  irq_exit_rcu+0x63/0x80
  [   48.551585]  sysvec_apic_timer_interrupt+0x75/0x90
  [   48.552521]  </IRQ>
  [   48.553529]  <TASK>
  [   48.553529]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
  [   48.555609] RIP: 0010:finish_task_switch.isra.0+0x90/0x260
  [   48.556526] Code: [...] 9f 58 0a 00 00 48 85 db 0f 85 89 01 00 00 4c 89 ff e8 53 d9 bd 00 fb 66 90 <4d> 85 ed 74
  [   48.562524] RSP: 0018:ffffc90000fabd38 EFLAGS: 00000282
  [   48.563589] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff83385620
  [   48.563589] RDX: ffff888237c73ae4 RSI: 0000000000000000 RDI: ffff888237c6fd00
  [   48.568521] RBP: ffffc90000fabd68 R08: 0000000000000000 R09: 0000000000000000
  [   48.569528] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8881009d0000
  [   48.573525] R13: ffff8881024e5400 R14: ffff88810791ae80 R15: ffff888237c6fd00
  [   48.575614]  ? finish_task_switch.isra.0+0x8d/0x260
  [   48.576523]  __schedule+0x364/0xac0
  [   48.577535]  schedule+0x2e/0x110
  [   48.578555]  pipe_read+0x301/0x400
  [   48.579589]  ? destroy_sched_domains_rcu+0x30/0x30
  [   48.579589]  vfs_read+0x2b3/0x2f0
  [   48.579589]  ksys_read+0x8b/0xc0
  [   48.583590]  do_syscall_64+0x3d/0xc0
  [   48.583590]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
  [   48.586525] RIP: 0033:0x7f2f28703fa1
  [   48.587592] Code: [...] 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 80 3d c5 23 14 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0
  [   48.593534] RSP: 002b:00007ffd90f8cf88 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
  [   48.595589] RAX: ffffffffffffffda RBX: 00007ffd90f8d5e8 RCX: 00007f2f28703fa1
  [   48.595589] RDX: 0000000000000001 RSI: 00007ffd90f8cfb0 RDI: 0000000000000006
  [   48.599592] RBP: 00007ffd90f8d2f0 R08: 0000000000000064 R09: 0000000000000000
  [   48.602527] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
  [   48.603589] R13: 00007ffd90f8d608 R14: 00007f2f288d8000 R15: 0000000000f6bdb0
  [   48.605527]  </TASK>

In the test, two processes are communicating through pipe. Further debugging
with strace found that the above splat is triggered as read() syscall could
not receive the data even if the corresponding write() syscall in another
process successfully wrote data into the pipe.

The failed subtest is "send_signal_perf". The corresponding perf event has
sample_period 1 and config PERF_COUNT_SW_CPU_CLOCK. sample_period 1 means every
overflow event will trigger a call to the BPF program. So I suspect this may
overwhelm the system. So I increased the sample_period to 100,000 and the test
passed. The sample_period 10,000 still has the test failed.

In other parts of selftest, e.g., [1], sample_freq is used instead. So I
decided to use sample_freq = 1,000 since the test can pass as well.

  [1] https://lore.kernel.org/bpf/20240604070700.3032142-1-song@kernel.org/

Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240605201203.2603846-1-yonghong.song@linux.dev
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-11-23 23:20:23 +01:00
..
accounting Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
arch x86/insn: Fix PUSH instruction in x86 instruction decoder opcode map 2024-11-19 12:26:59 +01:00
bootconfig Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
bpf bpf: Fix potential integer overflow in resolve_btfids 2024-11-19 12:27:10 +01:00
build Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
cgroup Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
crypto Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
debugging Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
edid Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
firewire Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
firmware Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
gpio Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
hv Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
iio tools: iio: replace seekdir() in iio_generic_buffer 2024-11-19 09:23:15 +01:00
include hugetlb_encode.h: fix undefined behaviour (34 << 26) 2024-11-19 14:19:05 +01:00
io_uring Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
kvm/kvm_stat Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
laptop Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
leds Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
lib libbpf: Fix no-args func prototype BTF dumping syntax 2024-11-23 23:20:16 +01:00
memory-model tools/memory-model: Fix bug in lock.cat 2024-11-23 23:20:13 +01:00
objtool exit: Rename module_put_and_exit to module_put_and_kthread_exit 2024-11-19 12:27:50 +01:00
pci Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
pcmcia Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
perf perf report: Fix condition in sort__sym_cmp() 2024-11-23 23:20:09 +01:00
power tools/power turbostat: Fix Bzy_MHz documentation typo 2024-11-19 11:32:44 +01:00
scripts Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
spi Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
testing selftests/bpf: Fix send_signal test with nested CONFIG_PARAVIRT 2024-11-23 23:20:23 +01:00
thermal/tmon Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
time Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
usb Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
virtio Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
vm Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
wmi Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
Makefile Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00