kernel_samsung_a53x/kernel
K Prateek Nayak d08554baac sched/core: Remove the unnecessary need_resched() check in nohz_csd_func()
[ Upstream commit ea9cffc0a154124821531991d5afdd7e8b20d7aa ]

The need_resched() check currently in nohz_csd_func() can be tracked
to have been added in scheduler_ipi() back in 2011 via commit
ca38062e57e9 ("sched: Use resched IPI to kick off the nohz idle balance")

Since then, it has travelled quite a bit but it seems like an idle_cpu()
check currently is sufficient to detect the need to bail out from an
idle load balancing. To justify this removal, consider all the following
case where an idle load balancing could race with a task wakeup:

o Since commit f3dd3f674555b ("sched: Remove the limitation of WF_ON_CPU
  on wakelist if wakee cpu is idle") a target perceived to be idle
  (target_rq->nr_running == 0) will return true for
  ttwu_queue_cond(target) which will offload the task wakeup to the idle
  target via an IPI.

  In all such cases target_rq->ttwu_pending will be set to 1 before
  queuing the wake function.

  If an idle load balance races here, following scenarios are possible:

  - The CPU is not in TIF_POLLING_NRFLAG mode in which case an actual
    IPI is sent to the CPU to wake it out of idle. If the
    nohz_csd_func() queues before sched_ttwu_pending(), the idle load
    balance will bail out since idle_cpu(target) returns 0 since
    target_rq->ttwu_pending is 1. If the nohz_csd_func() is queued after
    sched_ttwu_pending() it should see rq->nr_running to be non-zero and
    bail out of idle load balancing.

  - The CPU is in TIF_POLLING_NRFLAG mode and instead of an actual IPI,
    the sender will simply set TIF_NEED_RESCHED for the target to put it
    out of idle and flush_smp_call_function_queue() in do_idle() will
    execute the call function. Depending on the ordering of the queuing
    of nohz_csd_func() and sched_ttwu_pending(), the idle_cpu() check in
    nohz_csd_func() should either see target_rq->ttwu_pending = 1 or
    target_rq->nr_running to be non-zero if there is a genuine task
    wakeup racing with the idle load balance kick.

o The waker CPU perceives the target CPU to be busy
  (targer_rq->nr_running != 0) but the CPU is in fact going idle and due
  to a series of unfortunate events, the system reaches a case where the
  waker CPU decides to perform the wakeup by itself in ttwu_queue() on
  the target CPU but target is concurrently selected for idle load
  balance (XXX: Can this happen? I'm not sure, but we'll consider the
  mother of all coincidences to estimate the worst case scenario).

  ttwu_do_activate() calls enqueue_task() which would increment
  "rq->nr_running" post which it calls wakeup_preempt() which is
  responsible for setting TIF_NEED_RESCHED (via a resched IPI or by
  setting TIF_NEED_RESCHED on a TIF_POLLING_NRFLAG idle CPU) The key
  thing to note in this case is that rq->nr_running is already non-zero
  in case of a wakeup before TIF_NEED_RESCHED is set which would
  lead to idle_cpu() check returning false.

In all cases, it seems that need_resched() check is unnecessary when
checking for idle_cpu() first since an impending wakeup racing with idle
load balancer will either set the "rq->ttwu_pending" or indicate a newly
woken task via "rq->nr_running".

Chasing the reason why this check might have existed in the first place,
I came across  Peter's suggestion on the fist iteration of Suresh's
patch from 2011 [1] where the condition to raise the SCHED_SOFTIRQ was:

	sched_ttwu_do_pending(list);

	if (unlikely((rq->idle == current) &&
	    rq->nohz_balance_kick &&
	    !need_resched()))
		raise_softirq_irqoff(SCHED_SOFTIRQ);

Since the condition to raise the SCHED_SOFIRQ was preceded by
sched_ttwu_do_pending() (which is equivalent of sched_ttwu_pending()) in
the current upstream kernel, the need_resched() check was necessary to
catch a newly queued task. Peter suggested modifying it to:

	if (idle_cpu() && rq->nohz_balance_kick && !need_resched())
		raise_softirq_irqoff(SCHED_SOFTIRQ);

where idle_cpu() seems to have replaced "rq->idle == current" check.

Even back then, the idle_cpu() check would have been sufficient to catch
a new task being enqueued. Since commit b2a02fc43a1f ("smp: Optimize
send_call_function_single_ipi()") overloads the interpretation of
TIF_NEED_RESCHED for TIF_POLLING_NRFLAG idling, remove the
need_resched() check in nohz_csd_func() to raise SCHED_SOFTIRQ based
on Peter's suggestion.

Fixes: b2a02fc43a1f ("smp: Optimize send_call_function_single_ipi()")
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241119054432.6405-3-kprateek.nayak@amd.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-17 13:24:33 +01:00
..
bpf bpf: fix OOB devmap writes when deleting elements 2024-12-17 13:24:29 +01:00
cgroup cgroup/bpf: only cgroup v2 can be attached by bpf programs 2024-12-17 13:24:02 +01:00
configs Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
debug kdb: Use the passed prompt in kdb_position_cursor() 2024-11-23 23:20:16 +01:00
dma dma-debug: fix a possible deadlock on radix_lock 2024-12-17 13:24:30 +01:00
entry Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
events Revert "uprobes: Use kzalloc to allocate xol area" 2024-11-24 00:23:37 +01:00
futex futex: Don't include process MM in futex key on no-MMU 2024-11-18 11:42:47 +01:00
gcov gcov: add support for GCC 14 2024-11-19 14:19:10 +01:00
irq genirq/irqdesc: Honor caller provided affinity in alloc_desc() 2024-11-23 23:20:29 +01:00
kcsan kcsan: Turn report_filterlist_lock into a raw_spinlock 2024-12-17 13:24:29 +01:00
livepatch kallsyms: refactor {,module_}kallsyms_on_each_symbol 2024-11-19 12:27:32 +01:00
locking Revert "rtmutex: Drop rt_mutex::wait_lock before scheduling" 2024-11-24 00:23:36 +01:00
power power/wakelock: Add a timeout to wakelocks globally 2024-11-19 18:06:07 +01:00
printk printk: Silence useless system log spam 2024-11-19 18:04:40 +01:00
rcu rcu-tasks: Idle tasks on offline CPUs are in quiescent states 2024-12-17 13:23:58 +01:00
sched sched/core: Remove the unnecessary need_resched() check in nohz_csd_func() 2024-12-17 13:24:33 +01:00
time time: Fix references to _msecs_to_jiffies() handling of values 2024-12-17 13:24:00 +01:00
trace tracing: Use atomic64_inc_return() in trace_clock_counter() 2024-12-17 13:24:31 +01:00
acct.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
async.c async: Introduce async_schedule_dev_nocall() 2024-11-18 12:12:56 +01:00
audit.c audit: Send netlink ACK before setting connection in auditd_set 2024-11-18 12:13:09 +01:00
audit.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
audit_fsnotify.c fsnotify: make allow_dups a property of the group 2024-11-19 12:27:55 +01:00
audit_tree.c fsnotify: pass flags argument to fsnotify_alloc_group() 2024-11-19 12:27:55 +01:00
audit_watch.c fsnotify: pass flags argument to fsnotify_alloc_group() 2024-11-19 12:27:55 +01:00
auditfilter.c ima: Avoid blocking in RCU read-side critical section 2024-11-19 14:19:42 +01:00
auditsc.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
backtracetest.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
bounds.c bounds: Use the right number of bits for power-of-two CONFIG_NR_CPUS 2024-11-19 11:32:40 +01:00
capability.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
cfi.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
compat.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
configs.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
context_tracking.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
cpu.c cpu: Silence log spam when a CPU is brought up 2024-11-19 18:04:55 +01:00
cpu_pm.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
crash_core.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
crash_dump.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
cred.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
delayacct.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
dma.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
exec_domain.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
exit.c mm: optimize the redundant loop of mm_update_owner_next() 2024-11-19 14:19:41 +01:00
extable.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
fail_function.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
fork.c exec: Simplify unshare_files 2024-11-19 12:27:27 +01:00
freezer.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
gen_kheaders.sh kheaders: explicitly define file modes for archived headers 2024-11-19 14:19:30 +01:00
groups.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
hung_task.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
iomem.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
irq_work.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
jump_label.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
kallsyms.c Revert "kallsyms: Make kallsyms_on_each_symbol generally available" 2024-11-24 00:22:59 +01:00
kcmp.c kcmp: In get_file_raw_ptr use task_lookup_fd_rcu 2024-11-19 12:27:27 +01:00
Kconfig.freezer Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
Kconfig.hz Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
Kconfig.locks Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
Kconfig.preempt Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
kcov.c kcov: don't lose track of remote references during softirqs 2024-11-19 14:19:10 +01:00
kexec.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
kexec_core.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
kexec_elf.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
kexec_file.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
kexec_internal.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
kheaders.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
kmod.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
kprobes.c Revert "x86/ibt,ftrace: Search for __fentry__ location" 2024-11-24 00:23:31 +01:00
ksysfs.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
kthread.c Revert "kthread: add kthread_work tracepoints" 2024-11-24 00:23:23 +01:00
latencytop.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
Makefile kernel: Use the stock defconfig for /proc/config.gz 2024-06-15 16:20:14 -03:00
module-internal.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
module.c Revert "kallsyms: Make module_kallsyms_on_each_symbol generally available" 2024-11-24 00:22:59 +01:00
module_signature.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
module_signing.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
notifier.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
nsproxy.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
padata.c Revert "padata: Honor the caller's alignment in case of chunk_size 0" 2024-11-24 00:23:30 +01:00
panic.c panic: Flush kernel log buffer at the end 2024-11-19 09:23:11 +01:00
params.c params: lift param_set_uint_minmax to common code 2024-11-19 12:27:09 +01:00
pid.c kernel/pid.c: implement additional checks upon pidfd_create() parameters 2024-11-19 12:27:43 +01:00
pid_namespace.c zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING 2024-11-19 14:19:05 +01:00
profile.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
ptrace.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
range.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
reboot.c kernel/reboot: emergency_restart: Set correct system_state 2024-11-18 11:43:25 +01:00
regset.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
relay.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
resource.c Revert "resource: fix region_intersects() vs add_memory_driver_managed()" 2024-11-24 00:22:55 +01:00
rseq.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
scftorture.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
scs.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
seccomp.c seccomp: Invalidate seccomp mode to catch death failures 2024-11-18 22:25:35 +01:00
signal.c Revert "signal: Replace BUG_ON()s" 2024-11-24 00:23:07 +01:00
smp.c Revert "smp: Add missing destroy_work_on_stack() call in smp_call_on_cpu()" 2024-11-24 00:23:40 +01:00
smpboot.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
smpboot.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
softirq.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
stackleak.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
stacktrace.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
static_call.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
stop_machine.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
sys.c kernel/sys.c: implement custom uname override 2024-11-19 17:55:01 +01:00
sys_ni.c syscalls: fix compat_sys_io_pgetevents_time64 usage 2024-11-19 14:19:34 +01:00
sysctl-test.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
sysctl.c Revert "make sysctl constants read-only" 2024-12-03 19:57:26 +01:00
task_work.c task_work: Introduce task_work_cancel() again 2024-11-23 23:20:13 +01:00
taskstats.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
test_kprobes.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
torture.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
tracepoint.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
tsacct.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
ucount.c fanotify: configurable limits via sysfs 2024-11-19 12:27:37 +01:00
uid16.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
uid16.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
umh.c security: samsung: defex_lsm: nuke 2024-06-15 16:20:49 -03:00
up.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
user-return-notifier.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
user.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
user_namespace.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
usermode_driver.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
utsname.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
utsname_sysctl.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
watch_queue.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
watchdog.c watchdog: move softlockup_panic back to early_param 2024-11-18 11:43:21 +01:00
watchdog_hld.c watchdog/perf: properly initialize the turbo mode timestamp and rearm counter 2024-11-23 23:20:15 +01:00
workqueue.c Revert "workqueue: Make queue_rcu_work() use call_rcu_flush()" 2024-11-19 18:15:40 +01:00
workqueue_internal.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00