Commit graph

4384 commits

Author SHA1 Message Date
Sultan Alsawaf
debf1dee53 mbcache: Speed up cache entry creation
In order to prevent redundant entry creation by racing against itself,
mb_cache_entry_create scans through a large hash-list of all current
entries in order to see if another allocation for the requested new
entry has been made. Furthermore, it allocates memory for a new entry
before scanning through this hash-list, which results in that allocated
memory being discarded when the requested new entry is already present.
This happens more than half the time.

Speed up cache entry creation by keeping a small linked list of
requested new entries in progress, and scanning through that first
instead of the large hash-list. Additionally, don't bother allocating
memory for a new entry until it's known that the allocated memory will
be used.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
2024-11-19 17:58:19 +01:00
Ksawlii
757f0039df ARM64: configs: enable Lazy RCU and regenerate 2024-11-19 17:58:10 +01:00
Tashfin Shakeer Rhythm
892398140b include/linux: lz4: Reduce LZ4 memory usage to 1KB
As per Sony, 1KB of memory size for LZ4 is enough as we only
use LZ4 for zRAM and so our blocks are 4KB in size.

Therefore, reduce the LZ4 memory usage to 1KB. This can enhance
the speed of LZ4.

Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com>
Signed-off-by: Alexander Winkowski <dereference23@outlook.com>
2024-11-19 17:56:02 +01:00
Akinobu Mita
681fcbd50c batman-adv: fix random jitter calculation
[ Upstream commit 143cdd8f33909ff5a153e3f02048738c5964ba26 ]

batadv_iv_ogm_emit_send_time() attempts to calculates a random integer
in the range of 'orig_interval +- BATADV_JITTER' by the below lines.

        msecs = atomic_read(&bat_priv->orig_interval) - BATADV_JITTER;
        msecs += (random32() % 2 * BATADV_JITTER);

But it actually gets 'orig_interval' or 'orig_interval - BATADV_JITTER'
because '%' and '*' have same precedence and associativity is
left-to-right.

This adds the parentheses at the appropriate position so that it matches
original intension.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Antonio Quartulli <ordex@autistici.org>
Cc: Marek Lindner <lindner_marek@yahoo.de>
Cc: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Cc: Antonio Quartulli <ordex@autistici.org>
Cc: b.a.t.m.a.n@lists.open-mesh.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Carlos Jimenez (JavaShin-X) <javashin1986@gmail.com>
2024-11-19 17:55:48 +01:00
Jesse Chan
1b01b9bda3 f2fs: Enlarge min_fsync_blocks to 20
In OPPO's kernel:
enlarge min_fsync_blocks to optimize performance
  - yanwu@TECH.Storage.FS.oF2FS, 2019/08/12

Huawei is also doing this in their production kernel.

If this optimization is good for them and shipped
with their devices, it should be good for us.

Signed-off-by: Jesse Chan <jc@linux.com>
Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>
2024-11-19 17:55:45 +01:00
Park Ju Hyung
414da43f9f f2fs: set ioprio of GC kthread to idle
GC should run conservatively as possible to reduce latency spikes to the user.

Setting ioprio to idle class will allow the kernel to schedule GC thread's I/O
to not affect any other processes' I/O requests.

Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>
2024-11-19 17:55:41 +01:00
Park Ju Hyung
008f8557c0 f2fs: reduce timeout for uncongestion
On high fs utilization, congestion is hit quite frequently and waiting for a
whooping 20ms is too expensive, especially on critical paths.

Reduce it to an amount that is unlikely to affect UI rendering paths.

The new times are as follows:
  100 Hz  => 1 jiffy   (effective: 10 ms)
  250 Hz  => 2 jiffies (effective: 8 ms)
  300 Hz  => 2 jiffies (effective: 6 ms)
  1000 Hz => 6 jiffies (effective: 6 ms)

Co-authored-by: Danny Lin <danny@kdrag0n.dev>
Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>
2024-11-19 17:55:37 +01:00
Juhyung Park
8c11745023 kernel/sys.c: implement custom uname override
The uname system-call will return CONFIG_UNAME_OVERRIDE_STRING on struct
new_utsname->release when a process with CONFIG_UNAME_OVERRIDE_TARGET
included in its cmdline calls it.

Signed-off-by: Juhyung Park <qkrwngud825@gmail.com>
2024-11-19 17:55:01 +01:00
Sultan Alsawaf
d9e7f45cc4 arm64: Disable GENERIC_IRQ_EFFECTIVE_AFF_MASK
The effective affinity mask causes a lot of bugs by virtue of many
set_irq_affinity handlers only setting an effective affinity mask for an
IRQ's parent but not the IRQ itself. Since this is a widespread issue that
would require manual fixing on every different SoC, just disable the
effective affinity mask altogether and use the first CPU in an affinity
mask configured.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
2024-11-19 17:54:22 +01:00
Sultan Alsawaf
01b5714d66 arm64: Align __arch_clear_user() to 16 bytes as per upstream
With significant code churn, the 'read' result from callbench can
regress from 4000 ns to 7000 ns, despite no changes directly affecting
the code paths exercised by callbench. This can also happen when playing
with compiler options that affect the kernel's size.

Upon further investigation, it turns out that /dev/zero, which callbench
uses for its file benchmarks, makes heavy use of clear_user() when
accessed via read(). When the regression occurs, __arch_clear_user()
goes from being 16-byte aligned to being 4-byte aligned.

A recent upstream change to arm64's clear_user() implementation, commit
344323e0428b ("arm64: Rewrite __arch_clear_user()"), mentions this:
  Apparently some folks examine large reads from /dev/zero closely enough
  to notice the loop being hot, so align it per the other critical loops
  (presumably around a typical instruction fetch granularity).

As such, make __arch_clear_user() 16-byte aligned to fix the regression
and match upstream.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
2024-11-19 17:54:00 +01:00
Sultan Alsawaf
9bf1253932 selinux: Remove audit dependency
Auditing comes with a lot of overhead due to string assembly via
vsnprintf. It isn't actually needed to make SELinux work, so remove
SELinux's artificial dependency on it to make it possible to use SELinux
without the unneeded overhead.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
2024-11-19 17:53:57 +01:00
Danny Lin
f590674f1e uid_sys_stats: Remove dependency on the profiling subsystem
Now that we have a simple task exit notifier system to notify
uid_sys_stats when tasks exit independently of the profiling subsystem,
remove this unnecessary dependency.

Test: /proc/uid_cputime shows valid stats with profiling disabled
Signed-off-by: Danny Lin <danny@kdrag0n.dev>
2024-11-19 17:53:52 +01:00
Sultan Alsawaf
3a03f46735 mm: Disable proactive compaction by default
On-demand compaction works fine assuming that you don't have a need to
spam the page allocator nonstop for large order page allocations.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
2024-11-19 17:53:38 +01:00
Sultan Alsawaf
c3807312b1 iommu: pcie: Fix incorrect kmemleak_ignore() usage
kmemleak_ignore() shouldn't be used on the gen_pool allocations since they
aren't slab allocations. This leads to a flurry of warnings from kmemleak;
fix it by using kmemleak_ignore() correctly.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
2024-11-19 17:53:28 +01:00
Sultan Alsawaf
2f43de3476 dma-buf/sync_file: Speed up ioctl by omitting debug names
A lot of CPU time is wasted on allocating, populating, and copying
debug names back and forth with userspace when they're not actually
needed. We can't just remove the name buffers from the various sync data
structures though because we must preserve ABI compatibility with
userspace, but instead we can just pretend the name fields of the
user-shared structs aren't there. This massively reduces the sizes of
memory allocated for these data structures and the amount of data passed
between userspace, as well as eliminates a kzalloc() entirely from
sync_file_ioctl_fence_info(), thus improving graphics performance.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
2024-11-19 17:53:23 +01:00
Sultan Alsawaf
07a5ef1eeb qos: Don't disable interrupts while holding pm_qos_lock
None of the pm_qos functions actually run in interrupt context; if some
driver calls pm_qos_update_target in interrupt context then it's already
broken. There's no need to disable interrupts while holding pm_qos_lock,
so don't do it.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
2024-11-19 17:53:07 +01:00
Nahuel Gómez
27fe6f89a2 kernel: sched: ems: drop usage of SCHED_FEAT
We removed this.

../kernel/sched/ems/core.c:1370:23: error: use of undeclared identifier 'sched_feat_names'
 1370 |         index = match_string(sched_feat_names, __SCHED_FEAT_NR, "TTWU_QUEUE");
      |                              ^
../kernel/sched/ems/core.c:1370:41: error: use of undeclared identifier '__SCHED_FEAT_NR'
 1370 |         index = match_string(sched_feat_names, __SCHED_FEAT_NR, "TTWU_QUEUE");
      |                                                ^
../kernel/sched/ems/core.c:1372:23: error: use of undeclared identifier 'sched_feat_keys'
 1372 |                 static_key_disable(&sched_feat_keys[index]);
      |                                     ^
../kernel/sched/ems/core.c:1373:3: error: use of undeclared identifier 'sysctl_sched_features'; did you mean 'sysctl_sched_latency'?
 1373 |                 sysctl_sched_features &= ~(1UL << index);
      |                 ^~~~~~~~~~~~~~~~~~~~~
      |                 sysctl_sched_latency
../include/linux/sched/sysctl.h:29:21: note: 'sysctl_sched_latency' declared here
   29 | extern unsigned int sysctl_sched_latency;
      |                     ^
4 errors generated.

Signed-off-by: Nahuel Gómez <nahuelgomez329@gmail.com>
2024-11-19 17:52:14 +01:00
Ksawlii
89efaaeccf ARM64: configs: disable ZRAM_LRU_WRITEBACK 2024-11-19 17:51:54 +01:00
Ruchit
c94f14266e zram: Protect handle_decomp_fail behind a check
the previous definitions as well as the creation of this is locked behind CONFIG_ZRAM_LRU_WRITEBACK as well

Change-Id: I869b5595f69cc481e93ca6862b460594762d9b25
Signed-off-by: Ruchit <risenid@duck.com>
2024-11-19 17:50:10 +01:00
Nahuel Gómez
2cb2ac56fc drivers: zram: also guard lzo_marker
../drivers/block/zram/zram_drv.c:62:22: error: unused variable 'lzo_marker' [-Werror,-Wunused-variable]
   62 | static unsigned char lzo_marker[4] = {0x11, 0x00, 0x00};
      |                      ^~~~~~~~~~
1 error generated.

Signed-off-by: Nahuel Gómez <nahuelgomez329@gmail.com>
2024-11-19 17:49:47 +01:00
flar2
d1c1915a6c mmc: Disable crc check
Signed-off-by: flar2 <asegaert@gmail.com>
2024-11-19 17:47:04 +01:00
Pzqqt
8f30152c01 drivers: scsi: Reduce logspam 2024-11-19 17:47:00 +01:00
Pzqqt
1c7f2b3800 drivers: staging: Import Xiaomi's binder prio driver
- From branch: `liuqin-t-oss`

Signed-off-by: Pzqqt <821026875@qq.com>
2024-11-19 17:46:55 +01:00
jonascardoso
a37de4bafa slub: Optimized SLUB Memory Allocator
(cherry picked from commit 110e6c989068385cc84f71bb02bfda2b58e56a0f)
Signed-off-by: rk134 <rahul-k@bigdi.cc>
Signed-off-by: priiii1808 <priyanshusinghal0818@gmail.com>
2024-11-19 17:44:40 +01:00
Sultan Alsawaf
b8eba3b6e6 mm: kmemleak: Don't die when memory allocation fails
When memory is leaking, it's going to be harder to allocate more memory,
making it more likely for this failure condition inside of kmemleak to
manifest itself. This is extremely frustrating since kmemleak kills
itself upon the first instance of memory allocation failure.

Bypass that and make kmemleak more resilient when memory is running low.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
Signed-off-by: priiii1808 <priyanshusinghal0818@gmail.com>
2024-11-19 17:44:35 +01:00
Diab Neiroukh
1c19be24ee mm: oom_kill: Reduce some verbose logging
Signed-off-by: engstk <eng.stk@sapo.pt>
2024-11-19 17:44:31 +01:00
UtsavBalar1231
8ed372cd67 mm: page_alloc: Hardcode min_free_kbytes to 32768 kb
Change-Id: I08355acd995e956c63cc0d3f1587604e39f91269
Signed-off-by: UtsavBalar1231 <utsavbalar1231@gmail.com>
2024-11-19 17:44:24 +01:00
Sultan Alsawaf
1dca369959 mm: Don't hog the CPU and zone lock in rmqueue_bulk()
There is noticeable scheduling latency and heavy zone lock contention
stemming from rmqueue_bulk's single hold of the zone lock while doing
its work, as seen with the preemptoff tracer. There's no actual need for
rmqueue_bulk() to hold the zone lock the entire time; it only does so
for supposed efficiency. As such, we can relax the zone lock and even
reschedule when IRQs are enabled in order to keep the scheduling delays
and zone lock contention at bay. Forward progress is still guaranteed,
as the zone lock can only be relaxed after page removal.

With this change, rmqueue_bulk() no longer appears as a serious offender
in the preemptoff tracer, and system latency is noticeably improved.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
2024-11-19 17:44:18 +01:00
Juhyung Park
cfd1b6ca17 zsmalloc: backport from 5994eabf3bbb
Backport zsmalloc from commit 5994eabf3bbb ("merge mm-hotfixes-stable into
mm-stable to pick up depended-upon changes").

Signed-off-by: Juhyung Park <qkrwngud825@gmail.com>
2024-11-19 17:44:14 +01:00
Ben Gardon
e9933557cb locking/rwlocks: Add contention detection for rwlocks
rwlocks do not currently have any facility to detect contention
like spinlocks do. In order to allow users of rwlocks to better manage
latency, add contention detection for queued rwlocks.

CC: Ingo Molnar <mingo@redhat.com>
CC: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Acked-by: Waiman Long <longman@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ben Gardon <bgardon@google.com>
Message-Id: <20210202185734.1680553-7-bgardon@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-11-19 17:44:08 +01:00
Minchan Kim
f19a9560cc locking/rwlocks: introduce write_lock_nested
In preparation for converting bit_spin_lock to rwlock in zsmalloc so
that multiple writers of zspages can run at the same time but those
zspages are supposed to be different zspage instance.  Thus, it's not
deadlock.  This patch adds write_lock_nested to support the case for
LOCKDEP.

[minchan@kernel.org: fix write_lock_nested for RT]
  Link: https://lkml.kernel.org/r/YZfrMTAXV56HFWJY@google.com
[bigeasy@linutronix.de: fixup write_lock_nested() implementation]
  Link: https://lkml.kernel.org/r/20211123170134.y6xb7pmpgdn4m3bn@linutronix.de

Link: https://lkml.kernel.org/r/20211115185909.3949505-8-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-11-19 17:44:05 +01:00
Sultan Alsawaf
d4bbaf5715 sched/core: Forbid Unity-based games from changing their CPU affinity
Unity-based games (such as Wild Rift) like to shoot themselves in the foot
by setting a nonsense CPU affinity, restricting the game to a narrow set of
CPU cores that it thinks are the "big" cores in a heterogeneous CPU. It
assumes that CPUs only have two performance domains (clusters), and
therefore royally mucks up games' CPU affinities on CPUs which have more
than two performance domains.

Check if a setaffinity target task is part of a Unity-based game and
silently ignore the setaffinity request so that it can't sabotage itself.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
2024-11-19 17:43:59 +01:00
ztc1997
136bbfd757 block: Add default I/O scheduler option 2024-11-19 17:43:55 +01:00
Paolo Valente
fbbabdb3bc block, bfq: use half slice_idle as a threshold to check short ttime
The value of the I/O plugging (idling) timeout is used also as the
think-time threshold to decide whether a process has a short think
time.  In this respect, a good value of this timeout for rotational
drives is un the order of several ms. Yet, this is often too long a
time interval to be effective as a think-time threshold. This commit
mitigates this problem (by a lot, according to tests), by halving the
threshold.

Tested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit b5f74ecacc3139ef873e69acc3aba28083ecc416)
(cherry picked from commit b1511c438e8a5668e6be04ad9107d6695332756c)
(cherry picked from commit 389992d9dc78340676248d0f01c7569b3db950ed)
(cherry picked from commit 49919eface6f4391cda0e77bcaad3e2786cbbab3)
(cherry picked from commit 87b015de51122ea9b5d9e56b846ae945db8444f0)
(cherry picked from commit 6ada34cdc94c89e97926a2d001412ecc027e1392)
(cherry picked from commit 2782bcc2919dd2a0a1d461d36c22338e67bc6327)
2024-11-19 17:43:46 +01:00
Paolo Valente
fe945719eb block, bfq: increase time window for waker detection
Tests on slower machines showed current window to be way too
small. This commit increases it.

Tested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit ab1fb47e33dc7754a7593181ffe0742c7105ea9a)
(cherry picked from commit 0d1663f1922c5f6fb3a4b3cc5a3a861c765a3704)
(cherry picked from commit 85d9e1637a38d0cfdeba4e3847f1797dcd18da5d)
(cherry picked from commit 6bd707bb9a60e2bf0e680a271208f6c82a331571)
(cherry picked from commit 43755e08d048ccd6f3b2a3bbd34bea4a71c5bc12)
(cherry picked from commit b1a8cce9e99277ce53da20ab603473ad6c3e95d1)
(cherry picked from commit 74d27133a3261a296ddd98e9ff09d89bfab797bb)
2024-11-19 17:43:43 +01:00
Paolo Valente
7034a03ec0 block, bfq: do not raise non-default weights
BFQ heuristics try to detect interactive I/O, and raise the weight of
the queues containing such an I/O. Yet, if also the user changes the
weight of a queue (i.e., the user changes the ioprio of the process
associated with that queue), then it is most likely better to prevent
BFQ heuristics from silently changing the same weight.

Tested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit 91b896f65d32610d6d58af02170b15f8d37a7702)
(cherry picked from commit cbbd2f045e60073978fe1b721c0953cd8762ecbb)
(cherry picked from commit 88b650c71f7d0d30ac2fa215a139d7a48d069cd9)
(cherry picked from commit 9a4725f0341c71a9b4f50f2d203f9740029e42e5)
(cherry picked from commit a2c57345ffa5404cefd3d43e2fd4e4492ac7c6e0)
(cherry picked from commit df56458ca85c681d163d879b832f868ed5044c8e)
(cherry picked from commit dfc085aad98db2bcabd2c438fcd722a90303e6cb)
2024-11-19 17:43:40 +01:00
Paolo Valente
4b23f1e69b block, bfq: do not expire a queue when it is the only busy one
This commits preserves I/O-dispatch plugging for a special symmetric
case that may suddenly turn into asymmetric: the case where only one
bfq_queue, say bfqq, is busy. In this case, not expiring bfqq does not
cause any harm to any other queues in terms of service guarantees. In
contrast, it avoids the following unlucky sequence of events: (1) bfqq
is expired, (2) a new queue with a lower weight than bfqq becomes busy
(or more queues), (3) the new queue is served until a new request
arrives for bfqq, (4) when bfqq is finally served, there are so many
requests of the new queue in the drive that the pending requests for
bfqq take a lot of time to be served. In particular, event (2) may
case even already dispatched requests of bfqq to be delayed, inside
the drive. So, to avoid this series of events, the scenario is
preventively declared as asymmetric also if bfqq is the only busy
queues. By doing so, I/O-dispatch plugging is performed for bfqq.

Tested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit 2391d13ed484df1515f0025458e1f82317823fab)
(cherry picked from commit 79827eb41d8fb0f838a2c592775a8e63caeb7c57)
(cherry picked from commit 41720669259995fb7f064fc0f988c9d228750b37)
(cherry picked from commit 07d273c955ea2c34a42f6de0f1e3f1bfb00c6ce1)
(cherry picked from commit 8034c856b8fcafbef405eedddc12bb0625e52a42)
(cherry picked from commit f49083d304bda30647196b550a109f528c8266dc)
(cherry picked from commit 8a597f0ab5e7e83bfa426d071185c3d3ce5fa535)
2024-11-19 17:43:34 +01:00
Paolo Valente
5238084cd8 block, bfq: save also injection state on queue merging
To prevent injection information from being lost on bfq_queue merging,
also the amount of service that a bfq_queue receives must be saved and
restored when the bfq_queue is merged and split, respectively.

Tested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit 5a5436b98d5cd2714feaaa579cec49dd7f7057bb)
(cherry picked from commit 9372e98dc77c7f2ebbb808a60abb01f30d70d0bc)
(cherry picked from commit e6a5b66cfe56495f26182cfd2340e3336bb4b2b4)
(cherry picked from commit c579a3634d163ed05cc4ac258411f03db969926e)
(cherry picked from commit 359f87d07390f687634185b0dd9d6f106fb5afdd)
(cherry picked from commit d1d1f1336ed77b83e98d26175e196b45a28958f4)
(cherry picked from commit 0ff8068594640924e0cffe27d8b0273bb80d74ca)
2024-11-19 17:43:15 +01:00
Paolo Valente
0769622634 block, bfq: save also weight-raised service on queue merging
To prevent weight-raising information from being lost on bfq_queue merging,
also the amount of service that a bfq_queue receives must be saved and
restored when the bfq_queue is merged and split, respectively.

Tested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit e673914d52f913584cc4c454dfcff2e8eb04533f)
(cherry picked from commit 48f3cf9bb6ae73de3e8e6cad2e50c6e70a6cd33f)
(cherry picked from commit d947cf3f8bcbcbe2dd8f5eec82e83a35198f874b)
(cherry picked from commit 39b91f1f22265c70cdc48916ac694dad6c21c191)
(cherry picked from commit 421c82648e46467d29dc0b5cd5522f00a026083d)
(cherry picked from commit e9eecde7c67303c1dc87864c10c372019d609b0b)
(cherry picked from commit 41d4c63679c36dc63b4cc9be301ec8d8d518d33f)
2024-11-19 17:43:10 +01:00
Paolo Valente
5267faf794 block, bfq: fix switch back from soft-rt weitgh-raising
A bfq_queue may happen to be deemed as soft real-time while it is
still enjoying interactive weight-raising. If this happens because of
a false positive, then the bfq_queue is likely to loose its soft
real-time status soon. Upon losing such a status, the bfq_queue must
get back its interactive weight-raising, if its interactive period is
not over yet. But this case is not handled. This commit corrects this
error.

Tested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit d1f600fa4732dac36c71a03b790f0c829a076475)
(cherry picked from commit db891a7d6aed6cc37d681d2bbf6c9bd697059281)
(cherry picked from commit 647b877a9a8493df84a1d4abd94be089c8fed49b)
(cherry picked from commit 7eda6de0bbbfa1d05b8888b697d9b7aeffe4d64e)
(cherry picked from commit c1e076d9f4688c77dfa0f859060ae1f27a8d889e)
(cherry picked from commit db0058abb7534aeb0abebe01c65659aa3886de78)
(cherry picked from commit 40bc06529a2053ca0caf2053dd6f2a27bf7af916)
2024-11-19 17:42:58 +01:00
Paolo Valente
8b47ef547b block, bfq: re-evaluate convenience of I/O plugging on rq arrivals
Upon an I/O-dispatch attempt, BFQ may detect that it was better to
plug I/O dispatch, and to wait for a new request to arrive for the
currently in-service queue. But the arrival of a new request for an
empty bfq_queue, and thus the switch from idle to busy of the
bfq_queue, may cause the scenario to change, and make plugging no
longer needed for service guarantees, or more convenient for
throughput. In this case, keeping I/O-dispatch plugged would certainly
lower throughput.

To address this issue, this commit makes such a check, and stops
plugging I/O if it is better to stop plugging I/O.

Tested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit 7f1995c27b19060dbdff23442f375e3097c90707)
(cherry picked from commit 12ec5a8ca2486d06f880d41751383c0d9549ba49)
(cherry picked from commit 64c6efc5ccb01edf553487aff312c0b7110cb30f)
(cherry picked from commit 3e04c1949f447a8166fa6d6343bd5332d8c12a4b)
(cherry picked from commit 40a263c36cf2094311e8189b6e9173360a808b12)
(cherry picked from commit 61a02ce46503671c747e550a13972ca8abaf5030)
(cherry picked from commit 3707ff2d32dccd807b8e5e6885f07f3874c71180)
2024-11-19 17:42:55 +01:00
Pavel Begunkov
f029d24207 splice: don't generate zero-len segement bvecs
iter_file_splice_write() may spawn bvec segments with zero-length. In
preparation for prohibiting them, filter out by hand at splice level.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit 0f1d344feb534555a0dcd0beafb7211a37c5355e)
(cherry picked from commit 4c72fdc13bd20d10f59b8145627312814583a945)
(cherry picked from commit cba6a18da1cc8144a07ba6a4b03e8e8dc8d24428)
(cherry picked from commit 54a17499483118cd3c92feb747c88207ce30e9ce)
(cherry picked from commit 4dec661d05c16a8e62dd833262ff68ce3e466770)
(cherry picked from commit fe99d86b681099f662b2b01155b02b8476ff428d)
(cherry picked from commit aa033460cd26157fe81e829e4744b3396a09860b)
2024-11-19 17:42:24 +01:00
Pavel Begunkov
3c61c6aa45 bvec/iter: disallow zero-length segment bvecs
zero-length bvec segments are allowed in general, but not handled by bio
and down the block layer so filtered out. This inconsistency may be
confusing and prevent from optimisations. As zero-length segments are
useless and places that were generating them are patched, declare them
not allowed.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit 9b2e0016d04c6542ace0128eb82ecb3b10c97e43)
(cherry picked from commit 87afbd40acbb99860f846ad6f199e62e93be96c2)
(cherry picked from commit f0677085687d50b5ecd6e7a2e19e4aff23251cb6)
(cherry picked from commit affb154c088db678d4a541f8a4080fa5088cb10b)
(cherry picked from commit 9b383b80e8432af1d0421acf9287076db26996d7)
(cherry picked from commit f643066fcac50220888ecfe9b86c5d895d621648)
(cherry picked from commit d2f588cf9664d76f78287142f505e4f375503ae6)
2024-11-19 17:42:21 +01:00
Christoph Hellwig
8ae63d0654 target/file: allocate the bvec array as part of struct target_core_file_cmd
This saves one memory allocation, and ensures the bvecs aren't freed
before the AIO completion.  This will allow the lower level code to be
optimized so that it can avoid allocating another bvec array.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit ecd7fba0ade1d6d8d49d320df9caf96922a376b2)
(cherry picked from commit 272d2ea22b0e3da786a506896e36d3a586e6c252)
(cherry picked from commit 83ff0aa1cc08c329feb0748c575810b3ce8c0077)
(cherry picked from commit d0dc27fcc3f57d556ce4468a060e54f25c7b91b0)
(cherry picked from commit 847a30a99fc4b11c9e6cf2ec049ca20a6da9c769)
(cherry picked from commit 3799ad215edeb9276c4d16150a33de916cfa4ea1)
(cherry picked from commit ee8f417b3276049e4f0bbadf4c4524f071de2361)
2024-11-19 17:42:15 +01:00
Pavel Begunkov
f6172ea41b iov_iter: optimise bvec iov_iter_advance()
iov_iter_advance() is heavily used, but implemented through generic
means. For bvecs there is a specifically crafted function for that, so
use bvec_iter_advance() instead, it's faster and slimmer.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit 54c8195b4ebe10af66b49ab9c809bc16939555fc)
(cherry picked from commit 8cac76228025fb022b1bb15e100efae8acde0425)
(cherry picked from commit c8b0dff6b5ac38ff23605bdae1c5bf62766d0fa3)
(cherry picked from commit 5bbff4ddbd3f87ddb409753269fa933109a99a7f)
(cherry picked from commit 689d9157a0b58f95cb2641a17226b023a1fb226a)
(cherry picked from commit 0df724cafe05ae311556249c7df0c2cd00e05007)
(cherry picked from commit ba5d942df07c03782ab2aa2b2dd1f7b96b3b5c52)
2024-11-19 17:42:10 +01:00
Jan Kara
b93af2c415 bfq: Use 'ttime' local variable
Use local variable 'ttime' instead of dereferencing bfqq.

Signed-off-by: Jan Kara <jack@suse.cz>
Acked-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit 28c6def009192b673f92ea357dfb535ba15e00a4)
(cherry picked from commit bb2a213aa0a2b717c3a6e7848c6f82656d80897f)
(cherry picked from commit 2e0cfffb9a6da88cb1a786fb95618bfa714fea32)
(cherry picked from commit caff780963fdfda0ab456c24027298482d745b2f)
(cherry picked from commit b893b660ea8e998b760d48faeed2834e483158ad)
(cherry picked from commit 7e3d952af5fdcf6b02d01d55dbf658fbc2d67f41)
(cherry picked from commit 033b49f66e3808fead9e65e7c9417f26d423374f)
2024-11-19 17:42:05 +01:00
Joseph Qi
fdcb87e105 block/bfq: update comments and default value in docs for fifo_expire
Correct the comments since bfq_fifo_expire[0] is for async request,
while bfq_fifo_expire[1] is for sync request.
Also update docs, according the source code, the default
fifo_expire_async is 250ms, and fifo_expire_sync is 125ms.

Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Acked-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit 4168a8d27ed3a00f160e7f885c956f060d2a0741)
(cherry picked from commit a31ff2eb7d7cfa8331e513bb282f304117f18a77)
(cherry picked from commit a78637befaa4106f9858b3ad8e3273960d3de82b)
(cherry picked from commit bd8e7d3845c7a3b602aee361c7e3d0b5764ce060)
(cherry picked from commit a8543954accfadbb9a1cf1f64c6b3749ee3a629b)
(cherry picked from commit 960981f44b77dcd0d4e786aaef72d39057ccfc03)
(cherry picked from commit 50cfb4b6c1c2e4a3778f66510fee7a2e86e053f2)
2024-11-19 17:41:49 +01:00
Paolo Valente
8eb5a42575 block, bfq: always inject I/O of queues blocked by wakers
Suppose that I/O dispatch is plugged, to wait for new I/O for the
in-service bfq-queue, say bfqq.  Suppose then that there is a further
bfq_queue woken by bfqq, and that this woken queue has pending I/O. A
woken queue does not steal bandwidth from bfqq, because it remains
soon without I/O if bfqq is not served. So there is virtually no risk
of loss of bandwidth for bfqq if this woken queue has I/O dispatched
while bfqq is waiting for new I/O. In contrast, this extra I/O
injection boosts throughput. This commit performs this extra
injection.

Tested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Link: https://lore.kernel.org/r/20210304174627.161-2-paolo.valente@linaro.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit 2ec5a5c48373d4bc2f0699f86507a65bf0b9df35)
(cherry picked from commit 0750db9767232fc2e4850868e526f4b02ecfb247)
(cherry picked from commit 8676f43249bbb0478a8b18bd87703da59902dbfd)
(cherry picked from commit df655d250f253a2f8a6792569108f30a04b7b894)
(cherry picked from commit d76168c1c3805a2c948e7ff60c8eb341e2ff0013)
(cherry picked from commit f213ae4e575f8ed67ae065fe80d06dc957f0b068)
(cherry picked from commit eb1ff3ab6d66081fbaf007c6cfc1a5e841719c0c)
2024-11-19 17:41:42 +01:00
Jan Kara
9abe5bf065 bfq: Provide helper to generate bfqq name
Instead of having helper formating bfqq pid, provide a helper to
generate full bfqq name as used in the traces. It saves some code
duplication and will save more in the coming tracepoints.

Acked-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20211125133645.27483-6-jack@suse.cz
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit 582f04e19ad7b41df993c669805e48a01bcd9c5b)
(cherry picked from commit e030e88a4c2e220366f3db1af33d72d9638f93b5)
(cherry picked from commit e925a5fdce15f914ec2386b03bf64242792acce0)
(cherry picked from commit 9265a0e6952305932aa2b5caf2183387859dcfce)
(cherry picked from commit 41794de36673c11faca8c57625dfa50b76edde20)
(cherry picked from commit 5e830976b50a9f0a2c927b02f921f0d6ae796183)
(cherry picked from commit b5344876556e4a62cac7905bf11ca7ccf8d16d6d)
2024-11-19 17:41:18 +01:00
Yahu Gao
1297c45dcc block/bfq_wf2q: correct weight to ioprio
The return value is ioprio * BFQ_WEIGHT_CONVERSION_COEFF or 0.
What we want is ioprio or 0.
Correct this by changing the calculation.

Signed-off-by: Yahu Gao <gaoyahu19@gmail.com>
Acked-by: Paolo Valente <paolo.valente@linaro.org>
Link: https://lore.kernel.org/r/20220107065859.25689-1-gaoyahu19@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit bcd2be763252f3a4d5fc4d6008d4d96c601ee74b)
(cherry picked from commit 81806db867a17e49d37b1d556dd39f4da5227f56)
(cherry picked from commit aed9dbfda208b30130c64bac55570e2f89084d2b)
(cherry picked from commit 7158b54afec4b986d52cc646a5dffc30eac6dc19)
(cherry picked from commit fb4f80f773e0fc89f372c7afda9c8e9794849f67)
(cherry picked from commit 5ad409c78ed2bfca202490fa13f0a93c49f21382)
2024-11-19 17:40:48 +01:00