kernel_samsung_a53x

Author	SHA1	Message	Date
Ksawlii	110472ce93	Added KernelSU	2024-11-19 22:44:48 +01:00
Ksawlii	7902dbc903	Added KernelSu	2024-11-19 22:41:46 +01:00
Ksawlii	dd4f8d1c0e	FireAsf 2.0 Release	2024-11-19 19:26:09 +01:00
Ksawlii	c543c0da8f	Merge pull request #1 from Ksawlii/5.10.223-testing 5.10.223 + mods	2024-11-19 18:49:35 +01:00
Ksawlii	9f899d45ad	Revert "workqueue: Make queue_rcu_work() use call_rcu_flush()" This reverts commit `d0dc26b405`.	2024-11-19 18:15:40 +01:00
Ksawlii	7fcc5fcd13	Revert "mm: apply init protection" This reverts commit `bcec04dde1`.	2024-11-19 18:15:13 +01:00
Ksawlii	6f09981af2	Revert "kernel: sysctl: add init protection to common mm-related nodes" This reverts commit `7059d8baa3`.	2024-11-19 18:13:49 +01:00
Sultan Alsawaf	cc43b46500	configs: don't build kheaders_data.tar.xz	2024-11-19 18:06:37 +01:00
Zhen Lei	9fdbd3eed2	kallsyms: Improve the performance of kallsyms_lookup_name() Currently, to search for a symbol, we need to expand the symbols in 'kallsyms_names' one by one, and then use the expanded string for comparison. It's O(n). If we sort names in ascending order like addresses, we can also use binary search. It's O(log(n)). In order not to change the implementation of "/proc/kallsyms", the table kallsyms_names[] is still stored in a one-to-one correspondence with the address in ascending order. Add array kallsyms_seqs_of_names[], it's indexed by the sequence number of the sorted names, and the corresponding content is the sequence number of the sorted addresses. For example: Assume that the index of NameX in array kallsyms_seqs_of_names[] is 'i', the content of kallsyms_seqs_of_names[i] is 'k', then the corresponding address of NameX is kallsyms_addresses[k]. The offset in kallsyms_names[] is get_symbol_offset(k). Note that the memory usage will increase by (4 * kallsyms_num_syms) bytes, the next two patches will reduce (1 * kallsyms_num_syms) bytes and properly handle the case CONFIG_LTO_CLANG=y. Performance test results: (x86) Before: min=234, max=10364402, avg=5206926 min=267, max=11168517, avg=5207587 After: min=1016, max=90894, avg=7272 min=1014, max=93470, avg=7293 The average lookup performance of kallsyms_lookup_name() improved 715x. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>	2024-11-19 18:06:35 +01:00
LibXZR	fa04aad614	block: zram_drv: Allow creation of only one ZRAM device * Gotta store the pointer of the only ZRAM device for compaction * Also, more than one ZRAM device is useless Signed-off-by: Adithya R <gh0strider.2k18.reborn@gmail.com>	2024-11-19 18:06:30 +01:00
Matteo Croce	95dce0e9be	lib/string: optimized memmove When the destination buffer is before the source one, or when the buffers doesn't overlap, it's safe to use memcpy() instead, which is optimized to use a bigger data size possible. This "optimization" only covers a common case. In future, proper code which does the same thing as memcpy() does but backwards can be done. Link: https://lkml.kernel.org/r/20210702123153.14093-3-mcroce@linux.microsoft.com Signed-off-by: Matteo Croce <mcroce@microsoft.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: David Laight <David.Laight@aculab.com> Cc: Drew Fustini <drew@beagleboard.org> Cc: Emil Renner Berthing <kernel@esmil.dk> Cc: Guo Ren <guoren@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nick Kossifidis <mick@ics.forth.gr> Cc: Palmer Dabbelt <palmer@dabbelt.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jebaitedneko <Jebaitedneko@gmail.com> Signed-off-by: celtare21 <celtare21@gmail.com>	2024-11-19 18:06:13 +01:00
Matteo Croce	e4f231582a	lib/string: optimized memset The generic memset is defined as a byte at time write. This is always safe, but it's slower than a 4 byte or even 8 byte write. Write a generic memset which fills the data one byte at time until the destination is aligned, then fills using the largest size allowed, and finally fills the remaining data one byte at time. On a RISC-V machine the speed goes from 140 Mb/s to 241 Mb/s, and this the binary size increase according to bloat-o-meter: Function old new delta memset 32 148 +116 Link: https://lkml.kernel.org/r/20210702123153.14093-4-mcroce@linux.microsoft.com Signed-off-by: Matteo Croce <mcroce@microsoft.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: David Laight <David.Laight@aculab.com> Cc: Drew Fustini <drew@beagleboard.org> Cc: Emil Renner Berthing <kernel@esmil.dk> Cc: Guo Ren <guoren@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nick Kossifidis <mick@ics.forth.gr> Cc: Palmer Dabbelt <palmer@dabbelt.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jebaitedneko <Jebaitedneko@gmail.com> Signed-off-by: celtare21 <celtare21@gmail.com>	2024-11-19 18:06:10 +01:00
Panchajanya1999	82413308e6	power/wakelock: Add a timeout to wakelocks globally Few wakelocks tends to get stuck for no reason. Blocking them isn't necessary and sometimes blocking them breaks basic functionality. Wakelocks like "tx_swr_ctrl" tends to get stuck if we keep earphones connected and drops battery massively. Test: Keep earphones plugged in and leave device for few hours Expected result: No "tx_swr_ctrl" is being stuck. Actual result: Patch is working as expected. Change-Id: I5296990a84ab44cf6e449d6535b8b99408c415c8 Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live> Signed-off-by: Panchajanya1999 <kernel@panchajanya.dev> (cherry picked from commit c721867bf4dc2e2c316b2623ad97a28382af2c8c) (cherry picked from commit a5e999ea4df99f91b7b5aa5bab5b39123587424f)	2024-11-19 18:06:07 +01:00
Sultan Alsawaf	900245cda2	schedutil: Allow CPU frequency changes to be amended before they're set If the last CPU frequency selected isn't set before a new CPU frequency selection arrives, then use the new selection immediately to avoid using a stale frequency choice. This improves both performance and energy by more closely tracking the scheduler's latest decisions. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>	2024-11-19 18:06:02 +01:00
Tyler Nijmeh	826d5e8824	irq: spurious: Disable IRQ debugging by default Signed-off-by: Tyler Nijmeh <tylernij@gmail.com> Signed-off-by: sohamxda7 <sensoham135@gmail.com> Signed-off-by: Oktapra Amtono <oktapra.amtono@gmail.com> Signed-off-by: Anush02198 <Anush.4376@gmail.com> Signed-off-by: Divyanshu-Modi <divyan.m05@gmail.com> Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com> Signed-off-by: NotZeetaa <rodrigo2005contente@gmail.com> Signed-off-by: priiii1808 <priyanshusinghal0818@gmail.com>	2024-11-19 18:05:57 +01:00
Sultan Alsawaf	ae0839f165	kernel: Don't allow IRQ affinity masks to have more than one CPU Even with an affinity mask that has multiple CPUs set, IRQs always run on the first CPU in their affinity mask. Drivers that register an IRQ affinity notifier (such as pm_qos) will therefore have an incorrect assumption of where an IRQ is affined. Fix the IRQ affinity mask deception by forcing it to only contain one set CPU. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>	2024-11-19 18:05:54 +01:00
Sultan Alsawaf	5d83710a9b	kernel: Only set one CPU in the default IRQ affinity mask On ARM, IRQs are executed on the first CPU inside the affinity mask, so setting an affinity mask with more than one CPU set is deceptive and causes issues with pm_qos. To fix this, only set the CPU0 bit inside the affinity mask, since that's where IRQs will run by default. This is a follow-up to "kernel: Don't allow IRQ affinity masks to have more than one CPU". Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>	2024-11-19 18:05:50 +01:00
Sultan Alsawaf	bffb1b52f3	kernel: Warn when an IRQ's affinity notifier gets overwritten An IRQ affinity notifier getting overwritten can point to some annoying issues which need to be resolved, like multiple pm_qos objects being registered to the same IRQ. Print out a warning when this happens to aid debugging. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>	2024-11-19 18:05:46 +01:00
Sultan Alsawaf	2e3484e48b	PM / freezer: Reduce freeze timeout to 1 second for Android Freezing processes on Android usually takes less than 100 ms, and if it takes longer than that to the point where the 20 second freeze timeout is reached, it's because the remaining processes to be frozen are deadlocked waiting for something from a process which is already frozen. There's no point in burning power trying to freeze for that long, so reduce the freeze timeout to a very generous 1 second for Android and don't let anything mess with it. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>	2024-11-19 18:05:37 +01:00
xNombre	5d3ff5040f	alarmtimer: Minimize wakeup time Alarmtimer sets its wakeup timeout to 2s no matter the actual time to nearest timer expiration. This can cause device to be awake for more than needed. To fix this set wakeup timeout to min + 1 ms for safety margin. Tests revealed that average timer expiration is 1150ms in the future which suggests there is a room avilable to minimize wakeup times. Before this change device would enter sleep not earlier than 2s after alarmtimer suspend error (-EBUSY). With this change average suspend after alarmtimer suspend error time went down to 1.5s with a minimum of 0.248ms (after filtering results higher than 2.6s). This should lead to noticeable power savings as Android uses alarmtimer quite frequently. Signed-off-by: Andrzej Perczak <linux@andrzejperczak.com> Signed-off-by: Zlatan Radovanovic <zlatan.radovanovic@fet.ba>	2024-11-19 18:05:33 +01:00
friedrich420	5afb8f94f1	Kernel/sched: Reduce Latency [Pafcholini] Signed-off-by: HolyAngel <slverwolf@gmail.com> Signed-off-by: Salllz <sal235222727@gmail.com> Signed-off-by: alanndz <alanndz7@gmail.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: Little-W <1405481963@qq.com>	2024-11-19 18:05:31 +01:00
Yaroslav Furman	ec544c143c	PM / sleep: Skip OOM killer toggles when kernel is compiled for Android Android devices use LMK algorythms, so there's no reason to disable and enable the OOM killer when entering and exiting suspend. This is a fixed version of https://github.com/YaroST12/VIOLENT_kernel/commit/86e59a93b2ef Co-authored-by: Danny Lin <danny@kdrag0n.dev> Signed-off-by: Yaroslav Furman <yaro330@gmail.com> Signed-off-by: celtare21 <celtare21@gmail.com> Signed-off-by: Ren <89468157+Shirayuki39@users.noreply.github.com>	2024-11-19 18:05:27 +01:00
Sultan Alsawaf	419052d8e5	sched/fair: Compile out NUMA code entirely when NUMA is disabled Scheduler code is very hot and every little optimization counts. Instead of constantly checking sched_numa_balancing when NUMA is disabled, compile it out. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>	2024-11-19 18:05:24 +01:00
Clement Courbet	d4b05cdad5	sched: Optimize __calc_delta() A significant portion of __calc_delta() time is spent in the loop shifting a u64 by 32 bits. Use `fls` instead of iterating. This is ~7x faster on benchmarks. The generic `fls` implementation (`generic_fls`) is still ~4x faster than the loop. Architectures that have a better implementation will make use of it. For example, on x86 we get an additional factor 2 in speed without dedicated implementation. On GCC, the asm versions of `fls` are about the same speed as the builtin. On Clang, the versions that use fls are more than twice as slow as the builtin. This is because the way the `fls` function is written, clang puts the value in memory: https://godbolt.org/z/EfMbYe. This bug is filed at https://bugs.llvm.org/show_bug.cgi?idI406. ``` name cpu/op BM_Calc<__calc_delta_loop> 9.57ms Â=B112% BM_Calc<__calc_delta_generic_fls> 2.36ms Â=B113% BM_Calc<__calc_delta_asm_fls> 2.45ms Â=B113% BM_Calc<__calc_delta_asm_fls_nomem> 1.66ms Â=B112% BM_Calc<__calc_delta_asm_fls64> 2.46ms Â=B113% BM_Calc<__calc_delta_asm_fls64_nomem> 1.34ms Â=B115% BM_Calc<__calc_delta_builtin> 1.32ms Â=B111% ``` Signed-off-by: Clement Courbet <courbet@google.com> Signed-off-by: Josh Don <joshdon@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210303224653.2579656-1-joshdon@google.com	2024-11-19 18:05:19 +01:00
Qais Yousef	971267e87b	schedutil : cap iowait boost by uclamp_max Which is a backport of upstream fix: d37aee9018e6 ("sched/uclamp: Fix iowait boost escaping uclamp restriction") Bug: 261695814 Signed-off-by: Qais Yousef <qyousef@google.com> Change-Id: Ibe8175edb9dea35e325f1a6f4306885ab8b6b28a	2024-11-19 18:05:14 +01:00
Rohail33	ca3d31ea66	kernel: time: reduce ntp wakeups	2024-11-19 18:05:11 +01:00
Tyler Nijmeh	f40f9398a3	PM/Sleep: Start killing wakelocks after two minutes of idle (120s) Signed-off-by: Tyler Nijmeh <tylernij@gmail.com> Signed-off-by: ThunderStorms21th nalas <pinakastorm@gmail.com>	2024-11-19 18:05:05 +01:00
Sultan Alsawaf	25da1fb9b2	qos: Don't allow userspace to impose restrictions on CPU idle levels Giving userspace intimate control over CPU latency requirements is nonsense. Userspace can't even stop itself from being preempted, so there's no reason for it to have access to a mechanism primarily used to eliminate CPU delays on the order of microseconds. Remove userspace's ability to send pm_qos requests so that it can't hurt power consumption. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Panchajanya1999 <kernel@panchajanya.dev>	2024-11-19 18:05:02 +01:00
Sultan Alsawaf	74cbd01416	sched/core: Use SCHED_RR in place of SCHED_FIFO for all users Although SCHED_FIFO is a real-time scheduling policy, it can have bad results on system latency, since each SCHED_FIFO task will run to completion before yielding to another task. This can result in visible micro-stalls when a SCHED_FIFO task hogs the CPU for too long. On a system where latency is favored over throughput, using SCHED_RR is a better choice than SCHED_FIFO. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Oktapra Amtono <oktapra.amtono@gmail.com> Signed-off-by: CloudedQuartz <ravenklawasd@gmail.com>	2024-11-19 18:04:58 +01:00
Sultan Alsawaf	cda8f45b3b	cpu: Silence log spam when a CPU is brought up Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: celtare21 <celtare21@gmail.com> Signed-off-by: engstk <eng.stk@sapo.pt>	2024-11-19 18:04:55 +01:00
Yaroslav Furman	e7cede92a8	sched: core: silence no longer affine to cpu logspam Signed-off-by: engstk <eng.stk@sapo.pt>	2024-11-19 18:04:49 +01:00
Sultan Alsawaf	4861626fb1	schedutil: Don't affine sugov kthreads if DVFS is allowed from any CPU Restricting sugov kthreads to their respective CPUFreq policy's CPUs slows down schedutil's ability to switch frequencies. When DVFS is allowed from any CPU, allow respective sugov kthreads to run on any CPU for better performance. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>	2024-11-19 18:04:45 +01:00
atndko	3a5f3cae8a	printk: Silence useless system log spam When charging, healthd and dashd will spam every several secs, it's sooooo noisy and useless. If you launch a userspace app, there will give a logd message, silence it. Signed-off-by: Wahid Khan <wahidzk0091@gmail.com> Signed-off-by: atndko <z1281552865@gmail.com> Signed-off-by: Vaisakh Murali <mvaisakh@statixos.com> Signed-off-by: Cyber Knight <cyberknight755@gmail.com>	2024-11-19 18:04:40 +01:00
Sultan Alsawaf	0b24a687cf	sched: Set sched_nr_migrate back to 32 on RT for Android Android isn't a real-time userspace and has lots of processes, which makes the normal sched_nr_migrate value of 32 more appealing. In addition, there's no observed latency reduction from using a sched_nr_migrate value of 8, probably because the shallowest idle state on mobile CPUs takes longer to enter/exit than it takes for the scheduler to do a load balance run, so our tail end latency is limited by cpuidle anyway.	2024-11-19 18:04:37 +01:00
Rafael J. Wysocki	bc903594c9	cpufreq: schedutil: Reduce frequencies slower The schedutil governor reduces frequencies too fast in some situations which cases undesirable performance drops to appear. To address that issue, make schedutil reduce the frequency slower by setting it to the average of the value chosen during the previous iteration of governor computations and the new one coming from its frequency selection formula. Link: https://bugzilla.kernel.org/show_bug.cgi?id=194963 Reported-by: John <john.ettedgui@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Cykeek <Cykeek@proton.me> Signed-off-by: negrroo <mohammedaelnaggar1@gmail.com> Signed-off-by: priiii1808 <priyanshusinghal0818@gmail.com>	2024-11-19 18:04:33 +01:00
Yaroslav Furman	04ccb84743	kernel: printk: suspend-resume stfu Signed-off-by: Yaroslav Furman <yaro330@gmail.com> Signed-off-by: Oktapra Amtono <oktapra.amtono@gmail.com> Signed-off-by: clarencelol <clarencekuiek@icloud.com> Signed-off-by: Anush02198 <Anush.4376@gmail.com> Signed-off-by: Divyanshu-Modi <divyan.m05@gmail.com> Signed-off-by: Tashfin Shakeer Rhythm <tashfinshakeerrhythm@gmail.com> Signed-off-by: NotZeetaa <rodrigo2005contente@gmail.com> Signed-off-by: priiii1808 <priyanshusinghal0818@gmail.com>	2024-11-19 18:04:28 +01:00
Cyber Knight	471bfb0e50	kernel/cpu: Silence abundance of logspam We don't really need to know if the CPU is getting disabled or enabled on a production device. Signed-off-by: Cyber Knight <cyberknight755@gmail.com> Signed-off-by: priiii1808 <priyanshusinghal0818@gmail.com>	2024-11-19 18:04:25 +01:00
Rob Burton	edc883311b	security: samsung: defex_lsm: nuke	2024-11-19 18:03:36 +01:00
Rafael J. Wysocki	a05fde58e7	cpuidle: menu: Take negative "sleep length" values into account Make the menu governor check the tick_nohz_get_next_hrtimer() return value so as to avoid dealing with negative "sleep length" values and make it use that value directly when the tick is stopped. While at it, rename local variable delta_next in menu_select() to delta_tick which better reflects its purpose. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2024-11-19 18:01:28 +01:00
ztc1997	712e020901	cpuidle: teo: Increase default rating We want the teo priority to be higher than menu but lower than qcom-cpu-lpm	2024-11-19 18:01:17 +01:00
Kevin Bracey	53460d53c6	lib/crc32: Make crc32_be weak for arch override crc32_le and __crc32c_le can be overridden - extend this to crc32_be. Signed-off-by: Kevin Bracey <kevin@bracey.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2024-11-19 17:59:39 +01:00
Gustavo A. R. Silva	314c35f2c9	lib: Fix fall-through warnings for Clang In preparation to enable -Wimplicit-fallthrough for Clang, fix multiple warnings by explicitly adding multiple break statements instead of letting the code fall through to the next case, and by replacing a number of /* fall through / comments with the new pseudo-keyword macro fallthrough. Notice that Clang doesn't recognize / Fall through */ comments as implicit fall-through markings. Link: https://github.com/KSPP/linux/issues/115 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>	2024-11-19 17:59:22 +01:00
Zhen Lei	aa60ec0163	lib/decompressors: fix spelling mistakes Fix some spelling mistakes in comments: sentinal ==> sentinel compresed ==> compressed dependeny ==> dependency immediatelly ==> immediately dervied ==> derived splitted ==> split nore ==> not independed ==> independent asumed ==> assumed Link: https://lkml.kernel.org/r/20210604085656.12257-1-thunder.leizhen@huawei.com Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-11-19 17:59:19 +01:00
Sultan Alsawaf	867fdfeb8a	mm: Disable watermark boosting by default What watermark boosting does is preemptively fire up kswapd to free memory when there hasn't been an allocation failure. It does this by increasing kswapd's high watermark goal and then firing up kswapd. The reason why this causes freezes is because, with the increased high watermark goal, kswapd will steal memory from processes that need it in order to make forward progress. These processes will, in turn, try to allocate memory again, which will cause kswapd to steal necessary pages from those processes again, in a positive feedback loop known as page thrashing. When page thrashing occurs, your system is essentially livelocked until the necessary forward progress can be made to stop processes from trying to continuously allocate memory and trigger kswapd to steal it back. This problem already occurs with kswapd without watermark boosting, but it's usually only encountered on machines with a small amount of memory and/or a slow CPU. Watermark boosting just makes the existing problem worse enough to notice on higher spec'd machines. Disable watermark boosting by default since it's a total dumpster fire. I can't imagine why anyone would want to explicitly enable it, but the option is there in case someone does. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com> Signed-off-by: Juhyung Park <qkrwngud825@gmail.com> Signed-off-by: celtare21 <celtare21@gmail.com>	2024-11-19 17:59:15 +01:00
Alex Shi	09d9c4b0f2	mm/rmap: stop store reordering issue on page->mapping Hugh Dickins and Minchan Kim observed a long time issue which discussed here, but actully the mentioned fix in https://lore.kernel.org/lkml/20150504031722.GA2768@blaptop/ was missed. The store reordering may cause problem in the scenario: CPU 0 CPU1 do_anonymous_page page_add_new_anon_rmap() page->mapping = anon_vma + PAGE_MAPPING_ANON lru_cache_add_inactive_or_unevictable() spin_lock(lruvec->lock) SetPageLRU() spin_unlock(lruvec->lock) /* idletacking judged it as LRU * page so pass the page in * page_idle_clear_pte_refs */ page_idle_clear_pte_refs rmap_walk if PageAnon(page) Johannes give detailed examples how the store reordering could cause trouble: "The concern is the SetPageLRU may get reorder before 'page->mapping' setting, That would make CPU 1 will observe at page->mapping after observing PageLRU set on the page. 1. anon_vma + PAGE_MAPPING_ANON That's the in-order scenario and is fine. 2. NULL That's possible if the page->mapping store gets reordered to occur after SetPageLRU. That's fine too because we check for it. 3. anon_vma without the PAGE_MAPPING_ANON bit That would be a problem and could lead to all kinds of undesirable behavior including crashes and data corruption. Is it possible? AFAICT the compiler is allowed to tear the store to page->mapping and I don't see anything that would prevent it. That said, I also don't see how the reader testing PageLRU under the lru_lock would prevent that in the first place. AFAICT we need that WRITE_ONCE() around the page->mapping assignment." [alex.shi@linux.alibaba.com: updated for comments change from Johannes] Link: https://lkml.kernel.org/r/e66ef2e5-c74c-6498-e8b3-56c37b9d2d15@linux.alibaba.com Link: https://lkml.kernel.org/r/1604566549-62481-7-git-send-email-alex.shi@linux.alibaba.com Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Alexander Duyck <alexander.duyck@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: "Chen, Rong A" <rong.a.chen@intel.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Jann Horn <jannh@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mika Penttilä <mika.penttila@nextfour.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yang Shi <yang.shi@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-11-19 17:59:04 +01:00
Alex Shi	e20dc86b22	mm/vmscan: remove unnecessary lruvec adding We don't have to add a freeable page into lru and then remove from it. This change saves a couple of actions and makes the moving more clear. The SetPageLRU needs to be kept before put_page_testzero for list integrity, otherwise: #0 move_pages_to_lru #1 release_pages if !put_page_testzero if (put_page_testzero()) !PageLRU //skip lru_lock SetPageLRU() list_add(&page->lru,) list_add(&page->lru,) [akpm@linux-foundation.org: coding style fixes] Link: https://lkml.kernel.org/r/1604566549-62481-6-git-send-email-alex.shi@linux.alibaba.com Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Acked-by: Hugh Dickins <hughd@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Tejun Heo <tj@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Alexander Duyck <alexander.duyck@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: "Chen, Rong A" <rong.a.chen@intel.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Jann Horn <jannh@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mika Penttilä <mika.penttila@nextfour.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Shakeel Butt <shakeelb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yang Shi <yang.shi@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-11-19 17:58:57 +01:00
Alex Shi	dcd1eceb6e	mm/lru: move lock into lru_note_cost We have to move lru_lock into lru_note_cost, since it cycle up on memcg tree, for future per lruvec lru_lock replace. It's a bit ugly and may cost a bit more locking, but benefit from multiple memcg locking could cover the lost. Link: https://lkml.kernel.org/r/1604566549-62481-11-git-send-email-alex.shi@linux.alibaba.com Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Acked-by: Hugh Dickins <hughd@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Alexander Duyck <alexander.duyck@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: "Chen, Rong A" <rong.a.chen@intel.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Jann Horn <jannh@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mika Penttilä <mika.penttila@nextfour.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Shakeel Butt <shakeelb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yang Shi <yang.shi@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-11-19 17:58:50 +01:00
Hugh Dickins	19e71f3ba5	mm: page_idle_get_page() does not need lru_lock It is necessary for page_idle_get_page() to recheck PageLRU() after get_page_unless_zero(), but holding lru_lock around that serves no useful purpose, and adds to lru_lock contention: delete it. See https://lore.kernel.org/lkml/20150504031722.GA2768@blaptop for the discussion that led to lru_lock there; but __page_set_anon_rmap() now uses WRITE_ONCE(), and I see no other risk in page_idle_clear_pte_refs() using rmap_walk() (beyond the risk of racing PageAnon->PageKsm, mostly but not entirely prevented by page_count() check in ksm.c's write_protect_page(): that risk being shared with page_referenced() and not helped by lru_lock). Link: https://lkml.kernel.org/r/1604566549-62481-8-git-send-email-alex.shi@linux.alibaba.com Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: "Huang, Ying" <ying.huang@intel.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Alex Shi <alex.shi@linux.alibaba.com> Cc: Alexander Duyck <alexander.duyck@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: "Chen, Rong A" <rong.a.chen@intel.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Jann Horn <jannh@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mika Penttilä <mika.penttila@nextfour.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yang Shi <yang.shi@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-11-19 17:58:44 +01:00
Hugh Dickins	525918735b	mm/lru: revise the comments of lru_lock Since we changed the pgdat->lru_lock to lruvec->lru_lock, it's time to fix the incorrect comments in code. Also fixed some zone->lru_lock comment error from ancient time. etc. I struggled to understand the comment above move_pages_to_lru() (surely it never calls page_referenced()), and eventually realized that most of it had got separated from shrink_active_list(): move that comment back. Link: https://lkml.kernel.org/r/1604566549-62481-20-git-send-email-alex.shi@linux.alibaba.com Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Tejun Heo <tj@kernel.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Jann Horn <jannh@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Matthew Wilcox <willy@infradead.org> Cc: Alexander Duyck <alexander.duyck@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Chen, Rong A" <rong.a.chen@intel.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mika Penttilä <mika.penttila@nextfour.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Shakeel Butt <shakeelb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yang Shi <yang.shi@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-11-19 17:58:24 +01:00
Sultan Alsawaf	debf1dee53	mbcache: Speed up cache entry creation In order to prevent redundant entry creation by racing against itself, mb_cache_entry_create scans through a large hash-list of all current entries in order to see if another allocation for the requested new entry has been made. Furthermore, it allocates memory for a new entry before scanning through this hash-list, which results in that allocated memory being discarded when the requested new entry is already present. This happens more than half the time. Speed up cache entry creation by keeping a small linked list of requested new entries in progress, and scanning through that first instead of the large hash-list. Additionally, don't bother allocating memory for a new entry until it's known that the allocated memory will be used. Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>	2024-11-19 17:58:19 +01:00

1 2 3 4 5 ...

4433 commits