FireAsf 🔥
![]() [ Upstream commit 5ec8e8ea8b7783fab150cf86404fc38cb4db8800 ] The below race is observed on a PFN which falls into the device memory region with the system memory configuration where PFN's are such that [ZONE_NORMAL ZONE_DEVICE ZONE_NORMAL]. Since normal zone start and end pfn contains the device memory PFN's as well, the compaction triggered will try on the device memory PFN's too though they end up in NOP(because pfn_to_online_page() returns NULL for ZONE_DEVICE memory sections). When from other core, the section mappings are being removed for the ZONE_DEVICE region, that the PFN in question belongs to, on which compaction is currently being operated is resulting into the kernel crash with CONFIG_SPASEMEM_VMEMAP enabled. The crash logs can be seen at [1]. compact_zone() memunmap_pages ------------- --------------- __pageblock_pfn_to_page ...... (a)pfn_valid(): valid_section()//return true (b)__remove_pages()-> sparse_remove_section()-> section_deactivate(): [Free the array ms->usage and set ms->usage = NULL] pfn_section_valid() [Access ms->usage which is NULL] NOTE: From the above it can be said that the race is reduced to between the pfn_valid()/pfn_section_valid() and the section deactivate with SPASEMEM_VMEMAP enabled. The commit b943f045a9af("mm/sparse: fix kernel crash with pfn_section_valid check") tried to address the same problem by clearing the SECTION_HAS_MEM_MAP with the expectation of valid_section() returns false thus ms->usage is not accessed. Fix this issue by the below steps: a) Clear SECTION_HAS_MEM_MAP before freeing the ->usage. b) RCU protected read side critical section will either return NULL when SECTION_HAS_MEM_MAP is cleared or can successfully access ->usage. c) Free the ->usage with kfree_rcu() and set ms->usage = NULL. No attempt will be made to access ->usage after this as the SECTION_HAS_MEM_MAP is cleared thus valid_section() return false. Thanks to David/Pavan for their inputs on this patch. [1] https://lore.kernel.org/linux-mm/994410bb-89aa-d987-1f50-f514903c55aa@quicinc.com/ On Snapdragon SoC, with the mentioned memory configuration of PFN's as [ZONE_NORMAL ZONE_DEVICE ZONE_NORMAL], we are able to see bunch of issues daily while testing on a device farm. For this particular issue below is the log. Though the below log is not directly pointing to the pfn_section_valid(){ ms->usage;}, when we loaded this dump on T32 lauterbach tool, it is pointing. [ 540.578056] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 540.578068] Mem abort info: [ 540.578070] ESR = 0x0000000096000005 [ 540.578073] EC = 0x25: DABT (current EL), IL = 32 bits [ 540.578077] SET = 0, FnV = 0 [ 540.578080] EA = 0, S1PTW = 0 [ 540.578082] FSC = 0x05: level 1 translation fault [ 540.578085] Data abort info: [ 540.578086] ISV = 0, ISS = 0x00000005 [ 540.578088] CM = 0, WnR = 0 [ 540.579431] pstate: 82400005 (Nzcv daif +PAN -UAO +TCO -DIT -SSBSBTYPE=--) [ 540.579436] pc : __pageblock_pfn_to_page+0x6c/0x14c [ 540.579454] lr : compact_zone+0x994/0x1058 [ 540.579460] sp : ffffffc03579b510 [ 540.579463] x29: ffffffc03579b510 x28: 0000000000235800 x27:000000000000000c [ 540.579470] x26: 0000000000235c00 x25: 0000000000000068 x24:ffffffc03579b640 [ 540.579477] x23: 0000000000000001 x22: ffffffc03579b660 x21:0000000000000000 [ 540.579483] x20: 0000000000235bff x19: ffffffdebf7e3940 x18:ffffffdebf66d140 [ 540.579489] x17: 00000000739ba063 x16: 00000000739ba063 x15:00000000009f4bff [ 540.579495] x14: 0000008000000000 x13: 0000000000000000 x12:0000000000000001 [ 540.579501] x11: 0000000000000000 x10: 0000000000000000 x9 :ffffff897d2cd440 [ 540.579507] x8 : 0000000000000000 x7 : 0000000000000000 x6 :ffffffc03579b5b4 [ 540.579512] x5 : 0000000000027f25 x4 : ffffffc03579b5b8 x3 :0000000000000001 [ 540.579518] x2 : ffffffdebf7e3940 x1 : 0000000000235c00 x0 :0000000000235800 [ 540.579524] Call trace: [ 540.579527] __pageblock_pfn_to_page+0x6c/0x14c [ 540.579533] compact_zone+0x994/0x1058 [ 540.579536] try_to_compact_pages+0x128/0x378 [ 540.579540] __alloc_pages_direct_compact+0x80/0x2b0 [ 540.579544] __alloc_pages_slowpath+0x5c0/0xe10 [ 540.579547] __alloc_pages+0x250/0x2d0 [ 540.579550] __iommu_dma_alloc_noncontiguous+0x13c/0x3fc [ 540.579561] iommu_dma_alloc+0xa0/0x320 [ 540.579565] dma_alloc_attrs+0xd4/0x108 [quic_charante@quicinc.com: use kfree_rcu() in place of synchronize_rcu(), per David] Link: https://lkml.kernel.org/r/1698403778-20938-1-git-send-email-quic_charante@quicinc.com Link: https://lkml.kernel.org/r/1697202267-23600-1-git-send-email-quic_charante@quicinc.com Fixes: f46edbd1b151 ("mm/sparsemem: add helpers track active portions of a section at boot") Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Oscar Salvador <osalvador@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> |
||
---|---|---|
.github/workflows | ||
android | ||
arch | ||
block | ||
certs | ||
crypto | ||
Documentation | ||
drivers | ||
firmware/tsp_goodix | ||
fs | ||
gki | ||
include | ||
init | ||
io_uring | ||
ipc | ||
kernel | ||
kernel_build | ||
kunitconfigs | ||
lib | ||
LICENSES | ||
mm | ||
net | ||
samples | ||
scripts | ||
security | ||
sound | ||
test | ||
tools | ||
usr | ||
virt | ||
build.config.aarch64 | ||
build.config.allmodconfig | ||
build.config.allmodconfig.aarch64 | ||
build.config.allmodconfig.arm | ||
build.config.allmodconfig.x86_64 | ||
build.config.amlogic | ||
build.config.arm | ||
build.config.common | ||
build.config.db845c | ||
build.config.erd8825_a25_s | ||
build.config.erd8825_s | ||
build.config.erd9925_evt0_s | ||
build.config.erd9925_evt0_s5300_s | ||
build.config.erd9925_s | ||
build.config.gki | ||
build.config.gki-debug.aarch64 | ||
build.config.gki-debug.x86_64 | ||
build.config.gki.aarch64 | ||
build.config.gki.aarch64.fips140 | ||
build.config.gki.aarch64.fips140_eval_testing | ||
build.config.gki.x86_64 | ||
build.config.gki_kasan | ||
build.config.gki_kasan.aarch64 | ||
build.config.gki_kasan.x86_64 | ||
build.config.gki_kprobes | ||
build.config.gki_kprobes.aarch64 | ||
build.config.gki_kprobes.x86_64 | ||
build.config.hikey960 | ||
build.config.khwasan | ||
build.config.mcd | ||
build.config.rockchip | ||
build.config.universal2100_s | ||
build.config.universal8825_s | ||
build.config.universal9925_evt0_s | ||
build.config.universal9925_s | ||
build.config.x86_64 | ||
build.sh | ||
COPYING | ||
CREDITS | ||
Kbuild | ||
Kconfig | ||
linux-stable.sh | ||
MAINTAINERS | ||
Makefile | ||
OWNERS | ||
README | ||
README.md | ||
vendor_boot_module_order_exynos2100.cfg | ||
vendor_boot_module_order_s5e8825.cfg | ||
vendor_boot_module_order_s5e9925.cfg | ||
vendor_module_list_s5e8825.cfg | ||
vendor_module_list_s5e9925.cfg | ||
vendor_module_list_s5e9925_b0s.cfg | ||
vendor_module_list_s5e9925_g0s.cfg | ||
vendor_module_list_s5e9925_r0s.cfg |
How do I submit patches to Android Common Kernels
-
BEST: Make all of your changes to upstream Linux. If appropriate, backport to the stable releases. These patches will be merged automatically in the corresponding common kernels. If the patch is already in upstream Linux, post a backport of the patch that conforms to the patch requirements below.
- Do not send patches upstream that contain only symbol exports. To be considered for upstream Linux,
additions of
EXPORT_SYMBOL_GPL()
require an in-tree modular driver that uses the symbol -- so include the new driver or changes to an existing driver in the same patchset as the export. - When sending patches upstream, the commit message must contain a clear case for why the patch is needed and beneficial to the community. Enabling out-of-tree drivers or functionality is not not a persuasive case.
- Do not send patches upstream that contain only symbol exports. To be considered for upstream Linux,
additions of
-
LESS GOOD: Develop your patches out-of-tree (from an upstream Linux point-of-view). Unless these are fixing an Android-specific bug, these are very unlikely to be accepted unless they have been coordinated with kernel-team@android.com. If you want to proceed, post a patch that conforms to the patch requirements below.
Common Kernel patch requirements
- All patches must conform to the Linux kernel coding standards and pass
script/checkpatch.pl
- Patches shall not break gki_defconfig or allmodconfig builds for arm, arm64, x86, x86_64 architectures (see https://source.android.com/setup/build/building-kernels)
- If the patch is not merged from an upstream branch, the subject must be tagged with the type of patch:
UPSTREAM:
,BACKPORT:
,FROMGIT:
,FROMLIST:
, orANDROID:
. - All patches must have a
Change-Id:
tag (see https://gerrit-review.googlesource.com/Documentation/user-changeid.html) - If an Android bug has been assigned, there must be a
Bug:
tag. - All patches must have a
Signed-off-by:
tag by the author and the submitter
Additional requirements are listed below based on patch type
Requirements for backports from mainline Linux: UPSTREAM:
, BACKPORT:
- If the patch is a cherry-pick from Linux mainline with no changes at all
- tag the patch subject with
UPSTREAM:
. - add upstream commit information with a
(cherry picked from commit ...)
line - Example:
- if the upstream commit message is
- tag the patch subject with
important patch from upstream
This is the detailed description of the important patch
Signed-off-by: Fred Jones <fred.jones@foo.org>
- then Joe Smith would upload the patch for the common kernel as
UPSTREAM: important patch from upstream
This is the detailed description of the important patch
Signed-off-by: Fred Jones <fred.jones@foo.org>
Bug: 135791357
Change-Id: I4caaaa566ea080fa148c5e768bb1a0b6f7201c01
(cherry picked from commit c31e73121f4c1ec41143423ac6ce3ce6dafdcec1)
Signed-off-by: Joe Smith <joe.smith@foo.org>
- If the patch requires any changes from the upstream version, tag the patch with
BACKPORT:
instead ofUPSTREAM:
.- use the same tags as
UPSTREAM:
- add comments about the changes under the
(cherry picked from commit ...)
line - Example:
- use the same tags as
BACKPORT: important patch from upstream
This is the detailed description of the important patch
Signed-off-by: Fred Jones <fred.jones@foo.org>
Bug: 135791357
Change-Id: I4caaaa566ea080fa148c5e768bb1a0b6f7201c01
(cherry picked from commit c31e73121f4c1ec41143423ac6ce3ce6dafdcec1)
[joe: Resolved minor conflict in drivers/foo/bar.c ]
Signed-off-by: Joe Smith <joe.smith@foo.org>
Requirements for other backports: FROMGIT:
, FROMLIST:
,
- If the patch has been merged into an upstream maintainer tree, but has not yet
been merged into Linux mainline
- tag the patch subject with
FROMGIT:
- add info on where the patch came from as
(cherry picked from commit <sha1> <repo> <branch>)
. This must be a stable maintainer branch (not rebased, so don't uselinux-next
for example). - if changes were required, use
BACKPORT: FROMGIT:
- Example:
- if the commit message in the maintainer tree is
- tag the patch subject with
important patch from upstream
This is the detailed description of the important patch
Signed-off-by: Fred Jones <fred.jones@foo.org>
- then Joe Smith would upload the patch for the common kernel as
FROMGIT: important patch from upstream
This is the detailed description of the important patch
Signed-off-by: Fred Jones <fred.jones@foo.org>
Bug: 135791357
(cherry picked from commit 878a2fd9de10b03d11d2f622250285c7e63deace
https://git.kernel.org/pub/scm/linux/kernel/git/foo/bar.git test-branch)
Change-Id: I4caaaa566ea080fa148c5e768bb1a0b6f7201c01
Signed-off-by: Joe Smith <joe.smith@foo.org>
- If the patch has been submitted to LKML, but not accepted into any maintainer tree
- tag the patch subject with
FROMLIST:
- add a
Link:
tag with a link to the submittal on lore.kernel.org - add a
Bug:
tag with the Android bug (required for patches not accepted into a maintainer tree) - if changes were required, use
BACKPORT: FROMLIST:
- Example:
- tag the patch subject with
FROMLIST: important patch from upstream
This is the detailed description of the important patch
Signed-off-by: Fred Jones <fred.jones@foo.org>
Bug: 135791357
Link: https://lore.kernel.org/lkml/20190619171517.GA17557@someone.com/
Change-Id: I4caaaa566ea080fa148c5e768bb1a0b6f7201c01
Signed-off-by: Joe Smith <joe.smith@foo.org>
Requirements for Android-specific patches: ANDROID:
- If the patch is fixing a bug to Android-specific code
- tag the patch subject with
ANDROID:
- add a
Fixes:
tag that cites the patch with the bug - Example:
- tag the patch subject with
ANDROID: fix android-specific bug in foobar.c
This is the detailed description of the important fix
Fixes: 1234abcd2468 ("foobar: add cool feature")
Change-Id: I4caaaa566ea080fa148c5e768bb1a0b6f7201c01
Signed-off-by: Joe Smith <joe.smith@foo.org>
- If the patch is a new feature
- tag the patch subject with
ANDROID:
- add a
Bug:
tag with the Android bug (required for android-specific features)
- tag the patch subject with