From 53cef2d952c4b017776ee1a7a8db61dfed97ff6d Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Wed, 11 Sep 2024 17:11:23 -0700 Subject: [PATCH] mm: avoid leaving partial pfn mappings around in error case commit 79a61cc3fc0466ad2b7b89618a6157785f0293b3 upstream. As Jann points out, PFN mappings are special, because unlike normal memory mappings, there is no lifetime information associated with the mapping - it is just a raw mapping of PFNs with no reference counting of a 'struct page'. That's all very much intentional, but it does mean that it's easy to mess up the cleanup in case of errors. Yes, a failed mmap() will always eventually clean up any partial mappings, but without any explicit lifetime in the page table mapping itself, it's very easy to do the error handling in the wrong order. In particular, it's easy to mistakenly free the physical backing store before the page tables are actually cleaned up and (temporarily) have stale dangling PTE entries. To make this situation less error-prone, just make sure that any partial pfn mapping is torn down early, before any other error handling. Reported-and-tested-by: Jann Horn Cc: Andrew Morton Cc: Jason Gunthorpe Cc: Simona Vetter Signed-off-by: Linus Torvalds Signed-off-by: Harshvardhan Jha Signed-off-by: Greg Kroah-Hartman --- mm/memory.c | 27 ++++++++++++++++++++++----- 1 file changed, 22 insertions(+), 5 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 53d8f6b1b..1ebc74116 100755 --- a/mm/memory.c +++ b/mm/memory.c @@ -2345,11 +2345,7 @@ static inline int remap_p4d_range(struct mm_struct *mm, pgd_t *pgd, return 0; } -/* - * Variant of remap_pfn_range that does not call track_pfn_remap. The caller - * must have pre-validated the caching bits of the pgprot_t. - */ -int remap_pfn_range_notrack(struct vm_area_struct *vma, unsigned long addr, +static int remap_pfn_range_internal(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn, unsigned long size, pgprot_t prot) { pgd_t *pgd; @@ -2402,6 +2398,27 @@ int remap_pfn_range_notrack(struct vm_area_struct *vma, unsigned long addr, return 0; } +/* + * Variant of remap_pfn_range that does not call track_pfn_remap. The caller + * must have pre-validated the caching bits of the pgprot_t. + */ +int remap_pfn_range_notrack(struct vm_area_struct *vma, unsigned long addr, + unsigned long pfn, unsigned long size, pgprot_t prot) +{ + int error = remap_pfn_range_internal(vma, addr, pfn, size, prot); + + if (!error) + return 0; + + /* + * A partial pfn range mapping is dangerous: it does not + * maintain page reference counts, and callers may free + * pages due to the error. So zap it early. + */ + zap_page_range_single(vma, addr, size, NULL); + return error; +} + /** * remap_pfn_range - remap kernel memory to userspace * @vma: user vma to map to