[ Upstream commit 907c3fe532253a6ef4eb9c4d67efb71fab58c706 ]
When doing fast_commit replay an infinite loop may occur due to an
uninitialized extent_status struct. ext4_ext_determine_insert_hole() does
not detect the replay and calls ext4_es_find_extent_range(), which will
return immediately without initializing the 'es' variable.
Because 'es' contains garbage, an integer overflow may happen causing an
infinite loop in this function, easily reproducible using fstest generic/039.
This commit fixes this issue by unconditionally initializing the structure
in function ext4_es_find_extent_range().
Thanks to Zhang Yi, for figuring out the real problem!
Fixes: 8016e29f4362 ("ext4: fast commit recovery path")
Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Link: https://patch.msgid.link/20240515082857.32730-1-luis.henriques@linux.dev
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6c120399cde6b1b5cf65ce403765c579fb3d3e50 ]
Now ext4_es_insert_extent() never return error, so make it return void.
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230424033846.4732-12-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Stable-dep-of: acf795dc161f ("ext4: convert to exclusive lock while inserting delalloc extents")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8e387c89e96b9543a339f84043cf9df15fed2632 ]
__insert_pending() allocate memory in atomic context, so the allocation
could fail, but we are not handling that failure now. It could lead
ext4_es_remove_extent() to get wrong reserved clusters, and the global
data blocks reservation count will be incorrect. The same to
extents_status entry preallocation, preallocate pending entry out of the
i_es_lock with __GFP_NOFAIL, make sure __insert_pending() and
__revise_pending() always succeeds.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20230824092619.1327976-3-yi.zhang@huaweicloud.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 768d612f79822d30a1e7d132a4d4b05337ce42ec ]
Yikebaer reported an issue:
==================================================================
BUG: KASAN: slab-use-after-free in ext4_es_insert_extent+0xc68/0xcb0
fs/ext4/extents_status.c:894
Read of size 4 at addr ffff888112ecc1a4 by task syz-executor/8438
CPU: 1 PID: 8438 Comm: syz-executor Not tainted 6.5.0-rc5 #1
Call Trace:
[...]
kasan_report+0xba/0xf0 mm/kasan/report.c:588
ext4_es_insert_extent+0xc68/0xcb0 fs/ext4/extents_status.c:894
ext4_map_blocks+0x92a/0x16f0 fs/ext4/inode.c:680
ext4_alloc_file_blocks.isra.0+0x2df/0xb70 fs/ext4/extents.c:4462
ext4_zero_range fs/ext4/extents.c:4622 [inline]
ext4_fallocate+0x251c/0x3ce0 fs/ext4/extents.c:4721
[...]
Allocated by task 8438:
[...]
kmem_cache_zalloc include/linux/slab.h:693 [inline]
__es_alloc_extent fs/ext4/extents_status.c:469 [inline]
ext4_es_insert_extent+0x672/0xcb0 fs/ext4/extents_status.c:873
ext4_map_blocks+0x92a/0x16f0 fs/ext4/inode.c:680
ext4_alloc_file_blocks.isra.0+0x2df/0xb70 fs/ext4/extents.c:4462
ext4_zero_range fs/ext4/extents.c:4622 [inline]
ext4_fallocate+0x251c/0x3ce0 fs/ext4/extents.c:4721
[...]
Freed by task 8438:
[...]
kmem_cache_free+0xec/0x490 mm/slub.c:3823
ext4_es_try_to_merge_right fs/ext4/extents_status.c:593 [inline]
__es_insert_extent+0x9f4/0x1440 fs/ext4/extents_status.c:802
ext4_es_insert_extent+0x2ca/0xcb0 fs/ext4/extents_status.c:882
ext4_map_blocks+0x92a/0x16f0 fs/ext4/inode.c:680
ext4_alloc_file_blocks.isra.0+0x2df/0xb70 fs/ext4/extents.c:4462
ext4_zero_range fs/ext4/extents.c:4622 [inline]
ext4_fallocate+0x251c/0x3ce0 fs/ext4/extents.c:4721
[...]
==================================================================
The flow of issue triggering is as follows:
1. remove es
raw es es removed es1
|-------------------| -> |----|.......|------|
2. insert es
es insert es1 merge with es es1 merge with es and free es1
|----|.......|------| -> |------------|------| -> |-------------------|
es merges with newes, then merges with es1, frees es1, then determines
if es1->es_len is 0 and triggers a UAF.
The code flow is as follows:
ext4_es_insert_extent
es1 = __es_alloc_extent(true);
es2 = __es_alloc_extent(true);
__es_remove_extent(inode, lblk, end, NULL, es1)
__es_insert_extent(inode, &newes, es1) ---> insert es1 to es tree
__es_insert_extent(inode, &newes, es2)
ext4_es_try_to_merge_right
ext4_es_free_extent(inode, es1) ---> es1 is freed
if (es1 && !es1->es_len)
// Trigger UAF by determining if es1 is used.
We determine whether es1 or es2 is used immediately after calling
__es_remove_extent() or __es_insert_extent() to avoid triggering a
UAF if es1 or es2 is freed.
Reported-by: Yikebaer Aizezi <yikebaer61@gmail.com>
Closes: https://lore.kernel.org/lkml/CALcu4raD4h9coiyEBL4Bm0zjDwxC2CyPiTwsP3zFuhot6y9Beg@mail.gmail.com
Fixes: 2a69c450083d ("ext4: using nofail preallocation in ext4_es_insert_extent()")
Cc: stable@kernel.org
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230815070808.3377171-1-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Stable-dep-of: 8e387c89e96b ("ext4: make sure allocate pending entry not fail")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2a69c450083db164596c75c0f5b4d9c4c0e18eba ]
Similar to in ext4_es_insert_delayed_block(), we use preallocations that
do not fail to avoid inconsistencies, but we do not care about es that are
not must be kept, and we return 0 even if such es memory allocation fails.
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230424033846.4732-9-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Stable-dep-of: 8e387c89e96b ("ext4: make sure allocate pending entry not fail")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4a2d98447b37bcb68a7f06a1078edcb4f7e6ce7e ]
Similar to in ext4_es_remove_extent(), we use a no-fail preallocation
to avoid inconsistencies, except that here we may have to preallocate
two extent_status.
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230424033846.4732-8-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Stable-dep-of: 8e387c89e96b ("ext4: make sure allocate pending entry not fail")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e9fe2b882bd5b26b987c9ba110c2222796f72af5 ]
If __es_remove_extent() returns an error it means that when splitting
extent, allocating an extent that must be kept failed, where returning
an error directly would cause the extent tree to be inconsistent. So we
use GFP_NOFAIL to pre-allocate an extent_status and pass it to
__es_remove_extent() to avoid this problem.
In addition, since the allocated memory is outside the i_es_lock, the
extent_status tree may change and the pre-allocated extent_status is
no longer needed, so we release the pre-allocated extent_status when
es->es_len is not initialized.
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230424033846.4732-7-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Stable-dep-of: 8e387c89e96b ("ext4: make sure allocate pending entry not fail")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bda3efaf774fb687c2b7a555aaec3006b14a8857 ]
When splitting extent, if the second extent can not be dropped, we return
-ENOMEM and use GFP_NOFAIL to preallocate an extent_status outside of
i_es_lock and pass it to __es_remove_extent() to be used as the second
extent. This ensures that __es_remove_extent() is executed successfully,
thus ensuring consistency in the extent status tree. If the second extent
is not undroppable, we simply drop it and return 0. Then retry is no longer
necessary, remove it.
Now, __es_remove_extent() will always remove what it should, maybe more.
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230424033846.4732-6-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Stable-dep-of: 8e387c89e96b ("ext4: make sure allocate pending entry not fail")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 95f0b320339a977cf69872eac107122bf536775d ]
Pass a extent_status pointer prealloc to __es_insert_extent(). If the
pointer is non-null, it is used directly when a new extent_status is
needed to avoid memory allocation failures.
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230424033846.4732-5-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Stable-dep-of: 8e387c89e96b ("ext4: make sure allocate pending entry not fail")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 73a2f033656be11298912201ad50615307b4477a ]
Factor out __es_alloc_extent() and __es_free_extent(), which only allocate
and free extent_status in these two helpers.
The ext4_es_alloc_extent() function is split into __es_alloc_extent()
and ext4_es_init_extent(). In __es_alloc_extent() we allocate memory using
GFP_KERNEL | __GFP_NOFAIL | __GFP_ZERO if the memory allocation cannot
fail, otherwise we use GFP_ATOMIC. and the ext4_es_init_extent() is used to
initialize extent_status and update related variables after a successful
allocation.
This is to prepare for the use of pre-allocated extent_status later.
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230424033846.4732-4-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Stable-dep-of: 8e387c89e96b ("ext4: make sure allocate pending entry not fail")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9649eb18c6288f514cacffdd699d5cd999c2f8f6 ]
In the extent status tree, we have extents which we can just drop without
issues and extents we must not drop - this depends on the extent's status
- currently ext4_es_is_delayed() extents must stay, others may be dropped.
A helper function is added to help determine if the current extent can
be dropped, although only ext4_es_is_delayed() extents cannot be dropped
currently.
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20230424033846.4732-3-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Stable-dep-of: 8e387c89e96b ("ext4: make sure allocate pending entry not fail")
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 40ea98396a3659062267d1fe5f99af4f7e4f05e3 upstream.
When big allocate feature is enabled, we need to count and update
reserved clusters before removing a delayed only extent_status entry.
{init|count|get}_rsvd() have already done this, but the start block
number of this counting isn't correct in the following case.
lblk end
| |
v v
-------------------------
| | orig_es
-------------------------
^ ^
len1 is 0 | len2 |
If the start block of the orig_es entry founded is bigger than lblk, we
passed lblk as start block to count_rsvd(), but the length is correct,
finally, the range to be counted is offset. This patch fix this by
passing the start blocks to 'orig_es->lblk + len1'.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20230824092619.1327976-2-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>