kernel_samsung_a53x

Author	SHA1	Message	Date
Ksawlii	4e72477dc9	Revert "nfsd: fix refcount leak when file is unhashed after being found" This reverts commit `5d70e0c71b`.	2024-11-24 00:23:21 +01:00
Ksawlii	6407d85670	Revert "NFSD: Mark filecache "down" if init fails" This reverts commit `504cb79c21`.	2024-11-24 00:22:57 +01:00
Chuck Lever	504cb79c21	NFSD: Mark filecache "down" if init fails [ Upstream commit dc0d0f885aa422f621bc1c2124133eff566b0bc8 ] NeilBrown says: > The handling of NFSD_FILE_CACHE_UP is strange. nfsd_file_cache_init() > sets it, but doesn't clear it on failure. So if nfsd_file_cache_init() > fails for some reason, nfsd_file_cache_shutdown() would still try to > clean up if it was called. Reported-by: NeilBrown <neilb@suse.de> Fixes: c7b824c3d06c ("NFSD: Replace the "init once" mechanism") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-23 23:21:50 +01:00
Jeff Layton	5d70e0c71b	nfsd: fix refcount leak when file is unhashed after being found [ Upstream commit 8a7926176378460e0d91e02b03f0ff20a8709a60 ] If we wait_for_construction and find that the file is no longer hashed, and we're going to retry the open, the old nfsd_file reference is currently leaked. Put the reference before retrying. Fixes: c6593366c0bf ("nfsd: don't kill nfsd_files because of lease break error") Signed-off-by: Jeff Layton <jlayton@kernel.org> Tested-by: Youzhong Yang <youzhong@gmail.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-23 23:21:25 +01:00
Jeff Layton	047ae79564	nfsd: remove unneeded EEXIST error check in nfsd_do_file_acquire [ Upstream commit 81a95c2b1d605743220f28db04b8da13a65c4059 ] Given that we do the search and insertion while holding the i_lock, I don't think it's possible for us to get EEXIST here. Remove this case. Fixes: c6593366c0bf ("nfsd: don't kill nfsd_files because of lease break error") Signed-off-by: Jeff Layton <jlayton@kernel.org> Tested-by: Youzhong Yang <youzhong@gmail.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-23 23:21:25 +01:00
Dai Ngo	dd72e526a9	NFSD: Fix problem of COMMIT and NFS4ERR_DELAY in infinite loop [ Upstream commit 147abcacee33781e75588869e944ddb07528a897 ] The following request sequence to the same file causes the NFS client and server getting into an infinite loop with COMMIT and NFS4ERR_DELAY: OPEN REMOVE WRITE COMMIT Problem reported by recall11, recall12, recall14, recall20, recall22, recall40, recall42, recall48, recall50 of nfstest suite. This patch restores the handling of race condition in nfsd_file_do_acquire with unlink to that prior of the regression. Fixes: ac3a2585f018 ("nfsd: rework refcounting in filecache") Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:32 +01:00
Jeff Layton	430ccfd32c	nfsd: simplify the delayed disposal list code [ Upstream commit 92e4a6733f922f0fef1d0995f7b2d0eaff86c7ea ] When queueing a dispose list to the appropriate "freeme" lists, it pointlessly queues the objects one at a time to an intermediate list. Remove a few helpers and just open code a list_move to make it more clear and efficient. Better document the resulting functions with kerneldoc comments. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:32 +01:00
Chuck Lever	fc693e3ffc	NFSD: Convert filecache to rhltable [ Upstream commit c4c649ab413ba6a785b25f0edbb12f617c87db2a ] While we were converting the nfs4_file hashtable to use the kernel's resizable hashtable data structure, Neil Brown observed that the list variant (rhltable) would be better for managing nfsd_file items as well. The nfsd_file hash table will contain multiple entries for the same inode -- these should be kept together on a list. And, it could be possible for exotic or malicious client behavior to cause the hash table to resize itself on every insertion. A nice simplification is that rhltable_lookup() can return a list that contains only nfsd_file items that match a given inode, which enables us to eliminate specialized hash table helper functions and use the default functions provided by the rhashtable implementation). Since we are now storing nfsd_file items for the same inode on a single list, that effectively reduces the number of hash entries that have to be tracked in the hash table. The mininum bucket count is therefore lowered. Light testing with fstests generic/531 show no regressions. Suggested-by: Neil Brown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:32 +01:00
Jeff Layton	c28473c81d	nfsd: allow reaping files still under writeback [ Upstream commit dcb779fcd4ed5984ad15991d574943d12a8693d1 ] On most filesystems, there is no reason to delay reaping an nfsd_file just because its underlying inode is still under writeback. nfsd just relies on client activity or the local flusher threads to do writeback. The main exception is NFS, which flushes all of its dirty data on last close. Add a new EXPORT_OP_FLUSH_ON_CLOSE flag to allow filesystems to signal that they do this, and only skip closing files under writeback on such filesystems. Also, remove a redundant NULL file pointer check in nfsd_file_check_writeback, and clean up nfs's export op flag definitions. Signed-off-by: Jeff Layton <jlayton@kernel.org> Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com> [ cel: adjusted to apply to v5.10.y ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:32 +01:00
Jeff Layton	1dd7dbd153	nfsd: update comment over __nfsd_file_cache_purge [ Upstream commit 972cc0e0924598cb293b919d39c848dc038b2c28 ] Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:31 +01:00
Jeff Layton	a7a5c3a7f9	nfsd: don't take/put an extra reference when putting a file [ Upstream commit b2ff1bd71db2a1b193a6dde0845adcd69cbcf75e ] The last thing that filp_close does is an fput, so don't bother taking and putting the extra reference. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:31 +01:00
Jeff Layton	f8d546333b	nfsd: add some comments to nfsd_file_do_acquire [ Upstream commit b680cb9b737331aad271feebbedafb865504e234 ] David Howells mentioned that he found this bit of code confusing, so sprinkle in some comments to clarify. Reported-by: David Howells <dhowells@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:31 +01:00
Jeff Layton	ccc3df7195	nfsd: don't kill nfsd_files because of lease break error [ Upstream commit c6593366c0bf222be9c7561354dfb921c611745e ] An error from break_lease is non-fatal, so we needn't destroy the nfsd_file in that case. Just put the reference like we normally would and return the error. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:31 +01:00
Jeff Layton	743db303f1	nfsd: simplify test_bit return in NFSD_FILE_KEY_FULL comparator [ Upstream commit d69b8dbfd0866abc5ec84652cc1c10fc3d4d91ef ] test_bit returns bool, so we can just compare the result of that to the key->gc value without the "!!". Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:31 +01:00
Jeff Layton	c27712e985	nfsd: NFSD_FILE_KEY_INODE only needs to find GC'ed entries [ Upstream commit 6c31e4c98853a4ba47355ea151b36a77c42b7734 ] Since v4 files are expected to be long-lived, there's little value in closing them out of the cache when there is conflicting access. Change the comparator to also match the gc value in the key. Change both of the current users of that key to set the gc value in the key to "true". Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:31 +01:00
Jeff Layton	171671afc3	nfsd: don't open-code clear_and_wake_up_bit [ Upstream commit b8bea9f6cdd7236c7c2238d022145e9b2f8aac22 ] Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:31 +01:00
Jeff Layton	c9cd722efa	nfsd: don't fsync nfsd_files on last close [ Upstream commit 4c475eee02375ade6e864f1db16976ba0d96a0a2 ] Most of the time, NFSv4 clients issue a COMMIT before the final CLOSE of an open stateid, so with NFSv4, the fsync in the nfsd_file_free path is usually a no-op and doesn't block. We have a customer running knfsd over very slow storage (XFS over Ceph RBD). They were using the "async" export option because performance was more important than data integrity for this application. That export option turns NFSv4 COMMIT calls into no-ops. Due to the fsync in this codepath however, their final CLOSE calls would still stall (since a CLOSE effectively became a COMMIT). I think this fsync is not strictly necessary. We only use that result to reset the write verifier. Instead of fsync'ing all of the data when we free an nfsd_file, we can just check for writeback errors when one is acquired and when it is freed. If the client never comes back, then it'll never see the error anyway and there is no point in resetting it. If an error occurs after the nfsd_file is removed from the cache but before the inode is evicted, then it will reset the write verifier on the next nfsd_file_acquire, (since there will be an unseen error). The only exception here is if something else opens and fsyncs the file during that window. Given that local applications work with this limitation today, I don't see that as an issue. Link: https://bugzilla.redhat.com/show_bug.cgi?id=2166658 Fixes: ac3a2585f018 ("nfsd: rework refcounting in filecache") Reported-and-tested-by: Pierguido Lambri <plambri@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:30 +01:00
Jeff Layton	d32a91f5cf	nfsd: allow nfsd_file_get to sanely handle a NULL pointer [ Upstream commit 70f62231cdfd52357836733dd31db787e0412ab2 ] ...and remove some now-useless NULL pointer checks in its callers. Suggested-by: NeilBrown <neilb@suse.de> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:30 +01:00
Jeff Layton	7566172f2f	nfsd: don't free files unconditionally in __nfsd_file_cache_purge [ Upstream commit 4bdbba54e9b1c769da8ded9abd209d765715e1d6 ] nfsd_file_cache_purge is called when the server is shutting down, in which case, tearing things down is generally fine, but it also gets called when the exports cache is flushed. Instead of walking the cache and freeing everything unconditionally, handle it the same as when we have a notification of conflicting access. Fixes: ac3a2585f018 ("nfsd: rework refcounting in filecache") Reported-by: Ruben Vestergaard <rubenv@drcmr.dk> Reported-by: Torkil Svensgaard <torkil@drcmr.dk> Reported-by: Shachar Kagan <skagan@nvidia.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Tested-by: Shachar Kagan <skagan@nvidia.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:30 +01:00
Jeff Layton	9696c3cb78	nfsd: fix handling of cached open files in nfsd4_open codepath [ Upstream commit 0b3a551fa58b4da941efeb209b3770868e2eddd7 ] Commit fb70bf124b05 ("NFSD: Instantiate a struct file when creating a regular NFSv4 file") added the ability to cache an open fd over a compound. There are a couple of problems with the way this currently works: It's racy, as a newly-created nfsd_file can end up with its PENDING bit cleared while the nf is hashed, and the nf_file pointer is still zeroed out. Other tasks can find it in this state and they expect to see a valid nf_file, and can oops if nf_file is NULL. Also, there is no guarantee that we'll end up creating a new nfsd_file if one is already in the hash. If an extant entry is in the hash with a valid nf_file, nfs4_get_vfs_file will clobber its nf_file pointer with the value of op_file and the old nf_file will leak. Fix both issues by making a new nfsd_file_acquirei_opened variant that takes an optional file pointer. If one is present when this is called, we'll take a new reference to it instead of trying to open the file. If the nfsd_file already has a valid nf_file, we'll just ignore the optional file and pass the nfsd_file back as-is. Also rework the tracepoints a bit to allow for an "opened" variant and don't try to avoid counting acquisitions in the case where we already have a cached open file. Fixes: fb70bf124b05 ("NFSD: Instantiate a struct file when creating a regular NFSv4 file") Cc: Trond Myklebust <trondmy@hammerspace.com> Reported-by: Stanislav Saner <ssaner@redhat.com> Reported-and-Tested-by: Ruben Vestergaard <rubenv@drcmr.dk> Reported-and-Tested-by: Torkil Svensgaard <torkil@drcmr.dk> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:29 +01:00
Jeff Layton	853853fc1e	nfsd: rework refcounting in filecache [ Upstream commit ac3a2585f018f10039b4a856dcb122da88c1c1c9 ] The filecache refcounting is a bit non-standard for something searchable by RCU, in that we maintain a sentinel reference while it's hashed. This in turn requires that we have to do things differently in the "put" depending on whether its hashed, which we believe to have led to races. There are other problems in here too. nfsd_file_close_inode_sync can end up freeing an nfsd_file while there are still outstanding references to it, and there are a number of subtle ToC/ToU races. Rework the code so that the refcount is what drives the lifecycle. When the refcount goes to zero, then unhash and rcu free the object. A task searching for a nfsd_file is allowed to bump its refcount, but only if it's not already 0. Ensure that we don't make any other changes to it until a reference is held. With this change, the LRU carries a reference. Take special care to deal with it when removing an entry from the list, and ensure that we only repurpose the nf_lru list_head when the refcount is 0 to ensure exclusive access to it. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:29 +01:00
Chuck Lever	60297b262f	NFSD: Add an nfsd_file_fsync tracepoint [ Upstream commit d7064eaf688cfe454c50db9f59298463d80d403c ] Add a tracepoint to capture the number of filecache-triggered fsync calls and which files needed it. Also, record when an fsync triggers a write verifier reset. Examples: <...>-97 [007] 262.505611: nfsd_file_free: inode=0xffff888171e08140 ref=0 flags=GC may=WRITE nf_file=0xffff8881373d2400 <...>-97 [007] 262.505612: nfsd_file_fsync: inode=0xffff888171e08140 ref=0 flags=GC may=WRITE nf_file=0xffff8881373d2400 ret=0 <...>-97 [007] 262.505623: nfsd_file_free: inode=0xffff888171e08dc0 ref=0 flags=GC may=WRITE nf_file=0xffff8881373d1e00 <...>-97 [007] 262.505624: nfsd_file_fsync: inode=0xffff888171e08dc0 ref=0 flags=GC may=WRITE nf_file=0xffff8881373d1e00 ret=0 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:28 +01:00
Jeff Layton	0bb0fff135	nfsd: fix up the filecache laundrette scheduling [ Upstream commit 22ae4c114f77b55a4c5036e8f70409a0799a08f8 ] We don't really care whether there are hashed entries when it comes to scheduling the laundrette. They might all be non-gc entries, after all. We only want to schedule it if there are entries on the LRU. Switch to using list_lru_count, and move the check into nfsd_file_gc_worker. The other callsite in nfsd_file_put doesn't need to count entries, since it only schedules the laundrette after adding an entry to the LRU. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:28 +01:00
Jeff Layton	2cb80c1130	nfsd: reorganize filecache.c [ Upstream commit 8214118589881b2d390284410c5ff275e7a5e03c ] In a coming patch, we're going to rework how the filecache refcounting works. Move some code around in the function to reduce the churn in the later patches, and rename some of the functions with (hopefully) clearer names: nfsd_file_flush becomes nfsd_file_fsync, and nfsd_file_unhash_and_dispose is renamed to nfsd_file_unhash_and_queue. Also, the nfsd_file_put_final tracepoint is renamed to nfsd_file_free, to better match the name of the function from which it's called. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:28 +01:00
Jeff Layton	09ea1b1def	nfsd: remove the pages_flushed statistic from filecache [ Upstream commit 1f696e230ea5198e393368b319eb55651828d687 ] We're counting mapping->nrpages, but not all of those are necessarily dirty. We don't really have a simple way to count just the dirty pages, so just remove this stat since it's not accurate. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:28 +01:00
Chuck Lever	b4bd3bc3d3	NFSD: Fix licensing header in filecache.c [ Upstream commit 3f054211b29c0fa06dfdcab402c795fd7e906be1 ] Add a missing SPDX header. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:28 +01:00
Chuck Lever	747260445d	NFSD: Flesh out a documenting comment for filecache.c [ Upstream commit b3276c1f5b268ff56622e9e125b792b4c3dc03ac ] Record what we've learned recently about the NFSD filecache in a documenting comment so our future selves don't forget what all this is for. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:26 +01:00
Chuck Lever	e4ab944aae	NFSD: Add an NFSD_FILE_GC flag to enable nfsd_file garbage collection [ Upstream commit 4d1ea8455716ca070e3cd85767e6f6a562a58b1b ] NFSv4 operations manage the lifetime of nfsd_file items they use by means of NFSv4 OPEN and CLOSE. Hence there's no need for them to be garbage collected. Introduce a mechanism to enable garbage collection for nfsd_file items used only by NFSv2/3 callers. Note that the change in nfsd_file_put() ensures that both CLOSE and DELEGRETURN will actually close out and free an nfsd_file on last reference of a non-garbage-collected file. Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=394 Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:26 +01:00
Chuck Lever	bfcd8d8cf5	NFSD: Revert "NFSD: NFSv4 CLOSE should release an nfsd_file immediately" [ Upstream commit dcf3f80965ca787c70def402cdf1553c93c75529 ] This reverts commit 5e138c4a750dc140d881dab4a8804b094bbc08d2. That commit attempted to make files available to other users as soon as all NFSv4 clients were done with them, rather than waiting until the filecache LRU had garbage collected them. It gets the reference counting wrong, for one thing. But it also misses that DELEGRETURN should release a file in the same fashion. In fact, any nfsd_file_put() on an file held open by an NFSv4 client needs potentially to release the file immediately... Clear the way for implementing that idea. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:26 +01:00
Jeff Layton	e234918ef2	nfsd: fix use-after-free in nfsd_file_do_acquire tracepoint [ Upstream commit bdd6b5624c62d0acd350d07564f1c82fe649235f ] When we fail to insert into the hashtable with a non-retryable error, we'll free the object and then goto out_status. If the tracepoint is enabled, it'll end up accessing the freed object when it tries to grab the fields out of it. Set nf to NULL after freeing it to avoid the issue. Fixes: 243a5263014a ("nfsd: rework hashtable handling in nfsd_do_file_acquire") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:25 +01:00
Jeff Layton	cd363ec97d	nfsd: fix net-namespace logic in __nfsd_file_cache_purge [ Upstream commit d3aefd2b29ff5ffdeb5c06a7d3191a027a18cdb8 ] If the namespace doesn't match the one in "net", then we'll continue, but that doesn't cause another rhashtable_walk_next call, so it will loop infinitely. Fixes: ce502f81ba88 ("NFSD: Convert the filecache to use rhashtable") Reported-by: Petr Vorel <pvorel@suse.cz> Link: https://lore.kernel.org/ltp/Y1%2FP8gDAcWC%2F+VR3@pevik/ Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:25 +01:00
Jeff Layton	b32940c723	nfsd: rework hashtable handling in nfsd_do_file_acquire [ Upstream commit 243a5263014a30436c93ed3f1f864c1da845455e ] nfsd_file is RCU-freed, so we need to hold the rcu_read_lock long enough to get a reference after finding it in the hash. Take the rcu_read_lock() and call rhashtable_lookup directly. Switch to using rhashtable_lookup_insert_key as well, and use the usual retry mechanism if we hit an -EEXIST. Rename the "retry" bool to open_retry, and eliminiate the insert_err goto target. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:25 +01:00
Jeff Layton	750b504bd3	nfsd: fix nfsd_file_unhash_and_dispose [ Upstream commit 8d0d254b15cc5b7d46d85fb7ab8ecede9575e672 ] nfsd_file_unhash_and_dispose() is called for two reasons: We're either shutting down and purging the filecache, or we've gotten a notification about a file delete, so we want to go ahead and unhash it so that it'll get cleaned up when we close. We're either walking the hashtable or doing a lookup in it and we don't take a reference in either case. What we want to do in both cases is to try and unhash the object and put it on the dispose list if that was successful. If it's no longer hashed, then we don't want to touch it, with the assumption being that something else is already cleaning up the sentinel reference. Instead of trying to selectively decrement the refcount in this function, just unhash it, and if that was successful, move it to the dispose list. Then, the disposal routine will just clean that up as usual. Also, just make this a void function, drop the WARN_ON_ONCE, and the comments about deadlocking since the nature of the purported deadlock is no longer clear. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:25 +01:00
ChenXiaoSong	dd6e1f79a4	nfsd: use DEFINE_SHOW_ATTRIBUTE to define nfsd_file_cache_stats_fops [ Upstream commit 1342f9dd3fc219089deeb2620f6790f19b4129b1 ] Use DEFINE_SHOW_ATTRIBUTE helper macro to simplify the code. Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:24 +01:00
Chuck Lever	c035874cc9	NFSD: Ensure nf_inode is never dereferenced [ Upstream commit 427f5f83a3191cbf024c5aea6e5b601cdf88d895 ] The documenting comment for struct nf_file states: /* * A representation of a file that has been opened by knfsd. These are hashed * in the hashtable by inode pointer value. Note that this object doesn't * hold a reference to the inode by itself, so the nf_inode pointer should * never be dereferenced, only used for comparison. */ Replace the two existing dereferences to make the comment always true. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:01 +01:00
Chuck Lever	276969802c	NFSD: NFSv4 CLOSE should release an nfsd_file immediately [ Upstream commit 5e138c4a750dc140d881dab4a8804b094bbc08d2 ] The last close of a file should enable other accessors to open and use that file immediately. Leaving the file open in the filecache prevents other users from accessing that file until the filecache garbage-collects the file -- sometimes that takes several seconds. Reported-by: Wang Yugui <wangyugui@e16-tech.com> Link: https://bugzilla.linux-nfs.org/show_bug.cgi?387 Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:01 +01:00
Chuck Lever	450cd975fe	NFSD: Move nfsd_file_trace_alloc() tracepoint [ Upstream commit b40a2839470cd62ed68c4a32d72a18ee8975b1ac ] Avoid recording the allocation of an nfsd_file item that is immediately released because a matching item was already inserted in the hash. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:01 +01:00
Chuck Lever	96ab7b7d71	NFSD: Separate tracepoints for acquire and create [ Upstream commit be0230069fcbf7d332d010b57c1d0cfd623a84d6 ] These tracepoints collect different information: the create case does not open a file, so there's no nf_file available. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:01 +01:00
Chuck Lever	8135682ff8	NFSD: Clean up unused code after rhashtable conversion [ Upstream commit 0ec8e9d1539a7b8109a554028bbce441052f847e ] Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:01 +01:00
Chuck Lever	fb4d9b9e07	NFSD: Convert the filecache to use rhashtable [ Upstream commit ce502f81ba884c1fe45dc0ebddbcaaa4ec0fc5fb ] Enable the filecache hash table to start small, then grow with the workload. Smaller server deployments benefit because there should be lower memory utilization. Larger server deployments should see improved scaling with the number of open files. Suggested-by: Jeff Layton <jlayton@kernel.org> Suggested-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:01 +01:00
Chuck Lever	304260c184	NFSD: Set up an rhashtable for the filecache [ Upstream commit fc22945ecc2a0a028f3683115f98a922d506c284 ] Add code to initialize and tear down an rhashtable. The rhashtable is not used yet. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:01 +01:00
Chuck Lever	5d5d11d1ab	NFSD: Replace the "init once" mechanism [ Upstream commit c7b824c3d06c85e054caf86e227255112c5e3c38 ] In a moment, the nfsd_file_hashtbl global will be replaced with an rhashtable. Replace the one or two spots that need to check if the hash table is available. We can easily reuse the SHUTDOWN flag for this purpose. Document that this mechanism relies on callers to hold the nfsd_mutex to prevent init, shutdown, and purging to run concurrently. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:01 +01:00
Chuck Lever	c5c427f95c	NFSD: Remove nfsd_file::nf_hashval [ Upstream commit f0743c2b25c65debd4f599a7c861428cd9de5906 ] The value in this field can always be computed from nf_inode, thus it is no longer used. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:01 +01:00
Chuck Lever	76fe92d538	NFSD: nfsd_file_hash_remove can compute hashval [ Upstream commit cb7ec76e73ff6640241c8f1f2f35c81d4005a2d6 ] Remove an unnecessary use of nf_hashval. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:01 +01:00
Chuck Lever	d365398d77	NFSD: Refactor __nfsd_file_close_inode() [ Upstream commit a845511007a63467fee575353c706806c21218b1 ] The code that computes the hashval is the same in both callers. To prevent them from going stale, reframe the documenting comments to remove descriptions of the underlying hash table structure, which is about to be replaced. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:01 +01:00
Chuck Lever	adfe497c23	NFSD: nfsd_file_unhash can compute hashval from nf->nf_inode [ Upstream commit 8755326399f471ec3b31e2ab8c5074c0d28a0fb5 ] Remove an unnecessary usage of nf_hashval. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:01 +01:00
Chuck Lever	be3cb94280	NFSD: Remove lockdep assertion from unhash_and_release_locked() [ Upstream commit f53cef15dddec7203df702cdc62e554190385450 ] IIUC, holding the hash bucket lock is needed only in nfsd_file_unhash, and there is already a lockdep assertion there. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:01 +01:00
Chuck Lever	4f7dbf0537	NFSD: No longer record nf_hashval in the trace log [ Upstream commit 54f7df7094b329ca35d9f9808692bb16c48b13e9 ] I'm about to replace nfsd_file_hashtbl with an rhashtable. The individual hash values will no longer be visible or relevant, so remove them from the tracepoints. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:00 +01:00
Chuck Lever	1df5db6e78	NFSD: Never call nfsd_file_gc() in foreground paths [ Upstream commit 6df19411367a5fb4ef61854cbd1af269c077f917 ] The checks in nfsd_file_acquire() and nfsd_file_put() that directly invoke filecache garbage collection are intended to keep cache occupancy between a low- and high-watermark. The reason to limit the capacity of the filecache is to keep filecache lookups reasonably fast. However, invoking garbage collection at those points has some undesirable negative impacts. Files that are held open by NFSv4 clients often push the occupancy of the filecache over these watermarks. At that point: - Every call to nfsd_file_acquire() and nfsd_file_put() results in an LRU walk. This has the same effect on lookup latency as long chains in the hash table. - Garbage collection will then run on every nfsd thread, causing a lot of unnecessary lock contention. - Limiting cache capacity pushes out files used only by NFSv3 clients, which are the type of files the filecache is supposed to help. To address those negative impacts, remove the direct calls to the garbage collector. Subsequent patches will address maintaining lookup efficiency as cache capacity increases. Suggested-by: Wang Yugui <wangyugui@e16-tech.com> Suggested-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:00 +01:00
Chuck Lever	1d683366d4	NFSD: Fix the filecache LRU shrinker [ Upstream commit edead3a55804739b2e4af0f35e9c7326264e7b22 ] Without LRU item rotation, the shrinker visits only a few items on the end of the LRU list, and those would always be long-term OPEN files for NFSv4 workloads. That makes the filecache shrinker completely ineffective. Adopt the same strategy as the inode LRU by using LRU_ROTATE. Suggested-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:28:00 +01:00

1 2

79 commits