kernel_samsung_a53x

Author	SHA1	Message	Date
Yang Li	aa0a36f0de	fanotify: remove variable set but not used [ Upstream commit 217663f101a56ef77f82273818253fff082bf503 ] The code that uses the pointer info has been removed in 7326e382c21e ("fanotify: report old and/or new parent+name in FAN_RENAME event"). and fanotify_event_info() doesn't change 'event', so the declaration and assignment of info can be removed. Eliminate the following clang warning: fs/notify/fanotify/fanotify_user.c:161:24: warning: variable ‘info’ set but not used Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:53 +01:00
J. Bruce Fields	7973cfdbf4	nfsd: fix crash on COPY_NOTIFY with special stateid [ Upstream commit 074b07d94e0bb6ddce5690a9b7e2373088e8b33a ] RTM says "If the special ONE stateid is passed to nfs4_preprocess_stateid_op(), it returns status=0 but does not set *cstid. nfsd4_copy_notify() depends on stid being set if status=0, and thus can crash if the client sends the right COPY_NOTIFY RPC." RFC 7862 says "The cna_src_stateid MUST refer to either open or locking states provided earlier by the server. If it is invalid, then the operation MUST fail." The RFC doesn't specify an error, and the choice doesn't matter much as this is clearly illegal client behavior, but bad_stateid seems reasonable. Simplest is just to guarantee that nfs4_preprocess_stateid_op, called with non-NULL cstid, errors out if it can't return a stateid. Reported-by: rtm@csail.mit.edu Fixes: 624322f1adc5 ("NFSD add COPY_NOTIFY operation") Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Olga Kornievskaia <kolga@netapp.com> Tested-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:53 +01:00
Chuck Lever	8cabb46f6c	NFSD: Move fill_pre_wcc() and fill_post_wcc() [ Upstream commit fcb5e3fa012351f3b96024c07bc44834c2478213 ] These functions are related to file handle processing and have nothing to do with XDR encoding or decoding. Also they are no longer NFSv3-specific. As a clean-up, move their definitions to a more appropriate location. WCC is also an NFSv3-specific term, so rename them as general-purpose helpers. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:53 +01:00
Chuck Lever	124a3cdf27	Revert "nfsd: skip some unnecessary stats in the v4 case" [ Upstream commit 58f258f65267542959487dbe8b5641754411843d ] On the wire, I observed NFSv4 OPEN(CREATE) operations sometimes returning a reasonable-looking value in the cinfo.before field and zero in the cinfo.after field. RFC 8881 Section 10.8.1 says: > When a client is making changes to a given directory, it needs to > determine whether there have been changes made to the directory by > other clients. It does this by using the change attribute as > reported before and after the directory operation in the associated > change_info4 value returned for the operation. and > ... The post-operation change > value needs to be saved as the basis for future change_info4 > comparisons. A good quality client implementation therefore saves the zero cinfo.after value. During a subsequent OPEN operation, it will receive a different non-zero value in the cinfo.before field for that directory, and it will incorrectly believe the directory has changed, triggering an undesirable directory cache invalidation. There are filesystem types where fs_supports_change_attribute() returns false, tmpfs being one. On NFSv4 mounts, this means the fh_getattr() call site in fill_pre_wcc() and fill_post_wcc() is never invoked. Subsequently, nfsd4_change_attribute() is invoked with an uninitialized @stat argument. In fill_pre_wcc(), @stat contains stale stack garbage, which is then placed on the wire. In fill_post_wcc(), ->fh_post_wc is all zeroes, so zero is placed on the wire. Both of these values are meaningless. This fix can be applied immediately to stable kernels. Once there are more regression tests in this area, this optimization can be attempted again. Fixes: 428a23d2bf0c ("nfsd: skip some unnecessary stats in the v4 case") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:52 +01:00
Chuck Lever	57927de983	NFSD: Trace boot verifier resets [ Upstream commit 75acacb6583df0b9328dc701d8eeea05af49b8b5 ] According to commit bbf2f098838a ("nfsd: Reset the boot verifier on all write I/O errors"), the Linux NFS server forces all clients to resend pending unstable writes if any server-side write or commit operation encounters an error (say, ENOSPC). This is a rare and quite exceptional event that could require administrative recovery action, so it should be made trace-able. Example trace event: nfsd-938 [002] 7174.945558: nfsd_writeverf_reset: boot_time= 61cc920d xid=0xdcd62036 error=-28 new verifier=0x08aecc6142515904 [ cel: adjusted to apply to v5.10.y ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:52 +01:00
Chuck Lever	3b06528222	NFSD: Rename boot verifier functions [ Upstream commit 3988a57885eeac05ef89f0ab4d7e47b52fbcf630 ] Clean up: These functions handle what the specs call a write verifier, which in the Linux NFS server implementation is now divorced from the server's boot instance [ cel: adjusted to apply to v5.10.y ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:52 +01:00
Chuck Lever	0c1abc4a75	NFSD: Clean up the nfsd_net::nfssvc_boot field [ Upstream commit 91d2e9b56cf5c80f9efc530d494968369a8a0e0d ] There are two boot-time fields in struct nfsd_net: one called boot_time and one called nfssvc_boot. The latter is used only to form write verifiers, but its documenting comment declares: /* Time of server startup */ Since commit 27c438f53e79 ("nfsd: Support the server resetting the boot verifier"), this field can be reset at any time; it's no longer tied to server restart. So that comment is stale. Also, according to pahole, struct timespec64 is 16 bytes long on x86_64. The nfssvc_boot field is used only to form a write verifier, which is 8 bytes long. Let's clarify this situation by manufacturing an 8-byte verifier in nfs_reset_boot_verifier() and storing only that in struct nfsd_net. We're grabbing 128 bits of time, so compress all of those into a 64-bit verifier instead of throwing out the high-order bits. In the future, the siphash_key can be re-used for other hashed objects per-nfsd_net. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:52 +01:00
Chuck Lever	85f8d7896b	NFSD: Write verifier might go backwards [ Upstream commit cdc556600c0133575487cc69fb3128440b3c3e92 ] When vfs_iter_write() starts to fail because a file system is full, a bunch of writes can fail at once with ENOSPC. These writes repeatedly invoke nfsd_reset_boot_verifier() in quick succession. Ensure that the time it grabs doesn't go backwards due to an ntp adjustment going on at the same time. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:52 +01:00
Trond Myklebust	8651e227f2	nfsd: Add a tracepoint for errors in nfsd4_clone_file_range() [ Upstream commit a2f4c3fa4db94ba44d32a72201927cfd132a8e82 ] Since a clone error commit can cause the boot verifier to change, we should trace those errors. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> [ cel: Addressed a checkpatch.pl splat in fs/nfsd/vfs.h ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:52 +01:00
Chuck Lever	d4c52471e1	NFSD: De-duplicate net_generic(nf->nf_net, nfsd_net_id) [ Upstream commit 2c445a0e72cb1fbfbdb7f9473c53556ee27c1d90 ] Since this pointer is used repeatedly, move it to a stack variable. [ cel: adjusted to apply to v5.10.y ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:52 +01:00
Chuck Lever	d60b181028	NFSD: De-duplicate net_generic(SVC_NET(rqstp), nfsd_net_id) [ Upstream commit fb7622c2dbd1aa41133a8c73e1137b833c074519 ] Since this pointer is used repeatedly, move it to a stack variable. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:52 +01:00
Chuck Lever	15dea95570	NFSD: Clean up nfsd_vfs_write() [ Upstream commit 33388b3aefefd4d83764dab8038cb54068161a44 ] The RWF_SYNC and !RWF_SYNC arms are now exactly alike except that the RWF_SYNC arm resets the boot verifier twice in a row. Fix that redundancy and de-duplicate the code. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:52 +01:00
Jeff Layton	f9c454e287	nfsd: Retry once in nfsd_open on an -EOPENSTALE return [ Upstream commit 12bcbd40fd931472c7fc9cf3bfe66799ece93ed8 ] If we get back -EOPENSTALE from an NFSv4 open, then we either got some unhandled error or the inode we got back was not the same as the one associated with the dentry. We really have no recourse in that situation other than to retry the open, and if it fails to just return nfserr_stale back to the client. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:52 +01:00
Jeff Layton	e6931a1dd7	nfsd: Add errno mapping for EREMOTEIO [ Upstream commit a2694e51f60c5a18c7e43d1a9feaa46d7f153e65 ] The NFS client can occasionally return EREMOTEIO when signalling issues with the server. ...map to NFSERR_IO. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:52 +01:00
Peng Tao	40a15a91ce	nfsd: map EBADF [ Upstream commit b3d0db706c77d02055910fcfe2f6eb5155ff9d5e ] Now that we have open file cache, it is possible that another client deletes the file and DP will not know about it. Then IO to MDS would fail with BADSTATEID and knfsd would start state recovery, which should fail as well and then nfs read/write will fail with EBADF. And it triggers a WARN() in nfserrno(). -----------[ cut here ]------------ WARNING: CPU: 0 PID: 13529 at fs/nfsd/nfsproc.c:758 nfserrno+0x58/0x70 [nfsd]() nfsd: non-standard errno: -9 modules linked in: nfsv3 nfs_layout_flexfiles rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_connt pata_acpi floppy CPU: 0 PID: 13529 Comm: nfsd Tainted: G W 4.1.5-00307-g6e6579b #7 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014 0000000000000000 00000000464e6c9c ffff88079085fba8 ffffffff81789936 0000000000000000 ffff88079085fc00 ffff88079085fbe8 ffffffff810a08ea ffff88079085fbe8 ffff88080f45c900 ffff88080f627d50 ffff880790c46a48 all Trace: [<ffffffff81789936>] dump_stack+0x45/0x57 [<ffffffff810a08ea>] warn_slowpath_common+0x8a/0xc0 [<ffffffff810a0975>] warn_slowpath_fmt+0x55/0x70 [<ffffffff81252908>] ? splice_direct_to_actor+0x148/0x230 [<ffffffffa02fb8c0>] ? fsid_source+0x60/0x60 [nfsd] [<ffffffffa02f9918>] nfserrno+0x58/0x70 [nfsd] [<ffffffffa02fba57>] nfsd_finish_read+0x97/0xb0 [nfsd] [<ffffffffa02fc7a6>] nfsd_splice_read+0x76/0xa0 [nfsd] [<ffffffffa02fcca1>] nfsd_read+0xc1/0xd0 [nfsd] [<ffffffffa0233af2>] ? svc_tcp_adjust_wspace+0x12/0x30 [sunrpc] [<ffffffffa03073da>] nfsd3_proc_read+0xba/0x150 [nfsd] [<ffffffffa02f7a03>] nfsd_dispatch+0xc3/0x210 [nfsd] [<ffffffffa0233af2>] ? svc_tcp_adjust_wspace+0x12/0x30 [sunrpc] [<ffffffffa0232913>] svc_process_common+0x453/0x6f0 [sunrpc] [<ffffffffa0232cc3>] svc_process+0x113/0x1b0 [sunrpc] [<ffffffffa02f740f>] nfsd+0xff/0x170 [nfsd] [<ffffffffa02f7310>] ? nfsd_destroy+0x80/0x80 [nfsd] [<ffffffff810bf3a8>] kthread+0xd8/0xf0 [<ffffffff810bf2d0>] ? kthread_create_on_node+0x1b0/0x1b0 [<ffffffff817912a2>] ret_from_fork+0x42/0x70 [<ffffffff810bf2d0>] ? kthread_create_on_node+0x1b0/0x1b0 Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:52 +01:00
Chuck Lever	2543173ec6	NFSD: Fix zero-length NFSv3 WRITEs [ Upstream commit 6a2f774424bfdcc2df3e17de0cefe74a4269cad5 ] The Linux NFS server currently responds to a zero-length NFSv3 WRITE request with NFS3ERR_IO. It responds to a zero-length NFSv4 WRITE with NFS4_OK and count of zero. RFC 1813 says of the WRITE procedure's @count argument: count The number of bytes of data to be written. If count is 0, the WRITE will succeed and return a count of 0, barring errors due to permissions checking. RFC 8881 has similar language for NFSv4, though NFSv4 removed the explicit @count argument because that value is already contained in the opaque payload array. The synthetic client pynfs's WRT4 and WRT15 tests do emit zero- length WRITEs to exercise this spec requirement. Commit fdec6114ee1f ("nfsd4: zero-length WRITE should succeed") addressed the same problem there with the same fix. But interestingly the Linux NFS client does not appear to emit zero- length WRITEs, instead squelching them. I'm not aware of a test that can generate such WRITEs for NFSv3, so I wrote a naive C program to generate a zero-length WRITE and test this fix. Fixes: 8154ef2776aa ("NFSD: Clean up legacy NFS WRITE argument XDR decoders") Reported-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:52 +01:00
Vasily Averin	065246aa5b	nfsd4: add refcount for nfsd4_blocked_lock [ Upstream commit 47446d74f1707049067fee038507cdffda805631 ] nbl allocated in nfsd4_lock can be released by a several ways: directly in nfsd4_lock(), via nfs4_laundromat(), via another nfs command RELEASE_LOCKOWNER or via nfsd4_callback. This structure should be refcounted to be used and released correctly in all these cases. Refcount is initialized to 1 during allocation and is incremented when nbl is added into nbl_list/nbl_lru lists. Usually nbl is linked into both lists together, so only one refcount is used for both lists. However nfsd4_lock() should keep in mind that nbl can be present in one of lists only. This can happen if nbl was handled already by nfs4_laundromat/nfsd4_callback/etc. Refcount is decremented if vfs_lock_file() returns FILE_LOCK_DEFERRED, because nbl can be handled already by nfs4_laundromat/nfsd4_callback/etc. Refcount is not changed in find_blocked_lock() because of it reuses counter released after removing nbl from lists. Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:52 +01:00
J. Bruce Fields	26d8c7f66c	nfs: block notification on fs with its own ->lock [ Upstream commit 40595cdc93edf4110c0f0c0b06f8d82008f23929 ] NFSv4.1 supports an optional lock notification feature which notifies the client when a lock comes available. (Normally NFSv4 clients just poll for locks if necessary.) To make that work, we need to request a blocking lock from the filesystem. We turned that off for NFS in commit f657f8eef3ff ("nfs: don't atempt blocking locks on nfs reexports") [sic] because it actually blocks the nfsd thread while waiting for the lock. Thanks to Vasily Averin for pointing out that NFS isn't the only filesystem with that problem. Any filesystem that leaves ->lock NULL will use posix_lock_file(), which does the right thing. Simplest is just to assume that any filesystem that defines its own ->lock is not safe to request a blocking lock from. So, this patch mostly reverts commit f657f8eef3ff ("nfs: don't atempt blocking locks on nfs reexports") [sic] and commit b840be2f00c0 ("lockd: don't attempt blocking locks on nfs reexports"), and instead uses a check of ->lock (Vasily's suggestion) to decide whether to support blocking lock notifications on a given filesystem. Also add a little documentation. Perhaps someday we could add back an export flag later to allow filesystems with "good" ->lock methods to support blocking lock notifications. Reported-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> [ cel: Description rewritten to address checkpatch nits ] [ cel: Fixed warning when SUNRPC debugging is disabled ] [ cel: Fixed NULL check ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:52 +01:00
Chuck Lever	4555855ccc	NFSD: De-duplicate nfsd4_decode_bitmap4() [ Upstream commit cd2e999c7c394ae916d8be741418b3c6c1dddea8 ] Clean up. Trond points out that xdr_stream_decode_uint32_array() does the same thing as nfsd4_decode_bitmap4(). Suggested-by: Trond Myklebust <trondmy@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:51 +01:00
J. Bruce Fields	80b0c59bf7	nfsd: improve stateid access bitmask documentation [ Upstream commit 3dcd1d8aab00c5d3a0a3725253c86440b1a0f5a7 ] The use of the bitmaps is confusing. Add a cross-reference to make it easier to find the existing comment. Add an updated reference with URL to make it quicker to look up. And a bit more editorializing about the value of this. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:51 +01:00
Chuck Lever	0b68d96667	NFSD: Combine XDR error tracepoints [ Upstream commit 70e94d757b3e1f46486d573729d84c8955c81dce ] Clean up: The garbage_args and cant_encode tracepoints report the same information as each other, so combine them into a single tracepoint class to reduce code duplication and slightly reduce the size of trace.o. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:51 +01:00
NeilBrown	fefe2352c2	NFSD: simplify per-net file cache management [ Upstream commit 1463b38e7cf34d4cc60f41daff459ad807b2e408 ] We currently have a 'laundrette' for closing cached files - a different work-item for each network-namespace. These 'laundrettes' (aka struct nfsd_fcache_disposal) are currently on a list, and are freed using rcu. The list is not necessary as we have a per-namespace structure (struct nfsd_net) which can hold a link to the nfsd_fcache_disposal. The use of kfree_rcu is also unnecessary as the cache is cleaned of all files associated with a given namespace, and no new files can be added, before the nfsd_fcache_disposal is freed. So add a '->fcache_disposal' link to nfsd_net, and discard the list management and rcu usage. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:51 +01:00
Jiapeng Chong	f6eb876315	NFSD: Fix inconsistent indenting [ Upstream commit 1e37d0e5bda45881eea1bec4b812def72c7d4aea ] Eliminate the follow smatch warning: fs/nfsd/nfs4xdr.c:4766 nfsd4_encode_read_plus_hole() warn: inconsistent indenting. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:51 +01:00
Chuck Lever	ee471f1758	NFSD: Remove be32_to_cpu() from DRC hash function [ Upstream commit 7578b2f628db27281d3165af0aa862311883a858 ] Commit 7142b98d9fd7 ("nfsd: Clean up drc cache in preparation for global spinlock elimination"), billed as a clean-up, added be32_to_cpu() to the DRC hash function without explanation. That commit removed two comments that state that byte-swapping in the hash function is unnecessary without explaining whether there was a need for that change. On some Intel CPUs, the swab32 instruction is known to cause a CPU pipeline stall. be32_to_cpu() does not add extra randomness, since the hash multiplication is done /before/ shifting to the high-order bits of the result. As a micro-optimization, remove the unnecessary transform from the DRC hash function. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:51 +01:00
NeilBrown	8eaf23410b	NFS: switch the callback service back to non-pooled. [ Upstream commit 23a1a573c61ccb5e7829c1f5472d3e025293a031 ] Now that thread management is consistent there is no need for nfs-callback to use svc_create_pooled() as introduced in Commit df807fffaabd ("NFSv4.x/callback: Create the callback service through svc_create_pooled"). So switch back to svc_create(). If service pools were configured, but the number of threads were left at '1', nfs callback may not work reliably when svc_create_pooled() is used. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:51 +01:00
NeilBrown	1d1b43f1b8	lockd: use svc_set_num_threads() for thread start and stop [ Upstream commit 6b044fbaab02292fedb17565dbb3f2528083b169 ] svc_set_num_threads() does everything that lockd_start_svc() does, except set sv_maxconn. It also (when passed 0) finds the threads and stops them with kthread_stop(). So move the setting for sv_maxconn, and use svc_set_num_thread() We now don't need nlmsvc_task. Now that we use svc_set_num_threads() it makes sense to set svo_module. This request that the thread exists with module_put_and_exit(). Also fix the documentation for svo_module to make this explicit. svc_prepare_thread is now only used where it is defined, so it can be made static. Signed-off-by: NeilBrown <neilb@suse.de> [ cel: address merge conflict with fd2468fa1301 ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:51 +01:00
NeilBrown	a747f45557	lockd: rename lockd_create_svc() to lockd_get() [ Upstream commit ecd3ad68d2c6d3ae178a63a2d9a02c392904fd36 ] lockd_create_svc() already does an svc_get() if the service already exists, so it is more like a "get" than a "create". So: - Move the increment of nlmsvc_users into the function as well - rename to lockd_get(). It is now the inverse of lockd_put(). Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:51 +01:00
NeilBrown	c3b60ee331	lockd: introduce lockd_put() [ Upstream commit 865b674069e05e5779fcf8cf7a166d2acb7e930b ] There is some cleanup that is duplicated in lockd_down() and the failure path of lockd_up(). Factor these out into a new lockd_put() and call it from both places. lockd_put() does not take the mutex - that must be held by the caller. It decrements nlmsvc_users and if that reaches zero, it cleans up. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:51 +01:00
NeilBrown	ee2eaac7ff	lockd: move svc_exit_thread() into the thread [ Upstream commit 6a4e2527a63620a820c4ebf3596b57176da26fb3 ] The normal place to call svc_exit_thread() is from the thread itself just before it exists. Do this for lockd. This means that nlmsvc_rqst is not used out side of lockd_start_svc(), so it can be made local to that function, and renamed to 'rqst'. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:51 +01:00
NeilBrown	4007458676	lockd: move lockd_start_svc() call into lockd_create_svc() [ Upstream commit b73a2972041bee70eb0cbbb25fa77828c63c916b ] lockd_start_svc() only needs to be called once, just after the svc is created. If the start fails, the svc is discarded too. It thus makes sense to call lockd_start_svc() from lockd_create_svc(). This allows us to remove the test against nlmsvc_rqst at the start of lockd_start_svc() - it must always be NULL. lockd_up() only held an extra reference on the svc until a thread was created - then it dropped it. The thread - and thus the extra reference - will remain until kthread_stop() is called. Now that the thread is created in lockd_create_svc(), the extra reference can be dropped there. So the 'serv' variable is no longer needed in lockd_up(). Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:50 +01:00
NeilBrown	01350704c3	lockd: simplify management of network status notifiers [ Upstream commit 5a8a7ff57421b7de3ae72019938ffb5daaee36e7 ] Now that the network status notifiers use nlmsvc_serv rather then nlmsvc_rqst the management can be simplified. Notifier unregistration synchronises with any pending notifications so providing we unregister before nlm_serv is freed no further interlock is required. So we move the unregister call to just before the thread is killed (which destroys the service) and just before the service is destroyed in the failure-path of lockd_up(). Then nlm_ntf_refcnt and nlm_ntf_wq can be removed. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:50 +01:00
NeilBrown	1ad588ca8c	lockd: introduce nlmsvc_serv [ Upstream commit 2840fe864c91a0fe822169b1fbfddbcac9aeac43 ] lockd has two globals - nlmsvc_task and nlmsvc_rqst - but mostly it wants the 'struct svc_serv', and when it doesn't want it exactly it can get to what it wants from the serv. This patch is a first step to removing nlmsvc_task and nlmsvc_rqst. It introduces nlmsvc_serv to store the 'struct svc_serv*'. This is set as soon as the serv is created, and cleared only when it is destroyed. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:50 +01:00
NeilBrown	d69c68e4b0	NFSD: simplify locking for network notifier. [ Upstream commit d057cfec4940ce6eeffa22b4a71dec203b06cd55 ] nfsd currently maintains an open-coded read/write semaphore (refcount and wait queue) for each network namespace to ensure the nfs service isn't shut down while the notifier is running. This is excessive. As there is unlikely to be contention between notifiers and they run without sleeping, a single spinlock is sufficient to avoid problems. Signed-off-by: NeilBrown <neilb@suse.de> [ cel: ensure nfsd_notifier_lock is static ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:50 +01:00
NeilBrown	ab30e7bffb	SUNRPC: discard svo_setup and rename svc_set_num_threads_sync() [ Upstream commit 3ebdbe5203a874614819700d3f470724cb803709 ] The ->svo_setup callback serves no purpose. It is always called from within the same module that chooses which callback is needed. So discard it and call the relevant function directly. Now that svc_set_num_threads() is no longer used remove it and rename svc_set_num_threads_sync() to remove the "_sync" suffix. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:50 +01:00
NeilBrown	873ea429f8	NFSD: Make it possible to use svc_set_num_threads_sync [ Upstream commit 3409e4f1e8f239f0ed81be0b068ecf4e73e2e826 ] nfsd cannot currently use svc_set_num_threads_sync. It instead uses svc_set_num_threads which does not wait for threads to all exit, and has a separate mechanism (nfsd_shutdown_complete) to wait for completion. The reason that nfsd is unlike other services is that nfsd threads can exit separately from svc_set_num_threads being called - they die on receipt of SIGKILL. Also, when the last thread exits, the service must be shut down (sockets closed). For this, the nfsd_mutex needs to be taken, and as that mutex needs to be held while svc_set_num_threads is called, the one cannot wait for the other. This patch changes the nfsd thread so that it can drop the ref on the service without blocking on nfsd_mutex, so that svc_set_num_threads_sync can be used: - if it can drop a non-last reference, it does that. This does not trigger shutdown and does not require a mutex. This will likely happen for all but the last thread signalled, and for all threads being shut down by nfsd_shutdown_threads() - if it can get the mutex without blocking (trylock), it does that and then drops the reference. This will likely happen for the last thread killed by SIGKILL - Otherwise there might be an unrelated task holding the mutex, possibly in another network namespace, or nfsd_shutdown_threads() might be just about to get a reference on the service, after which we can drop ours safely. We cannot conveniently get wakeup notifications on these events, and we are unlikely to need to, so we sleep briefly and check again. With this we can discard nfsd_shutdown_complete and nfsd_complete_shutdown(), and switch to svc_set_num_threads_sync. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:50 +01:00
NeilBrown	92793de4f5	NFSD: narrow nfsd_mutex protection in nfsd thread [ Upstream commit 9d3792aefdcda71d20c2b1ecc589c17ae71eb523 ] There is nothing happening in the start of nfsd() that requires protection by the mutex, so don't take it until shutting down the thread - which does still require protection - but only for nfsd_put(). Signed-off-by: NeilBrown <neilb@suse.de> [ cel: address merge conflict with fd2468fa1301 ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:50 +01:00
NeilBrown	f1f6cafeb9	SUNRPC: use sv_lock to protect updates to sv_nrthreads. [ Upstream commit 2a36395fac3b72771f87c3ee4387e3a96d85a7cc ] Using sv_lock means we don't need to hold the service mutex over these updates. In particular, svc_exit_thread() no longer requires synchronisation, so threads can exit asynchronously. Note that we could use an atomic_t, but as there are many more read sites than writes, that would add unnecessary noise to the code. Some reads are already racy, and there is no need for them to not be. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:50 +01:00
NeilBrown	37d53c0889	nfsd: make nfsd_stats.th_cnt atomic_t [ Upstream commit 9b6c8c9bebccd5fb785c306b948c08874a88874d ] This allows us to move the updates for th_cnt out of the mutex. This is a step towards reducing mutex coverage in nfsd(). Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:50 +01:00
NeilBrown	58232d691d	SUNRPC: stop using ->sv_nrthreads as a refcount [ Upstream commit ec52361df99b490f6af412b046df9799b92c1050 ] The use of sv_nrthreads as a general refcount results in clumsy code, as is seen by various comments needed to explain the situation. This patch introduces a 'struct kref' and uses that for reference counting, leaving sv_nrthreads to be a pure count of threads. The kref is managed particularly in svc_get() and svc_put(), and also nfsd_put(); svc_destroy() now takes a pointer to the embedded kref, rather than to the serv. nfsd allows the svc_serv to exist with ->sv_nrhtreads being zero. This happens when a transport is created before the first thread is started. To support this, a 'keep_active' flag is introduced which holds a ref on the svc_serv. This is set when any listening socket is successfully added (unless there are running threads), and cleared when the number of threads is set. So when the last thread exits, the nfs_serv will be destroyed. The use of 'keep_active' replaces previous code which checked if there were any permanent sockets. We no longer clear ->rq_server when nfsd() exits. This was done to prevent svc_exit_thread() from calling svc_destroy(). Instead we take an extra reference to the svc_serv to prevent svc_destroy() from being called. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:50 +01:00
NeilBrown	a1ed765f29	SUNRPC/NFSD: clean up get/put functions. [ Upstream commit 8c62d12740a1450d2e8456d5747f440e10db281a ] svc_destroy() is poorly named - it doesn't necessarily destroy the svc, it might just reduce the ref count. nfsd_destroy() is poorly named for the same reason. This patch: - removes the refcount functionality from svc_destroy(), moving it to a new svc_put(). Almost all previous callers of svc_destroy() now call svc_put(). - renames nfsd_destroy() to nfsd_put() and improves the code, using the new svc_destroy() rather than svc_put() - removes a few comments that explain the important for balanced get/put calls. This should be obvious. The only non-trivial part of this is that svc_destroy() would call svc_sock_update() on a non-final decrement. It can no longer do that, and svc_put() isn't really a good place of it. This call is now made from svc_exit_thread() which seems like a good place. This makes the call before sv_nrthreads is decremented rather than after. This is not particularly important as the call just sets a flag which causes sv_nrthreads set be checked later. A subsequent patch will improve the ordering. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:50 +01:00
NeilBrown	6f24d342b6	SUNRPC: change svc_get() to return the svc. [ Upstream commit df5e49c880ea0776806b8a9f8ab95e035272cf6f ] It is common for 'get' functions to return the object that was 'got', and there are a couple of places where users of svc_get() would be a little simpler if svc_get() did that. Make it so. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:50 +01:00
NeilBrown	d5ed87db21	NFSD: handle errors better in write_ports_addfd() [ Upstream commit 89b24336f03a8ba560e96b0c47a8434a7fa48e3c ] If write_ports_add() fails, we shouldn't destroy the serv, unless we had only just created it. So if there are any permanent sockets already attached, leave the serv in place. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:50 +01:00
Chuck Lever	853a7127ee	NFSD: Fix sparse warning [ Upstream commit c2f1c4bd20621175c581f298b4943df0cffbd841 ] /home/cel/src/linux/linux/fs/nfsd/nfs4proc.c:1539:24: warning: incorrect type in assignment (different base types) /home/cel/src/linux/linux/fs/nfsd/nfs4proc.c:1539:24: expected restricted __be32 [usertype] status /home/cel/src/linux/linux/fs/nfsd/nfs4proc.c:1539:24: got int Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:50 +01:00
Eric W. Biederman	83327c859e	exit: Rename module_put_and_exit to module_put_and_kthread_exit [ Upstream commit ca3574bd653aba234a4b31955f2778947403be16 ] Update module_put_and_exit to call kthread_exit instead of do_exit. Change the name to reflect this change in functionality. All of the users of module_put_and_exit are causing the current kthread to exit so this change makes it clear what is happening. There is no functional change. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:50 +01:00
Amir Goldstein	6a733645cb	fanotify: wire up FAN_RENAME event [ Upstream commit 8cc3b1ccd930fe6971e1527f0c4f1bdc8cb56026 ] FAN_RENAME is the successor of FAN_MOVED_FROM and FAN_MOVED_TO and can be used to get the old and new parent+name information in a single event. FAN_MOVED_FROM and FAN_MOVED_TO are still supported for backward compatibility, but it makes little sense to use them together with FAN_RENAME in the same group. FAN_RENAME uses special info type records to report the old and new parent+name, so reporting only old and new parent id is less useful and was not implemented. Therefore, FAN_REANAME requires a group with flag FAN_REPORT_NAME. Link: https://lore.kernel.org/r/20211129201537.1932819-12-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:49 +01:00
Amir Goldstein	874b554e0d	fanotify: report old and/or new parent+name in FAN_RENAME event [ Upstream commit 7326e382c21e9c23c89c88369afdc90b82a14da8 ] In the special case of FAN_RENAME event, we report old or new or both old and new parent+name. A single info record will be reported if either the old or new dir is watched and two records will be reported if both old and new dir (or their filesystem) are watched. The old and new parent+name are reported using new info record types FAN_EVENT_INFO_TYPE_{OLD,NEW}_DFID_NAME, so if a single info record is reported, it is clear to the application, to which dir entry the fid+name info is referring to. Link: https://lore.kernel.org/r/20211129201537.1932819-11-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:49 +01:00
Amir Goldstein	7763fe9c2c	fanotify: record either old name new name or both for FAN_RENAME [ Upstream commit 2bfbcccde6e7a787feabad4645f628f963fe0663 ] We do not want to report the dirfid+name of a directory whose inode/sb are not watched, because watcher may not have permissions to see the directory content. Use an internal iter_info to indicate to fanotify_alloc_event() which marks of this group are watching FAN_RENAME, so it can decide if we need to record only the old parent+name, new parent+name or both. Link: https://lore.kernel.org/r/20211129201537.1932819-10-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> [JK: Modified code to pass around only mask of mark types matching generated event] Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:49 +01:00
Amir Goldstein	f981c1c4ef	fanotify: record old and new parent and name in FAN_RENAME event [ Upstream commit 3982534ba5ce45e890b2f5ef5e7372c1accd14c7 ] In the special case of FAN_RENAME event, we record both the old and new parent and name. Link: https://lore.kernel.org/r/20211129201537.1932819-9-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:49 +01:00
Amir Goldstein	d2063e86ff	fanotify: support secondary dir fh and name in fanotify_info [ Upstream commit 3cf984e950c1c3f41d407ed31db33beb996be132 ] Allow storing a secondary dir fh and name tupple in fanotify_info. This will be used to store the new parent and name information in FAN_RENAME event. Link: https://lore.kernel.org/r/20211129201537.1932819-8-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:49 +01:00
Amir Goldstein	80cc43b67c	fanotify: use helpers to parcel fanotify_info buffer [ Upstream commit 1a9515ac9e55e68d733bab81bd408463ab1e25b1 ] fanotify_info buffer is parceled into variable sized records, so the records must be written in order: dir_fh, file_fh, name. Use helpers to assert that order and make fanotify_alloc_name_event() a bit more generic to allow empty dir_fh record and to allow expanding to more records (i.e. name2) soon. Link: https://lore.kernel.org/r/20211129201537.1932819-7-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:49 +01:00
Amir Goldstein	30f3e4f1c3	fanotify: use macros to get the offset to fanotify_info buffer [ Upstream commit 2d9374f095136206a02eb0b6cd9ef94632c1e9f7 ] The fanotify_info buffer contains up to two file handles and a name. Use macros to simplify the code that access the different items within the buffer. Add assertions to verify that stored fh len and name len do not overflow the u8 stored value in fanotify_info header. Remove the unused fanotify_info_len() helper. Link: https://lore.kernel.org/r/20211129201537.1932819-6-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:49 +01:00
Amir Goldstein	14391c9f8b	fsnotify: generate FS_RENAME event with rich information [ Upstream commit e54183fa7047c15819bc155f4c58501d9a9a3489 ] The dnotify FS_DN_RENAME event is used to request notification about a move within the same parent directory and was always coupled with the FS_MOVED_FROM event. Rename the FS_DN_RENAME event flag to FS_RENAME, decouple it from FS_MOVED_FROM and report it with the moved dentry instead of the moved inode, so it has the information about both old and new parent and name. Generate the FS_RENAME event regardless of same parent dir and apply the "same parent" rule in the generic fsnotify_handle_event() helper that is used to call backends with ->handle_inode_event() method (i.e. dnotify). The ->handle_inode_event() method is not rich enough to report both old and new parent and name anyway. The enriched event is reported to fanotify over the ->handle_event() method with the old and new dir inode marks in marks array slots for ITER_TYPE_INODE and a new iter type slot ITER_TYPE_INODE2. The enriched event will be used for reporting old and new parent+name to fanotify groups with FAN_RENAME events. Link: https://lore.kernel.org/r/20211129201537.1932819-5-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:49 +01:00
Amir Goldstein	4add76eb2d	fanotify: introduce group flag FAN_REPORT_TARGET_FID [ Upstream commit d61fd650e9d206a71fda789f02a1ced4b19944c4 ] FAN_REPORT_FID is ambiguous in that it reports the fid of the child for some events and the fid of the parent for create/delete/move events. The new FAN_REPORT_TARGET_FID flag is an implicit request to report the fid of the target object of the operation (a.k.a the child inode) also in create/delete/move events in addition to the fid of the parent and the name of the child. To reduce the test matrix for uninteresting use cases, the new FAN_REPORT_TARGET_FID flag requires both FAN_REPORT_NAME and FAN_REPORT_FID. The convenience macro FAN_REPORT_DFID_NAME_TARGET combines FAN_REPORT_TARGET_FID with all the required flags. Link: https://lore.kernel.org/r/20211129201537.1932819-4-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:49 +01:00
Amir Goldstein	98a9aee8d9	fsnotify: separate mark iterator type from object type enum [ Upstream commit 1c9007d62bea6fd164285314f7553f73e5308863 ] They are two different types that use the same enum, so this confusing. Use the object type to indicate the type of object mark is attached to and the iter type to indicate the type of watch. A group can have two different watches of the same object type (parent and child watches) that match the same event. Link: https://lore.kernel.org/r/20211129201537.1932819-3-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:49 +01:00
Amir Goldstein	39d9bf2bd4	fsnotify: clarify object type argument [ Upstream commit ad69cd9972e79aba103ba5365de0acd35770c265 ] In preparation for separating object type from iterator type, rename some 'type' arguments in functions to 'obj_type' and remove the unused interface to clear marks by object type mask. Link: https://lore.kernel.org/r/20211129201537.1932819-2-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:49 +01:00
Chuck Lever	cc51856c28	NFSD: Fix READDIR buffer overflow [ Upstream commit 53b1119a6e5028b125f431a0116ba73510d82a72 ] If a client sends a READDIR count argument that is too small (say, zero), then the buffer size calculation in the new init_dirlist helper functions results in an underflow, allowing the XDR stream functions to write beyond the actual buffer. This calculation has always been suspect. NFSD has never sanity- checked the READDIR count argument, but the old entry encoders managed the problem correctly. With the commits below, entry encoding changed, exposing the underflow to the pointer arithmetic in xdr_reserve_space(). Modern NFS clients attempt to retrieve as much data as possible for each READDIR request. Also, we have no unit tests that exercise the behavior of READDIR at the lower bound of @count values. Thus this case was missed during testing. Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Fixes: f5dcccd647da ("NFSD: Update the NFSv2 READDIR entry encoder to use struct xdr_stream") Fixes: 7f87fc2d34d4 ("NFSD: Update NFSv3 READDIR entry encoders to use struct xdr_stream") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:49 +01:00
Chuck Lever	a0d4e4fa69	NFSD: Fix exposure in nfsd4_decode_bitmap() [ Upstream commit c0019b7db1d7ac62c711cda6b357a659d46428fe ] rtm@csail.mit.edu reports: > nfsd4_decode_bitmap4() will write beyond bmval[bmlen-1] if the RPC > directs it to do so. This can cause nfsd4_decode_state_protect4_a() > to write client-supplied data beyond the end of > nfsd4_exchange_id.spo_must_allow[] when called by > nfsd4_decode_exchange_id(). Rewrite the loops so nfsd4_decode_bitmap() cannot iterate beyond @bmlen. Reported by: rtm@csail.mit.edu Fixes: d1c263a031e8 ("NFSD: Replace READ* macros in nfsd4_decode_fattr()") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:49 +01:00
J. Bruce Fields	95b2ff0c63	nfsd4: remove obselete comment [ Upstream commit 80479eb862102f9513e93fcf726c78cc0be2e3b2 ] Mandatory locking has been removed. And the rest of this comment is redundant with the code. Reported-by: Jeff layton <jlayton@kernel.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:48 +01:00
Changcheng Deng	1a337a93b1	NFSD:fix boolreturn.cocci warning [ Upstream commit 291cd656da04163f4bba67953c1f2f823e0d1231 ] ./fs/nfsd/nfssvc.c: 1072: 8-9: :WARNING return of 0/1 in function 'nfssvc_decode_voidarg' with return type bool Return statements in functions returning bool should use true/false instead of 1/0. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Changcheng Deng <deng.changcheng@zte.com.cn> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:48 +01:00
J. Bruce Fields	39e2c3b956	nfsd: update create verifier comment [ Upstream commit 2336d696862186fd4a6ddd1ea0cb243b3e32847c ] I don't know if that Solaris behavior matters any more or if it's still possible to look up that bug ID any more. The XFS behavior's definitely still relevant, though; any but the most recent XFS filesystems will lose the top bits. Reported-by: Frank S. Filz <ffilzlnx@mindspring.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:48 +01:00
Chuck Lever	a90593a639	SUNRPC: Change return value type of .pc_encode [ Upstream commit 130e2054d4a652a2bd79fb1557ddcd19c053cb37 ] Returning an undecorated integer is an age-old trope, but it's not clear (even to previous experts in this code) that the only valid return values are 1 and 0. These functions do not return a negative errno, rpc_stat value, or a positive length. Document there are only two valid return values by having .pc_encode return only true or false. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:48 +01:00
Chuck Lever	ad7cd61954	SUNRPC: Replace the "__be32 p" parameter to .pc_encode [ Upstream commit fda494411485aff91768842c532f90fb8eb54943 ] The passed-in value of the "__be32 p" parameter is now unused in every server-side XDR encoder, and can be removed. Note also that there is a line in each encoder that sets up a local pointer to a struct xdr_stream. Passing that pointer from the dispatcher instead saves one line per encoder function. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:48 +01:00
Chuck Lever	efcbe86a29	NFSD: Save location of NFSv4 COMPOUND status [ Upstream commit 3b0ebb255fdc49a3d340846deebf045ef58ec744 ] Refactor: Currently nfs4svc_encode_compoundres() relies on the NFS dispatcher to pass in the buffer location of the COMPOUND status. Instead, save that buffer location in struct nfsd4_compoundres. The compound tag follows immediately after. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:48 +01:00
Chuck Lever	876bbff0f3	SUNRPC: Change return value type of .pc_decode [ Upstream commit c44b31c263798ec34614dd394c31ef1a2e7e716e ] Returning an undecorated integer is an age-old trope, but it's not clear (even to previous experts in this code) that the only valid return values are 1 and 0. These functions do not return a negative errno, rpc_stat value, or a positive length. Document there are only two valid return values by having .pc_decode return only true or false. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:48 +01:00
Chuck Lever	6e864f8e82	SUNRPC: Replace the "__be32 p" parameter to .pc_decode [ Upstream commit 16c663642c7ec03cd4cee5fec520bb69e97babe4 ] The passed-in value of the "__be32 p" parameter is now unused in every server-side XDR decoder, and can be removed. Note also that there is a line in each decoder that sets up a local pointer to a struct xdr_stream. Passing that pointer from the dispatcher instead saves one line per decoder function. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:48 +01:00
Chuck Lever	252ec73572	NFSD: Have legacy NFSD WRITE decoders use xdr_stream_subsegment() [ Upstream commit dae9a6cab8009e526570e7477ce858dcdfeb256e ] Refactor. Now that the NFSv2 and NFSv3 XDR decoders have been converted to use xdr_streams, the WRITE decoder functions can use xdr_stream_subsegment() to extract the WRITE payload into its own xdr_buf, just as the NFSv4 WRITE XDR decoder currently does. That makes it possible to pass the first kvec, pages array + length, page_base, and total payload length via a single function parameter. The payload's page_base is not yet assigned or used, but will be in subsequent patches. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> [ cel: adjusted to apply to v5.10.y ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:48 +01:00
Colin Ian King	d7091c9370	NFSD: Initialize pointer ni with NULL and not plain integer 0 [ Upstream commit 8e70bf27fd20cc17e87150327a640e546bfbee64 ] Pointer ni is being initialized with plain integer zero. Fix this by initializing with NULL. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:48 +01:00
Chuck Lever	810f34ab39	NFSD: Optimize DRC bucket pruning [ Upstream commit 8847ecc9274a14114385d1cb4030326baa0766eb ] DRC bucket pruning is done by nfsd_cache_lookup(), which is part of every NFSv2 and NFSv3 dispatch (ie, it's done while the client is waiting). I added a trace_printk() in prune_bucket() to see just how long it takes to prune. Here are two ends of the spectrum: prune_bucket: Scanned 1 and freed 0 in 90 ns, 62 entries remaining prune_bucket: Scanned 2 and freed 1 in 716 ns, 63 entries remaining ... prune_bucket: Scanned 75 and freed 74 in 34149 ns, 1 entries remaining Pruning latency is noticeable on fast transports with fast storage. By noticeable, I mean that the latency measured here in the worst case is the same order of magnitude as the round trip time for cached server operations. We could do something like moving expired entries to an expired list and then free them later instead of freeing them right in prune_bucket(). But simply limiting the number of entries that can be pruned by a lookup is simple and retains more entries in the cache, making the DRC somewhat more effective. Comparison with a 70/30 fio 8KB 12 thread direct I/O test: Before: write: IOPS=61.6k, BW=481MiB/s (505MB/s)(14.1GiB/30001msec); 0 zone resets WRITE: 1848726 ops (30%) avg bytes sent per op: 8340 avg bytes received per op: 136 backlog wait: 0.635158 RTT: 0.128525 total execute time: 0.827242 (milliseconds) After: write: IOPS=63.0k, BW=492MiB/s (516MB/s)(14.4GiB/30001msec); 0 zone resets WRITE: 1891144 ops (30%) avg bytes sent per op: 8340 avg bytes received per op: 136 backlog wait: 0.616114 RTT: 0.126842 total execute time: 0.805348 (milliseconds) Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:47 +01:00
Chuck Lever	af8db3370b	SUNRPC: Trace calls to .rpc_call_done [ Upstream commit b40887e10dcacc5e8ae3c1a99dcba20877c4831b ] Introduce a single tracepoint that can replace simple dprintk call sites in upper layer "rpc_call_done" callbacks. Example: kworker/u24:2-1254 [001] 771.026677: rpc_stats_latency: task:00000001@00000002 xid=0x16a6f3c0 rpcbindv2 GETPORT backlog=446 rtt=101 execute=555 kworker/u24:2-1254 [001] 771.026677: rpc_task_call_done: task:00000001@00000002 flags=ASYNC\|DYNAMIC\|SOFT\|SOFTCONN\|SENT runstate=RUNNING\|ACTIVE status=0 action=rpcb_getport_done kworker/u24:2-1254 [001] 771.026678: rpcb_setport: task:00000001@00000002 status=0 port=20048 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:47 +01:00
Gabriel Krisman Bertazi	7e0583ad21	fanotify: Allow users to request FAN_FS_ERROR events [ Upstream commit 9709bd548f11a092d124698118013f66e1740f9b ] Wire up the FAN_FS_ERROR event in the fanotify_mark syscall, allowing user space to request the monitoring of FAN_FS_ERROR events. These events are limited to filesystem marks, so check it is the case in the syscall handler. Link: https://lore.kernel.org/r/20211025192746.66445-29-krisman@collabora.com Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:47 +01:00
Gabriel Krisman Bertazi	1a132ffbdc	fanotify: Emit generic error info for error event [ Upstream commit 130a3c742107acff985541c28360c8b40203559c ] The error info is a record sent to users on FAN_FS_ERROR events documenting the type of error. It also carries an error count, documenting how many errors were observed since the last reporting. Link: https://lore.kernel.org/r/20211025192746.66445-28-krisman@collabora.com Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:47 +01:00
Gabriel Krisman Bertazi	2db57ea6b8	fanotify: Report fid info for file related file system errors [ Upstream commit 936d6a38be39177495af38497bf8da1c6128fa1b ] Plumb the pieces to add a FID report to error records. Since all error event memory must be pre-allocated, we pre-allocate the maximum file handle size possible, such that it should always fit. For errors that don't expose a file handle, report it with an invalid FID. Internally we use zero-length FILEID_ROOT file handle for passing the information (which we report as zero-length FILEID_INVALID file handle to userspace) so we update the handle reporting code to deal with this case correctly. Link: https://lore.kernel.org/r/20211025192746.66445-27-krisman@collabora.com Link: https://lore.kernel.org/r/20211025192746.66445-25-krisman@collabora.com Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> [Folded two patches into 2 to make series bisectable] Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:47 +01:00
Gabriel Krisman Bertazi	a081701164	fanotify: WARN_ON against too large file handles [ Upstream commit 572c28f27a269f88e2d8d7b6b1507f114d637337 ] struct fanotify_error_event, at least, is preallocated and isn't able to to handle arbitrarily large file handles. Future-proof the code by complaining loudly if a handle larger than MAX_HANDLE_SZ is ever found. Link: https://lore.kernel.org/r/20211025192746.66445-26-krisman@collabora.com Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:47 +01:00
Gabriel Krisman Bertazi	84116516b3	fanotify: Add helpers to decide whether to report FID/DFID [ Upstream commit 4bd5a5c8e6e5cd964e9738e6ef87f6c2cb453edf ] Now that there is an event that reports FID records even for a zeroed file handle, wrap the logic that deides whether to issue the records into helper functions. This shouldn't have any impact on the code, but simplifies further patches. Link: https://lore.kernel.org/r/20211025192746.66445-24-krisman@collabora.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:46 +01:00
Gabriel Krisman Bertazi	c58c13c897	fanotify: Wrap object_fh inline space in a creator macro [ Upstream commit 2c5069433a3adc01ff9c5673567961bb7f138074 ] fanotify_error_event would duplicate this sequence of declarations that already exist elsewhere with a slight different size. Create a helper macro to avoid code duplication. Link: https://lore.kernel.org/r/20211025192746.66445-23-krisman@collabora.com Suggested-by: Jan Kara <jack@suse.cz> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:46 +01:00
Gabriel Krisman Bertazi	a39e7db901	fanotify: Support merging of error events [ Upstream commit 8a6ae64132fd27a944faed7bc38484827609eb76 ] Error events (FAN_FS_ERROR) against the same file system can be merged by simply iterating the error count. The hash is taken from the fsid, without considering the FH. This means that only the first error object is reported. Link: https://lore.kernel.org/r/20211025192746.66445-22-krisman@collabora.com Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:46 +01:00
Gabriel Krisman Bertazi	d48295523b	fanotify: Support enqueueing of error events [ Upstream commit 83e9acbe13dc1b767f91b5c1350f7a65689b26f6 ] Once an error event is triggered, enqueue it in the notification group, similarly to what is done for other events. FAN_FS_ERROR is not handled specially, since the memory is now handled by a preallocated mempool. For now, make the event unhashed. A future patch implements merging of this kind of event. Link: https://lore.kernel.org/r/20211025192746.66445-21-krisman@collabora.com Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:46 +01:00
Gabriel Krisman Bertazi	0944770f2a	fanotify: Pre-allocate pool of error events [ Upstream commit 734a1a5eccc5f7473002b0669f788e135f1f64aa ] Pre-allocate slots for file system errors to have greater chances of succeeding, since error events can happen in GFP_NOFS context. This patch introduces a group-wide mempool of error events, shared by all FAN_FS_ERROR marks in this group. Link: https://lore.kernel.org/r/20211025192746.66445-20-krisman@collabora.com Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:46 +01:00
Gabriel Krisman Bertazi	250e21a93d	fanotify: Reserve UAPI bits for FAN_FS_ERROR [ Upstream commit 8d11a4f43ef4679be0908026907a7613b33d7127 ] FAN_FS_ERROR allows reporting of event type FS_ERROR to userspace, which is a mechanism to report file system wide problems via fanotify. This commit preallocate userspace visible bits to match the FS_ERROR event. Link: https://lore.kernel.org/r/20211025192746.66445-19-krisman@collabora.com Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:46 +01:00
Gabriel Krisman Bertazi	6a9162472e	fanotify: Require fid_mode for any non-fd event [ Upstream commit 4fe595cf1c80e7a5af4d00c4da29def64aff57a2 ] Like inode events, FAN_FS_ERROR will require fid mode. Therefore, convert the verification during fanotify_mark(2) to require fid for any non-fd event. This means fid_mode will not only be required for inode events, but for any event that doesn't provide a descriptor. Link: https://lore.kernel.org/r/20211025192746.66445-17-krisman@collabora.com Suggested-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:46 +01:00
Gabriel Krisman Bertazi	fcb2539997	fanotify: Encode empty file handle when no inode is provided [ Upstream commit 272531ac619b374ab474e989eb387162fded553f ] Instead of failing, encode an invalid file handle in fanotify_encode_fh if no inode is provided. This bogus file handle will be reported by FAN_FS_ERROR for non-inode errors. Link: https://lore.kernel.org/r/20211025192746.66445-16-krisman@collabora.com Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:46 +01:00
Gabriel Krisman Bertazi	bdbcd2bdff	fanotify: Allow file handle encoding for unhashed events [ Upstream commit 74fe4734897a2da2ae2a665a5e622cd490d36eaf ] Allow passing a NULL hash to fanotify_encode_fh and avoid calculating the hash if not needed. Link: https://lore.kernel.org/r/20211025192746.66445-15-krisman@collabora.com Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:46 +01:00
Gabriel Krisman Bertazi	dce74e3d63	fanotify: Support null inode event in fanotify_dfid_inode [ Upstream commit 12f47bf0f0990933d95d021d13d31bda010648fd ] FAN_FS_ERROR doesn't support DFID, but this function is still called for every event. The problem is that it is not capable of handling null inodes, which now can happen in case of superblock error events. For this case, just returning dir will be enough. Link: https://lore.kernel.org/r/20211025192746.66445-14-krisman@collabora.com Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:46 +01:00
Gabriel Krisman Bertazi	391f3d77ed	fsnotify: Pass group argument to free_event [ Upstream commit 330ae77d2a5b0af32c0f29e139bf28ec8591de59 ] For group-wide mempool backed events, like FS_ERROR, the free_event callback will need to reference the group's mempool to free the memory. Wire that argument into the current callers. Link: https://lore.kernel.org/r/20211025192746.66445-13-krisman@collabora.com Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:46 +01:00
Gabriel Krisman Bertazi	6554de7632	fsnotify: Protect fsnotify_handle_inode_event from no-inode events [ Upstream commit 24dca90590509a7a6cbe0650100c90c5b8a3468a ] FAN_FS_ERROR allows events without inodes - i.e. for file system-wide errors. Even though fsnotify_handle_inode_event is not currently used by fanotify, this patch protects other backends from cases where neither inode or dir are provided. Also document the constraints of the interface (inode and dir cannot be both NULL). Link: https://lore.kernel.org/r/20211025192746.66445-12-krisman@collabora.com Suggested-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:46 +01:00
Gabriel Krisman Bertazi	8c21886caa	fsnotify: Retrieve super block from the data field [ Upstream commit 29335033c574a15334015d8c4e36862cff3d3384 ] Some file system events (i.e. FS_ERROR) might not be associated with an inode or directory. For these, we can retrieve the super block from the data field. But, since the super_block is available in the data field on every event type, simplify the code to always retrieve it from there, through a new helper. Link: https://lore.kernel.org/r/20211025192746.66445-11-krisman@collabora.com Suggested-by: Jan Kara <jack@suse.cz> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:46 +01:00
Gabriel Krisman Bertazi	ee7eadf543	fsnotify: Add wrapper around fsnotify_add_event [ Upstream commit 1ad03c3a326a86e259389592117252c851873395 ] fsnotify_add_event is growing in number of parameters, which in most case are just passed a NULL pointer. So, split out a new fsnotify_insert_event function to clean things up for users who don't need an insert hook. Link: https://lore.kernel.org/r/20211025192746.66445-10-krisman@collabora.com Suggested-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:46 +01:00
Gabriel Krisman Bertazi	024cc835e3	fsnotify: Add helper to detect overflow_event [ Upstream commit 808967a0a4d2f4ce6a2005c5692fffbecaf018c1 ] Similarly to fanotify_is_perm_event and friends, provide a helper predicate to say whether a mask is of an overflow event. Link: https://lore.kernel.org/r/20211025192746.66445-9-krisman@collabora.com Suggested-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:46 +01:00
Gabriel Krisman Bertazi	e0ec734dce	inotify: Don't force FS_IN_IGNORED [ Upstream commit e0462f91d24756916fded4313d508e0fc52f39c9 ] According to Amir: "FS_IN_IGNORED is completely internal to inotify and there is no need to set it in i_fsnotify_mask at all, so if we remove the bit from the output of inotify_arg_to_mask() no functionality will change and we will be able to overload the event bit for FS_ERROR." This is done in preparation to overload FS_ERROR with the notification mechanism in fanotify. Link: https://lore.kernel.org/r/20211025192746.66445-8-krisman@collabora.com Suggested-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:45 +01:00
Gabriel Krisman Bertazi	06a2d124a1	fanotify: Split fsid check from other fid mode checks [ Upstream commit 8299212cbdb01a5867e230e961f82e5c02a6de34 ] FAN_FS_ERROR will require fsid, but not necessarily require the filesystem to expose a file handle. Split those checks into different functions, so they can be used separately when setting up an event. While there, update a comment about tmpfs having 0 fsid, which is no longer true. Link: https://lore.kernel.org/r/20211025192746.66445-7-krisman@collabora.com Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:45 +01:00
Gabriel Krisman Bertazi	ee1aa7e25e	fanotify: Fold event size calculation to its own function [ Upstream commit b9928e80dda84b349ba8de01780b9bef2fc36ffa ] Every time this function is invoked, it is immediately added to FAN_EVENT_METADATA_LEN, since there is no need to just calculate the length of info records. This minor clean up folds the rest of the calculation into the function, which now operates in terms of events, returning the size of the entire event, including metadata. Link: https://lore.kernel.org/r/20211025192746.66445-6-krisman@collabora.com Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:45 +01:00
Gabriel Krisman Bertazi	b2d167a9e8	fsnotify: Don't insert unmergeable events in hashtable [ Upstream commit cc53b55f697fe5aa98bdbfdfe67c6401da242155 ] Some events, like the overflow event, are not mergeable, so they are not hashed. But, when failing inside fsnotify_add_event for lack of space, fsnotify_add_event() still calls the insert hook, which adds the overflow event to the merge list. Add a check to prevent any kind of unmergeable event to be inserted in the hashtable. Fixes: 94e00d28a680 ("fsnotify: use hash table for faster events merge") Link: https://lore.kernel.org/r/20211025192746.66445-5-krisman@collabora.com Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:45 +01:00
Trond Myklebust	f0e4ee5057	nfsd: Fix a warning for nfsd_file_close_inode [ Upstream commit 19598141f40dff728dd50799e510805261f48850 ] Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:45 +01:00
Chuck Lever	e7fce2315c	NLM: Fix svcxdr_encode_owner() [ Upstream commit 89c485c7a3ecbc2ebd568f9c9c2edf3a8cf7485b ] Dai Ngo reports that, since the XDR overhaul, the NLM server crashes when the TEST procedure wants to return NLM_DENIED. There is a bug in svcxdr_encode_owner() that none of our standard test cases found. Replace the open-coded function with a call to an appropriate pre-fabricated XDR helper. Reported-by: Dai Ngo <Dai.Ngo@oracle.com> Fixes: a6a63ca5652e ("lockd: Common NLM XDR helpers") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:45 +01:00
Amir Goldstein	79f5155462	fsnotify: fix sb_connectors leak [ Upstream commit 4396a73115fc8739083536162e2228c0c0c3ed1a ] Fix a leak in s_fsnotify_connectors counter in case of a race between concurrent add of new fsnotify mark to an object. The task that lost the race fails to drop the counter before freeing the unused connector. Following umount() hangs in fsnotify_sb_delete()/wait_var_event(), because s_fsnotify_connectors never drops to zero. Fixes: ec44610fe2b8 ("fsnotify: count all objects with attached connectors") Reported-by: Murphy Zhou <jencce.kernel@gmail.com> Link: https://lore.kernel.org/linux-fsdevel/20210907063338.ycaw6wvhzrfsfdlp@xzhoux.usersys.redhat.com/ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:45 +01:00
Chuck Lever	0d6eaa40de	NFS: Remove unused callback void decoder [ Upstream commit c35a810ce59524971c4a3b45faed4d0121e5a305 ] Clean up: The callback RPC dispatcher no longer invokes these call outs, although svc_process_common() relies on seeing a .pc_encode function. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:45 +01:00
Chuck Lever	a8dde82b00	NFS: Add a private local dispatcher for NFSv4 callback operations [ Upstream commit 7d34c96217cf3c2d37ca0a56ca0bc3c3bef1e189 ] The client's NFSv4 callback service is the only remaining user of svc_generic_dispatch(). Note that the NFSv4 callback service doesn't use the .pc_encode and .pc_decode callouts in any substantial way, so they are removed. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:45 +01:00
Chuck Lever	83656eacdb	SUNRPC: Eliminate the RQ_AUTHERR flag [ Upstream commit 9082e1d914f8b27114352b1940bbcc7522f682e7 ] Now that there is an alternate method for returning an auth_stat value, replace the RQ_AUTHERR flag with use of that new method. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:45 +01:00
Chuck Lever	9705fedbeb	SUNRPC: Set rq_auth_stat in the pg_authenticate() callout [ Upstream commit 5c2465dfd457f3015eebcc3ace50570e1d896aeb ] In a few moments, rq_auth_stat will need to be explicitly set to rpc_auth_ok before execution gets to the dispatcher. svc_authenticate() already sets it, but it often gets reset to rpc_autherr_badcred right after that call, even when authentication is successful. Let's ensure that the pg_authenticate callout and svc_set_client() set it properly in every case. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:45 +01:00
J. Bruce Fields	c2cd7e6a64	nfs: don't allow reexport reclaims [ Upstream commit bb0a55bb7148a49e549ee992200860e7a040d3a5 ] In the reexport case, nfsd is currently passing along locks with the reclaim bit set. The client sends a new lock request, which is granted if there's currently no conflict--even if it's possible a conflicting lock could have been briefly held in the interim. We don't currently have any way to safely grant reclaim, so for now let's just deny them all. I'm doing this by passing the reclaim bit to nfs and letting it fail the call, with the idea that eventually the client might be able to do something more forgiving here. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-11-19 12:27:44 +01:00

1 2 3 4 5 ...

735 commits