feat(filer): add lazy directory listing for remote mounts (#8615)
* feat(filer): add lazy directory listing for remote mounts Directory listings on remote mounts previously only queried the local filer store. With lazy mounts the listing was empty; with eager mounts it went stale over time. Add on-demand directory listing that fetches from remote and caches results with a 5-minute TTL: - Add `ListDirectory` to `RemoteStorageClient` interface (delimiter-based, single-level listing, separate from recursive `Traverse`) - Implement in S3, GCS, and Azure backends using each platform's hierarchical listing API - Add `maybeLazyListFromRemote` to filer: before each directory listing, check if the directory is under a remote mount with an expired cache, fetch from remote, persist entries to the local store, then let existing listing logic run on the populated store - Use singleflight to deduplicate concurrent requests for the same directory - Skip local-only entries (no RemoteEntry) to avoid overwriting unsynced uploads - Errors are logged and swallowed (availability over consistency) * refactor: extract xattr key to constant xattrRemoteListingSyncedAt * feat: make listing cache TTL configurable per mount via listing_cache_ttl_seconds Add listing_cache_ttl_seconds field to RemoteStorageLocation protobuf. When 0 (default), lazy directory listing is disabled for that mount. When >0, enables on-demand directory listing with the specified TTL. Expose as -listingCacheTTL flag on remote.mount command. * refactor: address review feedback for lazy directory listing - Add context.Context to ListDirectory interface and all implementations - Capture startTime before remote call for accurate TTL tracking - Simplify S3 ListDirectory using ListObjectsV2PagesWithContext - Make maybeLazyListFromRemote return void (errors always swallowed) - Remove redundant trailing-slash path manipulation in caller - Update tests to match new signatures * When an existing entry has Remote != nil, we should merge remote metadata into it rather than replacing it. * fix(gcs): wrap ListDirectory iterator error with context The raw iterator error was returned without bucket/path context, making it harder to debug. Wrap it consistently with the S3 pattern. * fix(s3): guard against nil pointer dereference in Traverse and ListDirectory Some S3-compatible backends may return nil for LastModified, Size, or ETag fields. Check for nil before dereferencing to prevent panics. * fix(filer): remove blanket 2-minute timeout from lazy listing context Individual SDK operations (S3, GCS, Azure) already have per-request timeouts and retry policies. The blanket timeout could cut off large directory listings mid-operation even though individual pages were succeeding. * fix(filer): preserve trace context in lazy listing with WithoutCancel Use context.WithoutCancel(ctx) instead of context.Background() so trace/span values from the incoming request are retained for distributed tracing, while still decoupling cancellation. * fix(filer): use Store.FindEntry for internal lookups, add Uid/Gid to files, fix updateDirectoryListingSyncedAt - Use f.Store.FindEntry instead of f.FindEntry for staleness check and child lookups to avoid unnecessary lazy-fetch overhead - Set OS_UID/OS_GID on new file entries for consistency with directories - In updateDirectoryListingSyncedAt, use Store.UpdateEntry for existing directories instead of CreateEntry to avoid deleteChunksIfNotNew and NotifyUpdateEvent side effects * fix(filer): distinguish not-found from store errors in lazy listing Previously, any error from Store.FindEntry was treated as "not found," which could cause entry recreation/overwrite on transient DB failures. Now check for filer_pb.ErrNotFound explicitly and skip entries or bail out on real store errors. * refactor(filer): use errors.Is for ErrNotFound comparisons
This commit is contained in:
@@ -457,12 +457,13 @@ func (x *RemoteStorageMapping) GetPrimaryBucketStorageName() string {
|
||||
}
|
||||
|
||||
type RemoteStorageLocation struct {
|
||||
state protoimpl.MessageState `protogen:"open.v1"`
|
||||
Name string `protobuf:"bytes,1,opt,name=name,proto3" json:"name,omitempty"`
|
||||
Bucket string `protobuf:"bytes,2,opt,name=bucket,proto3" json:"bucket,omitempty"`
|
||||
Path string `protobuf:"bytes,3,opt,name=path,proto3" json:"path,omitempty"`
|
||||
unknownFields protoimpl.UnknownFields
|
||||
sizeCache protoimpl.SizeCache
|
||||
state protoimpl.MessageState `protogen:"open.v1"`
|
||||
Name string `protobuf:"bytes,1,opt,name=name,proto3" json:"name,omitempty"`
|
||||
Bucket string `protobuf:"bytes,2,opt,name=bucket,proto3" json:"bucket,omitempty"`
|
||||
Path string `protobuf:"bytes,3,opt,name=path,proto3" json:"path,omitempty"`
|
||||
ListingCacheTtlSeconds int32 `protobuf:"varint,4,opt,name=listing_cache_ttl_seconds,json=listingCacheTtlSeconds,proto3" json:"listing_cache_ttl_seconds,omitempty"` // 0 = disabled; >0 enables on-demand directory listing with this TTL in seconds
|
||||
unknownFields protoimpl.UnknownFields
|
||||
sizeCache protoimpl.SizeCache
|
||||
}
|
||||
|
||||
func (x *RemoteStorageLocation) Reset() {
|
||||
@@ -516,6 +517,13 @@ func (x *RemoteStorageLocation) GetPath() string {
|
||||
return ""
|
||||
}
|
||||
|
||||
func (x *RemoteStorageLocation) GetListingCacheTtlSeconds() int32 {
|
||||
if x != nil {
|
||||
return x.ListingCacheTtlSeconds
|
||||
}
|
||||
return 0
|
||||
}
|
||||
|
||||
var File_remote_proto protoreflect.FileDescriptor
|
||||
|
||||
const file_remote_proto_rawDesc = "" +
|
||||
@@ -573,11 +581,12 @@ const file_remote_proto_rawDesc = "" +
|
||||
"\x1bprimary_bucket_storage_name\x18\x02 \x01(\tR\x18primaryBucketStorageName\x1a]\n" +
|
||||
"\rMappingsEntry\x12\x10\n" +
|
||||
"\x03key\x18\x01 \x01(\tR\x03key\x126\n" +
|
||||
"\x05value\x18\x02 \x01(\v2 .remote_pb.RemoteStorageLocationR\x05value:\x028\x01\"W\n" +
|
||||
"\x05value\x18\x02 \x01(\v2 .remote_pb.RemoteStorageLocationR\x05value:\x028\x01\"\x92\x01\n" +
|
||||
"\x15RemoteStorageLocation\x12\x12\n" +
|
||||
"\x04name\x18\x01 \x01(\tR\x04name\x12\x16\n" +
|
||||
"\x06bucket\x18\x02 \x01(\tR\x06bucket\x12\x12\n" +
|
||||
"\x04path\x18\x03 \x01(\tR\x04pathBP\n" +
|
||||
"\x04path\x18\x03 \x01(\tR\x04path\x129\n" +
|
||||
"\x19listing_cache_ttl_seconds\x18\x04 \x01(\x05R\x16listingCacheTtlSecondsBP\n" +
|
||||
"\x10seaweedfs.clientB\n" +
|
||||
"FilerProtoZ0github.com/seaweedfs/seaweedfs/weed/pb/remote_pbb\x06proto3"
|
||||
|
||||
|
||||
Reference in New Issue
Block a user