19 Commits

Author SHA1 Message Date
Chris Lu
f3c5ba3cd6 feat(filer): add lazy directory listing for remote mounts (#8615)
* feat(filer): add lazy directory listing for remote mounts

Directory listings on remote mounts previously only queried the local
filer store. With lazy mounts the listing was empty; with eager mounts
it went stale over time.

Add on-demand directory listing that fetches from remote and caches
results with a 5-minute TTL:

- Add `ListDirectory` to `RemoteStorageClient` interface (delimiter-based,
  single-level listing, separate from recursive `Traverse`)
- Implement in S3, GCS, and Azure backends using each platform's
  hierarchical listing API
- Add `maybeLazyListFromRemote` to filer: before each directory listing,
  check if the directory is under a remote mount with an expired cache,
  fetch from remote, persist entries to the local store, then let existing
  listing logic run on the populated store
- Use singleflight to deduplicate concurrent requests for the same directory
- Skip local-only entries (no RemoteEntry) to avoid overwriting unsynced uploads
- Errors are logged and swallowed (availability over consistency)

* refactor: extract xattr key to constant xattrRemoteListingSyncedAt

* feat: make listing cache TTL configurable per mount via listing_cache_ttl_seconds

Add listing_cache_ttl_seconds field to RemoteStorageLocation protobuf.
When 0 (default), lazy directory listing is disabled for that mount.
When >0, enables on-demand directory listing with the specified TTL.

Expose as -listingCacheTTL flag on remote.mount command.

* refactor: address review feedback for lazy directory listing

- Add context.Context to ListDirectory interface and all implementations
- Capture startTime before remote call for accurate TTL tracking
- Simplify S3 ListDirectory using ListObjectsV2PagesWithContext
- Make maybeLazyListFromRemote return void (errors always swallowed)
- Remove redundant trailing-slash path manipulation in caller
- Update tests to match new signatures

* When an existing entry has Remote != nil, we should merge remote metadata   into it rather than replacing it.

* fix(gcs): wrap ListDirectory iterator error with context

The raw iterator error was returned without bucket/path context,
making it harder to debug. Wrap it consistently with the S3 pattern.

* fix(s3): guard against nil pointer dereference in Traverse and ListDirectory

Some S3-compatible backends may return nil for LastModified, Size, or
ETag fields. Check for nil before dereferencing to prevent panics.

* fix(filer): remove blanket 2-minute timeout from lazy listing context

Individual SDK operations (S3, GCS, Azure) already have per-request
timeouts and retry policies. The blanket timeout could cut off large
directory listings mid-operation even though individual pages were
succeeding.

* fix(filer): preserve trace context in lazy listing with WithoutCancel

Use context.WithoutCancel(ctx) instead of context.Background() so
trace/span values from the incoming request are retained for
distributed tracing, while still decoupling cancellation.

* fix(filer): use Store.FindEntry for internal lookups, add Uid/Gid to files, fix updateDirectoryListingSyncedAt

- Use f.Store.FindEntry instead of f.FindEntry for staleness check and
  child lookups to avoid unnecessary lazy-fetch overhead
- Set OS_UID/OS_GID on new file entries for consistency with directories
- In updateDirectoryListingSyncedAt, use Store.UpdateEntry for existing
  directories instead of CreateEntry to avoid deleteChunksIfNotNew and
  NotifyUpdateEvent side effects

* fix(filer): distinguish not-found from store errors in lazy listing

Previously, any error from Store.FindEntry was treated as "not found,"
which could cause entry recreation/overwrite on transient DB failures.
Now check for filer_pb.ErrNotFound explicitly and skip entries or
bail out on real store errors.

* refactor(filer): use errors.Is for ErrNotFound comparisons
2026-03-13 09:36:54 -07:00
Peter Dodd
0910252e31 feat: add statfile remote storage (#8443)
* feat: add statfile; add error for remote storage misses

* feat: statfile implementations for storage providers

* test: add unit tests for StatFile method across providers

Add comprehensive unit tests for the StatFile implementation covering:
- S3: interface compliance and error constant accessibility
- Azure: interface compliance, error constants, and field population
- GCS: interface compliance, error constants, error detection, and field population

Also fix variable shadowing issue in S3 and Azure StatFile implementations where
named return parameters were being shadowed by local variable declarations.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix: address StatFile review feedback

- Use errors.New for ErrRemoteObjectNotFound sentinel
- Fix S3 HeadObject 404 detection to use awserr.Error code check
- Remove hollow field-population tests that tested nothing
- Remove redundant stdlib error detection tests
- Trim verbose doc comment on ErrRemoteObjectNotFound

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix: address second round of StatFile review feedback

- Rename interface assertion tests to TestXxxRemoteStorageClientImplementsInterface
- Delegate readFileRemoteEntry to StatFile in all three providers
- Revert S3 404 detection to RequestFailure.StatusCode() check
- Fix double-slash in GCS error message format string
- Add storage type prefix to S3 error message for consistency

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix: comments

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-25 10:24:06 -08:00
Peter Dodd
4d513a2b3d feat(gcs): add application default credentials fallback support (#8161)
* feat(gcs): add application default credentials fallback support

* refactor

* Update weed/remote_storage/gcs/gcs_storage_client.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Chris Lu <chris.lu@gmail.com>
Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-01-29 09:57:49 -08:00
Chris Lu
269092c8c3 fix(gcs): resolve credential conflict in remote storage mount (#8013)
* fix(gcs): resolve credential conflict in remote storage mount

Manually handle GCS credentials to avoid conflict with automatic discovery.
Fixes #8007

* fix(gcs): use %w for error wrapping in gcs_storage_client.go

Address review feedback to use idiomatic error wrapping.
2026-01-12 12:22:42 -08:00
Chris Lu
69553e5ba6 convert error fromating to %w everywhere (#6995) 2025-07-16 23:39:27 -07:00
chrislu
4193dafce1 azure metadata: skip metadata prefixed with "X-"
fix https://github.com/seaweedfs/seaweedfs/issues/3875
2022-11-02 21:42:02 -07:00
chrislu
26dbc6c905 move to https://github.com/seaweedfs/seaweedfs 2022-07-29 00:17:28 -07:00
Eng Zer Jun
a23bcbb7ec refactor: move from io/ioutil to io and os package
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-10-14 12:27:58 +08:00
Chris Lu
bbc77f7af4 fix compilation 2021-09-03 22:56:59 -07:00
Chris Lu
0652805236 cloud drive: add createBucket() deleteBucket() 2021-09-03 22:30:55 -07:00
Chris Lu
83cd0fc739 cloud drive: add list buckets 2021-09-03 20:42:02 -07:00
Chris Lu
7ce97b59d8 go fmt 2021-09-01 02:45:42 -07:00
Chris Lu
a31f2907f0 cloud drive: filer.remote.sync supports remove folder 2021-08-29 18:46:28 -07:00
Chris Lu
001a472057 cloud mount: remote storage support hdfs 2021-08-29 18:41:29 -07:00
Chris Lu
05a648bb96 refactor: separating out remote.proto 2021-08-26 15:18:34 -07:00
Chris Lu
a78d0227cd adjust package name 2021-08-23 23:19:31 -07:00
Chris Lu
12631a3f5b cloud drive: gcs simplify a little bit 2021-08-23 14:43:01 -07:00
Chris Lu
95e2b83ca5 fix format 2021-08-23 00:49:59 -07:00
Chris Lu
258063de26 cloud drive: add google cloud storage 2021-08-23 00:29:27 -07:00