Commit Graph

32 Commits

Author SHA1 Message Date
G-OD
504b258258 s3: fix remote object not caching (#7790)
* s3: fix remote object not caching

* s3: address review comments for remote object caching

- Fix leading slash in object name by using strings.TrimPrefix
- Return cached entry from CacheRemoteObjectToLocalCluster to get updated local chunk locations
- Reuse existing helper function instead of inline gRPC call

* s3/filer: add singleflight deduplication for remote object caching

- Add singleflight.Group to FilerServer to deduplicate concurrent cache operations
- Wrap CacheRemoteObjectToLocalCluster with singleflight to ensure only one
  caching operation runs per object when multiple clients request the same file
- Add early-return check for already-cached objects
- S3 API calls filer gRPC with timeout and graceful fallback on error
- Clear negative bucket cache when bucket is created via weed shell
- Add integration tests for remote cache with singleflight deduplication

This benefits all clients (S3, HTTP, Hadoop) accessing remote-mounted objects
by preventing redundant cache operations and improving concurrent access performance.

Fixes: https://github.com/seaweedfs/seaweedfs/discussions/7599

* fix: data race in concurrent remote object caching

- Add mutex to protect chunks slice from concurrent append
- Add mutex to protect fetchAndWriteErr from concurrent read/write
- Fix incorrect error check (was checking assignResult.Error instead of parseErr)
- Rename inner variable to avoid shadowing fetchAndWriteErr

* fix: address code review comments

- Remove duplicate remote caching block in GetObjectHandler, keep only singleflight version
- Add mutex protection for concurrent chunk slice and error access (data race fix)
- Use lazy initialization for S3 client in tests to avoid panic during package load
- Fix markdown linting: add language specifier to code fence, blank lines around tables
- Add 'all' target to Makefile as alias for test-with-server
- Remove unused 'util' import

* style: remove emojis from test files

* fix: add defensive checks and sort chunks by offset

- Add nil check and type assertion check for singleflight result
- Sort chunks by offset after concurrent fetching to maintain file order

* fix: improve test diagnostics and path normalization

- runWeedShell now returns error for better test diagnostics
- Add all targets to .PHONY in Makefile (logs-primary, logs-remote, health)
- Strip leading slash from normalizedObject to avoid double slashes in path

---------

Co-authored-by: chrislu <chris.lu@gmail.com>
Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
2025-12-16 12:41:04 -08:00
Aleksey Kosov
283d9e0079 Add context with request (#6824) 2025-05-28 11:34:02 -07:00
chrislu
464611f614 optionally skip deleting file chunks 2024-06-15 11:39:48 -07:00
chrislu
677cfb8ad1 refactor 2024-06-15 09:39:49 -07:00
zemul
0bf56298d5 fix chunk.ModifiedTsNs (#4264)
* fix

* fix mtime s > ns

---------

Co-authored-by: zemul <zhouzemiao@ihuman.com>
2023-03-02 08:24:36 -08:00
chrislu
e1ca6308cb add chunk etag when downloading from remote storage
fix https://github.com/seaweedfs/seaweedfs/issues/3987
2022-12-10 21:49:07 -08:00
chrislu
70a4c98b00 refactor filer_pb.Entry and filer.Entry to use GetChunks()
for later locking on reading chunks
2022-11-15 06:33:36 -08:00
chrislu
ea2637734a refactor filer proto chunk variable from mtime to modified_ts_ns 2022-10-28 12:53:19 -07:00
chrislu
eaeb141b09 move proto package 2022-08-17 12:05:07 -07:00
chrislu
26dbc6c905 move to https://github.com/seaweedfs/seaweedfs 2022-07-29 00:17:28 -07:00
chrislu
9f9ef1340c use streaming mode for long poll grpc calls
streaming mode would create separate grpc connections for each call.
this is to ensure the long poll connections are properly closed.
2021-12-26 00:15:03 -08:00
banjiaojuhao
08336be92e filer server: allow upload file to specific dataNode 2021-12-22 21:57:26 +08:00
Chris Lu
88ff8fc27b ensure uploaded chunks are deleted on error 2021-11-29 00:28:26 -08:00
Chris Lu
24858507cc rename API to avoid confusion 2021-10-30 19:27:25 -07:00
Chris Lu
2789d10342 go fmt 2021-09-14 10:37:06 -07:00
Chris Lu
e5fc35ed0c change server address from string to a type 2021-09-12 22:47:52 -07:00
Chris Lu
0207f5fe9b replicated remote.cache 2021-09-08 15:54:55 -07:00
Chris Lu
2b1feb732c remote.cache supports replication 2021-09-06 18:30:44 -07:00
Chris Lu
7ce97b59d8 go fmt 2021-09-01 02:45:42 -07:00
Chris Lu
6a0bb7106b cloud drive: parallelize remote storage downloading 2021-08-26 16:16:26 -07:00
Chris Lu
05a648bb96 refactor: separating out remote.proto 2021-08-26 15:18:34 -07:00
Chris Lu
e2aa3cf63b fix go test 2021-08-15 23:20:46 -07:00
Chris Lu
c34747c79d rename, fix wrong logic. 2021-08-14 21:46:34 -07:00
Chris Lu
889b143fa7 adjust modification detection logic 2021-08-14 15:44:47 -07:00
Chris Lu
f365af81c2 parallelize remote content fetching 2021-08-14 15:41:37 -07:00
Chris Lu
53e66980b2 add comments 2021-08-14 15:16:10 -07:00
Chris Lu
9921801e0c Revert "use default or path-specific setting for cache replication level"
This reverts commit ba6923b223.
2021-08-14 15:14:26 -07:00
Chris Lu
ba6923b223 use default or path-specific setting for cache replication level 2021-08-14 15:14:01 -07:00
Chris Lu
69655ba8e5 mount: cache on reading remote storage 2021-08-09 22:11:57 -07:00
Chris Lu
9096f6f4f7 cache: set upper limit of chunk size 2021-08-09 15:08:53 -07:00
Chris Lu
a6be2520c9 fix 2021-08-09 14:37:25 -07:00
Chris Lu
713c035a6e shell: remote.cache remote.uncache 2021-08-09 14:35:18 -07:00