* s3: fix remote object not caching * s3: address review comments for remote object caching - Fix leading slash in object name by using strings.TrimPrefix - Return cached entry from CacheRemoteObjectToLocalCluster to get updated local chunk locations - Reuse existing helper function instead of inline gRPC call * s3/filer: add singleflight deduplication for remote object caching - Add singleflight.Group to FilerServer to deduplicate concurrent cache operations - Wrap CacheRemoteObjectToLocalCluster with singleflight to ensure only one caching operation runs per object when multiple clients request the same file - Add early-return check for already-cached objects - S3 API calls filer gRPC with timeout and graceful fallback on error - Clear negative bucket cache when bucket is created via weed shell - Add integration tests for remote cache with singleflight deduplication This benefits all clients (S3, HTTP, Hadoop) accessing remote-mounted objects by preventing redundant cache operations and improving concurrent access performance. Fixes: https://github.com/seaweedfs/seaweedfs/discussions/7599 * fix: data race in concurrent remote object caching - Add mutex to protect chunks slice from concurrent append - Add mutex to protect fetchAndWriteErr from concurrent read/write - Fix incorrect error check (was checking assignResult.Error instead of parseErr) - Rename inner variable to avoid shadowing fetchAndWriteErr * fix: address code review comments - Remove duplicate remote caching block in GetObjectHandler, keep only singleflight version - Add mutex protection for concurrent chunk slice and error access (data race fix) - Use lazy initialization for S3 client in tests to avoid panic during package load - Fix markdown linting: add language specifier to code fence, blank lines around tables - Add 'all' target to Makefile as alias for test-with-server - Remove unused 'util' import * style: remove emojis from test files * fix: add defensive checks and sort chunks by offset - Add nil check and type assertion check for singleflight result - Sort chunks by offset after concurrent fetching to maintain file order * fix: improve test diagnostics and path normalization - runWeedShell now returns error for better test diagnostics - Add all targets to .PHONY in Makefile (logs-primary, logs-remote, health) - Strip leading slash from normalizedObject to avoid double slashes in path --------- Co-authored-by: chrislu <chris.lu@gmail.com> Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
5.2 KiB
5.2 KiB
Remote Object Cache Integration Tests
This directory contains integration tests for the remote object caching feature with singleflight deduplication.
Test Flow
Each test follows this pattern:
- Write to local - Upload data to primary SeaweedFS (local storage)
- Uncache - Push data to remote storage and remove local chunks
- Read - Read data (triggers caching from remote back to local)
This tests the full remote caching workflow including singleflight deduplication.
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Test Client │
│ │
│ 1. PUT data to primary SeaweedFS │
│ 2. remote.cache.uncache (push to remote, purge local) │
│ 3. GET data (triggers caching from remote) │
│ 4. Verify singleflight deduplication │
└──────────────────────────────────┬──────────────────────────────┘
│
┌─────────────────┴─────────────────┐
▼ ▼
┌────────────────────────────────────┐ ┌────────────────────────────────┐
│ Primary SeaweedFS │ │ Remote SeaweedFS │
│ (port 8333) │ │ (port 8334) │
│ │ │ │
│ - Being tested │ │ - Acts as "remote" S3 │
│ - Has remote storage mounted │──▶│ - Receives uncached data │
│ - Caches remote objects │ │ - Serves data for caching │
│ - Singleflight deduplication │ │ │
└────────────────────────────────────┘ └────────────────────────────────┘
What's Being Tested
- Basic Remote Caching: Write → Uncache → Read workflow
- Singleflight Deduplication: Concurrent reads only trigger ONE caching operation
- Large Object Caching: 5MB files cache correctly
- Range Requests: Partial reads work with cached objects
- Not Found Handling: Proper error for non-existent objects
Quick Start
Run Full Test Suite (Recommended)
# Build SeaweedFS, start both servers, run tests, stop servers
make test-with-server
Manual Steps
# 1. Build SeaweedFS binary
make build-weed
# 2. Start remote SeaweedFS (acts as "remote" storage)
make start-remote
# 3. Start primary SeaweedFS (the one being tested)
make start-primary
# 4. Configure remote storage mount
make setup-remote
# 5. Run tests
make test
# 6. Clean up
make clean
Configuration
Primary SeaweedFS (Being Tested)
| Service | Port |
|---|---|
| S3 API | 8333 |
| Filer | 8888 |
| Master | 9333 |
| Volume | 8080 |
Remote SeaweedFS (Remote Storage)
| Service | Port |
|---|---|
| S3 API | 8334 |
| Filer | 8889 |
| Master | 9334 |
| Volume | 8081 |
Makefile Targets
make help # Show all available targets
make build-weed # Build SeaweedFS binary
make start-remote # Start remote SeaweedFS
make start-primary # Start primary SeaweedFS
make setup-remote # Configure remote storage mount
make test # Run tests
make test-with-server # Full automated test workflow
make logs # Show server logs
make health # Check server status
make clean # Stop servers and clean up
Test Details
TestRemoteCacheBasic
Basic workflow test:
- Write object to primary (local)
- Uncache (push to remote, remove local chunks)
- Read (triggers caching from remote)
- Read again (from local cache - should be faster)
TestRemoteCacheConcurrent
Singleflight deduplication test:
- Write 1MB object
- Uncache to remote
- Launch 10 concurrent reads
- All should succeed with correct data
- Only ONE caching operation should run (singleflight)
TestRemoteCacheLargeObject
Large file test (5MB) to verify chunked transfer works correctly.
TestRemoteCacheRangeRequest
Tests HTTP range requests work correctly after caching.
TestRemoteCacheNotFound
Tests proper error handling for non-existent objects.
Troubleshooting
View logs
make logs # Show recent logs from both servers
make logs-primary # Follow primary logs in real-time
make logs-remote # Follow remote logs in real-time
Check server health
make health
Clean up and retry
make clean
make test-with-server