* s3: fix remote object not caching * s3: address review comments for remote object caching - Fix leading slash in object name by using strings.TrimPrefix - Return cached entry from CacheRemoteObjectToLocalCluster to get updated local chunk locations - Reuse existing helper function instead of inline gRPC call * s3/filer: add singleflight deduplication for remote object caching - Add singleflight.Group to FilerServer to deduplicate concurrent cache operations - Wrap CacheRemoteObjectToLocalCluster with singleflight to ensure only one caching operation runs per object when multiple clients request the same file - Add early-return check for already-cached objects - S3 API calls filer gRPC with timeout and graceful fallback on error - Clear negative bucket cache when bucket is created via weed shell - Add integration tests for remote cache with singleflight deduplication This benefits all clients (S3, HTTP, Hadoop) accessing remote-mounted objects by preventing redundant cache operations and improving concurrent access performance. Fixes: https://github.com/seaweedfs/seaweedfs/discussions/7599 * fix: data race in concurrent remote object caching - Add mutex to protect chunks slice from concurrent append - Add mutex to protect fetchAndWriteErr from concurrent read/write - Fix incorrect error check (was checking assignResult.Error instead of parseErr) - Rename inner variable to avoid shadowing fetchAndWriteErr * fix: address code review comments - Remove duplicate remote caching block in GetObjectHandler, keep only singleflight version - Add mutex protection for concurrent chunk slice and error access (data race fix) - Use lazy initialization for S3 client in tests to avoid panic during package load - Fix markdown linting: add language specifier to code fence, blank lines around tables - Add 'all' target to Makefile as alias for test-with-server - Remove unused 'util' import * style: remove emojis from test files * fix: add defensive checks and sort chunks by offset - Add nil check and type assertion check for singleflight result - Sort chunks by offset after concurrent fetching to maintain file order * fix: improve test diagnostics and path normalization - runWeedShell now returns error for better test diagnostics - Add all targets to .PHONY in Makefile (logs-primary, logs-remote, health) - Strip leading slash from normalizedObject to avoid double slashes in path --------- Co-authored-by: chrislu <chris.lu@gmail.com> Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
158 lines
5.2 KiB
Markdown
158 lines
5.2 KiB
Markdown
# Remote Object Cache Integration Tests
|
|
|
|
This directory contains integration tests for the remote object caching feature with singleflight deduplication.
|
|
|
|
## Test Flow
|
|
|
|
Each test follows this pattern:
|
|
1. **Write to local** - Upload data to primary SeaweedFS (local storage)
|
|
2. **Uncache** - Push data to remote storage and remove local chunks
|
|
3. **Read** - Read data (triggers caching from remote back to local)
|
|
|
|
This tests the full remote caching workflow including singleflight deduplication.
|
|
|
|
## Architecture
|
|
|
|
```text
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ Test Client │
|
|
│ │
|
|
│ 1. PUT data to primary SeaweedFS │
|
|
│ 2. remote.cache.uncache (push to remote, purge local) │
|
|
│ 3. GET data (triggers caching from remote) │
|
|
│ 4. Verify singleflight deduplication │
|
|
└──────────────────────────────────┬──────────────────────────────┘
|
|
│
|
|
┌─────────────────┴─────────────────┐
|
|
▼ ▼
|
|
┌────────────────────────────────────┐ ┌────────────────────────────────┐
|
|
│ Primary SeaweedFS │ │ Remote SeaweedFS │
|
|
│ (port 8333) │ │ (port 8334) │
|
|
│ │ │ │
|
|
│ - Being tested │ │ - Acts as "remote" S3 │
|
|
│ - Has remote storage mounted │──▶│ - Receives uncached data │
|
|
│ - Caches remote objects │ │ - Serves data for caching │
|
|
│ - Singleflight deduplication │ │ │
|
|
└────────────────────────────────────┘ └────────────────────────────────┘
|
|
```
|
|
|
|
## What's Being Tested
|
|
|
|
1. **Basic Remote Caching**: Write → Uncache → Read workflow
|
|
2. **Singleflight Deduplication**: Concurrent reads only trigger ONE caching operation
|
|
3. **Large Object Caching**: 5MB files cache correctly
|
|
4. **Range Requests**: Partial reads work with cached objects
|
|
5. **Not Found Handling**: Proper error for non-existent objects
|
|
|
|
## Quick Start
|
|
|
|
### Run Full Test Suite (Recommended)
|
|
|
|
```bash
|
|
# Build SeaweedFS, start both servers, run tests, stop servers
|
|
make test-with-server
|
|
```
|
|
|
|
### Manual Steps
|
|
|
|
```bash
|
|
# 1. Build SeaweedFS binary
|
|
make build-weed
|
|
|
|
# 2. Start remote SeaweedFS (acts as "remote" storage)
|
|
make start-remote
|
|
|
|
# 3. Start primary SeaweedFS (the one being tested)
|
|
make start-primary
|
|
|
|
# 4. Configure remote storage mount
|
|
make setup-remote
|
|
|
|
# 5. Run tests
|
|
make test
|
|
|
|
# 6. Clean up
|
|
make clean
|
|
```
|
|
|
|
## Configuration
|
|
|
|
### Primary SeaweedFS (Being Tested)
|
|
|
|
| Service | Port |
|
|
|---------|------|
|
|
| S3 API | 8333 |
|
|
| Filer | 8888 |
|
|
| Master | 9333 |
|
|
| Volume | 8080 |
|
|
|
|
### Remote SeaweedFS (Remote Storage)
|
|
|
|
| Service | Port |
|
|
|---------|------|
|
|
| S3 API | 8334 |
|
|
| Filer | 8889 |
|
|
| Master | 9334 |
|
|
| Volume | 8081 |
|
|
|
|
## Makefile Targets
|
|
|
|
```bash
|
|
make help # Show all available targets
|
|
make build-weed # Build SeaweedFS binary
|
|
make start-remote # Start remote SeaweedFS
|
|
make start-primary # Start primary SeaweedFS
|
|
make setup-remote # Configure remote storage mount
|
|
make test # Run tests
|
|
make test-with-server # Full automated test workflow
|
|
make logs # Show server logs
|
|
make health # Check server status
|
|
make clean # Stop servers and clean up
|
|
```
|
|
|
|
## Test Details
|
|
|
|
### TestRemoteCacheBasic
|
|
Basic workflow test:
|
|
1. Write object to primary (local)
|
|
2. Uncache (push to remote, remove local chunks)
|
|
3. Read (triggers caching from remote)
|
|
4. Read again (from local cache - should be faster)
|
|
|
|
### TestRemoteCacheConcurrent
|
|
Singleflight deduplication test:
|
|
1. Write 1MB object
|
|
2. Uncache to remote
|
|
3. Launch 10 concurrent reads
|
|
4. All should succeed with correct data
|
|
5. Only ONE caching operation should run (singleflight)
|
|
|
|
### TestRemoteCacheLargeObject
|
|
Large file test (5MB) to verify chunked transfer works correctly.
|
|
|
|
### TestRemoteCacheRangeRequest
|
|
Tests HTTP range requests work correctly after caching.
|
|
|
|
### TestRemoteCacheNotFound
|
|
Tests proper error handling for non-existent objects.
|
|
|
|
## Troubleshooting
|
|
|
|
### View logs
|
|
```bash
|
|
make logs # Show recent logs from both servers
|
|
make logs-primary # Follow primary logs in real-time
|
|
make logs-remote # Follow remote logs in real-time
|
|
```
|
|
|
|
### Check server health
|
|
```bash
|
|
make health
|
|
```
|
|
|
|
### Clean up and retry
|
|
```bash
|
|
make clean
|
|
make test-with-server
|
|
```
|