Files
seaweedFS/test/s3/remote_cache
G-OD 504b258258 s3: fix remote object not caching (#7790)
* s3: fix remote object not caching

* s3: address review comments for remote object caching

- Fix leading slash in object name by using strings.TrimPrefix
- Return cached entry from CacheRemoteObjectToLocalCluster to get updated local chunk locations
- Reuse existing helper function instead of inline gRPC call

* s3/filer: add singleflight deduplication for remote object caching

- Add singleflight.Group to FilerServer to deduplicate concurrent cache operations
- Wrap CacheRemoteObjectToLocalCluster with singleflight to ensure only one
  caching operation runs per object when multiple clients request the same file
- Add early-return check for already-cached objects
- S3 API calls filer gRPC with timeout and graceful fallback on error
- Clear negative bucket cache when bucket is created via weed shell
- Add integration tests for remote cache with singleflight deduplication

This benefits all clients (S3, HTTP, Hadoop) accessing remote-mounted objects
by preventing redundant cache operations and improving concurrent access performance.

Fixes: https://github.com/seaweedfs/seaweedfs/discussions/7599

* fix: data race in concurrent remote object caching

- Add mutex to protect chunks slice from concurrent append
- Add mutex to protect fetchAndWriteErr from concurrent read/write
- Fix incorrect error check (was checking assignResult.Error instead of parseErr)
- Rename inner variable to avoid shadowing fetchAndWriteErr

* fix: address code review comments

- Remove duplicate remote caching block in GetObjectHandler, keep only singleflight version
- Add mutex protection for concurrent chunk slice and error access (data race fix)
- Use lazy initialization for S3 client in tests to avoid panic during package load
- Fix markdown linting: add language specifier to code fence, blank lines around tables
- Add 'all' target to Makefile as alias for test-with-server
- Remove unused 'util' import

* style: remove emojis from test files

* fix: add defensive checks and sort chunks by offset

- Add nil check and type assertion check for singleflight result
- Sort chunks by offset after concurrent fetching to maintain file order

* fix: improve test diagnostics and path normalization

- runWeedShell now returns error for better test diagnostics
- Add all targets to .PHONY in Makefile (logs-primary, logs-remote, health)
- Strip leading slash from normalizedObject to avoid double slashes in path

---------

Co-authored-by: chrislu <chris.lu@gmail.com>
Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
2025-12-16 12:41:04 -08:00
..

Remote Object Cache Integration Tests

This directory contains integration tests for the remote object caching feature with singleflight deduplication.

Test Flow

Each test follows this pattern:

  1. Write to local - Upload data to primary SeaweedFS (local storage)
  2. Uncache - Push data to remote storage and remove local chunks
  3. Read - Read data (triggers caching from remote back to local)

This tests the full remote caching workflow including singleflight deduplication.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        Test Client                               │
│                                                                  │
│    1. PUT data to primary SeaweedFS                             │
│    2. remote.cache.uncache (push to remote, purge local)        │
│    3. GET data (triggers caching from remote)                   │
│    4. Verify singleflight deduplication                         │
└──────────────────────────────────┬──────────────────────────────┘
                                   │
                 ┌─────────────────┴─────────────────┐
                 ▼                                   ▼
┌────────────────────────────────────┐   ┌────────────────────────────────┐
│     Primary SeaweedFS              │   │     Remote SeaweedFS           │
│        (port 8333)                 │   │        (port 8334)             │
│                                    │   │                                │
│  - Being tested                    │   │  - Acts as "remote" S3         │
│  - Has remote storage mounted      │──▶│  - Receives uncached data      │
│  - Caches remote objects           │   │  - Serves data for caching     │
│  - Singleflight deduplication      │   │                                │
└────────────────────────────────────┘   └────────────────────────────────┘

What's Being Tested

  1. Basic Remote Caching: Write → Uncache → Read workflow
  2. Singleflight Deduplication: Concurrent reads only trigger ONE caching operation
  3. Large Object Caching: 5MB files cache correctly
  4. Range Requests: Partial reads work with cached objects
  5. Not Found Handling: Proper error for non-existent objects

Quick Start

# Build SeaweedFS, start both servers, run tests, stop servers
make test-with-server

Manual Steps

# 1. Build SeaweedFS binary
make build-weed

# 2. Start remote SeaweedFS (acts as "remote" storage)
make start-remote

# 3. Start primary SeaweedFS (the one being tested)
make start-primary

# 4. Configure remote storage mount
make setup-remote

# 5. Run tests
make test

# 6. Clean up
make clean

Configuration

Primary SeaweedFS (Being Tested)

Service Port
S3 API 8333
Filer 8888
Master 9333
Volume 8080

Remote SeaweedFS (Remote Storage)

Service Port
S3 API 8334
Filer 8889
Master 9334
Volume 8081

Makefile Targets

make help           # Show all available targets
make build-weed     # Build SeaweedFS binary
make start-remote   # Start remote SeaweedFS
make start-primary  # Start primary SeaweedFS
make setup-remote   # Configure remote storage mount
make test           # Run tests
make test-with-server  # Full automated test workflow
make logs           # Show server logs
make health         # Check server status
make clean          # Stop servers and clean up

Test Details

TestRemoteCacheBasic

Basic workflow test:

  1. Write object to primary (local)
  2. Uncache (push to remote, remove local chunks)
  3. Read (triggers caching from remote)
  4. Read again (from local cache - should be faster)

TestRemoteCacheConcurrent

Singleflight deduplication test:

  1. Write 1MB object
  2. Uncache to remote
  3. Launch 10 concurrent reads
  4. All should succeed with correct data
  5. Only ONE caching operation should run (singleflight)

TestRemoteCacheLargeObject

Large file test (5MB) to verify chunked transfer works correctly.

TestRemoteCacheRangeRequest

Tests HTTP range requests work correctly after caching.

TestRemoteCacheNotFound

Tests proper error handling for non-existent objects.

Troubleshooting

View logs

make logs           # Show recent logs from both servers
make logs-primary   # Follow primary logs in real-time
make logs-remote    # Follow remote logs in real-time

Check server health

make health

Clean up and retry

make clean
make test-with-server