Files
seaweedFS/test/s3/sse
Chris Lu 1b1e5f69a2 Add TUS protocol support for resumable uploads (#7592)
* Add TUS protocol integration tests

This commit adds integration tests for the TUS (resumable upload) protocol
in preparation for implementing TUS support in the filer.

Test coverage includes:
- OPTIONS handler for capability discovery
- Basic single-request upload
- Chunked/resumable uploads
- HEAD requests for offset tracking
- DELETE for upload cancellation
- Error handling (invalid offsets, missing uploads)
- Creation-with-upload extension
- Resume after interruption simulation

Tests are skipped in short mode and require a running SeaweedFS cluster.

* Add TUS session storage types and utilities

Implements TUS upload session management:
- TusSession struct for tracking upload state
- Session creation with directory-based storage
- Session persistence using filer entries
- Session retrieval and offset updates
- Session deletion with chunk cleanup
- Upload completion with chunk assembly into final file

Session data is stored in /.uploads.tus/{upload-id}/ directory,
following the pattern used by S3 multipart uploads.

* Add TUS HTTP handlers

Implements TUS protocol HTTP handlers:
- tusHandler: Main entry point routing requests
- tusOptionsHandler: Capability discovery (OPTIONS)
- tusCreateHandler: Create new upload (POST)
- tusHeadHandler: Get upload offset (HEAD)
- tusPatchHandler: Upload data at offset (PATCH)
- tusDeleteHandler: Cancel upload (DELETE)
- tusWriteData: Upload data to volume servers

Features:
- Supports creation-with-upload extension
- Validates TUS protocol headers
- Offset conflict detection
- Automatic upload completion when size is reached
- Metadata parsing from Upload-Metadata header

* Wire up TUS protocol routes in filer server

Add TUS handler route (/.tus/) to the filer HTTP server.
The TUS route is registered before the catch-all route to ensure
proper routing of TUS protocol requests.

TUS protocol is now accessible at:
- OPTIONS /.tus/ - Capability discovery
- POST /.tus/{path} - Create upload
- HEAD /.tus/.uploads/{id} - Get offset
- PATCH /.tus/.uploads/{id} - Upload data
- DELETE /.tus/.uploads/{id} - Cancel upload

* Improve TUS integration test setup

Add comprehensive Makefile for TUS tests with targets:
- test-with-server: Run tests with automatic server management
- test-basic/chunked/resume/errors: Specific test categories
- manual-start/stop: For development testing
- debug-logs/status: For debugging
- ci-test: For CI/CD pipelines

Update README.md with:
- Detailed TUS protocol documentation
- All endpoint descriptions with headers
- Usage examples with curl commands
- Architecture diagram
- Comparison with S3 multipart uploads

Follows the pattern established by other tests in test/ folder.

* Fix TUS integration tests and creation-with-upload

- Fix test URLs to use full URLs instead of relative paths
- Fix creation-with-upload to refresh session before completing
- Fix Makefile to properly handle test cleanup
- Add FullURL helper function to TestCluster

* Add TUS protocol tests to GitHub Actions CI

- Add tus-tests.yml workflow that runs on PRs and pushes
- Runs when TUS-related files are modified
- Automatic server management for integration testing
- Upload logs on failure for debugging

* Make TUS base path configurable via CLI

- Add -tus.path CLI flag to filer command
- TUS is disabled by default (empty path)
- Example: -tus.path=/.tus to enable at /.tus endpoint
- Update test Makefile to use -tus.path flag
- Update README with TUS enabling instructions

* Rename -tus.path to -tusBasePath with default .tus

- Rename CLI flag from -tus.path to -tusBasePath
- Default to .tus (TUS enabled by default)
- Add -filer.tusBasePath option to weed server command
- Properly handle path prefix (prepend / if missing)

* Address code review comments

- Sort chunks by offset before assembling final file
- Use chunk.Offset directly instead of recalculating
- Return error on invalid file ID instead of skipping
- Require Content-Length header for PATCH requests
- Use fs.option.Cipher for encryption setting
- Detect MIME type from data using http.DetectContentType
- Fix concurrency group for push events in workflow
- Use os.Interrupt instead of Kill for graceful shutdown in tests

* fmt

* Address remaining code review comments

- Fix potential open redirect vulnerability by sanitizing uploadLocation path
- Add language specifier to README code block
- Handle os.Create errors in test setup
- Use waitForHTTPServer instead of time.Sleep for master/volume readiness
- Improve test reliability and debugging

* Address critical and high-priority review comments

- Add per-session locking to prevent race conditions in updateTusSessionOffset
- Stream data directly to volume server instead of buffering entire chunk
- Only buffer 512 bytes for MIME type detection, then stream remaining data
- Clean up session locks when session is deleted

* Fix race condition to work across multiple filer instances

- Store each chunk as a separate file entry instead of updating session JSON
- Chunk file names encode offset, size, and fileId for atomic storage
- getTusSession loads chunks from directory listing (atomic read)
- Eliminates read-modify-write race condition across multiple filers
- Remove in-memory mutex that only worked for single filer instance

* Address code review comments: fix variable shadowing, sniff size, and test stability

- Rename path variable to reqPath to avoid shadowing path package
- Make sniff buffer size respect contentLength (read at most contentLength bytes)
- Handle Content-Length < 0 in creation-with-upload (return error for chunked encoding)
- Fix test cluster: use temp directory for filer store, add startup delay

* Fix test stability: increase cluster stabilization delay to 5 seconds

The tests were intermittently failing because the volume server needed more
time to create volumes and register with the master. Increasing the delay
from 2 to 5 seconds fixes the flaky test behavior.

* Address PR review comments for TUS protocol support

- Fix strconv.Atoi error handling in test file (lines 386, 747)
- Fix lossy fileId encoding: use base64 instead of underscore replacement
- Add pagination support for ListDirectoryEntries in getTusSession
- Batch delete chunks instead of one-by-one in deleteTusSession

* Address additional PR review comments for TUS protocol

- Fix UploadAt timestamp: use entry.Crtime instead of time.Now()
- Remove redundant JSON content in chunk entry (metadata in filename)
- Refactor tusWriteData to stream in 4MB chunks to avoid OOM on large uploads
- Pass filer.Entry to parseTusChunkPath to preserve actual upload time

* Address more PR review comments for TUS protocol

- Normalize TUS path once in filer_server.go, store in option.TusPath
- Remove redundant path normalization from TUS handlers
- Remove goto statement in tusCreateHandler, simplify control flow

* Remove unnecessary mutexes in tusWriteData

The upload loop is sequential, so uploadErrLock and chunksLock are not needed.

* Rename updateTusSessionOffset to saveTusChunk

Remove unused newOffset parameter and rename function to better reflect its purpose.

* Improve TUS upload performance and add path validation

- Reuse operation.Uploader across sub-chunks for better connection reuse
- Guard against TusPath='/' to prevent hijacking all filer routes

* Address PR review comments for TUS protocol

- Fix critical chunk filename parsing: use strings.Cut instead of SplitN
  to correctly handle base64-encoded fileIds that may contain underscores
- Rename tusPath to tusBasePath for naming consistency across codebase
- Add background garbage collection for expired TUS sessions (runs hourly)
- Improve error messages with %w wrapping for better debuggability

* Address additional TUS PR review comments

- Fix tusBasePath default to use leading slash (/.tus) for consistency
- Add chunk contiguity validation in completeTusUpload to detect gaps/overlaps
- Fix offset calculation to find maximum contiguous range from 0, not just last chunk
- Return 413 Request Entity Too Large instead of silently truncating content
- Document tusChunkSize rationale (4MB balances memory vs request overhead)
- Fix Makefile xargs portability by removing GNU-specific -r flag
- Add explicit -tusBasePath flag to integration test for robustness
- Fix README example to use /.uploads/tus path format

* Revert log_buffer changes (moved to separate PR)

* Minor style fixes from PR review

- Simplify tusBasePath flag description to use example format
- Add 'TUS upload' prefix to session not found error message
- Remove duplicate tusChunkSize comment
- Capitalize warning message for consistency
- Add grep filter to Makefile xargs for better empty input handling
2025-12-14 21:56:07 -08:00
..
2025-10-13 18:05:17 -07:00
2025-10-13 18:05:17 -07:00

S3 Server-Side Encryption (SSE) Integration Tests

This directory contains comprehensive integration tests for SeaweedFS S3 API Server-Side Encryption functionality. These tests validate the complete end-to-end encryption/decryption pipeline from S3 API requests through filer metadata storage.

Overview

The SSE integration tests cover three main encryption methods:

  • SSE-C (Customer-Provided Keys): Client provides encryption keys via request headers
  • SSE-KMS (Key Management Service): Server manages encryption keys through a KMS provider
  • SSE-S3 (Server-Managed Keys): Server automatically manages encryption keys

🆕 Real KMS Integration

The tests now include real KMS integration with OpenBao, providing:

  • Actual encryption/decryption operations (not mock keys)
  • Multiple KMS keys for different security levels
  • Per-bucket KMS configuration testing
  • Performance benchmarking with real KMS operations

See README_KMS.md for detailed KMS integration documentation.

Why Integration Tests Matter

These integration tests were created to address a critical gap in test coverage that previously existed. While the SeaweedFS codebase had comprehensive unit tests for SSE components, it lacked integration tests that validated the complete request flow:

Client Request → S3 API → Filer Storage → Metadata Persistence → Retrieval → Decryption

The Bug These Tests Would Have Caught

A critical bug was discovered where:

  • S3 API correctly encrypted data and sent metadata headers to the filer
  • Filer did not process SSE metadata headers, losing all encryption metadata
  • Objects could be encrypted but never decrypted (metadata was lost)

Unit tests passed because they tested components in isolation, but the integration was broken. These integration tests specifically validate that:

  1. Encryption metadata is correctly sent to the filer
  2. Filer properly processes and stores the metadata
  3. Objects can be successfully retrieved and decrypted
  4. Copy operations preserve encryption metadata
  5. Multipart uploads maintain encryption consistency

Test Structure

Core Integration Tests

Basic Functionality

  • TestSSECIntegrationBasic - Basic SSE-C PUT/GET cycle
  • TestSSEKMSIntegrationBasic - Basic SSE-KMS PUT/GET cycle

Data Size Validation

  • TestSSECIntegrationVariousDataSizes - SSE-C with various data sizes (0B to 1MB)
  • TestSSEKMSIntegrationVariousDataSizes - SSE-KMS with various data sizes

Object Copy Operations

  • TestSSECObjectCopyIntegration - SSE-C object copying (key rotation, encryption changes)
  • TestSSEKMSObjectCopyIntegration - SSE-KMS object copying

Multipart Uploads

  • TestSSEMultipartUploadIntegration - SSE multipart uploads for large objects

Error Conditions

  • TestSSEErrorConditions - Invalid keys, malformed requests, error handling

Performance Tests

  • BenchmarkSSECThroughput - SSE-C performance benchmarking
  • BenchmarkSSEKMSThroughput - SSE-KMS performance benchmarking

Running Tests

Prerequisites

  1. Build SeaweedFS: Ensure the weed binary is built and available in PATH

    cd /path/to/seaweedfs
    make
    
  2. Dependencies: Tests use AWS SDK Go v2 and testify - these are handled by Go modules

Quick Test

Run basic SSE integration tests:

make test-basic

Comprehensive Testing

Run all SSE integration tests:

make test

Specific Test Categories

make test-ssec      # SSE-C tests only
make test-ssekms    # SSE-KMS tests only  
make test-copy      # Copy operation tests
make test-multipart # Multipart upload tests
make test-errors    # Error condition tests

Performance Testing

make benchmark      # Performance benchmarks
make perf          # Various data size performance tests

KMS Integration Testing

make setup-openbao          # Set up OpenBao KMS
make test-with-kms          # Run all SSE tests with real KMS
make test-ssekms-integration # Run SSE-KMS with OpenBao only
make clean-kms             # Clean up KMS environment

Development Testing

make manual-start   # Start SeaweedFS for manual testing
# ... run manual tests ...
make manual-stop    # Stop and cleanup

Test Configuration

Default Configuration

The tests use these default settings:

  • S3 Endpoint: http://127.0.0.1:8333
  • Access Key: some_access_key1
  • Secret Key: some_secret_key1
  • Region: us-east-1
  • Bucket Prefix: test-sse-

Custom Configuration

Override defaults via environment variables:

S3_PORT=8444 FILER_PORT=8889 make test

Test Environment

Each test run:

  1. Starts a complete SeaweedFS cluster (master, volume, filer, s3)
  2. Configures KMS support for SSE-KMS tests
  3. Creates temporary buckets with unique names
  4. Runs tests with real HTTP requests
  5. Cleans up all test artifacts

Test Data Coverage

Data Sizes Tested

  • 0 bytes: Empty files (edge case)
  • 1 byte: Minimal data
  • 16 bytes: Single AES block
  • 31 bytes: Just under two blocks
  • 32 bytes: Exactly two blocks
  • 100 bytes: Small file
  • 1 KB: Small text file
  • 8 KB: Medium file
  • 64 KB: Large file
  • 1 MB: Very large file

Encryption Key Scenarios

  • SSE-C: Random 256-bit keys, key rotation, wrong keys
  • SSE-KMS: Various key IDs, encryption contexts, bucket keys
  • Copy Operations: Same key, different keys, encryption transitions

Critical Test Scenarios

Metadata Persistence Validation

The integration tests specifically validate scenarios that would catch metadata storage bugs:

// 1. Upload with SSE-C
client.PutObject(..., SSECustomerKey: key)  // ← Metadata sent to filer

// 2. Retrieve with SSE-C  
client.GetObject(..., SSECustomerKey: key)  // ← Metadata retrieved from filer

// 3. Verify decryption works
assert.Equal(originalData, decryptedData)    // ← Would fail if metadata lost

Content-Length Validation

Tests verify that Content-Length headers are correct, which would catch bugs related to IV handling:

assert.Equal(int64(originalSize), resp.ContentLength)  // ← Would catch IV-in-stream bugs

Debugging

View Logs

make debug-logs     # Show recent log entries
make debug-status   # Show process and port status

Manual Testing

make manual-start   # Start SeaweedFS
# Test with S3 clients, curl, etc.
make manual-stop    # Cleanup

Integration Test Benefits

These integration tests provide:

  1. End-to-End Validation: Complete request pipeline testing
  2. Metadata Persistence: Validates filer storage/retrieval of encryption metadata
  3. Real Network Communication: Uses actual HTTP requests and responses
  4. Production-Like Environment: Full SeaweedFS cluster with all components
  5. Regression Protection: Prevents critical integration bugs
  6. Performance Baselines: Benchmarking for performance monitoring

Continuous Integration

For CI/CD pipelines, use:

make ci-test        # Quick tests suitable for CI
make stress         # Stress testing for stability validation

Key Differences from Unit Tests

Aspect Unit Tests Integration Tests
Scope Individual functions Complete request pipeline
Dependencies Mocked/simulated Real SeaweedFS cluster
Network None Real HTTP requests
Storage In-memory Real filer database
Metadata Manual simulation Actual storage/retrieval
Speed Fast (milliseconds) Slower (seconds)
Coverage Component logic System integration

Conclusion

These integration tests ensure that SeaweedFS SSE functionality works correctly in production-like environments. They complement the existing unit tests by validating that all components work together properly, providing confidence that encryption/decryption operations will succeed for real users.

Most importantly, these tests would have immediately caught the critical filer metadata storage bug that was previously undetected, demonstrating the crucial importance of integration testing for distributed systems.