* fix: volume balance detection now returns multiple tasks per run (#8551) Previously, detectForDiskType() returned at most 1 balance task per disk type, making the MaxJobsPerDetection setting ineffective. The detection loop now iterates within each disk type, planning multiple moves until the imbalance drops below threshold or maxResults is reached. Effective volume counts are adjusted after each planned move so the algorithm correctly re-evaluates which server is overloaded. * fix: factor pending tasks into destination scoring and use UnixNano for task IDs - Use UnixNano instead of Unix for task IDs to avoid collisions when multiple tasks are created within the same second - Adjust calculateBalanceScore to include LoadCount (pending + assigned tasks) in the utilization estimate, so the destination picker avoids stacking multiple planned moves onto the same target disk * test: add comprehensive balance detection tests for complex scenarios Cover multi-server convergence, max-server shifting, destination spreading, pre-existing pending task skipping, no-duplicate-volume invariant, and parameterized convergence verification across different cluster shapes and thresholds. * fix: address PR review findings in balance detection - hasMore flag: compute from len(results) >= maxResults so the scheduler knows more pages may exist, matching vacuum/EC handler pattern - Exhausted server fallthrough: when no eligible volumes remain on the current maxServer (all have pending tasks) or destination planning fails, mark the server as exhausted and continue to the next overloaded server instead of stopping the entire detection loop - Return canonical destination server ID directly from createBalanceTask instead of resolving via findServerIDByAddress, eliminating the fragile address→ID lookup for adjustment tracking - Fix bestScore sentinel: use math.Inf(-1) instead of -1.0 so disks with negative scores (high pending load, same rack/DC) are still selected as the best available destination - Add TestDetection_ExhaustedServerFallsThrough covering the scenario where the top server's volumes are all blocked by pre-existing tasks * test: fix computeEffectiveCounts and add len guard in no-duplicate test - computeEffectiveCounts now takes a servers slice to seed counts for all known servers (including empty ones) and uses an address→ID map from the topology spec instead of scanning metrics, so destination servers with zero initial volumes are tracked correctly - TestDetection_NoDuplicateVolumesAcrossIterations now asserts len > 1 before checking duplicates, so the test actually fails if Detection regresses to returning a single task * fix: remove redundant HasAnyTask check in createBalanceTask The HasAnyTask check in createBalanceTask duplicated the same check already performed in detectForDiskType's volume selection loop. Since detection runs single-threaded (MaxDetectionConcurrency: 1), no race can occur between the two points. * fix: consistent hasMore pattern and remove double-counted LoadCount in scoring - Adopt vacuum_handler's hasMore pattern: over-fetch by 1, check len > maxResults, and truncate — consistent truncation semantics - Remove direct LoadCount penalty in calculateBalanceScore since LoadCount is already factored into effectiveVolumeCount for utilization scoring; bump utilization weight from 40 to 50 to compensate for the removed 10-point load penalty * fix: handle zero maxResults as no-cap, emit trace after trim, seed empty servers - When MaxResults is 0 (omitted), treat as no explicit cap instead of defaulting to 1; only apply the +1 over-fetch probe when caller supplies a positive limit - Move decision trace emission after hasMore/trim so the trace accurately reflects the returned proposals - Seed serverVolumeCounts from ActiveTopology so servers that have a matching disk type but zero volumes are included in the imbalance calculation and MinServerCount check * fix: nil-guard clusterInfo, uncap legacy DetectionFunc, deterministic disk type order - Add early nil guard for clusterInfo in Detection to prevent panics in downstream helpers (detectForDiskType, createBalanceTask) - Change register.go DetectionFunc wrapper from maxResults=1 to 0 (no cap) so the legacy code path returns all detected tasks - Sort disk type keys before iteration so results are deterministic when maxResults spans multiple disk types (HDD/SSD) * fix: don't over-fetch in stateful detection to avoid orphaned pending tasks Detection registers planned moves in ActiveTopology via AddPendingTask, so requesting maxResults+1 would create an extra pending task that gets discarded during trim. Use len(results) >= maxResults as the hasMore signal instead, which is correct since Detection already caps internally. * fix: return explicit truncated flag from Detection instead of approximating Detection now returns (results, truncated, error) where truncated is true only when the loop stopped because it hit maxResults, not when it ran out of work naturally. This eliminates false hasMore signals when detection happens to produce exactly maxResults results by resolving the imbalance. * cleanup: simplify detection logic and remove redundancies - Remove redundant clusterInfo nil check in detectForDiskType since Detection already guards against nil clusterInfo - Remove adjustments loop for destination servers not in serverVolumeCounts — topology seeding ensures all servers with matching disk type are already present - Merge two-loop min/max calculation into a single loop: min across all servers, max only among non-exhausted servers - Replace magic number 100 with len(metrics) for minC initialization in convergence test * fix: accurate truncation flag, deterministic server order, indexed volume lookup - Track balanced flag to distinguish "hit maxResults cap" from "cluster balanced at exactly maxResults" — truncated is only true when there's genuinely more work to do - Sort servers for deterministic iteration and tie-breaking when multiple servers have equal volume counts - Pre-index volumes by server with per-server cursors to avoid O(maxResults * volumes) rescanning on each iteration - Add truncation flag assertions to RespectsMaxResults test: true when capped, false when detection finishes naturally * fix: seed trace server counts from ActiveTopology to match detection logic The decision trace was building serverVolumeCounts only from metrics, missing zero-volume servers seeded from ActiveTopology by Detection. This could cause the trace to report wrong server counts, incorrect imbalance ratios, or spurious "too few servers" messages. Pass activeTopology into the trace function and seed server counts the same way Detection does. * fix: don't exhaust server on per-volume planning failure, sort volumes by ID - When createBalanceTask returns nil, continue to the next volume on the same server instead of marking the entire server as exhausted. The failure may be volume-specific (not found in topology, pending task registration failed) and other volumes on the server may still be viable candidates. - Sort each server's volume slice by VolumeID after pre-indexing so volume selection is fully deterministic regardless of input order. * fix: use require instead of assert to prevent nil dereference panic in CORS test The test used assert.NoError (non-fatal) for GetBucketCors, then immediately accessed getResp.CORSRules. When the API returns an error, getResp is nil causing a panic. Switch to require.NoError/NotNil/Len so the test stops before dereferencing a nil response. * fix: deterministic disk tie-breaking and stronger pre-existing task test - Sort available disks by NodeID then DiskID before scoring so destination selection is deterministic when two disks score equally - Add task count bounds assertion to SkipsPreExistingPendingTasks test: with 15 of 20 volumes already having pending tasks, at most 5 new tasks should be created and at least 1 (imbalance still exists) * fix: seed adjustments from existing pending/assigned tasks to prevent over-scheduling Detection now calls ActiveTopology.GetTaskServerAdjustments() to initialize the adjustments map with source/destination deltas from existing pending and assigned balance tasks. This ensures effectiveCounts reflects in-flight moves, preventing the algorithm from planning additional moves in the same direction when prior moves already address the imbalance. Added GetTaskServerAdjustments(taskType) to ActiveTopology which iterates pending and assigned tasks, decrementing source servers and incrementing destination servers for the given task type.
CORS Integration Tests for SeaweedFS S3 API
This directory contains comprehensive integration tests for the CORS (Cross-Origin Resource Sharing) functionality in SeaweedFS S3 API.
Overview
The CORS integration tests validate the complete CORS implementation including:
- CORS configuration management (PUT/GET/DELETE)
- CORS rule validation
- CORS middleware behavior
- Caching functionality
- Error handling
- Real-world CORS scenarios
Prerequisites
- Go 1.19+: For building SeaweedFS and running tests
- Network Access: Tests use
localhost:8333by default - System Dependencies:
curlandnetstatfor health checks
Quick Start
The tests now automatically start their own SeaweedFS server, so you don't need to manually start one.
1. Run All Tests with Managed Server
# Run all tests with automatic server management
make test-with-server
# Run core CORS tests only
make test-cors-quick
# Run comprehensive CORS tests
make test-cors-comprehensive
2. Manual Server Management
If you prefer to manage the server manually:
# Start server
make start-server
# Run tests (assuming server is running)
make test-cors-simple
# Stop server
make stop-server
3. Individual Test Categories
# Run specific test types
make test-basic-cors # Basic CORS configuration
make test-preflight-cors # Preflight OPTIONS requests
make test-actual-cors # Actual CORS request handling
make test-origin-matching # Origin matching logic
make test-header-matching # Header matching logic
make test-method-matching # Method matching logic
make test-multiple-rules # Multiple CORS rules
make test-validation # CORS validation
make test-caching # CORS caching behavior
make test-error-handling # Error handling
Test Server Management
The tests use a comprehensive server management system similar to other SeaweedFS integration tests:
Server Configuration
- S3 Port: 8333 (configurable via
S3_PORT) - Master Port: 9333
- Volume Port: 8080
- Filer Port: 8888
- Metrics Port: 9324
- Data Directory:
./test-volume-data(auto-created) - Log File:
weed-test.log
Server Lifecycle
- Build: Automatically builds
../../../weed/weed_binary - Start: Launches SeaweedFS with S3 API enabled
- Health Check: Waits up to 90 seconds for server to be ready
- Test: Runs the requested tests
- Stop: Gracefully shuts down the server
- Cleanup: Removes temporary files and data
Available Commands
# Server management
make start-server # Start SeaweedFS server
make stop-server # Stop SeaweedFS server
make health-check # Check server health
make logs # View server logs
# Test execution
make test-with-server # Full test cycle with server management
make test-cors-simple # Run tests without server management
make test-cors-quick # Run core tests only
make test-cors-comprehensive # Run all tests
# Development
make dev-start # Start server for development
make dev-test # Run development tests
make build-weed # Build SeaweedFS binary
make check-deps # Check dependencies
# Maintenance
make clean # Clean up all artifacts
make coverage # Generate coverage report
make fmt # Format code
make lint # Run linter
Test Configuration
Default Configuration
The tests use these default settings (configurable via environment variables):
WEED_BINARY=../../../weed/weed_binary
S3_PORT=8333
TEST_TIMEOUT=10m
TEST_PATTERN=TestCORS
Configuration File
The test_config.json file contains S3 client configuration:
{
"endpoint": "http://localhost:8333",
"access_key": "some_access_key1",
"secret_key": "some_secret_key1",
"region": "us-east-1",
"bucket_prefix": "test-cors-",
"use_ssl": false,
"skip_verify_ssl": true
}
Troubleshooting
Compilation Issues
If you encounter compilation errors, the most common issues are:
-
AWS SDK v2 Type Mismatches: The
MaxAgeSecondsfield intypes.CORSRuleexpectsint32, not*int32. Use direct values like3600instead ofaws.Int32(3600). -
Field Name Issues: The
GetBucketCorsOutputtype has aCORSRulesfield directly, not aCORSConfigurationfield.
Example fix:
// ❌ Incorrect
MaxAgeSeconds: aws.Int32(3600),
assert.Len(t, getResp.CORSConfiguration.CORSRules, 1)
// ✅ Correct
MaxAgeSeconds: 3600,
assert.Len(t, getResp.CORSRules, 1)
Server Issues
-
Server Won't Start
# Check for port conflicts netstat -tlnp | grep 8333 # View server logs make logs # Force cleanup make clean -
Test Failures
# Run with server management make test-with-server # Run specific test make test-basic-cors # Check server health make health-check -
Connection Issues
# Verify server is running curl -s http://localhost:8333 # Check server logs tail -f weed-test.log
Performance Issues
If tests are slow or timing out:
# Increase timeout
export TEST_TIMEOUT=30m
make test-with-server
# Run quick tests only
make test-cors-quick
# Check server resources
make debug-status
Test Coverage
Core Functionality Tests
1. CORS Configuration Management (TestCORSConfigurationManagement)
- PUT CORS configuration
- GET CORS configuration
- DELETE CORS configuration
- Configuration updates
- Error handling for non-existent configurations
2. Multiple CORS Rules (TestCORSMultipleRules)
- Multiple rules in single configuration
- Rule precedence and ordering
- Complex rule combinations
3. CORS Validation (TestCORSValidation)
- Invalid HTTP methods
- Empty origins validation
- Negative MaxAge validation
- Rule limit validation
4. Wildcard Support (TestCORSWithWildcards)
- Wildcard origins (
*,https://*.example.com) - Wildcard headers (
*) - Wildcard expose headers
5. Rule Limits (TestCORSRuleLimit)
- Maximum 100 rules per configuration
- Rule limit enforcement
- Large configuration handling
6. Error Handling (TestCORSErrorHandling)
- Non-existent bucket operations
- Invalid configurations
- Malformed requests
HTTP-Level Tests
1. Preflight Requests (TestCORSPreflightRequest)
- OPTIONS request handling
- CORS headers in preflight responses
- Access-Control-Request-Method validation
- Access-Control-Request-Headers validation
2. Actual Requests (TestCORSActualRequest)
- CORS headers in actual responses
- Origin validation for real requests
- Proper expose headers handling
3. Origin Matching (TestCORSOriginMatching)
- Exact origin matching
- Wildcard origin matching (
*) - Subdomain wildcard matching (
https://*.example.com) - Non-matching origins (should be rejected)
4. Header Matching (TestCORSHeaderMatching)
- Wildcard header matching (
*) - Specific header matching
- Case-insensitive matching
- Disallowed headers
5. Method Matching (TestCORSMethodMatching)
- Allowed methods verification
- Disallowed methods rejection
- Method-specific CORS behavior
6. Multiple Rules (TestCORSMultipleRulesMatching)
- Rule precedence and selection
- Multiple rules with different configurations
- Complex rule interactions
Integration Tests
1. Caching (TestCORSCaching)
- CORS configuration caching
- Cache invalidation
- Cache performance
2. Object Operations (TestCORSObjectOperations)
- CORS with actual S3 operations
- PUT/GET/DELETE objects with CORS
- CORS headers in object responses
3. Without Configuration (TestCORSWithoutConfiguration)
- Behavior when no CORS configuration exists
- Default CORS behavior
- Graceful degradation
Development
Running Tests During Development
# Start server for development
make dev-start
# Run quick test
make dev-test
# View logs in real-time
make logs
Adding New Tests
- Follow the existing naming convention (
TestCORSXxxYyy) - Use the helper functions (
getS3Client,createTestBucket, etc.) - Add cleanup with
defer cleanupTestBucket(t, client, bucketName) - Include proper error checking with
require.NoError(t, err) - Use assertions with
assert.Equal(t, expected, actual) - Add the test to the appropriate Makefile target
Code Quality
# Format code
make fmt
# Run linter
make lint
# Generate coverage report
make coverage
Performance Notes
- Tests create and destroy buckets for each test case
- Large configuration tests may take several minutes
- Server startup typically takes 15-30 seconds
- Tests run in parallel where possible for efficiency
Integration with SeaweedFS
These tests validate the CORS implementation in:
weed/s3api/cors/- Core CORS packageweed/s3api/s3api_bucket_cors_handlers.go- HTTP handlersweed/s3api/s3api_server.go- Router integrationweed/s3api/s3api_bucket_config.go- Configuration management
The tests ensure AWS S3 API compatibility and proper CORS behavior across all supported scenarios.