s3: Add SOSAPI support for Veeam integration (#7899)

* s3api: Add SOSAPI core implementation and tests

Implement Smart Object Storage API (SOSAPI) support for Veeam integration.

- Add s3api_sosapi.go with XML structures and handlers for system.xml and capacity.xml
- Implement virtual object detection and dynamic XML generation
- Add capacity retrieval via gRPC (to be optimized in follow-up)
- Include comprehensive unit tests covering detection, XML generation, and edge cases

This enables Veeam Backup & Replication to discover SeaweedFS capabilities and capacity.

* s3api: Integrate SOSAPI handlers into GetObject and HeadObject

Add early interception for SOSAPI virtual objects in GetObjectHandler and HeadObjectHandler.

- Check for SOSAPI objects (.system-*/system.xml, .system-*/capacity.xml) before normal processing
- Delegate to handleSOSAPIGetObject and handleSOSAPIHeadObject when detected
- Ensures virtual objects are served without hitting storage layer

* s3api: Allow anonymous access to SOSAPI virtual objects

Enable discovery of SOSAPI capabilities without requiring credentials.

- Modify AuthWithPublicRead to bypass auth for SOSAPI objects if bucket exists
- Supports Veeam's initial discovery phase before full IAM setup
- Validates bucket existence to prevent information disclosure

* s3api: Fix SOSAPI capacity retrieval to use proper master connection

Fix gRPC error by connecting directly to master servers instead of through filer.

- Use pb.WithOneOfGrpcMasterClients with s3a.option.Masters
- Matches pattern used in bucket_size_metrics.go
- Resolves "unknown service master_pb.Seaweed" error
- Gracefully handles missing master configuration

* Merge origin/master and implement robust SOSAPI capacity logic

- Resolved merge conflict in s3api_sosapi.go
- Replaced global Statistics RPC with VolumeList (topology) for accurate bucket-specific 'Used' calculation
- Added bucket quota support (report quota as Capacity if set)
- Implemented cluster-wide capacity calculation from topology when no quota
- Added unit tests for topology capacity and usage calculations

* s3api: Remove anonymous access to SOSAPI virtual objects

Reverts the implicit public access for system.xml and capacity.xml.
Requests to these objects now require standard S3 authentication,
unless the bucket has a public-read policy.

* s3api: Refactor SOSAPI handlers to use http.ServeContent

- Consolidate handleSOSAPIGetObject and handleSOSAPIHeadObject into serveSOSAPI
- Use http.ServeContent for standard Range, HEAD, and ETag handling
- Remove manual range request handler and reduce code duplication

* s3api: Unify SOSAPI request handling

- Replaced handleSOSAPIGetObject and handleSOSAPIHeadObject with single HandleSOSAPI function
- Updated call sites in s3api_object_handlers.go
- Simplifies logic and ensures consistent handling for both GET and HEAD requests via http.ServeContent

* s3api: Restore distinct SOSAPI GET/HEAD handlers

- Reverted unified handler to enforce distinct behavior for GET and HEAD
- GET: Supports Range requests via http.ServeContent
- HEAD: Explicitly ignores Range requests (matches MinIO behavior) and writes headers only

* s3api: Refactor SOSAPI handlers to eliminate duplication

- Extracted shared content generation logic into generateSOSAPIContent helper
- handleSOSAPIGetObject: Uses http.ServeContent (supports Range requests)
- handleSOSAPIHeadObject: Manually sets headers (no Range, no body)
- Maintains distinct behavior while following DRY principle

* s3api: Remove low-value SOSAPI tests

Removed tests that validate standard library behavior or trivial constant checks:
- TestSOSAPIConstants (string prefix/suffix checks)
- TestSystemInfoXMLRootElement (redundant with TestGenerateSystemXML)
- TestSOSAPIXMLContentType (tests httptest, not our code)
- TestHTTPTimeFormat (tests standard library)
- TestCapacityInfoXMLStruct (tests Go's XML marshaling)

Kept tests that validate actual business logic and edge cases.

* s3api: Use consistent S3-compliant error responses in SOSAPI

Replaced http.Error() with s3err.WriteErrorResponse() for internal errors
to ensure all SOSAPI errors return S3-compliant XML instead of plain text.

* s3api: Return error when no masters configured for SOSAPI capacity

Changed getCapacityInfo to return an error instead of silently returning
zero capacity when no master servers are configured. This helps surface
configuration issues rather than masking them.

* s3api: Use collection name with FilerGroup prefix for SOSAPI capacity

Fixed collectBucketUsageFromTopology to use s3a.getCollectionName(bucket)
instead of raw bucket name. This ensures collection comparisons match actual
volume collection names when FilerGroup prefix is configured.

* s3api: Apply PR review feedback for SOSAPI

- Renamed `bucket` parameter to `collectionName` in collectBucketUsageFromTopology for clarity
- Changed error checks from `==` to `errors.Is()` for better wrapped error handling
- Added `errors` import

* s3api: Avoid variable shadowing in SOSAPI capacity retrieval

Refactored `getCapacityInfo` to use distinct variable names for errors
to improve code clarity and avoid unintentional shadowing of the
return parameter.
This commit is contained in:
Chris Lu
2025-12-28 14:07:58 -08:00
committed by GitHub
parent e8baeb3616
commit 2b529e310d
3 changed files with 225 additions and 271 deletions

View File

@@ -5,6 +5,8 @@ import (
"net/http/httptest"
"strings"
"testing"
"github.com/seaweedfs/seaweedfs/weed/pb/master_pb"
)
func TestIsSOSAPIObject(t *testing.T) {
@@ -134,91 +136,7 @@ func TestGenerateSystemXML(t *testing.T) {
}
}
func TestCapacityInfoXMLStruct(t *testing.T) {
// Test that CapacityInfo can be marshaled correctly
ci := CapacityInfo{
Capacity: 1000000,
Available: 800000,
Used: 200000,
}
xmlData, err := xml.Marshal(&ci)
if err != nil {
t.Fatalf("xml.Marshal failed: %v", err)
}
// Verify roundtrip
var parsed CapacityInfo
if err := xml.Unmarshal(xmlData, &parsed); err != nil {
t.Fatalf("xml.Unmarshal failed: %v", err)
}
if parsed.Capacity != ci.Capacity {
t.Errorf("Capacity = %d, want %d", parsed.Capacity, ci.Capacity)
}
if parsed.Available != ci.Available {
t.Errorf("Available = %d, want %d", parsed.Available, ci.Available)
}
if parsed.Used != ci.Used {
t.Errorf("Used = %d, want %d", parsed.Used, ci.Used)
}
}
func TestSOSAPIConstants(t *testing.T) {
// Verify constants are correctly set
if !strings.HasPrefix(sosAPISystemXML, sosAPISystemFolder) {
t.Errorf("sosAPISystemXML should start with sosAPISystemFolder")
}
if !strings.HasPrefix(sosAPICapacityXML, sosAPISystemFolder) {
t.Errorf("sosAPICapacityXML should start with sosAPISystemFolder")
}
if !strings.HasSuffix(sosAPISystemXML, "system.xml") {
t.Errorf("sosAPISystemXML should end with 'system.xml'")
}
if !strings.HasSuffix(sosAPICapacityXML, "capacity.xml") {
t.Errorf("sosAPICapacityXML should end with 'capacity.xml'")
}
// Protocol version should be quoted per SOSAPI spec
if !strings.HasPrefix(sosAPIProtocolVersion, "\"") || !strings.HasSuffix(sosAPIProtocolVersion, "\"") {
t.Errorf("sosAPIProtocolVersion should be quoted, got: %s", sosAPIProtocolVersion)
}
}
func TestSystemInfoXMLRootElement(t *testing.T) {
xmlData, err := generateSystemXML()
if err != nil {
t.Fatalf("generateSystemXML() failed: %v", err)
}
xmlStr := string(xmlData)
// Verify root element name
if !strings.Contains(xmlStr, "<SystemInfo>") {
t.Error("XML should contain <SystemInfo> root element")
}
// Verify required elements
requiredElements := []string{
"<ProtocolVersion>",
"<ModelName>",
"<ProtocolCapabilities>",
"<CapacityInfo>",
}
for _, elem := range requiredElements {
if !strings.Contains(xmlStr, elem) {
t.Errorf("XML should contain %s element", elem)
}
}
}
// TestSOSAPIHandlerIntegration tests the basic handler flow without a full server
func TestSOSAPIObjectDetectionEdgeCases(t *testing.T) {
// Test various edge cases for object detection
edgeCases := []struct {
object string
expected bool
@@ -244,32 +162,87 @@ func TestSOSAPIObjectDetectionEdgeCases(t *testing.T) {
}
}
// TestSOSAPIHandlerReturnsXMLContentType verifies content-type setting logic
func TestSOSAPIXMLContentType(t *testing.T) {
// Create a mock response writer to check headers
w := httptest.NewRecorder()
func TestCollectBucketUsageFromTopology(t *testing.T) {
topo := &master_pb.TopologyInfo{
DataCenterInfos: []*master_pb.DataCenterInfo{
{
RackInfos: []*master_pb.RackInfo{
{
DataNodeInfos: []*master_pb.DataNodeInfo{
{
DiskInfos: map[string]*master_pb.DiskInfo{
"hdd": {
VolumeInfos: []*master_pb.VolumeInformationMessage{
{Id: 1, Size: 100, Collection: "bucket1"},
{Id: 2, Size: 200, Collection: "bucket2"},
{Id: 3, Size: 300, Collection: "bucket1"},
{Id: 1, Size: 100, Collection: "bucket1"}, // Duplicate (replica), should be ignored
},
},
},
},
},
},
},
},
},
}
// Simulate what the handler should set
w.Header().Set("Content-Type", "application/xml")
usage := collectBucketUsageFromTopology(topo, "bucket1")
expected := int64(400) // 100 + 300
if usage != expected {
t.Errorf("collectBucketUsageFromTopology = %d, want %d", usage, expected)
}
contentType := w.Header().Get("Content-Type")
if contentType != "application/xml" {
t.Errorf("Content-Type = %q, want 'application/xml'", contentType)
usage2 := collectBucketUsageFromTopology(topo, "bucket2")
expected2 := int64(200)
if usage2 != expected2 {
t.Errorf("collectBucketUsageFromTopology = %d, want %d", usage2, expected2)
}
}
func TestHTTPTimeFormat(t *testing.T) {
// Verify the Last-Modified header format is correct for HTTP
w := httptest.NewRecorder()
w.Header().Set("Last-Modified", "Sat, 28 Dec 2024 20:00:00 GMT")
lastMod := w.Header().Get("Last-Modified")
if lastMod == "" {
t.Error("Last-Modified header should be set")
func TestCalculateClusterCapacity(t *testing.T) {
topo := &master_pb.TopologyInfo{
DataCenterInfos: []*master_pb.DataCenterInfo{
{
RackInfos: []*master_pb.RackInfo{
{
DataNodeInfos: []*master_pb.DataNodeInfo{
{
DiskInfos: map[string]*master_pb.DiskInfo{
"hdd": {
MaxVolumeCount: 100,
FreeVolumeCount: 40,
},
},
},
{
DiskInfos: map[string]*master_pb.DiskInfo{
"hdd": {
MaxVolumeCount: 200,
FreeVolumeCount: 160,
},
},
},
},
},
},
},
},
}
// HTTP date should contain day of week
if !strings.Contains(lastMod, "Dec") {
t.Errorf("Last-Modified should contain month, got: %s", lastMod)
volumeSizeLimitMb := uint64(1000) // 1GB
volumeSizeBytes := int64(1000) * 1024 * 1024
total, available := calculateClusterCapacity(topo, volumeSizeLimitMb)
expectedTotal := int64(300) * volumeSizeBytes
expectedAvailable := int64(200) * volumeSizeBytes
if total != expectedTotal {
t.Errorf("calculateClusterCapacity total = %d, want %d", total, expectedTotal)
}
if available != expectedAvailable {
t.Errorf("calculateClusterCapacity available = %d, want %d", available, expectedAvailable)
}
}