Commit Graph

12536 Commits

Author SHA1 Message Date
LeeXN
c909724bf1 Fix: prevent panic when swap file creation fails (#7957)
* Fix: prevent panic when swap file creation fails

* weed mount: fix race condition in swap file initialization

Ensure thread-safe access to sf.file and other state in NewSwapFileChunk
and FreeResource by using sf.chunkTrackingLock consistently. Also
set sf.file to nil after closing to prevent reuse.

* weed mount: improve swap directory creation logic

- Check error for os.MkdirAll and log it if it fails.
- Use 0700 permissions for the swap directory for better security.
- Improve error logging context.

* weed mount: add unit tests for swap file creation

Add tests to verify:
- Concurrent initialization of the swap file.
- Correct directory permissions (0700).
- Automatic directory recreation if deleted.

* weed mount: fix thread-safety in swap file unit tests

Use atomic.Uint32 to track failures within goroutines in
TestSwapFile_NewSwapFileChunk_Concurrent to avoid unsafe calls to
t.Errorf from multiple goroutines.

* weed mount: simplify swap file creation logic

Refactor the directory check and retry logic for better readability and
to avoid re-using the main error variable for directory creation errors.
Remove redundant error logging.

* weed mount: improve error checking in swap file tests

Explicitly check if NewSwapFileChunk returns nil to provide more
informative failures.

* weed mount: update DirtyPages interface to return error

Propagate errors from SaveDataAt when swap file creation fails. This
prevents potential panics in the write path.

* weed mount: handle AddPage errors in write paths

Update ChunkedDirtyPages and PageWriter to propagate errors and update
WFS.Write and WFS.CopyFileRange to return fuse.EIO on failure.

* weed mount: update swap directory creation error message

Change "recreate" to "create/recreate" to better reflect that this path
is also taken during the initial creation of the swap directory.

---------

Co-authored-by: lixiang58 <lixiang58@lenovo.com>
Co-authored-by: Chris Lu <chris.lu@gmail.com>
2026-01-04 01:12:37 -08:00
Chris Lu
0b5a65e00b avoid extra missing shard warning
fix https://github.com/seaweedfs/seaweedfs/issues/7956
2026-01-04 00:38:53 -08:00
Chris Lu
63b2fe0d76 fix: EC UI template error when viewing shard details (#7955)
* fix: EC UI template error when viewing shard details

Fixed field name mismatch in volume.html where it was using .ShardDetails
instead of .Shards. Added a robust type conversion wrapper in templates.go
to handle int64 to uint64 conversion for bytesToHumanReadable.
Added regression test to ensure future stability.

* refactor: improve bytesToHumanReadable and test robustness

- Handled more integer types (uint32, int32, uint) in bytesToHumanReadable.
- Improved volume_test.go to verify both shards are formatted correctly.

* refactor: add bounds checking to bytesToHumanReadable

Added checks for negative values in signed integer types to avoid incorrect
formatting when converting to uint64.
Addressed feedback from coderabbitai.
2026-01-03 22:45:48 -08:00
Chris Lu
0647bc24d5 s3api: fix authentication bypass and potential SIGSEGV (Issue #7912) (#7954)
* s3api: fix authentication bypass and potential SIGSEGV

* s3api: improve security tests with positive cases and nil identity guards

* s3api: fix secondary authentication bypass in AuthSignatureOnly

* s3api: refactor account loading and refine security tests based on review feedback

* s3api: refine security tests with realistic signature failures

* Update weed/s3api/auth_security_test.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-01-03 22:08:34 -08:00
Chris Lu
54de32f207 Support AWS standard IAM role ARN formats (issue #7946) (#7948)
* fix(iam): support both AWS standard and legacy IAM role ARN formats

Fix issue #7946 where SeaweedFS only recognized legacy IAM role ARN format
(arn:aws:iam::role/RoleName) but not the standard AWS format with account ID
(arn:aws:iam::ACCOUNT:role/RoleName). This was breaking EKS pod identity
integration which expects the standard format.

Changes:
- Update ExtractRoleNameFromArn() to handle both formats by searching for
  'role/' marker instead of matching a fixed prefix
- Update ExtractRoleNameFromPrincipal() to clearly document both STS and IAM
  formats it supports with or without account ID
- Simplify role ARN validation in validateRoleAssumptionForWebIdentity() and
  validateRoleAssumptionForCredentials() to use the extraction function
- Add comprehensive test coverage with 25 test cases covering both formats

The fix maintains backward compatibility with legacy format while adding
support for standard AWS format with account ID.

Fixes: https://github.com/seaweedfs/seaweedfs/issues/7946

* docs: improve docstring coverage for ARN utility functions

- Add comprehensive package-level documentation
- Enhance ExtractRoleNameFromPrincipal docstring with parameter and return descriptions
- Enhance ExtractRoleNameFromArn docstring with detailed format documentation
- Add docstrings to test functions explaining test coverage
- Update all docstrings to 80%+ coverage for code review compliance

* refactor: improve ARN parsing code maintainability and error messages

- Define constants for ARN prefixes and markers (stsPrefix, stsAssumedRoleMarker, iamPrefix, iamRoleMarker)
- Replace hardcoded magic strings with named constants in ExtractRoleNameFromPrincipal and ExtractRoleNameFromArn
- Enhance error messages in sts_service.go to show expected ARN format when validation fails
- Error message now shows: 'arn:aws:iam::[ACCOUNT_ID:]role/ROLE_NAME' format
- Improves code readability and maintainability
- Facilitates future ARN format changes and debugging

* feat: add structured ARN type for better debugging and extensibility

Implements Option 2 (Structured ARN Type) from ARN handling comparison:

New Features:
- ARNInfo struct with Original, RoleName, AccountID, and Format fields
- ARNFormat enum (Legacy, Standard, Invalid) for type-safe format tracking
- ParseRoleARN() function for structured IAM role ARN parsing
- ParsePrincipalARN() function for structured STS/IAM principal parsing

Benefits:
- Better debugging: Can see original ARN, extracted components, and format type
- Extensible: Easy to add more fields (Region, Service, etc.) in future
- Type-safe: Format is an enum, not a string
- Backward compatible: Kept original string-based functions

STS Service Updates:
- Uses ParseRoleARN() for structured validation
- Logs ARN components at V(4) level for debugging (role, account, format)
- Better error context when validation fails

Test Coverage:
- 7 new tests for ParseRoleARN (legacy, standard, invalid formats)
- 7 new tests for ParsePrincipalARN (STS/IAM, legacy/standard)
- All 39 existing tests still pass
- Total: 53 ARN-related tests

Comparison with MinIO:
- More flexible: Supports both AWS formats (MinIO only supports MinIO format)
- Better tested: 53 tests vs MinIO's 8 tests
- Structured like MinIO but more practical for AWS use cases

* security: fix ARN parsing to prevent malicious ARN acceptance

Fix critical security vulnerability where malicious ARNs could bypass validation:
- ARNs like 'arn:aws:iam::123456789012:user/role/malicious' were incorrectly accepted
- The previous implementation used strings.Index to find 'role/' anywhere in the ARN
- This allowed non-role resource types to be accepted if they contained 'role/' in their path

Changes:
1. Updated ExtractRoleNameFromArn() to validate resource type is exactly 'role/'
2. Updated ExtractRoleNameFromPrincipal() to validate resource type is exactly 'assumed-role/'
3. Updated ParseRoleARN() to validate structure before extracting fields
4. Updated ParsePrincipalARN() to validate structure before extracting fields
5. Added 6 security test cases to prevent regression

The fix validates ARN structure by:
- Splitting on ':' to separate account ID from resource type
- Verifying resource type starts with exact marker ('role/' or 'assumed-role/')
- Only then extracting role name, account ID, and format

All 59 tests pass, including new security tests that verify malicious ARNs are rejected.

Fixes: GitHub Copilot review #3624499048

* test: add test cases for empty role names and improve validation

Address review feedback to improve edge case coverage:

1. Added test case for standard format with empty role name
   - TestExtractRoleNameFromArn: arn:aws:iam::123456789012:role/
   - TestParseRoleARN: arn:aws:iam::123456789012:role/

2. Added empty role name validation for STS ARNs in ParsePrincipalARN
   - Now matches ParseRoleARN behavior
   - Prevents ARNs like arn:aws:sts::assumed-role/ from having valid Format

3. Added test cases for empty STS role names
   - TestParsePrincipalARN: arn:aws:sts::assumed-role/
   - TestParsePrincipalARN: arn:aws:sts::123456789012:assumed-role/

All 65 tests pass (15 for ExtractRoleNameFromArn, 10 for ExtractRoleNameFromPrincipal,
8 for ParseRoleARN, 9 for ParsePrincipalARN, 4 security user ARNs, 2 security STS,
plus existing tests).

* refactor: simplify ARNInfo by removing Format enum

Remove ARNFormat enum (ARNFormatLegacy, ARNFormatStandard, ARNFormatInvalid)
as it's not needed for backward compatibility. Simplifications:

1. Removed ARNFormat type and all format constants
2. Removed Format field from ARNInfo struct
3. Validation now checks if RoleName is empty (simpler and clearer)
4. AccountID presence already distinguishes legacy (empty) from standard (non-empty) formats
5. Updated STS service to check RoleName emptiness instead of Format field
6. Improved debug logging to explicitly show "(legacy format)" or "(standard format)"

Benefits:
- Simpler code with fewer concepts
- AccountID field already provides format information
- Validation is clearer: empty RoleName = invalid ARN
- All 65 tests still pass

This change maintains the same functionality while reducing code complexity.
No backward compatibility concerns as the structured ARN parsing is new.

* test: add comprehensive edge case tests for ARN parsing

Add 4 new test functions covering:
- Multiple role markers in paths (e.g., role/role/name)
- Consecutive slashes in role paths (preserved as valid components)
- Special characters valid in AWS role names (+=,.@-_)
- Extremely long role names near AWS limits

These tests verify the parser's resilience to edge cases and ensure
proper handling of various valid role name formats and special characters.
2026-01-03 19:00:04 -08:00
Taylor Jasko
6a9860098f fix: correcting S3 nil cipher dereference in filer init (#7952)
Resolves the following error reported in #7949:
```
I0103 21:38:30.230662 s3.go:275 Starting S3 API Server with standard IAM
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x38ca961]

goroutine 102 [running]:
github.com/seaweedfs/seaweedfs/weed/command.(*S3Options).startS3Server(0x7caf840)
    /go/src/github.com/seaweedfs/seaweedfs/weed/command/s3.go:295 +0x741
github.com/seaweedfs/seaweedfs/weed/command.runFiler.func1(...)
    /go/src/github.com/seaweedfs/seaweedfs/weed/command/filer.go:244
created by github.com/seaweedfs/seaweedfs/weed/command.runFiler in goroutine 1
    /go/src/github.com/seaweedfs/seaweedfs/weed/command/filer.go:242 +0x353
```
2026-01-03 18:53:38 -08:00
Chris Lu
25975bacfb fix(gcs): resolve credential conflict and improve backup logging (#7951)
* fix(gcs): resolve credential conflict and improve backup logging

- Workaround GCS SDK's "multiple credential options" error by manually constructing an authenticated HTTP client.
- Include source entry path in filer backup error logs for better visibility on missing volumes/404s.

* fix: address PR review feedback

- Add nil check for EventNotification in getSourceKey
- Avoid reassigning google_application_credentials parameter in gcs_sink.go

* fix(gcs): return errors instead of calling glog.Fatalf in initialize

Adheres to Go best practices and allows for more graceful failure handling by callers.

* read from bind ip
2026-01-03 14:41:25 -08:00
Chris Lu
23fc3f2621 Fix AWS SDK Signature V4 with STS credentials (issue #7941) (#7944)
* Add documentation for issue #7941 fix

* ensure auth

* rm FIX_ISSUE_7941.md

* Integrate STS session token validation into V4 signature verification

- Check for X-Amz-Security-Token header in verifyV4Signature
- Call validateSTSSessionToken for STS requests
- Skip regular access key lookup and expiration check for STS sessions

* Fix variable scoping in verifyV4Signature for STS session token validation

* Add ErrExpiredToken error for better AWS S3 compatibility with STS session tokens

* Support STS session token in query parameters for presigned URLs

* Fix nil pointer dereference in validateSTSSessionToken

* Enhance STS token validation with detailed error diagnostics and logging

* Fix missing credentials in STSSessionClaims.ToSessionInfo()

* test: Add comprehensive STS session claims validation tests

- TestSTSSessionClaimsToSessionInfo: Validates basic claims conversion
- TestSTSSessionClaimsToSessionInfoCredentialGeneration: Verifies credential generation
- TestSTSSessionClaimsToSessionInfoPreservesAllFields: Ensures all fields are preserved
- TestSTSSessionClaimsToSessionInfoEmptyFields: Tests handling of empty/nil fields
- TestSTSSessionClaimsToSessionInfoCredentialExpiration: Validates expiration handling

All tests pass with proper timing tolerance for credential generation.

* perf: Reuse CredentialGenerator instance for STS session claims

Optimize ToSessionInfo() to reuse a package-level defaultCredentialGenerator
instead of allocating a new CredentialGenerator on every call. This reduces
allocation overhead since this method is called frequently during signature
verification (potentially once per request).

The CredentialGenerator is stateless and deterministic, making it safe to
reuse across concurrent calls without synchronization.

* refactor: Surface credential generation errors and remove sensitive logging

Two improvements to error handling and security:

1. weed/iam/sts/session_claims.go:
   - Add logging for credential generation failures in ToSessionInfo()
   - Wrap errors with context (session ID) to aid debugging
   - Use glog.Warningf() to surface errors instead of silently swallowing them
   - Add fmt import for error wrapping

2. weed/s3api/auth_signature_v4.go:
   - Remove debug logging of actual access key IDs (glog.V(2) call)
   - Security improvement: avoid exposing sensitive access keys even at debug level
   - Keep warning-level logging that shows only count of available keys

This ensures credential generation failures are observable while protecting
sensitive authentication material from logs.

* test: Verify deterministic credential generation in session claims tests

Update TestSTSSessionClaimsToSessionInfoCredentialGeneration to properly verify
deterministic credential generation:

- Remove misleading comment about 'randomness' - parts of credentials ARE deterministic
- Add assertions that AccessKeyId is identical for same SessionId (hash-based, deterministic)
- Add assertions that SessionToken is identical for same SessionId (hash-based, deterministic)
- Verify Expiration matches when SessionId is identical
- Document that SecretAccessKey is NOT deterministic (uses random.Read)
- Truncate expiresAt to second precision to avoid timing issues

This test now properly verifies that the deterministic components of credential
generation work correctly while acknowledging the cryptographic randomness of
the secret access key.

* test(sts): Assert credentials expiration relative to now in credential expiration tests

Replace wallclock assertions comparing tc.expiresAt to time.Now() (which only verified test setup)
with assertions that check sessionInfo.Credentials.Expiration relative to time.Now(), thus
exercising the code under test. Include clarifying comment for intent.

* feat(sts): Add IsExpired helpers and use them in expiration tests

- Add Credentials.IsExpired() and SessionInfo.IsExpired() in new file session_helpers.go.
- Update TestSTSSessionClaimsToSessionInfoCredentialExpiration to use helpers for clearer intent.

* test: revert test-only IsExpired helpers; restore direct expiration assertions

Remove session_helpers.go and update TestSTSSessionClaimsToSessionInfoCredentialExpiration to assert against sessionInfo.Credentials.Expiration directly as requested by reviewer.,

* fix(s3api): restore error return when access key not found

Critical fix: The previous cleanup of sensitive logging inadvertently removed
the error return statement when access key lookup fails. This caused the code
to continue and call isCredentialExpired() on nil pointer, crashing the server.

This explains EOF errors in CORS tests - server was panicking on requests
with invalid keys.

* fix(sts): make secret access key deterministic based on sessionId

CRITICAL FIX: The secret access key was being randomly generated, causing
signature verification failures when the same session token was used twice:

1. AssumeRoleWithWebIdentity generates random secret key X
2. Client signs request using secret key X
3. Server validates token, regenerates credentials via ToSessionInfo()
4. ToSessionInfo() calls generateSecretAccessKey(), which generates random key Y
5. Server tries to verify signature using key Y, but signature was made with X
6. Signature verification fails (SignatureDoesNotMatch)

Solution: Make generateSecretAccessKey() deterministic by using SHA256 hash
of 'secret-key:' + sessionId, just like generateAccessKeyId() already does.

This ensures:
- AssumeRoleWithWebIdentity generates deterministic secret key from sessionId
- ToSessionInfo() regenerates the same secret key from the same sessionId
- Client signature verification succeeds because keys match

Fixes: AWS SDK v2 CORS tests failing with 'ExpiredToken' errors
Affected files:
- weed/iam/sts/token_utils.go: Updated generateSecretAccessKey() signature
  and implementation to be deterministic
- Updated GenerateTemporaryCredentials() to pass sessionId parameter

Tests: All 54 STS tests pass with this fix

* test(sts): add comprehensive secret key determinism test coverage

Updated tests to verify that secret access keys are now deterministic:

1. Updated TestSTSSessionClaimsToSessionInfoCredentialGeneration:
   - Changed comment from 'NOT deterministic' to 'NOW deterministic'
   - Added assertion that same sessionId produces identical secret key
   - Explains why this is critical for signature verification

2. Added TestSecretAccessKeyDeterminism (new dedicated test):
   - Verifies secret key is identical across multiple calls with same sessionId
   - Verifies access key ID and session token are also identical
   - Verifies different sessionIds produce different credentials
   - Includes detailed comments explaining why determinism is critical

These tests ensure that the STS implementation correctly regenerates
deterministic credentials during signature verification. Without
determinism, signature verification would always fail because the
server would use different secret keys than the client used to sign.

* refactor(sts): add explicit zero-time expiration handling

Improved defensive programming in IsExpired() methods:

1. Credentials.IsExpired():
   - Added explicit check for zero-time expiration (time.Time{})
   - Treats uninitialized credentials as expired
   - Prevents accidentally treating uninitialized creds as valid

2. SessionInfo.IsExpired():
   - Added same explicit zero-time check
   - Treats uninitialized sessions as expired
   - Protects against bugs where sessions might not be properly initialized

This is important because time.Now().After(time.Time{}) returns true,
but explicitly checking for zero time makes the intent clear and helps
catch initialization bugs during code review and debugging.

* refactor(sts): remove unused IsExpired() helper functions

The session_helpers.go file contained two unused IsExpired() methods:
- Credentials.IsExpired()
- SessionInfo.IsExpired()

These were never called anywhere in the codebase. The actual expiration
checks use:
- isCredentialExpired() in weed/s3api/auth_credentials.go (S3 auth)
- Direct time.Now().After() checks

Removing unused code improves code clarity and reduces maintenance burden.

* fix(auth): pass STS session token to IAM authorization for V4 signature auth

CRITICAL FIX: Session tokens were not being passed to the authorization
check when using AWS Signature V4 authentication with STS credentials.

The bug:
1. AWS SDK sends request with X-Amz-Security-Token header (V4 signature)
2. validateSTSSessionToken validates the token, creates Identity with PrincipalArn
3. authorizeWithIAM only checked X-SeaweedFS-Session-Token (JWT auth header)
4. Since it was empty, fell into 'static V4' branch which set SessionToken = ''
5. AuthorizeAction returned ErrAccessDenied because SessionToken was empty

The fix (in authorizeWithIAM):
- Check X-SeaweedFS-Session-Token first (JWT auth)
- If empty, fallback to X-Amz-Security-Token header (V4 STS auth)
- If still empty, check X-Amz-Security-Token query param (presigned URLs)
- When session token is found with PrincipalArn, use 'STS V4 signature' path
- Only use 'static V4' path when there's no session token

This ensures:
- JWT Bearer auth with session tokens works (existing path)
- STS V4 signature auth with session tokens works (new path)
- Static V4 signature auth without session tokens works (existing path)

Logging updated to distinguish:
- 'JWT-based IAM authorization'
- 'STS V4 signature IAM authorization' (new)
- 'static V4 signature IAM authorization' (clarified)

* test(s3api): add comprehensive STS session token authorization test coverage

Added new test file auth_sts_v4_test.go with comprehensive tests for the
STS session token authorization fix:

1. TestAuthorizeWithIAMSessionTokenExtraction:
   - Verifies X-SeaweedFS-Session-Token is extracted from JWT auth headers
   - Verifies X-Amz-Security-Token is extracted from V4 STS auth headers
   - Verifies X-Amz-Security-Token is extracted from query parameters (presigned URLs)
   - Verifies JWT tokens take precedence when both are present
   - Regression test for the bug where V4 STS tokens were not being passed to authorization

2. TestSTSSessionTokenIntoCredentials:
   - Verifies STS credentials have all required fields (AccessKeyId, SecretAccessKey, SessionToken)
   - Verifies deterministic generation from sessionId (same sessionId = same credentials)
   - Verifies different sessionIds produce different credentials
   - Critical for signature verification: same session must regenerate same secret key

3. TestActionConstantsForV4Auth:
   - Verifies S3 action constants are available for authorization checks
   - Ensures ACTION_READ, ACTION_WRITE, etc. are properly defined

These tests ensure that:
- V4 Signature auth with STS tokens properly extracts and uses session tokens
- Session tokens are prioritized correctly (JWT > X-Amz-Security-Token header > query param)
- STS credentials are deterministically generated for signature verification
- The fix for passing STS session tokens to authorization is properly covered

All 3 test functions pass (6 test cases total).

* refactor(s3api): improve code quality and performance

- Rename authorization path constants to avoid conflict with existing authType enum
- Replace nested if/else with clean switch statement in authorizeWithIAM()
- Add determineIAMAuthPath() helper for clearer intent and testability
- Optimize key counting in auth_signature_v4.go: remove unnecessary slice allocation
- Fix timing assertion in session_claims_test.go: use WithinDuration for symmetric tolerance

These changes improve code readability, maintainability, and performance while
maintaining full backward compatibility and test coverage.

* refactor(s3api): use typed iamAuthPath for authorization path constants

- Define iamAuthPath as a named string type (similar to existing authType enum)
- Update constants to use explicit type: iamAuthPathJWT, iamAuthPathSTS_V4, etc.
- Update determineIAMAuthPath() to return typed iamAuthPath
- Improves type safety and prevents accidental string value misuse
2026-01-03 10:09:59 -08:00
Chris Lu
24556ebdcc Refine Bucket Size Metrics: Logical and Physical Size (#7943)
* refactor: implement logical size calculation with replication factor using dedicated helper

* ui: update bucket list to show logical/physical size
2026-01-02 18:28:00 -08:00
Chris Lu
b97d17f79f Standardize -ip.bind flags to default to empty and fall back to -ip (#7945)
* Add documentation for issue #7941 fix

* rm FIX_ISSUE_7941.md

* Standardize -ip.bind flags to default to empty string and fall back to -ip option

- Change s3 command -ip.bind default logic to use -ip instead of localhost
- Change sftp command -ip.bind default to empty and fall back to 0.0.0.0
- Update help text for consistency

* Fix compilation error: add -ip flag to s3 command and update bindIp fallback

* Revert -ip flag addition for s3 command, set bindIp fallback to 0.0.0.0

* Update s3 command -ip.bind help text to reflect correct default behavior
2026-01-02 18:23:17 -08:00
Chris Lu
4d4b2e2d4a add debug messages 2026-01-02 17:15:51 -08:00
Chris Lu
f2373f9e8d fix: directory incorrectly listed as object in S3 ListObjects (#7939)
* fix: directory incorrectly listed as object in S3 ListObjects

Regular directories (without MIME type) were only added to CommonPrefixes
when delimiter was exactly '/'. This caused directories to be silently
skipped for other delimiter values.

Changed the condition from 'delimiter == "/"' to 'delimiter != ""' to
ensure directories are correctly added to CommonPrefixes for any delimiter.

Fixes issue where directories like 'data/file.vhd' were being returned as
objects instead of prefixes in ListObjects responses.

* fix: complete the directory listing fix for all delimiters

Address reviewer feedback:
- Changed doListFilerEntries line 549 from 'delimiter != "/"' to 'delimiter == ""'
  This ensures directories are yielded to the callback for ANY delimiter, not just "/"
- Parameterized test to verify fix works with multiple delimiters (/, _, :)

The previous fix only addressed line 260 but line 549 was still causing
recursion for non-"/" delimiters, preventing directories from being
added to CommonPrefixes.

* docs: update test comment to reflect multiple delimiters

Address reviewer feedback - clarify that the test verifies behavior
for any non-empty delimiter, not just '/'.

* docs: clarify test comment with delimiter examples

Add specific examples of delimiters ('/', '_', ':') to make it clear
that the test verifies behavior with multiple delimiter types.

* fix: revert line 549 to original logic, only line 260 needed changing

The fix for directories being listed as objects only required changing
line 260 from 'delimiter == "/"' to 'delimiter != ""'.

Line 549 should remain as 'delimiter != "/"' to allow recursion for
delimiters that don't exist in paths (e.g., delimiter=z for paths like
b/a/c). This is correct S3 behavior.

Updated test to only verify delimiter="/" since other delimiters should
recurse into directories to find actual files.

* docs: clarify test scope in directory listing test
2026-01-02 15:52:37 -08:00
Chris Lu
0f786cf0d2 Fix S3 list objects marker adjustment for delimiters (#7938) 2026-01-02 13:34:35 -08:00
Sheya Bernstein
c0188db7cc chart: Set admin metrics port to http port (#7936)
* chart: Set admin metrics port to http port

* remove metrics reference
2026-01-02 12:15:33 -08:00
Chris Lu
7b20d7da3c Delete SSE-C_IMPLEMENTATION.md 2026-01-02 11:34:20 -08:00
Chris Lu
fca0a38435 Update s3api_object_handlers.go 2026-01-02 11:19:29 -08:00
Chris Lu
87b71029f7 4.05 2026-01-01 20:39:22 -08:00
Chris Lu
ade2e58cf4 refactoring 2026-01-01 19:20:59 -08:00
Chris Lu
4e2af080df optimize: enable immediate EC shard reporting during startup (#7933)
* optimize: enable immediate EC shard reporting during startup

Ported the immediate EC shard reporting feature from Enterprise to Community version.
This allows the master to be notified about EC shards immediately during volume server startup,
instead of waiting for the first heartbeat.

Changes:
1. Updated NewStore to initialize notification channels BEFORE loading volumes (fixes potential nil panic).
2. Added ecShardNotifyHandler to report EC shards to NewEcShardsChan during startup.
3. Implemented non-blocking channel send for EC reporting to prevent deadlock when loading many EC shards (fixing the enterprise bug 17ac1290c).
4. Updated DiskLocation and EC loading logic to support the callback.

This optimization improves cluster state consistency and startup speed for EC-heavy clusters.

* optimize: report actual EC shard size during startup

* optimize: increase notification channel buffer size to 1024

* optimize: fix variable shadowing in store.go
2026-01-01 15:39:54 -08:00
Sheya Bernstein
6f28cb7f87 helm: Support multiple hosts for S3 ingress (#7931) 2026-01-01 07:41:53 -08:00
Chris Lu
c405ff1374 feat(iam): add TLS configuration support for OIDC provider (#7929)
* feat(iam): add TLS configuration support for OIDC provider

Adds tlsCaCert and tlsInsecureSkipVerify options to OIDC provider configuration to allow using custom CA certificates and skipping verification in development environments.

* fix: use SystemCertPool for custom CA and add security warning

- Use x509.SystemCertPool() to preserve trust in public CAs
- Add warning log when TLSInsecureSkipVerify is enabled
- Addresses code review feedback from gemini-code-assist

* docs: enhance TLS configuration field documentation

- Add explicit warning about TLSInsecureSkipVerify production usage
- Clarify TLSCACert is for custom/self-signed certificates

* security: enforce TLS 1.2 minimum version

- Set MinVersion to TLS 1.2 to prevent downgrade attacks
- Ensures secure communication with OIDC providers

* security: validate CA cert path is absolute

- Add filepath.IsAbs check before reading CA certificate
- Prevents reading unintended files from relative paths
- Fail fast on misconfigured paths
2025-12-31 14:19:40 -08:00
Chris Lu
998bcf2b3f classify grpc errors 2025-12-31 13:40:14 -08:00
Chris Lu
f7f133166a adjust fuse logs 2025-12-31 13:04:05 -08:00
Chris Lu
60707f99d8 customizable adminServer 2025-12-31 12:02:16 -08:00
Chris Lu
31a4f57cd9 Fix: Add -admin.grpc flag to worker for explicit gRPC port (#7926) (#7927)
* Fix: Add -admin.grpc flag to worker for explicit gRPC port configuration

* Fix(helm): Add adminGrpcServer to worker configuration

* Refactor: Support host:port.grpcPort address format, revert -admin.grpc flag

* Helm: Conditionally append grpcPort to worker admin address

* weed/admin: fix "send on closed channel" panic in worker gRPC server

Make unregisterWorker connection-aware to prevent closing channels
belonging to newer connections.

* weed/worker: improve gRPC client stability and logging

- Fix goroutine leak in reconnection logic
- Refactor reconnection loop to exit on success and prevent busy-waiting
- Add session identification and enhanced logging to client handlers
- Use constant for internal reset action and remove unused variables

* weed/worker: fix worker state initialization and add lifecycle logs

- Revert workerState to use running boolean correctly
- Prevent handleStart failing by checking running state instead of startTime
- Add more detailed logs for worker startup events
2025-12-31 11:55:09 -08:00
Chris Lu
5a135f8c5a fuse: add FUSE performance options to weed fuse command (#7925)
This adds support for the new FUSE performance options to the 'weed fuse' command,
matching the functionality available in 'weed mount'.

Added options:
- writebackCache: Enable FUSE writeback cache for improved write performance
- asyncDio: Enable async direct I/O for better concurrency
- cacheSymlink: Enable symlink caching to reduce metadata lookups
- sys.novncache: (macOS only) Disable vnode name caching to avoid stale data

These options can now be used with mount -t weed:
  mount -t weed fuse /mnt -o "filer=localhost:8888,writebackCache=true,asyncDio=true"

This ensures feature parity between 'weed mount' and 'weed fuse' commands.
2025-12-31 01:04:16 -08:00
Chris Lu
099da9332b mount: add -sys.novncache flag for macOS vnode cache control (#7924)
* mount: add -asyncDio flag for async direct I/O

This adds support for async direct I/O via the -asyncDio flag.

Async DIO enables the FUSE_CAP_ASYNC_DIO capability, allowing the kernel
to perform direct I/O operations asynchronously. This improves concurrency
for applications that use O_DIRECT flag.

Benefits:
- Better concurrency for direct I/O operations
- Improved performance for applications using O_DIRECT
- Reduced blocking on I/O operations

Use cases:
- Database workloads that use direct I/O
- Applications that bypass page cache intentionally
- High-performance I/O scenarios

Implementation inspired by JuiceFS which enables this capability
for improved I/O performance.

Usage:
  weed mount -filer=localhost:8888 -dir=/mnt/seaweedfs -asyncDio

* mount: add all remaining FUSE options (asyncDio, cacheSymlink, novncache)

This combines the remaining three FUSE mount options on top of the merged writebackCache PR:

1. asyncDio: Enable async direct I/O for better concurrency
2. cacheSymlink: Enable symlink caching to reduce metadata lookups
3. novncache: (macOS only) Disable vnode name caching to avoid stale data

All options use the function parameter 'option' instead of global 'mountOptions'.
2025-12-31 00:30:32 -08:00
Chris Lu
b107c45176 mount: add -cacheSymlink flag for symlink caching (#7923)
* mount: add -asyncDio flag for async direct I/O

This adds support for async direct I/O via the -asyncDio flag.

Async DIO enables the FUSE_CAP_ASYNC_DIO capability, allowing the kernel
to perform direct I/O operations asynchronously. This improves concurrency
for applications that use O_DIRECT flag.

Benefits:
- Better concurrency for direct I/O operations
- Improved performance for applications using O_DIRECT
- Reduced blocking on I/O operations

Use cases:
- Database workloads that use direct I/O
- Applications that bypass page cache intentionally
- High-performance I/O scenarios

Implementation inspired by JuiceFS which enables this capability
for improved I/O performance.

Usage:
  weed mount -filer=localhost:8888 -dir=/mnt/seaweedfs -asyncDio

* mount: add all remaining FUSE options (asyncDio, cacheSymlink, novncache)

This combines the remaining three FUSE mount options on top of the merged writebackCache PR:

1. asyncDio: Enable async direct I/O for better concurrency
2. cacheSymlink: Enable symlink caching to reduce metadata lookups
3. novncache: (macOS only) Disable vnode name caching to avoid stale data

All options use the function parameter 'option' instead of global 'mountOptions'.
2025-12-31 00:30:22 -08:00
Chris Lu
9072e1d38a mount: add -asyncDio flag for async direct I/O (#7922)
* mount: add -asyncDio flag for async direct I/O

This adds support for async direct I/O via the -asyncDio flag.

Async DIO enables the FUSE_CAP_ASYNC_DIO capability, allowing the kernel
to perform direct I/O operations asynchronously. This improves concurrency
for applications that use O_DIRECT flag.

Benefits:
- Better concurrency for direct I/O operations
- Improved performance for applications using O_DIRECT
- Reduced blocking on I/O operations

Use cases:
- Database workloads that use direct I/O
- Applications that bypass page cache intentionally
- High-performance I/O scenarios

Implementation inspired by JuiceFS which enables this capability
for improved I/O performance.

Usage:
  weed mount -filer=localhost:8888 -dir=/mnt/seaweedfs -asyncDio

* mount: add all remaining FUSE options (asyncDio, cacheSymlink, novncache)

This combines the remaining three FUSE mount options on top of the merged writebackCache PR:

1. asyncDio: Enable async direct I/O for better concurrency
2. cacheSymlink: Enable symlink caching to reduce metadata lookups
3. novncache: (macOS only) Disable vnode name caching to avoid stale data

All options use the function parameter 'option' instead of global 'mountOptions'.
2025-12-31 00:30:12 -08:00
Chris Lu
1424fe6ed5 mount: add -writebackCache flag for FUSE writeback caching (#7921)
* mount: add -writebackCache flag for FUSE writeback caching

This adds support for FUSE writeback caching via the -writebackCache flag.

Writeback caching buffers writes in the kernel page cache before flushing
to the filesystem. This significantly improves performance for workloads
with many small writes by reducing the number of write syscalls.

Benefits:
- Improved write performance for small files (2-5x faster)
- Reduced latency for write-heavy workloads
- Better handling of bursty write patterns

Trade-offs:
- Data may be lost if system crashes before kernel flushes
- Not recommended for critical data without proper fsync usage
- Disabled by default for safety

Inspired by JuiceFS implementation which uses the same FUSE option.

Usage:
  weed mount -filer=localhost:8888 -dir=/mnt/seaweedfs -writebackCache

* Apply suggestion from @gemini-code-assist[bot]

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-30 23:23:02 -08:00
Chris Lu
8f182e68e8 Update README.md 2025-12-30 23:00:24 -08:00
cduk
568f1fe5b1 fix: include DiskType in metadata log volume assignment (#7918)
When writing metadata logs to /topics/.system/log, the filer was not
respecting the disk type configuration from path-specific rules
(fs.configure). This caused volume assignment failures when volume
servers used a specific disk type (e.g., "ssd") because the assign
request defaulted to empty disk type.

The fix adds DiskType to the VolumeAssignRequest in the filer's
metadata log write path, ensuring that path-specific disk type
configurations are properly honored for internal system writes.

Fixes errors like:
  "metadata log write failed /topics/.system/log/...: AssignVolume:
   failed to find writable volumes for collection"

Signed-off-by: Charles Darke <s.cduk@toodevious.com>
Co-authored-by: Charles Darke <s.cduk@toodevious.com>
2025-12-30 17:32:33 -08:00
Lisandro Pin
91fcc60898 Have volume.list account for EC shards when computing disk usage. (#7909) 2025-12-30 17:27:11 -08:00
dependabot[bot]
0995214b37 chore(deps): bump github.com/parquet-go/parquet-go from 0.25.1 to 0.26.3 (#7908)
Bumps [github.com/parquet-go/parquet-go](https://github.com/parquet-go/parquet-go) from 0.25.1 to 0.26.3.
- [Release notes](https://github.com/parquet-go/parquet-go/releases)
- [Changelog](https://github.com/parquet-go/parquet-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/parquet-go/parquet-go/compare/v0.25.1...v0.26.3)

---
updated-dependencies:
- dependency-name: github.com/parquet-go/parquet-go
  dependency-version: 0.26.3
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-30 16:36:12 -08:00
dependabot[bot]
b1a9f344fe chore(deps): bump gocloud.dev/pubsub/natspubsub from 0.43.0 to 0.44.0 (#7907)
Bumps [gocloud.dev/pubsub/natspubsub](https://github.com/google/go-cloud) from 0.43.0 to 0.44.0.
- [Release notes](https://github.com/google/go-cloud/releases)
- [Commits](https://github.com/google/go-cloud/compare/v0.43.0...v0.44.0)

---
updated-dependencies:
- dependency-name: gocloud.dev/pubsub/natspubsub
  dependency-version: 0.44.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-30 16:36:03 -08:00
dependabot[bot]
ea4a2422f4 chore(deps): bump github.com/schollz/progressbar/v3 from 3.18.0 to 3.19.0 (#7906)
chore(deps): bump github.com/schollz/progressbar/v3

Bumps [github.com/schollz/progressbar/v3](https://github.com/schollz/progressbar) from 3.18.0 to 3.19.0.
- [Release notes](https://github.com/schollz/progressbar/releases)
- [Commits](https://github.com/schollz/progressbar/compare/v3.18.0...v3.19.0)

---
updated-dependencies:
- dependency-name: github.com/schollz/progressbar/v3
  dependency-version: 3.19.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-30 16:35:57 -08:00
dependabot[bot]
4391cff2e8 chore(deps): bump google.golang.org/api from 0.247.0 to 0.258.0 (#7905)
Bumps [google.golang.org/api](https://github.com/googleapis/google-api-go-client) from 0.247.0 to 0.258.0.
- [Release notes](https://github.com/googleapis/google-api-go-client/releases)
- [Changelog](https://github.com/googleapis/google-api-go-client/blob/main/CHANGES.md)
- [Commits](https://github.com/googleapis/google-api-go-client/compare/v0.247.0...v0.258.0)

---
updated-dependencies:
- dependency-name: google.golang.org/api
  dependency-version: 0.258.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-30 16:35:49 -08:00
dependabot[bot]
7bd9f6b5d8 chore(deps): bump modernc.org/sqlite from 1.39.0 to 1.42.2 (#7904)
Bumps [modernc.org/sqlite](https://gitlab.com/cznic/sqlite) from 1.39.0 to 1.42.2.
- [Commits](https://gitlab.com/cznic/sqlite/compare/v1.39.0...v1.42.2)

---
updated-dependencies:
- dependency-name: modernc.org/sqlite
  dependency-version: 1.42.2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-30 16:35:41 -08:00
Chris Lu
e3db95e0c1 Fix: Route unauthenticated specific STS requests to STS handler correctly (#7920)
* Fix STS Access Denied for AssumeRoleWithWebIdentity (Issue #7917)

* Fix logging regression: ensure IAM status is logged even if STS is enabled

* Address PR feedback: fix duplicate log, clarify comments, add comprehensive routing tests

* Add edge case test: authenticated STS action routes to IAM (auth precedence)
2025-12-30 16:34:05 -08:00
Chris Lu
b034cf188e Fix: trim prefix slash in ListObjectVersionsHandler (#7919)
* Fix: trim prefix slash in ListObjectVersionsHandler

* Add test for ListObjectVersions prefix handling

Test validates that prefix normalization works correctly with and without
leading slashes, ensuring the fix for /Veeam/Archive/ style prefixes.

* Simplify prefix test to validate normalization logic

The test now validates that the prefix normalization (TrimPrefix) works
correctly and that normalized prefixes match paths as expected. This is
a focused unit test that validates the core fix without requiring complex
mocking of the filer client.

* Enhance prefix test with full matchesPrefixFilter logic

Added test cases for directory traversal including:
- Directory matching with trailing slash
- canDescend logic for recursive directory search
- Full simulation of matchesPrefixFilter behavior

This provides more comprehensive coverage of the prefix normalization
fix and ensures it works correctly for both files and directories.
2025-12-30 14:54:37 -08:00
ai8future
73098c9792 filer.meta.backup: add -excludePaths flag to skip paths from backup (#7916)
* filer.meta.backup: add -excludePaths flag to skip paths from backup

Add a new -excludePaths flag that accepts comma-separated path prefixes
to exclude from backup operations. This enables selective backup when
certain directories (e.g., legacy buckets) should be skipped.

Usage:
  weed filer.meta.backup -filerDir=/buckets     -excludePaths=/buckets/legacy1,/buckets/legacy2     -config=backup.toml

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* filer.meta.backup: address code review feedback for -excludePaths

Fixes based on CodeRabbit and Gemini review:
- Cache parsed exclude paths in struct (performance)
- TrimSpace and skip empty entries (handles "a,,b" and "a, b")
- Add trailing slash for directory boundary matching (prevents
  /buckets/legacy matching /buckets/legacy_backup)
- Validate paths start with '/' and warn if not
- Log excluded paths at startup for debugging
- Fix rename handling: check both old and new paths, handle all
  four combinations correctly
- Add docstring to shouldExclude()
- Update UsageLine and Long description with new flag

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* filer.meta.backup: address nitpick feedback

- Clarify directory boundary matching behavior in help text
- Add warning when root path '/' is excluded (would exclude everything)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* includePrefixes and excludePrefixes

---------

Co-authored-by: C Shaw <cliffshaw@users.noreply.github.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Chris Lu <chris.lu@gmail.com>
2025-12-30 14:28:50 -08:00
Chris Lu
7a18c3a16f Fix critical authentication bypass vulnerability (#7912) (#7915)
* Fix critical authentication bypass vulnerability (#7912)

The isRequestPostPolicySignatureV4() function was incorrectly returning
true for ANY POST request with multipart/form-data content type,
causing all such requests to bypass authentication in authRequest().

This allowed unauthenticated access to S3 API endpoints, as reported
in issue #7912 where any credentials (or no credentials) were accepted.

The fix removes isRequestPostPolicySignatureV4() entirely, preventing
authTypePostPolicy from ever being set. PostPolicy signature verification
is still properly handled in PostPolicyBucketHandler via
doesPolicySignatureMatch().

Fixes #7912

* add AuthPostPolicy

* refactor

* Optimizing Auth Credentials

* Update auth_credentials.go

* Update auth_credentials.go
2025-12-30 12:40:59 -08:00
Chris Lu
808205e38f s3: implement Bucket Owner Enforced for object ownership (#7913)
* s3: implement Bucket Owner Enforced for object ownership

Objects uploaded by service accounts (or any user) are now owned by
the bucket owner when the bucket has BucketOwnerEnforced ownership
policy (the modern AWS default since April 2023).

This provides a more intuitive ownership model where users expect
objects created by their service accounts to be owned by themselves.

- Modified setObjectOwnerFromRequest to check bucket ObjectOwnership
- When BucketOwnerEnforced: use bucket owner's account ID
- When ObjectWriter: use uploader's account ID (backward compatible)

* s3: add nil check and fix ownership logic hole

- Add nil check for bucketRegistry before calling GetBucketMetadata
- Fix logic hole where objects could be created without owner when
  BucketOwnerEnforced is set but bucket owner is nil
- Refactor to ensure objects always have an owner by falling back
  to uploader when bucket owner is unavailable
- Improve logging to distinguish between different fallback scenarios

Addresses code review feedback from Gemini on PR #7913

* s3: add comprehensive tests for object ownership logic

Add unit tests for setObjectOwnerFromRequest covering:
- BucketOwnerEnforced: uses bucket owner
- ObjectWriter: uses uploader
- BucketOwnerPreferred: uses uploader
- Nil owner fallback scenarios
- Bucket metadata errors
- Nil bucketRegistry
- Empty account ID handling

All 8 test cases pass, verifying correct ownership assignment
in all scenarios including edge cases.
2025-12-29 23:54:00 -08:00
Chris Lu
b6d99f1c9e Admin: Add Service Account Management UI (#7902)
* admin: add Service Account management UI

Add admin UI for managing service accounts:

New files:
- handlers/service_account_handlers.go - HTTP handlers
- dash/service_account_management.go - CRUD operations
- view/app/service_accounts.templ - UI template

Changes:
- dash/types.go - Add ServiceAccount and related types
- handlers/admin_handlers.go - Register routes and handlers
- view/layout/layout.templ - Add sidebar navigation link

Service accounts are stored as special identities with "sa:" prefix
in their name, using ABIA access key prefix. They can be created,
listed, enabled/disabled, and deleted through the admin UI.

Features:
- Create service accounts linked to parent users
- View and manage service account status
- Delete service accounts
- Service accounts inherit parent user permissions

Note: STS configuration is read-only (configured via JSON file).
Full STS integration requires changes from PR #7901.

* admin: use dropdown for parent user selection

Change the Parent User field from text input to dropdown when
creating a service account. The dropdown is populated with all
existing Object Store users.

Changes:
- Add AvailableUsers field to ServiceAccountsData type
- Populate available users in getServiceAccountsData handler
- Update template to use <select> element with user options

* admin: show secret access key on service account creation

Display both access key and secret access key when creating a
service account, with proper AWS CLI usage instructions.

Changes:
- Add SecretAccessKey field to ServiceAccount type (only populated on creation)
- Return secret key from CreateServiceAccount
- Add credentials modal with copy-to-clipboard buttons
- Show AWS CLI usage example with actual credentials
- Modal is non-dismissible until user confirms they saved credentials

The secret key is only shown once during creation for security.
After creation, only the access key ID is visible in the list.

* admin: address code review comments for service account management

- Persist creation dates in identity actions (createdAt:timestamp)
- Replace magic number slicing with len(accessKeyPrefix)
- Add bounds checking after strings.SplitN
- Use accessKeyPrefix constant instead of hardcoded "ABIA"

Creation dates are now stored as actions (e.g., "createdAt:1735473600")
and will persist across restarts. Helper functions getCreationDate()
and setCreationDate() manage the timestamp storage.

Addresses review comments from gemini-code-assist[bot] and coderabbitai[bot]

* admin: fix XSS vulnerabilities in service account details

Replace innerHTML with template literals with safe DOM creation.
The createSADetailsContent function now uses createElement and
textContent to prevent XSS attacks from malicious service account
data (id, description, parent_user, etc.).

Also added try-catch for date parsing to prevent exceptions on
malformed input.

Addresses security review comments from coderabbitai[bot]

* admin: add context.Context to service account management methods

Addressed PR #7902 review feedback:

1. All service account management methods now accept context.Context as
   first parameter to enable cancellation, deadlines, and tracing
2. Removed all context.Background() calls
3. Updated handlers to pass c.Request.Context() from HTTP requests

Methods updated:
- GetServiceAccounts
- GetServiceAccountDetails
- CreateServiceAccount
- UpdateServiceAccount
- DeleteServiceAccount
- GetServiceAccountByAccessKey

Note: Creation date persistence was already implemented using the
createdAt:<timestamp> action pattern as suggested in the review.

* admin: fix render flow to prevent partial HTML writes

Fixed ShowServiceAccounts handler to render template to an in-memory
buffer first before writing to the response. This prevents partial HTML
writes followed by JSON error responses, which would result in invalid
mixed content.

Changes:
- Render to bytes.Buffer first
- Only write to c.Writer if render succeeds
- Use c.AbortWithStatus on error instead of attempting JSON response
- Prevents any additional headers/body writes after partial write

* admin: fix error handling, date validation, and event parameters

Addressed multiple code review issues:

1. Proper 404 vs 500 error handling:
   - Added ErrServiceAccountNotFound sentinel error
   - GetServiceAccountDetails now wraps errors with sentinel
   - Handler uses errors.Is() to distinguish not-found from internal errors
   - Returns 404 only for missing resources, 500 for other errors
   - Logs internal errors before returning 500

2. Date validation in JavaScript:
   - Validate expiration date before using it
   - Check !isNaN(date.getTime()) to ensure valid date
   - Return validation error if date is invalid
   - Prevents invalid Date construction

3. Event parameter handling:
   - copyToClipboard now accepts event parameter
   - Updated onclick attributes to pass event object
   - Prevents reliance on window.event
   - More explicit and reliable event handling

* admin: replace deprecated execCommand with Clipboard API

Replaced deprecated document.execCommand('copy') with modern
navigator.clipboard.writeText() API for better security and UX.

Changes:
- Made copyToClipboard async to support Clipboard API
- Use navigator.clipboard.writeText() as primary method
- Fallback to execCommand if Clipboard API fails (older browsers)
- Added console warning when fallback is used
- Maintains same visual feedback behavior

* admin: improve security and UX for error handling

Addressed code review feedback:

1. Security: Remove sensitive error details from API responses
   - CreateServiceAccount: Return generic error message
   - UpdateServiceAccount: Return generic error message
   - DeleteServiceAccount: Return generic error message
   - Detailed errors still logged server-side via glog.Errorf()
   - Prevents exposure of internal system details to clients

2. UX: Replace alert() with Bootstrap toast notifications
   - Implemented showToast() function using Bootstrap 5 toasts
   - Non-blocking, modern notification system
   - Auto-dismiss after 5 seconds
   - Proper HTML escaping to prevent XSS
   - Toast container positioned at top-right
   - Success (green) and error (red) variants

* admin: complete error handling improvements

Addressed remaining security review feedback:

1. GetServiceAccounts: Remove error details from response
   - Log errors server-side via glog.Errorf()
   - Return generic error message to client

2. UpdateServiceAccount & DeleteServiceAccount:
   - Wrap not-found errors with ErrServiceAccountNotFound sentinel
   - Enables proper 404 vs 500 distinction in handlers

3. Update & Delete handlers:
   - Added errors.Is() check for ErrServiceAccountNotFound
   - Return 404 for missing resources
   - Return 500 for internal errors with logging
   - Consistent with GetServiceAccountDetails behavior

All handlers now properly distinguish not-found (404) from internal
errors (500) and never expose sensitive error details to clients.

* admin: implement expiration support and improve code quality

Addressed final code review feedback:

1. Expiration Support:
   - Added expiration helper functions (getExpiration, setExpiration)
   - Implemented expiration in CreateServiceAccount
   - Implemented expiration in UpdateServiceAccount
   - Added Expiration field to ServiceAccount struct
   - Parse and validate RFC3339 expiration dates

2. Constants for Magic Strings:
   - Added StatusActive, StatusInactive constants
   - Added disabledAction, serviceAccountPrefix constants
   - Replaced all magic strings with constants throughout
   - Improves maintainability and prevents typos

3. Helper Function to Reduce Duplication:
   - Created identityToServiceAccount() helper
   - Reduces code duplication across Get/Update/Delete methods
   - Centralizes ServiceAccount struct building logic

4. Fixed time.Now() Fallback:
   - Changed from time.Now() to time.Time{} for legacy accounts
   - Prevents creation date from changing on each fetch
   - UI can display zero time as "N/A" or blank

All code quality issues addressed!

* admin: fix StatusActive reference in handler

Use dash.StatusActive to properly reference the constant from the dash package.

* admin: regenerate templ files

Regenerated all templ Go files after recent template changes.
The AWS CLI usage example already uses proper <pre><code> formatting
which preserves line breaks for better readability.

* admin: add explicit white-space CSS to AWS CLI example

Added style="white-space: pre-wrap;" to the pre tag to ensure
line breaks are preserved and displayed correctly in all browsers.
This forces the browser to respect the newlines in the code block.

* admin: fix AWS CLI example to display on separate lines

Replaced pre/code block with individual div elements for each line.
This ensures each command displays on its own line regardless of
how templ processes whitespace. Each line is now a separate div
with font-monospace styling for code appearance.

* make

* admin: filter service accounts from parent user dropdown

Service accounts should not appear as selectable parent users when
creating new service accounts. Added filter to GetObjectStoreUsers()
to skip identities with "sa:" prefix, ensuring only actual IAM users
are shown in the parent user dropdown.

* admin: address code review feedback

- Use constants for magic strings in service account management
- Add Expiration field to service account responses
- Add nil checks and context propagation
- Improve templates (date validation, async clipboard, toast notifications)

* Update service_accounts_templ.go
2025-12-29 22:57:02 -08:00
Chris Lu
ae9a943ef6 IAM: Add Service Account Support (#7744) (#7901)
* iam: add ServiceAccount protobuf schema

Add ServiceAccount message type to iam.proto with support for:
- Unique ID and parent user linkage
- Optional expiration timestamp
- Separate credentials (access key/secret)
- Action restrictions (subset of parent)
- Enable/disable status

This is the first step toward implementing issue #7744
(IAM Service Account Support).

* iam: add service account response types

Add IAM API response types for service account operations:
- ServiceAccountInfo struct for marshaling account details
- CreateServiceAccountResponse
- DeleteServiceAccountResponse
- ListServiceAccountsResponse
- GetServiceAccountResponse
- UpdateServiceAccountResponse

Also add type aliases in iamapi package for backwards compatibility.

Part of issue #7744 (IAM Service Account Support).

* iam: implement service account API handlers

Add CRUD operations for service accounts:
- CreateServiceAccount: Creates service account with ABIA key prefix
- DeleteServiceAccount: Removes service account and parent linkage
- ListServiceAccounts: Lists all or filtered by parent user
- GetServiceAccount: Retrieves service account details
- UpdateServiceAccount: Modifies status, description, expiration

Service accounts inherit parent user's actions by default and
support optional expiration timestamps.

Part of issue #7744 (IAM Service Account Support).

* sts: add AssumeRoleWithWebIdentity HTTP endpoint

Add STS API HTTP endpoint for AWS SDK compatibility:
- Create s3api_sts.go with HTTP handlers matching AWS STS spec
- Support AssumeRoleWithWebIdentity action with JWT token
- Return XML response with temporary credentials (AccessKeyId,
  SecretAccessKey, SessionToken) matching AWS format
- Register STS route at POST /?Action=AssumeRoleWithWebIdentity

This enables AWS SDKs (boto3, AWS CLI, etc.) to obtain temporary
S3 credentials using OIDC/JWT tokens.

Part of issue #7744 (IAM Service Account Support).

* test: add service account and STS integration tests

Add integration tests for new IAM features:

s3_service_account_test.go:
- TestServiceAccountLifecycle: Create, Get, List, Update, Delete
- TestServiceAccountValidation: Error handling for missing params

s3_sts_test.go:
- TestAssumeRoleWithWebIdentityValidation: Parameter validation
- TestAssumeRoleWithWebIdentityWithMockJWT: JWT token handling

Tests skip gracefully when SeaweedFS is not running or when IAM
features are not configured.

Part of issue #7744 (IAM Service Account Support).

* iam: address code review comments

- Add constants for service account ID and key lengths
- Use strconv.ParseInt instead of fmt.Sscanf for better error handling
- Allow clearing descriptions by checking key existence in url.Values
- Replace magic numbers (12, 20, 40) with named constants

Addresses review comments from gemini-code-assist[bot]

* test: add proper error handling in service account tests

Use require.NoError(t, err) for io.ReadAll and xml.Unmarshal
to prevent silent failures and ensure test reliability.

Addresses review comment from gemini-code-assist[bot]

* test: add proper error handling in STS tests

Use require.NoError(t, err) for io.ReadAll and xml.Unmarshal
to prevent silent failures and ensure test reliability.
Repeated this fix throughout the file.

Addresses review comment from gemini-code-assist[bot] in PR #7901.

* iam: address additional code review comments

- Specific error code mapping for STS service errors
- Distinguish between Sender and Receiver error types in STS responses
- Add nil checks for credentials in List/GetServiceAccount
- Validate expiration date is in the future
- Improve integration test error messages (include response body)
- Add credential verification step in service account tests

Addresses remaining review comments from gemini-code-assist[bot] across multiple files.

* iam: fix shared slice reference in service account creation

Copy parent's actions to create an independent slice for the service
account instead of sharing the underlying array. This prevents
unexpected mutations when the parent's actions are modified later.

Addresses review comment from coderabbitai[bot] in PR #7901.

* iam: remove duplicate unused constant

Removed redundant iamServiceAccountKeyPrefix as ServiceAccountKeyPrefix
is already defined and used.

Addresses remaining cleanup task.

* sts: document limitation of string-based error mapping

Added TODO comment explaining that the current string-based error
mapping approach is fragile and should be replaced with typed errors
from the STS service in a future refactoring.

This addresses the architectural concern raised in code review while
deferring the actual implementation to a separate PR to avoid scope
creep in the current service account feature addition.

* iam: fix remaining review issues

- Add future-date validation for expiration in UpdateServiceAccount
- Reorder tests so credential verification happens before deletion
- Fix compilation error by using correct JWT generation methods

Addresses final review comments from coderabbitai[bot].

* iam: fix service account access key length

The access key IDs were incorrectly generated with 24 characters
instead of the AWS-standard 20 characters. This was caused by
generating 20 random characters and then prepending the 4-character
ABIA prefix.

Fixed by subtracting the prefix length from AccessKeyLength, so the
final key is: ABIA (4 chars) + random (16 chars) = 20 chars total.

This ensures compatibility with S3 clients that validate key length.

* test: add comprehensive service account security tests

Added comprehensive integration tests for service account functionality:

- TestServiceAccountS3Access: Verify SA credentials work for S3 operations
- TestServiceAccountExpiration: Test expiration date validation and enforcement
- TestServiceAccountInheritedPermissions: Verify parent-child relationship
- TestServiceAccountAccessKeyFormat: Validate AWS-compatible key format (ABIA prefix, 20 char length)

These tests ensure SeaweedFS service accounts are compatible with AWS
conventions and provide robust security coverage.

* iam: remove unused UserAccessKeyPrefix constant

Code cleanup to remove unused constants.

* iam: remove unused iamCommonResponse type alias

Code cleanup to remove unused type aliases.

* iam: restore and use UserAccessKeyPrefix constant

Restored UserAccessKeyPrefix constant and updated s3api tests to use it
instead of hardcoded strings for better maintainability and consistency.

* test: improve error handling in service account security tests

Added explicit error checking for io.ReadAll and xml.Unmarshal in
TestServiceAccountExpiration to ensure failures are reported correctly and
cleanup is performed only when appropriate. Also added logging for failed
responses.

* test: use t.Cleanup for reliable resource cleanup

Replaced defer with t.Cleanup to ensure service account cleanup runs even
when require.NoError fails. Also switched from manual error checking to
require.NoError for more idiomatic testify usage.

* iam: add CreatedBy field and optimize identity lookups

- Added createdBy parameter to CreateServiceAccount to track who created each service account
- Extract creator identity from request context using GetIdentityNameFromContext
- Populate created_by field in ServiceAccount protobuf
- Added findIdentityByName helper function to optimize identity lookups
- Replaced nested loops with O(n) helper function calls in CreateServiceAccount and DeleteServiceAccount

This addresses code review feedback for better auditing and performance.

* iam: prevent user deletion when service accounts exist

Following AWS IAM behavior, prevent deletion of users that have active
service accounts. This ensures explicit cleanup and prevents orphaned
service account resources with invalid ParentUser references.

Users must delete all associated service accounts before deleting the
parent user, providing safer resource management.

* sts: enhance TODO with typed error implementation guidance

Updated TODO comment with detailed implementation approach for replacing
string-based error matching with typed errors using errors.Is(). This
provides a clear roadmap for a follow-up PR to improve error handling
robustness and maintainability.

* iam: add operational limits for service account creation

Added AWS IAM-compatible safeguards to prevent resource exhaustion:
- Maximum 100 service accounts per user (LimitExceededException)
- Maximum 1000 character description length (InvalidInputException)

These limits prevent accidental or malicious resource exhaustion while
not impacting legitimate use cases.

* iam: add missing operational limit constants

Added MaxServiceAccountsPerUser and MaxDescriptionLength constants that
were referenced in the previous commit but not defined.

* iam: enforce service account expiration during authentication

CRITICAL SECURITY FIX: Expired service account credentials were not being
rejected during authentication, allowing continued access after expiration.

Changes:
- Added Expiration field to Credential struct
- Populate expiration when loading service accounts from configuration
- Check expiration in all authentication paths (V2 and V4 signatures)
- Return ErrExpiredToken for expired credentials

This ensures expired service accounts are properly rejected at authentication
time, matching AWS IAM behavior and preventing unauthorized access.

* iam: fix error code for expired service account credentials

Use ErrAccessDenied instead of non-existent ErrExpiredToken for expired
service account credentials. This provides appropriate access denial for
expired credentials while maintaining AWS-compatible error responses.

* iam: fix remaining ErrExpiredToken references

Replace all remaining instances of non-existent ErrExpiredToken with
ErrAccessDenied for expired service account credentials.

* iam: apply AWS-standard key format to user access keys

Updated CreateAccessKey to generate AWS-standard 20-character access keys
with AKIA prefix for regular users, matching the format used for service
accounts. This ensures consistency across all access key types and full
AWS compatibility.

- Access keys: AKIA + 16 random chars = 20 total (was 21 chars, no prefix)
- Secret keys: 40 random chars (was 42, now matches AWS standard)
- Uses AccessKeyLength and UserAccessKeyPrefix constants

* sts: replace fragile string-based error matching with typed errors

Implemented robust error handling using typed errors and errors.Is() instead
of fragile strings.Contains() matching. This decouples the HTTP layer from
service implementation details and prevents errors from being miscategorized
if error messages change.

Changes:
- Added typed error variables to weed/iam/sts/constants.go:
  * ErrTypedTokenExpired
  * ErrTypedInvalidToken
  * ErrTypedInvalidIssuer
  * ErrTypedInvalidAudience
  * ErrTypedMissingClaims

- Updated STS service to wrap provider authentication errors with typed errors
- Replaced strings.Contains() with errors.Is() in HTTP layer for error checking
- Removed TODO comment as the improvement is now implemented

This makes error handling more maintainable and reliable.

* sts: eliminate all string-based error matching with provider-level typed errors

Completed the typed error implementation by adding provider-level typed errors
and updating provider implementations to return them. This eliminates ALL
fragile string matching throughout the entire error handling stack.

Changes:
- Added typed error definitions to weed/iam/providers/errors.go:
  * ErrProviderTokenExpired
  * ErrProviderInvalidToken
  * ErrProviderInvalidIssuer
  * ErrProviderInvalidAudience
  * ErrProviderMissingClaims

- Updated OIDC provider to wrap JWT validation errors with typed provider errors
- Replaced strings.Contains() with errors.Is() in STS service for error mapping
- Complete error chain: Provider -> STS -> HTTP layer, all using errors.Is()

This provides:
- Reliable error classification independent of error message content
- Type-safe error checking throughout the stack
- No order-dependent string matching
- Maintainable error handling that won't break with message changes

* oidc: use jwt.ErrTokenExpired instead of string matching

Replaced the last remaining string-based error check with the JWT library's
exported typed error. This makes the error detection independent of error
message content and more robust against library updates.

Changed from:
  strings.Contains(errMsg, "expired")
To:
  errors.Is(err, jwt.ErrTokenExpired)

This completes the elimination of ALL string-based error matching throughout
the entire authentication stack.

* iam: add description length validation to UpdateServiceAccount

Fixed inconsistency where UpdateServiceAccount didn't validate description
length against MaxDescriptionLength, allowing operational limits to be
bypassed during updates.

Now validates that updated descriptions don't exceed 1000 characters,
matching the validation in CreateServiceAccount.

* iam: refactor expiration check into helper method

Extracted duplicated credential expiration check logic into a helper method
to reduce code duplication and improve maintainability.

Added Credential.isCredentialExpired() method and replaced 5 instances of
inline expiration checks across auth_signature_v2.go and auth_signature_v4.go.

* iam: address critical Copilot security and consistency feedback

Fixed three critical issues identified by Copilot code review:

1. SECURITY: Prevent loading disabled service account credentials
   - Added check to skip disabled service accounts during credential loading
   - Disabled accounts can no longer authenticate

2. Add DurationSeconds validation for STS AssumeRoleWithWebIdentity
   - Enforce AWS-compatible range: 900-43200 seconds (15 min - 12 hours)
   - Returns proper error for out-of-range values

3. Fix expiration update consistency in UpdateServiceAccount
   - Added key existence check like Description field
   - Allows explicit clearing of expiration by setting to empty string
   - Distinguishes between "not updating" and "clearing expiration"

* sts: remove unused durationSecondsStr variable

Fixed build error from unused variable after refactoring duration parsing.

* iam: address remaining Copilot feedback and remove dead code

Completed remaining Copilot code review items:

1. Remove unused getPermission() method (dead code)
   - Method was defined but never called anywhere

2. Improve slice modification safety in DeleteServiceAccount
   - Replaced append-with-slice-operations with filter pattern
   - Avoids potential issues from mutating slice during iteration

3. Fix route registration order
   - Moved STS route registration BEFORE IAM route
   - Prevents IAM route from intercepting STS requests
   - More specific route (with query parameter) now registered first

* iam: improve expiration validation and test cleanup robustness

Addressed additional Copilot feedback:

1. Make expiration validation more explicit
   - Added explicit check for negative values
   - Added comment clarifying that 0 is allowed to clear expiration
   - Improves code readability and intent

2. Fix test cleanup order in s3_service_account_test.go
   - Track created service accounts in a slice
   - Delete all service accounts before deleting parent user
   - Prevents DeleteConflictException during cleanup
   - More robust cleanup even if test fails mid-execution

Note: s3_service_account_security_test.go already had correct cleanup
order due to LIFO defer execution.

* test: remove redundant variable assignments

Removed duplicate assignments of createdSAId, createdAccessKeyId, and
createdSecretAccessKey on lines 148-150 that were already assigned on
lines 132-134.
2025-12-29 20:17:23 -08:00
Chris Lu
288ba5fec8 mount: let filer handle chunk deletion decision (#7900)
* mount: let filer handle chunk deletion decision

Remove chunk deletion decision from FUSE mount's Unlink operation.
Previously, the mount decided whether to delete chunks based on
its locally cached entry's HardLinkCounter, which could be stale.

Now always pass isDeleteData=true and let the filer make the
authoritative decision based on its own data. This prevents
potential inconsistencies when:
- The FUSE mount's cached entry is stale
- Race conditions occur between multiple mounts
- Direct filer operations change hard link counts

* filer: check hard link counter before deleting chunks

When deleting an entry, only delete the underlying chunks if:
1. It is not a hard link
2. OR it is the last hard link (counter <= 1)

This protects against data loss when a client (like FUSE mount)
requests chunk deletion for a file that has multiple hard links.
2025-12-28 23:22:13 -08:00
Chris Lu
29a245052e tidy 2025-12-28 19:31:54 -08:00
Lisandro Pin
6b98b52acc Fix reporting of EC shard sizes from nodes to masters. (#7835)
SeaweedFS tracks EC shard sizes on topology data stuctures, but this information is never
relayed to master servers :( The end result is that commands reporting disk usage, such
as `volume.list` and `cluster.status`, yield incorrect figures when EC shards are present.

As an example for a simple 5-node test cluster, before...

```
> volume.list
Topology volumeSizeLimit:30000 MB hdd(volume:6/40 active:6 free:33 remote:0)
  DataCenter DefaultDataCenter hdd(volume:6/40 active:6 free:33 remote:0)
    Rack DefaultRack hdd(volume:6/40 active:6 free:33 remote:0)
      DataNode 192.168.10.111:9001 hdd(volume:1/8 active:1 free:7 remote:0)
        Disk hdd(volume:1/8 active:1 free:7 remote:0) id:0
          volume id:3  size:88967096  file_count:172  replica_placement:2  version:3  modified_at_second:1766349617
          ec volume id:1 collection: shards:[1 5]
        Disk hdd total size:88967096 file_count:172
      DataNode 192.168.10.111:9001 total size:88967096 file_count:172
  DataCenter DefaultDataCenter hdd(volume:6/40 active:6 free:33 remote:0)
    Rack DefaultRack hdd(volume:6/40 active:6 free:33 remote:0)
      DataNode 192.168.10.111:9002 hdd(volume:2/8 active:2 free:6 remote:0)
        Disk hdd(volume:2/8 active:2 free:6 remote:0) id:0
          volume id:2  size:77267536  file_count:166  replica_placement:2  version:3  modified_at_second:1766349617
          volume id:3  size:88967096  file_count:172  replica_placement:2  version:3  modified_at_second:1766349617
          ec volume id:1 collection: shards:[0 4]
        Disk hdd total size:166234632 file_count:338
      DataNode 192.168.10.111:9002 total size:166234632 file_count:338
  DataCenter DefaultDataCenter hdd(volume:6/40 active:6 free:33 remote:0)
    Rack DefaultRack hdd(volume:6/40 active:6 free:33 remote:0)
      DataNode 192.168.10.111:9003 hdd(volume:1/8 active:1 free:7 remote:0)
        Disk hdd(volume:1/8 active:1 free:7 remote:0) id:0
          volume id:2  size:77267536  file_count:166  replica_placement:2  version:3  modified_at_second:1766349617
          ec volume id:1 collection: shards:[2 6]
        Disk hdd total size:77267536 file_count:166
      DataNode 192.168.10.111:9003 total size:77267536 file_count:166
  DataCenter DefaultDataCenter hdd(volume:6/40 active:6 free:33 remote:0)
    Rack DefaultRack hdd(volume:6/40 active:6 free:33 remote:0)
      DataNode 192.168.10.111:9004 hdd(volume:2/8 active:2 free:6 remote:0)
        Disk hdd(volume:2/8 active:2 free:6 remote:0) id:0
          volume id:2  size:77267536  file_count:166  replica_placement:2  version:3  modified_at_second:1766349617
          volume id:3  size:88967096  file_count:172  replica_placement:2  version:3  modified_at_second:1766349617
          ec volume id:1 collection: shards:[3 7]
        Disk hdd total size:166234632 file_count:338
      DataNode 192.168.10.111:9004 total size:166234632 file_count:338
  DataCenter DefaultDataCenter hdd(volume:6/40 active:6 free:33 remote:0)
    Rack DefaultRack hdd(volume:6/40 active:6 free:33 remote:0)
      DataNode 192.168.10.111:9005 hdd(volume:0/8 active:0 free:8 remote:0)
        Disk hdd(volume:0/8 active:0 free:8 remote:0) id:0
          ec volume id:1 collection: shards:[8 9 10 11 12 13]
        Disk hdd total size:0 file_count:0
    Rack DefaultRack total size:498703896 file_count:1014
  DataCenter DefaultDataCenter total size:498703896 file_count:1014
total size:498703896 file_count:1014
```

...and after:

```
> volume.list
Topology volumeSizeLimit:30000 MB hdd(volume:6/40 active:6 free:33 remote:0)
  DataCenter DefaultDataCenter hdd(volume:6/40 active:6 free:33 remote:0)
    Rack DefaultRack hdd(volume:6/40 active:6 free:33 remote:0)
      DataNode 192.168.10.111:9001 hdd(volume:1/8 active:1 free:7 remote:0)
        Disk hdd(volume:1/8 active:1 free:7 remote:0) id:0
          volume id:2  size:81761800  file_count:161  replica_placement:2  version:3  modified_at_second:1766349495
          ec volume id:1 collection: shards:[1 5 9] sizes:[1:8.00 MiB 5:8.00 MiB 9:8.00 MiB] total:24.00 MiB
        Disk hdd total size:81761800 file_count:161
      DataNode 192.168.10.111:9001 total size:81761800 file_count:161
  DataCenter DefaultDataCenter hdd(volume:6/40 active:6 free:33 remote:0)
    Rack DefaultRack hdd(volume:6/40 active:6 free:33 remote:0)
      DataNode 192.168.10.111:9002 hdd(volume:1/8 active:1 free:7 remote:0)
        Disk hdd(volume:1/8 active:1 free:7 remote:0) id:0
          volume id:3  size:88678712  file_count:170  replica_placement:2  version:3  modified_at_second:1766349495
          ec volume id:1 collection: shards:[11 12 13] sizes:[11:8.00 MiB 12:8.00 MiB 13:8.00 MiB] total:24.00 MiB
        Disk hdd total size:88678712 file_count:170
      DataNode 192.168.10.111:9002 total size:88678712 file_count:170
  DataCenter DefaultDataCenter hdd(volume:6/40 active:6 free:33 remote:0)
    Rack DefaultRack hdd(volume:6/40 active:6 free:33 remote:0)
      DataNode 192.168.10.111:9003 hdd(volume:2/8 active:2 free:6 remote:0)
        Disk hdd(volume:2/8 active:2 free:6 remote:0) id:0
          volume id:2  size:81761800  file_count:161  replica_placement:2  version:3  modified_at_second:1766349495
          volume id:3  size:88678712  file_count:170  replica_placement:2  version:3  modified_at_second:1766349495
          ec volume id:1 collection: shards:[0 4 8] sizes:[0:8.00 MiB 4:8.00 MiB 8:8.00 MiB] total:24.00 MiB
        Disk hdd total size:170440512 file_count:331
      DataNode 192.168.10.111:9003 total size:170440512 file_count:331
  DataCenter DefaultDataCenter hdd(volume:6/40 active:6 free:33 remote:0)
    Rack DefaultRack hdd(volume:6/40 active:6 free:33 remote:0)
      DataNode 192.168.10.111:9004 hdd(volume:2/8 active:2 free:6 remote:0)
        Disk hdd(volume:2/8 active:2 free:6 remote:0) id:0
          volume id:2  size:81761800  file_count:161  replica_placement:2  version:3  modified_at_second:1766349495
          volume id:3  size:88678712  file_count:170  replica_placement:2  version:3  modified_at_second:1766349495
          ec volume id:1 collection: shards:[2 6 10] sizes:[2:8.00 MiB 6:8.00 MiB 10:8.00 MiB] total:24.00 MiB
        Disk hdd total size:170440512 file_count:331
      DataNode 192.168.10.111:9004 total size:170440512 file_count:331
  DataCenter DefaultDataCenter hdd(volume:6/40 active:6 free:33 remote:0)
    Rack DefaultRack hdd(volume:6/40 active:6 free:33 remote:0)
      DataNode 192.168.10.111:9005 hdd(volume:0/8 active:0 free:8 remote:0)
        Disk hdd(volume:0/8 active:0 free:8 remote:0) id:0
          ec volume id:1 collection: shards:[3 7] sizes:[3:8.00 MiB 7:8.00 MiB] total:16.00 MiB
        Disk hdd total size:0 file_count:0
    Rack DefaultRack total size:511321536 file_count:993
  DataCenter DefaultDataCenter total size:511321536 file_count:993
total size:511321536 file_count:993
```
2025-12-28 19:30:42 -08:00
Chris Lu
2b529e310d s3: Add SOSAPI support for Veeam integration (#7899)
* s3api: Add SOSAPI core implementation and tests

Implement Smart Object Storage API (SOSAPI) support for Veeam integration.

- Add s3api_sosapi.go with XML structures and handlers for system.xml and capacity.xml
- Implement virtual object detection and dynamic XML generation
- Add capacity retrieval via gRPC (to be optimized in follow-up)
- Include comprehensive unit tests covering detection, XML generation, and edge cases

This enables Veeam Backup & Replication to discover SeaweedFS capabilities and capacity.

* s3api: Integrate SOSAPI handlers into GetObject and HeadObject

Add early interception for SOSAPI virtual objects in GetObjectHandler and HeadObjectHandler.

- Check for SOSAPI objects (.system-*/system.xml, .system-*/capacity.xml) before normal processing
- Delegate to handleSOSAPIGetObject and handleSOSAPIHeadObject when detected
- Ensures virtual objects are served without hitting storage layer

* s3api: Allow anonymous access to SOSAPI virtual objects

Enable discovery of SOSAPI capabilities without requiring credentials.

- Modify AuthWithPublicRead to bypass auth for SOSAPI objects if bucket exists
- Supports Veeam's initial discovery phase before full IAM setup
- Validates bucket existence to prevent information disclosure

* s3api: Fix SOSAPI capacity retrieval to use proper master connection

Fix gRPC error by connecting directly to master servers instead of through filer.

- Use pb.WithOneOfGrpcMasterClients with s3a.option.Masters
- Matches pattern used in bucket_size_metrics.go
- Resolves "unknown service master_pb.Seaweed" error
- Gracefully handles missing master configuration

* Merge origin/master and implement robust SOSAPI capacity logic

- Resolved merge conflict in s3api_sosapi.go
- Replaced global Statistics RPC with VolumeList (topology) for accurate bucket-specific 'Used' calculation
- Added bucket quota support (report quota as Capacity if set)
- Implemented cluster-wide capacity calculation from topology when no quota
- Added unit tests for topology capacity and usage calculations

* s3api: Remove anonymous access to SOSAPI virtual objects

Reverts the implicit public access for system.xml and capacity.xml.
Requests to these objects now require standard S3 authentication,
unless the bucket has a public-read policy.

* s3api: Refactor SOSAPI handlers to use http.ServeContent

- Consolidate handleSOSAPIGetObject and handleSOSAPIHeadObject into serveSOSAPI
- Use http.ServeContent for standard Range, HEAD, and ETag handling
- Remove manual range request handler and reduce code duplication

* s3api: Unify SOSAPI request handling

- Replaced handleSOSAPIGetObject and handleSOSAPIHeadObject with single HandleSOSAPI function
- Updated call sites in s3api_object_handlers.go
- Simplifies logic and ensures consistent handling for both GET and HEAD requests via http.ServeContent

* s3api: Restore distinct SOSAPI GET/HEAD handlers

- Reverted unified handler to enforce distinct behavior for GET and HEAD
- GET: Supports Range requests via http.ServeContent
- HEAD: Explicitly ignores Range requests (matches MinIO behavior) and writes headers only

* s3api: Refactor SOSAPI handlers to eliminate duplication

- Extracted shared content generation logic into generateSOSAPIContent helper
- handleSOSAPIGetObject: Uses http.ServeContent (supports Range requests)
- handleSOSAPIHeadObject: Manually sets headers (no Range, no body)
- Maintains distinct behavior while following DRY principle

* s3api: Remove low-value SOSAPI tests

Removed tests that validate standard library behavior or trivial constant checks:
- TestSOSAPIConstants (string prefix/suffix checks)
- TestSystemInfoXMLRootElement (redundant with TestGenerateSystemXML)
- TestSOSAPIXMLContentType (tests httptest, not our code)
- TestHTTPTimeFormat (tests standard library)
- TestCapacityInfoXMLStruct (tests Go's XML marshaling)

Kept tests that validate actual business logic and edge cases.

* s3api: Use consistent S3-compliant error responses in SOSAPI

Replaced http.Error() with s3err.WriteErrorResponse() for internal errors
to ensure all SOSAPI errors return S3-compliant XML instead of plain text.

* s3api: Return error when no masters configured for SOSAPI capacity

Changed getCapacityInfo to return an error instead of silently returning
zero capacity when no master servers are configured. This helps surface
configuration issues rather than masking them.

* s3api: Use collection name with FilerGroup prefix for SOSAPI capacity

Fixed collectBucketUsageFromTopology to use s3a.getCollectionName(bucket)
instead of raw bucket name. This ensures collection comparisons match actual
volume collection names when FilerGroup prefix is configured.

* s3api: Apply PR review feedback for SOSAPI

- Renamed `bucket` parameter to `collectionName` in collectBucketUsageFromTopology for clarity
- Changed error checks from `==` to `errors.Is()` for better wrapped error handling
- Added `errors` import

* s3api: Avoid variable shadowing in SOSAPI capacity retrieval

Refactored `getCapacityInfo` to use distinct variable names for errors
to improve code clarity and avoid unintentional shadowing of the
return parameter.
2025-12-28 14:07:58 -08:00
Chris Lu
e8baeb3616 s3api: Allow anonymous access to SOSAPI virtual objects
Enable discovery of SOSAPI capabilities without requiring credentials.

- Modify AuthWithPublicRead to bypass auth for SOSAPI objects if bucket exists
- Supports Veeam's initial discovery phase before full IAM setup
- Validates bucket existence to prevent information disclosure
2025-12-28 12:57:01 -08:00