* fix: decrypt SSE-encrypted objects in S3 replication sink
* fix: add SSE decryption support to GCS, Azure, B2, Local sinks
* fix: return error instead of warning for SSE-C objects during replication
* fix: close readers after upload to prevent resource leaks
* fix: return error for unknown SSE types instead of passing through ciphertext
* refactor(repl_util): extract CloseReader/CloseMaybeDecryptedReader helpers
The io.Closer close-on-error and defer-close pattern was duplicated in
copyWithDecryption and the S3 sink. Extract exported helpers to keep a
single implementation and prevent future divergence.
* fix(repl_util): warn on mixed SSE types across chunks in detectSSEType
detectSSEType previously returned the SSE type of the first encrypted
chunk without inspecting the rest. If an entry somehow has chunks with
different SSE types, only the first type's decryption would be applied.
Now scans all chunks and logs a warning on mismatch.
* fix(repl_util): decrypt inline SSE objects during replication
Small SSE-encrypted objects stored in entry.Content were being copied
as ciphertext because:
1. detectSSEType only checked chunk metadata, but inline objects have
no chunks — now falls back to checking entry.Extended for SSE keys
2. Non-S3 sinks short-circuited on len(entry.Content)>0, bypassing
the decryption path — now call MaybeDecryptContent before writing
Adds MaybeDecryptContent helper for decrypting inline byte content.
* fix(repl_util): add KMS initialization for replication SSE decryption
SSE-KMS decryption was not wired up for filer.backup — the only
initialization was for SSE-S3 key manager. CreateSSEKMSDecryptedReader
requires a global KMS provider which is only loaded by the S3 API
auth-config path.
Add InitializeSSEForReplication helper that initializes both SSE-S3
(from filer KEK) and SSE-KMS (from Viper config [kms] section /
WEED_KMS_* env vars). Replace the SSE-S3-only init in filer_backup.go.
* fix(replicator): initialize SSE decryption for filer.replicate
The SSE decryption setup was only added to filer_backup.go, but the
notification-based replicator (filer.replicate) uses the same sinks
and was missing the required initialization. Add SSE init in
NewReplicator so filer.replicate can decrypt SSE objects.
* refactor(repl_util): fold entry param into CopyFromChunkViews
Remove the CopyFromChunkViewsWithEntry wrapper and add the entry
parameter directly to CopyFromChunkViews, since all callers already
pass it.
* fix(repl_util): guard SSE init with sync.Once, error on mixed SSE types
InitializeWithFiler overwrites the global superKey on every call.
Wrap InitializeSSEForReplication with sync.Once so repeated calls
(e.g. from NewReplicator) are safe.
detectSSEType now returns an error instead of logging a warning when
chunks have inconsistent SSE types, so replication aborts rather than
silently applying the wrong decryption to some chunks.
* fix(repl_util): allow SSE init retry, detect conflicting metadata, add tests
- Replace sync.Once with mutex+bool so transient failures (e.g. filer
unreachable) don't permanently prevent initialization. Only successful
init flips the flag; failed attempts allow retries.
- Remove v.IsSet("kms") guard that prevented env-only KMS configs
(WEED_KMS_*) from being detected. Always attempt KMS loading and let
LoadConfigurations handle "no config found".
- detectSSEType now checks for conflicting extended metadata keys
(e.g. both SeaweedFSSSES3Key and SeaweedFSSSEKMSKey present) and
returns an error instead of silently picking the first match.
- Add table-driven tests for detectSSEType, MaybeDecryptReader, and
MaybeDecryptContent covering plaintext, uniform SSE, mixed chunks,
inline SSE via extended metadata, conflicting metadata, and SSE-C.
* test(repl_util): add SSE-S3 and SSE-KMS integration tests
Add round-trip encryption/decryption tests:
- SSE-S3: encrypt with CreateSSES3EncryptedReader, decrypt with
CreateSSES3DecryptedReader, verify plaintext matches
- SSE-KMS: encrypt with AES-CTR, wire a mock KMSProvider via
SetGlobalKMSProvider, build serialized KMS metadata, verify
MaybeDecryptReader and MaybeDecryptContent produce correct plaintext
Fix existing tests to check io.ReadAll errors.
* test(repl_util): exercise full SSE-S3 path through MaybeDecryptReader
Replace direct CreateSSES3DecryptedReader calls with end-to-end tests
that go through MaybeDecryptReader → decryptSSES3 →
DeserializeSSES3Metadata → GetSSES3IV → CreateSSES3DecryptedReader.
Uses WEED_S3_SSE_KEK env var + a mock filer client to initialize the
global key manager with a test KEK, then SerializeSSES3Metadata to
build proper envelope-encrypted metadata. Cleanup restores the key
manager state.
* fix(localsink): write to temp file to prevent truncated replicas
The local sink truncated the destination file before writing content.
If decryption or chunk copy failed, the file was left empty/truncated,
destroying the previous replica.
Write to a temp file in the same directory and atomically rename on
success. On any error the temp file is cleaned up and the existing
replica is untouched.
---------
Co-authored-by: Chris Lu <chris.lu@gmail.com>
* fix(gcssink): prevent empty object finalization on write failure
The GCS writer was created unconditionally with defer wc.Close(),
which finalizes the upload even when content decryption or copy
fails. This silently overwrites valid objects with empty data.
Remove the unconditional defer, explicitly close on success to
propagate errors, and delete the object on write failure.
* fix(gcssink): use context cancellation instead of obj.Delete on failure
obj.Delete() after a failed write would delete the existing object at
that key, causing data loss on updates. Use a cancelable context
instead — cancelling before Close() aborts the GCS upload without
touching any pre-existing object.
* fix(gcs): resolve credential conflict and improve backup logging
- Workaround GCS SDK's "multiple credential options" error by manually constructing an authenticated HTTP client.
- Include source entry path in filer backup error logs for better visibility on missing volumes/404s.
* fix: address PR review feedback
- Add nil check for EventNotification in getSourceKey
- Avoid reassigning google_application_credentials parameter in gcs_sink.go
* fix(gcs): return errors instead of calling glog.Fatalf in initialize
Adheres to Go best practices and allows for more graceful failure handling by callers.
* read from bind ip
* Replace removeDuplicateSlashes with NormalizeObjectKey
Use s3_constants.NormalizeObjectKey instead of removeDuplicateSlashes in most places
for consistency. NormalizeObjectKey handles both duplicate slash removal and ensures
the path starts with '/', providing more complete normalization.
* Fix double slash issues after NormalizeObjectKey
After using NormalizeObjectKey, object keys have a leading '/'. This commit ensures:
- getVersionedObjectDir strips leading slash before concatenation
- getEntry calls receive names without leading slash
- String concatenation with '/' doesn't create '//' paths
This prevents path construction errors like:
/buckets/bucket//object (wrong)
/buckets/bucket/object (correct)
* ensure object key leading "/"
* fix compilation
* fix: Strip leading slash from object keys in S3 API responses
After introducing NormalizeObjectKey, all internal object keys have a
leading slash. However, S3 API responses must return keys without
leading slashes to match AWS S3 behavior.
Fixed in three functions:
- addVersion: Strip slash for version list entries
- processRegularFile: Strip slash for regular file entries
- processExplicitDirectory: Strip slash for directory entries
This ensures ListObjectVersions and similar APIs return keys like
'bar' instead of '/bar', matching S3 API specifications.
* fix: Normalize keyMarker for consistent pagination comparison
The S3 API provides keyMarker without a leading slash (e.g., 'object-001'),
but after introducing NormalizeObjectKey, all internal object keys have
leading slashes (e.g., '/object-001').
When comparing keyMarker < normalizedObjectKey in shouldSkipObjectForMarker,
the ASCII value of '/' (47) is less than 'o' (111), causing all objects
to be incorrectly skipped during pagination. This resulted in page 2 and
beyond returning 0 results.
Fix: Normalize the keyMarker when creating versionCollector so comparisons
work correctly with normalized object keys.
Fixes pagination tests:
- TestVersioningPaginationOver1000Versions
- TestVersioningPaginationMultipleObjectsManyVersions
* refactor: Change NormalizeObjectKey to return keys without leading slash
BREAKING STRATEGY CHANGE:
Previously, NormalizeObjectKey added a leading slash to all object keys,
which required stripping it when returning keys to S3 API clients and
caused complexity in marker normalization for pagination.
NEW STRATEGY:
- NormalizeObjectKey now returns keys WITHOUT leading slash (e.g., 'foo/bar' not '/foo/bar')
- This matches the S3 API format directly
- All path concatenations now explicitly add '/' between bucket and object
- No need to strip slashes in responses or normalize markers
Changes:
1. Modified NormalizeObjectKey to strip leading slash instead of adding it
2. Fixed all path concatenations to use:
- BucketsPath + '/' + bucket + '/' + object
instead of:
- BucketsPath + '/' + bucket + object
3. Reverted response key stripping in:
- addVersion()
- processRegularFile()
- processExplicitDirectory()
4. Reverted keyMarker normalization in findVersionsRecursively()
5. Updated matchesPrefixFilter() to work with keys without leading slash
6. Fixed paths in handlers:
- s3api_object_handlers.go (GetObject, HeadObject, cacheRemoteObjectForStreaming)
- s3api_object_handlers_postpolicy.go
- s3api_object_handlers_tagging.go
- s3api_object_handlers_acl.go
- s3api_version_id.go (getVersionedObjectDir, getVersionIdFormat)
- s3api_object_versioning.go (getObjectVersionList, updateLatestVersionAfterDeletion)
All versioning tests pass including pagination stress tests.
* adjust format
* Update post policy tests to match new NormalizeObjectKey behavior
- Update TestPostPolicyKeyNormalization to expect keys without leading slashes
- Update TestNormalizeObjectKey to expect keys without leading slashes
- Update TestPostPolicyFilenameSubstitution to expect keys without leading slashes
- Update path construction in tests to use new pattern: BucketsPath + '/' + bucket + '/' + object
* Fix ListObjectVersions prefix filtering
Remove leading slash addition to prefix parameter to allow correct filtering
of .versions directories when listing object versions with a specific prefix.
The prefix parameter should match entry paths relative to bucket root.
Adding a leading slash was breaking the prefix filter for paginated requests.
Fixes pagination issue where second page returned 0 versions instead of
continuing with remaining versions.
* no leading slash
* Fix urlEscapeObject to add leading slash for filer paths
NormalizeObjectKey now returns keys without leading slashes to match S3 API format.
However, urlEscapeObject is used for filer paths which require leading slashes.
Add leading slash back after normalization to ensure filer paths are correct.
Fixes TestS3ApiServer_toFilerPath test failures.
* adjust tests
* normalize
* Fix: Normalize prefixes and markers in LIST operations using NormalizeObjectKey
Ensure consistent key normalization across all S3 operations (GET, PUT, LIST).
Previously, LIST operations were not applying the same normalization rules
(handling backslashes, duplicate slashes, leading slashes) as GET/PUT operations.
Changes:
- Updated normalizePrefixMarker() to call NormalizeObjectKey for both prefix and marker
- This ensures prefixes with leading slashes, backslashes, or duplicate slashes are
handled consistently with how object keys are normalized
- Fixes Parquet test failures where pads.write_dataset creates implicit directory
structures that couldn't be discovered by subsequent LIST operations
- Added TestPrefixNormalizationInList and TestListPrefixConsistency tests
All existing LIST tests continue to pass with the normalization improvements.
* Add debugging logging to LIST operations to track prefix normalization
* Fix: Remove leading slash addition from GetPrefix to work with NormalizeObjectKey
The NormalizeObjectKey function removes leading slashes to match S3 API format
(e.g., 'foo/bar' not '/foo/bar'). However, GetPrefix was adding a leading slash
back, which caused LIST operations to fail with incorrect path handling.
Now GetPrefix only normalizes duplicate slashes without adding a leading slash,
which allows NormalizeObjectKey changes to work correctly for S3 LIST operations.
All Parquet integration tests now pass (20/20).
* Fix: Handle object paths without leading slash in checkDirectoryObject
NormalizeObjectKey() removes the leading slash to match S3 API format.
However, checkDirectoryObject() was assuming the object path has a leading
slash when processing directory markers (paths ending with '/').
Now we ensure the object has a leading slash before processing it for
filer operations.
Fixes implicit directory marker test (explicit_dir/) while keeping
Parquet integration tests passing (20/20).
All tests pass:
- Implicit directory tests: 6/6
- Parquet integration tests: 20/20
* Fix: Handle explicit directory markers with trailing slashes
Explicit directory markers created with put_object(Key='dir/', ...) are stored
in the filer with the trailing slash as part of the name. The checkDirectoryObject()
function now checks for both:
1. Explicit directories: lookup with trailing slash preserved (e.g., 'explicit_dir/')
2. Implicit directories: lookup without trailing slash (e.g., 'implicit_dir')
This ensures both types of directory markers are properly recognized.
All tests pass:
- Implicit directory tests: 6/6 (including explicit directory marker test)
- Parquet integration tests: 20/20
* Fix: Preserve trailing slash in NormalizeObjectKey
NormalizeObjectKey now preserves trailing slashes when normalizing object keys.
This is important for explicit directory markers like 'explicit_dir/' which rely
on the trailing slash to be recognized as directory objects.
The normalization process:
1. Notes if trailing slash was present
2. Removes duplicate slashes and converts backslashes
3. Removes leading slash for S3 API format
4. Restores trailing slash if it was in the original
This ensures explicit directory markers created with put_object(Key='dir/', ...)
are properly normalized and can be looked up by their exact name.
All tests pass:
- Implicit directory tests: 6/6
- Parquet integration tests: 20/20
* clean object
* Fix: Don't restore trailing slash if result is empty
When normalizing paths that are only slashes (e.g., '///', '/'), the function
should return an empty string, not a single slash. The fix ensures we only
restore the trailing slash if the result is non-empty.
This fixes the 'just_slashes' test case:
- Input: '///'
- Expected: ''
- Previous: '/'
- Fixed: ''
All tests now pass:
- Unit tests: TestNormalizeObjectKey (13/13)
- Implicit directory tests: 6/6
- Parquet integration tests: 20/20
* prefixEndsOnDelimiter
* Update s3api_object_handlers_list.go
* Update s3api_object_handlers_list.go
* handle create directory