Fix SSE-S3 copy: preserve encryption metadata and set chunk SSE type (#7598)

* Fix SSE-S3 copy: preserve encryption metadata and set chunk SSE type

Fixes GitHub #7562: Copying objects between encrypted buckets was failing.

Root causes:
1. processMetadataBytes was re-adding SSE headers from source entry, undoing
   the encryption header filtering. Now uses dstEntry.Extended which is
   already filtered.

2. SSE-S3 streaming copy returned nil metadata. Now properly generates and
   returns SSE-S3 destination metadata (SeaweedFSSSES3Key, AES256 header)
   via ExecuteStreamingCopyWithMetadata.

3. Chunks created during streaming copy didn't have SseType set. Now sets
   SseType and per-chunk SseMetadata with chunk-specific IVs for SSE-S3,
   enabling proper decryption on GetObject.

* Address review: make SSE-S3 metadata serialization failures fatal errors

- In executeEncryptCopy: return error instead of just logging if
  SerializeSSES3Metadata fails
- In createChunkFromData: return error if chunk SSE-S3 metadata
  serialization fails

This ensures objects/chunks are never created without proper encryption
metadata, preventing unreadable/corrupted data.

* fmt

* Refactor: reuse function names instead of creating WithMetadata variants

- Change ExecuteStreamingCopy to return (*EncryptionSpec, error) directly
- Remove ExecuteStreamingCopyWithMetadata wrapper
- Change executeStreamingReencryptCopy to return (*EncryptionSpec, error)
- Remove executeStreamingReencryptCopyWithMetadata wrapper
- Update callers to ignore encryption spec with _ where not needed

* Add TODO documenting large file SSE-S3 copy limitation

The streaming copy approach encrypts the entire stream with a single IV
but stores data in chunks with per-chunk IVs. This causes decryption
issues for large files. Small inline files work correctly.

This is a known architectural issue that needs separate work to fix.

* Use chunk-by-chunk encryption for SSE-S3 copy (consistent with SSE-C/SSE-KMS)

Instead of streaming encryption (which had IV mismatch issues for multi-chunk
files), SSE-S3 now uses the same chunk-by-chunk approach as SSE-C and SSE-KMS:

1. Extended copyMultipartCrossEncryption to handle SSE-S3:
   - Added SSE-S3 source decryption in copyCrossEncryptionChunk
   - Added SSE-S3 destination encryption with per-chunk IVs
   - Added object-level metadata generation for SSE-S3 destinations

2. Updated routing in executeEncryptCopy/executeDecryptCopy/executeReencryptCopy
   to use copyMultipartCrossEncryption for all SSE-S3 scenarios

3. Removed streaming copy functions (shouldUseStreamingCopy,
   executeStreamingReencryptCopy) as they're no longer used

4. Added large file (1MB) integration test to verify chunk-by-chunk copy works

This ensures consistent behavior across all SSE types and fixes data corruption
that occurred with large files in the streaming copy approach.

* fmt

* fmt

* Address review: fail explicitly if SSE-S3 metadata is missing

Instead of silently ignoring missing SSE-S3 metadata (which could create
unreadable objects), now explicitly fail the copy operation with a clear
error message if:
- First chunk is missing
- First chunk doesn't have SSE-S3 type
- First chunk has empty SSE metadata
- Deserialization fails

* Address review: improve comment to reflect full scope of chunk creation

* Address review: fail explicitly if baseIV is empty for SSE-S3 chunk encryption

If DestinationIV is not set when encrypting SSE-S3 chunks, the chunk would
be created without SseMetadata, causing GetObject decryption to fail later.
Now fails explicitly with a clear error message.

Note: calculateIVWithOffset returns ([]byte, int) not ([]byte, error) - the
int is a skip amount for intra-block alignment, not an error code.

* Address review: handle 0-byte files in SSE-S3 copy

For 0-byte files, there are no chunks to get metadata from. Generate an IV
for the object-level metadata to ensure even empty files are properly marked
as SSE-S3 encrypted.

Also validate that we don't have a non-empty file with no chunks (which
would indicate an internal error).
This commit is contained in:
Chris Lu
2025-12-02 09:24:31 -08:00
committed by GitHub
parent 099a351f3b
commit 733ca8e6df
4 changed files with 244 additions and 115 deletions

View File

@@ -199,7 +199,9 @@ func (s3a *S3ApiServer) CopyObjectHandler(w http.ResponseWriter, r *http.Request
}
// Process metadata and tags and apply to destination
processedMetadata, tagErr := processMetadataBytes(r.Header, entry.Extended, replaceMeta, replaceTagging)
// Use dstEntry.Extended (already filtered) as the source, not entry.Extended,
// to preserve the encryption header filtering. Fixes GitHub #7562.
processedMetadata, tagErr := processMetadataBytes(r.Header, dstEntry.Extended, replaceMeta, replaceTagging)
if tagErr != nil {
s3err.WriteErrorResponse(w, r, s3err.ErrInvalidCopySource)
return
@@ -1522,7 +1524,7 @@ func (s3a *S3ApiServer) copyMultipartSSECChunk(chunk *filer_pb.FileChunk, copySo
}
// copyMultipartCrossEncryption handles all cross-encryption and decrypt-only copy scenarios
// This unified function supports: SSE-C↔SSE-KMS, SSE-C→Plain, SSE-KMS→Plain
// This unified function supports: SSE-C↔SSE-KMSSSE-S3, and any→Plain
func (s3a *S3ApiServer) copyMultipartCrossEncryption(entry *filer_pb.Entry, r *http.Request, state *EncryptionState, dstBucket, dstPath string) ([]*filer_pb.FileChunk, map[string][]byte, error) {
var dstChunks []*filer_pb.FileChunk
@@ -1531,6 +1533,7 @@ func (s3a *S3ApiServer) copyMultipartCrossEncryption(entry *filer_pb.Entry, r *h
var destKMSKeyID string
var destKMSEncryptionContext map[string]string
var destKMSBucketKeyEnabled bool
var destSSES3Key *SSES3Key
if state.DstSSEC {
var err error
@@ -1544,7 +1547,13 @@ func (s3a *S3ApiServer) copyMultipartCrossEncryption(entry *filer_pb.Entry, r *h
if err != nil {
return nil, nil, fmt.Errorf("failed to parse destination SSE-KMS headers: %w", err)
}
} else {
} else if state.DstSSES3 {
// Generate SSE-S3 key for destination
var err error
destSSES3Key, err = GenerateSSES3Key()
if err != nil {
return nil, nil, fmt.Errorf("failed to generate SSE-S3 key: %w", err)
}
}
// Parse source encryption parameters
@@ -1563,12 +1572,18 @@ func (s3a *S3ApiServer) copyMultipartCrossEncryption(entry *filer_pb.Entry, r *h
var err error
if chunk.GetSseType() == filer_pb.SSEType_SSE_C {
copiedChunk, err = s3a.copyCrossEncryptionChunk(chunk, sourceSSECKey, destSSECKey, destKMSKeyID, destKMSEncryptionContext, destKMSBucketKeyEnabled, dstPath, dstBucket, state)
copiedChunk, err = s3a.copyCrossEncryptionChunk(chunk, sourceSSECKey, destSSECKey, destKMSKeyID, destKMSEncryptionContext, destKMSBucketKeyEnabled, destSSES3Key, dstPath, dstBucket, state)
} else if chunk.GetSseType() == filer_pb.SSEType_SSE_KMS {
copiedChunk, err = s3a.copyCrossEncryptionChunk(chunk, nil, destSSECKey, destKMSKeyID, destKMSEncryptionContext, destKMSBucketKeyEnabled, dstPath, dstBucket, state)
copiedChunk, err = s3a.copyCrossEncryptionChunk(chunk, nil, destSSECKey, destKMSKeyID, destKMSEncryptionContext, destKMSBucketKeyEnabled, destSSES3Key, dstPath, dstBucket, state)
} else if chunk.GetSseType() == filer_pb.SSEType_SSE_S3 {
copiedChunk, err = s3a.copyCrossEncryptionChunk(chunk, nil, destSSECKey, destKMSKeyID, destKMSEncryptionContext, destKMSBucketKeyEnabled, destSSES3Key, dstPath, dstBucket, state)
} else {
// Unencrypted chunk, copy directly
copiedChunk, err = s3a.copySingleChunk(chunk, dstPath)
// Unencrypted chunk - may need encryption if destination requires it
if state.DstSSEC || state.DstSSEKMS || state.DstSSES3 {
copiedChunk, err = s3a.copyCrossEncryptionChunk(chunk, nil, destSSECKey, destKMSKeyID, destKMSEncryptionContext, destKMSBucketKeyEnabled, destSSES3Key, dstPath, dstBucket, state)
} else {
copiedChunk, err = s3a.copySingleChunk(chunk, dstPath)
}
}
if err != nil {
@@ -1619,6 +1634,40 @@ func (s3a *S3ApiServer) copyMultipartCrossEncryption(entry *filer_pb.Entry, r *h
} else {
glog.Errorf("Failed to serialize SSE-KMS metadata: %v", serErr)
}
} else if state.DstSSES3 && destSSES3Key != nil {
// For SSE-S3 destination, create object-level metadata
var sses3Metadata *SSES3Key
if len(dstChunks) == 0 {
// Handle 0-byte files - generate IV for metadata even though there's no content to encrypt
if entry.Attributes.FileSize != 0 {
return nil, nil, fmt.Errorf("internal error: no chunks created for non-empty SSE-S3 destination object")
}
// Generate IV for 0-byte object metadata
iv := make([]byte, s3_constants.AESBlockSize)
if _, err := io.ReadFull(rand.Reader, iv); err != nil {
return nil, nil, fmt.Errorf("generate IV for 0-byte object: %w", err)
}
destSSES3Key.IV = iv
sses3Metadata = destSSES3Key
} else {
// For non-empty objects, use the first chunk's metadata
if dstChunks[0].GetSseType() != filer_pb.SSEType_SSE_S3 || len(dstChunks[0].GetSseMetadata()) == 0 {
return nil, nil, fmt.Errorf("internal error: first chunk is missing expected SSE-S3 metadata for destination object")
}
keyManager := GetSSES3KeyManager()
var err error
sses3Metadata, err = DeserializeSSES3Metadata(dstChunks[0].GetSseMetadata(), keyManager)
if err != nil {
return nil, nil, fmt.Errorf("failed to deserialize SSE-S3 metadata from first chunk: %w", err)
}
}
// Use the derived key with its IV for object-level metadata
keyData, serErr := SerializeSSES3Metadata(sses3Metadata)
if serErr != nil {
return nil, nil, fmt.Errorf("failed to serialize SSE-S3 metadata: %w", serErr)
}
dstMetadata[s3_constants.SeaweedFSSSES3Key] = keyData
dstMetadata[s3_constants.AmzServerSideEncryption] = []byte("AES256")
}
// For unencrypted destination, no metadata needed (dstMetadata remains empty)
@@ -1626,7 +1675,7 @@ func (s3a *S3ApiServer) copyMultipartCrossEncryption(entry *filer_pb.Entry, r *h
}
// copyCrossEncryptionChunk handles copying a single chunk with cross-encryption support
func (s3a *S3ApiServer) copyCrossEncryptionChunk(chunk *filer_pb.FileChunk, sourceSSECKey *SSECustomerKey, destSSECKey *SSECustomerKey, destKMSKeyID string, destKMSEncryptionContext map[string]string, destKMSBucketKeyEnabled bool, dstPath, dstBucket string, state *EncryptionState) (*filer_pb.FileChunk, error) {
func (s3a *S3ApiServer) copyCrossEncryptionChunk(chunk *filer_pb.FileChunk, sourceSSECKey *SSECustomerKey, destSSECKey *SSECustomerKey, destKMSKeyID string, destKMSEncryptionContext map[string]string, destKMSBucketKeyEnabled bool, destSSES3Key *SSES3Key, dstPath, dstBucket string, state *EncryptionState) (*filer_pb.FileChunk, error) {
// Create destination chunk
dstChunk := s3a.createDestinationChunk(chunk, chunk.Offset, chunk.Size)
@@ -1726,6 +1775,30 @@ func (s3a *S3ApiServer) copyCrossEncryptionChunk(chunk *filer_pb.FileChunk, sour
previewLen = len(finalData)
}
} else if chunk.GetSseType() == filer_pb.SSEType_SSE_S3 {
// Decrypt SSE-S3 source
if len(chunk.GetSseMetadata()) == 0 {
return nil, fmt.Errorf("SSE-S3 chunk missing per-chunk metadata")
}
keyManager := GetSSES3KeyManager()
sourceSSEKey, err := DeserializeSSES3Metadata(chunk.GetSseMetadata(), keyManager)
if err != nil {
return nil, fmt.Errorf("failed to deserialize SSE-S3 metadata: %w", err)
}
decryptedReader, decErr := CreateSSES3DecryptedReader(bytes.NewReader(encryptedData), sourceSSEKey, sourceSSEKey.IV)
if decErr != nil {
return nil, fmt.Errorf("create SSE-S3 decrypted reader: %w", decErr)
}
decryptedData, readErr := io.ReadAll(decryptedReader)
if readErr != nil {
return nil, fmt.Errorf("decrypt SSE-S3 chunk data: %w", readErr)
}
finalData = decryptedData
glog.V(4).Infof("Decrypted SSE-S3 chunk, size: %d", len(finalData))
} else {
// Source is unencrypted
finalData = encryptedData
@@ -1787,6 +1860,36 @@ func (s3a *S3ApiServer) copyCrossEncryptionChunk(chunk *filer_pb.FileChunk, sour
dstChunk.SseMetadata = kmsMetadata
glog.V(4).Infof("Re-encrypted chunk with SSE-KMS")
} else if state.DstSSES3 && destSSES3Key != nil {
// Encrypt with SSE-S3
encryptedReader, iv, encErr := CreateSSES3EncryptedReader(bytes.NewReader(finalData), destSSES3Key)
if encErr != nil {
return nil, fmt.Errorf("create SSE-S3 encrypted reader: %w", encErr)
}
reencryptedData, readErr := io.ReadAll(encryptedReader)
if readErr != nil {
return nil, fmt.Errorf("re-encrypt with SSE-S3: %w", readErr)
}
finalData = reencryptedData
// Create per-chunk SSE-S3 metadata with chunk-specific IV
chunkSSEKey := &SSES3Key{
Key: destSSES3Key.Key,
KeyID: destSSES3Key.KeyID,
Algorithm: destSSES3Key.Algorithm,
IV: iv,
}
sses3Metadata, err := SerializeSSES3Metadata(chunkSSEKey)
if err != nil {
return nil, fmt.Errorf("serialize SSE-S3 metadata: %w", err)
}
dstChunk.SseType = filer_pb.SSEType_SSE_S3
dstChunk.SseMetadata = sses3Metadata
glog.V(4).Infof("Re-encrypted chunk with SSE-S3")
}
// For unencrypted destination, finalData remains as decrypted plaintext