s3: fix PutObject ETag format for multi-chunk uploads (#7771)
* s3: fix PutObject ETag format for multi-chunk uploads Fix issue #7768: AWS S3 SDK for Java fails with 'Invalid base 16 character: -' when performing PutObject on files that are internally auto-chunked. The issue was that SeaweedFS returned a composite ETag format (<md5hash>-<count>) for regular PutObject when the file was split into multiple chunks due to auto-chunking. However, per AWS S3 spec, the composite ETag format should only be used for multipart uploads (CreateMultipartUpload/UploadPart/CompleteMultipartUpload API). Regular PutObject should always return a pure MD5 hash as the ETag, regardless of how the file is stored internally. The fix ensures the MD5 hash is always stored in entry.Attributes.Md5 for regular PutObject operations, so filer.ETag() returns the pure MD5 hash instead of falling back to ETagChunks() composite format. * test: add comprehensive ETag format tests for issue #7768 Add integration tests to ensure PutObject ETag format compatibility: Go tests (test/s3/etag/): - TestPutObjectETagFormat_SmallFile: 1KB single chunk - TestPutObjectETagFormat_LargeFile: 10MB auto-chunked (critical for #7768) - TestPutObjectETagFormat_ExtraLargeFile: 25MB multi-chunk - TestMultipartUploadETagFormat: verify composite ETag for multipart - TestPutObjectETagConsistency: ETag consistency across PUT/HEAD/GET - TestETagHexValidation: simulate AWS SDK v2 hex decoding - TestMultipleLargeFileUploads: stress test multiple large uploads Java tests (other/java/s3copier/): - Update pom.xml to include AWS SDK v2 (2.20.127) - Add ETagValidationTest.java with comprehensive SDK v2 tests - Add README.md documenting SDK versions and test coverage Documentation: - Add test/s3/SDK_COMPATIBILITY.md documenting validated SDK versions - Add test/s3/etag/README.md explaining test coverage These tests ensure large file PutObject (>8MB) returns pure MD5 ETags (not composite format), which is required for AWS SDK v2 compatibility. * fix: lower Java version requirement to 11 for CI compatibility * address CodeRabbit review comments - s3_etag_test.go: Handle rand.Read error, fix multipart part-count logging - Makefile: Add 'all' target, pass S3_ENDPOINT to test commands - SDK_COMPATIBILITY.md: Add language tag to fenced code block - ETagValidationTest.java: Add pagination to cleanup logic - README.md: Clarify Go SDK tests are in separate location * ci: add s3copier ETag validation tests to Java integration tests - Enable S3 API (-s3 -s3.port=8333) in SeaweedFS test server - Add S3 API readiness check to wait loop - Add step to run ETagValidationTest from s3copier This ensures the fix for issue #7768 is continuously tested against AWS SDK v2 for Java in CI. * ci: add S3 config with credentials for s3copier tests - Add -s3.config pointing to docker/compose/s3.json - Add -s3.allowDeleteBucketNotEmpty for test cleanup - Set S3_ACCESS_KEY and S3_SECRET_KEY env vars for tests * ci: pass S3 config as Maven system properties Pass S3_ENDPOINT, S3_ACCESS_KEY, S3_SECRET_KEY via -D flags so they're available via System.getProperty() in Java tests
This commit is contained in:
110
other/java/s3copier/README.md
Normal file
110
other/java/s3copier/README.md
Normal file
@@ -0,0 +1,110 @@
|
||||
# SeaweedFS S3 Java SDK Compatibility Tests
|
||||
|
||||
This project contains Java-based integration tests for SeaweedFS S3 API compatibility.
|
||||
|
||||
## Overview
|
||||
|
||||
Tests are provided for both AWS SDK v1 and v2 to ensure compatibility with the various SDK versions commonly used in production.
|
||||
|
||||
## SDK Versions
|
||||
|
||||
| SDK | Version | Notes |
|
||||
|-----|---------|-------|
|
||||
| AWS SDK v1 for Java | 1.12.600 | Legacy SDK, less strict ETag validation |
|
||||
| AWS SDK v2 for Java | 2.20.127 | Modern SDK with strict checksum validation |
|
||||
|
||||
## Running Tests
|
||||
|
||||
### Prerequisites
|
||||
|
||||
1. SeaweedFS running with S3 API enabled:
|
||||
```bash
|
||||
weed server -s3
|
||||
```
|
||||
|
||||
2. Java 18+ and Maven
|
||||
|
||||
### Run All Tests
|
||||
|
||||
```bash
|
||||
mvn test
|
||||
```
|
||||
|
||||
### Run Specific Tests
|
||||
|
||||
```bash
|
||||
# Run only ETag validation tests (AWS SDK v2)
|
||||
mvn test -Dtest=ETagValidationTest
|
||||
|
||||
# Run with custom endpoint
|
||||
mvn test -Dtest=ETagValidationTest -DS3_ENDPOINT=http://localhost:8333
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `S3_ENDPOINT` | `http://127.0.0.1:8333` | S3 API endpoint URL |
|
||||
| `S3_ACCESS_KEY` | `some_access_key1` | Access key ID |
|
||||
| `S3_SECRET_KEY` | `some_secret_key1` | Secret access key |
|
||||
| `S3_REGION` | `us-east-1` | AWS region |
|
||||
|
||||
## Test Coverage
|
||||
|
||||
### ETagValidationTest (AWS SDK v2)
|
||||
|
||||
Tests for [GitHub Issue #7768](https://github.com/seaweedfs/seaweedfs/issues/7768) - ETag format validation.
|
||||
|
||||
| Test | Description |
|
||||
|------|-------------|
|
||||
| `testSmallFilePutObject` | Verify small files return pure MD5 ETag |
|
||||
| `testLargeFilePutObject_Issue7768` | **Critical**: Verify large files (>8MB) return pure MD5 ETag |
|
||||
| `testExtraLargeFilePutObject` | Verify very large files (>24MB) return pure MD5 ETag |
|
||||
| `testMultipartUploadETag` | Verify multipart uploads return composite ETag |
|
||||
| `testETagConsistency` | Verify ETag consistency across PUT/HEAD/GET |
|
||||
| `testMultipleLargeFileUploads` | Stress test multiple large uploads |
|
||||
|
||||
### Background: Issue #7768
|
||||
|
||||
AWS SDK v2 for Java includes checksum validation that decodes the ETag as hexadecimal. When SeaweedFS returned composite ETags (`<md5>-<count>`) for regular `PutObject` with internally auto-chunked files, the SDK failed with:
|
||||
|
||||
```
|
||||
java.lang.IllegalArgumentException: Invalid base 16 character: '-'
|
||||
```
|
||||
|
||||
**Per AWS S3 specification:**
|
||||
- `PutObject`: ETag is always a pure MD5 hex string (32 chars)
|
||||
- `CompleteMultipartUpload`: ETag is composite format (`<md5>-<partcount>`)
|
||||
|
||||
The fix ensures SeaweedFS follows this specification.
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
src/
|
||||
├── main/java/com/seaweedfs/s3/
|
||||
│ ├── PutObject.java # Example PutObject with SDK v1
|
||||
│ └── HighLevelMultipartUpload.java
|
||||
└── test/java/com/seaweedfs/s3/
|
||||
├── PutObjectTest.java # Basic SDK v1 test
|
||||
└── ETagValidationTest.java # Comprehensive SDK v2 ETag tests
|
||||
```
|
||||
|
||||
## Validated SDK Versions
|
||||
|
||||
This Java test project validates:
|
||||
|
||||
- ✅ AWS SDK v2 for Java 2.20.127+
|
||||
- ✅ AWS SDK v1 for Java 1.12.600+
|
||||
|
||||
Go SDK validation is performed by separate test suites:
|
||||
- See [Go ETag Tests](/test/s3/etag/) for AWS SDK v2 for Go tests
|
||||
- See [test/s3/SDK_COMPATIBILITY.md](/test/s3/SDK_COMPATIBILITY.md) for full SDK compatibility matrix
|
||||
|
||||
## Related
|
||||
|
||||
- [GitHub Issue #7768](https://github.com/seaweedfs/seaweedfs/issues/7768)
|
||||
- [AWS S3 ETag Documentation](https://docs.aws.amazon.com/AmazonS3/latest/API/API_Object.html)
|
||||
- [Go ETag Tests](/test/s3/etag/)
|
||||
- [SDK Compatibility Matrix](/test/s3/SDK_COMPATIBILITY.md)
|
||||
|
||||
@@ -6,18 +6,29 @@
|
||||
<packaging>jar</packaging>
|
||||
<version>1.0-SNAPSHOT</version>
|
||||
<properties>
|
||||
<maven.compiler.source>18</maven.compiler.source>
|
||||
<maven.compiler.target>18</maven.compiler.target>
|
||||
<maven.compiler.source>11</maven.compiler.source>
|
||||
<maven.compiler.target>11</maven.compiler.target>
|
||||
<aws.sdk.v1.version>1.12.600</aws.sdk.v1.version>
|
||||
<aws.sdk.v2.version>2.20.127</aws.sdk.v2.version>
|
||||
</properties>
|
||||
<name>copier</name>
|
||||
<url>http://maven.apache.org</url>
|
||||
|
||||
<dependencyManagement>
|
||||
<dependencies>
|
||||
<!-- AWS SDK v1 BOM -->
|
||||
<dependency>
|
||||
<groupId>com.amazonaws</groupId>
|
||||
<artifactId>aws-java-sdk-bom</artifactId>
|
||||
<version>1.11.327</version>
|
||||
<version>${aws.sdk.v1.version}</version>
|
||||
<type>pom</type>
|
||||
<scope>import</scope>
|
||||
</dependency>
|
||||
<!-- AWS SDK v2 BOM -->
|
||||
<dependency>
|
||||
<groupId>software.amazon.awssdk</groupId>
|
||||
<artifactId>bom</artifactId>
|
||||
<version>${aws.sdk.v2.version}</version>
|
||||
<type>pom</type>
|
||||
<scope>import</scope>
|
||||
</dependency>
|
||||
@@ -25,15 +36,50 @@
|
||||
</dependencyManagement>
|
||||
|
||||
<dependencies>
|
||||
<!-- AWS SDK v1 (for backward compatibility with existing code) -->
|
||||
<dependency>
|
||||
<groupId>com.amazonaws</groupId>
|
||||
<artifactId>aws-java-sdk-s3</artifactId>
|
||||
</dependency>
|
||||
|
||||
<!-- AWS SDK v2 (for modern tests with checksum validation) -->
|
||||
<dependency>
|
||||
<groupId>software.amazon.awssdk</groupId>
|
||||
<artifactId>s3</artifactId>
|
||||
</dependency>
|
||||
<dependency>
|
||||
<groupId>software.amazon.awssdk</groupId>
|
||||
<artifactId>s3-transfer-manager</artifactId>
|
||||
</dependency>
|
||||
|
||||
<!-- Test dependencies -->
|
||||
<dependency>
|
||||
<groupId>junit</groupId>
|
||||
<artifactId>junit</artifactId>
|
||||
<version>4.13.1</version>
|
||||
<version>4.13.2</version>
|
||||
<scope>test</scope>
|
||||
</dependency>
|
||||
<dependency>
|
||||
<groupId>org.junit.jupiter</groupId>
|
||||
<artifactId>junit-jupiter</artifactId>
|
||||
<version>5.10.0</version>
|
||||
<scope>test</scope>
|
||||
</dependency>
|
||||
<dependency>
|
||||
<groupId>org.assertj</groupId>
|
||||
<artifactId>assertj-core</artifactId>
|
||||
<version>3.24.2</version>
|
||||
<scope>test</scope>
|
||||
</dependency>
|
||||
</dependencies>
|
||||
|
||||
<build>
|
||||
<plugins>
|
||||
<plugin>
|
||||
<groupId>org.apache.maven.plugins</groupId>
|
||||
<artifactId>maven-surefire-plugin</artifactId>
|
||||
<version>3.2.2</version>
|
||||
</plugin>
|
||||
</plugins>
|
||||
</build>
|
||||
</project>
|
||||
|
||||
@@ -0,0 +1,439 @@
|
||||
package com.seaweedfs.s3;
|
||||
|
||||
import org.junit.jupiter.api.*;
|
||||
import org.junit.jupiter.api.condition.EnabledIfEnvironmentVariable;
|
||||
import software.amazon.awssdk.auth.credentials.AwsBasicCredentials;
|
||||
import software.amazon.awssdk.auth.credentials.StaticCredentialsProvider;
|
||||
import software.amazon.awssdk.core.sync.RequestBody;
|
||||
import software.amazon.awssdk.regions.Region;
|
||||
import software.amazon.awssdk.services.s3.S3Client;
|
||||
import software.amazon.awssdk.services.s3.model.*;
|
||||
|
||||
import java.net.URI;
|
||||
import java.security.MessageDigest;
|
||||
import java.security.SecureRandom;
|
||||
import java.util.ArrayList;
|
||||
import java.util.List;
|
||||
import java.util.UUID;
|
||||
import java.util.regex.Pattern;
|
||||
|
||||
import static org.assertj.core.api.Assertions.*;
|
||||
|
||||
/**
|
||||
* AWS SDK v2 Integration Tests for S3 ETag Format Validation.
|
||||
*
|
||||
* These tests verify that SeaweedFS returns correct ETag formats that are
|
||||
* compatible with AWS SDK v2's checksum validation.
|
||||
*
|
||||
* Background (GitHub Issue #7768):
|
||||
* AWS SDK v2 for Java validates ETags as hexadecimal MD5 hashes for PutObject
|
||||
* responses. If the ETag contains non-hex characters (like '-' in composite
|
||||
* format), the SDK fails with "Invalid base 16 character: '-'".
|
||||
*
|
||||
* Per AWS S3 specification:
|
||||
* - Regular PutObject: ETag is always a pure MD5 hex string (32 chars)
|
||||
* - CompleteMultipartUpload: ETag is composite format "<md5>-<partcount>"
|
||||
*
|
||||
* To run these tests:
|
||||
* mvn test -Dtest=ETagValidationTest -DS3_ENDPOINT=http://localhost:8333
|
||||
*
|
||||
* Or set environment variable:
|
||||
* export S3_ENDPOINT=http://localhost:8333
|
||||
* mvn test -Dtest=ETagValidationTest
|
||||
*/
|
||||
@TestInstance(TestInstance.Lifecycle.PER_CLASS)
|
||||
@DisplayName("S3 ETag Format Validation Tests (AWS SDK v2)")
|
||||
class ETagValidationTest {
|
||||
|
||||
// Configuration - can be overridden via system properties or environment variables
|
||||
private static final String DEFAULT_ENDPOINT = "http://127.0.0.1:8333";
|
||||
private static final String DEFAULT_ACCESS_KEY = "some_access_key1";
|
||||
private static final String DEFAULT_SECRET_KEY = "some_secret_key1";
|
||||
private static final String DEFAULT_REGION = "us-east-1";
|
||||
|
||||
// Auto-chunking threshold in SeaweedFS (must match s3api_object_handlers_put.go)
|
||||
private static final int AUTO_CHUNK_SIZE = 8 * 1024 * 1024; // 8MB
|
||||
|
||||
// Test sizes
|
||||
private static final int SMALL_FILE_SIZE = 1024; // 1KB
|
||||
private static final int LARGE_FILE_SIZE = 10 * 1024 * 1024; // 10MB (triggers auto-chunking)
|
||||
private static final int XL_FILE_SIZE = 25 * 1024 * 1024; // 25MB (multiple chunks)
|
||||
private static final int MULTIPART_PART_SIZE = 5 * 1024 * 1024; // 5MB per part
|
||||
|
||||
// ETag format patterns
|
||||
private static final Pattern PURE_MD5_PATTERN = Pattern.compile("^\"?[a-f0-9]{32}\"?$");
|
||||
private static final Pattern COMPOSITE_PATTERN = Pattern.compile("^\"?[a-f0-9]{32}-\\d+\"?$");
|
||||
|
||||
private S3Client s3Client;
|
||||
private String testBucketName;
|
||||
private final SecureRandom random = new SecureRandom();
|
||||
|
||||
@BeforeAll
|
||||
void setUp() {
|
||||
String endpoint = getConfig("S3_ENDPOINT", DEFAULT_ENDPOINT);
|
||||
String accessKey = getConfig("S3_ACCESS_KEY", DEFAULT_ACCESS_KEY);
|
||||
String secretKey = getConfig("S3_SECRET_KEY", DEFAULT_SECRET_KEY);
|
||||
String region = getConfig("S3_REGION", DEFAULT_REGION);
|
||||
|
||||
System.out.println("Connecting to S3 endpoint: " + endpoint);
|
||||
|
||||
s3Client = S3Client.builder()
|
||||
.endpointOverride(URI.create(endpoint))
|
||||
.credentialsProvider(StaticCredentialsProvider.create(
|
||||
AwsBasicCredentials.create(accessKey, secretKey)))
|
||||
.region(Region.of(region))
|
||||
.forcePathStyle(true) // Required for SeaweedFS
|
||||
.build();
|
||||
|
||||
// Create test bucket
|
||||
testBucketName = "test-etag-" + UUID.randomUUID().toString().substring(0, 8);
|
||||
s3Client.createBucket(CreateBucketRequest.builder()
|
||||
.bucket(testBucketName)
|
||||
.build());
|
||||
|
||||
System.out.println("Created test bucket: " + testBucketName);
|
||||
}
|
||||
|
||||
@AfterAll
|
||||
void tearDown() {
|
||||
if (s3Client != null && testBucketName != null) {
|
||||
try {
|
||||
// Delete all objects with pagination
|
||||
String continuationToken = null;
|
||||
do {
|
||||
ListObjectsV2Response listResp = s3Client.listObjectsV2(
|
||||
ListObjectsV2Request.builder()
|
||||
.bucket(testBucketName)
|
||||
.continuationToken(continuationToken)
|
||||
.build());
|
||||
for (S3Object obj : listResp.contents()) {
|
||||
s3Client.deleteObject(DeleteObjectRequest.builder()
|
||||
.bucket(testBucketName)
|
||||
.key(obj.key())
|
||||
.build());
|
||||
}
|
||||
continuationToken = listResp.nextContinuationToken();
|
||||
} while (continuationToken != null);
|
||||
|
||||
// Abort any multipart uploads
|
||||
ListMultipartUploadsResponse mpResp = s3Client.listMultipartUploads(
|
||||
ListMultipartUploadsRequest.builder().bucket(testBucketName).build());
|
||||
for (MultipartUpload upload : mpResp.uploads()) {
|
||||
s3Client.abortMultipartUpload(AbortMultipartUploadRequest.builder()
|
||||
.bucket(testBucketName)
|
||||
.key(upload.key())
|
||||
.uploadId(upload.uploadId())
|
||||
.build());
|
||||
}
|
||||
|
||||
// Delete bucket
|
||||
s3Client.deleteBucket(DeleteBucketRequest.builder()
|
||||
.bucket(testBucketName)
|
||||
.build());
|
||||
|
||||
System.out.println("Cleaned up test bucket: " + testBucketName);
|
||||
} catch (Exception e) {
|
||||
System.err.println("Warning: Failed to cleanup test bucket: " + e.getMessage());
|
||||
}
|
||||
s3Client.close();
|
||||
}
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("Small file PutObject should return pure MD5 ETag")
|
||||
void testSmallFilePutObject() throws Exception {
|
||||
byte[] testData = generateRandomData(SMALL_FILE_SIZE);
|
||||
String expectedMD5 = calculateMD5Hex(testData);
|
||||
String objectKey = "small-file-" + UUID.randomUUID() + ".bin";
|
||||
|
||||
PutObjectResponse response = s3Client.putObject(
|
||||
PutObjectRequest.builder()
|
||||
.bucket(testBucketName)
|
||||
.key(objectKey)
|
||||
.build(),
|
||||
RequestBody.fromBytes(testData));
|
||||
|
||||
String etag = response.eTag();
|
||||
System.out.println("Small file ETag: " + etag + " (expected MD5: " + expectedMD5 + ")");
|
||||
|
||||
assertThat(etag)
|
||||
.describedAs("Small file ETag should be pure MD5")
|
||||
.matches(PURE_MD5_PATTERN);
|
||||
assertThat(cleanETag(etag))
|
||||
.describedAs("ETag should match calculated MD5")
|
||||
.isEqualTo(expectedMD5);
|
||||
assertThat(etag)
|
||||
.describedAs("ETag should not contain hyphen")
|
||||
.doesNotContain("-");
|
||||
}
|
||||
|
||||
/**
|
||||
* Critical test for GitHub Issue #7768.
|
||||
*
|
||||
* This test uploads a file larger than the auto-chunking threshold (8MB),
|
||||
* which triggers SeaweedFS to split the file into multiple internal chunks.
|
||||
*
|
||||
* Previously, this caused SeaweedFS to return a composite ETag like
|
||||
* "d41d8cd98f00b204e9800998ecf8427e-2", which AWS SDK v2 rejected because
|
||||
* it validates the ETag as hexadecimal and '-' is not a valid hex character.
|
||||
*
|
||||
* The fix ensures that regular PutObject always returns a pure MD5 ETag,
|
||||
* regardless of internal chunking.
|
||||
*/
|
||||
@Test
|
||||
@DisplayName("Large file PutObject (>8MB) should return pure MD5 ETag - Issue #7768")
|
||||
void testLargeFilePutObject_Issue7768() throws Exception {
|
||||
byte[] testData = generateRandomData(LARGE_FILE_SIZE);
|
||||
String expectedMD5 = calculateMD5Hex(testData);
|
||||
String objectKey = "large-file-" + UUID.randomUUID() + ".bin";
|
||||
|
||||
System.out.println("Uploading large file (" + LARGE_FILE_SIZE + " bytes, " +
|
||||
"> " + AUTO_CHUNK_SIZE + " byte auto-chunk threshold)...");
|
||||
|
||||
// This is where Issue #7768 would manifest - SDK v2 validates ETag
|
||||
PutObjectResponse response = s3Client.putObject(
|
||||
PutObjectRequest.builder()
|
||||
.bucket(testBucketName)
|
||||
.key(objectKey)
|
||||
.build(),
|
||||
RequestBody.fromBytes(testData));
|
||||
|
||||
String etag = response.eTag();
|
||||
int expectedChunks = (LARGE_FILE_SIZE / AUTO_CHUNK_SIZE) + 1;
|
||||
System.out.println("Large file ETag: " + etag +
|
||||
" (expected MD5: " + expectedMD5 + ", internal chunks: ~" + expectedChunks + ")");
|
||||
|
||||
// These assertions would fail before the fix
|
||||
assertThat(etag)
|
||||
.describedAs("Large file PutObject ETag MUST be pure MD5 (not composite)")
|
||||
.matches(PURE_MD5_PATTERN);
|
||||
assertThat(etag)
|
||||
.describedAs("Large file ETag should NOT be composite format")
|
||||
.doesNotMatch(COMPOSITE_PATTERN);
|
||||
assertThat(etag)
|
||||
.describedAs("ETag should not contain hyphen for regular PutObject")
|
||||
.doesNotContain("-");
|
||||
assertThat(cleanETag(etag))
|
||||
.describedAs("ETag should match calculated MD5")
|
||||
.isEqualTo(expectedMD5);
|
||||
|
||||
// Verify hex decoding works (this is what fails in Issue #7768)
|
||||
assertThatCode(() -> hexToBytes(cleanETag(etag)))
|
||||
.describedAs("ETag should be valid hexadecimal (AWS SDK v2 validation)")
|
||||
.doesNotThrowAnyException();
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("Extra large file PutObject (>24MB) should return pure MD5 ETag")
|
||||
void testExtraLargeFilePutObject() throws Exception {
|
||||
byte[] testData = generateRandomData(XL_FILE_SIZE);
|
||||
String expectedMD5 = calculateMD5Hex(testData);
|
||||
String objectKey = "xl-file-" + UUID.randomUUID() + ".bin";
|
||||
|
||||
int expectedChunks = (XL_FILE_SIZE / AUTO_CHUNK_SIZE) + 1;
|
||||
System.out.println("Uploading XL file (" + XL_FILE_SIZE + " bytes, ~" +
|
||||
expectedChunks + " internal chunks)...");
|
||||
|
||||
PutObjectResponse response = s3Client.putObject(
|
||||
PutObjectRequest.builder()
|
||||
.bucket(testBucketName)
|
||||
.key(objectKey)
|
||||
.build(),
|
||||
RequestBody.fromBytes(testData));
|
||||
|
||||
String etag = response.eTag();
|
||||
System.out.println("XL file ETag: " + etag);
|
||||
|
||||
assertThat(etag)
|
||||
.describedAs("XL file PutObject ETag MUST be pure MD5")
|
||||
.matches(PURE_MD5_PATTERN);
|
||||
assertThat(cleanETag(etag))
|
||||
.describedAs("ETag should match calculated MD5")
|
||||
.isEqualTo(expectedMD5);
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("Multipart upload should return composite ETag")
|
||||
void testMultipartUploadETag() throws Exception {
|
||||
int totalSize = 15 * 1024 * 1024; // 15MB = 3 parts
|
||||
byte[] testData = generateRandomData(totalSize);
|
||||
String objectKey = "multipart-file-" + UUID.randomUUID() + ".bin";
|
||||
|
||||
System.out.println("Performing multipart upload (" + totalSize + " bytes)...");
|
||||
|
||||
// Initiate multipart upload
|
||||
CreateMultipartUploadResponse createResp = s3Client.createMultipartUpload(
|
||||
CreateMultipartUploadRequest.builder()
|
||||
.bucket(testBucketName)
|
||||
.key(objectKey)
|
||||
.build());
|
||||
String uploadId = createResp.uploadId();
|
||||
|
||||
List<CompletedPart> completedParts = new ArrayList<>();
|
||||
int partNumber = 1;
|
||||
|
||||
// Upload parts
|
||||
for (int offset = 0; offset < totalSize; offset += MULTIPART_PART_SIZE) {
|
||||
int end = Math.min(offset + MULTIPART_PART_SIZE, totalSize);
|
||||
byte[] partData = new byte[end - offset];
|
||||
System.arraycopy(testData, offset, partData, 0, partData.length);
|
||||
|
||||
UploadPartResponse uploadResp = s3Client.uploadPart(
|
||||
UploadPartRequest.builder()
|
||||
.bucket(testBucketName)
|
||||
.key(objectKey)
|
||||
.uploadId(uploadId)
|
||||
.partNumber(partNumber)
|
||||
.build(),
|
||||
RequestBody.fromBytes(partData));
|
||||
|
||||
completedParts.add(CompletedPart.builder()
|
||||
.partNumber(partNumber)
|
||||
.eTag(uploadResp.eTag())
|
||||
.build());
|
||||
partNumber++;
|
||||
}
|
||||
|
||||
// Complete multipart upload
|
||||
CompleteMultipartUploadResponse completeResp = s3Client.completeMultipartUpload(
|
||||
CompleteMultipartUploadRequest.builder()
|
||||
.bucket(testBucketName)
|
||||
.key(objectKey)
|
||||
.uploadId(uploadId)
|
||||
.multipartUpload(CompletedMultipartUpload.builder()
|
||||
.parts(completedParts)
|
||||
.build())
|
||||
.build());
|
||||
|
||||
String etag = completeResp.eTag();
|
||||
System.out.println("Multipart upload ETag: " + etag + " (" + completedParts.size() + " parts)");
|
||||
|
||||
// Multipart uploads SHOULD have composite ETag
|
||||
assertThat(etag)
|
||||
.describedAs("Multipart upload ETag SHOULD be composite format")
|
||||
.matches(COMPOSITE_PATTERN);
|
||||
assertThat(etag)
|
||||
.describedAs("Multipart ETag should contain hyphen")
|
||||
.contains("-");
|
||||
|
||||
// Verify part count in ETag
|
||||
String[] parts = cleanETag(etag).split("-");
|
||||
assertThat(parts).hasSize(2);
|
||||
assertThat(parts[1])
|
||||
.describedAs("Part count in ETag should match uploaded parts")
|
||||
.isEqualTo(String.valueOf(completedParts.size()));
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("ETag should be consistent across PUT, HEAD, and GET")
|
||||
void testETagConsistency() throws Exception {
|
||||
byte[] testData = generateRandomData(LARGE_FILE_SIZE);
|
||||
String objectKey = "consistency-test-" + UUID.randomUUID() + ".bin";
|
||||
|
||||
// PUT
|
||||
PutObjectResponse putResp = s3Client.putObject(
|
||||
PutObjectRequest.builder()
|
||||
.bucket(testBucketName)
|
||||
.key(objectKey)
|
||||
.build(),
|
||||
RequestBody.fromBytes(testData));
|
||||
String putETag = putResp.eTag();
|
||||
|
||||
// HEAD
|
||||
HeadObjectResponse headResp = s3Client.headObject(
|
||||
HeadObjectRequest.builder()
|
||||
.bucket(testBucketName)
|
||||
.key(objectKey)
|
||||
.build());
|
||||
String headETag = headResp.eTag();
|
||||
|
||||
// GET
|
||||
GetObjectResponse getResp = s3Client.getObject(
|
||||
GetObjectRequest.builder()
|
||||
.bucket(testBucketName)
|
||||
.key(objectKey)
|
||||
.build())
|
||||
.response();
|
||||
String getETag = getResp.eTag();
|
||||
|
||||
System.out.println("PUT ETag: " + putETag + ", HEAD ETag: " + headETag + ", GET ETag: " + getETag);
|
||||
|
||||
assertThat(putETag).isEqualTo(headETag);
|
||||
assertThat(putETag).isEqualTo(getETag);
|
||||
}
|
||||
|
||||
@Test
|
||||
@DisplayName("Multiple large file uploads should all return pure MD5 ETags")
|
||||
void testMultipleLargeFileUploads() throws Exception {
|
||||
int numFiles = 3;
|
||||
|
||||
for (int i = 0; i < numFiles; i++) {
|
||||
byte[] testData = generateRandomData(LARGE_FILE_SIZE);
|
||||
String expectedMD5 = calculateMD5Hex(testData);
|
||||
String objectKey = "multi-large-" + i + "-" + UUID.randomUUID() + ".bin";
|
||||
|
||||
PutObjectResponse response = s3Client.putObject(
|
||||
PutObjectRequest.builder()
|
||||
.bucket(testBucketName)
|
||||
.key(objectKey)
|
||||
.build(),
|
||||
RequestBody.fromBytes(testData));
|
||||
|
||||
String etag = response.eTag();
|
||||
System.out.println("File " + i + " ETag: " + etag);
|
||||
|
||||
assertThat(etag)
|
||||
.describedAs("File " + i + " ETag should be pure MD5")
|
||||
.matches(PURE_MD5_PATTERN);
|
||||
assertThat(cleanETag(etag))
|
||||
.describedAs("File " + i + " ETag should match MD5")
|
||||
.isEqualTo(expectedMD5);
|
||||
|
||||
// Validate hex decoding
|
||||
assertThatCode(() -> hexToBytes(cleanETag(etag)))
|
||||
.doesNotThrowAnyException();
|
||||
}
|
||||
}
|
||||
|
||||
// Helper methods
|
||||
|
||||
private String getConfig(String key, String defaultValue) {
|
||||
String value = System.getProperty(key);
|
||||
if (value == null) {
|
||||
value = System.getenv(key);
|
||||
}
|
||||
return value != null ? value : defaultValue;
|
||||
}
|
||||
|
||||
private byte[] generateRandomData(int size) {
|
||||
byte[] data = new byte[size];
|
||||
random.nextBytes(data);
|
||||
return data;
|
||||
}
|
||||
|
||||
private String calculateMD5Hex(byte[] data) throws Exception {
|
||||
MessageDigest md = MessageDigest.getInstance("MD5");
|
||||
byte[] digest = md.digest(data);
|
||||
StringBuilder sb = new StringBuilder();
|
||||
for (byte b : digest) {
|
||||
sb.append(String.format("%02x", b));
|
||||
}
|
||||
return sb.toString();
|
||||
}
|
||||
|
||||
private String cleanETag(String etag) {
|
||||
if (etag == null) return null;
|
||||
return etag.replace("\"", "");
|
||||
}
|
||||
|
||||
private byte[] hexToBytes(String hex) {
|
||||
int len = hex.length();
|
||||
byte[] data = new byte[len / 2];
|
||||
for (int i = 0; i < len; i += 2) {
|
||||
data[i / 2] = (byte) ((Character.digit(hex.charAt(i), 16) << 4)
|
||||
+ Character.digit(hex.charAt(i + 1), 16));
|
||||
}
|
||||
return data;
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user