593 Commits

Author SHA1 Message Date
Chris Lu
0d8588e3ae S3: Implement IAM defaults and STS signing key fallback (#8348)
* S3: Implement IAM defaults and STS signing key fallback logic

* S3: Refactor startup order to init SSE-S3 key manager before IAM

* S3: Derive STS signing key from KEK using HKDF for security isolation

* S3: Document STS signing key fallback in security.toml

* fix(s3api): refine anonymous access logic and secure-by-default behavior

- Initialize anonymous identity by default in `NewIdentityAccessManagement` to prevent nil pointer exceptions.
- Ensure `ReplaceS3ApiConfiguration` preserves the anonymous identity if not present in the new configuration.
- Update `NewIdentityAccessManagement` signature to accept `filerClient`.
- In legacy mode (no policy engine), anonymous defaults to Deny (no actions), preserving secure-by-default behavior.
- Use specific `LookupAnonymous` method instead of generic map lookup.
- Update tests to accommodate signature changes and verify improved anonymous handling.

* feat(s3api): make IAM configuration optional

- Start S3 API server without a configuration file if `EnableIam` option is set.
- Default to `Allow` effect for policy engine when no configuration is provided (Zero-Config mode).
- Handle empty configuration path gracefully in `loadIAMManagerFromConfig`.
- Add integration test `iam_optional_test.go` to verify empty config behavior.

* fix(iamapi): fix signature mismatch in NewIdentityAccessManagementWithStore

* fix(iamapi): properly initialize FilerClient instead of passing nil

* fix(iamapi): properly initialize filer client for IAM management

- Instead of passing `nil`, construct a `wdclient.FilerClient` using the provided `Filers` addresses.
- Ensure `NewIdentityAccessManagementWithStore` receives a valid `filerClient` to avoid potential nil pointer dereferences or limited functionality.

* clean: remove dead code in s3api_server.go

* refactor(s3api): improve IAM initialization, safety and anonymous access security

* fix(s3api): ensure IAM config loads from filer after client init

* fix(s3): resolve test failures in integration, CORS, and tagging tests

- Fix CORS tests by providing explicit anonymous permissions config
- Fix S3 integration tests by setting admin credentials in init
- Align tagging test credentials in CI with IAM defaults
- Added goroutine to retry IAM config load in iamapi server

* fix(s3): allow anonymous access to health targets and S3 Tables when identities are present

* fix(ci): use /healthz for Caddy health check in awscli tests

* iam, s3api: expose DefaultAllow from IAM and Policy Engine

This allows checking the global "Open by Default" configuration from
other components like S3 Tables.

* s3api/s3tables: support DefaultAllow in permission logic and handler

Updated CheckPermissionWithContext to respect the DefaultAllow flag
in PolicyContext. This enables "Open by Default" behavior for
unauthenticated access in zero-config environments. Added a targeted
unit test to verify the logic.

* s3api/s3tables: propagate DefaultAllow through handlers

Propagated the DefaultAllow flag to individual handlers for
namespaces, buckets, tables, policies, and tagging. This ensures
consistent "Open by Default" behavior across all S3 Tables API
endpoints.

* s3api: wire up DefaultAllow for S3 Tables API initialization

Updated registerS3TablesRoutes to query the global IAM configuration
and set the DefaultAllow flag on the S3 Tables API server. This
completes the end-to-end propagation required for anonymous access in
zero-config environments. Added a SetDefaultAllow method to
S3TablesApiServer to facilitate this.

* s3api: fix tests by adding DefaultAllow to mock IAM integrations

The IAMIntegration interface was updated to include DefaultAllow(),
breaking several mock implementations in tests. This commit fixes
the build errors by adding the missing method to the mocks.

* env

* ensure ports

* env

* env

* fix default allow

* add one more test using non-anonymous user

* debug

* add more debug

* less logs
2026-02-16 13:59:13 -08:00
Chris Lu
f1bf60d288 faster 2026-02-13 14:09:30 -08:00
Chris Lu
beeb375a88 Add volume server integration test suite and CI workflow (#8322)
* docs(volume_server): add integration test development plan

* test(volume_server): add integration harness and profile matrix

* test(volume_server/http): add admin and options integration coverage

* test(volume_server/grpc): add state and status integration coverage

* test(volume_server): auto-build weed binary and harden cluster startup

* test(volume_server/http): add upload read range head delete coverage

* test(volume_server/grpc): expand admin lifecycle and state coverage

* docs(volume_server): update progress tracker for implemented tests

* test(volume_server/http): cover if-none-match and invalid-range branches

* test(volume_server/grpc): add batch delete integration coverage

* docs(volume_server): log latest HTTP and gRPC test coverage

* ci(volume_server): run volume server integration tests in github actions

* test(volume_server/grpc): add needle status configure ping and leave coverage

* docs(volume_server): record additional grpc coverage progress

* test(volume_server/grpc): add vacuum integration coverage

* docs(volume_server): record vacuum test coverage progress

* test(volume_server/grpc): add read and write needle blob error-path coverage

* docs(volume_server): record data rw grpc coverage progress

* test(volume_server/http): add jwt auth integration coverage

* test(volume_server/grpc): add sync copy and stream error-path coverage

* docs(volume_server): record jwt and sync/copy test coverage

* test(volume_server/grpc): add scrub and query integration coverage

* test(volume_server/grpc): add volume tail sender and receiver coverage

* docs(volume_server): record scrub query and tail test progress

* test(volume_server/grpc): add readonly writable and collection lifecycle coverage

* test(volume_server/http): add public-port cors and method parity coverage

* test(volume_server/grpc): add blob meta and read-all success path coverage

* test(volume_server/grpc): expand scrub and query variation coverage

* test(volume_server/grpc): add tiering and remote fetch error-path coverage

* test(volume_server/http): add unchanged write and delete edge-case coverage

* test(volume_server/grpc): add ping unknown and unreachable target coverage

* test(volume_server/grpc): add volume delete only-empty variation coverage

* test(volume_server/http): add jwt fid-mismatch auth coverage

* test(volume_server/grpc): add scrub ec auto-select empty coverage

* test(volume_server/grpc): stabilize ping timestamp assertion

* docs(volume_server): update integration coverage progress log

* test(volume_server/grpc): add tier remote backend and config variation coverage

* docs(volume_server): record tier remote variation progress

* test(volume_server/grpc): add incremental copy and receive-file protocol coverage

* test(volume_server/http): add read path shape and if-modified-since coverage

* test(volume_server/grpc): add copy-file compaction and receive-file success coverage

* test(volume_server/http): add passthrough headers and static asset coverage

* test(volume_server/grpc): add ping filer unreachable coverage

* docs(volume_server): record copy receive and http variant progress

* test(volume_server/grpc): add erasure coding maintenance and missing-path coverage

* docs(volume_server): record initial erasure coding rpc coverage

* test(volume_server/http): add multi-range multipart response coverage

* docs(volume_server): record multi-range http coverage progress

* test(volume_server/grpc): add query empty-stripe no-match coverage

* docs(volume_server): record query no-match stream behavior coverage

* test(volume_server/http): add upload throttling timeout and replicate bypass coverage

* docs(volume_server): record upload throttling coverage progress

* test(volume_server/http): add download throttling timeout coverage

* docs(volume_server): record download throttling coverage progress

* test(volume_server/http): add jwt wrong-cookie fid mismatch coverage

* docs(volume_server): record jwt wrong-cookie mismatch coverage

* test(volume_server/http): add jwt expired-token rejection coverage

* docs(volume_server): record jwt expired-token coverage

* test(volume_server/http): add jwt query and cookie transport coverage

* docs(volume_server): record jwt token transport coverage

* test(volume_server/http): add jwt token-source precedence coverage

* docs(volume_server): record jwt token-source precedence coverage

* test(volume_server/http): add jwt header-over-cookie precedence coverage

* docs(volume_server): record jwt header cookie precedence coverage

* test(volume_server/http): add jwt query-over-cookie precedence coverage

* docs(volume_server): record jwt query cookie precedence coverage

* test(volume_server/grpc): add setstate version mismatch and nil-state coverage

* docs(volume_server): record setstate validation coverage

* test(volume_server/grpc): add readonly persist-true lifecycle coverage

* docs(volume_server): record readonly persist variation coverage

* test(volume_server/http): add options origin cors header coverage

* docs(volume_server): record options origin cors coverage

* test(volume_server/http): add trace unsupported-method parity coverage

* docs(volume_server): record trace method parity coverage

* test(volume_server/grpc): add batch delete cookie-check variation coverage

* docs(volume_server): record batch delete cookie-check coverage

* test(volume_server/grpc): add admin lifecycle missing and maintenance variants

* docs(volume_server): record admin lifecycle edge-case coverage

* test(volume_server/grpc): add mixed batch delete status matrix coverage

* docs(volume_server): record mixed batch delete matrix coverage

* test(volume_server/http): add jwt-profile ui access gating coverage

* docs(volume_server): record jwt ui-gating http coverage

* test(volume_server/http): add propfind unsupported-method parity coverage

* docs(volume_server): record propfind method parity coverage

* test(volume_server/grpc): add volume configure success and rollback-path coverage

* docs(volume_server): record volume configure branch coverage

* test(volume_server/grpc): add volume needle status missing-path coverage

* docs(volume_server): record volume needle status error-path coverage

* test(volume_server/http): add readDeleted query behavior coverage

* docs(volume_server): record readDeleted http behavior coverage

* test(volume_server/http): add delete ts override parity coverage

* docs(volume_server): record delete ts parity coverage

* test(volume_server/grpc): add invalid blob/meta offset coverage

* docs(volume_server): record invalid blob/meta offset coverage

* test(volume_server/grpc): add read-all mixed volume abort coverage

* docs(volume_server): record read-all mixed-volume abort coverage

* test(volume_server/http): assert head response body parity

* docs(volume_server): record head body parity assertion

* test(volume_server/grpc): assert status state and memory payload completeness

* docs(volume_server): record volume server status payload coverage

* test(volume_server/grpc): add batch delete chunk-manifest rejection coverage

* docs(volume_server): record batch delete chunk-manifest coverage

* test(volume_server/grpc): add query cookie-mismatch eof parity coverage

* docs(volume_server): record query cookie-mismatch parity coverage

* test(volume_server/grpc): add ping master success target coverage

* docs(volume_server): record ping master success coverage

* test(volume_server/http): add head if-none-match conditional parity

* docs(volume_server): record head if-none-match parity coverage

* test(volume_server/http): add head if-modified-since parity coverage

* docs(volume_server): record head if-modified-since parity coverage

* test(volume_server/http): add connect unsupported-method parity coverage

* docs(volume_server): record connect method parity coverage

* test(volume_server/http): assert options allow-headers cors parity

* docs(volume_server): record options allow-headers coverage

* test(volume_server/framework): add dual volume cluster integration harness

* test(volume_server/http): add missing-local read mode proxy redirect local coverage

* docs(volume_server): record read mode missing-local matrix coverage

* test(volume_server/http): add download over-limit replica proxy fallback coverage

* docs(volume_server): record download replica fallback coverage

* test(volume_server/http): add missing-local readDeleted proxy redirect parity coverage

* docs(volume_server): record missing-local readDeleted mode coverage

* test(volume_server/framework): add single-volume cluster with filer harness

* test(volume_server/grpc): add ping filer success target coverage

* docs(volume_server): record ping filer success coverage

* test(volume_server/http): add proxied-loop guard download timeout coverage

* docs(volume_server): record proxied-loop download coverage

* test(volume_server/http): add disabled upload and download limit coverage

* docs(volume_server): record disabled throttling path coverage

* test(volume_server/grpc): add idempotent volume server leave coverage

* docs(volume_server): record leave idempotence coverage

* test(volume_server/http): add redirect collection query preservation coverage

* docs(volume_server): record redirect collection query coverage

* test(volume_server/http): assert admin server headers on status and health

* docs(volume_server): record admin server header coverage

* test(volume_server/http): assert healthz request-id echo parity

* docs(volume_server): record healthz request-id parity coverage

* test(volume_server/http): add over-limit invalid-vid download branch coverage

* docs(volume_server): record over-limit invalid-vid branch coverage

* test(volume_server/http): add public-port static asset coverage

* docs(volume_server): record public static endpoint coverage

* test(volume_server/http): add public head method parity coverage

* docs(volume_server): record public head parity coverage

* test(volume_server/http): add throttling wait-then-proceed path coverage

* docs(volume_server): record throttling wait-then-proceed coverage

* test(volume_server/http): add read cookie-mismatch not-found coverage

* docs(volume_server): record read cookie-mismatch coverage

* test(volume_server/http): add throttling timeout-recovery coverage

* docs(volume_server): record throttling timeout-recovery coverage

* test(volume_server/grpc): add ec generate mount info unmount lifecycle coverage

* docs(volume_server): record ec positive lifecycle coverage

* test(volume_server/grpc): add ec shard read and blob delete lifecycle coverage

* docs(volume_server): record ec shard read/blob delete lifecycle coverage

* test(volume_server/grpc): add ec rebuild and to-volume error branch coverage

* docs(volume_server): record ec rebuild and to-volume branch coverage

* test(volume_server/grpc): add ec shards-to-volume success roundtrip coverage

* docs(volume_server): record ec shards-to-volume success coverage

* test(volume_server/grpc): add ec receive and copy-file missing-source coverage

* docs(volume_server): record ec receive and copy-file coverage

* test(volume_server/grpc): add ec last-shard delete cleanup coverage

* docs(volume_server): record ec last-shard delete cleanup coverage

* test(volume_server/grpc): add volume copy success path coverage

* docs(volume_server): record volume copy success coverage

* test(volume_server/grpc): add volume copy overwrite-destination coverage

* docs(volume_server): record volume copy overwrite coverage

* test(volume_server/http): add write error-path variant coverage

* docs(volume_server): record http write error-path coverage

* test(volume_server/http): add conditional header precedence coverage

* docs(volume_server): record conditional header precedence coverage

* test(volume_server/http): add oversized combined range guard coverage

* docs(volume_server): record oversized range guard coverage

* test(volume_server/http): add image resize and crop read coverage

* docs(volume_server): record image transform coverage

* test(volume_server/http): add chunk-manifest expansion and bypass coverage

* docs(volume_server): record chunk-manifest read coverage

* test(volume_server/http): add compressed read encoding matrix coverage

* docs(volume_server): record compressed read matrix coverage

* test(volume_server/grpc): add tail receiver source replication coverage

* docs(volume_server): record tail receiver replication coverage

* test(volume_server/grpc): add tail sender large-needle chunking coverage

* docs(volume_server): record tail sender chunking coverage

* test(volume_server/grpc): add ec-backed volume needle status coverage

* docs(volume_server): record ec-backed needle status coverage

* test(volume_server/grpc): add ec shard copy from peer success coverage

* docs(volume_server): record ec shard copy success coverage

* test(volume_server/http): add chunk-manifest delete child cleanup coverage

* docs(volume_server): record chunk-manifest delete cleanup coverage

* test(volume_server/http): add chunk-manifest delete failure-path coverage

* docs(volume_server): record chunk-manifest delete failure coverage

* test(volume_server/grpc): add ec shard copy source-unavailable coverage

* docs(volume_server): record ec shard copy source-unavailable coverage

* parallel
2026-02-13 00:40:56 -08:00
Chris Lu
796f23f68a Fix STS InvalidAccessKeyId and request body consumption issues (#8328)
* Fix STS InvalidAccessKeyId and request body consumption in Lakekeeper integration test

* Remove debug prints

* Add Lakekeeper integration tests to CI

* Fix connection refused in CI by binding to 0.0.0.0

* Add timeout to docker run in Lakekeeper integration test

* Update weed/s3api/auth_credentials.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-02-12 17:37:07 -08:00
Chris Lu
c1a9263e37 Fix STS AssumeRole with POST body param (#8320)
* Fix STS AssumeRole with POST body param and add integration test

* Add STS integration test to CI workflow

* Address code review feedback: fix HPP vulnerability and style issues

* Refactor: address code review feedback

- Fix HTTP Parameter Pollution vulnerability in UnifiedPostHandler
- Refactor permission check logic for better readability
- Extract test helpers to testutil/docker.go to reduce duplication
- Clean up imports and simplify context setting

* Add SigV4-style test variant for AssumeRole POST body routing

- Added ActionInBodyWithSigV4Style test case to validate real-world scenario
- Test confirms routing works correctly for AWS SigV4-signed requests
- Addresses code review feedback about testing with SigV4 signatures

* Fix: always set identity in context when non-nil

- Ensure UnifiedPostHandler always calls SetIdentityInContext when identity is non-nil
- Only call SetIdentityNameInContext when identity.Name is non-empty
- This ensures downstream handlers (embeddedIam.DoActions) always have access to identity
- Addresses potential issue where empty identity.Name would skip context setting
2026-02-12 12:04:07 -08:00
Chris Lu
b8ef48c8f1 Add RisingWave catalog tests (#8308)
* Add RisingWave catalog tests for S3 tables

* Add RisingWave catalog integration tests to CI workflow

* Refactor RisingWave catalog tests based on PR feedback

* Address PR feedback: optimize checks, cleanup logs

* fix tests

* consistent
2026-02-11 22:00:06 -08:00
Chris Lu
ac242d04ee one time manual run 2026-02-11 13:20:52 -08:00
Chris Lu
21543134c8 fix manual build process 2026-02-11 13:20:41 -08:00
Chris Lu
2d97685390 ci: fix container_release_unified manual dispatch and workflow parsing (#8293)
ci: fix unified release workflow dispatch matrix filtering
2026-02-10 13:15:52 -08:00
Chris Lu
b73bd08470 ci: move manual container builds to unified release workflow (#8290)
* ci: move manual dev container build into unified release workflow

* ci: make unified manual container build release-tag based
2026-02-10 12:47:39 -08:00
Chris Lu
b261c89675 Fix RocksDB container build compatibility and add manual rocksdb dispatch (#8288)
* Fix RocksDB build compatibility and add manual rocksdb trigger

* Upgrade RocksDB defaults and keep grocksdb v1.10.7

* Add manual latest-image trigger inputs for ref and variant

* Allow manual latest build to set image tag and source ref

* Fix manual variant selection using setup job matrix output
2026-02-10 11:48:42 -08:00
Chris Lu
c9428c2c0f Modify AI review comments checklist in PR template
Updated AI code review checklist to include potential additional reviews.
2026-02-09 08:58:35 -08:00
Chris Lu
458c12fb99 test: add Spark S3 integration regression for issue #8234 (#8249)
* test: add Spark S3 regression integration test for issue 8234

* Update test/s3/spark/setup_test.go

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-02-08 21:13:31 -08:00
Chris Lu
403592bb9f Add Spark Iceberg catalog integration tests and CI support (#8242)
* Add Spark Iceberg catalog integration tests and CI support

Implement comprehensive integration tests for Spark with SeaweedFS Iceberg REST catalog:
- Basic CRUD operations (Create, Read, Update, Delete) on Iceberg tables
- Namespace (database) management
- Data insertion, querying, and deletion
- Time travel capabilities via snapshot versioning
- Compatible with SeaweedFS S3 and Iceberg REST endpoints

Tests mirror the structure of existing Trino integration tests but use Spark's
Python SQL API and PySpark for testing.

Add GitHub Actions CI job for spark-iceberg-catalog-tests in s3-tables-tests.yml
to automatically run Spark integration tests on pull requests.

* fmt

* Fix Spark integration tests - code review feedback

* go mod tidy

* Add go mod tidy step to integration test jobs

Add 'go mod tidy' step before test runs for all integration test jobs:
- s3-tables-tests
- iceberg-catalog-tests
- trino-iceberg-catalog-tests
- spark-iceberg-catalog-tests

This ensures dependencies are clean before running tests.

* Fix remaining Spark operations test issues

Address final code review comments:

Setup & Initialization:
- Add waitForSparkReady() helper function that polls Spark readiness
  with backoff instead of hardcoded 10-second sleep
- Extract setupSparkTestEnv() helper to reduce boilerplate duplication
  between TestSparkCatalogBasicOperations and TestSparkTimeTravel
- Both tests now use helpers for consistent, reliable setup

Assertions & Validation:
- Make setup-critical operations (namespace, table creation, initial
  insert) use t.Fatalf instead of t.Errorf to fail fast
- Validate setupSQL output in TestSparkTimeTravel and fail if not
  'Setup complete'
- Add validation after second INSERT in TestSparkTimeTravel:
  verify row count increased to 2 before time travel test
- Add context to error messages with namespace and tableName params

Code Quality:
- Remove code duplication between test functions
- All critical paths now properly validated
- Consistent error handling throughout

* Fix go vet errors in S3 Tables tests

Fixes:
1. setup_test.go (Spark):
   - Add missing import: github.com/testcontainers/testcontainers-go/wait
   - Use wait.ForLog instead of undefined testcontainers.NewLogStrategy
   - Remove unused strings import

2. trino_catalog_test.go:
   - Use net.JoinHostPort instead of fmt.Sprintf for address formatting
   - Properly handles IPv6 addresses by wrapping them in brackets

* Use weed mini for simpler SeaweedFS startup

Replace complex multi-process startup (master, volume, filer, s3)
with single 'weed mini' command that starts all services together.

Benefits:
- Simpler, more reliable startup
- Single weed mini process vs 4 separate processes
- Automatic coordination between components
- Better port management with no manual coordination

Changes:
- Remove separate master, volume, filer process startup
- Use weed mini with -master.port, -filer.port, -s3.port flags
- Keep Iceberg REST as separate service (still needed)
- Increase timeout to 15s for port readiness (weed mini startup)
- Remove volumePort and filerProcess fields from TestEnvironment
- Simplify cleanup to only handle two processes (mini, iceberg rest)

* Clean up dead code and temp directory leaks

Fixes:

1. Remove dead s3Process field and cleanup:
   - weed mini bundles S3 gateway, no separate process needed
   - Removed s3Process field from TestEnvironment
   - Removed unnecessary s3Process cleanup code

2. Fix temp config directory leak:
   - Add sparkConfigDir field to TestEnvironment
   - Store returned configDir in writeSparkConfig
   - Clean up sparkConfigDir in Cleanup() with os.RemoveAll
   - Prevents accumulation of temp directories in test runs

3. Simplify Cleanup:
   - Now handles only necessary processes (weed mini, iceberg rest)
   - Removes both seaweedfsDataDir and sparkConfigDir
   - Cleaner shutdown sequence

* Use weed mini's built-in Iceberg REST and fix python binary

Changes:
- Add -s3.port.iceberg flag to weed mini for built-in Iceberg REST Catalog
- Remove separate 'weed server' process for Iceberg REST
- Remove icebergRestProcess field from TestEnvironment
- Simplify Cleanup() to only manage weed mini + Spark
- Add port readiness check for iceberg REST from weed mini
- Set Spark container Cmd to '/bin/sh -c sleep 3600' to keep it running
- Change python to python3 in container.Exec calls

This simplifies to truly one all-in-one weed mini process (master, filer, s3,
iceberg-rest) plus just the Spark container.

* go fmt

* clean up

* bind on a non-loopback IP for container access, aligned Iceberg metadata saves/locations with table locations, and reworked Spark time travel to use TIMESTAMP AS OF   with safe timestamp extraction.

* shared mini start

* Fixed internal directory creation under /buckets so .objects paths can auto-create without failing bucket-name validation, which restores table bucket object writes

* fix path

  Updated table bucket objects to write under `/buckets/<bucket>` and saved Iceberg metadata there, adjusting Spark time-travel timestamp to committed_at +1s. Rebuilt the weed binary (`go
  install ./weed`) and confirmed passing tests for Spark and Trino with focused test commands.

* Updated table bucket creation to stop creating /buckets/.objects and switched Trino REST warehouse to s3://<bucket> to match Iceberg layout.

* Stabilize S3Tables integration tests

* Fix timestamp extraction and remove dead code in bucketDir

* Use table bucket as warehouse in s3tables tests

* Update trino_blog_operations_test.go

* adds the CASCADE option to handle any remaining table metadata/files in the schema directory

* skip namespace not empty
2026-02-08 10:03:53 -08:00
Chris Lu
c284e51d20 fix: multipart upload ETag calculation (#8238)
* fix multipart etag

* address comments

* clean up

* clean up

* optimization

* address comments

* unquoted etag

* dedup

* upgrade

* clean

* etag

* return quoted tag

* quoted etag

* debug

* s3api: unify ETag retrieval and quoting across handlers

Refactor newListEntry to take *S3ApiServer and use getObjectETag,
and update setResponseHeaders to use the same logic. This ensures
consistent ETags are returned for both listing and direct access.

* s3api: implement ListObjects deduplication for versioned buckets

Handle duplicate entries between the main path and the .versions
directory by prioritizing the latest version when bucket versioning
is enabled.

* s3api: cleanup stale main file entries during versioned uploads

Add explicit deletion of pre-existing "main" files when creating new
versions in versioned buckets. This prevents stale entries from
appearing in bucket listings and ensures consistency.

* s3api: fix cleanup code placement in versioned uploads

Correct the placement of rm calls in completeMultipartUpload and
putVersionedObject to ensure stale main files are properly deleted
during versioned uploads.

* s3api: improve getObjectETag fallback for empty ExtETagKey

Ensure that when ExtETagKey exists but contains an empty value,
the function falls through to MD5/chunk-based calculation instead
of returning an empty string.

* s3api: fix test files for new newListEntry signature

Update test files to use the new newListEntry signature where the
first parameter is *S3ApiServer. Created mockS3ApiServer to properly
test owner display name lookup functionality.

* s3api: use filer.ETag for consistent Md5 handling in getEtagFromEntry

Change getEtagFromEntry fallback to use filer.ETag(entry) instead of
filer.ETagChunks to ensure legacy entries with Attributes.Md5 are
handled consistently with the rest of the codebase.

* s3api: optimize list logic and fix conditional header logging

- Hoist bucket versioning check out of per-entry callback to avoid
  repeated getVersioningState calls
- Extract appendOrDedup helper function to eliminate duplicate
  dedup/append logic across multiple code paths
- Change If-Match mismatch logging from glog.Errorf to glog.V(3).Infof
  and remove DEBUG prefix for consistency

* s3api: fix test mock to properly initialize IAM accounts

Fixed nil pointer dereference in TestNewListEntryOwnerDisplayName by
directly initializing the IdentityAccessManagement.accounts map in the
test setup. This ensures newListEntry can properly look up account
display names without panicking.

* cleanup

* s3api: remove premature main file cleanup in versioned uploads

Removed incorrect cleanup logic that was deleting main files during
versioned uploads. This was causing test failures because it deleted
objects that should have been preserved as null versions when
versioning was first enabled. The deduplication logic in listing is
sufficient to handle duplicate entries without deleting files during
upload.

* s3api: add empty-value guard to getEtagFromEntry

Added the same empty-value guard used in getObjectETag to prevent
returning quoted empty strings. When ExtETagKey exists but is empty,
the function now falls through to filer.ETag calculation instead of
returning "".

* s3api: fix listing of directory key objects with matching prefix

Revert prefix handling logic to use strings.TrimPrefix instead of
checking HasPrefix with empty string result. This ensures that when a
directory key object exactly matches the prefix (e.g. prefix="dir/",
object="dir/"), it is correctly handled as a regular entry instead of
being skipped or incorrectly processed as a common prefix. Also fixed
missing variable definition.

* s3api: refactor list inline dedup to use appendOrDedup helper

Refactored the inline deduplication logic in listFilerEntries to use the
shared appendOrDedup helper function. This ensures consistent behavior
and reduces code duplication.

* test: fix port allocation race in s3tables integration test

Updated startMiniCluster to find all required ports simultaneously using
findAvailablePorts instead of sequentially. This prevents race conditions
where the OS reallocates a port that was just released, causing multiple
services (e.g. Filer and Volume) to be assigned the same port and fail
to start.
2026-02-06 21:54:43 -08:00
Chris Lu
a3b83f8808 test: add Trino Iceberg catalog integration test (#8228)
* test: add Trino Iceberg catalog integration test

- Create test/s3/catalog_trino/trino_catalog_test.go with TestTrinoIcebergCatalog
- Tests integration between Trino SQL engine and SeaweedFS Iceberg REST catalog
- Starts weed mini with all services and Trino in Docker container
- Validates Iceberg catalog schema creation and listing operations
- Uses native S3 filesystem support in Trino with path-style access
- Add workflow job to s3-tables-tests.yml for CI execution

* fix: preserve AWS environment credentials when replacing S3 configuration

When S3 configuration is loaded from filer/db, it replaces the identities list
and inadvertently removes AWS_ACCESS_KEY_ID credentials that were added from
environment variables. This caused auth to remain disabled even though valid
credentials were present.

Fix by preserving environment-based identities when replacing the configuration
and re-adding them after the replacement. This ensures environment credentials
persist across configuration reloads and properly enable authentication.

* fix: use correct ServerAddress format with gRPC port encoding

The admin server couldn't connect to master because the master address
was missing the gRPC port information. Use pb.NewServerAddress() which
properly encodes both HTTP and gRPC ports in the address string.

Changes:
- weed/command/mini.go: Use pb.NewServerAddress for master address in admin
- test/s3/policy/policy_test.go: Store and use gRPC ports for master/filer addresses

This fix applies to:
1. Admin server connection to master (mini.go)
2. Test shell commands that need master/filer addresses (policy_test.go)

* move

* move

* fix: always include gRPC port in server address encoding

The NewServerAddress() function was omitting the gRPC port from the address
string when it matched the port+10000 convention. However, gRPC port allocation
doesn't always follow this convention - when the calculated port is busy, an
alternative port is allocated.

This caused a bug where:
1. Master's gRPC port was allocated as 50661 (sequential, not port+10000)
2. Address was encoded as '192.168.1.66:50660' (gRPC port omitted)
3. Admin client called ToGrpcAddress() which assumed port+10000 offset
4. Admin tried to connect to 60660 but master was on 50661 → connection failed

Fix: Always include explicit gRPC port in address format (host:httpPort.grpcPort)
unless gRPC port is 0. This makes addresses unambiguous and works regardless of
the port allocation strategy used.

Impacts: All server-to-server gRPC connections now use properly formatted addresses.

* test: fix Iceberg REST API readiness check

The Iceberg REST API endpoints require authentication. When checked without
credentials, the API returns 403 Forbidden (not 401 Unauthorized).  The
readiness check now accepts both auth error codes (401/403) as indicators
that the service is up and ready, it just needs credentials.

This fixes the 'Iceberg REST API did not become ready' test failure.

* Fix AWS SigV4 signature verification for base64-encoded payload hashes

   AWS SigV4 canonical requests must use hex-encoded SHA256 hashes,
   but the X-Amz-Content-Sha256 header may be transmitted as base64.

   Changes:
   - Added normalizePayloadHash() function to convert base64 to hex
   - Call normalizePayloadHash() in extractV4AuthInfoFromHeader()
   - Added encoding/base64 import

   Fixes 403 Forbidden errors on POST requests to Iceberg REST API
   when clients send base64-encoded content hashes in the header.

   Impacted services: Iceberg REST API, S3Tables

* Fix AWS SigV4 signature verification for base64-encoded payload hashes

   AWS SigV4 canonical requests must use hex-encoded SHA256 hashes,
   but the X-Amz-Content-Sha256 header may be transmitted as base64.

   Changes:
   - Added normalizePayloadHash() function to convert base64 to hex
   - Call normalizePayloadHash() in extractV4AuthInfoFromHeader()
   - Added encoding/base64 import
   - Removed unused fmt import

   Fixes 403 Forbidden errors on POST requests to Iceberg REST API
   when clients send base64-encoded content hashes in the header.

   Impacted services: Iceberg REST API, S3Tables

* pass sigv4

* s3api: fix identity preservation and logging levels

- Ensure environment-based identities are preserved during config replacement
- Update accessKeyIdent and nameToIdentity maps correctly
- Downgrade informational logs to V(2) to reduce noise

* test: fix trino integration test and s3 policy test

- Pin Trino image version to 479
- Fix port binding to 0.0.0.0 for Docker connectivity
- Fix S3 policy test hang by correctly assigning MiniClusterCtx
- Improve port finding robustness in policy tests

* ci: pre-pull trino image to avoid timeouts

- Pull trinodb/trino:479 after Docker setup
- Ensure image is ready before integration tests start

* iceberg: remove unused checkAuth and improve logging

- Remove unused checkAuth method
- Downgrade informational logs to V(2)
- Ensure loggingMiddleware uses a status writer for accurate reported codes
- Narrow catch-all route to avoid interfering with other subsystems

* iceberg: fix build failure by removing unused s3api import

* Update iceberg.go

* use warehouse

* Update trino_catalog_test.go
2026-02-06 13:12:25 -08:00
Chris Lu
b244bb58aa s3tables: redesign Iceberg REST Catalog using iceberg-go and automate integration tests (#8197)
* full integration with iceberg-go

* Table Commit Operations (handleUpdateTable)

* s3tables: fix Iceberg v2 compliance and namespace properties

This commit ensures SeaweedFS Iceberg REST Catalog is compliant with
Iceberg Format Version 2 by:
- Using iceberg-go's table.NewMetadataWithUUID for strict v2 compliance.
- Explicitly initializing namespace properties to empty maps.
- Removing omitempty from required Iceberg response fields.
- Fixing CommitTableRequest unmarshaling using table.Requirements and table.Updates.

* s3tables: automate Iceberg integration tests

- Added Makefile for local test execution and cluster management.
- Added docker-compose for PyIceberg compatibility kit.
- Added Go integration test harness for PyIceberg.
- Updated GitHub CI to run Iceberg catalog tests automatically.

* s3tables: update PyIceberg test suite for compatibility

- Updated test_rest_catalog.py to use latest PyIceberg transaction APIs.
- Updated Dockerfile to include pyarrow and pandas dependencies.
- Improved namespace and table handling in integration tests.

* s3tables: address review feedback on Iceberg Catalog

- Implemented robust metadata version parsing and incrementing.
- Ensured table metadata changes are persisted during commit (handleUpdateTable).
- Standardized namespace property initialization for consistency.
- Fixed unused variable and incorrect struct field build errors.

* s3tables: finalize Iceberg REST Catalog and optimize tests

- Implemented robust metadata versioning and persistence.
- Standardized namespace property initialization.
- Optimized integration tests using pre-built Docker image.
- Added strict property persistence validation to test suite.
- Fixed build errors from previous partial updates.

* Address PR review: fix Table UUID stability, implement S3Tables UpdateTable, and support full metadata persistence individually

* fix: Iceberg catalog stable UUIDs, metadata persistence, and file writing

- Ensure table UUIDs are stable (do not regenerate on load).
- Persist full table metadata (Iceberg JSON) in s3tables extended attributes.
- Add `MetadataVersion` to explicitly track version numbers, replacing regex parsing.
- Implement `saveMetadataFile` to persist metadata JSON files to the Filer on commit.
- Update `CreateTable` and `UpdateTable` handlers to use the new logic.

* test: bind weed mini to 0.0.0.0 in integration tests to fix Docker connectivity

* Iceberg: fix metadata handling in REST catalog

- Add nil guard in createTable
- Fix updateTable to correctly load existing metadata from storage
- Ensure full metadata persistence on updates
- Populate loadTable result with parsed metadata

* S3Tables: add auth checks and fix response fields in UpdateTable

- Add CheckPermissionWithContext to UpdateTable handler
- Include TableARN and MetadataLocation in UpdateTable response
- Use ErrCodeConflict (409) for version token mismatches

* Tests: improve Iceberg catalog test infrastructure and cleanup

- Makefile: use PID file for precise process killing
- test_rest_catalog.py: remove unused variables and fix f-strings

* Iceberg: fix variable shadowing in UpdateTable

- Rename inner loop variable `req` to `requirement` to avoid shadowing outer request variable

* S3Tables: simplify MetadataVersion initialization

- Use `max(req.MetadataVersion, 1)` instead of anonymous function

* Tests: remove unicode characters from S3 tables integration test logs

- Remove unicode checkmarks from test output for cleaner logs

* Iceberg: improve metadata persistence robustness

- Fix MetadataLocation in LoadTableResult to fallback to generated location
- Improve saveMetadataFile to ensure directory hierarchy existence and robust error handling
2026-02-03 15:30:04 -08:00
Chris Lu
2bb21ea276 feat: Add Iceberg REST Catalog server and admin UI (#8175)
* feat: Add Iceberg REST Catalog server

Implement Iceberg REST Catalog API on a separate port (default 8181)
that exposes S3 Tables metadata through the Apache Iceberg REST protocol.

- Add new weed/s3api/iceberg package with REST handlers
- Implement /v1/config endpoint returning catalog configuration
- Implement namespace endpoints (list/create/get/head/delete)
- Implement table endpoints (list/create/load/head/delete/update)
- Add -port.iceberg flag to S3 standalone server (s3.go)
- Add -s3.port.iceberg flag to combined server mode (server.go)
- Add -s3.port.iceberg flag to mini cluster mode (mini.go)
- Support prefix-based routing for multiple catalogs

The Iceberg REST server reuses S3 Tables metadata storage under
/table-buckets and enables DuckDB, Spark, and other Iceberg clients
to connect to SeaweedFS as a catalog.

* feat: Add Iceberg Catalog pages to admin UI

Add admin UI pages to browse Iceberg catalogs, namespaces, and tables.

- Add Iceberg Catalog menu item under Object Store navigation
- Create iceberg_catalog.templ showing catalog overview with REST info
- Create iceberg_namespaces.templ listing namespaces in a catalog
- Create iceberg_tables.templ listing tables in a namespace
- Add handlers and routes in admin_handlers.go
- Add Iceberg data provider methods in s3tables_management.go
- Add Iceberg data types in types.go

The Iceberg Catalog pages provide visibility into the same S3 Tables
data through an Iceberg-centric lens, including REST endpoint examples
for DuckDB and PyIceberg.

* test: Add Iceberg catalog integration tests and reorg s3tables tests

- Reorganize existing s3tables tests to test/s3tables/table-buckets/
- Add new test/s3tables/catalog/ for Iceberg REST catalog tests
- Add TestIcebergConfig to verify /v1/config endpoint
- Add TestIcebergNamespaces to verify namespace listing
- Add TestDuckDBIntegration for DuckDB connectivity (requires Docker)
- Update CI workflow to use new test paths

* fix: Generate proper random UUIDs for Iceberg tables

Address code review feedback:
- Replace placeholder UUID with crypto/rand-based UUID v4 generation
- Add detailed TODO comments for handleUpdateTable stub explaining
  the required atomic metadata swap implementation

* fix: Serve Iceberg on localhost listener when binding to different interface

Address code review feedback: properly serve the localhost listener
when the Iceberg server is bound to a non-localhost interface.

* ci: Add Iceberg catalog integration tests to CI

Add new job to run Iceberg catalog tests in CI, along with:
- Iceberg package build verification
- Iceberg unit tests
- Iceberg go vet checks
- Iceberg format checks

* fix: Address code review feedback for Iceberg implementation

- fix: Replace hardcoded account ID with s3_constants.AccountAdminId in buildTableBucketARN()
- fix: Improve UUID generation error handling with deterministic fallback (timestamp + PID + counter)
- fix: Update handleUpdateTable to return HTTP 501 Not Implemented instead of fake success
- fix: Better error handling in handleNamespaceExists to distinguish 404 from 500 errors
- fix: Use relative URL in template instead of hardcoded localhost:8181
- fix: Add HTTP timeout to test's waitForService function to avoid hangs
- fix: Use dynamic ephemeral ports in integration tests to avoid flaky parallel failures
- fix: Add Iceberg port to final port configuration logging in mini.go

* fix: Address critical issues in Iceberg implementation

- fix: Cache table UUIDs to ensure persistence across LoadTable calls
  The UUID now remains stable for the lifetime of the server session.
  TODO: For production, UUIDs should be persisted in S3 Tables metadata.

- fix: Remove redundant URL-encoded namespace parsing
  mux router already decodes %1F to \x1F before passing to handlers.
  Redundant ReplaceAll call could cause bugs with literal %1F in namespace.

* fix: Improve test robustness and reduce code duplication

- fix: Make DuckDB test more robust by failing on unexpected errors
  Instead of silently logging errors, now explicitly check for expected
  conditions (extension not available) and skip the test appropriately.

- fix: Extract username helper method to reduce duplication
  Created getUsername() helper in AdminHandlers to avoid duplicating
  the username retrieval logic across Iceberg page handlers.

* fix: Add mutex protection to table UUID cache

Protects concurrent access to the tableUUIDs map with sync.RWMutex.
Uses read-lock for fast path when UUID already cached, and write-lock
for generating new UUIDs. Includes double-check pattern to handle race
condition between read-unlock and write-lock.

* style: fix go fmt errors

* feat(iceberg): persist table UUID in S3 Tables metadata

* feat(admin): configure Iceberg port in Admin UI and commands

* refactor: address review comments (flags, tests, handlers)

- command/mini: fix tracking of explicit s3.port.iceberg flag
- command/admin: add explicit -iceberg.port flag
- admin/handlers: reuse getUsername helper
- tests: use 127.0.0.1 for ephemeral ports and os.Stat for file size check

* test: check error from FileStat in verify_gc_empty_test
2026-02-02 23:12:13 -08:00
Chris Lu
3dcaee56aa Revert "ci: Pin GitHub Actions to commit SHAs for s3-tables-tests"
This reverts commit 01da26fbcb.
2026-01-28 17:43:11 -08:00
Chris Lu
01da26fbcb ci: Pin GitHub Actions to commit SHAs for s3-tables-tests
Update all action refs to use pinned commit SHAs instead of floating tags:
- actions/checkout: @v6 → @8e8c483 (v4)
- actions/setup-go: @v6 → @0c52d54 (v5)
- actions/upload-artifact: @v6 → @65d8626 (v4)

Pinned SHAs improve reproducibility and reduce supply chain risk by
preventing accidental or malicious changes in action releases. Aligns
with repository conventions used in other workflows (e.g., go.yml).
2026-01-28 17:40:40 -08:00
Chris Lu
33da87452b Refine S3 Tables implementation to address code review feedback
- Standardize namespace representation to []string
- Improve listing logic with pagination and StartFromFileName
- Enhance error handling with sentinel errors and robust checks
- Add JSON encoding error logging
- Fix CI workflow to use gofmt -l
- Standardize timestamps in directory creation
- Validate single-level namespaces
2026-01-28 10:04:27 -08:00
Chris Lu
24c78d524c ci: fail s3 tables tests if any command in pipeline fails 2026-01-28 09:37:51 -08:00
Chris Lu
05c184b610 workflow: fix go install path to ./weed 2026-01-28 01:35:24 -08:00
Chris Lu
96a6e4c551 workflow: remove emojis from echo statements 2026-01-28 01:29:42 -08:00
Chris Lu
f4e472d396 workflow: fix s3 tables tests path and working directory
The workflow was failing because it was running inside 'weed' directory,
but the tests are at the repository root. Removed working-directory
default and updated relative paths to weed source.
2026-01-28 01:28:47 -08:00
Chris Lu
0dcb175514 ci: add S3 Tables integration tests to GitHub Actions
- Create new workflow for S3 Tables integration testing
- Add build verification job for s3tables package and s3api integration
- Add format checking for S3 Tables code
- Add go vet checks for code quality
- Workflow runs on all pull requests
- Includes test output logging and artifact upload on failure
2026-01-28 00:57:53 -08:00
Chris Lu
e86e65e5ab fix #8081: build latest container is missing latest_large_disk (#8145)
* fix #8081: build latest container is missing latest_large_disk

* fix: simplify QEMU setup condition in container_latest.yml matrix
2026-01-27 19:05:29 -08:00
dependabot[bot]
b502411884 chore(deps): bump actions/checkout from 4 to 6 (#8121)
Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 6.
- [Release notes](https://github.com/actions/checkout/releases)
- [Commits](https://github.com/actions/checkout/compare/v4...v6)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-26 11:16:37 -08:00
dependabot[bot]
16dc90e3bd chore(deps): bump actions/setup-go from 5 to 6 (#8124)
Bumps [actions/setup-go](https://github.com/actions/setup-go) from 5 to 6.
- [Release notes](https://github.com/actions/setup-go/releases)
- [Commits](https://github.com/actions/setup-go/compare/v5...v6)

---
updated-dependencies:
- dependency-name: actions/setup-go
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-26 11:16:28 -08:00
dependabot[bot]
6714973ffe chore(deps): bump actions/upload-artifact from 4 to 6 (#8125)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4 to 6.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/v4...v6)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-26 11:16:19 -08:00
Chris Lu
6bf088cec9 IAM Policy Management via gRPC (#8109)
* Add IAM gRPC service definition

- Add GetConfiguration/PutConfiguration for config management
- Add CreateUser/GetUser/UpdateUser/DeleteUser/ListUsers for user management
- Add CreateAccessKey/DeleteAccessKey/GetUserByAccessKey for access key management
- Methods mirror existing IAM HTTP API functionality

* Add IAM gRPC handlers on filer server

- Implement IamGrpcServer with CredentialManager integration
- Handle configuration get/put operations
- Handle user CRUD operations
- Handle access key create/delete operations
- All methods delegate to CredentialManager for actual storage

* Wire IAM gRPC service to filer server

- Add CredentialManager field to FilerOption and FilerServer
- Import credential store implementations in filer command
- Initialize CredentialManager from credential.toml if available
- Register IAM gRPC service on filer gRPC server
- Enable credential management via gRPC alongside existing filer services

* Regenerate IAM protobuf with gRPC service methods

* iam_pb: add Policy Management to protobuf definitions

* credential: implement PolicyManager in credential stores

* filer: implement IAM Policy Management RPCs

* shell: add s3.policy command

* test: add integration test for s3.policy

* test: fix compilation errors in policy_test

* pb

* fmt

* test

* weed shell: add -policies flag to s3.configure

This allows linking/unlinking IAM policies to/from identities
directly from the s3.configure command.

* test: verify s3.configure policy linking and fix port allocation

- Added test case for linking policies to users via s3.configure
- Implemented findAvailablePortPair to ensure HTTP and gRPC ports
  are both available, avoiding conflicts with randomized port assignments.
- Updated assertion to match jsonpb output (policyNames)

* credential: add StoreTypeGrpc constant

* credential: add IAM gRPC store boilerplate

* credential: implement identity methods in gRPC store

* credential: implement policy methods in gRPC store

* admin: use gRPC credential store for AdminServer

This ensures that all IAM and policy changes made through the Admin UI
are persisted via the Filer's IAM gRPC service instead of direct file manipulation.

* shell: s3.configure use granular IAM gRPC APIs instead of full config patching

* shell: s3.configure use granular IAM gRPC APIs

* shell: replace deprecated ioutil with os in s3.policy

* filer: use gRPC FailedPrecondition for unconfigured credential manager

* test: improve s3.policy integration tests and fix error checks

* ci: add s3 policy shell integration tests to github workflow

* filer: fix LoadCredentialConfiguration error handling

* credential/grpc: propagate unmarshal errors in GetPolicies

* filer/grpc: improve error handling and validation

* shell: use gRPC status codes in s3.configure

* credential: document PutPolicy as create-or-replace

* credential/postgres: reuse CreatePolicy in PutPolicy to deduplicate logic

* shell: add timeout context and strictly enforce flags in s3.policy

* iam: standardize policy content field naming in gRPC and proto

* shell: extract slice helper functions in s3.configure

* filer: map credential store errors to gRPC status codes

* filer: add input validation for UpdateUser and CreateAccessKey

* iam: improve validation in policy and config handlers

* filer: ensure IAM service registration by defaulting credential manager

* credential: add GetStoreName method to manager

* test: verify policy deletion in integration test
2026-01-25 13:39:30 -08:00
Chris Lu
535be3096b Add AWS IAM integration tests and refactor admin authorization (#8098)
* Add AWS IAM integration tests and refactor admin authorization
- Added AWS IAM management integration tests (User, AccessKey, Policy)
- Updated test framework to support IAM client creation with JWT/OIDC
- Refactored s3api authorization to be policy-driven for IAM actions
- Removed hardcoded role name checks for admin privileges
- Added new tests to GitHub Actions basic test matrix

* test(s3/iam): add UpdateUser and UpdateAccessKey tests and fix nil pointer dereference

* feat(s3api): add DeletePolicy and update tests with cleanup logic

* test(s3/iam): use t.Cleanup for managed policy deletion in CreatePolicy test
2026-01-23 16:41:51 -08:00
Chris Lu
d664ca5ed3 fix: IAM authentication with AWS Signature V4 and environment credentials (#8099)
* fix: IAM authentication with AWS Signature V4 and environment credentials

Three key fixes for authenticated IAM requests to work:

1. Fix request body consumption before signature verification
   - iamMatcher was calling r.ParseForm() which consumed POST body
   - This broke AWS Signature V4 verification on subsequent reads
   - Now only check query string in matcher, preserving body for verification
   - File: weed/s3api/s3api_server.go

2. Preserve environment variable credentials across config reloads
   - After IAM mutations, config reload overwrote env var credentials
   - Extract env var loading into loadEnvironmentVariableCredentials()
   - Call after every config reload to persist credentials
   - File: weed/s3api/auth_credentials.go

3. Add authenticated IAM tests and test infrastructure
   - New TestIAMAuthenticated suite with AWS SDK + Signature V4
   - Dynamic port allocation for independent test execution
   - Flag reset to prevent state leakage between tests
   - CI workflow to run S3 and IAM tests separately
   - Files: test/s3/example/*, .github/workflows/s3-example-integration-tests.yml

All tests pass:
- TestIAMCreateUser (unauthenticated)
- TestIAMAuthenticated (with AWS Signature V4)
- S3 integration tests

* fmt

* chore: rename test/s3/example to test/s3/normal

* simplify: CI runs all integration tests in single job

* Update s3-example-integration-tests.yml

* ci: run each test group separately to avoid raft registry conflicts
2026-01-23 16:27:42 -08:00
Chris Lu
b0b7bd0273 Add check for AI code review comments in PR template
Updated the pull request template to include a new check for addressing AI code review comments.
2026-01-23 12:19:00 -08:00
Chris Lu
13dcf445a4 Fix maintenance worker panic and add EC integration tests (#8068)
* Fix nil pointer panic in maintenance worker when receiving empty task assignment

When a worker requests a task and none are available, the admin server
sends an empty TaskAssignment message. The worker was attempting to log
the task details without checking if the TaskId was empty, causing a
nil pointer dereference when accessing taskAssign.Params.VolumeId.

This fix adds a check for empty TaskId before processing the assignment,
preventing worker crashes and improving stability in production environments.

* Add EC integration test for admin-worker maintenance system

Adds comprehensive integration test that verifies the end-to-end flow
of erasure coding maintenance tasks:
- Admin server detects volumes needing EC encoding
- Workers register and receive task assignments
- EC encoding is executed and verified in master topology
- File read-back validation confirms data integrity

The test uses unique absolute working directories for each worker to
prevent ID conflicts and ensure stable worker registration. Includes
proper cleanup and process management for reliable test execution.

* Improve maintenance system stability and task deduplication

- Add cross-type task deduplication to prevent concurrent maintenance
  operations on the same volume (EC, balance, vacuum)
- Implement HasAnyTask check in ActiveTopology for better coordination
- Increase RequestTask timeout from 5s to 30s to prevent unnecessary
  worker reconnections
- Add TaskTypeNone sentinel for generic task checks
- Update all task detectors to use HasAnyTask for conflict prevention
- Improve config persistence and schema handling

* Add GitHub Actions workflow for EC integration tests

Adds CI workflow that runs EC integration tests on push and pull requests
to master branch. The workflow:
- Triggers on changes to admin, worker, or test files
- Builds the weed binary
- Runs the EC integration test suite
- Uploads test logs as artifacts on failure for debugging

This ensures the maintenance system remains stable and worker-admin
integration is validated in CI.

* go version 1.24

* address comments

* Update maintenance_integration.go

* support seconds

* ec prioritize over balancing in tests
2026-01-20 15:07:43 -08:00
Chris Lu
bfd267bfd7 removing the problematic Microsoft package sources 2026-01-18 20:41:27 -08:00
Chris Lu
f8d4583ecd Enhance PR template with AI checks
Added checks for AI generated PRs to the template.
2026-01-16 11:49:46 -08:00
Chris Lu
ee3813787e feat(s3api): Implement S3 Policy Variables (#8039)
* feat: Add AWS IAM Policy Variables support to S3 API

Implements policy variables for dynamic access control in bucket policies.

Supported variables:
- aws:username - Extracted from principal ARN
- aws:userid - User identifier (same as username in SeaweedFS)
- aws:principaltype - IAMUser, IAMRole, or AssumedRole
- jwt:* - Any JWT claim (e.g., jwt:preferred_username, jwt:sub)

Key changes:
- Added PolicyVariableRegex to detect ${...} patterns
- Extended CompiledStatement with DynamicResourcePatterns, DynamicPrincipalPatterns, DynamicActionPatterns
- Added Claims field to PolicyEvaluationArgs for JWT claim access
- Implemented SubstituteVariables() for variable replacement from context and JWT claims
- Implemented extractPrincipalVariables() for ARN parsing
- Updated EvaluateConditions() to support variable substitution
- Comprehensive unit and integration tests

Resolves #8037

* feat: Add LDAP and PrincipalAccount variable support

Completes future enhancements for policy variables:

- Added ldap:* variable support for LDAP claims
  - ldap:username - LDAP username from claims
  - ldap:dn - LDAP distinguished name from claims
  - ldap:* - Any LDAP claim

- Added aws:PrincipalAccount extraction from ARN
  - Extracts account ID from principal ARN
  - Available as ${aws:PrincipalAccount} in policies

Updated SubstituteVariables() to check LDAP claims
Updated extractPrincipalVariables() to extract account ID
Added comprehensive tests for new variables

* feat(s3api): implement IAM policy variables core logic and optimization

* feat(s3api): integrate policy variables with S3 authentication and handlers

* test(s3api): add integration tests for policy variables

* cleanup: remove unused policy conversion files

* Add S3 policy variables integration tests and path support

- Add comprehensive integration tests for policy variables
- Test username isolation, JWT claims, LDAP claims
- Add support for IAM paths in principal ARN parsing
- Add tests for principals with paths

* Fix IAM Role principal variable extraction

IAM Roles should not have aws:userid or aws:PrincipalAccount
according to AWS behavior. Only IAM Users and Assumed Roles
should have these variables.

Fixes TestExtractPrincipalVariables test failures.

* Security fixes and bug fixes for S3 policy variables

SECURITY FIXES:
- Prevent X-SeaweedFS-Principal header spoofing by clearing internal
  headers at start of authentication (auth_credentials.go)
- Restrict policy variable substitution to safe allowlist to prevent
  client header injection (iam/policy/policy_engine.go)
- Add core policy validation before storing bucket policies

BUG FIXES:
- Remove unused sid variable in evaluateStatement
- Fix LDAP claim lookup to check both prefixed and unprefixed keys
- Add ValidatePolicy call in PutBucketPolicyHandler

These fixes prevent privilege escalation via header injection and
ensure only validated identity claims are used in policy evaluation.

* Additional security fixes and code cleanup

SECURITY FIXES:
- Fixed X-Forwarded-For spoofing by only trusting proxy headers from
  private/localhost IPs (s3_iam_middleware.go)
- Changed context key from "sourceIP" to "aws:SourceIp" for proper
  policy variable substitution

CODE IMPROVEMENTS:
- Kept aws:PrincipalAccount for IAM Roles to support condition evaluations
- Removed redundant STS principaltype override
- Removed unused service variable
- Cleaned up commented-out debug logging statements
- Updated tests to reflect new IAM Role behavior

These changes prevent IP spoofing attacks and ensure policy variables
work correctly with the safe allowlist.

* Add security documentation for ParseJWTToken

Added comprehensive security comments explaining that ParseJWTToken
is safe despite parsing without verification because:
- It's only used for routing to the correct verification method
- All code paths perform cryptographic verification before trusting claims
- OIDC tokens: validated via validateExternalOIDCToken
- STS tokens: validated via ValidateSessionToken

Enhanced function documentation with clear security warnings about
proper usage to prevent future misuse.

* Fix IP condition evaluation to use aws:SourceIp key

Fixed evaluateIPCondition in IAM policy engine to use "aws:SourceIp"
instead of "sourceIP" to match the updated extractRequestContext.

This fixes the failing IP-restricted role test where IP-based policy
conditions were not being evaluated correctly.

Updated all test cases to use the correct "aws:SourceIp" key.

* Address code review feedback: optimize and clarify

PERFORMANCE IMPROVEMENT:
- Optimized expandPolicyVariables to use regexp.ReplaceAllStringFunc
  for single-pass variable substitution instead of iterating through
  all safe variables. This improves performance from O(n*m) to O(m)
  where n is the number of safe variables and m is the pattern length.

CODE CLARITY:
- Added detailed comment explaining LDAP claim fallback mechanism
  (checks both prefixed and unprefixed keys for compatibility)
- Enhanced TODO comment for trusted proxy configuration with rationale
  and recommendations for supporting cloud load balancers, CDNs, and
  complex network topologies

All tests passing.

* Address Copilot code review feedback

BUG FIXES:
- Fixed type switch for int/int32/int64 - separated into individual cases
  since interface type switches only match the first type in multi-type cases
- Fixed grammatically incorrect error message in types.go

CODE QUALITY:
- Removed duplicate Resource/NotResource validation (already in ValidateStatement)
- Added comprehensive comment explaining isEnabled() logic and security implications
- Improved trusted proxy NOTE comment to be more concise while noting limitations

All tests passing.

* Fix test failures after extractSourceIP security changes

Updated tests to work with the security fix that only trusts
X-Forwarded-For/X-Real-IP headers from private IP addresses:

- Set RemoteAddr to 127.0.0.1 in tests to simulate trusted proxy
- Changed context key from "sourceIP" to "aws:SourceIp"
- Added test case for untrusted proxy (public RemoteAddr)
- Removed invalid ValidateStatement call (validation happens in ValidatePolicy)

All tests now passing.

* Address remaining Gemini code review feedback

CODE SAFETY:
- Deep clone Action field in CompileStatement to prevent potential data races
  if the original policy document is modified after compilation

TEST CLEANUP:
- Remove debug logging (fmt.Fprintf) from engine_notresource_test.go
- Remove unused imports in engine_notresource_test.go

All tests passing.

* Fix insecure JWT parsing in IAM auth flow

SECURITY FIX:
- Renamed ParseJWTToken to ParseUnverifiedJWTToken with explicit security warnings.
- Refactored AuthenticateJWT to use the trusted SessionInfo returned by ValidateSessionToken
  instead of relying on unverified claims from the initial parse.
- Refactored ValidatePresignedURLWithIAM to reuse the robust AuthenticateJWT logic, removing
  duplicated and insecure manual token parsing.

This ensures all identity information (Role, Principal, Subject) used for authorization
decisions is derived solely from cryptographically verified tokens.

* Security: Fix insecure JWT claim extraction in policy engine

- Refactored EvaluatePolicy to accept trusted claims from verified Identity instead of parsing unverified tokens
- Updated AuthenticateJWT to populate Claims in IAMIdentity from verified sources (SessionInfo/ExternalIdentity)
- Updated s3api_server and handlers to pass claims correctly
- Improved isPrivateIP to support IPv6 loopback, link-local, and ULA
- Fixed flaky distributed_session_consistency test with retry logic

* fix(iam): populate Subject in STSSessionInfo to ensure correct identity propagation

This fixes the TestS3IAMAuthentication/valid_jwt_token_authentication failure by ensuring the session subject (sub) is correctly mapped to the internal SessionInfo struct, allowing bucket ownership validation to succeed.

* Optimized isPrivateIP

* Create s3-policy-tests.yml

* fix tests

* fix tests

* tests(s3/iam): simplify policy to resource-based \ (step 1)

* tests(s3/iam): add explicit Deny NotResource for isolation (step 2)

* fixes

* policy: skip resource matching for STS trust policies to allow AssumeRole evaluation

* refactor: remove debug logging and hoist policy variables for performance

* test: fix TestS3IAMBucketPolicyIntegration cleanup to handle per-subtest object lifecycle

* test: fix bucket name generation to comply with S3 63-char limit

* test: skip TestS3IAMPolicyEnforcement until role setup is implemented

* test: use weed mini for simpler test server deployment

Replace 'weed server' with 'weed mini' for IAM tests to avoid port binding issues
and simplify the all-in-one server deployment. This improves test reliability
and execution time.

* security: prevent allocation overflow in policy evaluation

Add maxPoliciesForEvaluation constant to cap the number of policies evaluated
in a single request. This prevents potential integer overflow when allocating
slices for policy lists that may be influenced by untrusted input.

Changes:
- Add const maxPoliciesForEvaluation = 1024 to set an upper bound
- Validate len(policies) < maxPoliciesForEvaluation before appending bucket policy
- Use append() instead of make([]string, len+1) to avoid arithmetic overflow
- Apply fix to both IsActionAllowed policy evaluation paths
2026-01-16 11:12:28 -08:00
Chris Lu
15ca301e43 Fix flaky EC integration tests by collecting server logs on failure (#7969)
* Fix flaky EC integration tests by collecting server logs on failure

The EC Integration Tests were experiencing flaky timeouts with errors like
"error reading from server: EOF" and master client reconnection attempts.
When tests failed, server logs were not collected, making debugging difficult.

Changes:
- Updated all test functions to use t.TempDir() instead of os.MkdirTemp()
  and manual cleanup. t.TempDir() automatically preserves directories when
  tests fail, ensuring logs are available for debugging.
- Modified GitHub Actions workflow to collect server logs from temp
  directories when tests fail, including master.log and volume*.log files.
- Added explicit log collection step that searches for test temp directories
  and copies them to artifacts for upload.

This will make debugging flaky test failures much easier by providing access
to the actual server logs showing what went wrong.

* Fix find command precedence in log collection

The -type d flag only applied to the first -name predicate because -o
has lower precedence than the implicit AND. Grouped the -name predicates
with escaped parentheses so -type d applies to all directory name patterns.
2026-01-05 12:05:31 -08:00
Chris Lu
8d6bcddf60 Add S3 volume encryption support with -s3.encryptVolumeData flag (#7890)
* Add S3 volume encryption support with -s3.encryptVolumeData flag

This change adds volume-level encryption support for S3 uploads, similar
to the existing -filer.encryptVolumeData option. Each chunk is encrypted
with its own auto-generated CipherKey when the flag is enabled.

Changes:
- Add -s3.encryptVolumeData flag to weed s3, weed server, and weed mini
- Wire Cipher option through S3ApiServer and ChunkedUploadOption
- Add integration tests for multi-chunk range reads with encryption
- Tests verify encryption works across chunk boundaries

Usage:
  weed s3 -encryptVolumeData
  weed server -s3 -s3.encryptVolumeData
  weed mini -s3.encryptVolumeData

Integration tests:
  go test -v -tags=integration -timeout 5m ./test/s3/sse/...

* Add GitHub Actions CI for S3 volume encryption tests

- Add test-volume-encryption target to Makefile that starts server with -s3.encryptVolumeData
- Add s3-volume-encryption job to GitHub Actions workflow
- Tests run with integration build tag and 10m timeout
- Server logs uploaded on failure for debugging

* Fix S3 client credentials to use environment variables

The test was using hardcoded credentials "any"/"any" but the Makefile
sets AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY to "some_access_key1"/
"some_secret_key1". Updated getS3Client() to read from environment
variables with fallback to "any"/"any" for manual testing.

* Change bucket creation errors from skip to fatal

Tests should fail, not skip, when bucket creation fails. This ensures
that credential mismatches and other configuration issues are caught
rather than silently skipped.

* Make copy and multipart test jobs fail instead of succeed

Changed exit 0 to exit 1 for s3-sse-copy-operations and s3-sse-multipart
jobs. These jobs document known limitations but should fail to ensure
the issues are tracked and addressed, not silently ignored.

* Hardcode S3 credentials to match Makefile

Changed from environment variables to hardcoded credentials
"some_access_key1"/"some_secret_key1" to match the Makefile
configuration. This ensures tests work reliably.

* fix Double Encryption

* fix Chunk Size Mismatch

* Added IsCompressed

* is gzipped

* fix copying

* only perform HEAD request when len(cipherKey) > 0

* Revert "Make copy and multipart test jobs fail instead of succeed"

This reverts commit bc34a7eb3c103ae7ab2000da2a6c3925712eb226.

* fix security vulnerability

* fix security

* Update s3api_object_handlers_copy.go

* Update s3api_object_handlers_copy.go

* jwt to get content length
2025-12-27 00:09:14 -08:00
Chris Lu
1261e93ef2 fix: comprehensive go vet error fixes and add CI enforcement (#7861)
* fix: use keyed fields in struct literals

- Replace unsafe reflect.StringHeader/SliceHeader with safe unsafe.String/Slice (weed/query/sqltypes/unsafe.go)
- Add field names to Type_ScalarType struct literals (weed/mq/schema/schema_builder.go)
- Add Duration field name to FlexibleDuration struct literals across test files
- Add field names to bson.D struct literals (weed/filer/mongodb/mongodb_store_kv.go)

Fixes go vet warnings about unkeyed struct literals.

* fix: remove unreachable code

- Remove unreachable return statements after infinite for loops
- Remove unreachable code after if/else blocks where all paths return
- Simplify recursive logic by removing unnecessary for loop (inode_to_path.go)
- Fix Type_ScalarType literal to use enum value directly (schema_builder.go)
- Call onCompletionFn on stream error (subscribe_session.go)

Files fixed:
- weed/query/sqltypes/unsafe.go
- weed/mq/schema/schema_builder.go
- weed/mq/client/sub_client/connect_to_sub_coordinator.go
- weed/filer/redis3/ItemList.go
- weed/mq/client/agent_client/subscribe_session.go
- weed/mq/broker/broker_grpc_pub_balancer.go
- weed/mount/inode_to_path.go
- weed/util/skiplist/name_list.go

* fix: avoid copying lock values in protobuf messages

- Use proto.Merge() instead of direct assignment to avoid copying sync.Mutex in S3ApiConfiguration (iamapi_server.go)
- Add explicit comments noting that channel-received values are already copies before taking addresses (volume_grpc_client_to_master.go)

The protobuf messages contain sync.Mutex fields from the message state, which should not be copied.
Using proto.Merge() properly merges messages without copying the embedded mutex.

* fix: correct byte array size for uint32 bit shift operations

The generateAccountId() function only needs 4 bytes to create a uint32 value.
Changed from allocating 8 bytes to 4 bytes to match the actual usage.

This fixes go vet warning about shifting 8-bit values (bytes) by more than 8 bits.

* fix: ensure context cancellation on all error paths

In broker_client_subscribe.go, ensure subscriberCancel() is called on all error return paths:
- When stream creation fails
- When partition assignment fails
- When sending initialization message fails

This prevents context leaks when an error occurs during subscriber creation.

* fix: ensure subscriberCancel called for CreateFreshSubscriber stream.Send error

Ensure subscriberCancel() is called when stream.Send fails in CreateFreshSubscriber.

* ci: add go vet step to prevent future lint regressions

- Add go vet step to GitHub Actions workflow
- Filter known protobuf lock warnings (MessageState sync.Mutex)
  These are expected in generated protobuf code and are safe
- Prevents accumulation of go vet errors in future PRs
- Step runs before build to catch issues early

* fix: resolve remaining syntax and logic errors in vet fixes

- Fixed syntax errors in filer_sync.go caused by missing closing braces
- Added missing closing brace for if block and function
- Synchronized fixes to match previous commits on branch

* fix: add missing return statements to daemon functions

- Add 'return false' after infinite loops in filer_backup.go and filer_meta_backup.go
- Satisfies declared bool return type signatures
- Maintains consistency with other daemon functions (runMaster, runFilerSynchronize, runWorker)
- While unreachable, explicitly declares the return satisfies function signature contract

* fix: add nil check for onCompletionFn in SubscribeMessageRecord

- Check if onCompletionFn is not nil before calling it
- Prevents potential panic if nil function is passed
- Matches pattern used in other callback functions

* docs: clarify unreachable return statements in daemon functions

- Add comments documenting that return statements satisfy function signature
- Explains that these returns follow infinite loops and are unreachable
- Improves code clarity for future maintainers
2025-12-23 14:48:50 -08:00
Chris Lu
621ff124f0 fix: ensure Helm chart is published only after container images are available (#7859)
fix: consolidate Helm chart release with container image build

Resolve issue #7855 by consolidating the Helm chart release workflow
with the container image build workflow. This ensures perfect alignment:

1. Container images are built and pushed to GHCR
2. Images are copied from GHCR to Docker Hub
3. Helm chart is published only after step 2 completes

Previously, the Helm chart was published immediately on tag push before
images were available in Docker Hub, causing deployment failures.

Changes:
- Added helm-release job to container_release_unified.yml that depends
  on copy-to-dockerhub job
- Removed helm_chart_release.yml workflow (consolidated into unified release)

Benefits:
- No race conditions between image push and chart publication
- Users can deploy immediately after release
- Single source of truth for release process
- Clearer job dependencies and execution flow
2025-12-23 10:33:21 -08:00
Chris Lu
683e3d06a4 go mod tidy 2025-12-22 18:48:13 -08:00
Chris Lu
aaa6de7712 Increase timeout from 5m to 10m for S3 HTTPS test workflow 2025-12-22 15:57:32 -08:00
chrislu
4a764dbb37 fmt 2025-12-19 15:33:16 -08:00
G-OD
504b258258 s3: fix remote object not caching (#7790)
* s3: fix remote object not caching

* s3: address review comments for remote object caching

- Fix leading slash in object name by using strings.TrimPrefix
- Return cached entry from CacheRemoteObjectToLocalCluster to get updated local chunk locations
- Reuse existing helper function instead of inline gRPC call

* s3/filer: add singleflight deduplication for remote object caching

- Add singleflight.Group to FilerServer to deduplicate concurrent cache operations
- Wrap CacheRemoteObjectToLocalCluster with singleflight to ensure only one
  caching operation runs per object when multiple clients request the same file
- Add early-return check for already-cached objects
- S3 API calls filer gRPC with timeout and graceful fallback on error
- Clear negative bucket cache when bucket is created via weed shell
- Add integration tests for remote cache with singleflight deduplication

This benefits all clients (S3, HTTP, Hadoop) accessing remote-mounted objects
by preventing redundant cache operations and improving concurrent access performance.

Fixes: https://github.com/seaweedfs/seaweedfs/discussions/7599

* fix: data race in concurrent remote object caching

- Add mutex to protect chunks slice from concurrent append
- Add mutex to protect fetchAndWriteErr from concurrent read/write
- Fix incorrect error check (was checking assignResult.Error instead of parseErr)
- Rename inner variable to avoid shadowing fetchAndWriteErr

* fix: address code review comments

- Remove duplicate remote caching block in GetObjectHandler, keep only singleflight version
- Add mutex protection for concurrent chunk slice and error access (data race fix)
- Use lazy initialization for S3 client in tests to avoid panic during package load
- Fix markdown linting: add language specifier to code fence, blank lines around tables
- Add 'all' target to Makefile as alias for test-with-server
- Remove unused 'util' import

* style: remove emojis from test files

* fix: add defensive checks and sort chunks by offset

- Add nil check and type assertion check for singleflight result
- Sort chunks by offset after concurrent fetching to maintain file order

* fix: improve test diagnostics and path normalization

- runWeedShell now returns error for better test diagnostics
- Add all targets to .PHONY in Makefile (logs-primary, logs-remote, health)
- Strip leading slash from normalizedObject to avoid double slashes in path

---------

Co-authored-by: chrislu <chris.lu@gmail.com>
Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
2025-12-16 12:41:04 -08:00
chrislu
daa3af826f ci: fix stress tests by adding server start/stop 2025-12-16 00:02:00 -08:00
chrislu
aff144f8b5 ci: run versioning stress tests on all PRs, not just master pushes 2025-12-15 23:42:50 -08:00
chrislu
8236df1368 ci: enable pagination stress tests in GitHub CI
Add pagination stress tests (>1000 versions) to the S3 versioning stress
test job in GitHub CI. These tests run on master branch pushes to validate
that ListObjectVersions correctly handles objects with more than 1000
versions using pagination.
2025-12-15 23:11:24 -08:00
Chris Lu
44beb42eb9 s3: fix PutObject ETag format for multi-chunk uploads (#7771)
* s3: fix PutObject ETag format for multi-chunk uploads

Fix issue #7768: AWS S3 SDK for Java fails with 'Invalid base 16
character: -' when performing PutObject on files that are internally
auto-chunked.

The issue was that SeaweedFS returned a composite ETag format
(<md5hash>-<count>) for regular PutObject when the file was split
into multiple chunks due to auto-chunking. However, per AWS S3 spec,
the composite ETag format should only be used for multipart uploads
(CreateMultipartUpload/UploadPart/CompleteMultipartUpload API).

Regular PutObject should always return a pure MD5 hash as the ETag,
regardless of how the file is stored internally.

The fix ensures the MD5 hash is always stored in entry.Attributes.Md5
for regular PutObject operations, so filer.ETag() returns the pure
MD5 hash instead of falling back to ETagChunks() composite format.

* test: add comprehensive ETag format tests for issue #7768

Add integration tests to ensure PutObject ETag format compatibility:

Go tests (test/s3/etag/):
- TestPutObjectETagFormat_SmallFile: 1KB single chunk
- TestPutObjectETagFormat_LargeFile: 10MB auto-chunked (critical for #7768)
- TestPutObjectETagFormat_ExtraLargeFile: 25MB multi-chunk
- TestMultipartUploadETagFormat: verify composite ETag for multipart
- TestPutObjectETagConsistency: ETag consistency across PUT/HEAD/GET
- TestETagHexValidation: simulate AWS SDK v2 hex decoding
- TestMultipleLargeFileUploads: stress test multiple large uploads

Java tests (other/java/s3copier/):
- Update pom.xml to include AWS SDK v2 (2.20.127)
- Add ETagValidationTest.java with comprehensive SDK v2 tests
- Add README.md documenting SDK versions and test coverage

Documentation:
- Add test/s3/SDK_COMPATIBILITY.md documenting validated SDK versions
- Add test/s3/etag/README.md explaining test coverage

These tests ensure large file PutObject (>8MB) returns pure MD5 ETags
(not composite format), which is required for AWS SDK v2 compatibility.

* fix: lower Java version requirement to 11 for CI compatibility

* address CodeRabbit review comments

- s3_etag_test.go: Handle rand.Read error, fix multipart part-count logging
- Makefile: Add 'all' target, pass S3_ENDPOINT to test commands
- SDK_COMPATIBILITY.md: Add language tag to fenced code block
- ETagValidationTest.java: Add pagination to cleanup logic
- README.md: Clarify Go SDK tests are in separate location

* ci: add s3copier ETag validation tests to Java integration tests

- Enable S3 API (-s3 -s3.port=8333) in SeaweedFS test server
- Add S3 API readiness check to wait loop
- Add step to run ETagValidationTest from s3copier

This ensures the fix for issue #7768 is continuously tested
against AWS SDK v2 for Java in CI.

* ci: add S3 config with credentials for s3copier tests

- Add -s3.config pointing to docker/compose/s3.json
- Add -s3.allowDeleteBucketNotEmpty for test cleanup
- Set S3_ACCESS_KEY and S3_SECRET_KEY env vars for tests

* ci: pass S3 config as Maven system properties

Pass S3_ENDPOINT, S3_ACCESS_KEY, S3_SECRET_KEY via -D flags
so they're available via System.getProperty() in Java tests
2025-12-15 12:43:33 -08:00