* S3: Implement IAM defaults and STS signing key fallback logic
* S3: Refactor startup order to init SSE-S3 key manager before IAM
* S3: Derive STS signing key from KEK using HKDF for security isolation
* S3: Document STS signing key fallback in security.toml
* fix(s3api): refine anonymous access logic and secure-by-default behavior
- Initialize anonymous identity by default in `NewIdentityAccessManagement` to prevent nil pointer exceptions.
- Ensure `ReplaceS3ApiConfiguration` preserves the anonymous identity if not present in the new configuration.
- Update `NewIdentityAccessManagement` signature to accept `filerClient`.
- In legacy mode (no policy engine), anonymous defaults to Deny (no actions), preserving secure-by-default behavior.
- Use specific `LookupAnonymous` method instead of generic map lookup.
- Update tests to accommodate signature changes and verify improved anonymous handling.
* feat(s3api): make IAM configuration optional
- Start S3 API server without a configuration file if `EnableIam` option is set.
- Default to `Allow` effect for policy engine when no configuration is provided (Zero-Config mode).
- Handle empty configuration path gracefully in `loadIAMManagerFromConfig`.
- Add integration test `iam_optional_test.go` to verify empty config behavior.
* fix(iamapi): fix signature mismatch in NewIdentityAccessManagementWithStore
* fix(iamapi): properly initialize FilerClient instead of passing nil
* fix(iamapi): properly initialize filer client for IAM management
- Instead of passing `nil`, construct a `wdclient.FilerClient` using the provided `Filers` addresses.
- Ensure `NewIdentityAccessManagementWithStore` receives a valid `filerClient` to avoid potential nil pointer dereferences or limited functionality.
* clean: remove dead code in s3api_server.go
* refactor(s3api): improve IAM initialization, safety and anonymous access security
* fix(s3api): ensure IAM config loads from filer after client init
* fix(s3): resolve test failures in integration, CORS, and tagging tests
- Fix CORS tests by providing explicit anonymous permissions config
- Fix S3 integration tests by setting admin credentials in init
- Align tagging test credentials in CI with IAM defaults
- Added goroutine to retry IAM config load in iamapi server
* fix(s3): allow anonymous access to health targets and S3 Tables when identities are present
* fix(ci): use /healthz for Caddy health check in awscli tests
* iam, s3api: expose DefaultAllow from IAM and Policy Engine
This allows checking the global "Open by Default" configuration from
other components like S3 Tables.
* s3api/s3tables: support DefaultAllow in permission logic and handler
Updated CheckPermissionWithContext to respect the DefaultAllow flag
in PolicyContext. This enables "Open by Default" behavior for
unauthenticated access in zero-config environments. Added a targeted
unit test to verify the logic.
* s3api/s3tables: propagate DefaultAllow through handlers
Propagated the DefaultAllow flag to individual handlers for
namespaces, buckets, tables, policies, and tagging. This ensures
consistent "Open by Default" behavior across all S3 Tables API
endpoints.
* s3api: wire up DefaultAllow for S3 Tables API initialization
Updated registerS3TablesRoutes to query the global IAM configuration
and set the DefaultAllow flag on the S3 Tables API server. This
completes the end-to-end propagation required for anonymous access in
zero-config environments. Added a SetDefaultAllow method to
S3TablesApiServer to facilitate this.
* s3api: fix tests by adding DefaultAllow to mock IAM integrations
The IAMIntegration interface was updated to include DefaultAllow(),
breaking several mock implementations in tests. This commit fixes
the build errors by adding the missing method to the mocks.
* env
* ensure ports
* env
* env
* fix default allow
* add one more test using non-anonymous user
* debug
* add more debug
* less logs
* Add Trino blog operations test
* Update test/s3tables/catalog_trino/trino_blog_operations_test.go
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* feat: add table bucket path helpers and filer operations
- Add table object root and table location mapping directories
- Implement ensureDirectory, upsertFile, deleteEntryIfExists helpers
- Support table location bucket mapping for S3 access
* feat: manage table bucket object roots on creation/deletion
- Create .objects directory for table buckets on creation
- Clean up table object bucket paths on deletion
- Enable S3 operations on table bucket object roots
* feat: add table location mapping for Iceberg REST
- Track table location bucket mappings when tables are created/updated/deleted
- Enable location-based routing for S3 operations on table data
* feat: route S3 operations to table bucket object roots
- Route table-s3 bucket names to mapped table paths
- Route table buckets to object root directories
- Support table location bucket mapping lookup
* feat: emit table-s3 locations from Iceberg REST
- Generate unique table-s3 bucket names with UUID suffix
- Store table metadata under table bucket paths
- Return table-s3 locations for Trino compatibility
* fix: handle missing directories in S3 list operations
- Propagate ErrNotFound from ListEntries for non-existent directories
- Treat missing directories as empty results for list operations
- Fixes Trino non-empty location checks on table creation
* test: improve Trino CSV parsing for single-value results
- Sanitize Trino output to skip jline warnings
- Handle single-value CSV results without header rows
- Strip quotes from numeric values in tests
* refactor: use bucket path helpers throughout S3 API
- Replace direct bucket path operations with helper functions
- Leverage centralized table bucket routing logic
- Improve maintainability with consistent path resolution
* fix: add table bucket cache and improve filer error handling
- Cache table bucket lookups to reduce filer overhead on repeated checks
- Use filer_pb.CreateEntry and filer_pb.UpdateEntry helpers to check resp.Error
- Fix delete order in handler_bucket_get_list_delete: delete table object before directory
- Make location mapping errors best-effort: log and continue, don't fail API
- Update table location mappings to delete stale prior bucket mappings on update
- Add 1-second sleep before timestamp time travel query to ensure timestamps are in past
- Fix CSV parsing: examine all lines, not skip first; handle single-value rows
* fix: properly handle stale metadata location mapping cleanup
- Capture oldMetadataLocation before mutation in handleUpdateTable
- Update updateTableLocationMapping to accept both old and new locations
- Use passed-in oldMetadataLocation to detect location changes
- Delete stale mapping only when location actually changes
- Pass empty string for oldLocation in handleCreateTable (new tables have no prior mapping)
- Improve logging to show old -> new location transitions
* refactor: cleanup imports and cache design
- Remove unused 'sync' import from bucket_paths.go
- Use filer_pb.UpdateEntry helper in setExtendedAttribute and deleteExtendedAttribute for consistent error handling
- Add dedicated tableBucketCache map[string]bool to BucketRegistry instead of mixing concerns with metadataCache
- Improve cache separation: table buckets cache is now separate from bucket metadata cache
* fix: improve cache invalidation and add transient error handling
Cache invalidation (critical fix):
- Add tableLocationCache to BucketRegistry for location mapping lookups
- Clear tableBucketCache and tableLocationCache in RemoveBucketMetadata
- Prevents stale cache entries when buckets are deleted/recreated
Transient error handling:
- Only cache table bucket lookups when conclusive (found or ErrNotFound)
- Skip caching on transient errors (network, permission, etc)
- Prevents marking real table buckets as non-table due to transient failures
Performance optimization:
- Cache tableLocationDir results to avoid repeated filer RPCs on hot paths
- tableLocationDir now checks cache before making expensive filer lookups
- Cache stores empty string for 'not found' to avoid redundant lookups
Code clarity:
- Add comment to deleteDirectory explaining DeleteEntry response lacks Error field
* go fmt
* fix: mirror transient error handling in tableLocationDir and optimize bucketDir
Transient error handling:
- tableLocationDir now only caches definitive results
- Mirrors isTableBucket behavior to prevent treating transient errors as permanent misses
- Improves reliability on flaky systems or during recovery
Performance optimization:
- bucketDir avoids redundant isTableBucket call via bucketRoot
- Directly use s3a.option.BucketsPath for regular buckets
- Saves one cache lookup for every non-table bucket operation
* fix: revert bucketDir optimization to preserve bucketRoot logic
The optimization to directly use BucketsPath bypassed bucketRoot's logic
and caused issues with S3 list operations on delimiter+prefix cases.
Revert to using path.Join(s3a.bucketRoot(bucket), bucket) which properly
handles all bucket types and ensures consistent path resolution across
the codebase.
The slight performance cost of an extra cache lookup is worth the correctness
and consistency benefits.
* feat: move table buckets under /buckets
Add a table-bucket marker attribute, reuse bucket metadata cache for table bucket detection, and update list/validation/UI/test paths to treat table buckets as /buckets entries.
* Fix S3 Tables code review issues
- handler_bucket_create.go: Fix bucket existence check to properly validate
entryResp.Entry before setting s3BucketExists flag (nil Entry should not
indicate existing bucket)
- bucket_paths.go: Add clarifying comment to bucketRoot() explaining unified
buckets root path for all bucket types
- file_browser_data.go: Optimize by extracting table bucket check early to
avoid redundant WithFilerClient call
* Fix list prefix delimiter handling
* Handle list errors conservatively
* Fix Trino FOR TIMESTAMP query - use past timestamp
Iceberg requires the timestamp to be strictly in the past.
Use current_timestamp - interval '1' second instead of current_timestamp.
---------
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* full integration with iceberg-go
* Table Commit Operations (handleUpdateTable)
* s3tables: fix Iceberg v2 compliance and namespace properties
This commit ensures SeaweedFS Iceberg REST Catalog is compliant with
Iceberg Format Version 2 by:
- Using iceberg-go's table.NewMetadataWithUUID for strict v2 compliance.
- Explicitly initializing namespace properties to empty maps.
- Removing omitempty from required Iceberg response fields.
- Fixing CommitTableRequest unmarshaling using table.Requirements and table.Updates.
* s3tables: automate Iceberg integration tests
- Added Makefile for local test execution and cluster management.
- Added docker-compose for PyIceberg compatibility kit.
- Added Go integration test harness for PyIceberg.
- Updated GitHub CI to run Iceberg catalog tests automatically.
* s3tables: update PyIceberg test suite for compatibility
- Updated test_rest_catalog.py to use latest PyIceberg transaction APIs.
- Updated Dockerfile to include pyarrow and pandas dependencies.
- Improved namespace and table handling in integration tests.
* s3tables: address review feedback on Iceberg Catalog
- Implemented robust metadata version parsing and incrementing.
- Ensured table metadata changes are persisted during commit (handleUpdateTable).
- Standardized namespace property initialization for consistency.
- Fixed unused variable and incorrect struct field build errors.
* s3tables: finalize Iceberg REST Catalog and optimize tests
- Implemented robust metadata versioning and persistence.
- Standardized namespace property initialization.
- Optimized integration tests using pre-built Docker image.
- Added strict property persistence validation to test suite.
- Fixed build errors from previous partial updates.
* Address PR review: fix Table UUID stability, implement S3Tables UpdateTable, and support full metadata persistence individually
* fix: Iceberg catalog stable UUIDs, metadata persistence, and file writing
- Ensure table UUIDs are stable (do not regenerate on load).
- Persist full table metadata (Iceberg JSON) in s3tables extended attributes.
- Add `MetadataVersion` to explicitly track version numbers, replacing regex parsing.
- Implement `saveMetadataFile` to persist metadata JSON files to the Filer on commit.
- Update `CreateTable` and `UpdateTable` handlers to use the new logic.
* test: bind weed mini to 0.0.0.0 in integration tests to fix Docker connectivity
* Iceberg: fix metadata handling in REST catalog
- Add nil guard in createTable
- Fix updateTable to correctly load existing metadata from storage
- Ensure full metadata persistence on updates
- Populate loadTable result with parsed metadata
* S3Tables: add auth checks and fix response fields in UpdateTable
- Add CheckPermissionWithContext to UpdateTable handler
- Include TableARN and MetadataLocation in UpdateTable response
- Use ErrCodeConflict (409) for version token mismatches
* Tests: improve Iceberg catalog test infrastructure and cleanup
- Makefile: use PID file for precise process killing
- test_rest_catalog.py: remove unused variables and fix f-strings
* Iceberg: fix variable shadowing in UpdateTable
- Rename inner loop variable `req` to `requirement` to avoid shadowing outer request variable
* S3Tables: simplify MetadataVersion initialization
- Use `max(req.MetadataVersion, 1)` instead of anonymous function
* Tests: remove unicode characters from S3 tables integration test logs
- Remove unicode checkmarks from test output for cleaner logs
* Iceberg: improve metadata persistence robustness
- Fix MetadataLocation in LoadTableResult to fallback to generated location
- Improve saveMetadataFile to ensure directory hierarchy existence and robust error handling
* Add shared s3tables manager
* Add s3tables shell commands
* Add s3tables admin API
* Add s3tables admin UI
* Fix admin s3tables namespace create
* Rename table buckets menu
* Centralize s3tables tag validation
* Reuse s3tables manager in admin
* Extract s3tables list limit
* Add s3tables bucket ARN helper
* Remove write middleware from s3tables APIs
* Fix bucket link and policy hint
* Fix table tag parsing and nav link
* Disable namespace table link on invalid ARN
* Improve s3tables error decode
* Return flag parse errors for s3tables tag
* Accept query params for namespace create
* Bind namespace create form data
* Read s3tables JS data from DOM
* s3tables: allow empty region ARN
* shell: pass s3tables account id
* shell: require account for table buckets
* shell: use bucket name for namespaces
* shell: use bucket name for tables
* shell: use bucket name for tags
* admin: add table buckets links in file browser
* s3api: reuse s3tables tag validation
* admin: harden s3tables UI handlers
* fix admin list table buckets
* allow admin s3tables access
* validate s3tables bucket tags
* log s3tables bucket metadata errors
* rollback table bucket on owner failure
* show s3tables bucket owner
* add s3tables iam conditions
* Add s3tables user permissions UI
* Authorize s3tables using identity actions
* Add s3tables permissions to user modal
* Disambiguate bucket scope in user permissions
* Block table bucket names that match S3 buckets
* Pretty-print IAM identity JSON
* Include tags in s3tables permission context
* admin: refactor S3 Tables inline JavaScript into a separate file
* s3tables: extend IAM policy condition operators support
* shell: use LookupEntry wrapper for s3tables bucket conflict check
* admin: handle buildBucketPermissions validation in create/update flows
Move bucket name extraction outside the if/else block in
extractResourceOwnerAndBucket since the logic is identical for both
ResourceTypeTable and ResourceTypeBucket cases. This reduces code
duplication and improves maintainability.
The extraction pattern (parts[1] from /tables/{bucket}/...) works for
both resource types, so it's now performed once before the type-specific
metadata unmarshaling.
Replace error-swallowing pattern where all errors from getExtendedAttribute
were ignored for bucket policy reads. Now properly distinguish between:
- ErrAttributeNotFound: Policy not found is expected; continue with empty policy
- Other errors: Return internal server error and stop processing
Applied fix to all bucket policy reads in:
- handleDeleteTableBucketPolicy (line 220)
- handleTagResource (line 313)
- handleUntagResource (line 405)
- handleListTagsForResource (line 488)
- And additional occurrences in closures
This prevents silent failures and ensures policy-related errors are surfaced
to callers rather than being silently ignored.
Create extractResourceOwnerAndBucket() helper to consolidate the repeated pattern
of unmarshaling metadata and extracting bucket name from resource path. This
pattern was duplicated in handleTagResource, handleListTagsForResource, and
handleUntagResource. Update all three handlers to use the helper.
Also update remaining uses of getPrincipalFromRequest() (in handler_bucket_create,
handler_bucket_get_list_delete, handler_namespace) to use getAccountID() after
consolidating the two identical methods.
Update handleListTagsForResource to fetch and pass bucket policy to
CheckPermission, matching the behavior of handleTagResource/handleUntagResource.
This enables bucket-policy-based permission grants to be evaluated for
ListTagsForResource, not just ownership-based checks.
Refactored resolveResourcePath to return resource type, enabling accurate
NoSuchBucket vs NoSuchTable error codes. Added existence checks before
deleting policies.
- Add authorization checks to all S3 Tables handlers (policy, table ops) to enforce security
- Improve error handling to distinguish between NotFound (404) and InternalError (500)
- Fix directory FileMode usage in filer_ops
- Improve test randomness for version tokens
- Update permissions comments to acknowledge IAM gaps
- Implement strict table name validation (prevention of path traversal and character enforcement)
- Add nil checks for entry.Entry in all listing loops to prevent panics
- Propagate backend errors instead of swallowing them or assuming 404
- Correctly map filer_pb.ErrNotFound to appropriate S3 error codes
- Standardize existence checks across bucket, namespace, and table handlers