seaweedFS

Author	SHA1	Message	Date
Chris Lu	a3b83f8808	test: add Trino Iceberg catalog integration test (#8228 ) * test: add Trino Iceberg catalog integration test - Create test/s3/catalog_trino/trino_catalog_test.go with TestTrinoIcebergCatalog - Tests integration between Trino SQL engine and SeaweedFS Iceberg REST catalog - Starts weed mini with all services and Trino in Docker container - Validates Iceberg catalog schema creation and listing operations - Uses native S3 filesystem support in Trino with path-style access - Add workflow job to s3-tables-tests.yml for CI execution * fix: preserve AWS environment credentials when replacing S3 configuration When S3 configuration is loaded from filer/db, it replaces the identities list and inadvertently removes AWS_ACCESS_KEY_ID credentials that were added from environment variables. This caused auth to remain disabled even though valid credentials were present. Fix by preserving environment-based identities when replacing the configuration and re-adding them after the replacement. This ensures environment credentials persist across configuration reloads and properly enable authentication. * fix: use correct ServerAddress format with gRPC port encoding The admin server couldn't connect to master because the master address was missing the gRPC port information. Use pb.NewServerAddress() which properly encodes both HTTP and gRPC ports in the address string. Changes: - weed/command/mini.go: Use pb.NewServerAddress for master address in admin - test/s3/policy/policy_test.go: Store and use gRPC ports for master/filer addresses This fix applies to: 1. Admin server connection to master (mini.go) 2. Test shell commands that need master/filer addresses (policy_test.go) * move * move * fix: always include gRPC port in server address encoding The NewServerAddress() function was omitting the gRPC port from the address string when it matched the port+10000 convention. However, gRPC port allocation doesn't always follow this convention - when the calculated port is busy, an alternative port is allocated. This caused a bug where: 1. Master's gRPC port was allocated as 50661 (sequential, not port+10000) 2. Address was encoded as '192.168.1.66:50660' (gRPC port omitted) 3. Admin client called ToGrpcAddress() which assumed port+10000 offset 4. Admin tried to connect to 60660 but master was on 50661 → connection failed Fix: Always include explicit gRPC port in address format (host:httpPort.grpcPort) unless gRPC port is 0. This makes addresses unambiguous and works regardless of the port allocation strategy used. Impacts: All server-to-server gRPC connections now use properly formatted addresses. * test: fix Iceberg REST API readiness check The Iceberg REST API endpoints require authentication. When checked without credentials, the API returns 403 Forbidden (not 401 Unauthorized). The readiness check now accepts both auth error codes (401/403) as indicators that the service is up and ready, it just needs credentials. This fixes the 'Iceberg REST API did not become ready' test failure. * Fix AWS SigV4 signature verification for base64-encoded payload hashes AWS SigV4 canonical requests must use hex-encoded SHA256 hashes, but the X-Amz-Content-Sha256 header may be transmitted as base64. Changes: - Added normalizePayloadHash() function to convert base64 to hex - Call normalizePayloadHash() in extractV4AuthInfoFromHeader() - Added encoding/base64 import Fixes 403 Forbidden errors on POST requests to Iceberg REST API when clients send base64-encoded content hashes in the header. Impacted services: Iceberg REST API, S3Tables * Fix AWS SigV4 signature verification for base64-encoded payload hashes AWS SigV4 canonical requests must use hex-encoded SHA256 hashes, but the X-Amz-Content-Sha256 header may be transmitted as base64. Changes: - Added normalizePayloadHash() function to convert base64 to hex - Call normalizePayloadHash() in extractV4AuthInfoFromHeader() - Added encoding/base64 import - Removed unused fmt import Fixes 403 Forbidden errors on POST requests to Iceberg REST API when clients send base64-encoded content hashes in the header. Impacted services: Iceberg REST API, S3Tables * pass sigv4 * s3api: fix identity preservation and logging levels - Ensure environment-based identities are preserved during config replacement - Update accessKeyIdent and nameToIdentity maps correctly - Downgrade informational logs to V(2) to reduce noise * test: fix trino integration test and s3 policy test - Pin Trino image version to 479 - Fix port binding to 0.0.0.0 for Docker connectivity - Fix S3 policy test hang by correctly assigning MiniClusterCtx - Improve port finding robustness in policy tests * ci: pre-pull trino image to avoid timeouts - Pull trinodb/trino:479 after Docker setup - Ensure image is ready before integration tests start * iceberg: remove unused checkAuth and improve logging - Remove unused checkAuth method - Downgrade informational logs to V(2) - Ensure loggingMiddleware uses a status writer for accurate reported codes - Narrow catch-all route to avoid interfering with other subsystems * iceberg: fix build failure by removing unused s3api import * Update iceberg.go * use warehouse * Update trino_catalog_test.go	2026-02-06 13:12:25 -08:00
Chris Lu	b244bb58aa	s3tables: redesign Iceberg REST Catalog using iceberg-go and automate integration tests (#8197 ) * full integration with iceberg-go * Table Commit Operations (handleUpdateTable) * s3tables: fix Iceberg v2 compliance and namespace properties This commit ensures SeaweedFS Iceberg REST Catalog is compliant with Iceberg Format Version 2 by: - Using iceberg-go's table.NewMetadataWithUUID for strict v2 compliance. - Explicitly initializing namespace properties to empty maps. - Removing omitempty from required Iceberg response fields. - Fixing CommitTableRequest unmarshaling using table.Requirements and table.Updates. * s3tables: automate Iceberg integration tests - Added Makefile for local test execution and cluster management. - Added docker-compose for PyIceberg compatibility kit. - Added Go integration test harness for PyIceberg. - Updated GitHub CI to run Iceberg catalog tests automatically. * s3tables: update PyIceberg test suite for compatibility - Updated test_rest_catalog.py to use latest PyIceberg transaction APIs. - Updated Dockerfile to include pyarrow and pandas dependencies. - Improved namespace and table handling in integration tests. * s3tables: address review feedback on Iceberg Catalog - Implemented robust metadata version parsing and incrementing. - Ensured table metadata changes are persisted during commit (handleUpdateTable). - Standardized namespace property initialization for consistency. - Fixed unused variable and incorrect struct field build errors. * s3tables: finalize Iceberg REST Catalog and optimize tests - Implemented robust metadata versioning and persistence. - Standardized namespace property initialization. - Optimized integration tests using pre-built Docker image. - Added strict property persistence validation to test suite. - Fixed build errors from previous partial updates. * Address PR review: fix Table UUID stability, implement S3Tables UpdateTable, and support full metadata persistence individually * fix: Iceberg catalog stable UUIDs, metadata persistence, and file writing - Ensure table UUIDs are stable (do not regenerate on load). - Persist full table metadata (Iceberg JSON) in s3tables extended attributes. - Add `MetadataVersion` to explicitly track version numbers, replacing regex parsing. - Implement `saveMetadataFile` to persist metadata JSON files to the Filer on commit. - Update `CreateTable` and `UpdateTable` handlers to use the new logic. * test: bind weed mini to 0.0.0.0 in integration tests to fix Docker connectivity * Iceberg: fix metadata handling in REST catalog - Add nil guard in createTable - Fix updateTable to correctly load existing metadata from storage - Ensure full metadata persistence on updates - Populate loadTable result with parsed metadata * S3Tables: add auth checks and fix response fields in UpdateTable - Add CheckPermissionWithContext to UpdateTable handler - Include TableARN and MetadataLocation in UpdateTable response - Use ErrCodeConflict (409) for version token mismatches * Tests: improve Iceberg catalog test infrastructure and cleanup - Makefile: use PID file for precise process killing - test_rest_catalog.py: remove unused variables and fix f-strings * Iceberg: fix variable shadowing in UpdateTable - Rename inner loop variable `req` to `requirement` to avoid shadowing outer request variable * S3Tables: simplify MetadataVersion initialization - Use `max(req.MetadataVersion, 1)` instead of anonymous function * Tests: remove unicode characters from S3 tables integration test logs - Remove unicode checkmarks from test output for cleaner logs * Iceberg: improve metadata persistence robustness - Fix MetadataLocation in LoadTableResult to fallback to generated location - Improve saveMetadataFile to ensure directory hierarchy existence and robust error handling	2026-02-03 15:30:04 -08:00
Chris Lu	79722bcf30	Add s3tables shell and admin UI (#8172 ) * Add shared s3tables manager * Add s3tables shell commands * Add s3tables admin API * Add s3tables admin UI * Fix admin s3tables namespace create * Rename table buckets menu * Centralize s3tables tag validation * Reuse s3tables manager in admin * Extract s3tables list limit * Add s3tables bucket ARN helper * Remove write middleware from s3tables APIs * Fix bucket link and policy hint * Fix table tag parsing and nav link * Disable namespace table link on invalid ARN * Improve s3tables error decode * Return flag parse errors for s3tables tag * Accept query params for namespace create * Bind namespace create form data * Read s3tables JS data from DOM * s3tables: allow empty region ARN * shell: pass s3tables account id * shell: require account for table buckets * shell: use bucket name for namespaces * shell: use bucket name for tables * shell: use bucket name for tags * admin: add table buckets links in file browser * s3api: reuse s3tables tag validation * admin: harden s3tables UI handlers * fix admin list table buckets * allow admin s3tables access * validate s3tables bucket tags * log s3tables bucket metadata errors * rollback table bucket on owner failure * show s3tables bucket owner * add s3tables iam conditions * Add s3tables user permissions UI * Authorize s3tables using identity actions * Add s3tables permissions to user modal * Disambiguate bucket scope in user permissions * Block table bucket names that match S3 buckets * Pretty-print IAM identity JSON * Include tags in s3tables permission context * admin: refactor S3 Tables inline JavaScript into a separate file * s3tables: extend IAM policy condition operators support * shell: use LookupEntry wrapper for s3tables bucket conflict check * admin: handle buildBucketPermissions validation in create/update flows	2026-01-30 22:57:05 -08:00
Chris Lu	25b0f86bda	s3tables: Fix ownership consistency across handlers Address three related ownership consistency issues: 1. CreateNamespace now sets OwnerAccountID to bucketMetadata.OwnerAccountID instead of request principal. This prevents namespaces created by delegated callers (via bucket policy) from becoming unmanageable, since ListNamespaces filters by bucket owner. 2. CreateTable now: - Fetches bucket metadata to use correct owner for bucket policy evaluation - Uses namespaceMetadata.OwnerAccountID for namespace policy checks - Uses bucketMetadata.OwnerAccountID for bucket policy checks - Sets table OwnerAccountID to namespaceMetadata.OwnerAccountID (inherited) 3. GetTable now: - Fetches bucket metadata to use correct owner for bucket policy evaluation - Uses metadata.OwnerAccountID for table policy checks - Uses bucketMetadata.OwnerAccountID for bucket policy checks This ensures: - Bucket owner retains implicit "owner always allowed" behavior even when evaluating bucket policies - Ownership hierarchy is consistent (namespace owned by bucket, table owned by namespace) - Cross-principal delegation via policies doesn't break ownership chains	2026-01-28 18:03:47 -08:00
Chris Lu	2c45b69775	s3tables: Fix remaining policy error handling in namespace and bucket handlers Replace silent error swallowing (err == nil) with proper error distinction for bucket policy reads. Now properly checks ErrAttributeNotFound and propagates other errors as internal server errors. Fixed 5 locations: - handleCreateNamespace (policy fetch) - handleDeleteNamespace (policy fetch) - handleListNamespaces (policy fetch) - handleGetNamespace (policy fetch) - handleGetTableBucket (policy fetch) This prevents masking of filer issues when policies cannot be read due to I/O errors or other transient failures.	2026-01-28 17:39:44 -08:00
Chris Lu	a27f6527ab	s3tables: Extract resource owner and bucket extraction into helper method Create extractResourceOwnerAndBucket() helper to consolidate the repeated pattern of unmarshaling metadata and extracting bucket name from resource path. This pattern was duplicated in handleTagResource, handleListTagsForResource, and handleUntagResource. Update all three handlers to use the helper. Also update remaining uses of getPrincipalFromRequest() (in handler_bucket_create, handler_bucket_get_list_delete, handler_namespace) to use getAccountID() after consolidating the two identical methods.	2026-01-28 16:24:07 -08:00
Chris Lu	2d556ac2a5	S3 Tables API now properly enforces resource policies addressing the critical security gap where policies were created but never evaluated.	2026-01-28 16:15:34 -08:00
Chris Lu	4d4af0589b	s3tables: standardize access denied errors using ErrAccessDenied constant	2026-01-28 14:33:01 -08:00
Chris Lu	43aebc10da	s3tables: enforce strict resource ownership and implement result filtering for namespaces	2026-01-28 13:59:24 -08:00
Chris Lu	1697ec862f	ownerAccountID	2026-01-28 13:54:49 -08:00
Chris Lu	ef0bae45e3	s3tables: refactor permission checks to use resource owner in namespace handlers	2026-01-28 13:50:16 -08:00
Chris Lu	12c1190a5c	s3tables: update namespace handlers for multi-account support Updated namespace creation to use authenticated account ID for ownership and unified permission checks across all namespace operations to use the correct account principal.	2026-01-28 13:25:27 -08:00
Chris Lu	5eed1874a9	s3tables: return 404 in handleDeleteNamespace if namespace not found	2026-01-28 12:46:20 -08:00
Chris Lu	babf1b06ac	s3tables: implement token-based pagination for namespace listing	2026-01-28 12:30:33 -08:00
Chris Lu	d8c7c16aad	S3 Tables: fix gRPC stream loop handling in namespace handlers - Correctly handle io.EOF in handleListNamespaces and handleDeleteNamespace. - Propagate other errors to prevent silent failures or accidental data loss. - Added necessary io import.	2026-01-28 12:13:29 -08:00
Chris Lu	1c0d37e15a	s3tables: improve error handling and permission logic - Update handleGetNamespace to distinguish between 404 and 500 errors - Refactor CanManagePolicy to use CheckPermission for consistent enforcement - Ensure empty identities are correctly handled in policy management checks	2026-01-28 11:39:28 -08:00
Chris Lu	62a1178a0b	s3tables: improve robustness, security, and error propagation in handlers - Implement strict table name validation (prevention of path traversal and character enforcement) - Add nil checks for entry.Entry in all listing loops to prevent panics - Propagate backend errors instead of swallowing them or assuming 404 - Correctly map filer_pb.ErrNotFound to appropriate S3 error codes - Standardize existence checks across bucket, namespace, and table handlers	2026-01-28 11:37:02 -08:00
Chris Lu	04514071a7	s3tables: implement granular authorization and refine error responses - Remove mandatory ACTION_ADMIN at the router level - Enforce granular permissions in bucket and namespace handlers - Prioritize AccountID in ExtractPrincipalFromContext for ARN matching - Distinguish between 404 (NoSuchBucket) and 500 (InternalError) in metadata lookups - Clean up unused imports in s3api_tables.go	2026-01-28 11:31:38 -08:00
Chris Lu	2c551dad5d	s3tables: fix pagination and enhance error handling in list/delete operations - Fix InclusiveStartFrom logic to ensure exclusive start on continued pages - Prevent duplicates in bucket, namespace, and table listings - Fail fast on listing errors during bucket and namespace deletion - Stop swallowing errors in handleListTables and return proper HTTP error responses	2026-01-28 10:36:28 -08:00
Chris Lu	33da87452b	Refine S3 Tables implementation to address code review feedback - Standardize namespace representation to []string - Improve listing logic with pagination and StartFromFileName - Enhance error handling with sentinel errors and robust checks - Add JSON encoding error logging - Fix CI workflow to use gofmt -l - Standardize timestamps in directory creation - Validate single-level namespaces	2026-01-28 10:04:27 -08:00
Chris Lu	b30631c3b5	s3tables: propagate request context to filer operations	2026-01-28 09:38:01 -08:00
Chris Lu	ef3873b616	s3tables: add error handling for json.Marshal calls - Add error handling in handler_namespace.go (metadata marshaling) - Add error handling in handler_table.go (metadata and tags marshaling) - Add error handling in handler_policy.go (tag marshaling in TagResource and UntagResource) - Return proper errors with context instead of silently ignoring failures	2026-01-28 01:13:42 -08:00
Chris Lu	3b1920cf43	s3tables: add handler_ prefix to operation handler files - Rename bucket_create.go → handler_bucket_create.go - Rename bucket_get_list_delete.go → handler_bucket_get_list_delete.go - Rename namespace.go → handler_namespace.go - Rename table.go → handler_table.go - Rename policy.go → handler_policy.go Improves file organization by clearly identifying handler implementations. No code changes, refactoring only.	2026-01-28 01:00:00 -08:00

23 Commits