s3tables: redesign Iceberg REST Catalog using iceberg-go and automate integration tests (#8197)
* full integration with iceberg-go * Table Commit Operations (handleUpdateTable) * s3tables: fix Iceberg v2 compliance and namespace properties This commit ensures SeaweedFS Iceberg REST Catalog is compliant with Iceberg Format Version 2 by: - Using iceberg-go's table.NewMetadataWithUUID for strict v2 compliance. - Explicitly initializing namespace properties to empty maps. - Removing omitempty from required Iceberg response fields. - Fixing CommitTableRequest unmarshaling using table.Requirements and table.Updates. * s3tables: automate Iceberg integration tests - Added Makefile for local test execution and cluster management. - Added docker-compose for PyIceberg compatibility kit. - Added Go integration test harness for PyIceberg. - Updated GitHub CI to run Iceberg catalog tests automatically. * s3tables: update PyIceberg test suite for compatibility - Updated test_rest_catalog.py to use latest PyIceberg transaction APIs. - Updated Dockerfile to include pyarrow and pandas dependencies. - Improved namespace and table handling in integration tests. * s3tables: address review feedback on Iceberg Catalog - Implemented robust metadata version parsing and incrementing. - Ensured table metadata changes are persisted during commit (handleUpdateTable). - Standardized namespace property initialization for consistency. - Fixed unused variable and incorrect struct field build errors. * s3tables: finalize Iceberg REST Catalog and optimize tests - Implemented robust metadata versioning and persistence. - Standardized namespace property initialization. - Optimized integration tests using pre-built Docker image. - Added strict property persistence validation to test suite. - Fixed build errors from previous partial updates. * Address PR review: fix Table UUID stability, implement S3Tables UpdateTable, and support full metadata persistence individually * fix: Iceberg catalog stable UUIDs, metadata persistence, and file writing - Ensure table UUIDs are stable (do not regenerate on load). - Persist full table metadata (Iceberg JSON) in s3tables extended attributes. - Add `MetadataVersion` to explicitly track version numbers, replacing regex parsing. - Implement `saveMetadataFile` to persist metadata JSON files to the Filer on commit. - Update `CreateTable` and `UpdateTable` handlers to use the new logic. * test: bind weed mini to 0.0.0.0 in integration tests to fix Docker connectivity * Iceberg: fix metadata handling in REST catalog - Add nil guard in createTable - Fix updateTable to correctly load existing metadata from storage - Ensure full metadata persistence on updates - Populate loadTable result with parsed metadata * S3Tables: add auth checks and fix response fields in UpdateTable - Add CheckPermissionWithContext to UpdateTable handler - Include TableARN and MetadataLocation in UpdateTable response - Use ErrCodeConflict (409) for version token mismatches * Tests: improve Iceberg catalog test infrastructure and cleanup - Makefile: use PID file for precise process killing - test_rest_catalog.py: remove unused variables and fix f-strings * Iceberg: fix variable shadowing in UpdateTable - Rename inner loop variable `req` to `requirement` to avoid shadowing outer request variable * S3Tables: simplify MetadataVersion initialization - Use `max(req.MetadataVersion, 1)` instead of anonymous function * Tests: remove unicode characters from S3 tables integration test logs - Remove unicode checkmarks from test output for cleaner logs * Iceberg: improve metadata persistence robustness - Fix MetadataLocation in LoadTableResult to fallback to generated location - Improve saveMetadataFile to ensure directory hierarchy existence and robust error handling
This commit is contained in:
@@ -79,18 +79,18 @@ func parseTableFromARN(arn string) (bucketName, namespace, tableName string, err
|
||||
|
||||
// Path helpers
|
||||
|
||||
// getTableBucketPath returns the filer path for a table bucket
|
||||
func getTableBucketPath(bucketName string) string {
|
||||
// GetTableBucketPath returns the filer path for a table bucket
|
||||
func GetTableBucketPath(bucketName string) string {
|
||||
return path.Join(TablesPath, bucketName)
|
||||
}
|
||||
|
||||
// getNamespacePath returns the filer path for a namespace
|
||||
func getNamespacePath(bucketName, namespace string) string {
|
||||
// GetNamespacePath returns the filer path for a namespace
|
||||
func GetNamespacePath(bucketName, namespace string) string {
|
||||
return path.Join(TablesPath, bucketName, namespace)
|
||||
}
|
||||
|
||||
// getTablePath returns the filer path for a table
|
||||
func getTablePath(bucketName, namespace, tableName string) string {
|
||||
// GetTablePath returns the filer path for a table
|
||||
func GetTablePath(bucketName, namespace, tableName string) string {
|
||||
return path.Join(TablesPath, bucketName, namespace, tableName)
|
||||
}
|
||||
|
||||
@@ -118,6 +118,7 @@ type tableMetadataInternal struct {
|
||||
ModifiedAt time.Time `json:"modifiedAt"`
|
||||
OwnerAccountID string `json:"ownerAccountId"`
|
||||
VersionToken string `json:"versionToken"`
|
||||
MetadataVersion int `json:"metadataVersion"`
|
||||
MetadataLocation string `json:"metadataLocation,omitempty"`
|
||||
Metadata *TableMetadata `json:"metadata,omitempty"`
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user