feat: Add Iceberg REST Catalog server and admin UI (#8175)
* feat: Add Iceberg REST Catalog server Implement Iceberg REST Catalog API on a separate port (default 8181) that exposes S3 Tables metadata through the Apache Iceberg REST protocol. - Add new weed/s3api/iceberg package with REST handlers - Implement /v1/config endpoint returning catalog configuration - Implement namespace endpoints (list/create/get/head/delete) - Implement table endpoints (list/create/load/head/delete/update) - Add -port.iceberg flag to S3 standalone server (s3.go) - Add -s3.port.iceberg flag to combined server mode (server.go) - Add -s3.port.iceberg flag to mini cluster mode (mini.go) - Support prefix-based routing for multiple catalogs The Iceberg REST server reuses S3 Tables metadata storage under /table-buckets and enables DuckDB, Spark, and other Iceberg clients to connect to SeaweedFS as a catalog. * feat: Add Iceberg Catalog pages to admin UI Add admin UI pages to browse Iceberg catalogs, namespaces, and tables. - Add Iceberg Catalog menu item under Object Store navigation - Create iceberg_catalog.templ showing catalog overview with REST info - Create iceberg_namespaces.templ listing namespaces in a catalog - Create iceberg_tables.templ listing tables in a namespace - Add handlers and routes in admin_handlers.go - Add Iceberg data provider methods in s3tables_management.go - Add Iceberg data types in types.go The Iceberg Catalog pages provide visibility into the same S3 Tables data through an Iceberg-centric lens, including REST endpoint examples for DuckDB and PyIceberg. * test: Add Iceberg catalog integration tests and reorg s3tables tests - Reorganize existing s3tables tests to test/s3tables/table-buckets/ - Add new test/s3tables/catalog/ for Iceberg REST catalog tests - Add TestIcebergConfig to verify /v1/config endpoint - Add TestIcebergNamespaces to verify namespace listing - Add TestDuckDBIntegration for DuckDB connectivity (requires Docker) - Update CI workflow to use new test paths * fix: Generate proper random UUIDs for Iceberg tables Address code review feedback: - Replace placeholder UUID with crypto/rand-based UUID v4 generation - Add detailed TODO comments for handleUpdateTable stub explaining the required atomic metadata swap implementation * fix: Serve Iceberg on localhost listener when binding to different interface Address code review feedback: properly serve the localhost listener when the Iceberg server is bound to a non-localhost interface. * ci: Add Iceberg catalog integration tests to CI Add new job to run Iceberg catalog tests in CI, along with: - Iceberg package build verification - Iceberg unit tests - Iceberg go vet checks - Iceberg format checks * fix: Address code review feedback for Iceberg implementation - fix: Replace hardcoded account ID with s3_constants.AccountAdminId in buildTableBucketARN() - fix: Improve UUID generation error handling with deterministic fallback (timestamp + PID + counter) - fix: Update handleUpdateTable to return HTTP 501 Not Implemented instead of fake success - fix: Better error handling in handleNamespaceExists to distinguish 404 from 500 errors - fix: Use relative URL in template instead of hardcoded localhost:8181 - fix: Add HTTP timeout to test's waitForService function to avoid hangs - fix: Use dynamic ephemeral ports in integration tests to avoid flaky parallel failures - fix: Add Iceberg port to final port configuration logging in mini.go * fix: Address critical issues in Iceberg implementation - fix: Cache table UUIDs to ensure persistence across LoadTable calls The UUID now remains stable for the lifetime of the server session. TODO: For production, UUIDs should be persisted in S3 Tables metadata. - fix: Remove redundant URL-encoded namespace parsing mux router already decodes %1F to \x1F before passing to handlers. Redundant ReplaceAll call could cause bugs with literal %1F in namespace. * fix: Improve test robustness and reduce code duplication - fix: Make DuckDB test more robust by failing on unexpected errors Instead of silently logging errors, now explicitly check for expected conditions (extension not available) and skip the test appropriately. - fix: Extract username helper method to reduce duplication Created getUsername() helper in AdminHandlers to avoid duplicating the username retrieval logic across Iceberg page handlers. * fix: Add mutex protection to table UUID cache Protects concurrent access to the tableUUIDs map with sync.RWMutex. Uses read-lock for fast path when UUID already cached, and write-lock for generating new UUIDs. Includes double-check pattern to handle race condition between read-unlock and write-lock. * style: fix go fmt errors * feat(iceberg): persist table UUID in S3 Tables metadata * feat(admin): configure Iceberg port in Admin UI and commands * refactor: address review comments (flags, tests, handlers) - command/mini: fix tracking of explicit s3.port.iceberg flag - command/admin: add explicit -iceberg.port flag - admin/handlers: reuse getUsername helper - tests: use 127.0.0.1 for ephemeral ports and os.Stat for file size check * test: check error from FileStat in verify_gc_empty_test
This commit is contained in:
@@ -104,11 +104,12 @@ type AdminServer struct {
|
||||
collectionStatsCacheThreshold time.Duration
|
||||
|
||||
s3TablesManager *s3tables.Manager
|
||||
icebergPort int
|
||||
}
|
||||
|
||||
// Type definitions moved to types.go
|
||||
|
||||
func NewAdminServer(masters string, templateFS http.FileSystem, dataDir string) *AdminServer {
|
||||
func NewAdminServer(masters string, templateFS http.FileSystem, dataDir string, icebergPort int) *AdminServer {
|
||||
grpcDialOption := security.LoadClientTLS(util.GetViper(), "grpc.admin")
|
||||
|
||||
// Create master client with multiple master support
|
||||
@@ -136,6 +137,7 @@ func NewAdminServer(masters string, templateFS http.FileSystem, dataDir string)
|
||||
configPersistence: NewConfigPersistence(dataDir),
|
||||
collectionStatsCacheThreshold: 30 * time.Second,
|
||||
s3TablesManager: newS3TablesManager(),
|
||||
icebergPort: icebergPort,
|
||||
}
|
||||
|
||||
// Initialize topic retention purger
|
||||
|
||||
@@ -166,6 +166,85 @@ func (s *AdminServer) GetS3TablesTablesData(ctx context.Context, bucketArn, name
|
||||
}, nil
|
||||
}
|
||||
|
||||
// Iceberg Catalog data providers
|
||||
|
||||
// GetIcebergCatalogData returns the Iceberg catalog overview data.
|
||||
// Each S3 Table Bucket is exposed as an Iceberg catalog.
|
||||
func (s *AdminServer) GetIcebergCatalogData(ctx context.Context) (IcebergCatalogData, error) {
|
||||
bucketsData, err := s.GetS3TablesBucketsData(ctx)
|
||||
if err != nil {
|
||||
return IcebergCatalogData{}, err
|
||||
}
|
||||
|
||||
catalogs := make([]IcebergCatalogInfo, 0, len(bucketsData.Buckets))
|
||||
for _, bucket := range bucketsData.Buckets {
|
||||
catalogs = append(catalogs, IcebergCatalogInfo{
|
||||
Name: bucket.Name,
|
||||
ARN: bucket.ARN,
|
||||
OwnerAccountID: bucket.OwnerAccountID,
|
||||
CreatedAt: bucket.CreatedAt,
|
||||
})
|
||||
}
|
||||
|
||||
return IcebergCatalogData{
|
||||
Catalogs: catalogs,
|
||||
TotalCatalogs: len(catalogs),
|
||||
IcebergPort: s.icebergPort, // Use the port passed to AdminServer
|
||||
LastUpdated: time.Now(),
|
||||
}, nil
|
||||
}
|
||||
|
||||
// GetIcebergNamespacesData returns namespaces for an Iceberg catalog.
|
||||
func (s *AdminServer) GetIcebergNamespacesData(ctx context.Context, catalogName, bucketArn string) (IcebergNamespacesData, error) {
|
||||
nsData, err := s.GetS3TablesNamespacesData(ctx, bucketArn)
|
||||
if err != nil {
|
||||
return IcebergNamespacesData{}, err
|
||||
}
|
||||
|
||||
namespaces := make([]IcebergNamespaceInfo, 0, len(nsData.Namespaces))
|
||||
for _, ns := range nsData.Namespaces {
|
||||
name := ""
|
||||
if len(ns.Namespace) > 0 {
|
||||
name = strings.Join(ns.Namespace, ".")
|
||||
}
|
||||
namespaces = append(namespaces, IcebergNamespaceInfo{
|
||||
Name: name,
|
||||
CreatedAt: ns.CreatedAt,
|
||||
})
|
||||
}
|
||||
|
||||
return IcebergNamespacesData{
|
||||
CatalogName: catalogName,
|
||||
Namespaces: namespaces,
|
||||
TotalNamespaces: len(namespaces),
|
||||
LastUpdated: time.Now(),
|
||||
}, nil
|
||||
}
|
||||
|
||||
// GetIcebergTablesData returns tables for an Iceberg namespace.
|
||||
func (s *AdminServer) GetIcebergTablesData(ctx context.Context, catalogName, bucketArn, namespace string) (IcebergTablesData, error) {
|
||||
tablesData, err := s.GetS3TablesTablesData(ctx, bucketArn, namespace)
|
||||
if err != nil {
|
||||
return IcebergTablesData{}, err
|
||||
}
|
||||
|
||||
tables := make([]IcebergTableInfo, 0, len(tablesData.Tables))
|
||||
for _, t := range tablesData.Tables {
|
||||
tables = append(tables, IcebergTableInfo{
|
||||
Name: t.Name,
|
||||
CreatedAt: t.CreatedAt,
|
||||
})
|
||||
}
|
||||
|
||||
return IcebergTablesData{
|
||||
CatalogName: catalogName,
|
||||
NamespaceName: namespace,
|
||||
Tables: tables,
|
||||
TotalTables: len(tables),
|
||||
LastUpdated: time.Now(),
|
||||
}, nil
|
||||
}
|
||||
|
||||
// API handlers
|
||||
|
||||
func (s *AdminServer) ListS3TablesBucketsAPI(c *gin.Context) {
|
||||
|
||||
@@ -596,3 +596,46 @@ type STSConfigData struct {
|
||||
Providers []string `json:"providers,omitempty"`
|
||||
LastUpdated time.Time `json:"last_updated"`
|
||||
}
|
||||
|
||||
// Iceberg Catalog types
|
||||
type IcebergCatalogInfo struct {
|
||||
Name string `json:"name"`
|
||||
ARN string `json:"arn"`
|
||||
OwnerAccountID string `json:"owner_account_id"`
|
||||
CreatedAt time.Time `json:"created_at"`
|
||||
}
|
||||
|
||||
type IcebergCatalogData struct {
|
||||
Username string `json:"username"`
|
||||
Catalogs []IcebergCatalogInfo `json:"catalogs"`
|
||||
TotalCatalogs int `json:"total_catalogs"`
|
||||
IcebergPort int `json:"iceberg_port"`
|
||||
LastUpdated time.Time `json:"last_updated"`
|
||||
}
|
||||
|
||||
type IcebergNamespaceInfo struct {
|
||||
Name string `json:"name"`
|
||||
CreatedAt time.Time `json:"created_at"`
|
||||
}
|
||||
|
||||
type IcebergNamespacesData struct {
|
||||
Username string `json:"username"`
|
||||
CatalogName string `json:"catalog_name"`
|
||||
Namespaces []IcebergNamespaceInfo `json:"namespaces"`
|
||||
TotalNamespaces int `json:"total_namespaces"`
|
||||
LastUpdated time.Time `json:"last_updated"`
|
||||
}
|
||||
|
||||
type IcebergTableInfo struct {
|
||||
Name string `json:"name"`
|
||||
CreatedAt time.Time `json:"created_at"`
|
||||
}
|
||||
|
||||
type IcebergTablesData struct {
|
||||
Username string `json:"username"`
|
||||
CatalogName string `json:"catalog_name"`
|
||||
NamespaceName string `json:"namespace_name"`
|
||||
Tables []IcebergTableInfo `json:"tables"`
|
||||
TotalTables int `json:"total_tables"`
|
||||
LastUpdated time.Time `json:"last_updated"`
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user