feat: Add Iceberg REST Catalog server and admin UI (#8175)
* feat: Add Iceberg REST Catalog server Implement Iceberg REST Catalog API on a separate port (default 8181) that exposes S3 Tables metadata through the Apache Iceberg REST protocol. - Add new weed/s3api/iceberg package with REST handlers - Implement /v1/config endpoint returning catalog configuration - Implement namespace endpoints (list/create/get/head/delete) - Implement table endpoints (list/create/load/head/delete/update) - Add -port.iceberg flag to S3 standalone server (s3.go) - Add -s3.port.iceberg flag to combined server mode (server.go) - Add -s3.port.iceberg flag to mini cluster mode (mini.go) - Support prefix-based routing for multiple catalogs The Iceberg REST server reuses S3 Tables metadata storage under /table-buckets and enables DuckDB, Spark, and other Iceberg clients to connect to SeaweedFS as a catalog. * feat: Add Iceberg Catalog pages to admin UI Add admin UI pages to browse Iceberg catalogs, namespaces, and tables. - Add Iceberg Catalog menu item under Object Store navigation - Create iceberg_catalog.templ showing catalog overview with REST info - Create iceberg_namespaces.templ listing namespaces in a catalog - Create iceberg_tables.templ listing tables in a namespace - Add handlers and routes in admin_handlers.go - Add Iceberg data provider methods in s3tables_management.go - Add Iceberg data types in types.go The Iceberg Catalog pages provide visibility into the same S3 Tables data through an Iceberg-centric lens, including REST endpoint examples for DuckDB and PyIceberg. * test: Add Iceberg catalog integration tests and reorg s3tables tests - Reorganize existing s3tables tests to test/s3tables/table-buckets/ - Add new test/s3tables/catalog/ for Iceberg REST catalog tests - Add TestIcebergConfig to verify /v1/config endpoint - Add TestIcebergNamespaces to verify namespace listing - Add TestDuckDBIntegration for DuckDB connectivity (requires Docker) - Update CI workflow to use new test paths * fix: Generate proper random UUIDs for Iceberg tables Address code review feedback: - Replace placeholder UUID with crypto/rand-based UUID v4 generation - Add detailed TODO comments for handleUpdateTable stub explaining the required atomic metadata swap implementation * fix: Serve Iceberg on localhost listener when binding to different interface Address code review feedback: properly serve the localhost listener when the Iceberg server is bound to a non-localhost interface. * ci: Add Iceberg catalog integration tests to CI Add new job to run Iceberg catalog tests in CI, along with: - Iceberg package build verification - Iceberg unit tests - Iceberg go vet checks - Iceberg format checks * fix: Address code review feedback for Iceberg implementation - fix: Replace hardcoded account ID with s3_constants.AccountAdminId in buildTableBucketARN() - fix: Improve UUID generation error handling with deterministic fallback (timestamp + PID + counter) - fix: Update handleUpdateTable to return HTTP 501 Not Implemented instead of fake success - fix: Better error handling in handleNamespaceExists to distinguish 404 from 500 errors - fix: Use relative URL in template instead of hardcoded localhost:8181 - fix: Add HTTP timeout to test's waitForService function to avoid hangs - fix: Use dynamic ephemeral ports in integration tests to avoid flaky parallel failures - fix: Add Iceberg port to final port configuration logging in mini.go * fix: Address critical issues in Iceberg implementation - fix: Cache table UUIDs to ensure persistence across LoadTable calls The UUID now remains stable for the lifetime of the server session. TODO: For production, UUIDs should be persisted in S3 Tables metadata. - fix: Remove redundant URL-encoded namespace parsing mux router already decodes %1F to \x1F before passing to handlers. Redundant ReplaceAll call could cause bugs with literal %1F in namespace. * fix: Improve test robustness and reduce code duplication - fix: Make DuckDB test more robust by failing on unexpected errors Instead of silently logging errors, now explicitly check for expected conditions (extension not available) and skip the test appropriately. - fix: Extract username helper method to reduce duplication Created getUsername() helper in AdminHandlers to avoid duplicating the username retrieval logic across Iceberg page handlers. * fix: Add mutex protection to table UUID cache Protects concurrent access to the tableUUIDs map with sync.RWMutex. Uses read-lock for fast path when UUID already cached, and write-lock for generating new UUIDs. Includes double-check pattern to handle race condition between read-unlock and write-lock. * style: fix go fmt errors * feat(iceberg): persist table UUID in S3 Tables metadata * feat(admin): configure Iceberg port in Admin UI and commands * refactor: address review comments (flags, tests, handlers) - command/mini: fix tracking of explicit s3.port.iceberg flag - command/admin: add explicit -iceberg.port flag - admin/handlers: reuse getUsername helper - tests: use 127.0.0.1 for ephemeral ports and os.Stat for file size check * test: check error from FileStat in verify_gc_empty_test
This commit is contained in:
@@ -42,6 +42,7 @@ type AdminOptions struct {
|
||||
readOnlyUser *string
|
||||
readOnlyPassword *string
|
||||
dataDir *string
|
||||
icebergPort *int
|
||||
}
|
||||
|
||||
func init() {
|
||||
@@ -56,6 +57,7 @@ func init() {
|
||||
a.adminPassword = cmdAdmin.Flag.String("adminPassword", "", "admin interface password (if empty, auth is disabled)")
|
||||
a.readOnlyUser = cmdAdmin.Flag.String("readOnlyUser", "", "read-only user username (optional, for view-only access)")
|
||||
a.readOnlyPassword = cmdAdmin.Flag.String("readOnlyPassword", "", "read-only user password (optional, for view-only access; requires adminPassword to be set)")
|
||||
a.icebergPort = cmdAdmin.Flag.Int("iceberg.port", 8181, "Iceberg REST Catalog port (0 to hide in UI)")
|
||||
}
|
||||
|
||||
var cmdAdmin = &Command{
|
||||
@@ -211,7 +213,7 @@ func runAdmin(cmd *Command, args []string) bool {
|
||||
}()
|
||||
|
||||
// Start the admin server with all masters (UI enabled by default)
|
||||
err := startAdminServer(ctx, a, true)
|
||||
err := startAdminServer(ctx, a, true, *a.icebergPort)
|
||||
if err != nil {
|
||||
fmt.Printf("Admin server error: %v\n", err)
|
||||
return false
|
||||
@@ -222,7 +224,7 @@ func runAdmin(cmd *Command, args []string) bool {
|
||||
}
|
||||
|
||||
// startAdminServer starts the actual admin server
|
||||
func startAdminServer(ctx context.Context, options AdminOptions, enableUI bool) error {
|
||||
func startAdminServer(ctx context.Context, options AdminOptions, enableUI bool, icebergPort int) error {
|
||||
// Set Gin mode
|
||||
gin.SetMode(gin.ReleaseMode)
|
||||
|
||||
@@ -281,7 +283,7 @@ func startAdminServer(ctx context.Context, options AdminOptions, enableUI bool)
|
||||
}
|
||||
|
||||
// Create admin server
|
||||
adminServer := dash.NewAdminServer(*options.master, nil, dataDir)
|
||||
adminServer := dash.NewAdminServer(*options.master, nil, dataDir, icebergPort)
|
||||
|
||||
// Show discovered filers
|
||||
filers := adminServer.GetAllFilers()
|
||||
|
||||
@@ -222,6 +222,7 @@ func initMiniS3Flags() {
|
||||
miniS3Options.port = cmdMini.Flag.Int("s3.port", 8333, "s3 server http listen port")
|
||||
miniS3Options.portHttps = cmdMini.Flag.Int("s3.port.https", 0, "s3 server https listen port")
|
||||
miniS3Options.portGrpc = cmdMini.Flag.Int("s3.port.grpc", 0, "s3 server grpc listen port")
|
||||
miniS3Options.portIceberg = cmdMini.Flag.Int("s3.port.iceberg", 8181, "Iceberg REST Catalog server listen port (0 to disable)")
|
||||
miniS3Options.domainName = cmdMini.Flag.String("s3.domainName", "", "suffix of the host name in comma separated list, {bucket}.{domainName}")
|
||||
miniS3Options.allowedOrigins = cmdMini.Flag.String("s3.allowedOrigins", "*", "comma separated list of allowed origins")
|
||||
miniS3Options.tlsPrivateKey = cmdMini.Flag.String("s3.key.file", "", "path to the TLS private key file")
|
||||
@@ -463,6 +464,14 @@ func ensureAllPortsAvailableOnIP(bindIp string) error {
|
||||
flagName string
|
||||
grpcPtr *int
|
||||
}{miniS3Options.port, "S3", "s3.port", miniS3Options.portGrpc})
|
||||
if miniS3Options.portIceberg != nil && *miniS3Options.portIceberg > 0 {
|
||||
portConfigs = append(portConfigs, struct {
|
||||
port *int
|
||||
name string
|
||||
flagName string
|
||||
grpcPtr *int
|
||||
}{miniS3Options.portIceberg, "Iceberg", "s3.port.iceberg", nil})
|
||||
}
|
||||
}
|
||||
portConfigs = append(portConfigs, struct {
|
||||
port *int
|
||||
@@ -510,9 +519,13 @@ func ensureAllPortsAvailableOnIP(bindIp string) error {
|
||||
initializeGrpcPortsOnIP(bindIp)
|
||||
|
||||
// Log the final port configuration
|
||||
glog.Infof("Final port configuration - Master: %d, Filer: %d, Volume: %d, S3: %d, WebDAV: %d, Admin: %d",
|
||||
icebergPortStr := "disabled"
|
||||
if miniS3Options.portIceberg != nil && *miniS3Options.portIceberg > 0 {
|
||||
icebergPortStr = fmt.Sprintf("%d", *miniS3Options.portIceberg)
|
||||
}
|
||||
glog.Infof("Final port configuration - Master: %d, Filer: %d, Volume: %d, S3: %d, Iceberg: %s, WebDAV: %d, Admin: %d",
|
||||
*miniMasterOptions.port, *miniFilerOptions.port, *miniOptions.v.port,
|
||||
*miniS3Options.port, *miniWebDavOptions.port, *miniAdminOptions.port)
|
||||
*miniS3Options.port, icebergPortStr, *miniWebDavOptions.port, *miniAdminOptions.port)
|
||||
|
||||
// Log gRPC ports too (now finalized)
|
||||
glog.Infof("gRPC port configuration - Master: %d, Filer: %d, Volume: %d, S3: %d, Admin: %d",
|
||||
@@ -704,7 +717,7 @@ func runMini(cmd *Command, args []string) bool {
|
||||
// Capture which port flags were explicitly passed on CLI BEFORE config file is applied
|
||||
// This is necessary to distinguish user-specified ports from defaults or config file options
|
||||
explicitPortFlags = make(map[string]bool)
|
||||
portFlagNames := []string{"master.port", "filer.port", "volume.port", "s3.port", "webdav.port", "admin.port", "s3.iam.readOnly"}
|
||||
portFlagNames := []string{"master.port", "filer.port", "volume.port", "s3.port", "s3.port.iceberg", "webdav.port", "admin.port", "s3.iam.readOnly"}
|
||||
for _, flagName := range portFlagNames {
|
||||
explicitPortFlags[flagName] = isFlagPassed(flagName)
|
||||
}
|
||||
@@ -982,7 +995,11 @@ func startMiniAdminWithWorker(allServicesReady chan struct{}) {
|
||||
|
||||
// Start admin server in background
|
||||
go func() {
|
||||
if err := startAdminServer(ctx, miniAdminOptions, *miniEnableAdminUI); err != nil {
|
||||
var icebergPort int
|
||||
if miniS3Options.portIceberg != nil {
|
||||
icebergPort = *miniS3Options.portIceberg
|
||||
}
|
||||
if err := startAdminServer(ctx, miniAdminOptions, *miniEnableAdminUI, icebergPort); err != nil {
|
||||
glog.Errorf("Admin server error: %v", err)
|
||||
}
|
||||
}()
|
||||
|
||||
@@ -23,6 +23,7 @@ import (
|
||||
"github.com/seaweedfs/seaweedfs/weed/pb/filer_pb"
|
||||
"github.com/seaweedfs/seaweedfs/weed/pb/s3_pb"
|
||||
"github.com/seaweedfs/seaweedfs/weed/s3api"
|
||||
"github.com/seaweedfs/seaweedfs/weed/s3api/iceberg"
|
||||
"github.com/seaweedfs/seaweedfs/weed/s3api/s3err"
|
||||
"github.com/seaweedfs/seaweedfs/weed/security"
|
||||
stats_collect "github.com/seaweedfs/seaweedfs/weed/stats"
|
||||
@@ -41,6 +42,7 @@ type S3Options struct {
|
||||
port *int
|
||||
portHttps *int
|
||||
portGrpc *int
|
||||
portIceberg *int
|
||||
config *string
|
||||
iamConfig *string
|
||||
domainName *string
|
||||
@@ -74,6 +76,7 @@ func init() {
|
||||
s3StandaloneOptions.port = cmdS3.Flag.Int("port", 8333, "s3 server http listen port")
|
||||
s3StandaloneOptions.portHttps = cmdS3.Flag.Int("port.https", 0, "s3 server https listen port")
|
||||
s3StandaloneOptions.portGrpc = cmdS3.Flag.Int("port.grpc", 0, "s3 server grpc listen port")
|
||||
s3StandaloneOptions.portIceberg = cmdS3.Flag.Int("port.iceberg", 8181, "Iceberg REST Catalog server listen port (0 to disable)")
|
||||
s3StandaloneOptions.domainName = cmdS3.Flag.String("domainName", "", "suffix of the host name in comma separated list, {bucket}.{domainName}")
|
||||
s3StandaloneOptions.allowedOrigins = cmdS3.Flag.String("allowedOrigins", "*", "comma separated list of allowed origins")
|
||||
s3StandaloneOptions.dataCenter = cmdS3.Flag.String("dataCenter", "", "prefer to read and write to volumes in this data center")
|
||||
@@ -312,6 +315,11 @@ func (s3opt *S3Options) startS3Server() bool {
|
||||
}
|
||||
defer s3ApiServer.Shutdown()
|
||||
|
||||
// Start Iceberg REST Catalog server if enabled
|
||||
if *s3opt.portIceberg > 0 {
|
||||
go s3opt.startIcebergServer(s3ApiServer)
|
||||
}
|
||||
|
||||
if runtime.GOOS != "windows" {
|
||||
localSocket := *s3opt.localSocket
|
||||
if localSocket == "" {
|
||||
@@ -464,3 +472,40 @@ func (s3opt *S3Options) startS3Server() bool {
|
||||
return true
|
||||
|
||||
}
|
||||
|
||||
// startIcebergServer starts the Iceberg REST Catalog server on a separate port.
|
||||
func (s3opt *S3Options) startIcebergServer(s3ApiServer *s3api.S3ApiServer) {
|
||||
icebergRouter := mux.NewRouter().SkipClean(true)
|
||||
|
||||
// Create Iceberg server using the S3ApiServer as filer client
|
||||
icebergServer := iceberg.NewServer(s3ApiServer)
|
||||
icebergServer.RegisterRoutes(icebergRouter)
|
||||
|
||||
listenAddress := fmt.Sprintf("%s:%d", *s3opt.bindIp, *s3opt.portIceberg)
|
||||
icebergListener, icebergLocalListener, err := util.NewIpAndLocalListeners(
|
||||
*s3opt.bindIp, *s3opt.portIceberg, time.Duration(*s3opt.idleTimeout)*time.Second)
|
||||
if err != nil {
|
||||
glog.Fatalf("Iceberg REST Catalog listener on %s error: %v", listenAddress, err)
|
||||
}
|
||||
|
||||
glog.V(0).Infof("Start Iceberg REST Catalog Server at http://%s", listenAddress)
|
||||
|
||||
httpS := newHttpServer(icebergRouter, nil)
|
||||
if MiniClusterCtx != nil {
|
||||
go func() {
|
||||
<-MiniClusterCtx.Done()
|
||||
httpS.Shutdown(context.Background())
|
||||
}()
|
||||
}
|
||||
// Serve on localhost as well if we're bound to a different interface
|
||||
if icebergLocalListener != nil {
|
||||
go func() {
|
||||
if err := httpS.Serve(icebergLocalListener); err != nil && err != http.ErrServerClosed {
|
||||
glog.V(0).Infof("Iceberg localhost listener error: %v", err)
|
||||
}
|
||||
}()
|
||||
}
|
||||
if err = httpS.Serve(icebergListener); err != nil && err != http.ErrServerClosed {
|
||||
glog.Fatalf("Iceberg REST Catalog Server Fail to serve: %v", err)
|
||||
}
|
||||
}
|
||||
|
||||
@@ -158,6 +158,7 @@ func init() {
|
||||
s3Options.port = cmdServer.Flag.Int("s3.port", 8333, "s3 server http listen port")
|
||||
s3Options.portHttps = cmdServer.Flag.Int("s3.port.https", 0, "s3 server https listen port")
|
||||
s3Options.portGrpc = cmdServer.Flag.Int("s3.port.grpc", 0, "s3 server grpc listen port")
|
||||
s3Options.portIceberg = cmdServer.Flag.Int("s3.port.iceberg", 8181, "Iceberg REST Catalog server listen port (0 to disable)")
|
||||
s3Options.domainName = cmdServer.Flag.String("s3.domainName", "", "suffix of the host name in comma separated list, {bucket}.{domainName}")
|
||||
s3Options.allowedOrigins = cmdServer.Flag.String("s3.allowedOrigins", "*", "comma separated list of allowed origins")
|
||||
s3Options.tlsPrivateKey = cmdServer.Flag.String("s3.key.file", "", "path to the TLS private key file")
|
||||
|
||||
Reference in New Issue
Block a user