feat: Add Iceberg REST Catalog server and admin UI (#8175)

* feat: Add Iceberg REST Catalog server Implement Iceberg REST Catalog API on a separate port (default 8181) that exposes S3 Tables metadata through the Apache Iceberg REST protocol. - Add new weed/s3api/iceberg package with REST handlers - Implement /v1/config endpoint returning catalog configuration - Implement namespace endpoints (list/create/get/head/delete) - Implement table endpoints (list/create/load/head/delete/update) - Add -port.iceberg flag to S3 standalone server (s3.go) - Add -s3.port.iceberg flag to combined server mode (server.go) - Add -s3.port.iceberg flag to mini cluster mode (mini.go) - Support prefix-based routing for multiple catalogs The Iceberg REST server reuses S3 Tables metadata storage under /table-buckets and enables DuckDB, Spark, and other Iceberg clients to connect to SeaweedFS as a catalog. * feat: Add Iceberg Catalog pages to admin UI Add admin UI pages to browse Iceberg catalogs, namespaces, and tables. - Add Iceberg Catalog menu item under Object Store navigation - Create iceberg_catalog.templ showing catalog overview with REST info - Create iceberg_namespaces.templ listing namespaces in a catalog - Create iceberg_tables.templ listing tables in a namespace - Add handlers and routes in admin_handlers.go - Add Iceberg data provider methods in s3tables_management.go - Add Iceberg data types in types.go The Iceberg Catalog pages provide visibility into the same S3 Tables data through an Iceberg-centric lens, including REST endpoint examples for DuckDB and PyIceberg. * test: Add Iceberg catalog integration tests and reorg s3tables tests - Reorganize existing s3tables tests to test/s3tables/table-buckets/ - Add new test/s3tables/catalog/ for Iceberg REST catalog tests - Add TestIcebergConfig to verify /v1/config endpoint - Add TestIcebergNamespaces to verify namespace listing - Add TestDuckDBIntegration for DuckDB connectivity (requires Docker) - Update CI workflow to use new test paths * fix: Generate proper random UUIDs for Iceberg tables Address code review feedback: - Replace placeholder UUID with crypto/rand-based UUID v4 generation - Add detailed TODO comments for handleUpdateTable stub explaining the required atomic metadata swap implementation * fix: Serve Iceberg on localhost listener when binding to different interface Address code review feedback: properly serve the localhost listener when the Iceberg server is bound to a non-localhost interface. * ci: Add Iceberg catalog integration tests to CI Add new job to run Iceberg catalog tests in CI, along with: - Iceberg package build verification - Iceberg unit tests - Iceberg go vet checks - Iceberg format checks * fix: Address code review feedback for Iceberg implementation - fix: Replace hardcoded account ID with s3_constants.AccountAdminId in buildTableBucketARN() - fix: Improve UUID generation error handling with deterministic fallback (timestamp + PID + counter) - fix: Update handleUpdateTable to return HTTP 501 Not Implemented instead of fake success - fix: Better error handling in handleNamespaceExists to distinguish 404 from 500 errors - fix: Use relative URL in template instead of hardcoded localhost:8181 - fix: Add HTTP timeout to test's waitForService function to avoid hangs - fix: Use dynamic ephemeral ports in integration tests to avoid flaky parallel failures - fix: Add Iceberg port to final port configuration logging in mini.go * fix: Address critical issues in Iceberg implementation - fix: Cache table UUIDs to ensure persistence across LoadTable calls The UUID now remains stable for the lifetime of the server session. TODO: For production, UUIDs should be persisted in S3 Tables metadata. - fix: Remove redundant URL-encoded namespace parsing mux router already decodes %1F to \x1F before passing to handlers. Redundant ReplaceAll call could cause bugs with literal %1F in namespace. * fix: Improve test robustness and reduce code duplication - fix: Make DuckDB test more robust by failing on unexpected errors Instead of silently logging errors, now explicitly check for expected conditions (extension not available) and skip the test appropriately. - fix: Extract username helper method to reduce duplication Created getUsername() helper in AdminHandlers to avoid duplicating the username retrieval logic across Iceberg page handlers. * fix: Add mutex protection to table UUID cache Protects concurrent access to the tableUUIDs map with sync.RWMutex. Uses read-lock for fast path when UUID already cached, and write-lock for generating new UUIDs. Includes double-check pattern to handle race condition between read-unlock and write-lock. * style: fix go fmt errors * feat(iceberg): persist table UUID in S3 Tables metadata * feat(admin): configure Iceberg port in Admin UI and commands * refactor: address review comments (flags, tests, handlers) - command/mini: fix tracking of explicit s3.port.iceberg flag - command/admin: add explicit -iceberg.port flag - admin/handlers: reuse getUsername helper - tests: use 127.0.0.1 for ephemeral ports and os.Stat for file size check * test: check error from FileStat in verify_gc_empty_test
2026-02-02 23:12:13 -08:00
parent 330bd92ddc
commit 2bb21ea276
59 changed files with 3436 additions and 818 deletions
--- a/weed/storage/verify_gc_empty_test.go
+++ b/weed/storage/verify_gc_empty_test.go
@@ -0,0 +1,74 @@
+package storage
+
+import (
+	"os"
+	"testing"
+
+	"github.com/seaweedfs/seaweedfs/weed/storage/needle"
+	"github.com/seaweedfs/seaweedfs/weed/storage/super_block"
+)
+
+func TestCompactionToEmpty(t *testing.T) {
+	dir := t.TempDir()
+
+	// 1. Create a new volume
+	v, err := NewVolume(dir, dir, "", 678, NeedleMapInMemory, &super_block.ReplicaPlacement{}, &needle.TTL{}, 0, needle.GetCurrentVersion(), 0, 0)
+	if err != nil {
+		t.Fatalf("volume creation: %v", err)
+	}
+	defer v.Close()
+
+	// 2. Write a few needles
+	numNeedles := 5
+	for i := 1; i <= numNeedles; i++ {
+		n := newRandomNeedle(uint64(i))
+		_, _, _, err := v.writeNeedle2(n, true, false)
+		if err != nil {
+			t.Fatalf("write needle %d: %v", i, err)
+		}
+	}
+
+	// 3. Delete all of them
+	for i := 1; i <= numNeedles; i++ {
+		n := newEmptyNeedle(uint64(i))
+		_, err := v.deleteNeedle2(n)
+		if err != nil {
+			t.Fatalf("delete needle %d: %v", i, err)
+		}
+	}
+
+	// 4. Run compaction
+	err = v.Compact2(0, 0, nil)
+	if err != nil {
+		t.Fatalf("compaction: %v", err)
+	}
+
+	// 5. Commit compaction
+	err = v.CommitCompact()
+	if err != nil {
+		t.Fatalf("commit compaction: %v", err)
+	}
+
+	// 6. Verify the resulting .dat file size
+	datSize, _, err := v.FileStat()
+	if err != nil {
+		t.Fatalf("file stat: %v", err)
+	}
+	if datSize != super_block.SuperBlockSize {
+		t.Errorf("expected dat file size %d, got %d", super_block.SuperBlockSize, datSize)
+	}
+
+	// 7. Verify index file size (should be 0 or at least no entries)
+	if v.nm.FileCount() != 0 {
+		t.Errorf("expected 0 files in needle map, got %d", v.nm.FileCount())
+	}
+
+	// Check if the file itself exists and is empty (except for superblock)
+	info, err := os.Stat(v.FileName(".dat"))
+	if err != nil {
+		t.Fatalf("stat dat file: %v", err)
+	}
+	if info.Size() != int64(super_block.SuperBlockSize) {
+		t.Fatalf("dat file physical size mismatch: expected %d, got %d", super_block.SuperBlockSize, info.Size())
+	}
+}