feat: Add Iceberg REST Catalog server and admin UI (#8175)

* feat: Add Iceberg REST Catalog server

Implement Iceberg REST Catalog API on a separate port (default 8181)
that exposes S3 Tables metadata through the Apache Iceberg REST protocol.

- Add new weed/s3api/iceberg package with REST handlers
- Implement /v1/config endpoint returning catalog configuration
- Implement namespace endpoints (list/create/get/head/delete)
- Implement table endpoints (list/create/load/head/delete/update)
- Add -port.iceberg flag to S3 standalone server (s3.go)
- Add -s3.port.iceberg flag to combined server mode (server.go)
- Add -s3.port.iceberg flag to mini cluster mode (mini.go)
- Support prefix-based routing for multiple catalogs

The Iceberg REST server reuses S3 Tables metadata storage under
/table-buckets and enables DuckDB, Spark, and other Iceberg clients
to connect to SeaweedFS as a catalog.

* feat: Add Iceberg Catalog pages to admin UI

Add admin UI pages to browse Iceberg catalogs, namespaces, and tables.

- Add Iceberg Catalog menu item under Object Store navigation
- Create iceberg_catalog.templ showing catalog overview with REST info
- Create iceberg_namespaces.templ listing namespaces in a catalog
- Create iceberg_tables.templ listing tables in a namespace
- Add handlers and routes in admin_handlers.go
- Add Iceberg data provider methods in s3tables_management.go
- Add Iceberg data types in types.go

The Iceberg Catalog pages provide visibility into the same S3 Tables
data through an Iceberg-centric lens, including REST endpoint examples
for DuckDB and PyIceberg.

* test: Add Iceberg catalog integration tests and reorg s3tables tests

- Reorganize existing s3tables tests to test/s3tables/table-buckets/
- Add new test/s3tables/catalog/ for Iceberg REST catalog tests
- Add TestIcebergConfig to verify /v1/config endpoint
- Add TestIcebergNamespaces to verify namespace listing
- Add TestDuckDBIntegration for DuckDB connectivity (requires Docker)
- Update CI workflow to use new test paths

* fix: Generate proper random UUIDs for Iceberg tables

Address code review feedback:
- Replace placeholder UUID with crypto/rand-based UUID v4 generation
- Add detailed TODO comments for handleUpdateTable stub explaining
  the required atomic metadata swap implementation

* fix: Serve Iceberg on localhost listener when binding to different interface

Address code review feedback: properly serve the localhost listener
when the Iceberg server is bound to a non-localhost interface.

* ci: Add Iceberg catalog integration tests to CI

Add new job to run Iceberg catalog tests in CI, along with:
- Iceberg package build verification
- Iceberg unit tests
- Iceberg go vet checks
- Iceberg format checks

* fix: Address code review feedback for Iceberg implementation

- fix: Replace hardcoded account ID with s3_constants.AccountAdminId in buildTableBucketARN()
- fix: Improve UUID generation error handling with deterministic fallback (timestamp + PID + counter)
- fix: Update handleUpdateTable to return HTTP 501 Not Implemented instead of fake success
- fix: Better error handling in handleNamespaceExists to distinguish 404 from 500 errors
- fix: Use relative URL in template instead of hardcoded localhost:8181
- fix: Add HTTP timeout to test's waitForService function to avoid hangs
- fix: Use dynamic ephemeral ports in integration tests to avoid flaky parallel failures
- fix: Add Iceberg port to final port configuration logging in mini.go

* fix: Address critical issues in Iceberg implementation

- fix: Cache table UUIDs to ensure persistence across LoadTable calls
  The UUID now remains stable for the lifetime of the server session.
  TODO: For production, UUIDs should be persisted in S3 Tables metadata.

- fix: Remove redundant URL-encoded namespace parsing
  mux router already decodes %1F to \x1F before passing to handlers.
  Redundant ReplaceAll call could cause bugs with literal %1F in namespace.

* fix: Improve test robustness and reduce code duplication

- fix: Make DuckDB test more robust by failing on unexpected errors
  Instead of silently logging errors, now explicitly check for expected
  conditions (extension not available) and skip the test appropriately.

- fix: Extract username helper method to reduce duplication
  Created getUsername() helper in AdminHandlers to avoid duplicating
  the username retrieval logic across Iceberg page handlers.

* fix: Add mutex protection to table UUID cache

Protects concurrent access to the tableUUIDs map with sync.RWMutex.
Uses read-lock for fast path when UUID already cached, and write-lock
for generating new UUIDs. Includes double-check pattern to handle race
condition between read-unlock and write-lock.

* style: fix go fmt errors

* feat(iceberg): persist table UUID in S3 Tables metadata

* feat(admin): configure Iceberg port in Admin UI and commands

* refactor: address review comments (flags, tests, handlers)

- command/mini: fix tracking of explicit s3.port.iceberg flag
- command/admin: add explicit -iceberg.port flag
- admin/handlers: reuse getUsername helper
- tests: use 127.0.0.1 for ephemeral ports and os.Stat for file size check

* test: check error from FileStat in verify_gc_empty_test
This commit is contained in:
Chris Lu
2026-02-02 23:12:13 -08:00
committed by GitHub
parent 330bd92ddc
commit 2bb21ea276
59 changed files with 3436 additions and 818 deletions

View File

@@ -33,7 +33,7 @@ jobs:
- name: Run S3 Tables Integration Tests
timeout-minutes: 25
working-directory: test/s3tables
working-directory: test/s3tables/table-buckets
run: |
set -x
set -o pipefail
@@ -51,7 +51,7 @@ jobs:
- name: Show test output on failure
if: failure()
working-directory: test/s3tables
working-directory: test/s3tables/table-buckets
run: |
echo "=== Test Output ==="
if [ -f test-output.log ]; then
@@ -66,7 +66,64 @@ jobs:
uses: actions/upload-artifact@v6
with:
name: s3-tables-test-logs
path: test/s3tables/test-output.log
path: test/s3tables/table-buckets/test-output.log
retention-days: 3
iceberg-catalog-tests:
name: Iceberg Catalog Integration Tests
runs-on: ubuntu-22.04
timeout-minutes: 30
steps:
- name: Check out code
uses: actions/checkout@v6
- name: Set up Go
uses: actions/setup-go@v6
with:
go-version-file: 'go.mod'
id: go
- name: Install SeaweedFS
run: |
go install -buildvcs=false ./weed
- name: Run Iceberg Catalog Integration Tests
timeout-minutes: 25
working-directory: test/s3tables/catalog
run: |
set -x
set -o pipefail
echo "=== System Information ==="
uname -a
free -h
df -h
echo "=== Starting Iceberg Catalog Tests ==="
# Run Iceberg catalog integration tests
go test -v -timeout 20m . 2>&1 | tee test-output.log || {
echo "Iceberg catalog integration tests failed"
exit 1
}
- name: Show test output on failure
if: failure()
working-directory: test/s3tables/catalog
run: |
echo "=== Test Output ==="
if [ -f test-output.log ]; then
tail -200 test-output.log
fi
echo "=== Process information ==="
ps aux | grep -E "(weed|test|docker)" || true
- name: Upload test logs on failure
if: failure()
uses: actions/upload-artifact@v6
with:
name: iceberg-catalog-test-logs
path: test/s3tables/catalog/test-output.log
retention-days: 3
s3-tables-build-verification:
@@ -114,6 +171,26 @@ jobs:
}
echo "S3 Tables unit tests passed"
- name: Verify Iceberg Package Builds
run: |
set -x
echo "=== Building Iceberg package ==="
go build ./weed/s3api/iceberg || {
echo "Iceberg package build failed"
exit 1
}
echo "Iceberg package built successfully"
- name: Run Go Tests for Iceberg Package
run: |
set -x
echo "=== Running Go unit tests for Iceberg ==="
go test -v -race -timeout 5m ./weed/s3api/iceberg/... || {
echo "Iceberg unit tests failed"
exit 1
}
echo "Iceberg unit tests passed"
s3-tables-fmt-check:
name: S3 Tables Format Check
runs-on: ubuntu-22.04
@@ -153,6 +230,18 @@ jobs:
fi
echo "All S3 Tables test files are properly formatted"
- name: Check Iceberg Format
run: |
set -x
echo "=== Checking Iceberg Go format ==="
unformatted=$(gofmt -l ./weed/s3api/iceberg)
if [ -n "$unformatted" ]; then
echo "Go format check failed for Iceberg - files need formatting"
echo "$unformatted"
exit 1
fi
echo "All Iceberg files are properly formatted"
s3-tables-vet:
name: S3 Tables Go Vet Check
runs-on: ubuntu-22.04
@@ -187,3 +276,13 @@ jobs:
exit 1
}
echo "go vet checks passed for tests"
- name: Run Go Vet on Iceberg
run: |
set -x
echo "=== Running go vet on Iceberg package ==="
go vet ./weed/s3api/iceberg/... || {
echo "go vet check failed for Iceberg"
exit 1
}
echo "go vet checks passed for Iceberg"