Commit Graph

2 Commits

Author SHA1 Message Date
Chris Lu
36c469e34e Enforce IAM for S3 Tables bucket creation (#8388)
* Enforce IAM for s3tables bucket creation

* Prefer IAM path when policies exist

* Ensure IAM enforcement honors default allow

* address comments

* Reused the precomputed principal when setting tableBucketMetadata.OwnerAccountID, avoiding the redundant getAccountID call.

* get identity

* fix

* dedup

* fix

* comments

* fix tests

* update iam config

* go fmt

* fix ports

* fix flags

* mini clean shutdown

* Revert "update iam config"

This reverts commit ca48fdbb0afa45657823d98657556c0bbf24f239.

Revert "mini clean shutdown"

This reverts commit 9e17f6baffd5dd7cc404d831d18dd618b9fe5049.

Revert "fix flags"

This reverts commit e9e7b29d2f77ee5cb82147d50621255410695ee3.

Revert "go fmt"

This reverts commit bd3241960b1d9484b7900190773b0ecb3f762c9a.

* test/s3tables: share single weed mini per test package via TestMain

Previously each top-level test function in the catalog and s3tables
package started and stopped its own weed mini instance. This caused
failures when a prior instance wasn't cleanly stopped before the next
one started (port conflicts, leaked global state).

Changes:
- catalog/iceberg_catalog_test.go: introduce TestMain that starts one
  shared TestEnvironment (external weed binary) before all tests and
  tears it down after. All individual test functions now use sharedEnv.
  Added randomSuffix() for unique resource names across tests.
- catalog/pyiceberg_test.go: updated to use sharedEnv instead of
  per-test environments.
- catalog/pyiceberg_test_helpers.go -> pyiceberg_test_helpers_test.go:
  renamed to a _test.go file so it can access TestEnvironment which is
  defined in a test file.
- table-buckets/setup.go: add package-level sharedCluster variable.
- table-buckets/s3tables_integration_test.go: introduce TestMain that
  starts one shared TestCluster before all tests. TestS3TablesIntegration
  now uses sharedCluster. Extract startMiniClusterInDir (no *testing.T)
  for TestMain use. TestS3TablesCreateBucketIAMPolicy keeps its own
  cluster (different IAM config). Remove miniClusterMutex (no longer
  needed). Fix Stop() to not panic when t is nil."

* delete

* parse

* default allow should work with anonymous

* fix port

* iceberg route

The failures are from Iceberg REST using the default bucket warehouse when no prefix is provided. Your tests create random buckets, so /v1/namespaces was looking in warehouse and failing. I updated the tests to use the prefixed Iceberg routes (/v1/{bucket}/...) via a small helper.

* test(s3tables): fix port conflicts and IAM ARN matching in integration tests

- Pass -master.dir explicitly to prevent filer store directory collision
  between shared cluster and per-test clusters running in the same process
- Pass -volume.port.public and -volume.publicUrl to prevent the global
  publicPort flag (mutated from 0 → concrete port by first cluster) from
  being reused by a second cluster, causing 'address already in use'
- Remove the flag-reset loop in Stop() that reset global flag values while
  other goroutines were reading them (race → panic)
- Fix IAM policy Resource ARN in TestS3TablesCreateBucketIAMPolicy to use
  wildcards (arn:aws:s3tables:*:*:bucket/<name>) because the handler
  generates ARNs with its own DefaultRegion (us-east-1) and principal name
  ('admin'), not the test constants testRegion/testAccountID
2026-02-19 22:52:05 -08:00
Chris Lu
2bb21ea276 feat: Add Iceberg REST Catalog server and admin UI (#8175)
* feat: Add Iceberg REST Catalog server

Implement Iceberg REST Catalog API on a separate port (default 8181)
that exposes S3 Tables metadata through the Apache Iceberg REST protocol.

- Add new weed/s3api/iceberg package with REST handlers
- Implement /v1/config endpoint returning catalog configuration
- Implement namespace endpoints (list/create/get/head/delete)
- Implement table endpoints (list/create/load/head/delete/update)
- Add -port.iceberg flag to S3 standalone server (s3.go)
- Add -s3.port.iceberg flag to combined server mode (server.go)
- Add -s3.port.iceberg flag to mini cluster mode (mini.go)
- Support prefix-based routing for multiple catalogs

The Iceberg REST server reuses S3 Tables metadata storage under
/table-buckets and enables DuckDB, Spark, and other Iceberg clients
to connect to SeaweedFS as a catalog.

* feat: Add Iceberg Catalog pages to admin UI

Add admin UI pages to browse Iceberg catalogs, namespaces, and tables.

- Add Iceberg Catalog menu item under Object Store navigation
- Create iceberg_catalog.templ showing catalog overview with REST info
- Create iceberg_namespaces.templ listing namespaces in a catalog
- Create iceberg_tables.templ listing tables in a namespace
- Add handlers and routes in admin_handlers.go
- Add Iceberg data provider methods in s3tables_management.go
- Add Iceberg data types in types.go

The Iceberg Catalog pages provide visibility into the same S3 Tables
data through an Iceberg-centric lens, including REST endpoint examples
for DuckDB and PyIceberg.

* test: Add Iceberg catalog integration tests and reorg s3tables tests

- Reorganize existing s3tables tests to test/s3tables/table-buckets/
- Add new test/s3tables/catalog/ for Iceberg REST catalog tests
- Add TestIcebergConfig to verify /v1/config endpoint
- Add TestIcebergNamespaces to verify namespace listing
- Add TestDuckDBIntegration for DuckDB connectivity (requires Docker)
- Update CI workflow to use new test paths

* fix: Generate proper random UUIDs for Iceberg tables

Address code review feedback:
- Replace placeholder UUID with crypto/rand-based UUID v4 generation
- Add detailed TODO comments for handleUpdateTable stub explaining
  the required atomic metadata swap implementation

* fix: Serve Iceberg on localhost listener when binding to different interface

Address code review feedback: properly serve the localhost listener
when the Iceberg server is bound to a non-localhost interface.

* ci: Add Iceberg catalog integration tests to CI

Add new job to run Iceberg catalog tests in CI, along with:
- Iceberg package build verification
- Iceberg unit tests
- Iceberg go vet checks
- Iceberg format checks

* fix: Address code review feedback for Iceberg implementation

- fix: Replace hardcoded account ID with s3_constants.AccountAdminId in buildTableBucketARN()
- fix: Improve UUID generation error handling with deterministic fallback (timestamp + PID + counter)
- fix: Update handleUpdateTable to return HTTP 501 Not Implemented instead of fake success
- fix: Better error handling in handleNamespaceExists to distinguish 404 from 500 errors
- fix: Use relative URL in template instead of hardcoded localhost:8181
- fix: Add HTTP timeout to test's waitForService function to avoid hangs
- fix: Use dynamic ephemeral ports in integration tests to avoid flaky parallel failures
- fix: Add Iceberg port to final port configuration logging in mini.go

* fix: Address critical issues in Iceberg implementation

- fix: Cache table UUIDs to ensure persistence across LoadTable calls
  The UUID now remains stable for the lifetime of the server session.
  TODO: For production, UUIDs should be persisted in S3 Tables metadata.

- fix: Remove redundant URL-encoded namespace parsing
  mux router already decodes %1F to \x1F before passing to handlers.
  Redundant ReplaceAll call could cause bugs with literal %1F in namespace.

* fix: Improve test robustness and reduce code duplication

- fix: Make DuckDB test more robust by failing on unexpected errors
  Instead of silently logging errors, now explicitly check for expected
  conditions (extension not available) and skip the test appropriately.

- fix: Extract username helper method to reduce duplication
  Created getUsername() helper in AdminHandlers to avoid duplicating
  the username retrieval logic across Iceberg page handlers.

* fix: Add mutex protection to table UUID cache

Protects concurrent access to the tableUUIDs map with sync.RWMutex.
Uses read-lock for fast path when UUID already cached, and write-lock
for generating new UUIDs. Includes double-check pattern to handle race
condition between read-unlock and write-lock.

* style: fix go fmt errors

* feat(iceberg): persist table UUID in S3 Tables metadata

* feat(admin): configure Iceberg port in Admin UI and commands

* refactor: address review comments (flags, tests, handlers)

- command/mini: fix tracking of explicit s3.port.iceberg flag
- command/admin: add explicit -iceberg.port flag
- admin/handlers: reuse getUsername helper
- tests: use 127.0.0.1 for ephemeral ports and os.Stat for file size check

* test: check error from FileStat in verify_gc_empty_test
2026-02-02 23:12:13 -08:00