Files
seaweedFS/weed/s3api/s3tables/iceberg_layout_test.go
Chris Lu 5a0204310c Add Iceberg admin UI (#8246)
* Add Iceberg table details view

* Enhance Iceberg catalog browsing UI

* Fix Iceberg UI security and logic issues

- Fix selectSchema() and partitionFieldsFromFullMetadata() to always search for matching IDs instead of checking != 0
- Fix snapshotsFromFullMetadata() to defensive-copy before sorting to prevent mutating caller's slice
- Fix XSS vulnerabilities in s3tables.js: replace innerHTML with textContent/createElement for user-controlled data
- Fix deleteIcebergTable() to redirect to namespace tables list on details page instead of reloading
- Fix data-bs-target in iceberg_namespaces.templ: remove templ.SafeURL for CSS selector
- Add catalogName to delete modal data attributes for proper redirect
- Remove unused hidden inputs from create table form (icebergTableBucketArn, icebergTableNamespace)

* Regenerate templ files for Iceberg UI updates

* Support complex Iceberg type objects in schema

Change Type field from string to json.RawMessage in both IcebergSchemaFieldInfo
and internal icebergSchemaField to properly handle Iceberg spec's complex type
objects (e.g. {"type": "struct", "fields": [...]}). Currently test data
only shows primitive string types, but this change makes the implementation
defensively robust for future complex types by preserving the exact JSON
representation. Add typeToString() helper and update schema extraction
functions to marshal string types as JSON. Update template to convert
json.RawMessage to string for display.

* Regenerate templ files for Type field changes

* templ

* Fix additional Iceberg UI issues from code review

- Fix lazy-load flag that was set before async operation completed, preventing retries
  on error; now sets loaded flag only after successful load and throws error to caller
  for proper error handling and UI updates
- Add zero-time guards for CreatedAt and ModifiedAt fields in table details to avoid
  displaying Go zero-time values; render dash when time is zero
- Add URL path escaping for all catalog/namespace/table names in URLs to prevent
  malformed URLs when names contain special characters like /, ?, or #
- Remove redundant innerHTML clear in loadIcebergNamespaceTables that cleared twice
  before appending the table list
- Fix selectSnapshotForMetrics to remove != 0 guard for consistency with selectSchema
  fix; now always searches for CurrentSnapshotID without zero-value gate
- Enhance typeToString() helper to display '(complex)' for non-primitive JSON types

* Regenerate templ files for Phase 3 updates

* Fix template generation to use correct file paths

Run templ generate from repo root instead of weed/admin directory to ensure
generated _templ.go files have correct absolute paths in error messages
(e.g., 'weed/admin/view/app/iceberg_table_details.templ' instead of
'app/iceberg_table_details.templ'). This ensures both 'make admin-generate'
at repo root and 'make generate' in weed/admin directory produce identical
output with consistent file path references.

* Regenerate template files with correct path references

* Validate S3 Tables names in UI

- Add client-side validation for table bucket and namespace names to surface
  errors for invalid characters (dots/underscores) before submission
- Use HTML validity messages with reportValidity for immediate feedback
- Update namespace helper text to reflect actual constraints (single-level,
  lowercase letters, numbers, and underscores)

* Regenerate templ files for namespace helper text

* Fix Iceberg catalog REST link and actions

* Disallow S3 object access on table buckets

* Validate Iceberg layout for table bucket objects

* Fix REST API link to /v1/config

* merge iceberg page with table bucket page

* Allowed Trino/Iceberg stats files in metadata validation

* fixes

  - Backend/data handling:
      - Normalized Iceberg type display and fallback handling in weed/admin/dash/s3tables_management.go.
      - Fixed snapshot fallback pointer semantics in weed/admin/dash/s3tables_management.go.
      - Added CSRF token generation/propagation/validation for namespace create/delete in:
          - weed/admin/dash/csrf.go
          - weed/admin/dash/auth_middleware.go
          - weed/admin/dash/middleware.go
          - weed/admin/dash/s3tables_management.go
          - weed/admin/view/layout/layout.templ
          - weed/admin/static/js/s3tables.js
  - UI/template fixes:
      - Zero-time guards for CreatedAt fields in:
          - weed/admin/view/app/iceberg_namespaces.templ
          - weed/admin/view/app/iceberg_tables.templ
      - Fixed invalid templ-in-script interpolation and host/port rendering in:
          - weed/admin/view/app/iceberg_catalog.templ
          - weed/admin/view/app/s3tables_buckets.templ
      - Added data-catalog-name consistency on Iceberg delete action in weed/admin/view/app/iceberg_tables.templ.
      - Updated retry wording in weed/admin/static/js/s3tables.js.
      - Regenerated all affected _templ.go files.
  - S3 API/comment follow-ups:
      - Reused cached table-bucket validator in weed/s3api/bucket_paths.go.
      - Added validation-failure debug logging in weed/s3api/s3api_object_handlers_tagging.go.
      - Added multipart path-validation design comment in weed/s3api/s3api_object_handlers_multipart.go.
  - Build tooling:
      - Fixed templ generate working directory issues in weed/admin/Makefile (watch + pattern rule).

* populate data

* test/s3tables: harden populate service checks

* admin: skip table buckets in object-store bucket list

* admin sidebar: move object store to top-level links

* admin iceberg catalog: guard zero times and escape links

* admin forms: add csrf/error handling and client-side name validation

* admin s3tables: fix namespace delete modal redeclaration

* admin: replace native confirm dialogs with modal helpers

* admin modal-alerts: remove noisy confirm usage console log

* reduce logs

* test/s3tables: use partitioned tables in trino and spark populate

* admin file browser: normalize filer ServerAddress for HTTP parsing
2026-02-08 20:06:32 -08:00

187 lines
6.9 KiB
Go

package s3tables
import (
"testing"
)
func TestIcebergLayoutValidator_ValidateFilePath(t *testing.T) {
v := NewIcebergLayoutValidator()
tests := []struct {
name string
path string
wantErr bool
}{
// Valid metadata files
{"valid metadata v1", "metadata/v1.metadata.json", false},
{"valid metadata v123", "metadata/v123.metadata.json", false},
{"valid snapshot manifest", "metadata/snap-123-1-abc12345-1234-5678-9abc-def012345678.avro", false},
{"valid manifest file", "metadata/abc12345-1234-5678-9abc-def012345678-m0.avro", false},
{"valid general manifest", "metadata/abc12345-1234-5678-9abc-def012345678.avro", false},
{"valid version hint", "metadata/version-hint.text", false},
{"valid uuid metadata", "metadata/abc12345-1234-5678-9abc-def012345678.metadata.json", false},
{"valid trino stats", "metadata/20260208_212535_00007_bn4hb-d3599c32-1709-4b94-b6b2-1957b6d6db04.stats", false},
// Valid data files
{"valid parquet file", "data/file.parquet", false},
{"valid orc file", "data/file.orc", false},
{"valid avro data file", "data/file.avro", false},
{"valid parquet with path", "data/00000-0-abc12345.parquet", false},
// Valid partitioned data
{"valid partitioned parquet", "data/year=2024/file.parquet", false},
{"valid multi-partition", "data/year=2024/month=01/file.parquet", false},
{"valid bucket subdirectory", "data/bucket0/file.parquet", false},
// Directories only
{"metadata directory bare", "metadata", true},
{"data directory bare", "data", true},
{"metadata directory with slash", "metadata/", false},
{"data directory with slash", "data/", false},
// Invalid paths
{"empty path", "", true},
{"invalid top dir", "invalid/file.parquet", true},
{"root file", "file.parquet", true},
{"invalid metadata file", "metadata/random.txt", true},
{"nested metadata directory", "metadata/nested/v1.metadata.json", true},
{"nested metadata directory no file", "metadata/nested/", true},
{"metadata subdir no slash", "metadata/nested", true},
{"invalid data file", "data/file.csv", true},
{"invalid data file json", "data/file.json", true},
// Partition/subdirectory without trailing slashes
{"partition directory no slash", "data/year=2024", false},
{"data subdirectory no slash", "data/my_subdir", false},
{"multi-level partition", "data/event_date=2025-01-01/hour=00/file.parquet", false},
{"multi-level partition directory", "data/event_date=2025-01-01/hour=00/", false},
{"multi-level partition directory no slash", "data/event_date=2025-01-01/hour=00", false},
// Double slashes
{"data double slash", "data//file.parquet", true},
{"data redundant slash", "data/year=2024//file.parquet", true},
{"metadata redundant slash", "metadata//v1.metadata.json", true},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
err := v.ValidateFilePath(tt.path)
if (err != nil) != tt.wantErr {
t.Errorf("ValidateFilePath(%q) error = %v, wantErr %v", tt.path, err, tt.wantErr)
}
})
}
}
func TestIcebergLayoutValidator_PartitionPaths(t *testing.T) {
v := NewIcebergLayoutValidator()
validPaths := []string{
"data/year=2024/file.parquet",
"data/date=2024-01-15/file.parquet",
"data/category=electronics/file.parquet",
"data/user_id=12345/file.parquet",
"data/region=us-east-1/file.parquet",
"data/year=2024/month=01/day=15/file.parquet",
}
for _, path := range validPaths {
if err := v.ValidateFilePath(path); err != nil {
t.Errorf("ValidateFilePath(%q) should be valid, got error: %v", path, err)
}
}
}
func TestTableBucketFileValidator_ValidateTableBucketUpload(t *testing.T) {
v := NewTableBucketFileValidator()
tests := []struct {
name string
path string
wantErr bool
}{
// Non-table bucket paths should pass (no validation)
{"regular bucket path", "/buckets/mybucket/file.txt", false},
{"filer path", "/home/user/file.txt", false},
// Table bucket structure paths (creating directories)
{"table bucket root", "/buckets/mybucket", false},
{"namespace dir", "/buckets/mybucket/myns", false},
{"table dir", "/buckets/mybucket/myns/mytable", false},
{"table dir trailing slash", "/buckets/mybucket/myns/mytable/", false},
// Valid table bucket file uploads
{"valid parquet upload", "/buckets/mybucket/myns/mytable/data/file.parquet", false},
{"valid metadata upload", "/buckets/mybucket/myns/mytable/metadata/v1.metadata.json", false},
{"valid trino stats upload", "/buckets/mybucket/myns/mytable/metadata/20260208_212535_00007_bn4hb-d3599c32-1709-4b94-b6b2-1957b6d6db04.stats", false},
{"valid partitioned data", "/buckets/mybucket/myns/mytable/data/year=2024/file.parquet", false},
// Invalid table bucket file uploads
{"invalid file type", "/buckets/mybucket/myns/mytable/data/file.csv", true},
{"invalid top-level dir", "/buckets/mybucket/myns/mytable/invalid/file.parquet", true},
{"root file in table", "/buckets/mybucket/myns/mytable/file.parquet", true},
// Empty segment cases
{"empty bucket", "/buckets//myns/mytable/data/file.parquet", true},
{"empty namespace", "/buckets/mybucket//mytable/data/file.parquet", true},
{"empty table", "/buckets/mybucket/myns//data/file.parquet", true},
{"empty bucket dir", "/buckets//", true},
{"empty namespace dir", "/buckets/mybucket//", true},
{"table double slash bypass", "/buckets/mybucket/myns/mytable//data/file.parquet", true},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
err := v.ValidateTableBucketUpload(tt.path)
if (err != nil) != tt.wantErr {
t.Errorf("ValidateTableBucketUpload(%q) error = %v, wantErr %v", tt.path, err, tt.wantErr)
}
})
}
}
func TestIsTableBucketPath(t *testing.T) {
tests := []struct {
path string
want bool
}{
{"/buckets/mybucket", true},
{"/buckets/mybucket/ns/table/data/file.parquet", true},
{"/home/user/file.txt", false},
{"buckets/mybucket", false}, // missing leading slash
}
for _, tt := range tests {
t.Run(tt.path, func(t *testing.T) {
if got := IsTableBucketPath(tt.path); got != tt.want {
t.Errorf("IsTableBucketPath(%q) = %v, want %v", tt.path, got, tt.want)
}
})
}
}
func TestGetTableInfoFromPath(t *testing.T) {
tests := []struct {
path string
wantBucket string
wantNamespace string
wantTable string
}{
{"/buckets/mybucket/myns/mytable/data/file.parquet", "mybucket", "myns", "mytable"},
{"/buckets/mybucket/myns/mytable", "mybucket", "myns", "mytable"},
{"/buckets/mybucket/myns", "mybucket", "myns", ""},
{"/buckets/mybucket", "mybucket", "", ""},
{"/home/user/file.txt", "", "", ""},
}
for _, tt := range tests {
t.Run(tt.path, func(t *testing.T) {
bucket, namespace, table := GetTableInfoFromPath(tt.path)
if bucket != tt.wantBucket || namespace != tt.wantNamespace || table != tt.wantTable {
t.Errorf("GetTableInfoFromPath(%q) = (%q, %q, %q), want (%q, %q, %q)",
tt.path, bucket, namespace, table, tt.wantBucket, tt.wantNamespace, tt.wantTable)
}
})
}
}