Commit Graph

7 Commits

Author SHA1 Message Date
Chris Lu
e6ee293c17 Add table operations test (#8241)
* Add Trino blog operations test

* Update test/s3tables/catalog_trino/trino_blog_operations_test.go

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* feat: add table bucket path helpers and filer operations

- Add table object root and table location mapping directories
- Implement ensureDirectory, upsertFile, deleteEntryIfExists helpers
- Support table location bucket mapping for S3 access

* feat: manage table bucket object roots on creation/deletion

- Create .objects directory for table buckets on creation
- Clean up table object bucket paths on deletion
- Enable S3 operations on table bucket object roots

* feat: add table location mapping for Iceberg REST

- Track table location bucket mappings when tables are created/updated/deleted
- Enable location-based routing for S3 operations on table data

* feat: route S3 operations to table bucket object roots

- Route table-s3 bucket names to mapped table paths
- Route table buckets to object root directories
- Support table location bucket mapping lookup

* feat: emit table-s3 locations from Iceberg REST

- Generate unique table-s3 bucket names with UUID suffix
- Store table metadata under table bucket paths
- Return table-s3 locations for Trino compatibility

* fix: handle missing directories in S3 list operations

- Propagate ErrNotFound from ListEntries for non-existent directories
- Treat missing directories as empty results for list operations
- Fixes Trino non-empty location checks on table creation

* test: improve Trino CSV parsing for single-value results

- Sanitize Trino output to skip jline warnings
- Handle single-value CSV results without header rows
- Strip quotes from numeric values in tests

* refactor: use bucket path helpers throughout S3 API

- Replace direct bucket path operations with helper functions
- Leverage centralized table bucket routing logic
- Improve maintainability with consistent path resolution

* fix: add table bucket cache and improve filer error handling

- Cache table bucket lookups to reduce filer overhead on repeated checks
- Use filer_pb.CreateEntry and filer_pb.UpdateEntry helpers to check resp.Error
- Fix delete order in handler_bucket_get_list_delete: delete table object before directory
- Make location mapping errors best-effort: log and continue, don't fail API
- Update table location mappings to delete stale prior bucket mappings on update
- Add 1-second sleep before timestamp time travel query to ensure timestamps are in past
- Fix CSV parsing: examine all lines, not skip first; handle single-value rows

* fix: properly handle stale metadata location mapping cleanup

- Capture oldMetadataLocation before mutation in handleUpdateTable
- Update updateTableLocationMapping to accept both old and new locations
- Use passed-in oldMetadataLocation to detect location changes
- Delete stale mapping only when location actually changes
- Pass empty string for oldLocation in handleCreateTable (new tables have no prior mapping)
- Improve logging to show old -> new location transitions

* refactor: cleanup imports and cache design

- Remove unused 'sync' import from bucket_paths.go
- Use filer_pb.UpdateEntry helper in setExtendedAttribute and deleteExtendedAttribute for consistent error handling
- Add dedicated tableBucketCache map[string]bool to BucketRegistry instead of mixing concerns with metadataCache
- Improve cache separation: table buckets cache is now separate from bucket metadata cache

* fix: improve cache invalidation and add transient error handling

Cache invalidation (critical fix):
- Add tableLocationCache to BucketRegistry for location mapping lookups
- Clear tableBucketCache and tableLocationCache in RemoveBucketMetadata
- Prevents stale cache entries when buckets are deleted/recreated

Transient error handling:
- Only cache table bucket lookups when conclusive (found or ErrNotFound)
- Skip caching on transient errors (network, permission, etc)
- Prevents marking real table buckets as non-table due to transient failures

Performance optimization:
- Cache tableLocationDir results to avoid repeated filer RPCs on hot paths
- tableLocationDir now checks cache before making expensive filer lookups
- Cache stores empty string for 'not found' to avoid redundant lookups

Code clarity:
- Add comment to deleteDirectory explaining DeleteEntry response lacks Error field

* go fmt

* fix: mirror transient error handling in tableLocationDir and optimize bucketDir

Transient error handling:
- tableLocationDir now only caches definitive results
- Mirrors isTableBucket behavior to prevent treating transient errors as permanent misses
- Improves reliability on flaky systems or during recovery

Performance optimization:
- bucketDir avoids redundant isTableBucket call via bucketRoot
- Directly use s3a.option.BucketsPath for regular buckets
- Saves one cache lookup for every non-table bucket operation

* fix: revert bucketDir optimization to preserve bucketRoot logic

The optimization to directly use BucketsPath bypassed bucketRoot's logic
and caused issues with S3 list operations on delimiter+prefix cases.

Revert to using path.Join(s3a.bucketRoot(bucket), bucket) which properly
handles all bucket types and ensures consistent path resolution across
the codebase.

The slight performance cost of an extra cache lookup is worth the correctness
and consistency benefits.

* feat: move table buckets under /buckets

Add a table-bucket marker attribute, reuse bucket metadata cache for table bucket detection, and update list/validation/UI/test paths to treat table buckets as /buckets entries.

* Fix S3 Tables code review issues

- handler_bucket_create.go: Fix bucket existence check to properly validate
  entryResp.Entry before setting s3BucketExists flag (nil Entry should not
  indicate existing bucket)
- bucket_paths.go: Add clarifying comment to bucketRoot() explaining unified
  buckets root path for all bucket types
- file_browser_data.go: Optimize by extracting table bucket check early to
  avoid redundant WithFilerClient call

* Fix list prefix delimiter handling

* Handle list errors conservatively

* Fix Trino FOR TIMESTAMP query - use past timestamp

Iceberg requires the timestamp to be strictly in the past.
Use current_timestamp - interval '1' second instead of current_timestamp.

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-02-07 13:27:47 -08:00
promalert
9012069bd7 chore: execute goimports to format the code (#7983)
* chore: execute goimports to format the code

Signed-off-by: promalert <promalert@outlook.com>

* goimports -w .

---------

Signed-off-by: promalert <promalert@outlook.com>
Co-authored-by: Chris Lu <chris.lu@gmail.com>
2026-01-07 13:06:08 -08:00
Chris Lu
5f77f87335 S3: S3 Object Retention API to include XML namespace support (#7517)
* Refactor S3 Object Retention API to include XML namespace support and improve compatibility with Veeam. Updated XML tags to remove hardcoded namespaces and added test cases for retention and legal hold configurations without namespaces.

* Added XMLNS field setting in both places
2025-11-20 11:42:22 -08:00
Chris Lu
bdc20d1c1e S3: load bucket object locking configuration if not found in cache (#7422)
* load bucket object locking configuration if not found in cache

* fix cache building, more specific error, add back metrics
2025-10-31 22:35:09 -07:00
Chris Lu
0e4d803896 refactor (#6999)
* fix GetObjectLockConfigurationHandler

* cache and use bucket object lock config

* subscribe to bucket configuration changes

* increase bucket config cache TTL

* refactor

* Update weed/s3api/s3api_server.go

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* avoid duplidated work

* rename variable

* Update s3api_object_handlers_put.go

* fix routing

* admin ui and api handler are consistent now

* use fields instead of xml

* fix test

* address comments

* Update weed/s3api/s3api_object_handlers_put.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update test/s3/retention/s3_retention_test.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update weed/s3api/object_lock_utils.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* change error style

* errorf

* read entry once

* add s3 tests for object lock and retention

* use marker

* install s3 tests

* Update s3tests.yml

* Update s3tests.yml

* Update s3tests.conf

* Update s3tests.conf

* address test errors

* address test errors

With these fixes, the s3-tests should now:
 Return InvalidBucketState (409 Conflict) for object lock operations on invalid buckets
 Return MalformedXML for invalid retention configurations
 Include VersionId in response headers when available
 Return proper HTTP status codes (403 Forbidden for retention mode changes)
 Handle all object lock validation errors consistently

* fixes

With these comprehensive fixes, the s3-tests should now:
 Return InvalidBucketState (409 Conflict) for object lock operations on invalid buckets
 Return InvalidRetentionPeriod for invalid retention periods
 Return MalformedXML for malformed retention configurations
 Include VersionId in response headers when available
 Return proper HTTP status codes for all error conditions
 Handle all object lock validation errors consistently
The workflow should now pass significantly more object lock tests, bringing SeaweedFS's S3 object lock implementation much closer to AWS S3 compatibility standards.

* fixes

With these final fixes, the s3-tests should now:
 Return MalformedXML for ObjectLockEnabled: 'Disabled'
 Return MalformedXML when both Days and Years are specified in retention configuration
 Return InvalidBucketState (409 Conflict) when trying to suspend versioning on buckets with object lock enabled
 Handle all object lock validation errors consistently with proper error codes

* constants and fixes

 Return InvalidRetentionPeriod for invalid retention values (0 days, negative years)
 Return ObjectLockConfigurationNotFoundError when object lock configuration doesn't exist
 Handle all object lock validation errors consistently with proper error codes

* fixes

 Return MalformedXML when both Days and Years are specified in the same retention configuration
 Return 400 (Bad Request) with InvalidRequest when object lock operations are attempted on buckets without object lock enabled
 Handle all object lock validation errors consistently with proper error codes

* fixes

 Return 409 (Conflict) with InvalidBucketState for bucket-level object lock configuration operations on buckets without object lock enabled
 Allow increasing retention periods and overriding retention with same/later dates
 Only block decreasing retention periods without proper bypass permissions
 Handle all object lock validation errors consistently with proper error codes

* fixes

 Include VersionId in multipart upload completion responses when versioning is enabled
 Block retention mode changes (GOVERNANCE ↔ COMPLIANCE) without bypass permissions
 Handle all object lock validation errors consistently with proper error codes
 Pass the remaining object lock tests

* fix tests

* fixes

* pass tests

* fix tests

* fixes

* add error mapping

* Update s3tests.conf

* fix test_object_lock_put_obj_lock_invalid_days

* fixes

* fix many issues

* fix test_object_lock_delete_multipart_object_with_legal_hold_on

* fix tests

* refactor

* fix test_object_lock_delete_object_with_retention_and_marker

* fix tests

* fix tests

* fix tests

* fix test itself

* fix tests

* fix test

* Update weed/s3api/s3api_object_retention.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* reduce logs

* address comments

* refactor

* rename

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-19 00:49:56 -07:00
Chris Lu
26403e8a0d Test object lock and retention (#6997)
* fix GetObjectLockConfigurationHandler

* cache and use bucket object lock config

* subscribe to bucket configuration changes

* increase bucket config cache TTL

* refactor

* Update weed/s3api/s3api_server.go

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* avoid duplidated work

* rename variable

* Update s3api_object_handlers_put.go

* fix routing

* admin ui and api handler are consistent now

* use fields instead of xml

* fix test

* address comments

* Update weed/s3api/s3api_object_handlers_put.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update test/s3/retention/s3_retention_test.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update weed/s3api/object_lock_utils.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* change error style

* errorf

* read entry once

* add s3 tests for object lock and retention

* use marker

* install s3 tests

* Update s3tests.yml

* Update s3tests.yml

* Update s3tests.conf

* Update s3tests.conf

* address test errors

* address test errors

With these fixes, the s3-tests should now:
 Return InvalidBucketState (409 Conflict) for object lock operations on invalid buckets
 Return MalformedXML for invalid retention configurations
 Include VersionId in response headers when available
 Return proper HTTP status codes (403 Forbidden for retention mode changes)
 Handle all object lock validation errors consistently

* fixes

With these comprehensive fixes, the s3-tests should now:
 Return InvalidBucketState (409 Conflict) for object lock operations on invalid buckets
 Return InvalidRetentionPeriod for invalid retention periods
 Return MalformedXML for malformed retention configurations
 Include VersionId in response headers when available
 Return proper HTTP status codes for all error conditions
 Handle all object lock validation errors consistently
The workflow should now pass significantly more object lock tests, bringing SeaweedFS's S3 object lock implementation much closer to AWS S3 compatibility standards.

* fixes

With these final fixes, the s3-tests should now:
 Return MalformedXML for ObjectLockEnabled: 'Disabled'
 Return MalformedXML when both Days and Years are specified in retention configuration
 Return InvalidBucketState (409 Conflict) when trying to suspend versioning on buckets with object lock enabled
 Handle all object lock validation errors consistently with proper error codes

* constants and fixes

 Return InvalidRetentionPeriod for invalid retention values (0 days, negative years)
 Return ObjectLockConfigurationNotFoundError when object lock configuration doesn't exist
 Handle all object lock validation errors consistently with proper error codes

* fixes

 Return MalformedXML when both Days and Years are specified in the same retention configuration
 Return 400 (Bad Request) with InvalidRequest when object lock operations are attempted on buckets without object lock enabled
 Handle all object lock validation errors consistently with proper error codes

* fixes

 Return 409 (Conflict) with InvalidBucketState for bucket-level object lock configuration operations on buckets without object lock enabled
 Allow increasing retention periods and overriding retention with same/later dates
 Only block decreasing retention periods without proper bypass permissions
 Handle all object lock validation errors consistently with proper error codes

* fixes

 Include VersionId in multipart upload completion responses when versioning is enabled
 Block retention mode changes (GOVERNANCE ↔ COMPLIANCE) without bypass permissions
 Handle all object lock validation errors consistently with proper error codes
 Pass the remaining object lock tests

* fix tests

* fixes

* pass tests

* fix tests

* fixes

* add error mapping

* Update s3tests.conf

* fix test_object_lock_put_obj_lock_invalid_days

* fixes

* fix many issues

* fix test_object_lock_delete_multipart_object_with_legal_hold_on

* fix tests

* refactor

* fix test_object_lock_delete_object_with_retention_and_marker

* fix tests

* fix tests

* fix tests

* fix test itself

* fix tests

* fix test

* Update weed/s3api/s3api_object_retention.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* reduce logs

* address comments

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-18 22:25:58 -07:00
Chris Lu
c6a22ce43a Fix get object lock configuration handler (#6996)
* fix GetObjectLockConfigurationHandler

* cache and use bucket object lock config

* subscribe to bucket configuration changes

* increase bucket config cache TTL

* refactor

* Update weed/s3api/s3api_server.go

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* avoid duplidated work

* rename variable

* Update s3api_object_handlers_put.go

* fix routing

* admin ui and api handler are consistent now

* use fields instead of xml

* fix test

* address comments

* Update weed/s3api/s3api_object_handlers_put.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update test/s3/retention/s3_retention_test.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update weed/s3api/object_lock_utils.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* change error style

* errorf

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-18 02:19:50 -07:00