Commit Graph

6 Commits

Author SHA1 Message Date
Chris Lu
1d0361d936 Fix: Eliminate duplicate versioned objects in S3 list operations (#7850)
* Fix: Eliminate duplicate versioned objects in S3 list operations

- Move versioned directory processing outside of pagination loop to process only once
- Add deduplication during .versions directory collection phase
- Fix directory handling to not add directories to results in recursive mode
- Directly add versioned entries to contents array instead of using callback

Fixes issue where AWS S3 list operations returned duplicated versioned objects
(e.g., 1000 duplicate entries from 4 unique objects). Now correctly returns only
the unique logical entries without duplication.

Verified with:
  aws s3api list-objects --endpoint-url http://localhost:8333 --bucket pm-itatiaiucu-01
Returns exactly 4 entries (ClientInfo.xml and Repository from 2 Veeam backup folders)

* Refactor: Process .versions directories immediately when encountered

Instead of collecting .versions directories and processing them after the
pagination loop, process them immediately when encountered during traversal.

Benefits:
- Simpler code: removed versionedDirEntry struct and collection array
- More efficient: no need to store and iterate through collected entries
- Same O(V) complexity but with less memory overhead
- Clearer logic: processing happens in one pass during traversal

Since each .versions directory is only visited once during recursive
traversal (we never traverse into them), there's no need for deferred
processing or deduplication.

* Add comprehensive tests for versioned objects list

- TestListObjectsWithVersionedObjects: Tests listing with various delimiters
- TestVersionedObjectsNoDuplication: Core test validating no 250x duplication
- TestVersionedObjectsWithDeleteMarker: Tests delete marker filtering
- TestVersionedObjectsMaxKeys: Tests pagination with versioned objects
- TestVersionsDirectoryNotTraversed: Ensures .versions never traversed
- Fix existing test signature to match updated doListFilerEntries

* style: Fix formatting alignment in versioned objects tests

* perf: Optimize path extraction using string indexing

Replace multiple strings.Split/Join calls with efficient strings.Index
slicing to extract bucket-relative path from directory string.

Reduces unnecessary allocations and improves performance in versioned
objects listing path construction.

* refactor: Address code review feedback from Gemini Code Assist

1. Fix misleading comment about versioned directory processing location.
   Versioned directories are processed immediately in doListFilerEntries,
   not deferred to ListObjectsV1Handler.

2. Simplify path extraction logic using explicit bucket path construction
   instead of index-based string slicing for better readability and
   maintainability.

3. Add clarifying comment to test callback explaining why production logic
   is duplicated - necessary because listFilerEntries is not easily testable
   with filer client injection.

* fmt

* refactor: Address code review feedback from Copilot

- Fix misleading comment about versioned directory processing location
  (note that processing happens within doListFilerEntries, not at top level)
- Add maxKeys validation checks in all test callbacks for consistency
- Add maxKeys check before calling eachEntryFn for versioned objects
- Improve test documentation to clarify testing approach and avoid apologetic tone

* refactor: Address code review feedback from Gemini Code Assist

- Remove redundant maxKeys check before eachEntryFn call on line 541
  (the loop already checks maxKeys <= 0 at line 502, ensuring quota exists)
- Fix pagination pattern consistency in all test callbacks
  - TestVersionedObjectsNoDuplication: Use cursor.maxKeys <= 0 check and decrement
  - TestVersionedObjectsWithDeleteMarker: Use cursor.maxKeys <= 0 check and decrement
  - TestVersionsDirectoryNotTraversed: Use cursor.maxKeys <= 0 check and decrement
- Ensures consistent pagination logic across all callbacks matching production behavior

* refactor: Address code review suggestions for code quality

- Adjust log verbosity from V(5) to V(4) for file additions to reduce noise
  while maintaining useful debug output during troubleshooting
- Remove unused isRecursive parameter from doListFilerEntries function
  signature and all call sites (not used for any logic decisions)
- Consolidate redundant comments about versioned directory handling
  to reduce documentation duplication

These changes improve code maintainability and clarity.

* fmt

* refactor: Add pagination test and optimize stream processing

- Add comprehensive test validation to TestVersionedObjectsMaxKeys
  that verifies truncation is correctly set when maxKeys is exhausted
  with more entries available, ensuring proper pagination state

- Optimize stream processing in doListFilerEntries by using 'break'
  instead of 'continue' when quota is exhausted (cursor.maxKeys <= 0)
  This avoids receiving and discarding entries from the stream when
  we've already reached the requested limit, improving efficiency
2025-12-22 15:50:13 -08:00
Chris Lu
f63d9ad390 s3api: fix bucket-root listing w/ delimiter (#7827)
* s3api: fix bucket-root listing w/ delimiter

* test: improve mock robustness for bucket-root listing test

- Make testListEntriesStream implement interface explicitly without embedding
- Add prefix filtering logic to testFilerClient to simulate real filer behavior
- Special-case prefix='/' to not filter for bucket root compatibility
- Add required imports for metadata and strings packages

This addresses review comments about test mock brittleness and accuracy.

* test: add clarifying comment for mock filtering behavior

Add detailed comment explaining which ListEntriesRequest parameters
are implemented (Prefix) vs ignored (Limit, StartFromFileName, etc.)
in the test mock to improve code documentation and future maintenance.

* logging

* less logs

* less check if already locked
2025-12-20 00:59:10 -08:00
chrislu
a4df110e77 address List permission
fix https://github.com/seaweedfs/seaweedfs/issues/7039
2025-07-28 02:39:41 -07:00
Chris Lu
33b9017b48 fix listing objects (#7008)
* fix listing objects

* add more list testing

* address comments

* fix next marker

* fix isTruncated in listing

* fix tests

* address tests

* Update s3api_object_handlers_multipart.go

* fixes

* store json into bucket content, for tagging and cors

* switch bucket metadata from json to proto

* fix

* Update s3api_bucket_config.go

* fix test issue

* fix test_bucket_listv2_delimiter_prefix

* Update cors.go

* skip special characters

* passing listing

* fix test_bucket_list_delimiter_prefix

* ok. fix the xsd generated go code now

* fix cors tests

* fix test

* fix test_bucket_list_unordered and test_bucket_listv2_unordered

do not accept the allow-unordered and delimiter parameter combination

* fix test_bucket_list_objects_anonymous and test_bucket_listv2_objects_anonymous

The tests test_bucket_list_objects_anonymous and test_bucket_listv2_objects_anonymous were failing because they try to set bucket ACL to public-read, but SeaweedFS only supported private ACL.

Updated PutBucketAclHandler to use the existing ExtractAcl function which already supports all standard S3 canned ACLs
Replaced the hardcoded check for only private ACL with proper ACL parsing that handles public-read, public-read-write, authenticated-read, bucket-owner-read, bucket-owner-full-control, etc.
Added unit tests to verify all standard canned ACLs are accepted

* fix list unordered

The test is expecting the error code to be InvalidArgument instead of InvalidRequest

* allow anonymous listing( and head, get)

* fix test_bucket_list_maxkeys_invalid

Invalid values: max-keys=blah → Returns ErrInvalidMaxKeys (HTTP 400)

* updating IsPublicRead when parsing acl

* more logs

* CORS Test Fix

* fix test_bucket_list_return_data

* default to private

* fix test_bucket_list_delimiter_not_skip_special

* default no acl

* add debug logging

* more logs

* use basic http client

remove logs also

* fixes

* debug

* Update stats.go

* debugging

* fix anonymous test expectation

anonymous user can read, as configured in s3 json.
2025-07-22 01:07:15 -07:00
Konstantin Lebedev
f77eee667d add s3test for sql (#5718)
* add s3test for sql

* fix test test_bucket_listv2_delimiter_basic for s3

* fix action s3tests

* regen s3 api xsd

* rm minor s3 test test_bucket_listv2_fetchowner_defaultempty

* add docs

* without xmlns
2024-07-04 11:00:41 -07:00
chrislu
a1b59948cc rename files 2024-04-29 05:33:56 -07:00