seaweedFS/weed/s3api/s3api_version_id_test.go at 4e2af080df55a5eb2cc0ff9a4b786ca5219d7210

Files

Chris Lu 414cda4215 fix: S3 versioning memory leak in ListObjectVersions pagination (#7813 )

* fix: S3 versioning memory leak in ListObjectVersions pagination

This commit fixes a memory leak issue in S3 versioning buckets where
ListObjectVersions with pagination (key-marker set) would collect ALL
versions in the bucket before filtering, causing O(N) memory usage.

Root cause:
- When keyMarker was set, maxCollect was set to 0 (unlimited)
- This caused findVersionsRecursively to traverse the entire bucket
- All versions were collected into memory, sorted, then filtered

Fix:
- Updated findVersionsRecursively to accept keyMarker and versionIdMarker
- Skips objects/versions before the marker during recursion (not after)
- Always respects maxCollect limit (never unlimited)
- Memory usage is now O(maxKeys) instead of O(total versions)

Refactoring:
- Introduced versionCollector struct to encapsulate collection state
- Extracted helper methods for cleaner, more testable code:
  - matchesPrefixFilter: prefix matching logic
  - shouldSkipObjectForMarker: keyMarker filtering
  - shouldSkipVersionForMarker: versionIdMarker filtering
  - processVersionsDirectory: .versions directory handling
  - processExplicitDirectory: S3 directory object handling
  - processRegularFile: pre-versioning file handling
  - collectVersions: main recursive collection loop
  - processDirectory: directory entry dispatch

This reduces the high QPS on 'find' and 'prefixList' operations
by skipping irrelevant objects during traversal.

Fixes customer-reported memory leak with high find/prefixList QPS
in Grafana for S3 versioning buckets.

* s3: infer version ID format from ExtLatestVersionIdKey metadata

Simplified version format detection:
- Removed ExtVersionIdFormatKey - no longer needed
- getVersionIdFormat() now infers format from ExtLatestVersionIdKey
- Uses isNewFormatVersionId() to check if latest version uses inverted format

This approach is simpler because:
- ExtLatestVersionIdKey is already stored in .versions directory metadata
- No need for separate format metadata field
- Format is naturally determined by the existing version IDs

2025-12-18 02:52:50 -08:00

11 KiB

Raw Blame History

View Raw

11 KiB Raw Blame History

11 KiB

Raw Blame History