Commit Graph

13 Commits

Author SHA1 Message Date
Chris Lu
f7c27cc81f go fmt 2026-02-20 18:40:47 -08:00
Chris Lu
b57429ef2e Switch empty-folder cleanup to bucket policy (#8292)
* Fix Spark _temporary cleanup and add issue #8285 regression test

* Generalize empty folder cleanup for Spark temp artifacts

* Revert synchronous folder pruning and add cleanup diagnostics

* Add actionable empty-folder cleanup diagnostics

* Fix Spark temp marker cleanup in async folder cleaner

* Fix Spark temp cleanup with implicit directory markers

* Keep explicit directory markers non-implicit

* logging

* more logs

* Switch empty-folder cleanup to bucket policy

* Seaweed-X-Amz-Allow-Empty-Folders

* less logs

* go vet

* less logs

* refactoring
2026-02-10 18:38:38 -08:00
Chris Lu
8880f9932f filer: auto clean empty implicit s3 folders (#8051)
* filer: auto clean empty s3 implicit folders

Explicitly tag implicitly created S3 folders (parent directories from object uploads) with 'Seaweed-X-Amz-Implicit-Dir'.

Update EmptyFolderCleaner to check for this attribute and cache the result efficiently.

* filer: correctly handle nil attributes in empty folder cleaner cache

* filer: refine implicit tagging logic

Prevent tagging buckets as implicit directories. Reduce code duplication.

* filer: safeguard GetEntryAttributes against nil entry and not found error

* filer: move ErrNotFound handling to EmptyFolderCleaner

* filer: add comment to explain level > 3 check for implicit directories
2026-01-17 22:10:15 -08:00
promalert
9012069bd7 chore: execute goimports to format the code (#7983)
* chore: execute goimports to format the code

Signed-off-by: promalert <promalert@outlook.com>

* goimports -w .

---------

Signed-off-by: promalert <promalert@outlook.com>
Co-authored-by: Chris Lu <chris.lu@gmail.com>
2026-01-07 13:06:08 -08:00
Chris Lu
bccef78082 fix: reduce N+1 queries in S3 versioned object list operations (#7814)
* fix: achieve single-scan efficiency for S3 versioned object listing

When listing objects in a versioning-enabled bucket, the original code
triggered multiple getEntry calls per versioned object (up to 12 with
retries), causing excessive 'find' operations visible in Grafana and
leading to high memory usage.

This fix achieves single-scan efficiency by caching list metadata
(size, ETag, mtime, owner) directly in the .versions directory:

1. Add new Extended keys for caching list metadata in .versions dir
2. Update upload/copy/multipart paths to cache metadata when creating versions
3. Update getLatestVersionEntryFromDirectoryEntry to use cached metadata
   (zero getEntry calls when cache is available)
4. Update updateLatestVersionAfterDeletion to maintain cache consistency

Performance improvement for N versioned objects:
- Before: N×1 to N×12 find operations per list request
- After: 0 extra find operations (all metadata from single scan)

This matches the efficiency of normal (non-versioned) object listing.

* Update s3api_object_versioning.go

* s3api: fix ETag handling for versioned objects and simplify delete marker creation

- Add Md5 attribute to synthetic logicalEntry for single-part uploads to ensure
  filer.ETag() returns correct value in ListObjects response
- Simplify delete marker creation by initializing entry directly in mkFile callback
- Add bytes and encoding/hex imports for ETag parsing

* s3api: preserve default attributes in delete marker mkFile callback

Only modify Mtime field instead of replacing the entire Attributes struct,
preserving default values like Crtime, FileMode, Uid, and Gid that mkFile
initializes.

* s3api: fix ETag handling in newListEntry for multipart uploads

Prioritize ExtETagKey from Extended attributes before falling back to
filer.ETag(). This properly handles multipart upload ETags (format: md5-parts)
for versioned objects, where the synthetic entry has cached ETag metadata
but no chunks to calculate from.

* s3api: reduce code duplication in delete marker creation

Extract deleteMarkerExtended map to be reused in both mkFile callback
and deleteMarkerEntry construction.

* test: add multipart upload versioning tests for ETag verification

Add tests to verify that multipart uploaded objects in versioned buckets
have correct ETags when listed:

- TestMultipartUploadVersioningListETag: Basic multipart upload with 2 parts
- TestMultipartUploadMultipleVersionsListETag: Multiple multipart versions
- TestMixedSingleAndMultipartVersionsListETag: Mix of single-part and multipart

These tests cover a bug where synthetic entries for versioned objects
didn't include proper ETag handling for multipart uploads.

* test: add delete marker test for multipart uploaded versioned objects

TestMultipartUploadDeleteMarkerListBehavior verifies:
- Delete marker creation hides object from ListObjectsV2
- ListObjectVersions shows both version and delete marker
- Version ETag (multipart format) is preserved after delete marker
- Object can be accessed by version ID after delete marker
- Removing delete marker restores object visibility

* refactor: address code review feedback

- test: use assert.ElementsMatch for ETag verification (more idiomatic)
- s3api: optimize newListEntry ETag logic (check ExtETagKey first)
- s3api: fix edge case in ETag parsing (>= 2 instead of > 2)

* s3api: prevent stale cached metadata and preserve existing extended attrs

- setCachedListMetadata: clear old cached keys before setting new values
  to prevent stale data when new version lacks certain fields (e.g., owner)
- createDeleteMarker: merge extended attributes instead of overwriting
  to preserve any existing metadata on the entry

* s3api: extract clearCachedVersionMetadata to reduce code duplication

- clearCachedVersionMetadata: clears only metadata fields (size, mtime, etag, owner, deleteMarker)
- clearCachedListMetadata: now reuses clearCachedVersionMetadata + clears ID/filename
- setCachedListMetadata: uses clearCachedVersionMetadata (not clearCachedListMetadata
  because caller has already set ID/filename)

* s3api: share timestamp between version entry and cache entry

Capture versionMtime once before mkFile and reuse for both:
- versionEntry.Attributes.Mtime in the mkFile callback
- versionEntryForCache.Attributes.Mtime for list caching

This keeps list vs. HEAD LastModified timestamps aligned.

* s3api: remove amzAccountId variable shadowing in multipart upload

Extract amzAccountId before mkFile callback and reuse in both places,
similar to how versionMtime is handled. Avoids confusion from
redeclaring the same variable.
2025-12-18 17:44:27 -08:00
Konstantin Lebedev
084b377f87 do delete expired entries on s3 list request (#7426)
* do delete expired entries on s3 list request
https://github.com/seaweedfs/seaweedfs/issues/6837

* disable delete expires s3 entry in filer

* pass opt allowDeleteObjectsByTTL to all servers

* delete on get and head

* add lifecycle expiration s3 tests

* fix opt allowDeleteObjectsByTTL for server

* fix test lifecycle expiration

* fix IsExpired

* fix locationPrefix for updateEntriesTTL

* fix s3tests

* resolv  coderabbitai

* GetS3ExpireTime on filer

* go mod

* clear TtlSeconds for volume

* move s3 delete expired entry to filer

* filer delete meta and data

* del unusing func removeExpiredObject

* test s3 put

* test s3 put multipart

* allowDeleteObjectsByTTL by default

* fix pipline tests

* rm dublicate SeaweedFSExpiresS3

* revert expiration tests

* fix updateTTL

* rm log

* resolv comment

* fix delete version object

* fix S3Versioning

* fix delete on FindEntry

* fix delete chunks

* fix sqlite not support concurrent writes/reads

* move deletion out of listing transaction; delete entries and empty folders

* Revert "fix sqlite not support concurrent writes/reads"

This reverts commit 5d5da14e0ed91c613fe5c0ed058f58bb04fba6f0.

* clearer handling on recursive empty directory deletion

* handle listing errors

* strut copying

* reuse code to delete empty folders

* use iterative approach with a queue to avoid recursive WithFilerClient calls

* stop a gRPC stream from the client-side callback is to return a specific error, e.g., io.EOF

* still issue UpdateEntry when the flag must be added

* errors join

* join path

* cleaner

* add context, sort directories by depth (deepest first) to avoid redundant checks

* batched operation, refactoring

* prevent deleting bucket

* constant

* reuse code

* more logging

* refactoring

* s3 TTL time

* Safety check

---------

Co-authored-by: chrislu <chris.lu@gmail.com>
2025-11-05 22:05:54 -08:00
Chris Lu
c6a22ce43a Fix get object lock configuration handler (#6996)
* fix GetObjectLockConfigurationHandler

* cache and use bucket object lock config

* subscribe to bucket configuration changes

* increase bucket config cache TTL

* refactor

* Update weed/s3api/s3api_server.go

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* avoid duplidated work

* rename variable

* Update s3api_object_handlers_put.go

* fix routing

* admin ui and api handler are consistent now

* use fields instead of xml

* fix test

* address comments

* Update weed/s3api/s3api_object_handlers_put.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update test/s3/retention/s3_retention_test.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update weed/s3api/object_lock_utils.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* change error style

* errorf

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-18 02:19:50 -07:00
Chris Lu
dde1cf63c2 S3 Object Lock: ensure x-amz-bucket-object-lock-enabled header (#6990)
* ensure x-amz-bucket-object-lock-enabled header

* fix tests

* combine 2 metadata changes into one

* address comments

* Update s3api_bucket_handlers.go

* Update weed/s3api/s3api_bucket_handlers.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update test/s3/retention/object_lock_reproduce_test.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update test/s3/retention/object_lock_validation_test.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update test/s3/retention/s3_bucket_object_lock_test.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update weed/s3api/s3api_bucket_handlers.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update weed/s3api/s3api_bucket_handlers.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update test/s3/retention/s3_bucket_object_lock_test.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update weed/s3api/s3api_bucket_handlers.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* package name

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-15 23:21:58 -07:00
Chris Lu
7cb1ca1308 Add policy engine (#6970) 2025-07-13 16:21:36 -07:00
Chris Lu
1549ee2e15 implement PubObjectRetention and WORM (#6969)
* implement PubObjectRetention and WORM

* Update s3_worm_integration_test.go

* avoid previous buckets

* Update s3-versioning-tests.yml

* address comments

* address comments

* rename to ExtObjectLockModeKey

* only checkObjectLockPermissions if versioningEnabled

* address comments

* comments

* Revert "comments"

This reverts commit 6736434176f86c6e222b867777324b17c2de716f.

* Update s3api_object_handlers_skip.go

* Update s3api_object_retention_test.go

* add version id to ObjectIdentifier

* address comments

* add comments

* Add proper error logging for timestamp parsing failures

* address comments

* add version id to the error

* Update weed/s3api/s3api_object_retention_test.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update weed/s3api/s3api_object_retention.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* constants

* fix comments

* address comments

* address comment

* refactor out handleObjectLockAvailabilityCheck

* errors.Is ErrBucketNotFound

* better error checking

* address comments

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-12 21:58:55 -07:00
Chris Lu
cf5a24983a S3: add object versioning (#6945)
* add object versioning

* add missing file

* Update weed/s3api/s3api_object_versioning.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update weed/s3api/s3api_object_versioning.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update weed/s3api/s3api_object_versioning.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* ListObjectVersionsResult is better to show multiple version entries

* fix test

* Update weed/s3api/s3api_object_handlers_put.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update weed/s3api/s3api_object_versioning.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* multiple improvements

* move PutBucketVersioningHandler into weed/s3api/s3api_bucket_handlers.go file
* duplicated code for reading bucket config, versioningEnabled, etc. try to use functions
* opportunity to cache bucket config

* error handling if bucket is not found

* in case bucket is not found

* fix build

* add object versioning tests

* remove non-existent tests

* add tests

* add versioning tests

* skip a new test

* ensure .versions directory exists before saving info into it

* fix creating version entry

* logging on creating version directory

* Update s3api_object_versioning_test.go

* retry and wait for directory creation

* revert add more logging

* Update s3api_object_versioning.go

* more debug messages

* clean up logs, and touch directory correctly

* log the .versions creation and then parent directory listing

* use mkFile instead of touch

touch is for update

* clean up data

* add versioning test in go

* change location

* if modified, latest version is moved to .versions directory, and create a new latest version

 Core versioning functionality: WORKING
TestVersioningBasicWorkflow - PASS
TestVersioningDeleteMarkers - PASS
TestVersioningMultipleVersionsSameObject - PASS
TestVersioningDeleteAndRecreate - PASS
TestVersioningListWithPagination - PASS
 Some advanced features still failing:
ETag calculation issues (using mtime instead of proper MD5)
Specific version retrieval (EOF error)
Version deletion (internal errors)
Concurrent operations (race conditions)

* calculate multi chunk md5

Test Results - All Passing:
 TestBucketListReturnDataVersioning - PASS
 TestVersioningCreateObjectsInOrder - PASS
 TestVersioningBasicWorkflow - PASS
 TestVersioningMultipleVersionsSameObject - PASS
 TestVersioningDeleteMarkers - PASS

* dedupe

* fix TestVersioningErrorCases

* fix eof error of reading old versions

* get specific version also check current version

* enable integration tests for versioning

* trigger action to work for now

* Fix GitHub Actions S3 versioning tests workflow

- Fix syntax error (incorrect indentation)
- Update directory paths from weed/s3api/versioning_tests/ to test/s3/versioning/
- Add push trigger for add-object-versioning branch to enable CI during development
- Update artifact paths to match correct directory structure

* Improve CI robustness for S3 versioning tests

Makefile improvements:
- Increase server startup timeout from 30s to 90s for CI environments
- Add progressive timeout reporting (logs at 30s, full logs at 90s)
- Better error handling with server logs on failure
- Add server PID tracking for debugging
- Improved test failure reporting

GitHub Actions workflow improvements:
- Increase job timeouts to account for CI environment delays
- Add system information logging (memory, disk space)
- Add detailed failure reporting with server logs
- Add process and network diagnostics on failure
- Better error messaging and log collection

These changes should resolve the 'Server failed to start within 30 seconds' issue
that was causing the CI tests to fail.

* adjust testing volume size

* Update Makefile

* Update Makefile

* Update Makefile

* Update Makefile

* Update s3-versioning-tests.yml

* Update s3api_object_versioning.go

* Update Makefile

* do not clean up

* log received version id

* more logs

* printout response

* print out list version response

* use tmp files when put versioned object

* change to versions folder layout

* Delete weed-test.log

* test with mixed versioned and unversioned objects

* remove versionDirCache

* remove unused functions

* remove unused function

* remove fallback checking

* minor

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-09 01:51:45 -07:00
LHHDZ
d21e2f523d split ExtAcpKey to ExtAmzOwnerKey and ExtAmzAclKey to avoid unn… (#3824)
split `ExtAcpKey` to `ExtAmzOwnerKey` and `ExtAmzAclKey` to avoid unnecessary `json.Unmarshal()` call

Signed-off-by: changlin.shi <changlin.shi@ly.com>

Signed-off-by: changlin.shi <changlin.shi@ly.com>
2022-10-11 20:14:14 -07:00
LHHDZ
3de1e19780 s3: sync bucket info from filer (#3759) 2022-09-29 12:29:01 -07:00