Files
seaweedFS/weed/plugin/worker/lifecycle/rules_test.go
Chris Lu 9c3bc138a0 lifecycle worker: scan-time rule evaluation for object expiration (#8809)
* s3api: extend lifecycle XML types with NoncurrentVersionExpiration, AbortIncompleteMultipartUpload

Add missing S3 lifecycle rule types to the XML data model:
- NoncurrentVersionExpiration with NoncurrentDays and NewerNoncurrentVersions
- NoncurrentVersionTransition with NoncurrentDays and StorageClass
- AbortIncompleteMultipartUpload with DaysAfterInitiation
- Filter.ObjectSizeGreaterThan and ObjectSizeLessThan
- And.ObjectSizeGreaterThan and ObjectSizeLessThan
- Filter.UnmarshalXML to properly parse Tag, And, and size filter elements

Each new type follows the existing set-field pattern for conditional
XML marshaling. No behavior changes - these types are not yet wired
into handlers or the lifecycle worker.

* s3lifecycle: add lifecycle rule evaluator package

New package weed/s3api/s3lifecycle/ provides a pure-function lifecycle
rule evaluation engine. The evaluator accepts flattened Rule structs and
ObjectInfo metadata, and returns the appropriate Action.

Components:
- evaluator.go: Evaluate() for per-object actions with S3 priority
  ordering (delete marker > noncurrent version > current expiration),
  ShouldExpireNoncurrentVersion() with NewerNoncurrentVersions support,
  EvaluateMPUAbort() for multipart upload rules
- filter.go: prefix, tag, and size-based filter matching
- tags.go: ExtractTags() extracts S3 tags from filer Extended metadata,
  HasTagRules() for scan-time optimization
- version_time.go: GetVersionTimestamp() extracts timestamps from
  SeaweedFS version IDs (both old and new format)

Comprehensive test coverage: 54 tests covering all action types,
filter combinations, edge cases, and version ID formats.

* s3api: add UnmarshalXML for Expiration, Transition, ExpireDeleteMarker

Add UnmarshalXML methods that set the internal 'set' flag during XML
parsing. Previously these flags were only set programmatically, causing
XML round-trip to drop elements. This ensures lifecycle configurations
stored as XML survive unmarshal/marshal cycles correctly.

Add comprehensive XML round-trip tests for all lifecycle rule types
including NoncurrentVersionExpiration, AbortIncompleteMultipartUpload,
Filter with Tag/And/size constraints, and a complete Terraform-style
lifecycle configuration.

* s3lifecycle: address review feedback

- Fix version_time.go overflow: guard timestampPart > MaxInt64 before
  the inversion subtraction to prevent uint64 wrap
- Make all expiry checks inclusive (!now.Before instead of now.After)
  so actions trigger at the exact scheduled instant
- Add NoncurrentIndex to ObjectInfo so Evaluate() can properly handle
  NewerNoncurrentVersions via ShouldExpireNoncurrentVersion()
- Add test for high-bit overflow version ID

* s3lifecycle: guard ShouldExpireNoncurrentVersion against zero SuccessorModTime

Add early return when obj.IsLatest or obj.SuccessorModTime.IsZero()
to prevent premature expiration of versions with uninitialized
successor timestamps (zero value would compute to epoch, always expired).

* lifecycle worker: detect buckets with lifecycle XML, not just filer.conf TTLs

Update the detection phase to check for stored lifecycle XML in bucket
metadata (key: s3-bucket-lifecycle-configuration-xml) in addition to
filer.conf TTL entries. A bucket is proposed for lifecycle processing if
it has lifecycle XML OR filer.conf TTLs (backward compatible).

New proposal parameters:
- has_lifecycle_xml: whether the bucket has stored lifecycle XML
- versioning_status: the bucket's versioning state (Enabled/Suspended/"")

These parameters will be used by the execution phase (subsequent PR)
to determine which evaluation path to use.

* lifecycle worker: update detection function comment to reflect XML support

* lifecycle worker: add lifecycle XML parsing and rule conversion

Add rules.go with:
- parseLifecycleXML() converts stored lifecycle XML to evaluator-friendly
  s3lifecycle.Rule structs, handling Filter.Prefix, Filter.Tag, Filter.And,
  size constraints, NoncurrentVersionExpiration, AbortIncompleteMultipartUpload,
  Expiration.Date, and ExpiredObjectDeleteMarker
- loadLifecycleRulesFromBucket() reads lifecycle XML from bucket metadata
- parseExpirationDate() supports RFC3339 and ISO 8601 date-only formats

Comprehensive tests for all XML variants, filter types, and date formats.

* lifecycle worker: add scan-time rule evaluation for object expiration

Update executeLifecycleForBucket to try lifecycle XML evaluation first,
falling back to TTL-only evaluation when no lifecycle XML exists.

New listExpiredObjectsByRules() function:
- Walks the bucket directory tree
- Builds s3lifecycle.ObjectInfo from each filer entry
- Calls s3lifecycle.Evaluate() to check lifecycle rules
- Skips objects already handled by TTL fast path (TtlSec set)
- Extracts tags only when rules use tag-based filters (optimization)
- Skips .uploads and .versions directories (handled by other phases)

Supports Expiration.Days, Expiration.Date, Filter.Prefix, Filter.Tag,
Filter.And, and Filter.ObjectSize* in the scan-time evaluation path.
Existing TTL-based path remains for backward compatibility.

* lifecycle worker: address review feedback

- Use sentinel error (errLimitReached) instead of string matching
  for scan limit detection
- Fix loadLifecycleRulesFromBucket path: use bucketsPath directly
  as directory for LookupEntry instead of path.Dir which produced
  the wrong parent

* lifecycle worker: fix And filter detection for size-only constraints

The And branch condition only triggered when Prefix or Tags were present,
missing the case where And contains only ObjectSizeGreaterThan or
ObjectSizeLessThan without a prefix or tags.

* lifecycle worker: address review feedback round 3

- rules.go: pass through Filter-level size constraints when Tag is
  present without And (Tag+size combination was dropping sizes)
- execution.go: add doc comment to listExpiredObjectsByRules noting
  that it handles non-versioned objects only; versioned objects are
  handled by processVersionsDirectory
- rules_test.go: add bounds checks before indexing rules[0]

---------

Co-authored-by: Copilot <copilot@github.com>
2026-03-28 11:39:50 -07:00

257 lines
7.0 KiB
Go

package lifecycle
import (
"testing"
"time"
)
func TestParseLifecycleXML_CompleteConfig(t *testing.T) {
xml := []byte(`<LifecycleConfiguration>
<Rule>
<ID>rotation</ID>
<Filter><Prefix></Prefix></Filter>
<Status>Enabled</Status>
<Expiration><Days>30</Days></Expiration>
<NoncurrentVersionExpiration>
<NoncurrentDays>7</NoncurrentDays>
<NewerNoncurrentVersions>2</NewerNoncurrentVersions>
</NoncurrentVersionExpiration>
<AbortIncompleteMultipartUpload>
<DaysAfterInitiation>3</DaysAfterInitiation>
</AbortIncompleteMultipartUpload>
</Rule>
</LifecycleConfiguration>`)
rules, err := parseLifecycleXML(xml)
if err != nil {
t.Fatalf("parseLifecycleXML: %v", err)
}
if len(rules) != 1 {
t.Fatalf("expected 1 rule, got %d", len(rules))
}
r := rules[0]
if r.ID != "rotation" {
t.Errorf("expected ID 'rotation', got %q", r.ID)
}
if r.Status != "Enabled" {
t.Errorf("expected Status 'Enabled', got %q", r.Status)
}
if r.ExpirationDays != 30 {
t.Errorf("expected ExpirationDays=30, got %d", r.ExpirationDays)
}
if r.NoncurrentVersionExpirationDays != 7 {
t.Errorf("expected NoncurrentVersionExpirationDays=7, got %d", r.NoncurrentVersionExpirationDays)
}
if r.NewerNoncurrentVersions != 2 {
t.Errorf("expected NewerNoncurrentVersions=2, got %d", r.NewerNoncurrentVersions)
}
if r.AbortMPUDaysAfterInitiation != 3 {
t.Errorf("expected AbortMPUDaysAfterInitiation=3, got %d", r.AbortMPUDaysAfterInitiation)
}
}
func TestParseLifecycleXML_PrefixFilter(t *testing.T) {
xml := []byte(`<LifecycleConfiguration>
<Rule>
<ID>logs</ID>
<Status>Enabled</Status>
<Filter><Prefix>logs/</Prefix></Filter>
<Expiration><Days>7</Days></Expiration>
</Rule>
</LifecycleConfiguration>`)
rules, err := parseLifecycleXML(xml)
if err != nil {
t.Fatalf("parseLifecycleXML: %v", err)
}
if len(rules) != 1 {
t.Fatalf("expected 1 rule, got %d", len(rules))
}
if rules[0].Prefix != "logs/" {
t.Errorf("expected Prefix='logs/', got %q", rules[0].Prefix)
}
}
func TestParseLifecycleXML_LegacyPrefix(t *testing.T) {
// Old-style <Prefix> at rule level instead of inside <Filter>.
xml := []byte(`<LifecycleConfiguration>
<Rule>
<ID>old</ID>
<Status>Enabled</Status>
<Prefix>archive/</Prefix>
<Expiration><Days>90</Days></Expiration>
</Rule>
</LifecycleConfiguration>`)
rules, err := parseLifecycleXML(xml)
if err != nil {
t.Fatalf("parseLifecycleXML: %v", err)
}
if len(rules) != 1 {
t.Fatalf("expected 1 rule, got %d", len(rules))
}
if rules[0].Prefix != "archive/" {
t.Errorf("expected Prefix='archive/', got %q", rules[0].Prefix)
}
}
func TestParseLifecycleXML_TagFilter(t *testing.T) {
xml := []byte(`<LifecycleConfiguration>
<Rule>
<ID>tag-rule</ID>
<Status>Enabled</Status>
<Filter>
<Tag><Key>env</Key><Value>dev</Value></Tag>
</Filter>
<Expiration><Days>1</Days></Expiration>
</Rule>
</LifecycleConfiguration>`)
rules, err := parseLifecycleXML(xml)
if err != nil {
t.Fatalf("parseLifecycleXML: %v", err)
}
if len(rules) != 1 {
t.Fatalf("expected 1 rule, got %d", len(rules))
}
r := rules[0]
if len(r.FilterTags) != 1 || r.FilterTags["env"] != "dev" {
t.Errorf("expected FilterTags={env:dev}, got %v", r.FilterTags)
}
}
func TestParseLifecycleXML_AndFilter(t *testing.T) {
xml := []byte(`<LifecycleConfiguration>
<Rule>
<ID>and-rule</ID>
<Status>Enabled</Status>
<Filter>
<And>
<Prefix>data/</Prefix>
<Tag><Key>env</Key><Value>staging</Value></Tag>
<ObjectSizeGreaterThan>1024</ObjectSizeGreaterThan>
</And>
</Filter>
<Expiration><Days>14</Days></Expiration>
</Rule>
</LifecycleConfiguration>`)
rules, err := parseLifecycleXML(xml)
if err != nil {
t.Fatalf("parseLifecycleXML: %v", err)
}
if len(rules) != 1 {
t.Fatalf("expected 1 rule, got %d", len(rules))
}
r := rules[0]
if r.Prefix != "data/" {
t.Errorf("expected Prefix='data/', got %q", r.Prefix)
}
if r.FilterTags["env"] != "staging" {
t.Errorf("expected tag env=staging, got %v", r.FilterTags)
}
if r.FilterSizeGreaterThan != 1024 {
t.Errorf("expected FilterSizeGreaterThan=1024, got %d", r.FilterSizeGreaterThan)
}
}
func TestParseLifecycleXML_ExpirationDate(t *testing.T) {
xml := []byte(`<LifecycleConfiguration>
<Rule>
<ID>date-rule</ID>
<Status>Enabled</Status>
<Filter><Prefix></Prefix></Filter>
<Expiration><Date>2026-06-01T00:00:00Z</Date></Expiration>
</Rule>
</LifecycleConfiguration>`)
rules, err := parseLifecycleXML(xml)
if err != nil {
t.Fatalf("parseLifecycleXML: %v", err)
}
expected := time.Date(2026, 6, 1, 0, 0, 0, 0, time.UTC)
if !rules[0].ExpirationDate.Equal(expected) {
t.Errorf("expected ExpirationDate=%v, got %v", expected, rules[0].ExpirationDate)
}
}
func TestParseLifecycleXML_ExpiredObjectDeleteMarker(t *testing.T) {
xml := []byte(`<LifecycleConfiguration>
<Rule>
<ID>marker-cleanup</ID>
<Status>Enabled</Status>
<Filter><Prefix></Prefix></Filter>
<Expiration><ExpiredObjectDeleteMarker>true</ExpiredObjectDeleteMarker></Expiration>
</Rule>
</LifecycleConfiguration>`)
rules, err := parseLifecycleXML(xml)
if err != nil {
t.Fatalf("parseLifecycleXML: %v", err)
}
if !rules[0].ExpiredObjectDeleteMarker {
t.Error("expected ExpiredObjectDeleteMarker=true")
}
}
func TestParseLifecycleXML_MultipleRules(t *testing.T) {
xml := []byte(`<LifecycleConfiguration>
<Rule>
<ID>rule1</ID>
<Status>Enabled</Status>
<Filter><Prefix>logs/</Prefix></Filter>
<Expiration><Days>7</Days></Expiration>
</Rule>
<Rule>
<ID>rule2</ID>
<Status>Disabled</Status>
<Filter><Prefix>temp/</Prefix></Filter>
<Expiration><Days>1</Days></Expiration>
</Rule>
<Rule>
<ID>rule3</ID>
<Status>Enabled</Status>
<Filter><Prefix></Prefix></Filter>
<Expiration><Days>365</Days></Expiration>
</Rule>
</LifecycleConfiguration>`)
rules, err := parseLifecycleXML(xml)
if err != nil {
t.Fatalf("parseLifecycleXML: %v", err)
}
if len(rules) != 3 {
t.Fatalf("expected 3 rules, got %d", len(rules))
}
if rules[1].Status != "Disabled" {
t.Errorf("expected rule2 Status=Disabled, got %q", rules[1].Status)
}
}
func TestParseExpirationDate(t *testing.T) {
tests := []struct {
name string
input string
want time.Time
wantErr bool
}{
{"rfc3339_utc", "2026-06-01T00:00:00Z", time.Date(2026, 6, 1, 0, 0, 0, 0, time.UTC), false},
{"rfc3339_offset", "2026-06-01T00:00:00+05:00", time.Date(2026, 6, 1, 0, 0, 0, 0, time.FixedZone("", 5*3600)), false},
{"date_only", "2026-06-01", time.Date(2026, 6, 1, 0, 0, 0, 0, time.UTC), false},
{"invalid", "not-a-date", time.Time{}, true},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got, err := parseExpirationDate(tt.input)
if (err != nil) != tt.wantErr {
t.Errorf("parseExpirationDate(%q) error = %v, wantErr %v", tt.input, err, tt.wantErr)
return
}
if !tt.wantErr && !got.Equal(tt.want) {
t.Errorf("parseExpirationDate(%q) = %v, want %v", tt.input, got, tt.want)
}
})
}
}