Files
seaweedFS/weed/command/filer_backup.go
Chris Lu d75162370c Fix trust policy wildcard principal handling (#7970)
* Fix trust policy wildcard principal handling

This change fixes the trust policy validation to properly support
AWS-standard wildcard principals like {"Federated": "*"}.

Previously, the evaluatePrincipalValue() function would check for
context existence before evaluating wildcards, causing wildcard
principals to fail when the context key didn't exist. This forced
users to use the plain "*" workaround instead of the more specific
{"Federated": "*"} format.

Changes:
- Modified evaluatePrincipalValue() to check for "*" FIRST before
  validating against context
- Added support for wildcards in principal arrays
- Added comprehensive tests for wildcard principal handling
- All existing tests continue to pass (no regressions)

This matches AWS IAM behavior where "*" in a principal field means
"allow any value" without requiring context validation.

Fixes: https://github.com/seaweedfs/seaweedfs/issues/7917

* Refactor: Move Principal matching to PolicyEngine

This refactoring consolidates all policy evaluation logic into the
PolicyEngine, improving code organization and eliminating duplication.

Changes:
- Added matchesPrincipal() and evaluatePrincipalValue() to PolicyEngine
- Added EvaluateTrustPolicy() method for direct trust policy evaluation
- Updated statementMatches() to check Principal field when present
- Made resource matching optional (trust policies don't have Resources)
- Simplified evaluateTrustPolicy() in iam_manager.go to delegate to PolicyEngine
- Removed ~170 lines of duplicate code from iam_manager.go

Benefits:
- Single source of truth for all policy evaluation
- Better code reusability and maintainability
- Consistent evaluation rules for all policy types
- Easier to test and debug

All tests pass with no regressions.

* Make PolicyEngine AWS-compatible and add unit tests

Changes:
1. AWS-Compatible Context Keys:
   - Changed "seaweed:FederatedProvider" -> "aws:FederatedProvider"
   - Changed "seaweed:AWSPrincipal" -> "aws:PrincipalArn"
   - Changed "seaweed:ServicePrincipal" -> "aws:PrincipalServiceName"
   - This ensures 100% AWS compatibility for trust policies

2. Added Comprehensive Unit Tests:
   - TestPrincipalMatching: 8 test cases for Principal matching
   - TestEvaluatePrincipalValue: 7 test cases for value evaluation
   - TestTrustPolicyEvaluation: 6 test cases for trust policy evaluation
   - TestGetPrincipalContextKey: 4 test cases for context key mapping
   - Total: 25 new unit tests for PolicyEngine

All tests pass:
- Policy engine tests: 54 passed
- Integration tests: 9 passed
- Total: 63 tests passing

* Update context keys to standard AWS/OIDC formats

Replaced remaining seaweed: context keys with standard AWS and OIDC
keys to ensure 100% compatibility with AWS IAM policies.

Mappings:
- seaweed:TokenIssuer -> oidc:iss
- seaweed:Issuer -> oidc:iss
- seaweed:Subject -> oidc:sub
- seaweed:SourceIP -> aws:SourceIp

Also updated unit tests to reflect these changes.

All 63 tests pass successfully.

* Add advanced policy tests for variable substitution and conditions

Added comprehensive tests inspired by AWS IAM patterns:
- TestPolicyVariableSubstitution: Tests ${oidc:sub} variable in resources
- TestConditionWithNumericComparison: Tests sts:DurationSeconds condition
- TestMultipleConditionOperators: Tests combining StringEquals and StringLike

Results:
- TestMultipleConditionOperators:  All 3 subtests pass
- Other tests reveal need for sts:DurationSeconds context population

These tests validate the PolicyEngine's ability to handle complex
AWS-compatible policy scenarios.

* Fix federated provider context and add DurationSeconds support

Changes:
- Use iss claim as aws:FederatedProvider (AWS standard)
- Add sts:DurationSeconds to trust policy evaluation context
- TestPolicyVariableSubstitution now passes 

Remaining work:
- TestConditionWithNumericComparison partially works (1/3 pass)
- Need to investigate NumericLessThanEquals evaluation

* Update trust policies to use issuer URL for AWS compatibility

Changed trust policy from using provider name ("test-oidc") to
using the issuer URL ("https://test-issuer.com") to match AWS
standard behavior where aws:FederatedProvider contains the OIDC
issuer URL.

Test Results:
- 10/12 test suites passing
- TestFullOIDCWorkflow:  All subtests pass
- TestPolicyEnforcement:  All subtests pass
- TestSessionExpiration:  Pass
- TestPolicyVariableSubstitution:  Pass
- TestMultipleConditionOperators:  All subtests pass

Remaining work:
- TestConditionWithNumericComparison needs investigation
- One subtest in TestTrustPolicyValidation needs fix

* Fix S3 API tests for AWS compatibility

Updated all S3 API tests to use AWS-compatible context keys and
trust policy principals:

Changes:
- seaweed:SourceIP → aws:SourceIp (IP-based conditions)
- Federated: "test-oidc" → "https://test-issuer.com" (trust policies)

Test Results:
- TestS3EndToEndWithJWT:  All 13 subtests pass
- TestIPBasedPolicyEnforcement:  All 3 subtests pass

This ensures policies are 100% AWS-compatible and portable.

* Fix ValidateTrustPolicy for AWS compatibility

Updated ValidateTrustPolicy method to check for:
- OIDC: issuer URL ("https://test-issuer.com")
- LDAP: provider name ("test-ldap")
- Wildcard: "*"

Test Results:
- TestTrustPolicyValidation:  All 3 subtests pass

This ensures trust policy validation uses the same AWS-compatible
principals as the PolicyEngine.

* Fix multipart and presigned URL tests for AWS compatibility

Updated trust policies in:
- s3_multipart_iam_test.go
- s3_presigned_url_iam_test.go

Changed "Federated": "test-oidc" → "https://test-issuer.com"

Test Results:
- TestMultipartIAMValidation:  All 7 subtests pass
- TestPresignedURLIAMValidation:  All 4 subtests pass
- TestPresignedURLGeneration:  All 4 subtests pass
- TestPresignedURLExpiration:  All 4 subtests pass
- TestPresignedURLSecurityPolicy:  All 4 subtests pass

All S3 API tests now use AWS-compatible trust policies.

* Fix numeric condition evaluation and trust policy validation interface

Major updates to ensure robust AWS-compatible policy evaluation:
1.  **Policy Engine**: Added support for `int` and `int64` types in `evaluateNumericCondition`, fixing issues where raw numbers in policy documents caused evaluation failures.
2.  **Trust Policy Validation**: Updated `TrustPolicyValidator` interface and `STSService` to propagate `DurationSeconds` correctly during the double-validation flow (Validation -> STS -> Validation callback).
3.  **IAM Manager**: Updated implementation to match the new interface and correctly pass `sts:DurationSeconds` context key.

Test Results:
- TestConditionWithNumericComparison:  All 3 subtests pass
- All IAM and S3 integration tests pass (100%)

This resolves the final edge case with DurationSeconds numeric conditions.

* Fix MockTrustPolicyValidator interface and unreachable code warnings

Updates:
1. Updated MockTrustPolicyValidator.ValidateTrustPolicyForWebIdentity to match new interface signature with durationSeconds parameter
2. Removed unreachable code after infinite loops in filer_backup.go and filer_meta_backup.go to satisfy linter

Test Results:
- All STS tests pass 
- Build warnings resolved 

* Refactor matchesPrincipal to consolidate array handling logic

Consolidated duplicated logic for []interface{} and []string types by converting them to a unified []interface{} upfront.

* Fix malformed AWS docs URL in iam_manager.go comment

* dup

* Enhance IAM integration tests with negative cases and interface array support

Added test cases to TestTrustPolicyWildcardPrincipal to:
1. Verify rejection of roles when principal context does not match (negative test)
2. Verify support for principal arrays as []interface{} (simulating JSON unmarshaled roles)

* Fix syntax errors in filer_backup and filer_meta_backup

Restored missing closing braces for for-loops and re-added return statements.
The previous attempt to remove unreachable code accidentally broke the function structure.
Build now passes successfully.
2026-01-05 15:55:24 -08:00

222 lines
8.3 KiB
Go

package command
import (
"errors"
"fmt"
"regexp"
"strings"
"time"
"github.com/seaweedfs/seaweedfs/weed/glog"
"github.com/seaweedfs/seaweedfs/weed/pb"
"github.com/seaweedfs/seaweedfs/weed/pb/filer_pb"
"github.com/seaweedfs/seaweedfs/weed/replication/source"
"github.com/seaweedfs/seaweedfs/weed/security"
"github.com/seaweedfs/seaweedfs/weed/util"
"github.com/seaweedfs/seaweedfs/weed/util/http"
"google.golang.org/grpc"
)
type FilerBackupOptions struct {
isActivePassive *bool
filer *string
path *string
excludePaths *string
excludeFileName *string
debug *bool
proxyByFiler *bool
doDeleteFiles *bool
disableErrorRetry *bool
ignore404Error *bool
timeAgo *time.Duration
retentionDays *int
}
var (
filerBackupOptions FilerBackupOptions
)
func init() {
cmdFilerBackup.Run = runFilerBackup // break init cycle
filerBackupOptions.filer = cmdFilerBackup.Flag.String("filer", "localhost:8888", "filer of one SeaweedFS cluster")
filerBackupOptions.path = cmdFilerBackup.Flag.String("filerPath", "/", "directory to sync on filer")
filerBackupOptions.excludePaths = cmdFilerBackup.Flag.String("filerExcludePaths", "", "exclude directories to sync on filer")
filerBackupOptions.excludeFileName = cmdFilerBackup.Flag.String("filerExcludeFileName", "", "exclude file names that match the regexp to sync on filer")
filerBackupOptions.proxyByFiler = cmdFilerBackup.Flag.Bool("filerProxy", false, "read and write file chunks by filer instead of volume servers")
filerBackupOptions.doDeleteFiles = cmdFilerBackup.Flag.Bool("doDeleteFiles", false, "delete files on the destination")
filerBackupOptions.debug = cmdFilerBackup.Flag.Bool("debug", false, "debug mode to print out received files")
filerBackupOptions.timeAgo = cmdFilerBackup.Flag.Duration("timeAgo", 0, "start time before now. \"300ms\", \"1.5h\" or \"2h45m\". Valid time units are \"ns\", \"us\" (or \"µs\"), \"ms\", \"s\", \"m\", \"h\"")
filerBackupOptions.retentionDays = cmdFilerBackup.Flag.Int("retentionDays", 0, "incremental backup retention days")
filerBackupOptions.disableErrorRetry = cmdFilerBackup.Flag.Bool("disableErrorRetry", false, "disables errors retry, only logs will print")
filerBackupOptions.ignore404Error = cmdFilerBackup.Flag.Bool("ignore404Error", true, "ignore 404 errors from filer")
}
var cmdFilerBackup = &Command{
UsageLine: "filer.backup -filer=<filerHost>:<filerPort> ",
Short: "resume-able continuously replicate files from a SeaweedFS cluster to another location defined in replication.toml",
Long: `resume-able continuously replicate files from a SeaweedFS cluster to another location defined in replication.toml
filer.backup listens on filer notifications. If any file is updated, it will fetch the updated content,
and write to the destination. This is to replace filer.replicate command since additional message queue is not needed.
If restarted and "-timeAgo" is not set, the synchronization will resume from the previous checkpoints, persisted every minute.
A fresh sync will start from the earliest metadata logs. To reset the checkpoints, just set "-timeAgo" to a high value.
`,
}
func runFilerBackup(cmd *Command, args []string) bool {
util.LoadSecurityConfiguration()
util.LoadConfiguration("replication", true)
grpcDialOption := security.LoadClientTLS(util.GetViper(), "grpc.client")
clientId := util.RandomInt32()
var clientEpoch int32
for {
clientEpoch++
err := doFilerBackup(grpcDialOption, &filerBackupOptions, clientId, clientEpoch)
if err != nil {
glog.Errorf("backup from %s: %v", *filerBackupOptions.filer, err)
time.Sleep(1747 * time.Millisecond)
}
}
return false
}
const (
BackupKeyPrefix = "backup."
)
func doFilerBackup(grpcDialOption grpc.DialOption, backupOption *FilerBackupOptions, clientId int32, clientEpoch int32) error {
// find data sink
dataSink := findSink(util.GetViper())
if dataSink == nil {
return fmt.Errorf("no data sink configured in replication.toml")
}
sourceFiler := pb.ServerAddress(*backupOption.filer)
sourcePath := *backupOption.path
excludePaths := util.StringSplit(*backupOption.excludePaths, ",")
var reExcludeFileName *regexp.Regexp
if *backupOption.excludeFileName != "" {
var err error
if reExcludeFileName, err = regexp.Compile(*backupOption.excludeFileName); err != nil {
return fmt.Errorf("error compile regexp %v for exclude file name: %+v", *backupOption.excludeFileName, err)
}
}
timeAgo := *backupOption.timeAgo
targetPath := dataSink.GetSinkToDirectory()
debug := *backupOption.debug
// get start time for the data sink
startFrom := time.Unix(0, 0)
sinkId := util.HashStringToLong(dataSink.GetName() + dataSink.GetSinkToDirectory())
if timeAgo.Milliseconds() == 0 {
lastOffsetTsNs, err := getOffset(grpcDialOption, sourceFiler, BackupKeyPrefix, int32(sinkId))
if err != nil {
glog.V(0).Infof("starting from %v", startFrom)
} else {
startFrom = time.Unix(0, lastOffsetTsNs)
glog.V(0).Infof("resuming from %v", startFrom)
}
} else {
startFrom = time.Now().Add(-timeAgo)
glog.V(0).Infof("start time is set to %v", startFrom)
}
// create filer sink
filerSource := &source.FilerSource{}
filerSource.DoInitialize(
sourceFiler.ToHttpAddress(),
sourceFiler.ToGrpcAddress(),
sourcePath,
*backupOption.proxyByFiler)
dataSink.SetSourceFiler(filerSource)
var processEventFn func(*filer_pb.SubscribeMetadataResponse) error
if *backupOption.ignore404Error {
processEventFnGenerated := genProcessFunction(sourcePath, targetPath, excludePaths, reExcludeFileName, dataSink, *backupOption.doDeleteFiles, debug)
processEventFn = func(resp *filer_pb.SubscribeMetadataResponse) error {
err := processEventFnGenerated(resp)
if err == nil {
return nil
}
// ignore HTTP 404 from remote reads
if errors.Is(err, http.ErrNotFound) {
glog.V(0).Infof("got 404 error for %s, ignore it: %s", getSourceKey(resp), err.Error())
return nil
}
// also ignore missing volume/lookup errors coming from LookupFileId or vid map
errStr := err.Error()
if strings.Contains(errStr, "LookupFileId") || (strings.Contains(errStr, "volume id") && strings.Contains(errStr, "not found")) {
glog.V(0).Infof("got missing-volume error for %s, ignore it: %s", getSourceKey(resp), errStr)
return nil
}
return err
}
} else {
processEventFn = genProcessFunction(sourcePath, targetPath, excludePaths, reExcludeFileName, dataSink, *backupOption.doDeleteFiles, debug)
}
processEventFnWithOffset := pb.AddOffsetFunc(processEventFn, 3*time.Second, func(counter int64, lastTsNs int64) error {
glog.V(0).Infof("backup %s progressed to %v %0.2f/sec", sourceFiler, time.Unix(0, lastTsNs), float64(counter)/float64(3))
return setOffset(grpcDialOption, sourceFiler, BackupKeyPrefix, int32(sinkId), lastTsNs)
})
if dataSink.IsIncremental() && *filerBackupOptions.retentionDays > 0 {
go func() {
for {
now := time.Now()
time.Sleep(time.Hour * 24)
key := util.Join(targetPath, now.Add(-1*time.Hour*24*time.Duration(*filerBackupOptions.retentionDays)).Format("2006-01-02"))
_ = dataSink.DeleteEntry(util.Join(targetPath, key), true, true, nil)
glog.V(0).Infof("incremental backup delete directory:%s", key)
}
}()
}
prefix := sourcePath
if !strings.HasSuffix(prefix, "/") {
prefix = prefix + "/"
}
eventErrorType := pb.RetryForeverOnError
if *backupOption.disableErrorRetry {
eventErrorType = pb.TrivialOnError
}
metadataFollowOption := &pb.MetadataFollowOption{
ClientName: "backup_" + dataSink.GetName(),
ClientId: clientId,
ClientEpoch: clientEpoch,
SelfSignature: 0,
PathPrefix: prefix,
AdditionalPathPrefixes: nil,
DirectoriesToWatch: nil,
StartTsNs: startFrom.UnixNano(),
StopTsNs: 0,
EventErrorType: eventErrorType,
}
return pb.FollowMetadata(sourceFiler, grpcDialOption, metadataFollowOption, processEventFnWithOffset)
}
func getSourceKey(resp *filer_pb.SubscribeMetadataResponse) string {
if resp == nil || resp.EventNotification == nil {
return ""
}
message := resp.EventNotification
if message.NewEntry != nil {
return string(util.FullPath(message.NewParentPath).Child(message.NewEntry.Name))
}
if message.OldEntry != nil {
return string(util.FullPath(resp.Directory).Child(message.OldEntry.Name))
}
return ""
}