feat(s3api): Implement S3 Policy Variables (#8039)

* feat: Add AWS IAM Policy Variables support to S3 API Implements policy variables for dynamic access control in bucket policies. Supported variables: - aws:username - Extracted from principal ARN - aws:userid - User identifier (same as username in SeaweedFS) - aws:principaltype - IAMUser, IAMRole, or AssumedRole - jwt:* - Any JWT claim (e.g., jwt:preferred_username, jwt:sub) Key changes: - Added PolicyVariableRegex to detect ${...} patterns - Extended CompiledStatement with DynamicResourcePatterns, DynamicPrincipalPatterns, DynamicActionPatterns - Added Claims field to PolicyEvaluationArgs for JWT claim access - Implemented SubstituteVariables() for variable replacement from context and JWT claims - Implemented extractPrincipalVariables() for ARN parsing - Updated EvaluateConditions() to support variable substitution - Comprehensive unit and integration tests Resolves #8037 * feat: Add LDAP and PrincipalAccount variable support Completes future enhancements for policy variables: - Added ldap:* variable support for LDAP claims - ldap:username - LDAP username from claims - ldap:dn - LDAP distinguished name from claims - ldap:* - Any LDAP claim - Added aws:PrincipalAccount extraction from ARN - Extracts account ID from principal ARN - Available as ${aws:PrincipalAccount} in policies Updated SubstituteVariables() to check LDAP claims Updated extractPrincipalVariables() to extract account ID Added comprehensive tests for new variables * feat(s3api): implement IAM policy variables core logic and optimization * feat(s3api): integrate policy variables with S3 authentication and handlers * test(s3api): add integration tests for policy variables * cleanup: remove unused policy conversion files * Add S3 policy variables integration tests and path support - Add comprehensive integration tests for policy variables - Test username isolation, JWT claims, LDAP claims - Add support for IAM paths in principal ARN parsing - Add tests for principals with paths * Fix IAM Role principal variable extraction IAM Roles should not have aws:userid or aws:PrincipalAccount according to AWS behavior. Only IAM Users and Assumed Roles should have these variables. Fixes TestExtractPrincipalVariables test failures. * Security fixes and bug fixes for S3 policy variables SECURITY FIXES: - Prevent X-SeaweedFS-Principal header spoofing by clearing internal headers at start of authentication (auth_credentials.go) - Restrict policy variable substitution to safe allowlist to prevent client header injection (iam/policy/policy_engine.go) - Add core policy validation before storing bucket policies BUG FIXES: - Remove unused sid variable in evaluateStatement - Fix LDAP claim lookup to check both prefixed and unprefixed keys - Add ValidatePolicy call in PutBucketPolicyHandler These fixes prevent privilege escalation via header injection and ensure only validated identity claims are used in policy evaluation. * Additional security fixes and code cleanup SECURITY FIXES: - Fixed X-Forwarded-For spoofing by only trusting proxy headers from private/localhost IPs (s3_iam_middleware.go) - Changed context key from "sourceIP" to "aws:SourceIp" for proper policy variable substitution CODE IMPROVEMENTS: - Kept aws:PrincipalAccount for IAM Roles to support condition evaluations - Removed redundant STS principaltype override - Removed unused service variable - Cleaned up commented-out debug logging statements - Updated tests to reflect new IAM Role behavior These changes prevent IP spoofing attacks and ensure policy variables work correctly with the safe allowlist. * Add security documentation for ParseJWTToken Added comprehensive security comments explaining that ParseJWTToken is safe despite parsing without verification because: - It's only used for routing to the correct verification method - All code paths perform cryptographic verification before trusting claims - OIDC tokens: validated via validateExternalOIDCToken - STS tokens: validated via ValidateSessionToken Enhanced function documentation with clear security warnings about proper usage to prevent future misuse. * Fix IP condition evaluation to use aws:SourceIp key Fixed evaluateIPCondition in IAM policy engine to use "aws:SourceIp" instead of "sourceIP" to match the updated extractRequestContext. This fixes the failing IP-restricted role test where IP-based policy conditions were not being evaluated correctly. Updated all test cases to use the correct "aws:SourceIp" key. * Address code review feedback: optimize and clarify PERFORMANCE IMPROVEMENT: - Optimized expandPolicyVariables to use regexp.ReplaceAllStringFunc for single-pass variable substitution instead of iterating through all safe variables. This improves performance from O(n*m) to O(m) where n is the number of safe variables and m is the pattern length. CODE CLARITY: - Added detailed comment explaining LDAP claim fallback mechanism (checks both prefixed and unprefixed keys for compatibility) - Enhanced TODO comment for trusted proxy configuration with rationale and recommendations for supporting cloud load balancers, CDNs, and complex network topologies All tests passing. * Address Copilot code review feedback BUG FIXES: - Fixed type switch for int/int32/int64 - separated into individual cases since interface type switches only match the first type in multi-type cases - Fixed grammatically incorrect error message in types.go CODE QUALITY: - Removed duplicate Resource/NotResource validation (already in ValidateStatement) - Added comprehensive comment explaining isEnabled() logic and security implications - Improved trusted proxy NOTE comment to be more concise while noting limitations All tests passing. * Fix test failures after extractSourceIP security changes Updated tests to work with the security fix that only trusts X-Forwarded-For/X-Real-IP headers from private IP addresses: - Set RemoteAddr to 127.0.0.1 in tests to simulate trusted proxy - Changed context key from "sourceIP" to "aws:SourceIp" - Added test case for untrusted proxy (public RemoteAddr) - Removed invalid ValidateStatement call (validation happens in ValidatePolicy) All tests now passing. * Address remaining Gemini code review feedback CODE SAFETY: - Deep clone Action field in CompileStatement to prevent potential data races if the original policy document is modified after compilation TEST CLEANUP: - Remove debug logging (fmt.Fprintf) from engine_notresource_test.go - Remove unused imports in engine_notresource_test.go All tests passing. * Fix insecure JWT parsing in IAM auth flow SECURITY FIX: - Renamed ParseJWTToken to ParseUnverifiedJWTToken with explicit security warnings. - Refactored AuthenticateJWT to use the trusted SessionInfo returned by ValidateSessionToken instead of relying on unverified claims from the initial parse. - Refactored ValidatePresignedURLWithIAM to reuse the robust AuthenticateJWT logic, removing duplicated and insecure manual token parsing. This ensures all identity information (Role, Principal, Subject) used for authorization decisions is derived solely from cryptographically verified tokens. * Security: Fix insecure JWT claim extraction in policy engine - Refactored EvaluatePolicy to accept trusted claims from verified Identity instead of parsing unverified tokens - Updated AuthenticateJWT to populate Claims in IAMIdentity from verified sources (SessionInfo/ExternalIdentity) - Updated s3api_server and handlers to pass claims correctly - Improved isPrivateIP to support IPv6 loopback, link-local, and ULA - Fixed flaky distributed_session_consistency test with retry logic * fix(iam): populate Subject in STSSessionInfo to ensure correct identity propagation This fixes the TestS3IAMAuthentication/valid_jwt_token_authentication failure by ensuring the session subject (sub) is correctly mapped to the internal SessionInfo struct, allowing bucket ownership validation to succeed. * Optimized isPrivateIP * Create s3-policy-tests.yml * fix tests * fix tests * tests(s3/iam): simplify policy to resource-based \ (step 1) * tests(s3/iam): add explicit Deny NotResource for isolation (step 2) * fixes * policy: skip resource matching for STS trust policies to allow AssumeRole evaluation * refactor: remove debug logging and hoist policy variables for performance * test: fix TestS3IAMBucketPolicyIntegration cleanup to handle per-subtest object lifecycle * test: fix bucket name generation to comply with S3 63-char limit * test: skip TestS3IAMPolicyEnforcement until role setup is implemented * test: use weed mini for simpler test server deployment Replace 'weed server' with 'weed mini' for IAM tests to avoid port binding issues and simplify the all-in-one server deployment. This improves test reliability and execution time. * security: prevent allocation overflow in policy evaluation Add maxPoliciesForEvaluation constant to cap the number of policies evaluated in a single request. This prevents potential integer overflow when allocating slices for policy lists that may be influenced by untrusted input. Changes: - Add const maxPoliciesForEvaluation = 1024 to set an upper bound - Validate len(policies) < maxPoliciesForEvaluation before appending bucket policy - Use append() instead of make([]string, len+1) to avoid arithmetic overflow - Apply fix to both IsActionAllowed policy evaluation paths
2026-01-16 11:12:28 -08:00
parent b49f3ce6d3
commit ee3813787e
38 changed files with 2766 additions and 1288 deletions
--- a/weed/s3api/s3_iam_middleware.go
+++ b/weed/s3api/s3_iam_middleware.go
@@ -18,6 +18,26 @@ import (
 	"github.com/seaweedfs/seaweedfs/weed/s3api/s3err"
 )

+// privateNetworks contains pre-parsed private IP ranges for efficient lookups
+var privateNetworks []*net.IPNet
+
+func init() {
+	// Private IPv4 ranges (RFC1918) and IPv6 Unique Local Addresses (ULA)
+	privateRanges := []string{
+		"10.0.0.0/8",     // IPv4 private
+		"172.16.0.0/12",  // IPv4 private
+		"192.168.0.0/16", // IPv4 private
+		"fc00::/7",       // IPv6 Unique Local Addresses (ULA)
+	}
+
+	for _, cidr := range privateRanges {
+		_, network, err := net.ParseCIDR(cidr)
+		if err == nil {
+			privateNetworks = append(privateNetworks, network)
+		}
+	}
+}
+
 // IAMIntegration defines the interface for IAM integration
 type IAMIntegration interface {
 	AuthenticateJWT(ctx context.Context, r *http.Request) (*IAMIdentity, s3err.ErrorCode)
@@ -73,14 +93,21 @@ func (s3iam *S3IAMIntegration) AuthenticateJWT(ctx context.Context, r *http.Requ
 		return nil, s3err.ErrAccessDenied
 	}

-	// Try to parse as STS session token first
-	tokenClaims, err := parseJWTToken(sessionToken)
+	// SECURITY NOTE: ParseJWTToken parses without cryptographic verification
+	// This is SAFE because we only use the unverified claims to route to the correct
+	// verification method. All code paths below perform full cryptographic verification:
+	// - OIDC tokens: validated via validateExternalOIDCToken (line 98)
+	// - STS tokens: validated via ValidateSessionToken (line 156)
+	// The unverified issuer claim is only used for routing, never for authorization.
+	tokenClaims, err := ParseUnverifiedJWTToken(sessionToken)
 	if err != nil {
 		glog.V(3).Infof("Failed to parse JWT token: %v", err)
 		return nil, s3err.ErrAccessDenied
 	}

 	// Determine token type by issuer claim (more robust than checking role claim)
+	// We use the unverified claims ONLY for routing to the correct verification method.
+	// We DO NOT use these claims for building the identity.
 	issuer, issuerOk := tokenClaims["iss"].(string)
 	if !issuerOk {
 		glog.V(3).Infof("Token missing issuer claim - invalid JWT")
@@ -116,59 +143,47 @@ func (s3iam *S3IAMIntegration) AuthenticateJWT(ctx context.Context, r *http.Requ
 				EmailAddress: identity.UserID + "@oidc.local",
 				Id:           identity.UserID,
 			},
+			Claims: map[string]interface{}{
+				"sub":  identity.UserID,
+				"role": identity.RoleArn,
+			},
 		}, s3err.ErrNone
 	}

-	// This is an STS-issued token - extract STS session information
-
-	// Extract role claim from STS token
-	roleName, roleOk := tokenClaims["role"].(string)
-	if !roleOk || roleName == "" {
-		glog.V(3).Infof("STS token missing role claim")
-		return nil, s3err.ErrAccessDenied
-	}
-
-	sessionName, ok := tokenClaims["snam"].(string)
-	if !ok || sessionName == "" {
-		sessionName = "jwt-session" // Default fallback
-	}
-
-	subject, ok := tokenClaims["sub"].(string)
-	if !ok || subject == "" {
-		subject = "jwt-user" // Default fallback
-	}
-
-	// Use the principal ARN directly from token claims, or build it if not available
-	principalArn, ok := tokenClaims["principal"].(string)
-	if !ok || principalArn == "" {
-		// Fallback: extract role name from role ARN and build principal ARN
-		roleNameOnly := roleName
-		if strings.Contains(roleName, "/") {
-			parts := strings.Split(roleName, "/")
-			roleNameOnly = parts[len(parts)-1]
-		}
-		principalArn = fmt.Sprintf("arn:aws:sts::assumed-role/%s/%s", roleNameOnly, sessionName)
-	}
-
-	// Validate the JWT token directly using STS service (avoid circular dependency)
-	// Note: We don't call IsActionAllowed here because that would create a circular dependency
-	// Authentication should only validate the token, authorization happens later
-	_, err = s3iam.stsService.ValidateSessionToken(ctx, sessionToken)
+	// This is an STS-issued token - validate with STS service
+	// ValidateSessionToken performs cryptographic verification and extraction of trusted claims
+	sessionInfo, err := s3iam.stsService.ValidateSessionToken(ctx, sessionToken)
 	if err != nil {
 		glog.V(3).Infof("STS session validation failed: %v", err)
 		return nil, s3err.ErrAccessDenied
 	}

-	// Create IAM identity from validated token
+	// Create claims map starting with request context (which holds custom claims)
+	claims := make(map[string]interface{})
+	if sessionInfo.RequestContext != nil {
+		for k, v := range sessionInfo.RequestContext {
+			claims[k] = v
+		}
+	}
+
+	// Add standard claims
+	claims["sub"] = sessionInfo.Subject
+	claims["role"] = sessionInfo.RoleArn
+	claims["principal"] = sessionInfo.Principal
+	claims["snam"] = sessionInfo.SessionName
+
+	// Create IAM identity from VALIDATED session info
+	// We use the trusted data returned by the STS service, not the unverified token claims
 	identity := &IAMIdentity{
-		Name:         subject,
-		Principal:    principalArn,
+		Name:         sessionInfo.Subject,
+		Principal:    sessionInfo.Principal,
 		SessionToken: sessionToken,
 		Account: &Account{
-			DisplayName:  roleName,
-			EmailAddress: subject + "@seaweedfs.local",
-			Id:           subject,
+			DisplayName:  sessionInfo.SessionName,
+			EmailAddress: sessionInfo.Subject + "@seaweedfs.local",
+			Id:           sessionInfo.Subject,
 		},
+		Claims: claims,
 	}

 	glog.V(3).Infof("JWT authentication successful for principal: %s", identity.Principal)
@@ -199,6 +214,35 @@ func (s3iam *S3IAMIntegration) AuthorizeAction(ctx context.Context, identity *IA
 	// Extract request context for policy conditions
 	requestContext := extractRequestContext(r)

+	// Add s3:prefix to request context based on object key
+	// This ensures that policy conditions referencing s3:prefix (like StringLike)
+	// work correctly for both ListObjects (where objectKey is the prefix) and
+	// object operations (where we treat the object key as the prefix for matching)
+	if objectKey != "" && objectKey != "/" {
+		requestContext["s3:prefix"] = objectKey
+	}
+
+	// Add identity claims to request context for policy variables
+	// Only add claim keys if they don't already exist (to avoid overwriting request-derived context)
+	if identity.Claims != nil {
+		for k, v := range identity.Claims {
+			// Only add the claim if this key doesn't already exist in request context
+			if _, exists := requestContext[k]; !exists {
+				requestContext[k] = v
+			}
+
+			// If the claim doesn't have a namespace prefix (e.g. "email"), add "jwt:" prefix
+			// This allows ${jwt:email} or ${jwt:preferred_username} to work
+			// Only add namespaced version if it doesn't already exist
+			if !strings.Contains(k, ":") {
+				jwtKey := "jwt:" + k
+				if _, exists := requestContext[jwtKey]; !exists {
+					requestContext[jwtKey] = v
+				}
+			}
+		}
+	}
+
 	// Determine the specific S3 action based on the HTTP request details
 	specificAction := ResolveS3Action(r, string(action), bucket, objectKey)

@@ -240,6 +284,7 @@ type IAMIdentity struct {
 	SessionToken string
 	Account      *Account
 	PolicyNames  []string
+	Claims       map[string]interface{}
 }

 // IsAdmin checks if the identity has admin privileges
@@ -406,9 +451,10 @@ func extractRequestContext(r *http.Request) map[string]interface{} {
 	context := make(map[string]interface{})

 	// Extract source IP for IP-based conditions
+	// Use AWS-compatible key name for policy variable substitution
 	sourceIP := extractSourceIP(r)
 	if sourceIP != "" {
-		context["sourceIP"] = sourceIP
+		context["aws:SourceIp"] = sourceIP
 	}

 	// Extract user agent
@@ -428,42 +474,86 @@ func extractRequestContext(r *http.Request) map[string]interface{} {
 }

 // extractSourceIP extracts the real source IP from the request
+// SECURITY: Prioritizes RemoteAddr over client-controlled headers to prevent spoofing
+// Only trusts X-Forwarded-For/X-Real-IP if RemoteAddr appears to be from a trusted proxy
 func extractSourceIP(r *http.Request) string {
-	// Check X-Forwarded-For header (most common for proxied requests)
-	if forwardedFor := r.Header.Get("X-Forwarded-For"); forwardedFor != "" {
-		// X-Forwarded-For can contain multiple IPs, take the first one
-		if ips := strings.Split(forwardedFor, ","); len(ips) > 0 {
-			return strings.TrimSpace(ips[0])
+	// Always start with RemoteAddr as the most trustworthy source
+	remoteIP := r.RemoteAddr
+	if ip, _, err := net.SplitHostPort(remoteIP); err == nil {
+		remoteIP = ip
+	}
+
+	// NOTE: The current heuristic of using isPrivateIP assumes reverse proxies are on a
+	// private/local network. This may be insufficient for some cloud, CDN, or multi-tier
+	// proxy deployments where proxies terminate connections from public IPs. In such
+	// environments, deployment-specific controls (e.g., network ACLs or proxy configs)
+	// should be used to ensure only trusted components can set forwarding headers.
+	// Future enhancements may introduce an explicit, configurable trusted proxy CIDR list.
+	isTrustedProxy := isPrivateIP(remoteIP)
+
+	if isTrustedProxy {
+		// Check X-Real-IP header first (single IP, more reliable than X-Forwarded-For)
+		if realIP := r.Header.Get("X-Real-IP"); realIP != "" {
+			return strings.TrimSpace(realIP)
+		}
+
+		// Check X-Forwarded-For header (can contain multiple IPs, take the first one)
+		if forwardedFor := r.Header.Get("X-Forwarded-For"); forwardedFor != "" {
+			if ips := strings.Split(forwardedFor, ","); len(ips) > 0 {
+				return strings.TrimSpace(ips[0])
+			}
 		}
 	}

-	// Check X-Real-IP header
-	if realIP := r.Header.Get("X-Real-IP"); realIP != "" {
-		return strings.TrimSpace(realIP)
-	}
-
-	// Fall back to RemoteAddr
-	if ip, _, err := net.SplitHostPort(r.RemoteAddr); err == nil {
-		return ip
-	}
-
-	return r.RemoteAddr
+	// Fall back to RemoteAddr (most secure)
+	return remoteIP
 }

-// parseJWTToken parses a JWT token and returns its claims without verification
-// Note: This is for extracting claims only. Verification is done by the IAM system.
-func parseJWTToken(tokenString string) (jwt.MapClaims, error) {
+// isPrivateIP checks if an IP is in a private range (localhost or RFC1918)
+func isPrivateIP(ipStr string) bool {
+	ip := net.ParseIP(ipStr)
+	if ip == nil {
+		return false
+	}
+
+	// Check for localhost and link-local addresses (IPv4/IPv6)
+	if ip.IsLoopback() || ip.IsLinkLocalUnicast() || ip.IsLinkLocalMulticast() {
+		return true
+	}
+
+	// Check against pre-parsed private CIDR ranges
+	for _, network := range privateNetworks {
+		if network.Contains(ip) {
+			return true
+		}
+	}
+
+	return false
+}
+
+// ParseUnverifiedJWTToken parses a JWT token and returns its claims WITHOUT cryptographic verification
+//
+// SECURITY WARNING: This function does NOT validate the token signature!
+// It should ONLY be used for:
+// 1. Routing tokens to the appropriate verification method (e.g., checking issuer to determine STS vs OIDC)
+// 2. Extracting claims for logging/debugging AFTER the token has been cryptographically verified
+//
+// NEVER use the returned claims for authorization decisions without first calling a proper
+// verification function like ValidateSessionToken() or validateExternalOIDCToken().
+func ParseUnverifiedJWTToken(tokenString string) (jwt.MapClaims, error) {
+	// Parse token without verification to get claims
+	// This token IS NOT VERIFIED at this stage.
+	// It is only used to peek at claims (like issuer) to determine which verification key/strategy to use.
 	token, _, err := new(jwt.Parser).ParseUnverified(tokenString, jwt.MapClaims{})
 	if err != nil {
-		return nil, fmt.Errorf("failed to parse JWT token: %v", err)
+		return nil, err
 	}

-	claims, ok := token.Claims.(jwt.MapClaims)
-	if !ok {
-		return nil, fmt.Errorf("invalid token claims")
+	if claims, ok := token.Claims.(jwt.MapClaims); ok {
+		return claims, nil
 	}

-	return claims, nil
+	return nil, fmt.Errorf("invalid token claims")
 }

 // minInt returns the minimum of two integers