Files
seaweedFS/weed/iam/sts/distributed_sts_test.go
Chris Lu 06391701ed Add AssumeRole and AssumeRoleWithLDAPIdentity STS actions (#8003)
* test: add integration tests for AssumeRole and AssumeRoleWithLDAPIdentity STS actions

- Add s3_sts_assume_role_test.go with comprehensive tests for AssumeRole:
  * Parameter validation (missing RoleArn, RoleSessionName, invalid duration)
  * AWS SigV4 authentication with valid/invalid credentials
  * Temporary credential generation and usage

- Add s3_sts_ldap_test.go with tests for AssumeRoleWithLDAPIdentity:
  * Parameter validation (missing LDAP credentials, RoleArn)
  * LDAP authentication scenarios (valid/invalid credentials)
  * Integration with LDAP server (when configured)

- Update Makefile with new test targets:
  * test-sts: run all STS tests
  * test-sts-assume-role: run AssumeRole tests only
  * test-sts-ldap: run LDAP STS tests only
  * test-sts-suite: run tests with full service lifecycle

- Enhance setup_all_tests.sh:
  * Add OpenLDAP container setup for LDAP testing
  * Create test LDAP users (testuser, ldapadmin)
  * Set LDAP environment variables for tests
  * Update cleanup to remove LDAP container

- Fix setup_keycloak.sh:
  * Enable verbose error logging for realm creation
  * Improve error diagnostics

Tests use fail-fast approach (t.Fatal) when server not configured,
ensuring clear feedback when infrastructure is missing.

* feat: implement AssumeRole and AssumeRoleWithLDAPIdentity STS actions

Implement two new STS actions to match MinIO's STS feature set:

**AssumeRole Implementation:**
- Add handleAssumeRole with full AWS SigV4 authentication
- Integrate with existing IAM infrastructure via verifyV4Signature
- Validate required parameters (RoleArn, RoleSessionName)
- Validate DurationSeconds (900-43200 seconds range)
- Generate temporary credentials with expiration
- Return AWS-compatible XML response

**AssumeRoleWithLDAPIdentity Implementation:**
- Add handleAssumeRoleWithLDAPIdentity handler (stub)
- Validate LDAP-specific parameters (LDAPUsername, LDAPPassword)
- Validate common STS parameters (RoleArn, RoleSessionName, DurationSeconds)
- Return proper error messages for missing LDAP provider
- Ready for LDAP provider integration

**Routing Fixes:**
- Add explicit routes for AssumeRole and AssumeRoleWithLDAPIdentity
- Prevent IAM handler from intercepting authenticated STS requests
- Ensure proper request routing priority

**Handler Infrastructure:**
- Add IAM field to STSHandlers for SigV4 verification
- Update NewSTSHandlers to accept IAM reference
- Add STS-specific error codes and response types
- Implement writeSTSErrorResponse for AWS-compatible errors

The AssumeRole action is fully functional and tested.
AssumeRoleWithLDAPIdentity requires LDAP provider implementation.

* fix: update IAM matcher to exclude STS actions from interception

Update the IAM handler matcher to check for STS actions (AssumeRole,
AssumeRoleWithWebIdentity, AssumeRoleWithLDAPIdentity) and exclude them
from IAM handler processing. This allows STS requests to be handled by
the STS fallback handler even when they include AWS SigV4 authentication.

The matcher now parses the form data to check the Action parameter and
returns false for STS actions, ensuring they are routed to the correct
handler.

Note: This is a work-in-progress fix. Tests are still showing some
routing issues that need further investigation.

* fix: address PR review security issues for STS handlers

This commit addresses all critical security issues from PR review:

Security Fixes:
- Use crypto/rand for cryptographically secure credential generation
  instead of time.Now().UnixNano() (fixes predictable credentials)
- Add sts:AssumeRole permission check via VerifyActionPermission to
  prevent unauthorized role assumption
- Generate proper session tokens using crypto/rand instead of
  placeholder strings

Code Quality Improvements:
- Refactor DurationSeconds parsing into reusable parseDurationSeconds()
  helper function used by all three STS handlers
- Create generateSecureCredentials() helper for consistent and secure
  temporary credential generation
- Fix iamMatcher to check query string as fallback when Action not
  found in form data

LDAP Provider Implementation:
- Add go-ldap/ldap/v3 dependency
- Create LDAPProvider implementing IdentityProvider interface with
  full LDAP authentication support (connect, bind, search, groups)
- Update ProviderFactory to create real LDAP providers
- Wire LDAP provider into AssumeRoleWithLDAPIdentity handler

Test Infrastructure:
- Add LDAP user creation verification step in setup_all_tests.sh

* fix: address PR feedback (Round 2) - config validation & provider improvements

- Implement `validateLDAPConfig` in `ProviderFactory`
- Improve `LDAPProvider.Initialize`:
  - Support `connectionTimeout` parsing (string/int/float) from config map
  - Warn if `BindDN` is present but `BindPassword` is empty
- Improve `LDAPProvider.GetUserInfo`:
  - Add fallback to `searchUserGroups` if `memberOf` returns no groups (consistent with Authenticate)

* fix: address PR feedback (Round 3) - LDAP connection improvements & build fix

- Improve `LDAPProvider` connection handling:
  - Use `net.Dialer` with configured timeout for connection establishment
  - Enforce TLS 1.2+ (`MinVersion: tls.VersionTLS12`) for both LDAPS and StartTLS
- Fix build error in `s3api_sts.go` (format verb for ErrorCode)

* fix: address PR feedback (Round 4) - LDAP hardening, Authz check & Routing fix

- LDAP Provider Hardening:
  - Prevent re-initialization
  - Enforce single user match in `GetUserInfo` (was explicit only in Authenticate)
  - Ensure connection closure if StartTLS fails
- STS Handlers:
  - Add robust provider detection using type assertion
  - **Security**: Implement authorization check (`VerifyActionPermission`) after LDAP authentication
- Routing:
  - Update tests to reflect that STS actions are handled by STS handler, not generic IAM

* fix: address PR feedback (Round 5) - JWT tokens, ARN formatting, PrincipalArn

CRITICAL FIXES:
- Replace standalone credential generation with STS service JWT tokens
  - handleAssumeRole now generates proper JWT session tokens
  - handleAssumeRoleWithLDAPIdentity now generates proper JWT session tokens
  - Session tokens can be validated across distributed instances

- Fix ARN formatting in responses
  - Extract role name from ARN using utils.ExtractRoleNameFromArn()
  - Prevents malformed ARNs like "arn:aws:sts::assumed-role/arn:aws:iam::..."

- Add configurable AccountId for federated users
  - Add AccountId field to STSConfig (defaults to "111122223333")
  - PrincipalArn now uses configured account ID instead of hardcoded "aws"
  - Enables proper trust policy validation

IMPROVEMENTS:
- Sanitize LDAP authentication error messages (don't leak internal details)
- Remove duplicate comment in provider detection
- Add utils import for ARN parsing utilities

* feat: implement LDAP connection pooling to prevent resource exhaustion

PERFORMANCE IMPROVEMENT:
- Add connection pool to LDAPProvider (default size: 10 connections)
- Reuse LDAP connections across authentication requests
- Prevent file descriptor exhaustion under high load

IMPLEMENTATION:
- connectionPool struct with channel-based connection management
- getConnection(): retrieves from pool or creates new connection
- returnConnection(): returns healthy connections to pool
- createConnection(): establishes new LDAP connection with TLS support
- Close(): cleanup method to close all pooled connections
- Connection health checking (IsClosing()) before reuse

BENEFITS:
- Reduced connection overhead (no TCP handshake per request)
- Better resource utilization under load
- Prevents "too many open files" errors
- Non-blocking pool operations (creates new conn if pool empty)

* fix: correct TokenGenerator access in STS handlers

CRITICAL FIX:
- Make TokenGenerator public in STSService (was private tokenGenerator)
- Update all references from Config.TokenGenerator to TokenGenerator
- Remove TokenGenerator from STSConfig (it belongs in STSService)

This fixes the "NotImplemented" errors in distributed and Keycloak tests.
The issue was that Round 5 changes tried to access Config.TokenGenerator
which didn't exist - TokenGenerator is a field in STSService, not STSConfig.

The TokenGenerator is properly initialized in STSService.Initialize() and
is now accessible for JWT token generation in AssumeRole handlers.

* fix: update tests to use public TokenGenerator field

Following the change to make TokenGenerator public in STSService,
this commit updates the test files to reference the correct public field name.
This resolves compilation errors in the IAM STS test suite.

* fix: update distributed tests to use valid Keycloak users

Updated s3_iam_distributed_test.go to use 'admin-user' and 'read-user'
which exist in the standard Keycloak setup provided by setup_keycloak.sh.
This resolves 'unknown test user' errors in distributed integration tests.

* fix: ensure iam_config.json exists in setup target for CI

The GitHub Actions workflow calls 'make setup' which was not creating
iam_config.json, causing the server to start without IAM integration
enabled (iamIntegration = nil), resulting in NotImplemented errors.

Now 'make setup' copies iam_config.local.json to iam_config.json if
it doesn't exist, ensuring IAM is properly configured in CI.

* fix(iam/ldap): fix connection pool race and rebind corruption

- Add atomic 'closed' flag to connection pool to prevent racing on Close()
- Rebind authenticated user connections back to service account before returning to pool
- Close connections on error instead of returning potentially corrupted state to pool

* fix(iam/ldap): populate standard TokenClaims fields in ValidateToken

- Set Subject, Issuer, Audience, IssuedAt, and ExpiresAt to satisfy the interface
- Use time.Time for timestamps as required by TokenClaims struct
- Default to 1 hour TTL for LDAP tokens

* fix(s3api): include account ID in STS AssumedRoleUser ARN

- Consistent with AWS, include the account ID in the assumed-role ARN
- Use the configured account ID from STS service if available, otherwise default to '111122223333'
- Apply to both AssumeRole and AssumeRoleWithLDAPIdentity handlers
- Also update .gitignore to ignore IAM test environment files

* refactor(s3api): extract shared STS credential generation logic

- Move common logic for session claims and credential generation to prepareSTSCredentials
- Update handleAssumeRole and handleAssumeRoleWithLDAPIdentity to use the helper
- Remove stale comments referencing outdated line numbers

* feat(iam/ldap): make pool size configurable and add audience support

- Add PoolSize to LDAPConfig (default 10)
- Add Audience to LDAPConfig to align with OIDC validation
- Update initialization and ValidateToken to use new fields

* update tests

* debug

* chore(iam): cleanup debug prints and fix test config port

* refactor(iam): use mapstructure for LDAP config parsing

* feat(sts): implement strict trust policy validation for AssumeRole

* test(iam): refactor STS tests to use AWS SDK signer

* test(s3api): implement ValidateTrustPolicyForPrincipal in MockIAMIntegration

* fix(s3api): ensure IAM matcher checks query string on ParseForm error

* fix(sts): use crypto/rand for secure credentials and extract constants

* fix(iam): fix ldap connection leaks and add insecure warning

* chore(iam): improved error wrapping and test parameterization

* feat(sts): add support for LDAPProviderName parameter

* Update weed/iam/ldap/ldap_provider.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update weed/s3api/s3api_sts.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix(sts): use STSErrSTSNotReady when LDAP provider is missing

* fix(sts): encapsulate TokenGenerator in STSService and add getter

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-01-12 10:45:24 -08:00

341 lines
13 KiB
Go

package sts
import (
"context"
"testing"
"time"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
// TestDistributedSTSService verifies that multiple STS instances with identical configurations
// behave consistently across distributed environments
func TestDistributedSTSService(t *testing.T) {
ctx := context.Background()
// Common configuration for all instances
commonConfig := &STSConfig{
TokenDuration: FlexibleDuration{time.Hour},
MaxSessionLength: FlexibleDuration{12 * time.Hour},
Issuer: "distributed-sts-test",
SigningKey: []byte("test-signing-key-32-characters-long"),
Providers: []*ProviderConfig{
{
Name: "keycloak-oidc",
Type: "oidc",
Enabled: true,
Config: map[string]interface{}{
"issuer": "http://keycloak:8080/realms/seaweedfs-test",
"clientId": "seaweedfs-s3",
"jwksUri": "http://keycloak:8080/realms/seaweedfs-test/protocol/openid-connect/certs",
},
},
{
Name: "disabled-ldap",
Type: "oidc", // Use OIDC as placeholder since LDAP isn't implemented
Enabled: false,
Config: map[string]interface{}{
"issuer": "ldap://company.com",
"clientId": "ldap-client",
},
},
},
}
// Create multiple STS instances simulating distributed deployment
instance1 := NewSTSService()
instance2 := NewSTSService()
instance3 := NewSTSService()
// Initialize all instances with identical configuration
err := instance1.Initialize(commonConfig)
require.NoError(t, err, "Instance 1 should initialize successfully")
err = instance2.Initialize(commonConfig)
require.NoError(t, err, "Instance 2 should initialize successfully")
err = instance3.Initialize(commonConfig)
require.NoError(t, err, "Instance 3 should initialize successfully")
// Manually register mock providers for testing (not available in production)
mockProviderConfig := map[string]interface{}{
"issuer": "http://localhost:9999",
"clientId": "test-client",
}
mockProvider1, err := createMockOIDCProvider("test-mock-provider", mockProviderConfig)
require.NoError(t, err)
mockProvider2, err := createMockOIDCProvider("test-mock-provider", mockProviderConfig)
require.NoError(t, err)
mockProvider3, err := createMockOIDCProvider("test-mock-provider", mockProviderConfig)
require.NoError(t, err)
instance1.RegisterProvider(mockProvider1)
instance2.RegisterProvider(mockProvider2)
instance3.RegisterProvider(mockProvider3)
// Verify all instances have identical provider configurations
t.Run("provider_consistency", func(t *testing.T) {
// All instances should have same number of providers
assert.Len(t, instance1.providers, 2, "Instance 1 should have 2 enabled providers")
assert.Len(t, instance2.providers, 2, "Instance 2 should have 2 enabled providers")
assert.Len(t, instance3.providers, 2, "Instance 3 should have 2 enabled providers")
// All instances should have same provider names
instance1Names := instance1.getProviderNames()
instance2Names := instance2.getProviderNames()
instance3Names := instance3.getProviderNames()
assert.ElementsMatch(t, instance1Names, instance2Names, "Instance 1 and 2 should have same providers")
assert.ElementsMatch(t, instance2Names, instance3Names, "Instance 2 and 3 should have same providers")
// Verify specific providers exist on all instances
expectedProviders := []string{"keycloak-oidc", "test-mock-provider"}
assert.ElementsMatch(t, instance1Names, expectedProviders, "Instance 1 should have expected providers")
assert.ElementsMatch(t, instance2Names, expectedProviders, "Instance 2 should have expected providers")
assert.ElementsMatch(t, instance3Names, expectedProviders, "Instance 3 should have expected providers")
// Verify disabled providers are not loaded
assert.NotContains(t, instance1Names, "disabled-ldap", "Disabled providers should not be loaded")
assert.NotContains(t, instance2Names, "disabled-ldap", "Disabled providers should not be loaded")
assert.NotContains(t, instance3Names, "disabled-ldap", "Disabled providers should not be loaded")
})
// Test token generation consistency across instances
t.Run("token_generation_consistency", func(t *testing.T) {
sessionId := "test-session-123"
expiresAt := time.Now().Add(time.Hour)
// Generate tokens from different instances
token1, err1 := instance1.GetTokenGenerator().GenerateSessionToken(sessionId, expiresAt)
token2, err2 := instance2.GetTokenGenerator().GenerateSessionToken(sessionId, expiresAt)
token3, err3 := instance3.GetTokenGenerator().GenerateSessionToken(sessionId, expiresAt)
require.NoError(t, err1, "Instance 1 token generation should succeed")
require.NoError(t, err2, "Instance 2 token generation should succeed")
require.NoError(t, err3, "Instance 3 token generation should succeed")
// All tokens should be different (due to timestamp variations)
// But they should all be valid JWTs with same signing key
assert.NotEmpty(t, token1)
assert.NotEmpty(t, token2)
assert.NotEmpty(t, token3)
})
// Test token validation consistency - any instance should validate tokens from any other instance
t.Run("cross_instance_token_validation", func(t *testing.T) {
sessionId := "cross-validation-session"
expiresAt := time.Now().Add(time.Hour)
// Generate token on instance 1
token, err := instance1.GetTokenGenerator().GenerateSessionToken(sessionId, expiresAt)
require.NoError(t, err)
// Validate on all instances
claims1, err1 := instance1.GetTokenGenerator().ValidateSessionToken(token)
claims2, err2 := instance2.GetTokenGenerator().ValidateSessionToken(token)
claims3, err3 := instance3.GetTokenGenerator().ValidateSessionToken(token)
require.NoError(t, err1, "Instance 1 should validate token from instance 1")
require.NoError(t, err2, "Instance 2 should validate token from instance 1")
require.NoError(t, err3, "Instance 3 should validate token from instance 1")
// All instances should extract same session ID
assert.Equal(t, sessionId, claims1.SessionId)
assert.Equal(t, sessionId, claims2.SessionId)
assert.Equal(t, sessionId, claims3.SessionId)
assert.Equal(t, claims1.SessionId, claims2.SessionId)
assert.Equal(t, claims2.SessionId, claims3.SessionId)
})
// Test provider access consistency
t.Run("provider_access_consistency", func(t *testing.T) {
// All instances should be able to access the same providers
provider1, exists1 := instance1.providers["test-mock-provider"]
provider2, exists2 := instance2.providers["test-mock-provider"]
provider3, exists3 := instance3.providers["test-mock-provider"]
assert.True(t, exists1, "Instance 1 should have test-mock-provider")
assert.True(t, exists2, "Instance 2 should have test-mock-provider")
assert.True(t, exists3, "Instance 3 should have test-mock-provider")
assert.Equal(t, provider1.Name(), provider2.Name())
assert.Equal(t, provider2.Name(), provider3.Name())
// Test authentication with the mock provider on all instances
testToken := "valid_test_token"
identity1, err1 := provider1.Authenticate(ctx, testToken)
identity2, err2 := provider2.Authenticate(ctx, testToken)
identity3, err3 := provider3.Authenticate(ctx, testToken)
require.NoError(t, err1, "Instance 1 provider should authenticate successfully")
require.NoError(t, err2, "Instance 2 provider should authenticate successfully")
require.NoError(t, err3, "Instance 3 provider should authenticate successfully")
// All instances should return identical identity information
assert.Equal(t, identity1.UserID, identity2.UserID)
assert.Equal(t, identity2.UserID, identity3.UserID)
assert.Equal(t, identity1.Email, identity2.Email)
assert.Equal(t, identity2.Email, identity3.Email)
assert.Equal(t, identity1.Provider, identity2.Provider)
assert.Equal(t, identity2.Provider, identity3.Provider)
})
}
// TestSTSConfigurationValidation tests configuration validation for distributed deployments
func TestSTSConfigurationValidation(t *testing.T) {
t.Run("consistent_signing_keys_required", func(t *testing.T) {
// Different signing keys should result in incompatible token validation
config1 := &STSConfig{
TokenDuration: FlexibleDuration{time.Hour},
MaxSessionLength: FlexibleDuration{12 * time.Hour},
Issuer: "test-sts",
SigningKey: []byte("signing-key-1-32-characters-long"),
}
config2 := &STSConfig{
TokenDuration: FlexibleDuration{time.Hour},
MaxSessionLength: FlexibleDuration{12 * time.Hour},
Issuer: "test-sts",
SigningKey: []byte("signing-key-2-32-characters-long"), // Different key!
}
instance1 := NewSTSService()
instance2 := NewSTSService()
err1 := instance1.Initialize(config1)
err2 := instance2.Initialize(config2)
require.NoError(t, err1)
require.NoError(t, err2)
// Generate token on instance 1
sessionId := "test-session"
expiresAt := time.Now().Add(time.Hour)
token, err := instance1.GetTokenGenerator().GenerateSessionToken(sessionId, expiresAt)
require.NoError(t, err)
// Instance 1 should validate its own token
_, err = instance1.GetTokenGenerator().ValidateSessionToken(token)
assert.NoError(t, err, "Instance 1 should validate its own token")
// Instance 2 should reject token from instance 1 (different signing key)
_, err = instance2.GetTokenGenerator().ValidateSessionToken(token)
assert.Error(t, err, "Instance 2 should reject token with different signing key")
})
t.Run("consistent_issuer_required", func(t *testing.T) {
// Different issuers should result in incompatible tokens
commonSigningKey := []byte("shared-signing-key-32-characters-lo")
config1 := &STSConfig{
TokenDuration: FlexibleDuration{time.Hour},
MaxSessionLength: FlexibleDuration{12 * time.Hour},
Issuer: "sts-instance-1",
SigningKey: commonSigningKey,
}
config2 := &STSConfig{
TokenDuration: FlexibleDuration{time.Hour},
MaxSessionLength: FlexibleDuration{12 * time.Hour},
Issuer: "sts-instance-2", // Different issuer!
SigningKey: commonSigningKey,
}
instance1 := NewSTSService()
instance2 := NewSTSService()
err1 := instance1.Initialize(config1)
err2 := instance2.Initialize(config2)
require.NoError(t, err1)
require.NoError(t, err2)
// Generate token on instance 1
sessionId := "test-session"
expiresAt := time.Now().Add(time.Hour)
token, err := instance1.GetTokenGenerator().GenerateSessionToken(sessionId, expiresAt)
require.NoError(t, err)
// Instance 2 should reject token due to issuer mismatch
// (Even though signing key is the same, issuer validation will fail)
_, err = instance2.GetTokenGenerator().ValidateSessionToken(token)
assert.Error(t, err, "Instance 2 should reject token with different issuer")
})
}
// TestProviderFactoryDistributed tests the provider factory in distributed scenarios
func TestProviderFactoryDistributed(t *testing.T) {
factory := NewProviderFactory()
// Simulate configuration that would be identical across all instances
configs := []*ProviderConfig{
{
Name: "production-keycloak",
Type: "oidc",
Enabled: true,
Config: map[string]interface{}{
"issuer": "https://keycloak.company.com/realms/seaweedfs",
"clientId": "seaweedfs-prod",
"clientSecret": "super-secret-key",
"jwksUri": "https://keycloak.company.com/realms/seaweedfs/protocol/openid-connect/certs",
"scopes": []string{"openid", "profile", "email", "roles"},
},
},
{
Name: "backup-oidc",
Type: "oidc",
Enabled: false, // Disabled by default
Config: map[string]interface{}{
"issuer": "https://backup-oidc.company.com",
"clientId": "seaweedfs-backup",
},
},
}
// Create providers multiple times (simulating multiple instances)
providers1, err1 := factory.LoadProvidersFromConfig(configs)
providers2, err2 := factory.LoadProvidersFromConfig(configs)
providers3, err3 := factory.LoadProvidersFromConfig(configs)
require.NoError(t, err1, "First load should succeed")
require.NoError(t, err2, "Second load should succeed")
require.NoError(t, err3, "Third load should succeed")
// All instances should have same provider counts
assert.Len(t, providers1, 1, "First instance should have 1 enabled provider")
assert.Len(t, providers2, 1, "Second instance should have 1 enabled provider")
assert.Len(t, providers3, 1, "Third instance should have 1 enabled provider")
// All instances should have same provider names
names1 := make([]string, 0, len(providers1))
names2 := make([]string, 0, len(providers2))
names3 := make([]string, 0, len(providers3))
for name := range providers1 {
names1 = append(names1, name)
}
for name := range providers2 {
names2 = append(names2, name)
}
for name := range providers3 {
names3 = append(names3, name)
}
assert.ElementsMatch(t, names1, names2, "Instance 1 and 2 should have same provider names")
assert.ElementsMatch(t, names2, names3, "Instance 2 and 3 should have same provider names")
// Verify specific providers
expectedProviders := []string{"production-keycloak"}
assert.ElementsMatch(t, names1, expectedProviders, "Should have expected enabled providers")
// Verify disabled providers are not included
assert.NotContains(t, names1, "backup-oidc", "Disabled providers should not be loaded")
assert.NotContains(t, names2, "backup-oidc", "Disabled providers should not be loaded")
assert.NotContains(t, names3, "backup-oidc", "Disabled providers should not be loaded")
}