seaweedFS

Author	SHA1	Message	Date
Chris Lu	74b5c57dcf	credential/filer_etc: migrate to multi-file identity storage	2026-01-25 13:48:46 -08:00
Chris Lu	6bf088cec9	IAM Policy Management via gRPC (#8109 ) * Add IAM gRPC service definition - Add GetConfiguration/PutConfiguration for config management - Add CreateUser/GetUser/UpdateUser/DeleteUser/ListUsers for user management - Add CreateAccessKey/DeleteAccessKey/GetUserByAccessKey for access key management - Methods mirror existing IAM HTTP API functionality * Add IAM gRPC handlers on filer server - Implement IamGrpcServer with CredentialManager integration - Handle configuration get/put operations - Handle user CRUD operations - Handle access key create/delete operations - All methods delegate to CredentialManager for actual storage * Wire IAM gRPC service to filer server - Add CredentialManager field to FilerOption and FilerServer - Import credential store implementations in filer command - Initialize CredentialManager from credential.toml if available - Register IAM gRPC service on filer gRPC server - Enable credential management via gRPC alongside existing filer services * Regenerate IAM protobuf with gRPC service methods * iam_pb: add Policy Management to protobuf definitions * credential: implement PolicyManager in credential stores * filer: implement IAM Policy Management RPCs * shell: add s3.policy command * test: add integration test for s3.policy * test: fix compilation errors in policy_test * pb * fmt * test * weed shell: add -policies flag to s3.configure This allows linking/unlinking IAM policies to/from identities directly from the s3.configure command. * test: verify s3.configure policy linking and fix port allocation - Added test case for linking policies to users via s3.configure - Implemented findAvailablePortPair to ensure HTTP and gRPC ports are both available, avoiding conflicts with randomized port assignments. - Updated assertion to match jsonpb output (policyNames) * credential: add StoreTypeGrpc constant * credential: add IAM gRPC store boilerplate * credential: implement identity methods in gRPC store * credential: implement policy methods in gRPC store * admin: use gRPC credential store for AdminServer This ensures that all IAM and policy changes made through the Admin UI are persisted via the Filer's IAM gRPC service instead of direct file manipulation. * shell: s3.configure use granular IAM gRPC APIs instead of full config patching * shell: s3.configure use granular IAM gRPC APIs * shell: replace deprecated ioutil with os in s3.policy * filer: use gRPC FailedPrecondition for unconfigured credential manager * test: improve s3.policy integration tests and fix error checks * ci: add s3 policy shell integration tests to github workflow * filer: fix LoadCredentialConfiguration error handling * credential/grpc: propagate unmarshal errors in GetPolicies * filer/grpc: improve error handling and validation * shell: use gRPC status codes in s3.configure * credential: document PutPolicy as create-or-replace * credential/postgres: reuse CreatePolicy in PutPolicy to deduplicate logic * shell: add timeout context and strictly enforce flags in s3.policy * iam: standardize policy content field naming in gRPC and proto * shell: extract slice helper functions in s3.configure * filer: map credential store errors to gRPC status codes * filer: add input validation for UpdateUser and CreateAccessKey * iam: improve validation in policy and config handlers * filer: ensure IAM service registration by defaulting credential manager * credential: add GetStoreName method to manager * test: verify policy deletion in integration test	2026-01-25 13:39:30 -08:00
Lisandro Pin	59d40f7186	Return volume server state flags via `VolumeServerStatus()` RPCs. (#8016 )	2026-01-24 21:45:23 -08:00
Chris Lu	8814c2a07d	iam: support ForAnyValue and ForAllValues condition set operators (#8105 ) * iam: support ForAnyValue and ForAllValues condition set operators This implementation adds support for AWS-style IAM condition set operators `ForAnyValue:` and `ForAllValues:`. These are essential for trust policies that evaluate collection-based claims like `oidc:roles` or groups. - Updated EvaluateStringCondition to handle set operators. - Added set operator support to numeric, date, and boolean conditions. - ForAnyValue matches if any request value matches any condition value (default). - ForAllValues matches if every request value matches at least one condition value. * iam: add test suite for condition set operators * iam: ensure ForAllValues is vacuously true for all condition types Aligned Numeric, Date, and Boolean conditions with AWS IAM behavior where ForAllValues returns true when the request context values are empty. * iam: add Date vacuously true test case for ForAllValues * iam: expand policy variables in case-insensitive string conditions Added expandPolicyVariables support to evaluateStringConditionIgnoreCase to ensure consistency with case-sensitive counterparts. * iam: fix negation issues in string set operators Refactored EvaluateStringCondition and evaluateStringConditionIgnoreCase to evaluate operators (including negation) per context value before aggregating. This ensures StringNotEquals and StringNotLike work correctly with ForAllValues and ForAnyValue. * iam: add []string support for Date and Boolean context values Ensures consistency with Numeric conditions by allowing context values to be provided as slices of strings, which is common in JSON/OIDC claims. * iam: simplify redundant type check in policy engine The `evaluateStringConditionIgnoreCase` function had a redundant type check for `string` in the `default` block of a type switch that already handled the `string` case. * iam: remove outdated "currently fails" comment in negation tests * iam: add StringLikeIgnoreCase condition support * iam: explicitly handle empty context sets for ForAnyValue AWS IAM treats empty request sets as "no match" for ForAnyValue. Added an explicit check and comment to make this behavior clear. * iam: refactor EvaluateStringCondition to expand policy variables once Avoid redundant calls to expandPolicyVariables by expanding them once per condition value instead of inside awsIAMMatch or in the exact matching branch. * iam: fix StringLike case sensitivity to match AWS IAM specs StringLike and StringNotLike condition operators are case-sensitive in AWS IAM. Changed the implementation to use filepath.Match for case-sensitive wildcard matching instead of the case-insensitive awsIAMMatch. * iam: integrate StringLike case-sensitivity test into suite Integrated the case-sensitivity verification into condition_set_test.go and updated the consistency test to use StringLikeIgnoreCase to maintain its case-insensitive matching verification. * iam: fix NumericNotEquals logic to follow "not equal to any" semantics Updated evaluateNumericCondition to correctly handle NumericNotEquals by ensuring a context value matches only if it is not equal to ANY of the provided expected values. Also added support for []string expected values. * iam: fix DateNotEquals logic and integrate tests Updated evaluateDateCondition to correctly handle DateNotEquals logic. Integrated the new test cases for NumericNotEquals and DateNotEquals into condition_set_test.go. * iam: fix validation error in integrated NotEquals tests Added missing Resource field to IAM policy statements in condition_set_test.go to satisfy validation requirements. * iam: add set operator support for IP and Null conditions Implemented ForAllValues and ForAnyValue support for IpAddress, NotIpAddress, and Null condition operators. Also added test coverage for ForAnyValue with an empty context to ensure correct behavior. * iam: refine IP condition evaluation to handle multiple policy value types Updated evaluateIPCondition to correctly handle string, []string, and []interface{} values for IP address conditions in policy documents. Added IpAddress:SingleStringValue test case to verify consistency. * iam: refine Null and case-insensitive string conditions - Reverted evaluateNullCondition to standard AWS behavior (no set operators). - Refactored evaluateStringConditionIgnoreCase to use idiomatic helpers (strings.EqualFold and AwsWildcardMatch). - Cleaned up tests in condition_set_test.go. * iam: normalize policy value handling across condition evaluators - Implemented normalizeRanges helper for consistent IP range extraction. - Expanded type switches in IP, Bool, and String condition evaluators to support string, []string, and []interface{} policy values. - Fixed ForAnyValue bool matching to support string slices. - Added targeted tests for []string policy values in condition_set_test.go. * iam: refactor IP condition to support arbitrary context keys Refactored evaluateIPCondition to iterate through all keys in the condition block instead of hardcoding aws:SourceIp. This ensures consistency with other condition types and allows custom context keys. Added IpAddress:CustomContextKey test case to verify the change.	2026-01-24 13:34:49 -08:00
Chris Lu	d3f79d4c38	Update detection.go	2026-01-23 21:38:51 -08:00
Chris Lu	6394e2f6a5	Fix IAM OIDC role mapping and OIDC claims in trust policy (#8104 ) * Fix IAM OIDC role mapping and OIDC claims in trust policy * Address PR review: Add config safety checks and refactor tests	2026-01-23 21:35:26 -08:00
Chris Lu	57a16b0b87	Improve error handling in GetObjectStoreUsers per PR review	2026-01-23 20:34:39 -08:00
Chris Lu	e559b8df37	Refactor Admin UI to use unified IAM storage and add Shutdown hook	2026-01-23 20:29:21 -08:00
Chris Lu	81009c1a81	Refactor IAM Storage: Multi-File Backend & Unified Interface (#8102 ) Refactor IAM Shutdown to use sync.Once for thread safety	2026-01-23 20:27:22 -08:00
Chris Lu	f6318edbc9	Refactor Admin UI to use unified IAM storage and add MultipleFileStore (#8101 ) * Refactor Admin UI to use unified IAM storage and add MultipleFileStore * Address PR feedback: fix renames, error handling, and sync logic in FilerMultipleStore * Address refined PR feedback: safe rename order, rollback logic, and structural sync refinement * Optimize LoadConfiguration: use streaming callback for memory efficiency * Refactor UpdateUser: log rollback failures during rename * Implement PolicyManager for FilerMultipleStore * include the filer_multiple backend configuration * Implement cross-S3 synchronization and proper shutdown for all IAM backends * Extract Admin UI refactoring to a separate PR	2026-01-23 20:12:59 -08:00
Chris Lu	535be3096b	Add AWS IAM integration tests and refactor admin authorization (#8098 ) * Add AWS IAM integration tests and refactor admin authorization - Added AWS IAM management integration tests (User, AccessKey, Policy) - Updated test framework to support IAM client creation with JWT/OIDC - Refactored s3api authorization to be policy-driven for IAM actions - Removed hardcoded role name checks for admin privileges - Added new tests to GitHub Actions basic test matrix * test(s3/iam): add UpdateUser and UpdateAccessKey tests and fix nil pointer dereference * feat(s3api): add DeletePolicy and update tests with cleanup logic * test(s3/iam): use t.Cleanup for managed policy deletion in CreatePolicy test	2026-01-23 16:41:51 -08:00
Chris Lu	25a4691135	Update store_ec_recovery_test.go	2026-01-23 16:38:36 -08:00
Chris Lu	d664ca5ed3	fix: IAM authentication with AWS Signature V4 and environment credentials (#8099 ) * fix: IAM authentication with AWS Signature V4 and environment credentials Three key fixes for authenticated IAM requests to work: 1. Fix request body consumption before signature verification - iamMatcher was calling r.ParseForm() which consumed POST body - This broke AWS Signature V4 verification on subsequent reads - Now only check query string in matcher, preserving body for verification - File: weed/s3api/s3api_server.go 2. Preserve environment variable credentials across config reloads - After IAM mutations, config reload overwrote env var credentials - Extract env var loading into loadEnvironmentVariableCredentials() - Call after every config reload to persist credentials - File: weed/s3api/auth_credentials.go 3. Add authenticated IAM tests and test infrastructure - New TestIAMAuthenticated suite with AWS SDK + Signature V4 - Dynamic port allocation for independent test execution - Flag reset to prevent state leakage between tests - CI workflow to run S3 and IAM tests separately - Files: test/s3/example/, .github/workflows/s3-example-integration-tests.yml All tests pass: - TestIAMCreateUser (unauthenticated) - TestIAMAuthenticated (with AWS Signature V4) - S3 integration tests fmt * chore: rename test/s3/example to test/s3/normal * simplify: CI runs all integration tests in single job * Update s3-example-integration-tests.yml * ci: run each test group separately to avoid raft registry conflicts	2026-01-23 16:27:42 -08:00
Chris Lu	b203ed4124	Fix imbalance detection disk type grouping and volume grow errors (#8097 ) * Fix imbalance detection disk type grouping and volume grow errors This PR addresses two issues: 1. Imbalance Detection: Previously, balance detection did not verify disk types, leading to false positives when comparing heterogenous nodes (e.g. SSD vs HDD). Logic is now updated to group volumes by DiskType before calculating imbalance. 2. Volume Grow Errors: Fixed a variable scope issue in master_grpc_server_volume.go and added a pre-check for available space to prevent 'only 0 volumes left' error logs when a disk type is full or abandoned. Included units tests for the detection logic. * Refactor balance detection loop into detectForDiskType * Fix potential panic in volume grow logic by checking replica placement parse error	2026-01-23 12:25:11 -08:00
Lisandro Pin	7e81c0bf0d	Clarfiy errors upon needle CRC mismatches. (#8096 )	2026-01-23 10:48:29 -08:00
Chris Lu	e717a63665	Fix EC shard recovery with improved diagnostics (#8091 ) * storage: fix EC shard recovery with improved diagnostics and logging - Fix buffer size mismatch in ReconstructData call - Add detailed logging of available and missing shards - Improve error messages when recovery is impossible - Add unit tests for EC recovery shard counting logic * test: refine EC recovery unit tests - Remove redundant tests that only validate setup - Use standard strings.Contains instead of custom recursive helper * adjust tests and minor improvement	2026-01-22 20:34:19 -08:00
Chris Lu	bc1113208d	fix: S3 listing NextMarker missing intermediate directory component (#8089 ) * fix: S3 listing NextMarker missing intermediate directory component When listing with nested prefixes like "character/member/", the NextMarker was incorrectly constructed as "character/res024/" instead of "character/member/res024/", causing continuation requests to fail. Root cause: The code at line 331 was constructing NextMarker as: nextMarker = requestDir + "/" + nextMarker This worked when nextMarker already contained the full relative path, but failed when it was just the entry name from the innermost recursion. Fix: Include the prefix component when constructing NextMarker: if prefix != "" { nextMarker = requestDir + "/" + prefix + "/" + nextMarker } This ensures the full path is always constructed correctly for both: - CommonPrefix entries (directories) - Regular entries (files) Also includes fix for cursor.prefixEndsOnDelimiter state leak that was causing sibling directories to be incorrectly listed. * test: add regression tests for NextMarker construction Add comprehensive unit tests to verify NextMarker is correctly constructed with nested prefixes. Tests cover: - Regular entries with nested prefix (character/member/res024) - CommonPrefix entries (directories) - Edge cases (no requestDir, no prefix, deeply nested) These tests ensure the fix prevents regression of the bug where NextMarker was missing intermediate directory components.	2026-01-22 16:56:35 -08:00
Chris Lu	066410dbd0	Fix S3 Gateway Read Failover #8076 (#8087 ) * fix s3 read failover #8076 - Implement cache invalidation in vidMapClient - Add retry logic in shared PrepareStreamContentWithThrottler - Update S3 Gateway to use FilerClient directly for invalidation support - Remove obsolete simpleMasterClient struct * improve observability for chunk re-lookup failures Added a warning log when volume location re-lookup fails after cache invalidation in PrepareStreamContentWithThrottler. * address code review feedback - Prevent infinite retry loops by comparing old/new URLs before retry - Update fileId2Url map after successful re-lookup for subsequent references - Add comprehensive test coverage for failover logic - Add tests for InvalidateCache method * Fix: prevent data duplication in stream retry and improve VidMap robustness * Cleanup: remove redundant check in InvalidateCache	2026-01-22 14:07:24 -08:00
Chris Lu	2e9a7e13e2	cast i to int64 first, ensuring the calculation happens in 64-bit space fix https://github.com/seaweedfs/seaweedfs/issues/8086	2026-01-22 14:05:45 -08:00
Chris Lu	5472061231	Fix: Populate Claims from STS session RequestContext for policy variable substitution (#8082 ) * Fix: Populate Claims from STS session RequestContext for policy variable substitution When using STS temporary credentials (from AssumeRoleWithWebIdentity) with AWS Signature V4 authentication, JWT claims like preferred_username were not available for bucket policy variable substitution (e.g., ${jwt:preferred_username}). Root Cause: - STS session tokens store user claims in the req_ctx field (added in PR #8079) - validateSTSSessionToken() created Identity but didn't populate Claims field - authorizeWithIAM() created IAMIdentity but didn't copy Claims - Policy engine couldn't resolve ${jwt:} variables without claims Changes: 1. auth_signature_v4.go: Extract claims from sessionInfo.RequestContext and populate Identity.Claims in validateSTSSessionToken() 2. auth_credentials.go: Copy Claims when creating IAMIdentity in authorizeWithIAM() 3. auth_sts_identity_test.go: Add TestSTSIdentityClaimsPopulation to verify claims are properly populated from RequestContext This enables bucket policies with JWT claim variables to work correctly with STS temporary credentials obtained via AssumeRoleWithWebIdentity. Fixes #8037 Refactor: Idiomatic map population for STS claims	2026-01-21 18:36:24 -08:00
Chris Lu	51735e667c	Fix S3 conditional writes with versioning (Issue #8073 ) (#8080 ) * Fix S3 conditional writes with versioning (Issue #8073) Refactors conditional header checks to properly resolve the latest object version when versioning is enabled. This prevents incorrect validation against non-versioned root objects. * Add integration test for S3 conditional writes with versioning (Issue #8073) * Refactor: Propagate internal errors in conditional header checks - Make resolveObjectEntry return errors from isVersioningConfigured - Update checkConditionalHeaders checks to return 500 on internal resolve errors * Refactor: Stricter error handling and test assertions - Propagate internal errors in checkConditionalHeadersWithGetter functions - Enforce strict 412 PreconditionFailed check in integration test Perf: Add early return for conditional headers + safety improvements - Add fast path to skip resolveObjectEntry when no conditional headers present - Avoids expensive getLatestObjectVersion retries in common case - Add nil checks before dereferencing pointers in integration test - Fix grammar in test comments - Remove duplicate comment in resolveObjectEntry * Refactor: Use errors.Is for robust ErrNotFound checking - Update checkConditionalHeaders* to use errors.Is(err, filer_pb.ErrNotFound) - Update resolveObjectEntry to use errors.Is for wrapped error compatibility - Remove duplicate comment lines in s3api handlers * Perf: Optimize resolveObjectEntry for conditional checks - Refactor getLatestObjectVersion to doGetLatestObjectVersion supporting variable retries - Use 1-retry path in resolveObjectEntry to avoid exponential backoff latency * Test: Enhance integration test with content verification - Verify actual object content equals expected content after successful conditional write - Add missing io and errors imports to test file * Refactor: Final refinements based on feedback - Optimize header validation by passing parsed headers to avoid redundant parsing - Simplify integration test assertions using require.Error and assert.True - Fix build errors in s3api handler and test imports * Test: Use smithy.APIError for robust error code checking - Replace string-based error checking with structured API error - Add smithy-go import for AWS SDK v2 error handling * Test: Use types.PreconditionFailed and handle io.ReadAll error - Replace smithy.APIError with more specific types.PreconditionFailed - Add proper error handling for io.ReadAll in content verification * Refactor: Use combined error checking and add nil guards - Use smithy.APIError with ErrorCode() for robust error checking - Add nil guards for entry.Attributes before accessing Mtime - Prevents potential panics when Attributes is uninitialized	2026-01-21 16:36:18 -08:00
粒粒橙	52882aed70	fix(s3api): missing `Vary: Origin` header on non-CORS and `OPTIONS` requests (#8072 ) * fix: Refactor CORS middleware to consistently apply the `Vary: Origin` header when a configuration exists and streamline request processing logic. * fix: Add Vary: Origin header to CORS OPTIONS responses and refactor request handling for clarity and correctness. * fix: update CORS middleware tests to correctly parse and check for 'Origin' in Vary header. * refactor: extract `hasVaryOrigin` helper function to simplify Vary header checks in tests. * test: Remove `Vary: Origin` header from CORS test expectations. * refactor: consolidate CORS request handling into a new `processCORS` method using a `next` callback.	2026-01-21 14:04:57 -08:00
Chris Lu	cd2e93bf2b	fix: propagate OIDC attributes to STS session token for IAM policies (#8079 ) * fix: propagate OIDC attributes to STS session token * refactor: apply PR suggestions for STS session claims	2026-01-21 13:27:33 -08:00
Chris Lu	16c8aac7c9	minor	2026-01-21 13:05:28 -08:00
Chris Lu	7d788ae73c	Fix: S3 CORS headers missing for non-existent buckets (#8078 ) Fix S3 CORS for non-existent buckets Enable fallback to global CORS configuration when a bucket is not found (s3err.ErrNoSuchBucket). This ensures consistent CORS behavior and prevents information disclosure.	2026-01-21 12:50:51 -08:00
Chris Lu	3f879b8d2b	copy the aws keys	2026-01-20 18:12:32 -08:00
Chris Lu	3d1f710485	remove the .versions directory when all versions are deleted	2026-01-20 18:08:40 -08:00
Chris Lu	f6a2ef11ff	Fix CORS headers not applied to non-existent bucket responses (#8070 ) Fixes #8065 Problem: - CORS headers were only applied after checking bucket existence - Non-existent buckets returned responses without CORS headers - This caused CORS preflight failures and information disclosure vulnerability - Unauthenticated users could infer bucket existence from CORS header presence Solution: - Moved CORS evaluation before bucket existence check in middleware - CORS headers now applied consistently regardless of bucket existence - Preflight requests succeed for non-existent buckets (matching AWS S3) - Actual requests still return NoSuchBucket error but with CORS headers Changes: - Modified Handler() and HandleOptionsRequest() in middleware.go - Added comprehensive test suite for non-existent bucket scenarios - All 39 tests passing (31 existing + 8 new) Security Impact: - Prevents information disclosure about bucket existence - Bucket existence cannot be inferred from CORS header presence/absence AWS S3 Compatibility: - Improved compatibility with AWS S3 CORS behavior - Preflight requests now succeed for non-existent buckets	2026-01-20 16:15:46 -08:00
Chris Lu	13dcf445a4	Fix maintenance worker panic and add EC integration tests (#8068 ) * Fix nil pointer panic in maintenance worker when receiving empty task assignment When a worker requests a task and none are available, the admin server sends an empty TaskAssignment message. The worker was attempting to log the task details without checking if the TaskId was empty, causing a nil pointer dereference when accessing taskAssign.Params.VolumeId. This fix adds a check for empty TaskId before processing the assignment, preventing worker crashes and improving stability in production environments. * Add EC integration test for admin-worker maintenance system Adds comprehensive integration test that verifies the end-to-end flow of erasure coding maintenance tasks: - Admin server detects volumes needing EC encoding - Workers register and receive task assignments - EC encoding is executed and verified in master topology - File read-back validation confirms data integrity The test uses unique absolute working directories for each worker to prevent ID conflicts and ensure stable worker registration. Includes proper cleanup and process management for reliable test execution. * Improve maintenance system stability and task deduplication - Add cross-type task deduplication to prevent concurrent maintenance operations on the same volume (EC, balance, vacuum) - Implement HasAnyTask check in ActiveTopology for better coordination - Increase RequestTask timeout from 5s to 30s to prevent unnecessary worker reconnections - Add TaskTypeNone sentinel for generic task checks - Update all task detectors to use HasAnyTask for conflict prevention - Improve config persistence and schema handling * Add GitHub Actions workflow for EC integration tests Adds CI workflow that runs EC integration tests on push and pull requests to master branch. The workflow: - Triggers on changes to admin, worker, or test files - Builds the weed binary - Runs the EC integration test suite - Uploads test logs as artifacts on failure for debugging This ensures the maintenance system remains stable and worker-admin integration is validated in CI. * go version 1.24 * address comments * Update maintenance_integration.go * support seconds * ec prioritize over balancing in tests	2026-01-20 15:07:43 -08:00
KyoungYun-K	59dfe047b6	Support for cacheMetaTtlSec option in fuse command (#8063 )	2026-01-19 22:52:47 -08:00
Chris Lu	bc8a077561	Fix: Propagate OIDC claims for dynamic IAM policies (#8060 ) Fix: Propagate OIDC claims to IAM identity for dynamic policy variables Fixes #8037. Ensures additional OIDC claims (like preferred_username) are preserved in ExternalIdentity attributes and propagated to IAM tokens, enabling substitution in dynamic policies.	2026-01-19 13:39:18 -08:00
Chris Lu	bc64ed51c5	Fix CopyObject If-Match ETag mismatch by copying Md5 attribute (#8053 )	2026-01-18 20:28:01 -08:00
Chris Lu	bc853bdee5	4.07	2026-01-18 15:48:09 -08:00
Chris Lu	ce8e2db893	Merge branch 'master' of https://github.com/seaweedfs/seaweedfs	2026-01-18 15:04:59 -08:00
Chris Lu	3e5d34dd67	skip md5 validation if Content-MD5 is not provided	2026-01-18 15:04:56 -08:00
SoSweetHam	2662420194	fix(s3api): correct wildcard matching (#8052 ) * fix(s3api): correct wildcard matching * chore(tests): add multi-slash test case in ref. to cases provided here https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_elements_resource.html\#reference_policies_elements_resource_wildcards * fix: gemini suggestions	2026-01-18 14:54:03 -08:00
Chris Lu	753e1db096	Prevent split-brain: Persistent ClusterID and Join Validation (#8022 ) * Prevent split-brain: Persistent ClusterID and Join Validation - Persist ClusterId in Raft store to survive restarts. - Validate ClusterId on Raft command application (piggybacked on MaxVolumeId). - Prevent masters with conflicting ClusterIds from joining/operating together. - Update Telemetry to report the persistent ClusterId. * Refine ClusterID validation based on feedback - Improved error message in cluster_commands.go. - Added ClusterId mismatch check in RaftServer.Recovery. * Handle Raft errors and support Hashicorp Raft for ClusterId - Check for errors when persisting ClusterId in legacy Raft. - Implement ClusterId generation and persistence for Hashicorp Raft leader changes. - Ensure consistent error logging. * Refactor ClusterId validation - Centralize ClusterId mismatch check in Topology.SetClusterId. - Simplify MaxVolumeIdCommand.Apply and RaftServer.Recovery to rely on SetClusterId. * Fix goroutine leak and add timeout - Handle channel closure in Hashicorp Raft leader listener. - Add timeout to Raft Apply call to prevent blocking. * Fix deadlock in legacy Raft listener - Wrap ClusterId generation/persistence in a goroutine to avoid blocking the Raft event loop (deadlock). * Rename ClusterId to SystemId - Renamed ClusterId to SystemId across the codebase (protobuf, topology, server, telemetry). - Regenerated telemetry.pb.go with new field. * Rename SystemId to TopologyId - Rename to SystemId was intermediate step. - Final name is TopologyId for the persistent cluster identifier. - Updated protobuf, topology, raft server, master server, and telemetry. * Optimize Hashicorp Raft listener - Integrated TopologyId generation into existing monitorLeaderLoop. - Removed extra goroutine in master_server.go. * Fix optimistic TopologyId update - Removed premature local state update of TopologyId in master_server.go and raft_hashicorp.go. - State is now solely updated via the Raft state machine Apply/Restore methods after consensus. * Add explicit log for recovered TopologyId - Added glog.V(0) info log in RaftServer.Recovery to print the recovered TopologyId on startup. * Add Raft barrier to prevent TopologyId race condition - Implement ensureTopologyId helper method - Send no-op MaxVolumeIdCommand to sync Raft log before checking TopologyId - Ensures persisted TopologyId is recovered before generating new one - Prevents race where generation happens during log replay * Serialize TopologyId generation with mutex - Add topologyIdGenLock mutex to MasterServer struct - Wrap ensureTopologyId method with lock to prevent concurrent generation - Fixes race where event listener and manual leadership check both generate IDs - Second caller waits for first to complete and sees the generated ID * Add TopologyId recovery logging to Apply method - Change log level from V(1) to V(0) for visibility - Log 'Recovered TopologyId' when applying from Raft log - Ensures recovery is visible whether from snapshot or log replay - Matches Recovery() method logging for consistency * Fix Raft barrier timing issue - Add 100ms delay after barrier command to ensure log application completes - Add debug logging to track barrier execution and TopologyId state - Return early if barrier command fails - Prevents TopologyId generation before old logs are fully applied * ensure leader * address comments * address comments * redundant * clean up * double check * refactoring * comment	2026-01-18 14:02:34 -08:00
Chris Lu	ce23c4fca7	missing changes	2026-01-17 23:15:33 -08:00
Chris Lu	b8dc8d12f2	ErrNoSuchKey should not be reported as an error in the logs	2026-01-17 23:07:49 -08:00
Chris Lu	8880f9932f	filer: auto clean empty implicit s3 folders (#8051 ) * filer: auto clean empty s3 implicit folders Explicitly tag implicitly created S3 folders (parent directories from object uploads) with 'Seaweed-X-Amz-Implicit-Dir'. Update EmptyFolderCleaner to check for this attribute and cache the result efficiently. * filer: correctly handle nil attributes in empty folder cleaner cache * filer: refine implicit tagging logic Prevent tagging buckets as implicit directories. Reduce code duplication. * filer: safeguard GetEntryAttributes against nil entry and not found error * filer: move ErrNotFound handling to EmptyFolderCleaner * filer: add comment to explain level > 3 check for implicit directories	2026-01-17 22:10:15 -08:00
Chris Lu	1dedc8daf9	adjust logs	2026-01-17 18:40:48 -08:00
Chris Lu	6bc5a64a98	Add access key status management to Admin UI (#8050 ) * Add access key status management to Admin UI - Add Status field to AccessKeyInfo struct - Implement UpdateAccessKeyStatus API endpoint - Add status dropdown in access keys modal - Fix modal backdrop issue by using refreshAccessKeysList helper - Status can be toggled between Active and Inactive * Replace magic strings with constants for access key status - Define AccessKeyStatusActive and AccessKeyStatusInactive constants in admin_data.go - Define STATUS_ACTIVE and STATUS_INACTIVE constants in JavaScript - Replace all hardcoded 'Active' and 'Inactive' strings with constants - Update error messages to use constants for consistency * Remove duplicate manageAccessKeys function definition * Add security improvements to access key status management - Add status validation in UpdateAccessKeyStatus to prevent invalid values - Fix XSS vulnerability by replacing inline onchange with data attributes - Add delegated event listener for status select changes - Add URL encoding to API request path segments	2026-01-17 18:18:32 -08:00
Chris Lu	dbde8983a7	Fix bucket permission persistence in Admin UI (#8049 ) Fix bucket permission persistence and security issues (#7226) Security Fixes: - Fix XSS vulnerability in showModal by using DOM methods instead of template strings for title - Add escapeHtmlForAttribute helper to properly escape all HTML entities (&, <, >, ", ') - Fix XSS in showSecretKey and showNewAccessKeyModal by using proper HTML escaping - Fix XSS in createAccessKeysContent by replacing inline onclick with data attributes and event delegation Code Cleanup: - Remove debug label "(DEBUG)" from page header - Remove debug console.log statements from buildBucketPermissionsNew - Remove dead functions: addBucketPermissionRow, removeBucketPermissionRow, parseBucketPermissions, buildBucketPermissions Validation Improvements: - Add validation in handleUpdateUser to prevent empty permissions submission - Update buildBucketPermissionsNew to return null when no buckets selected (instead of empty array) - Add proper error messages for validation failures UI Improvements: - Enhanced access key management with proper modals and copy buttons - Improved copy-to-clipboard functionality with fallbacks Fixes #7226	2026-01-17 12:54:21 -08:00
Chris Lu	796a911cb3	Prevent bucket renaming in filer, fuse mount, and S3 (#8048 ) * prevent bucket renaming in filer, fuse mount, s3 * refactor CanRename to support context propagation * harden bucket rename validation to fail closed on find error	2026-01-16 19:48:09 -08:00
Chris Lu	a473278bfa	Fix: Fail fast on unsupported volume versions (#8047 ) * Fix: Fail fast when initializing volume with Version 0 * Fix: Fail fast when loading unsupported volume version (e.g. 0 or 4) * Refactor: Use IsSupportedVersion helper function for version validation	2026-01-16 19:19:18 -08:00
Chris Lu	0a46577700	Fix #8040 : Support '_default' keyword in collectionPattern to match default collection (#8046 ) * Fix #8040: Support 'default' keyword in collectionPattern to match default collection The default collection in SeaweedFS is represented as an empty string internally. Previously, it was impossible to specifically target only the default collection because: - Empty collectionPattern matched ALL collections (filter was skipped) - Using collectionPattern="default" tried to match the literal string "default" This commit adds special handling for the keyword "default" in collectionPattern across multiple shell commands: - volume.tier.move - volume.list - volume.fix.replication - volume.configure.replication Now users can use -collectionPattern="default" to specifically target volumes in the default collection (empty collection name), while maintaining backward compatibility where empty pattern matches all collections. Updated help text to document this feature. * Update compileCollectionPattern to support 'default' keyword This extends the fix to all commands that use regex-based collection pattern matching: - ec.encode - ec.decode - volume.tier.download - volume.balance The compileCollectionPattern function now treats "default" as a special keyword that compiles to the regex "^$" (matching empty strings), making it consistent with the other commands that use filepath.Match. * Use CollectionDefault constant instead of hardcoded "default" string Refactored the collection pattern matching logic to use a central constant CollectionDefault defined in weed/shell/common.go. This improves maintainability and ensures consistency across all shell commands. * Address PR review feedback: simplify logic and use '_default' keyword Changes: 1. Changed CollectionDefault from "default" to "_default" to avoid collision with literal collection names 2. Simplified pattern matching logic to reduce code duplication across all affected commands 3. Fixed error handling in command_volume_tier_move.go to properly propagate filepath.Match errors instead of swallowing them 4. Updated documentation to clarify how to match a literal "default" collection using regex patterns like "^default$" This addresses all feedback from PR review comments. * Remove unnecessary documentation about matching literal 'default' Since we changed the keyword to '_default', users can now simply use 'default' to match a literal collection named "default". The previous documentation about using regex patterns was confusing and no longer needed. * Fix error propagation and empty pattern handling 1. command_volume_tier_move.go: Added early termination check after eachDataNode callback to stop processing remaining nodes if a pattern matching error occurred, improving efficiency 2. command_volume_configure_replication.go: Fixed empty pattern handling to match all collections (collectionMatched = true when pattern is empty), mirroring the behavior in other commands These changes address the remaining PR review feedback.	2026-01-16 12:31:48 -08:00
Chris Lu	ee3813787e	feat(s3api): Implement S3 Policy Variables (#8039 ) * feat: Add AWS IAM Policy Variables support to S3 API Implements policy variables for dynamic access control in bucket policies. Supported variables: - aws:username - Extracted from principal ARN - aws:userid - User identifier (same as username in SeaweedFS) - aws:principaltype - IAMUser, IAMRole, or AssumedRole - jwt:* - Any JWT claim (e.g., jwt:preferred_username, jwt:sub) Key changes: - Added PolicyVariableRegex to detect ${...} patterns - Extended CompiledStatement with DynamicResourcePatterns, DynamicPrincipalPatterns, DynamicActionPatterns - Added Claims field to PolicyEvaluationArgs for JWT claim access - Implemented SubstituteVariables() for variable replacement from context and JWT claims - Implemented extractPrincipalVariables() for ARN parsing - Updated EvaluateConditions() to support variable substitution - Comprehensive unit and integration tests Resolves #8037 * feat: Add LDAP and PrincipalAccount variable support Completes future enhancements for policy variables: - Added ldap:* variable support for LDAP claims - ldap:username - LDAP username from claims - ldap:dn - LDAP distinguished name from claims - ldap:* - Any LDAP claim - Added aws:PrincipalAccount extraction from ARN - Extracts account ID from principal ARN - Available as ${aws:PrincipalAccount} in policies Updated SubstituteVariables() to check LDAP claims Updated extractPrincipalVariables() to extract account ID Added comprehensive tests for new variables * feat(s3api): implement IAM policy variables core logic and optimization * feat(s3api): integrate policy variables with S3 authentication and handlers * test(s3api): add integration tests for policy variables * cleanup: remove unused policy conversion files * Add S3 policy variables integration tests and path support - Add comprehensive integration tests for policy variables - Test username isolation, JWT claims, LDAP claims - Add support for IAM paths in principal ARN parsing - Add tests for principals with paths * Fix IAM Role principal variable extraction IAM Roles should not have aws:userid or aws:PrincipalAccount according to AWS behavior. Only IAM Users and Assumed Roles should have these variables. Fixes TestExtractPrincipalVariables test failures. * Security fixes and bug fixes for S3 policy variables SECURITY FIXES: - Prevent X-SeaweedFS-Principal header spoofing by clearing internal headers at start of authentication (auth_credentials.go) - Restrict policy variable substitution to safe allowlist to prevent client header injection (iam/policy/policy_engine.go) - Add core policy validation before storing bucket policies BUG FIXES: - Remove unused sid variable in evaluateStatement - Fix LDAP claim lookup to check both prefixed and unprefixed keys - Add ValidatePolicy call in PutBucketPolicyHandler These fixes prevent privilege escalation via header injection and ensure only validated identity claims are used in policy evaluation. * Additional security fixes and code cleanup SECURITY FIXES: - Fixed X-Forwarded-For spoofing by only trusting proxy headers from private/localhost IPs (s3_iam_middleware.go) - Changed context key from "sourceIP" to "aws:SourceIp" for proper policy variable substitution CODE IMPROVEMENTS: - Kept aws:PrincipalAccount for IAM Roles to support condition evaluations - Removed redundant STS principaltype override - Removed unused service variable - Cleaned up commented-out debug logging statements - Updated tests to reflect new IAM Role behavior These changes prevent IP spoofing attacks and ensure policy variables work correctly with the safe allowlist. * Add security documentation for ParseJWTToken Added comprehensive security comments explaining that ParseJWTToken is safe despite parsing without verification because: - It's only used for routing to the correct verification method - All code paths perform cryptographic verification before trusting claims - OIDC tokens: validated via validateExternalOIDCToken - STS tokens: validated via ValidateSessionToken Enhanced function documentation with clear security warnings about proper usage to prevent future misuse. * Fix IP condition evaluation to use aws:SourceIp key Fixed evaluateIPCondition in IAM policy engine to use "aws:SourceIp" instead of "sourceIP" to match the updated extractRequestContext. This fixes the failing IP-restricted role test where IP-based policy conditions were not being evaluated correctly. Updated all test cases to use the correct "aws:SourceIp" key. * Address code review feedback: optimize and clarify PERFORMANCE IMPROVEMENT: - Optimized expandPolicyVariables to use regexp.ReplaceAllStringFunc for single-pass variable substitution instead of iterating through all safe variables. This improves performance from O(nm) to O(m) where n is the number of safe variables and m is the pattern length. CODE CLARITY: - Added detailed comment explaining LDAP claim fallback mechanism (checks both prefixed and unprefixed keys for compatibility) - Enhanced TODO comment for trusted proxy configuration with rationale and recommendations for supporting cloud load balancers, CDNs, and complex network topologies All tests passing. Address Copilot code review feedback BUG FIXES: - Fixed type switch for int/int32/int64 - separated into individual cases since interface type switches only match the first type in multi-type cases - Fixed grammatically incorrect error message in types.go CODE QUALITY: - Removed duplicate Resource/NotResource validation (already in ValidateStatement) - Added comprehensive comment explaining isEnabled() logic and security implications - Improved trusted proxy NOTE comment to be more concise while noting limitations All tests passing. * Fix test failures after extractSourceIP security changes Updated tests to work with the security fix that only trusts X-Forwarded-For/X-Real-IP headers from private IP addresses: - Set RemoteAddr to 127.0.0.1 in tests to simulate trusted proxy - Changed context key from "sourceIP" to "aws:SourceIp" - Added test case for untrusted proxy (public RemoteAddr) - Removed invalid ValidateStatement call (validation happens in ValidatePolicy) All tests now passing. * Address remaining Gemini code review feedback CODE SAFETY: - Deep clone Action field in CompileStatement to prevent potential data races if the original policy document is modified after compilation TEST CLEANUP: - Remove debug logging (fmt.Fprintf) from engine_notresource_test.go - Remove unused imports in engine_notresource_test.go All tests passing. * Fix insecure JWT parsing in IAM auth flow SECURITY FIX: - Renamed ParseJWTToken to ParseUnverifiedJWTToken with explicit security warnings. - Refactored AuthenticateJWT to use the trusted SessionInfo returned by ValidateSessionToken instead of relying on unverified claims from the initial parse. - Refactored ValidatePresignedURLWithIAM to reuse the robust AuthenticateJWT logic, removing duplicated and insecure manual token parsing. This ensures all identity information (Role, Principal, Subject) used for authorization decisions is derived solely from cryptographically verified tokens. * Security: Fix insecure JWT claim extraction in policy engine - Refactored EvaluatePolicy to accept trusted claims from verified Identity instead of parsing unverified tokens - Updated AuthenticateJWT to populate Claims in IAMIdentity from verified sources (SessionInfo/ExternalIdentity) - Updated s3api_server and handlers to pass claims correctly - Improved isPrivateIP to support IPv6 loopback, link-local, and ULA - Fixed flaky distributed_session_consistency test with retry logic * fix(iam): populate Subject in STSSessionInfo to ensure correct identity propagation This fixes the TestS3IAMAuthentication/valid_jwt_token_authentication failure by ensuring the session subject (sub) is correctly mapped to the internal SessionInfo struct, allowing bucket ownership validation to succeed. * Optimized isPrivateIP * Create s3-policy-tests.yml * fix tests * fix tests * tests(s3/iam): simplify policy to resource-based \ (step 1) * tests(s3/iam): add explicit Deny NotResource for isolation (step 2) * fixes * policy: skip resource matching for STS trust policies to allow AssumeRole evaluation * refactor: remove debug logging and hoist policy variables for performance * test: fix TestS3IAMBucketPolicyIntegration cleanup to handle per-subtest object lifecycle * test: fix bucket name generation to comply with S3 63-char limit * test: skip TestS3IAMPolicyEnforcement until role setup is implemented * test: use weed mini for simpler test server deployment Replace 'weed server' with 'weed mini' for IAM tests to avoid port binding issues and simplify the all-in-one server deployment. This improves test reliability and execution time. * security: prevent allocation overflow in policy evaluation Add maxPoliciesForEvaluation constant to cap the number of policies evaluated in a single request. This prevents potential integer overflow when allocating slices for policy lists that may be influenced by untrusted input. Changes: - Add const maxPoliciesForEvaluation = 1024 to set an upper bound - Validate len(policies) < maxPoliciesForEvaluation before appending bucket policy - Use append() instead of make([]string, len+1) to avoid arithmetic overflow - Apply fix to both IsActionAllowed policy evaluation paths	2026-01-16 11:12:28 -08:00
Chris Lu	7eb90fdfd7	Enhance EC balancing to separate parity and data shards (#8038 ) * Enhance EC balancing to separate parity and data shards across racks * Rename avoidRacks to antiAffinityRacks for clarity * Implement server-level EC separation for parity/data shards * Optimize EC balancing: consolidate helpers and extract two-pass selection logic * Add comprehensive edge case tests for EC balancing logic * Apply code review feedback: rename select_(), add divide-by-zero guard, fix comment * Remove unused parameters from doBalanceEcShardsWithinOneRack and add explicit anti-affinity check * Add disk-level anti-affinity for data/parity shard separation - Modified pickBestDiskOnNode to accept shardId and dataShardCount - Implemented explicit anti-affinity: 1000-point penalty for placing data shards on disks with parity (and vice versa) - Updated all call sites including balancing and evacuation - For evacuation, disabled anti-affinity by passing dataShardCount=0	2026-01-15 12:43:44 -08:00
Chris Lu	905e7e72d9	Add remote.copy.local command to copy local files to remote storage (#8033 ) * Add remote.copy.local command to copy local files to remote storage This new command solves the issue described in GitHub Discussion #8031 where files exist locally but are not synced to remote storage due to missing filer logs. Features: - Copies local-only files to remote storage - Supports file filtering (include/exclude patterns) - Dry run mode to preview actions - Configurable concurrency for performance - Force update option for existing remote files - Comprehensive error handling with retry logic Usage: remote.copy.local -dir=/path/to/mount/dir [options] This addresses the need to manually sync files when filer logs were deleted or when local files were never synced to remote storage. * shell: rename commandRemoteLocalSync to commandRemoteCopyLocal * test: add comprehensive remote cache integration tests * shell: fix forceUpdate logic in remote.copy.local The previous logic only allowed force updates when localEntry.RemoteEntry was not nil, which defeated the purpose of using -forceUpdate to fix inconsistencies where local metadata might be missing. Now -forceUpdate will overwrite remote files whenever they exist, regardless of local metadata state. * shell: fix code review issues in remote.copy.local - Return actual error from flag parsing instead of swallowing it - Use sync.Once to safely capture first error in concurrent operations - Add atomic counter to track actual successful copies - Protect concurrent writes to output with mutex to prevent interleaving - Fix path matching to prevent false positives with sibling directories (e.g., /mnt/remote2 no longer matches /mnt/remote) * test: address code review nitpicks in integration tests - Improve create_bucket error handling to fail on real errors - Fix test assertions to properly verify expected failures - Use case-insensitive string matching for error detection - Replace weak logging-only tests with proper assertions - Remove extra blank line in Makefile * test: remove redundant edge case tests Removed 5 tests that were either duplicates or didn't assert meaningful behavior: - TestEdgeCaseEmptyDirectory (duplicate of TestRemoteCopyLocalEmptyDirectory) - TestEdgeCaseRapidCacheUncache (no meaningful assertions) - TestEdgeCaseConcurrentCommands (only logs errors, no assertions) - TestEdgeCaseInvalidPaths (no security assertions) - TestEdgeCaseFileNamePatterns (duplicate of pattern tests in cache tests) Kept valuable stress tests: nested directories, special characters, very large files (100MB), many small files (100), and zero-byte files. * test: fix CI failures by forcing localhost IP advertising Added -ip=127.0.0.1 flag to both primary and remote weed mini commands to prevent IP auto-detection issues in CI environments. Without this flag, the master would advertise itself using the actual IP (e.g., 10.1.0.17) while binding to 127.0.0.1, causing connection refused errors when other services tried to connect to the gRPC port. * test: address final code review issues - Add proper error assertions for concurrent commands test - Require errors for invalid path tests instead of just logging - Remove unused 'match' field from pattern test struct - Add dry-run output assertion to verify expected behavior - Simplify redundant condition in remote.copy.local (remove entry.RemoteEntry check) * test: fix remote.configure tests to match actual validation rules - Use only letters in remote names (no numbers) to match validation - Relax missing parameter test expectations since validation may not be strict - Generate unique names using letter suffix instead of numbers * shell: rename pathToCopyCopy to localPath for clarity Improved variable naming in concurrent copy loop to make the code more readable and less repetitive. * test: fix remaining test failures - Remove strict error requirement for invalid paths (commands handle gracefully) - Fix TestRemoteUncacheBasic to actually test uncache instead of cache - Use simple numeric names for remote.configure tests (testcfg1234 format) to avoid validation issues with letter-only or complex name generation * test: use only letters in remote.configure test names The validation regex ^[A-Za-z][A-Za-z0-9]$ requires names to start with a letter, but using static letter-only names avoids any potential issues with the validation. test: remove quotes from -name parameter in remote.configure tests Single quotes were being included as part of the name value, causing validation failures. Changed from -name='testremote' to -name=testremote. * test: fix remote.configure assertion to be flexible about JSON formatting Changed from checking exact JSON format with specific spacing to just checking if the name appears in the output, since JSON formatting may vary (e.g., "name": "value" vs "name": "value").	2026-01-15 00:52:57 -08:00
Jaehoon Kim	f2e7af257d	Fix volume.fsck -forcePurging -reallyDeleteFromVolume to fail fast on filer traversal errors (#8015 ) * Add TraverseBfsWithContext and fix race conditions in error handling - Add TraverseBfsWithContext function to support context cancellation - Fix race condition in doTraverseBfsAndSaving using atomic.Bool and sync.Once - Improve error handling with fail-fast behavior and proper error propagation - Update command_volume_fsck to use error-returning saveFn callback - Enhance error messages in readFilerFileIdFile with detailed context * refactoring * fix error format * atomic * filer_pb: make enqueue return void * shell: simplify fs.meta.save error handling * filer_pb: handle enqueue return value * Revert "atomic" This reverts commit 712648bc354b186d6654fdb8a46fd4848fdc4e00. * shell: refine fs.meta.save logic --------- Co-authored-by: Chris Lu <chris.lu@gmail.com>	2026-01-14 21:37:50 -08:00

1 2 3 4 5 ...

8041 Commits