seaweedFS

Author	SHA1	Message	Date
Chris Lu	e596542295	Move SQL engine and PostgreSQL server to their own binaries (#8417 ) * Drop SQL engine and PostgreSQL server * Split SQL tooling into weed-db and weed-sql * move * fix building	2026-02-23 16:27:08 -08:00
Chris Lu	e4b70c2521	go fix	2026-02-20 18:42:00 -08:00
Chris Lu	8ec9ff4a12	Refactor plugin system and migrate worker runtime (#8369 ) * admin: add plugin runtime UI page and route wiring * pb: add plugin gRPC contract and generated bindings * admin/plugin: implement worker registry, runtime, monitoring, and config store * admin/dash: wire plugin runtime and expose plugin workflow APIs * command: add flags to enable plugin runtime * admin: rename remaining plugin v2 wording to plugin * admin/plugin: add detectable job type registry helper * admin/plugin: add scheduled detection and dispatch orchestration * admin/plugin: prefetch job type descriptors when workers connect * admin/plugin: add known job type discovery API and UI * admin/plugin: refresh design doc to match current implementation * admin/plugin: enforce per-worker scheduler concurrency limits * admin/plugin: use descriptor runtime defaults for scheduler policy * admin/ui: auto-load first known plugin job type on page open * admin/plugin: bootstrap persisted config from descriptor defaults * admin/plugin: dedupe scheduled proposals by dedupe key * admin/ui: add job type and state filters for plugin monitoring * admin/ui: add per-job-type plugin activity summary * admin/plugin: split descriptor read API from schema refresh * admin/ui: keep plugin summary metrics global while tables are filtered * admin/plugin: retry executor reservation before timing out * admin/plugin: expose scheduler states for monitoring * admin/ui: show per-job-type scheduler states in plugin monitor * pb/plugin: rename protobuf package to plugin * admin/plugin: rename pluginRuntime wiring to plugin * admin/plugin: remove runtime naming from plugin APIs and UI * admin/plugin: rename runtime files to plugin naming * admin/plugin: persist jobs and activities for monitor recovery * admin/plugin: lease one detector worker per job type * admin/ui: show worker load from plugin heartbeats * admin/plugin: skip stale workers for detector and executor picks * plugin/worker: add plugin worker command and stream runtime scaffold * plugin/worker: implement vacuum detect and execute handlers * admin/plugin: document external vacuum plugin worker starter * command: update plugin.worker help to reflect implemented flow * command/admin: drop legacy Plugin V2 label * plugin/worker: validate vacuum job type and respect min interval * plugin/worker: test no-op detect when min interval not elapsed * command/admin: document plugin.worker external process * plugin/worker: advertise configured concurrency in hello * command/plugin.worker: add jobType handler selection * command/plugin.worker: test handler selection by job type * command/plugin.worker: persist worker id in workingDir * admin/plugin: document plugin.worker jobType and workingDir flags * plugin/worker: support cancel request for in-flight work * plugin/worker: test cancel request acknowledgements * command/plugin.worker: document workingDir and jobType behavior * plugin/worker: emit executor activity events for monitor * plugin/worker: test executor activity builder * admin/plugin: send last successful run in detection request * admin/plugin: send cancel request when detect or execute context ends * admin/plugin: document worker cancel request responsibility * admin/handlers: expose plugin scheduler states API in no-auth mode * admin/handlers: test plugin scheduler states route registration * admin/plugin: keep worker id on worker-generated activity records * admin/plugin: test worker id propagation in monitor activities * admin/dash: always initialize plugin service * command/admin: remove plugin enable flags and default to enabled * admin/dash: drop pluginEnabled constructor parameter * admin/plugin UI: stop checking plugin enabled state * admin/plugin: remove docs for plugin enable flags * admin/dash: remove unused plugin enabled check method * admin/dash: fallback to in-memory plugin init when dataDir fails * admin/plugin API: expose worker gRPC port in status * command/plugin.worker: resolve admin gRPC port via plugin status * split plugin UI into overview/configuration/monitoring pages * Update layout_templ.go * add volume_balance plugin worker handler * wire plugin.worker CLI for volume_balance job type * add erasure_coding plugin worker handler * wire plugin.worker CLI for erasure_coding job type * support multi-job handlers in plugin worker runtime * allow plugin.worker jobType as comma-separated list * admin/plugin UI: rename to Workers and simplify config view * plugin worker: queue detection requests instead of capacity reject * Update plugin_worker.go * plugin volume_balance: remove force_move/timeout from worker config UI * plugin erasure_coding: enforce local working dir and cleanup * admin/plugin UI: rename admin settings to job scheduling * admin/plugin UI: persist and robustly render detection results * admin/plugin: record and return detection trace metadata * admin/plugin UI: show detection process and decision trace * plugin: surface detector decision trace as activities * mini: start a plugin worker by default * admin/plugin UI: split monitoring into detection and execution tabs * plugin worker: emit detection decision trace for EC and balance * admin workers UI: split monitoring into detection and execution pages * plugin scheduler: skip proposals for active assigned/running jobs * admin workers UI: add job queue tab * plugin worker: add dummy stress detector and executor job type * admin workers UI: reorder tabs to detection queue execution * admin workers UI: regenerate plugin template * plugin defaults: include dummy stress and add stress tests * plugin dummy stress: rotate detection selections across runs * plugin scheduler: remove cross-run proposal dedupe * plugin queue: track pending scheduled jobs * plugin scheduler: wait for executor capacity before dispatch * plugin scheduler: skip detection when waiting backlog is high * plugin: add disk-backed job detail API and persistence * admin ui: show plugin job detail modal from job id links * plugin: generate unique job ids instead of reusing proposal ids * plugin worker: emit heartbeats on work state changes * plugin registry: round-robin tied executor and detector picks * add temporary EC overnight stress runner * plugin job details: persist and render EC execution plans * ec volume details: color data and parity shard badges * shard labels: keep parity ids numeric and color-only distinction * admin: remove legacy maintenance UI routes and templates * admin: remove dead maintenance endpoint helpers * Update layout_templ.go * remove dummy_stress worker and command support * refactor plugin UI to job-type top tabs and sub-tabs * migrate weed worker command to plugin runtime * remove plugin.worker command and keep worker runtime with metrics * update helm worker args for jobType and execution flags * set plugin scheduling defaults to global 16 and per-worker 4 * stress: fix RPC context reuse and remove redundant variables in ec_stress_runner * admin/plugin: fix lifecycle races, safe channel operations, and terminal state constants * admin/dash: randomize job IDs and fix priority zero-value overwrite in plugin API * admin/handlers: implement buffered rendering to prevent response corruption * admin/plugin: implement debounced persistence flusher and optimize BuildJobDetail memory lookups * admin/plugin: fix priority overwrite and implement bounded wait in scheduler reserve * admin/plugin: implement atomic file writes and fix run record side effects * admin/plugin: use P prefix for parity shard labels in execution plans * admin/plugin: enable parallel execution for cancellation tests * admin: refactor time.Time fields to pointers for better JSON omitempty support * admin/plugin: implement pointer-safe time assignments and comparisons in plugin core * admin/plugin: fix time assignment and sorting logic in plugin monitor after pointer refactor * admin/plugin: update scheduler activity tracking to use time pointers * admin/plugin: fix time-based run history trimming after pointer refactor * admin/dash: fix JobSpec struct literal in plugin API after pointer refactor * admin/view: add D/P prefixes to EC shard badges for UI consistency * admin/plugin: use lifecycle-aware context for schema prefetching * Update ec_volume_details_templ.go * admin/stress: fix proposal sorting and log volume cleanup errors * stress: refine ec stress runner with math/rand and collection name - Added Collection field to VolumeEcShardsDeleteRequest for correct filename construction. - Replaced crypto/rand with seeded math/rand PRNG for bulk payloads. - Added documentation for EcMinAge zero-value behavior. - Added logging for ignored errors in volume/shard deletion. * admin: return internal server error for plugin store failures Changed error status code from 400 Bad Request to 500 Internal Server Error for failures in GetPluginJobDetail to correctly reflect server-side errors. * admin: implement safe channel sends and graceful shutdown sync - Added sync.WaitGroup to Plugin struct to manage background goroutines. - Implemented safeSendCh helper using recover() to prevent panics on closed channels. - Ensured Shutdown() waits for all background operations to complete. * admin: robustify plugin monitor with nil-safe time and record init - Standardized nil-safe assignment for time.Time pointers (CreatedAt, UpdatedAt, CompletedAt). - Ensured persistJobDetailSnapshot initializes new records correctly if they don't exist on disk. - Fixed debounced persistence to trigger immediate write on job completion. admin: improve scheduler shutdown behavior and logic guards - Replaced brittle error string matching with explicit r.shutdownCh selection for shutdown detection. - Removed redundant nil guard in buildScheduledJobSpec. - Standardized WaitGroup usage for schedulerLoop. * admin: implement deep copy for job parameters and atomic write fixes - Implemented deepCopyGenericValue and used it in cloneTrackedJob to prevent shared state. - Ensured atomicWriteFile creates parent directories before writing. * admin: remove unreachable branch in shard classification Removed an unreachable 'totalShards <= 0' check in classifyShardID as dataShards and parityShards are already guarded. * admin: secure UI links and use canonical shard constants - Added rel="noopener noreferrer" to external links for security. - Replaced magic number 14 with erasure_coding.TotalShardsCount. - Used renderEcShardBadge for missing shard list consistency. * admin: stabilize plugin tests and fix regressions - Composed a robust plugin_monitor_test.go to handle asynchronous persistence. - Updated all time.Time literals to use timeToPtr helper. - Added explicit Shutdown() calls in tests to synchronize with debounced writes. - Fixed syntax errors and orphaned struct literals in tests. * Potential fix for code scanning alert no. 278: Slice memory allocation with excessive size value Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * Potential fix for code scanning alert no. 283: Uncontrolled data used in path expression Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * admin: finalize refinements for error handling, scheduler, and race fixes - Standardized HTTP 500 status codes for store failures in plugin_api.go. - Tracked scheduled detection goroutines with sync.WaitGroup for safe shutdown. - Fixed race condition in safeSendDetectionComplete by extracting channel under lock. - Implemented deep copy for JobActivity details. - Used defaultDirPerm constant in atomicWriteFile. * test(ec): migrate admin dockertest to plugin APIs * admin/plugin_api: fix RunPluginJobTypeAPI to return 500 for server-side detection/filter errors * admin/plugin_api: fix ExecutePluginJobAPI to return 500 for job execution failures * admin/plugin_api: limit parseProtoJSONBody request body to 1MB to prevent unbounded memory usage * admin/plugin: consolidate regex to package-level validJobTypePattern; add char validation to sanitizeJobID * admin/plugin: fix racy Shutdown channel close with sync.Once * admin/plugin: track sendLoop and recv goroutines in WorkerStream with r.wg * admin/plugin: document writeProtoFiles atomicity — .pb is source of truth, .json is human-readable only * admin/plugin: extract activityLess helper to deduplicate nil-safe OccurredAt sort comparators * test/ec: check http.NewRequest errors to prevent nil req panics * test/ec: replace deprecated ioutil/math/rand, fix stale step comment 5.1→3.1 * plugin(ec): raise default detection and scheduling throughput limits * topology: include empty disks in volume list and EC capacity fallback * topology: remove hard 10-task cap for detection planning * Update ec_volume_details_templ.go * adjust default * fix tests --------- Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>	2026-02-18 13:42:41 -08:00
Chris Lu	3300874cb5	filer: add default log purging to master maintenance scripts (#8359 ) * filer: add default log purging to master maintenance scripts * filer: fix default maintenance scripts to include full set of tasks * filer: refactor maintenance scripts to avoid duplication	2026-02-16 16:58:15 -08:00
Chris Lu	0d8588e3ae	S3: Implement IAM defaults and STS signing key fallback (#8348 ) * S3: Implement IAM defaults and STS signing key fallback logic * S3: Refactor startup order to init SSE-S3 key manager before IAM * S3: Derive STS signing key from KEK using HKDF for security isolation * S3: Document STS signing key fallback in security.toml * fix(s3api): refine anonymous access logic and secure-by-default behavior - Initialize anonymous identity by default in `NewIdentityAccessManagement` to prevent nil pointer exceptions. - Ensure `ReplaceS3ApiConfiguration` preserves the anonymous identity if not present in the new configuration. - Update `NewIdentityAccessManagement` signature to accept `filerClient`. - In legacy mode (no policy engine), anonymous defaults to Deny (no actions), preserving secure-by-default behavior. - Use specific `LookupAnonymous` method instead of generic map lookup. - Update tests to accommodate signature changes and verify improved anonymous handling. * feat(s3api): make IAM configuration optional - Start S3 API server without a configuration file if `EnableIam` option is set. - Default to `Allow` effect for policy engine when no configuration is provided (Zero-Config mode). - Handle empty configuration path gracefully in `loadIAMManagerFromConfig`. - Add integration test `iam_optional_test.go` to verify empty config behavior. * fix(iamapi): fix signature mismatch in NewIdentityAccessManagementWithStore * fix(iamapi): properly initialize FilerClient instead of passing nil * fix(iamapi): properly initialize filer client for IAM management - Instead of passing `nil`, construct a `wdclient.FilerClient` using the provided `Filers` addresses. - Ensure `NewIdentityAccessManagementWithStore` receives a valid `filerClient` to avoid potential nil pointer dereferences or limited functionality. * clean: remove dead code in s3api_server.go * refactor(s3api): improve IAM initialization, safety and anonymous access security * fix(s3api): ensure IAM config loads from filer after client init * fix(s3): resolve test failures in integration, CORS, and tagging tests - Fix CORS tests by providing explicit anonymous permissions config - Fix S3 integration tests by setting admin credentials in init - Align tagging test credentials in CI with IAM defaults - Added goroutine to retry IAM config load in iamapi server * fix(s3): allow anonymous access to health targets and S3 Tables when identities are present * fix(ci): use /healthz for Caddy health check in awscli tests * iam, s3api: expose DefaultAllow from IAM and Policy Engine This allows checking the global "Open by Default" configuration from other components like S3 Tables. * s3api/s3tables: support DefaultAllow in permission logic and handler Updated CheckPermissionWithContext to respect the DefaultAllow flag in PolicyContext. This enables "Open by Default" behavior for unauthenticated access in zero-config environments. Added a targeted unit test to verify the logic. * s3api/s3tables: propagate DefaultAllow through handlers Propagated the DefaultAllow flag to individual handlers for namespaces, buckets, tables, policies, and tagging. This ensures consistent "Open by Default" behavior across all S3 Tables API endpoints. * s3api: wire up DefaultAllow for S3 Tables API initialization Updated registerS3TablesRoutes to query the global IAM configuration and set the DefaultAllow flag on the S3 Tables API server. This completes the end-to-end propagation required for anonymous access in zero-config environments. Added a SetDefaultAllow method to S3TablesApiServer to facilitate this. * s3api: fix tests by adding DefaultAllow to mock IAM integrations The IAMIntegration interface was updated to include DefaultAllow(), breaking several mock implementations in tests. This commit fixes the build errors by adding the missing method to the mocks. * env * ensure ports * env * env * fix default allow * add one more test using non-anonymous user * debug * add more debug * less logs	2026-02-16 13:59:13 -08:00
Lisandro Pin	0721e3c1e9	Rework volume compaction (a.k.a vacuuming) logic to cleanly support new parameters. (#8337 ) We'll leverage on this to support a "ignore broken needles" option, necessary to properly recover damaged volumes, as described in https://github.com/seaweedfs/seaweedfs/issues/7442#issuecomment-3897784283 .	2026-02-16 02:15:14 -08:00
Chris Lu	b57429ef2e	Switch empty-folder cleanup to bucket policy (#8292 ) * Fix Spark _temporary cleanup and add issue #8285 regression test * Generalize empty folder cleanup for Spark temp artifacts * Revert synchronous folder pruning and add cleanup diagnostics * Add actionable empty-folder cleanup diagnostics * Fix Spark temp marker cleanup in async folder cleaner * Fix Spark temp cleanup with implicit directory markers * Keep explicit directory markers non-implicit * logging * more logs * Switch empty-folder cleanup to bucket policy * Seaweed-X-Amz-Allow-Empty-Folders * less logs * go vet * less logs * refactoring	2026-02-10 18:38:38 -08:00
Chris Lu	ba8e2aaae9	Fix master leader election when grpc ports change (#8272 ) * Fix master leader detection when grpc ports change * Canonicalize self peer entry to avoid raft self-alias panic * Normalize and deduplicate master peer addresses	2026-02-09 18:13:02 -08:00
Chris Lu	e6ee293c17	Add table operations test (#8241 ) * Add Trino blog operations test * Update test/s3tables/catalog_trino/trino_blog_operations_test.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * feat: add table bucket path helpers and filer operations - Add table object root and table location mapping directories - Implement ensureDirectory, upsertFile, deleteEntryIfExists helpers - Support table location bucket mapping for S3 access * feat: manage table bucket object roots on creation/deletion - Create .objects directory for table buckets on creation - Clean up table object bucket paths on deletion - Enable S3 operations on table bucket object roots * feat: add table location mapping for Iceberg REST - Track table location bucket mappings when tables are created/updated/deleted - Enable location-based routing for S3 operations on table data * feat: route S3 operations to table bucket object roots - Route table-s3 bucket names to mapped table paths - Route table buckets to object root directories - Support table location bucket mapping lookup * feat: emit table-s3 locations from Iceberg REST - Generate unique table-s3 bucket names with UUID suffix - Store table metadata under table bucket paths - Return table-s3 locations for Trino compatibility * fix: handle missing directories in S3 list operations - Propagate ErrNotFound from ListEntries for non-existent directories - Treat missing directories as empty results for list operations - Fixes Trino non-empty location checks on table creation * test: improve Trino CSV parsing for single-value results - Sanitize Trino output to skip jline warnings - Handle single-value CSV results without header rows - Strip quotes from numeric values in tests * refactor: use bucket path helpers throughout S3 API - Replace direct bucket path operations with helper functions - Leverage centralized table bucket routing logic - Improve maintainability with consistent path resolution * fix: add table bucket cache and improve filer error handling - Cache table bucket lookups to reduce filer overhead on repeated checks - Use filer_pb.CreateEntry and filer_pb.UpdateEntry helpers to check resp.Error - Fix delete order in handler_bucket_get_list_delete: delete table object before directory - Make location mapping errors best-effort: log and continue, don't fail API - Update table location mappings to delete stale prior bucket mappings on update - Add 1-second sleep before timestamp time travel query to ensure timestamps are in past - Fix CSV parsing: examine all lines, not skip first; handle single-value rows * fix: properly handle stale metadata location mapping cleanup - Capture oldMetadataLocation before mutation in handleUpdateTable - Update updateTableLocationMapping to accept both old and new locations - Use passed-in oldMetadataLocation to detect location changes - Delete stale mapping only when location actually changes - Pass empty string for oldLocation in handleCreateTable (new tables have no prior mapping) - Improve logging to show old -> new location transitions * refactor: cleanup imports and cache design - Remove unused 'sync' import from bucket_paths.go - Use filer_pb.UpdateEntry helper in setExtendedAttribute and deleteExtendedAttribute for consistent error handling - Add dedicated tableBucketCache map[string]bool to BucketRegistry instead of mixing concerns with metadataCache - Improve cache separation: table buckets cache is now separate from bucket metadata cache * fix: improve cache invalidation and add transient error handling Cache invalidation (critical fix): - Add tableLocationCache to BucketRegistry for location mapping lookups - Clear tableBucketCache and tableLocationCache in RemoveBucketMetadata - Prevents stale cache entries when buckets are deleted/recreated Transient error handling: - Only cache table bucket lookups when conclusive (found or ErrNotFound) - Skip caching on transient errors (network, permission, etc) - Prevents marking real table buckets as non-table due to transient failures Performance optimization: - Cache tableLocationDir results to avoid repeated filer RPCs on hot paths - tableLocationDir now checks cache before making expensive filer lookups - Cache stores empty string for 'not found' to avoid redundant lookups Code clarity: - Add comment to deleteDirectory explaining DeleteEntry response lacks Error field * go fmt * fix: mirror transient error handling in tableLocationDir and optimize bucketDir Transient error handling: - tableLocationDir now only caches definitive results - Mirrors isTableBucket behavior to prevent treating transient errors as permanent misses - Improves reliability on flaky systems or during recovery Performance optimization: - bucketDir avoids redundant isTableBucket call via bucketRoot - Directly use s3a.option.BucketsPath for regular buckets - Saves one cache lookup for every non-table bucket operation * fix: revert bucketDir optimization to preserve bucketRoot logic The optimization to directly use BucketsPath bypassed bucketRoot's logic and caused issues with S3 list operations on delimiter+prefix cases. Revert to using path.Join(s3a.bucketRoot(bucket), bucket) which properly handles all bucket types and ensures consistent path resolution across the codebase. The slight performance cost of an extra cache lookup is worth the correctness and consistency benefits. * feat: move table buckets under /buckets Add a table-bucket marker attribute, reuse bucket metadata cache for table bucket detection, and update list/validation/UI/test paths to treat table buckets as /buckets entries. * Fix S3 Tables code review issues - handler_bucket_create.go: Fix bucket existence check to properly validate entryResp.Entry before setting s3BucketExists flag (nil Entry should not indicate existing bucket) - bucket_paths.go: Add clarifying comment to bucketRoot() explaining unified buckets root path for all bucket types - file_browser_data.go: Optimize by extracting table bucket check early to avoid redundant WithFilerClient call * Fix list prefix delimiter handling * Handle list errors conservatively * Fix Trino FOR TIMESTAMP query - use past timestamp Iceberg requires the timestamp to be strictly in the past. Use current_timestamp - interval '1' second instead of current_timestamp. --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2026-02-07 13:27:47 -08:00
Chris Lu	a3b83f8808	test: add Trino Iceberg catalog integration test (#8228 ) * test: add Trino Iceberg catalog integration test - Create test/s3/catalog_trino/trino_catalog_test.go with TestTrinoIcebergCatalog - Tests integration between Trino SQL engine and SeaweedFS Iceberg REST catalog - Starts weed mini with all services and Trino in Docker container - Validates Iceberg catalog schema creation and listing operations - Uses native S3 filesystem support in Trino with path-style access - Add workflow job to s3-tables-tests.yml for CI execution * fix: preserve AWS environment credentials when replacing S3 configuration When S3 configuration is loaded from filer/db, it replaces the identities list and inadvertently removes AWS_ACCESS_KEY_ID credentials that were added from environment variables. This caused auth to remain disabled even though valid credentials were present. Fix by preserving environment-based identities when replacing the configuration and re-adding them after the replacement. This ensures environment credentials persist across configuration reloads and properly enable authentication. * fix: use correct ServerAddress format with gRPC port encoding The admin server couldn't connect to master because the master address was missing the gRPC port information. Use pb.NewServerAddress() which properly encodes both HTTP and gRPC ports in the address string. Changes: - weed/command/mini.go: Use pb.NewServerAddress for master address in admin - test/s3/policy/policy_test.go: Store and use gRPC ports for master/filer addresses This fix applies to: 1. Admin server connection to master (mini.go) 2. Test shell commands that need master/filer addresses (policy_test.go) * move * move * fix: always include gRPC port in server address encoding The NewServerAddress() function was omitting the gRPC port from the address string when it matched the port+10000 convention. However, gRPC port allocation doesn't always follow this convention - when the calculated port is busy, an alternative port is allocated. This caused a bug where: 1. Master's gRPC port was allocated as 50661 (sequential, not port+10000) 2. Address was encoded as '192.168.1.66:50660' (gRPC port omitted) 3. Admin client called ToGrpcAddress() which assumed port+10000 offset 4. Admin tried to connect to 60660 but master was on 50661 → connection failed Fix: Always include explicit gRPC port in address format (host:httpPort.grpcPort) unless gRPC port is 0. This makes addresses unambiguous and works regardless of the port allocation strategy used. Impacts: All server-to-server gRPC connections now use properly formatted addresses. * test: fix Iceberg REST API readiness check The Iceberg REST API endpoints require authentication. When checked without credentials, the API returns 403 Forbidden (not 401 Unauthorized). The readiness check now accepts both auth error codes (401/403) as indicators that the service is up and ready, it just needs credentials. This fixes the 'Iceberg REST API did not become ready' test failure. * Fix AWS SigV4 signature verification for base64-encoded payload hashes AWS SigV4 canonical requests must use hex-encoded SHA256 hashes, but the X-Amz-Content-Sha256 header may be transmitted as base64. Changes: - Added normalizePayloadHash() function to convert base64 to hex - Call normalizePayloadHash() in extractV4AuthInfoFromHeader() - Added encoding/base64 import Fixes 403 Forbidden errors on POST requests to Iceberg REST API when clients send base64-encoded content hashes in the header. Impacted services: Iceberg REST API, S3Tables * Fix AWS SigV4 signature verification for base64-encoded payload hashes AWS SigV4 canonical requests must use hex-encoded SHA256 hashes, but the X-Amz-Content-Sha256 header may be transmitted as base64. Changes: - Added normalizePayloadHash() function to convert base64 to hex - Call normalizePayloadHash() in extractV4AuthInfoFromHeader() - Added encoding/base64 import - Removed unused fmt import Fixes 403 Forbidden errors on POST requests to Iceberg REST API when clients send base64-encoded content hashes in the header. Impacted services: Iceberg REST API, S3Tables * pass sigv4 * s3api: fix identity preservation and logging levels - Ensure environment-based identities are preserved during config replacement - Update accessKeyIdent and nameToIdentity maps correctly - Downgrade informational logs to V(2) to reduce noise * test: fix trino integration test and s3 policy test - Pin Trino image version to 479 - Fix port binding to 0.0.0.0 for Docker connectivity - Fix S3 policy test hang by correctly assigning MiniClusterCtx - Improve port finding robustness in policy tests * ci: pre-pull trino image to avoid timeouts - Pull trinodb/trino:479 after Docker setup - Ensure image is ready before integration tests start * iceberg: remove unused checkAuth and improve logging - Remove unused checkAuth method - Downgrade informational logs to V(2) - Ensure loggingMiddleware uses a status writer for accurate reported codes - Narrow catch-all route to avoid interfering with other subsystems * iceberg: fix build failure by removing unused s3api import * Update iceberg.go * use warehouse * Update trino_catalog_test.go	2026-02-06 13:12:25 -08:00
Chris Lu	a04e8dd00b	Support Linux file/dir ACL in weed mount (#8233 ) * Support Linux file/dir ACL in weed mount #8229 * Update weed/command/mount_std.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2026-02-06 11:33:36 -08:00
Chris Lu	e39a4c2041	fix flaky test	2026-02-04 23:16:31 -08:00
Chris Lu	bd4e7ff14e	command: fix s3 panic in filer command (#8208 ) command: fix s3 panic in filer command due to uninitialized options This fixes a nil pointer dereference panic when starting the S3 server via the `weed filer` command by correctly initializing `iamReadOnly` and `portIceberg` flags. Relates to #8200	2026-02-04 10:37:20 -08:00
Chris Lu	72a8f598f2	Fix Maintenance Task Sorting and Refactor Log Persistence (#8199 ) * fix float stepping * do not auto refresh * only logs when non 200 status * fix maintenance task sorting and cleanup redundant handler logic * Refactor log retrieval to persist to disk and fix slowness - Move log retrieval to disk-based persistence in GetMaintenanceTaskDetail - Implement background log fetching on task completion in worker_grpc_server.go - Implement async background refresh for in-progress tasks - Completely remove blocking gRPC calls from the UI path to fix 10s timeouts - Cleanup debug logs and performance profiling code * Ensure consistent deterministic sorting in config_persistence cleanup * Replace magic numbers with constants and remove debug logs - Added descriptive constants for truncation limits and timeouts in admin_server.go and worker_grpc_server.go - Replaced magic numbers with these constants throughout the codebase - Verified removal of stdout debug printing - Ensured consistent truncation logic during log persistence * Address code review feedback on history truncation and logging logic - Fix AssignmentHistory double-serialization by copying task in GetMaintenanceTaskDetail - Fix handleTaskCompletion logging logic (mutually exclusive success/failure logs) - Remove unused Timeout field from LogRequestContext and sync select timeouts with constants - Ensure AssignmentHistory is only provided in the top-level field for better JSON structure * Implement goroutine leak protection and request deduplication - Add request deduplication in RequestTaskLogs to prevent multiple concurrent fetches for the same task - Implement safe cleanup in timeout handlers to avoid race conditions in pendingLogRequests map - Add a 10s cooldown for background log refreshes in GetMaintenanceTaskDetail to prevent spamming - Ensure all persistent log-fetching goroutines are bounded and efficiently managed * Fix potential nil pointer panics in maintenance handlers - Add nil checks for adminServer in ShowTaskDetail, ShowMaintenanceWorkers, and UpdateTaskConfig - Update getMaintenanceQueueData to return a descriptive error instead of nil when adminServer is uninitialized - Ensure internal helper methods consistently check for adminServer initialization before use * Strictly enforce disk-only log reading - Remove background log fetching from GetMaintenanceTaskDetail to prevent timeouts and network calls during page view - Remove unused lastLogFetch tracking fields to clean up dead code - Ensure logs are only updated upon task completion via handleTaskCompletion * Refactor GetWorkerLogs to read from disk - Update /api/maintenance/workers/:id/logs endpoint to use configPersistence.LoadTaskExecutionLogs - Remove synchronous gRPC call RequestTaskLogs to prevent timeouts and bad gateway errors - Ensure consistent log retrieval behavior across the application (disk-only) * Fix timestamp parsing in log viewer - Update task_detail.templ JS to handle both ISO 8601 strings and Unix timestamps - Fix "Invalid time value" error when displaying logs fetched from disk - Regenerate templates * master: fallback to HDD if SSD volumes are full in Assign * worker: improve EC detection logging and fix skip counters * worker: add Sync method to TaskLogger interface * worker: implement Sync and ensure logs are flushed before task completion * admin: improve task log retrieval with retries and better timeouts * admin: robust timestamp parsing in task detail view	2026-02-04 08:48:55 -08:00
Chris Lu	2ff1cd9fc9	format	2026-02-03 18:39:01 -08:00
Chris Lu	1274cf038c	s3: enforce authentication and JSON error format for Iceberg REST Catalog (#8192 ) * s3: enforce authentication and JSON error format for Iceberg REST Catalog * s3/iceberg: align error exception types with OpenAPI spec examples * s3api: refactor AuthenticateRequest to return identity object * s3/iceberg: propagate full identity object to request context * s3/iceberg: differentiate NotAuthorizedException and ForbiddenException * s3/iceberg: reject requests if authenticator is nil to prevent auth bypass * s3/iceberg: refactor Auth middleware to build context incrementally and use switch for error mapping * s3api: update misleading comment for authRequestWithAuthType * s3api: return ErrAccessDenied if IAM is not configured to prevent auth bypass * s3/iceberg: optimize context update in Auth middleware * s3api: export CanDo for external authorization use * s3/iceberg: enforce identity-based authorization in all API handlers * s3api: fix compilation errors by updating internal CanDo references * s3/iceberg: robust identity validation and consistent action usage in handlers * s3api: complete CanDo rename across tests and policy engine integration * s3api: fix integration tests by allowing admin access when auth is disabled and explicit gRPC ports * duckdb * create test bucket	2026-02-03 11:55:12 -08:00
Chris Lu	2bb21ea276	feat: Add Iceberg REST Catalog server and admin UI (#8175 ) * feat: Add Iceberg REST Catalog server Implement Iceberg REST Catalog API on a separate port (default 8181) that exposes S3 Tables metadata through the Apache Iceberg REST protocol. - Add new weed/s3api/iceberg package with REST handlers - Implement /v1/config endpoint returning catalog configuration - Implement namespace endpoints (list/create/get/head/delete) - Implement table endpoints (list/create/load/head/delete/update) - Add -port.iceberg flag to S3 standalone server (s3.go) - Add -s3.port.iceberg flag to combined server mode (server.go) - Add -s3.port.iceberg flag to mini cluster mode (mini.go) - Support prefix-based routing for multiple catalogs The Iceberg REST server reuses S3 Tables metadata storage under /table-buckets and enables DuckDB, Spark, and other Iceberg clients to connect to SeaweedFS as a catalog. * feat: Add Iceberg Catalog pages to admin UI Add admin UI pages to browse Iceberg catalogs, namespaces, and tables. - Add Iceberg Catalog menu item under Object Store navigation - Create iceberg_catalog.templ showing catalog overview with REST info - Create iceberg_namespaces.templ listing namespaces in a catalog - Create iceberg_tables.templ listing tables in a namespace - Add handlers and routes in admin_handlers.go - Add Iceberg data provider methods in s3tables_management.go - Add Iceberg data types in types.go The Iceberg Catalog pages provide visibility into the same S3 Tables data through an Iceberg-centric lens, including REST endpoint examples for DuckDB and PyIceberg. * test: Add Iceberg catalog integration tests and reorg s3tables tests - Reorganize existing s3tables tests to test/s3tables/table-buckets/ - Add new test/s3tables/catalog/ for Iceberg REST catalog tests - Add TestIcebergConfig to verify /v1/config endpoint - Add TestIcebergNamespaces to verify namespace listing - Add TestDuckDBIntegration for DuckDB connectivity (requires Docker) - Update CI workflow to use new test paths * fix: Generate proper random UUIDs for Iceberg tables Address code review feedback: - Replace placeholder UUID with crypto/rand-based UUID v4 generation - Add detailed TODO comments for handleUpdateTable stub explaining the required atomic metadata swap implementation * fix: Serve Iceberg on localhost listener when binding to different interface Address code review feedback: properly serve the localhost listener when the Iceberg server is bound to a non-localhost interface. * ci: Add Iceberg catalog integration tests to CI Add new job to run Iceberg catalog tests in CI, along with: - Iceberg package build verification - Iceberg unit tests - Iceberg go vet checks - Iceberg format checks * fix: Address code review feedback for Iceberg implementation - fix: Replace hardcoded account ID with s3_constants.AccountAdminId in buildTableBucketARN() - fix: Improve UUID generation error handling with deterministic fallback (timestamp + PID + counter) - fix: Update handleUpdateTable to return HTTP 501 Not Implemented instead of fake success - fix: Better error handling in handleNamespaceExists to distinguish 404 from 500 errors - fix: Use relative URL in template instead of hardcoded localhost:8181 - fix: Add HTTP timeout to test's waitForService function to avoid hangs - fix: Use dynamic ephemeral ports in integration tests to avoid flaky parallel failures - fix: Add Iceberg port to final port configuration logging in mini.go * fix: Address critical issues in Iceberg implementation - fix: Cache table UUIDs to ensure persistence across LoadTable calls The UUID now remains stable for the lifetime of the server session. TODO: For production, UUIDs should be persisted in S3 Tables metadata. - fix: Remove redundant URL-encoded namespace parsing mux router already decodes %1F to \x1F before passing to handlers. Redundant ReplaceAll call could cause bugs with literal %1F in namespace. * fix: Improve test robustness and reduce code duplication - fix: Make DuckDB test more robust by failing on unexpected errors Instead of silently logging errors, now explicitly check for expected conditions (extension not available) and skip the test appropriately. - fix: Extract username helper method to reduce duplication Created getUsername() helper in AdminHandlers to avoid duplicating the username retrieval logic across Iceberg page handlers. * fix: Add mutex protection to table UUID cache Protects concurrent access to the tableUUIDs map with sync.RWMutex. Uses read-lock for fast path when UUID already cached, and write-lock for generating new UUIDs. Includes double-check pattern to handle race condition between read-unlock and write-lock. * style: fix go fmt errors * feat(iceberg): persist table UUID in S3 Tables metadata * feat(admin): configure Iceberg port in Admin UI and commands * refactor: address review comments (flags, tests, handlers) - command/mini: fix tracking of explicit s3.port.iceberg flag - command/admin: add explicit -iceberg.port flag - admin/handlers: reuse getUsername helper - tests: use 127.0.0.1 for ephemeral ports and os.Stat for file size check * test: check error from FileStat in verify_gc_empty_test	2026-02-02 23:12:13 -08:00
Chris Lu	2ee6e4f391	mount: refresh and evict hot dir cache (#8174 ) * mount: refresh and evict hot dir cache * mount: guard dir update window and extend TTL * mount: reuse timestamp for cache mark * Apply suggestion from @gemini-code-assist[bot] Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * mount: make dir cache tuning configurable * mount: dedupe dir update notices * mount: restore invalidate-all cache helper * mount: keep hot dir tuning constants * mount: centralize cache state reset * mount: mark refresh completion time * mount: allow disabling idle eviction --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2026-01-31 13:46:37 -08:00
Chris Lu	c106532b79	fix: prevent MiniClusterCtx race conditions in command shutdown Capture global MiniClusterCtx into local variables before goroutine/select evaluation to prevent nil dereference/data race when context is reset to nil after nil check. Applied to filer, master, volume, and s3 commands.	2026-01-28 19:42:16 -08:00
Chris Lu	580c2b4ad4	command: fix stale error variable logging in filer serving goroutines - Use local 'err' variable instead of stale 'e' from outer scope - Applied to both TLS and non-TLS paths for local listener	2026-01-28 11:27:18 -08:00
Chris Lu	01c17478ae	command: implement graceful shutdown for mini cluster - Introduce MiniClusterCtx to coordinate shutdown across mini services - Update Master, Volume, Filer, S3, and WebDAV servers to respect context cancellation - Ensure all resources are cleaned up properly during test teardown - Integrate MiniClusterCtx in s3tables integration tests	2026-01-28 10:36:19 -08:00
Chris Lu	6542d1e0aa	Enable weed fuse on FreeBSD (#8146 ) * Enable weed fuse on FreeBSD * Update weed/command/fuse_notsupported.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * Update weed/command/fuse_std.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2026-01-27 21:37:23 -08:00
MorezMartin	20952aa514	Fix jwt error in admin UI (#8140 ) * add jwt token in weed admin headers requests * add jwt token to header for download * :s/upload/download * filer_signing.read despite of filer_signing key * finalize filer_browser_handlers.go * admin: add JWT authorization to file browser handlers * security: fix typos in JWT read validation descriptions * Move security.toml to example and secure keys * security: address PR feedback on JWT enforcement and example keys * security: refactor JWT logic and improve example keys readability * Update docker/Dockerfile.local Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Chris Lu <chris.lu@gmail.com> Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-01-27 17:27:02 -08:00
Chris Lu	551a31e156	Implement IAM propagation to S3 servers (#8130 ) * Implement IAM propagation to S3 servers - Add PropagatingCredentialStore to propagate IAM changes to S3 servers via gRPC - Add Policy management RPCs to S3 proto and S3ApiServer - Update CredentialManager to use PropagatingCredentialStore when MasterClient is available - Wire FilerServer to enable propagation * Implement parallel IAM propagation and fix S3 cluster registration - Parallelized IAM change propagation with 10s timeout. - Refined context usage in PropagatingCredentialStore. - Added S3Type support to cluster node management. - Enabled S3 servers to register with gRPC address to the master. - Ensured IAM configuration reload after policy updates via gRPC. * Optimize IAM propagation with direct in-memory cache updates * Secure IAM propagation: Use metadata to skip persistence only on propagation * pb: refactor IAM and S3 services for unidirectional IAM propagation - Move SeaweedS3IamCache service from iam.proto to s3.proto. - Remove legacy IAM management RPCs and empty SeaweedS3 service from s3.proto. - Enforce that S3 servers only use the synchronization interface. * pb: regenerate Go code for IAM and S3 services Updated generated code following the proto refactoring of IAM synchronization services. * s3api: implement read-only mode for Embedded IAM API - Add readOnly flag to EmbeddedIamApi to reject write operations via HTTP. - Enable read-only mode by default in S3ApiServer. - Handle AccessDenied error in writeIamErrorResponse. - Embed SeaweedS3IamCacheServer in S3ApiServer. * credential: refactor PropagatingCredentialStore for unidirectional IAM flow - Update to use s3_pb.SeaweedS3IamCacheClient for propagation to S3 servers. - Propagate full Identity object via PutIdentity for consistency. - Remove redundant propagation of specific user/account/policy management RPCs. - Add timeout context for propagation calls. * s3api: implement SeaweedS3IamCacheServer for unidirectional sync - Update S3ApiServer to implement the cache synchronization gRPC interface. - Methods (PutIdentity, RemoveIdentity, etc.) now perform direct in-memory cache updates. - Register SeaweedS3IamCacheServer in command/s3.go. - Remove registration for the legacy and now empty SeaweedS3 service. * s3api: update tests for read-only IAM and propagation - Added TestEmbeddedIamReadOnly to verify rejection of write operations in read-only mode. - Update test setup to pass readOnly=false to NewEmbeddedIamApi in routing tests. - Updated EmbeddedIamApiForTest helper with read-only checks matching production behavior. * s3api: add back temporary debug logs for IAM updates Log IAM updates received via: - gRPC propagation (PutIdentity, PutPolicy, etc.) - Metadata configuration reloads (LoadS3ApiConfigurationFromCredentialManager) - Core identity management (UpsertIdentity, RemoveIdentity) * IAM: finalize propagation fix with reduced logging and clarified architecture * Allow configuring IAM read-only mode for S3 server integration tests * s3api: add defensive validation to UpsertIdentity * s3api: fix log message to reference correct IAM read-only flag * test/s3/iam: ensure WaitForS3Service checks for IAM write permissions * test: enable writable IAM in Makefile for integration tests * IAM: add GetPolicy/ListPolicies RPCs to s3.proto * S3: add GetBucketPolicy and ListBucketPolicies helpers * S3: support storing generic IAM policies in IdentityAccessManagement * S3: implement IAM policy RPCs using IdentityAccessManagement * IAM: fix stale user identity on rename propagation	2026-01-26 22:59:43 -08:00
Chris Lu	5a7c74feac	migrate IAM policies to multi-file storage (#8114 ) * Add IAM gRPC service definition - Add GetConfiguration/PutConfiguration for config management - Add CreateUser/GetUser/UpdateUser/DeleteUser/ListUsers for user management - Add CreateAccessKey/DeleteAccessKey/GetUserByAccessKey for access key management - Methods mirror existing IAM HTTP API functionality * Add IAM gRPC handlers on filer server - Implement IamGrpcServer with CredentialManager integration - Handle configuration get/put operations - Handle user CRUD operations - Handle access key create/delete operations - All methods delegate to CredentialManager for actual storage * Wire IAM gRPC service to filer server - Add CredentialManager field to FilerOption and FilerServer - Import credential store implementations in filer command - Initialize CredentialManager from credential.toml if available - Register IAM gRPC service on filer gRPC server - Enable credential management via gRPC alongside existing filer services * Regenerate IAM protobuf with gRPC service methods * fix: compilation error in DeleteUser * fix: address code review comments for IAM migration * feat: migrate policies to multi-file layout and fix identity duplicated content * refactor: remove configuration.json and migrate Service Accounts to multi-file layout * refactor: standardize Service Accounts as distinct store entities and fix Admin Server persistence * config: set ServiceAccountsDirectory to /etc/iam/service_accounts * Fix Chrome dialog auto-dismiss with Bootstrap modals - Add modal-alerts.js library with Bootstrap modal replacements - Replace all 15 confirm() calls with showConfirm/showDeleteConfirm - Auto-override window.alert() for all alert() calls - Fixes Chrome 132+ aggressively blocking native dialogs * Upgrade Bootstrap from 5.3.2 to 5.3.8 * Fix syntax error in object_store_users.templ - remove duplicate closing braces * create policy * display errors * migrate to multi-file policies * address PR feedback: use showDeleteConfirm and showErrorMessage in policies.templ, refine migration check * Update policies_templ.go * add service account to iam grpc * iam: fix potential path traversal in policy names by validating name pattern * iam: add GetServiceAccountByAccessKey to CredentialStore interface * iam: implement service account support for PostgresStore Includes full CRUD operations and efficient lookup by access key. * iam: implement GetServiceAccountByAccessKey for filer_etc, grpc, and memory stores Provides efficient lookup of service accounts by access key where possible, with linear scan fallbacks for file-based stores. * iam: remove filer_multiple support Deleted its implementation and references in imports, scaffold config, and core interface constants. Redundant with filer_etc. * clear comment * dash: robustify service account construction - Guard against nil sa.Credential when constructing responses - Fix Expiration logic to only set if > 0, avoiding Unix epoch 1970 - Ensure consistency across Get, Create, and Update handlers * credential/filer_etc: improve error propagation in configuration handlers - Return error from loadServiceAccountsFromMultiFile to callers - Ensure listEntries errors in SaveConfiguration (cleanup logic) are propagated unless they are "not found" failures. - Fixes potential silent failures during IAM configuration sync. * credential/filer_etc: add existence check to CreateServiceAccount Ensures consistency with other stores by preventing accidental overwrite of existing service accounts during creation. * credential/memory: improve store robustness and Reset logic - Enforce ID immutability in UpdateServiceAccount to prevent orphans - Update Reset() to also clear the policies map, ensuring full state cleanup for tests. * dash: improve service account robustness and policy docs - Wrap parent user lookup errors to preserve context - Strictly validate Status field in UpdateServiceAccount - Add deprecation comments to legacy policy management methods * credential/filer_etc: protect against path traversal in service accounts Implemented ID validation (alphanumeric, underscores, hyphens) and applied it to Get, Save, and Delete operations to ensure no directory traversal via saId.json filenames. * credential/postgres: improve robustness and cleanup comments - Removed brainstorming comments in GetServiceAccountByAccessKey - Added missing rows.Err() check during iteration - Properly propagate Scan and Unmarshal errors instead of swallowing them * admin: unify UI alerts and confirmations using Bootstrap modals - Updated modal-alerts.js with improved automated alert type detection - Replaced native alert() and confirm() with showAlert(), showConfirm(), and showDeleteConfirm() across various Templ components - Improved UX for delete operations by providing better context and styling - Ensured consistent error reporting across IAM and Maintenance views * admin: additional UI consistency fixes for alerts and confirmations - Replaced native alert() and confirm() with Bootstrap modals in: - EC volumes (repair flow) - Collection details (repair flow) - File browser (properties and delete) - Maintenance config schema (save and reset) - Improved delete confirmation in file browser with item context - Ensured consistent success/error/info styling for all feedbacks * make * iam: add GetServiceAccountByAccessKey RPC and update GetConfiguration * iam: implement GetServiceAccountByAccessKey on server and client * iam: centralize policy and service account validation * iam: optimize MemoryStore service account lookups with indexing * iam: fix postgres service_accounts table and optimize lookups * admin: refactor modal alerts and clean up dashboard logic * admin: fix EC shards table layout mismatch * admin: URL-encode IAM path parameters for safety * admin: implement pauseWorker logic in maintenance view * iam: add rows.Err() check to postgres ListServiceAccounts * iam: standardize ErrServiceAccountNotFound across credential stores * iam: map ErrServiceAccountNotFound to codes.NotFound in DeleteServiceAccount * iam: refine service account store logic, errors and schema * iam: add validation to GetServiceAccountByAccessKey * admin: refine modal titles and ensure URL safety * admin: address bot review comments for alerts and async usage * iam: fix syntax error by restoring missing function declaration * [FilerEtcStore] improve error handling in CreateServiceAccount Refine error handling to provide clearer messages when checking for existing service accounts. * [PostgresStore] add nil guards and validation to service account methods Ensure input parameters are not nil and required IDs are present to prevent runtime panics and ensure data integrity. * [JS] add shared IAM utility script Consolidate common IAM operations like deleteUser and deleteAccessKey into a shared utility script for better maintainability. * [View] include shared IAM utilities in layout Include iam-utils.js in the main layout to make IAM functions available across all administrative pages. * [View] refactor IAM logic and restore async in EC Shards view Remove redundant local IAM functions and ensure that delete confirmation callbacks are properly marked as async. * [View] consolidate IAM logic in Object Store Users view Remove redundant local definitions of deleteUser and deleteAccessKey, relying on the shared utilities instead. * [View] update generated templ files for UI consistency * credential/postgres: remove redundant name column from service_accounts table The id is already used as the unique identifier and was being copied to the name column. This removes the name column from the schema and updates the INSERT/UPDATE queries. * credential/filer_etc: improve logging for policy migration failures Added Errorf log if AtomicRenameEntry fails during migration to ensure visibility of common failure points. * credential: allow uppercase characters in service account ID username Updated ServiceAccountIdPattern to allow [A-Za-z0-9_-]+ for the username component, matching the actual service account creation logic which uses the parent user name directly. * Update object_store_users_templ.go * admin: fix ec_shards pagination to handle numeric page arguments Updated goToPage in cluster_ec_shards.templ to accept either an Event or a numeric page argument. This prevents errors when goToPage(1) is called directly. Corrected both the .templ source and generated Go code. * credential/filer_etc: improve service account storage robustness Added nil guard to saveServiceAccount, updated GetServiceAccount to return ErrServiceAccountNotFound for empty data, and improved deleteServiceAccount to handle response-level Filer errors.	2026-01-26 11:28:23 -08:00
Chris Lu	6bf088cec9	IAM Policy Management via gRPC (#8109 ) * Add IAM gRPC service definition - Add GetConfiguration/PutConfiguration for config management - Add CreateUser/GetUser/UpdateUser/DeleteUser/ListUsers for user management - Add CreateAccessKey/DeleteAccessKey/GetUserByAccessKey for access key management - Methods mirror existing IAM HTTP API functionality * Add IAM gRPC handlers on filer server - Implement IamGrpcServer with CredentialManager integration - Handle configuration get/put operations - Handle user CRUD operations - Handle access key create/delete operations - All methods delegate to CredentialManager for actual storage * Wire IAM gRPC service to filer server - Add CredentialManager field to FilerOption and FilerServer - Import credential store implementations in filer command - Initialize CredentialManager from credential.toml if available - Register IAM gRPC service on filer gRPC server - Enable credential management via gRPC alongside existing filer services * Regenerate IAM protobuf with gRPC service methods * iam_pb: add Policy Management to protobuf definitions * credential: implement PolicyManager in credential stores * filer: implement IAM Policy Management RPCs * shell: add s3.policy command * test: add integration test for s3.policy * test: fix compilation errors in policy_test * pb * fmt * test * weed shell: add -policies flag to s3.configure This allows linking/unlinking IAM policies to/from identities directly from the s3.configure command. * test: verify s3.configure policy linking and fix port allocation - Added test case for linking policies to users via s3.configure - Implemented findAvailablePortPair to ensure HTTP and gRPC ports are both available, avoiding conflicts with randomized port assignments. - Updated assertion to match jsonpb output (policyNames) * credential: add StoreTypeGrpc constant * credential: add IAM gRPC store boilerplate * credential: implement identity methods in gRPC store * credential: implement policy methods in gRPC store * admin: use gRPC credential store for AdminServer This ensures that all IAM and policy changes made through the Admin UI are persisted via the Filer's IAM gRPC service instead of direct file manipulation. * shell: s3.configure use granular IAM gRPC APIs instead of full config patching * shell: s3.configure use granular IAM gRPC APIs * shell: replace deprecated ioutil with os in s3.policy * filer: use gRPC FailedPrecondition for unconfigured credential manager * test: improve s3.policy integration tests and fix error checks * ci: add s3 policy shell integration tests to github workflow * filer: fix LoadCredentialConfiguration error handling * credential/grpc: propagate unmarshal errors in GetPolicies * filer/grpc: improve error handling and validation * shell: use gRPC status codes in s3.configure * credential: document PutPolicy as create-or-replace * credential/postgres: reuse CreatePolicy in PutPolicy to deduplicate logic * shell: add timeout context and strictly enforce flags in s3.policy * iam: standardize policy content field naming in gRPC and proto * shell: extract slice helper functions in s3.configure * filer: map credential store errors to gRPC status codes * filer: add input validation for UpdateUser and CreateAccessKey * iam: improve validation in policy and config handlers * filer: ensure IAM service registration by defaulting credential manager * credential: add GetStoreName method to manager * test: verify policy deletion in integration test	2026-01-25 13:39:30 -08:00
Chris Lu	e559b8df37	Refactor Admin UI to use unified IAM storage and add Shutdown hook	2026-01-23 20:29:21 -08:00
Chris Lu	f6318edbc9	Refactor Admin UI to use unified IAM storage and add MultipleFileStore (#8101 ) * Refactor Admin UI to use unified IAM storage and add MultipleFileStore * Address PR feedback: fix renames, error handling, and sync logic in FilerMultipleStore * Address refined PR feedback: safe rename order, rollback logic, and structural sync refinement * Optimize LoadConfiguration: use streaming callback for memory efficiency * Refactor UpdateUser: log rollback failures during rename * Implement PolicyManager for FilerMultipleStore * include the filer_multiple backend configuration * Implement cross-S3 synchronization and proper shutdown for all IAM backends * Extract Admin UI refactoring to a separate PR	2026-01-23 20:12:59 -08:00
KyoungYun-K	59dfe047b6	Support for cacheMetaTtlSec option in fuse command (#8063 )	2026-01-19 22:52:47 -08:00
Jaehoon Kim	f2e7af257d	Fix volume.fsck -forcePurging -reallyDeleteFromVolume to fail fast on filer traversal errors (#8015 ) * Add TraverseBfsWithContext and fix race conditions in error handling - Add TraverseBfsWithContext function to support context cancellation - Fix race condition in doTraverseBfsAndSaving using atomic.Bool and sync.Once - Improve error handling with fail-fast behavior and proper error propagation - Update command_volume_fsck to use error-returning saveFn callback - Enhance error messages in readFilerFileIdFile with detailed context * refactoring * fix error format * atomic * filer_pb: make enqueue return void * shell: simplify fs.meta.save error handling * filer_pb: handle enqueue return value * Revert "atomic" This reverts commit 712648bc354b186d6654fdb8a46fd4848fdc4e00. * shell: refine fs.meta.save logic --------- Co-authored-by: Chris Lu <chris.lu@gmail.com>	2026-01-14 21:37:50 -08:00
Walnuts	691aea84c3	feat: add TLS configuration options for Cassandra2 store (#7998 ) * feat: add TLS configuration options for Cassandra2 store Signed-off-by: walnuts1018 <r.juglans.1018@gmail.com> * fix: use 9142 port in tls connection Signed-off-by: walnuts1018 <r.juglans.1018@gmail.com> * Align the setting field names with gocql's SSLOpts. Signed-off-by: walnuts1018 <r.juglans.1018@gmail.com> * Removed: store.cluster.Port = 9142 * chore: update gocql dependency to v2 * refactor: improve Cassandra TLS configuration and port logic * docs: update filer.toml scaffold with ssl_enable_host_verification --------- Signed-off-by: walnuts1018 <r.juglans.1018@gmail.com> Co-authored-by: Chris Lu <chris.lu@gmail.com>	2026-01-14 17:59:59 -08:00
Chris Lu	1ea6b0c0d9	cleanup: deduplicate environment variable credential loading Previously, `weed mini` logic duplicated the credential loading process by creating a temporary IAM config file from environment variables. `auth_credentials.go` also had fallback logic to load these variables. This change: 1. Updates `auth_credentials.go` to always check for and merge AWS environment variable credentials (`AWS_ACCESS_KEY_ID`, etc.) into the identity list. This ensures they are available regardless of whether other configurations (static file or filer) are loaded. 2. Removes the redundant file creation logic from `weed/command/mini.go`. 3. Updates `weed mini` user messages to accurately reflect that credentials are loaded from environment variables in-memory. This results in a cleaner implementation where `weed/s3api` manages all credential loading logic, and `weed mini` simply relies on it.	2026-01-08 20:35:37 -08:00
Chris Lu	bd237999bb	weed mini can optionally skip s3	2026-01-08 10:05:42 -08:00
promalert	9012069bd7	chore: execute goimports to format the code (#7983 ) * chore: execute goimports to format the code Signed-off-by: promalert <promalert@outlook.com> * goimports -w . --------- Signed-off-by: promalert <promalert@outlook.com> Co-authored-by: Chris Lu <chris.lu@gmail.com>	2026-01-07 13:06:08 -08:00
Chris Lu	d4ecfaeda7	Enable writeback_cache and async_dio FUSE options (#7980 ) * Enable writeback_cache and async_dio FUSE options Fixes #7978 - Update mount_std.go to use EnableWriteback and EnableAsyncDio from go-fuse - Add go.mod replace directive to use local go-fuse with capability support - Remove temporary workaround that disabled these options This enables proper FUSE kernel capability negotiation for writeback cache and async direct I/O, improving performance for small writes and concurrent direct I/O operations. * Address PR review comments - Remove redundant nil checks for writebackCache and asyncDio flags - Update go.mod replace directive to use seaweedfs/go-fuse fork instead of local path * Add TODO comment for go.mod replace directive The replace directive must use a local path until seaweedfs/go-fuse#1 is merged. After merge, this should be updated to use the proper version. * Use seaweedfs/go-fuse v2.9.0 instead of local repository Replace local path with seaweedfs/go-fuse v2.9.0 fork which includes the writeback_cache and async_dio capability support. * Use github.com/seaweedfs/go-fuse/v2 directly without replace directive - Updated all imports to use github.com/seaweedfs/go-fuse/v2 - Removed replace directive from go.mod - Using seaweedfs/go-fuse v2.0.0-20260106181308-87f90219ce09 which includes: * writeback_cache and async_dio support * Corrected module path * Update to seaweedfs/go-fuse v2.9.1 Use v2.9.1 tag which includes the corrected module path (github.com/seaweedfs/go-fuse/v2) along with writeback_cache and async_dio support.	2026-01-06 10:50:54 -08:00
Chris Lu	e10f11b480	opt: reduce ShardsInfo memory usage with bitmap and sorted slice (#7974 ) * opt: reduce ShardsInfo memory usage with bitmap and sorted slice - Replace map[ShardId]ShardInfo with sorted []ShardInfo slice - Add ShardBits (uint32) bitmap for O(1) existence checks - Use binary search for O(log n) lookups by shard ID - Maintain sorted order for efficient iteration - Add comprehensive unit tests and benchmarks Memory savings: - Map overhead: ~48 bytes per entry eliminated - Pointers: 8 bytes per entry eliminated - Total: ~56 bytes per shard saved Performance improvements: - Has(): O(1) using bitmap - Size(): O(log n) using binary search (was O(1), acceptable tradeoff) - Count(): O(1) using popcount on bitmap - Iteration: Faster due to cache locality refactor: add methods to ShardBits type - Add Has(), Set(), Clear(), and Count() methods to ShardBits - Simplify ShardsInfo methods by using ShardBits methods - Improves code readability and encapsulation * opt: use ShardBits directly in ShardsCountFromVolumeEcShardInformationMessage Avoid creating a full ShardsInfo object just to count shards. Directly cast vi.EcIndexBits to ShardBits and use Count() method. * opt: use strings.Builder in ShardsInfo.String() for efficiency * refactor: change AsSlice to return []ShardInfo (values instead of pointers) This completes the memory optimization by avoiding unnecessary pointer slices and potential allocations. * refactor: rename ShardsCountFromVolumeEcShardInformationMessage to GetShardCount * fix: prevent deadlock in Add and Subtract methods Copy shards data from 'other' before releasing its lock to avoid potential deadlock when a.Add(b) and b.Add(a) are called concurrently. The previous implementation held other's lock while calling si.Set/Delete, which acquires si's lock. This could deadlock if two goroutines tried to add/subtract each other concurrently. * opt: avoid unnecessary locking in constructor functions ShardsInfoFromVolume and ShardsInfoFromVolumeEcShardInformationMessage now build shards slice and bitmap directly without calling Set(), which acquires a lock on every call. Since the object is local and not yet shared, locking is unnecessary and adds overhead. This improves performance during object construction. * fix: rename 'copy' variable to avoid shadowing built-in function The variable name 'copy' in TestShardsInfo_Copy shadowed the built-in copy() function, which is confusing and bad practice. Renamed to 'siCopy'. * opt: use math/bits.OnesCount32 and reorganize types 1. Replace manual popcount loop with math/bits.OnesCount32 for better performance and idiomatic Go code 2. Move ShardSize type definition to ec_shards_info.go for better code organization since it's primarily used there * refactor: Set() now accepts ShardInfo for future extensibility Changed Set(id ShardId, size ShardSize) to Set(shard ShardInfo) to support future additions to ShardInfo without changing the API. This makes the code more extensible as new fields can be added to ShardInfo (e.g., checksum, location, etc.) without breaking the Set API. * refactor: move ShardInfo and ShardSize to separate file Created ec_shard_info.go to hold the basic shard types (ShardInfo and ShardSize) for better code organization and separation of concerns. * refactor: add ShardInfo constructor and helper functions Added NewShardInfo() constructor and IsValid() method to better encapsulate ShardInfo creation and validation. Updated code to use the constructor for cleaner, more maintainable code. * fix: update remaining Set() calls to use NewShardInfo constructor Fixed compilation errors in storage and shell packages where Set() calls were not updated to use the new NewShardInfo() constructor. * fix: remove unreachable code in filer backup commands Removed unreachable return statements after infinite loops in filer_backup.go and filer_meta_backup.go to fix compilation errors. * fix: rename 'new' variable to avoid shadowing built-in Renamed 'new' to 'result' in MinusParityShards, Plus, and Minus methods to avoid shadowing Go's built-in new() function. * fix: update remaining test files to use NewShardInfo constructor Fixed Set() calls in command_volume_list_test.go and ec_rebalance_slots_test.go to use NewShardInfo() constructor.	2026-01-06 00:09:52 -08:00
Chris Lu	d75162370c	Fix trust policy wildcard principal handling (#7970 ) * Fix trust policy wildcard principal handling This change fixes the trust policy validation to properly support AWS-standard wildcard principals like {"Federated": ""}. Previously, the evaluatePrincipalValue() function would check for context existence before evaluating wildcards, causing wildcard principals to fail when the context key didn't exist. This forced users to use the plain "" workaround instead of the more specific {"Federated": ""} format. Changes: - Modified evaluatePrincipalValue() to check for "" FIRST before validating against context - Added support for wildcards in principal arrays - Added comprehensive tests for wildcard principal handling - All existing tests continue to pass (no regressions) This matches AWS IAM behavior where "" in a principal field means "allow any value" without requiring context validation. Fixes: https://github.com/seaweedfs/seaweedfs/issues/7917 Refactor: Move Principal matching to PolicyEngine This refactoring consolidates all policy evaluation logic into the PolicyEngine, improving code organization and eliminating duplication. Changes: - Added matchesPrincipal() and evaluatePrincipalValue() to PolicyEngine - Added EvaluateTrustPolicy() method for direct trust policy evaluation - Updated statementMatches() to check Principal field when present - Made resource matching optional (trust policies don't have Resources) - Simplified evaluateTrustPolicy() in iam_manager.go to delegate to PolicyEngine - Removed ~170 lines of duplicate code from iam_manager.go Benefits: - Single source of truth for all policy evaluation - Better code reusability and maintainability - Consistent evaluation rules for all policy types - Easier to test and debug All tests pass with no regressions. * Make PolicyEngine AWS-compatible and add unit tests Changes: 1. AWS-Compatible Context Keys: - Changed "seaweed:FederatedProvider" -> "aws:FederatedProvider" - Changed "seaweed:AWSPrincipal" -> "aws:PrincipalArn" - Changed "seaweed:ServicePrincipal" -> "aws:PrincipalServiceName" - This ensures 100% AWS compatibility for trust policies 2. Added Comprehensive Unit Tests: - TestPrincipalMatching: 8 test cases for Principal matching - TestEvaluatePrincipalValue: 7 test cases for value evaluation - TestTrustPolicyEvaluation: 6 test cases for trust policy evaluation - TestGetPrincipalContextKey: 4 test cases for context key mapping - Total: 25 new unit tests for PolicyEngine All tests pass: - Policy engine tests: 54 passed - Integration tests: 9 passed - Total: 63 tests passing * Update context keys to standard AWS/OIDC formats Replaced remaining seaweed: context keys with standard AWS and OIDC keys to ensure 100% compatibility with AWS IAM policies. Mappings: - seaweed:TokenIssuer -> oidc:iss - seaweed:Issuer -> oidc:iss - seaweed:Subject -> oidc:sub - seaweed:SourceIP -> aws:SourceIp Also updated unit tests to reflect these changes. All 63 tests pass successfully. * Add advanced policy tests for variable substitution and conditions Added comprehensive tests inspired by AWS IAM patterns: - TestPolicyVariableSubstitution: Tests ${oidc:sub} variable in resources - TestConditionWithNumericComparison: Tests sts:DurationSeconds condition - TestMultipleConditionOperators: Tests combining StringEquals and StringLike Results: - TestMultipleConditionOperators: ✅ All 3 subtests pass - Other tests reveal need for sts:DurationSeconds context population These tests validate the PolicyEngine's ability to handle complex AWS-compatible policy scenarios. * Fix federated provider context and add DurationSeconds support Changes: - Use iss claim as aws:FederatedProvider (AWS standard) - Add sts:DurationSeconds to trust policy evaluation context - TestPolicyVariableSubstitution now passes ✅ Remaining work: - TestConditionWithNumericComparison partially works (1/3 pass) - Need to investigate NumericLessThanEquals evaluation * Update trust policies to use issuer URL for AWS compatibility Changed trust policy from using provider name ("test-oidc") to using the issuer URL ("https://test-issuer.com") to match AWS standard behavior where aws:FederatedProvider contains the OIDC issuer URL. Test Results: - 10/12 test suites passing - TestFullOIDCWorkflow: ✅ All subtests pass - TestPolicyEnforcement: ✅ All subtests pass - TestSessionExpiration: ✅ Pass - TestPolicyVariableSubstitution: ✅ Pass - TestMultipleConditionOperators: ✅ All subtests pass Remaining work: - TestConditionWithNumericComparison needs investigation - One subtest in TestTrustPolicyValidation needs fix * Fix S3 API tests for AWS compatibility Updated all S3 API tests to use AWS-compatible context keys and trust policy principals: Changes: - seaweed:SourceIP → aws:SourceIp (IP-based conditions) - Federated: "test-oidc" → "https://test-issuer.com" (trust policies) Test Results: - TestS3EndToEndWithJWT: ✅ All 13 subtests pass - TestIPBasedPolicyEnforcement: ✅ All 3 subtests pass This ensures policies are 100% AWS-compatible and portable. * Fix ValidateTrustPolicy for AWS compatibility Updated ValidateTrustPolicy method to check for: - OIDC: issuer URL ("https://test-issuer.com") - LDAP: provider name ("test-ldap") - Wildcard: "" Test Results: - TestTrustPolicyValidation: ✅ All 3 subtests pass This ensures trust policy validation uses the same AWS-compatible principals as the PolicyEngine. Fix multipart and presigned URL tests for AWS compatibility Updated trust policies in: - s3_multipart_iam_test.go - s3_presigned_url_iam_test.go Changed "Federated": "test-oidc" → "https://test-issuer.com" Test Results: - TestMultipartIAMValidation: ✅ All 7 subtests pass - TestPresignedURLIAMValidation: ✅ All 4 subtests pass - TestPresignedURLGeneration: ✅ All 4 subtests pass - TestPresignedURLExpiration: ✅ All 4 subtests pass - TestPresignedURLSecurityPolicy: ✅ All 4 subtests pass All S3 API tests now use AWS-compatible trust policies. * Fix numeric condition evaluation and trust policy validation interface Major updates to ensure robust AWS-compatible policy evaluation: 1. Policy Engine: Added support for `int` and `int64` types in `evaluateNumericCondition`, fixing issues where raw numbers in policy documents caused evaluation failures. 2. Trust Policy Validation: Updated `TrustPolicyValidator` interface and `STSService` to propagate `DurationSeconds` correctly during the double-validation flow (Validation -> STS -> Validation callback). 3. IAM Manager: Updated implementation to match the new interface and correctly pass `sts:DurationSeconds` context key. Test Results: - TestConditionWithNumericComparison: ✅ All 3 subtests pass - All IAM and S3 integration tests pass (100%) This resolves the final edge case with DurationSeconds numeric conditions. * Fix MockTrustPolicyValidator interface and unreachable code warnings Updates: 1. Updated MockTrustPolicyValidator.ValidateTrustPolicyForWebIdentity to match new interface signature with durationSeconds parameter 2. Removed unreachable code after infinite loops in filer_backup.go and filer_meta_backup.go to satisfy linter Test Results: - All STS tests pass ✅ - Build warnings resolved ✅ * Refactor matchesPrincipal to consolidate array handling logic Consolidated duplicated logic for []interface{} and []string types by converting them to a unified []interface{} upfront. * Fix malformed AWS docs URL in iam_manager.go comment * dup * Enhance IAM integration tests with negative cases and interface array support Added test cases to TestTrustPolicyWildcardPrincipal to: 1. Verify rejection of roles when principal context does not match (negative test) 2. Verify support for principal arrays as []interface{} (simulating JSON unmarshaled roles) * Fix syntax errors in filer_backup and filer_meta_backup Restored missing closing braces for for-loops and re-added return statements. The previous attempt to remove unreachable code accidentally broke the function structure. Build now passes successfully.	2026-01-05 15:55:24 -08:00
Chris Lu	d15f32ae46	feat: add flags to disable WebDAV and Admin UI in weed mini (#7971 ) * feat: add flags to disable WebDAV and Admin UI in weed mini - Add -webdav flag (default: true) to optionally disable WebDAV server - Add -admin.ui flag (default: true) to optionally disable Admin UI only (server still runs) - Conditionally skip WebDAV service startup based on flag - Pass disableUI flag to SetupRoutes to skip UI route registration - Admin server still runs for gRPC and API access when UI is disabled Addresses issue from https://github.com/seaweedfs/seaweedfs/pull/7833#issuecomment-3711924150 * refactor: use positive enableUI parameter instead of disableUI across admin server and handlers * docs: update mini welcome message to list enabled components * chore: remove unused welcomeMessageTemplate constant * docs: split S3 credential message into separate sb.WriteString calls	2026-01-05 13:10:11 -08:00
Chris Lu	8269dc136d	simplify	2026-01-04 11:26:21 -08:00
Taylor Jasko	6a9860098f	fix: correcting S3 nil cipher dereference in filer init (#7952 ) Resolves the following error reported in #7949: ``` I0103 21:38:30.230662 s3.go:275 Starting S3 API Server with standard IAM panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x38ca961] goroutine 102 [running]: github.com/seaweedfs/seaweedfs/weed/command.(*S3Options).startS3Server(0x7caf840) /go/src/github.com/seaweedfs/seaweedfs/weed/command/s3.go:295 +0x741 github.com/seaweedfs/seaweedfs/weed/command.runFiler.func1(...) /go/src/github.com/seaweedfs/seaweedfs/weed/command/filer.go:244 created by github.com/seaweedfs/seaweedfs/weed/command.runFiler in goroutine 1 /go/src/github.com/seaweedfs/seaweedfs/weed/command/filer.go:242 +0x353 ```	2026-01-03 18:53:38 -08:00
Chris Lu	25975bacfb	fix(gcs): resolve credential conflict and improve backup logging (#7951 ) * fix(gcs): resolve credential conflict and improve backup logging - Workaround GCS SDK's "multiple credential options" error by manually constructing an authenticated HTTP client. - Include source entry path in filer backup error logs for better visibility on missing volumes/404s. * fix: address PR review feedback - Add nil check for EventNotification in getSourceKey - Avoid reassigning google_application_credentials parameter in gcs_sink.go * fix(gcs): return errors instead of calling glog.Fatalf in initialize Adheres to Go best practices and allows for more graceful failure handling by callers. * read from bind ip	2026-01-03 14:41:25 -08:00
Chris Lu	b97d17f79f	Standardize -ip.bind flags to default to empty and fall back to -ip (#7945 ) * Add documentation for issue #7941 fix * rm FIX_ISSUE_7941.md * Standardize -ip.bind flags to default to empty string and fall back to -ip option - Change s3 command -ip.bind default logic to use -ip instead of localhost - Change sftp command -ip.bind default to empty and fall back to 0.0.0.0 - Update help text for consistency * Fix compilation error: add -ip flag to s3 command and update bindIp fallback * Revert -ip flag addition for s3 command, set bindIp fallback to 0.0.0.0 * Update s3 command -ip.bind help text to reflect correct default behavior	2026-01-02 18:23:17 -08:00
Chris Lu	31a4f57cd9	Fix: Add -admin.grpc flag to worker for explicit gRPC port (#7926 ) (#7927 ) * Fix: Add -admin.grpc flag to worker for explicit gRPC port configuration * Fix(helm): Add adminGrpcServer to worker configuration * Refactor: Support host:port.grpcPort address format, revert -admin.grpc flag * Helm: Conditionally append grpcPort to worker admin address * weed/admin: fix "send on closed channel" panic in worker gRPC server Make unregisterWorker connection-aware to prevent closing channels belonging to newer connections. * weed/worker: improve gRPC client stability and logging - Fix goroutine leak in reconnection logic - Refactor reconnection loop to exit on success and prevent busy-waiting - Add session identification and enhanced logging to client handlers - Use constant for internal reset action and remove unused variables * weed/worker: fix worker state initialization and add lifecycle logs - Revert workerState to use running boolean correctly - Prevent handleStart failing by checking running state instead of startTime - Add more detailed logs for worker startup events	2025-12-31 11:55:09 -08:00
Chris Lu	5a135f8c5a	fuse: add FUSE performance options to weed fuse command (#7925 ) This adds support for the new FUSE performance options to the 'weed fuse' command, matching the functionality available in 'weed mount'. Added options: - writebackCache: Enable FUSE writeback cache for improved write performance - asyncDio: Enable async direct I/O for better concurrency - cacheSymlink: Enable symlink caching to reduce metadata lookups - sys.novncache: (macOS only) Disable vnode name caching to avoid stale data These options can now be used with mount -t weed: mount -t weed fuse /mnt -o "filer=localhost:8888,writebackCache=true,asyncDio=true" This ensures feature parity between 'weed mount' and 'weed fuse' commands.	2025-12-31 01:04:16 -08:00
Chris Lu	9072e1d38a	mount: add -asyncDio flag for async direct I/O (#7922 ) * mount: add -asyncDio flag for async direct I/O This adds support for async direct I/O via the -asyncDio flag. Async DIO enables the FUSE_CAP_ASYNC_DIO capability, allowing the kernel to perform direct I/O operations asynchronously. This improves concurrency for applications that use O_DIRECT flag. Benefits: - Better concurrency for direct I/O operations - Improved performance for applications using O_DIRECT - Reduced blocking on I/O operations Use cases: - Database workloads that use direct I/O - Applications that bypass page cache intentionally - High-performance I/O scenarios Implementation inspired by JuiceFS which enables this capability for improved I/O performance. Usage: weed mount -filer=localhost:8888 -dir=/mnt/seaweedfs -asyncDio * mount: add all remaining FUSE options (asyncDio, cacheSymlink, novncache) This combines the remaining three FUSE mount options on top of the merged writebackCache PR: 1. asyncDio: Enable async direct I/O for better concurrency 2. cacheSymlink: Enable symlink caching to reduce metadata lookups 3. novncache: (macOS only) Disable vnode name caching to avoid stale data All options use the function parameter 'option' instead of global 'mountOptions'.	2025-12-31 00:30:12 -08:00
Chris Lu	1424fe6ed5	mount: add -writebackCache flag for FUSE writeback caching (#7921 ) * mount: add -writebackCache flag for FUSE writeback caching This adds support for FUSE writeback caching via the -writebackCache flag. Writeback caching buffers writes in the kernel page cache before flushing to the filesystem. This significantly improves performance for workloads with many small writes by reducing the number of write syscalls. Benefits: - Improved write performance for small files (2-5x faster) - Reduced latency for write-heavy workloads - Better handling of bursty write patterns Trade-offs: - Data may be lost if system crashes before kernel flushes - Not recommended for critical data without proper fsync usage - Disabled by default for safety Inspired by JuiceFS implementation which uses the same FUSE option. Usage: weed mount -filer=localhost:8888 -dir=/mnt/seaweedfs -writebackCache * Apply suggestion from @gemini-code-assist[bot] Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-12-30 23:23:02 -08:00
ai8future	73098c9792	filer.meta.backup: add -excludePaths flag to skip paths from backup (#7916 ) * filer.meta.backup: add -excludePaths flag to skip paths from backup Add a new -excludePaths flag that accepts comma-separated path prefixes to exclude from backup operations. This enables selective backup when certain directories (e.g., legacy buckets) should be skipped. Usage: weed filer.meta.backup -filerDir=/buckets -excludePaths=/buckets/legacy1,/buckets/legacy2 -config=backup.toml 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * filer.meta.backup: address code review feedback for -excludePaths Fixes based on CodeRabbit and Gemini review: - Cache parsed exclude paths in struct (performance) - TrimSpace and skip empty entries (handles "a,,b" and "a, b") - Add trailing slash for directory boundary matching (prevents /buckets/legacy matching /buckets/legacy_backup) - Validate paths start with '/' and warn if not - Log excluded paths at startup for debugging - Fix rename handling: check both old and new paths, handle all four combinations correctly - Add docstring to shouldExclude() - Update UsageLine and Long description with new flag 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * filer.meta.backup: address nitpick feedback - Clarify directory boundary matching behavior in help text - Add warning when root path '/' is excluded (would exclude everything) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * includePrefixes and excludePrefixes --------- Co-authored-by: C Shaw <cliffshaw@users.noreply.github.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: Chris Lu <chris.lu@gmail.com>	2025-12-30 14:28:50 -08:00
Sheya Bernstein	915a7d4a54	feat: Add probes to worker service (#7896 ) * feat: Add probes to worker service * feat: Add probes to worker service * Merge branch 'master' into pr/7896 * refactor --------- Co-authored-by: Chris Lu <chris.lu@gmail.com>	2025-12-27 13:40:05 -08:00
Chris Lu	8d6bcddf60	Add S3 volume encryption support with -s3.encryptVolumeData flag (#7890 ) * Add S3 volume encryption support with -s3.encryptVolumeData flag This change adds volume-level encryption support for S3 uploads, similar to the existing -filer.encryptVolumeData option. Each chunk is encrypted with its own auto-generated CipherKey when the flag is enabled. Changes: - Add -s3.encryptVolumeData flag to weed s3, weed server, and weed mini - Wire Cipher option through S3ApiServer and ChunkedUploadOption - Add integration tests for multi-chunk range reads with encryption - Tests verify encryption works across chunk boundaries Usage: weed s3 -encryptVolumeData weed server -s3 -s3.encryptVolumeData weed mini -s3.encryptVolumeData Integration tests: go test -v -tags=integration -timeout 5m ./test/s3/sse/... * Add GitHub Actions CI for S3 volume encryption tests - Add test-volume-encryption target to Makefile that starts server with -s3.encryptVolumeData - Add s3-volume-encryption job to GitHub Actions workflow - Tests run with integration build tag and 10m timeout - Server logs uploaded on failure for debugging * Fix S3 client credentials to use environment variables The test was using hardcoded credentials "any"/"any" but the Makefile sets AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY to "some_access_key1"/ "some_secret_key1". Updated getS3Client() to read from environment variables with fallback to "any"/"any" for manual testing. * Change bucket creation errors from skip to fatal Tests should fail, not skip, when bucket creation fails. This ensures that credential mismatches and other configuration issues are caught rather than silently skipped. * Make copy and multipart test jobs fail instead of succeed Changed exit 0 to exit 1 for s3-sse-copy-operations and s3-sse-multipart jobs. These jobs document known limitations but should fail to ensure the issues are tracked and addressed, not silently ignored. * Hardcode S3 credentials to match Makefile Changed from environment variables to hardcoded credentials "some_access_key1"/"some_secret_key1" to match the Makefile configuration. This ensures tests work reliably. * fix Double Encryption * fix Chunk Size Mismatch * Added IsCompressed * is gzipped * fix copying * only perform HEAD request when len(cipherKey) > 0 * Revert "Make copy and multipart test jobs fail instead of succeed" This reverts commit bc34a7eb3c103ae7ab2000da2a6c3925712eb226. * fix security vulnerability * fix security * Update s3api_object_handlers_copy.go * Update s3api_object_handlers_copy.go * jwt to get content length	2025-12-27 00:09:14 -08:00
Chris Lu	935f41bff6	filer.backup: ignore missing volume/lookup errors when -ignore404Error is set (#7889 ) * filer.backup: ignore missing volume/lookup errors when -ignore404Error is set (#7888) * simplify	2025-12-26 15:44:30 -08:00

1 2 3 4 5 ...

1369 Commits