Commit Graph

12941 Commits

Author SHA1 Message Date
Sheya Bernstein
d8b8f0dffd fix(helm): add missing app.kubernetes.io/instance label to volume service (#8403) 2026-02-22 07:20:38 -08:00
Chris Lu
8e25c55bfb S3: Truncate timestamps to milliseconds for CopyObjectResult and CopyPartResult (#8398)
* S3: Truncate timestamps to milliseconds for CopyObjectResult and CopyPartResult

Fixes #8394

* S3: Address nitpick comments in copy handlers

- Synchronize Mtime and LastModified by capturing time once\n- Optimize copyChunksForRange loop\n- Use built-in min/max\n- Remove dead previewLen code
2026-02-20 21:01:31 -08:00
Chris Lu
e4b70c2521 go fix 2026-02-20 18:42:00 -08:00
Chris Lu
f7c27cc81f go fmt 2026-02-20 18:40:47 -08:00
Chris Lu
66680c58b7 consistent time 2026-02-20 18:40:27 -08:00
Chris Lu
2a1ae896e4 helm: refine openshift-values.yaml for assigned UID ranges (#8396)
* helm: refine openshift-values.yaml to remove hardcoded UIDs

Remove hardcoded runAsUser, runAsGroup, and fsGroup from the
openshift-values.yaml example. This allows OpenShift's admission
controller to automatically assign a valid UID from the namespace's
allocated range, avoiding "forbidden" errors when UID 1000 is
outside the permissible range.

Updates #8381, #8390.

* helm: fix volume.logs and add consistent security context comments

* Update README.md
2026-02-20 12:05:57 -08:00
Chris Lu
bd0b1fe9d5 S3 IAM: Added ListPolicyVersions and GetPolicyVersion support (#8395)
* test(s3/iam): add managed policy CRUD lifecycle integration coverage

* s3/iam: add ListPolicyVersions and GetPolicyVersion support

* test(s3/iam): cover ListPolicyVersions and GetPolicyVersion
2026-02-20 11:04:18 -08:00
Richard Chen Zheng
964a8f5fde Allow user to define access and secret key via values (#8389)
* Allow user to define admin access and secret key via values

* Add comments to values.yaml

* Add support for read for consistency

* Simplify templating

* Add checksum to s3 config

* Update comments

* Revert "Add checksum to s3 config"

This reverts commit d21a7038a86ae2adf547730b2cb6f455dcd4ce70.
2026-02-20 00:37:54 -08:00
Chris Lu
40cc0e04a6 docker: fix entrypoint chown guard; helm: add openshift-values.yaml (#8390)
* Enforce IAM for s3tables bucket creation

* Prefer IAM path when policies exist

* Ensure IAM enforcement honors default allow

* address comments

* Reused the precomputed principal when setting tableBucketMetadata.OwnerAccountID, avoiding the redundant getAccountID call.

* get identity

* fix

* dedup

* fix

* comments

* fix tests

* update iam config

* go fmt

* fix ports

* fix flags

* mini clean shutdown

* Revert "update iam config"

This reverts commit ca48fdbb0afa45657823d98657556c0bbf24f239.

Revert "mini clean shutdown"

This reverts commit 9e17f6baffd5dd7cc404d831d18dd618b9fe5049.

Revert "fix flags"

This reverts commit e9e7b29d2f77ee5cb82147d50621255410695ee3.

Revert "go fmt"

This reverts commit bd3241960b1d9484b7900190773b0ecb3f762c9a.

* test/s3tables: share single weed mini per test package via TestMain

Previously each top-level test function in the catalog and s3tables
package started and stopped its own weed mini instance. This caused
failures when a prior instance wasn't cleanly stopped before the next
one started (port conflicts, leaked global state).

Changes:
- catalog/iceberg_catalog_test.go: introduce TestMain that starts one
  shared TestEnvironment (external weed binary) before all tests and
  tears it down after. All individual test functions now use sharedEnv.
  Added randomSuffix() for unique resource names across tests.
- catalog/pyiceberg_test.go: updated to use sharedEnv instead of
  per-test environments.
- catalog/pyiceberg_test_helpers.go -> pyiceberg_test_helpers_test.go:
  renamed to a _test.go file so it can access TestEnvironment which is
  defined in a test file.
- table-buckets/setup.go: add package-level sharedCluster variable.
- table-buckets/s3tables_integration_test.go: introduce TestMain that
  starts one shared TestCluster before all tests. TestS3TablesIntegration
  now uses sharedCluster. Extract startMiniClusterInDir (no *testing.T)
  for TestMain use. TestS3TablesCreateBucketIAMPolicy keeps its own
  cluster (different IAM config). Remove miniClusterMutex (no longer
  needed). Fix Stop() to not panic when t is nil."

* delete

* parse

* default allow should work with anonymous

* fix port

* iceberg route

The failures are from Iceberg REST using the default bucket warehouse when no prefix is provided. Your tests create random buckets, so /v1/namespaces was looking in warehouse and failing. I updated the tests to use the prefixed Iceberg routes (/v1/{bucket}/...) via a small helper.

* test(s3tables): fix port conflicts and IAM ARN matching in integration tests

- Pass -master.dir explicitly to prevent filer store directory collision
  between shared cluster and per-test clusters running in the same process
- Pass -volume.port.public and -volume.publicUrl to prevent the global
  publicPort flag (mutated from 0 → concrete port by first cluster) from
  being reused by a second cluster, causing 'address already in use'
- Remove the flag-reset loop in Stop() that reset global flag values while
  other goroutines were reading them (race → panic)
- Fix IAM policy Resource ARN in TestS3TablesCreateBucketIAMPolicy to use
  wildcards (arn:aws:s3tables:*:*:bucket/<name>) because the handler
  generates ARNs with its own DefaultRegion (us-east-1) and principal name
  ('admin'), not the test constants testRegion/testAccountID

* docker: fix entrypoint chown guard; helm: add openshift-values.yaml

Fix a regression in entrypoint.sh where the DATA_UID/DATA_GID
ownership comparison was dropped, causing chown -R /data to run
unconditionally on every container start even when ownership was
already correct. Restore the guard so the recursive chown is
skipped when the seaweed user already owns /data — making startup
faster on subsequent runs and a no-op on OpenShift/PVC deployments
where fsGroup has already set correct ownership.

Add k8s/charts/seaweedfs/openshift-values.yaml: an example Helm
overrides file for deploying SeaweedFS on OpenShift (or any cluster
enforcing the Kubernetes restricted Pod Security Standard). Replaces
hostPath volumes with PVCs, sets runAsUser/fsGroup to 1000
(the seaweed user baked into the image), drops all capabilities,
disables privilege escalation, and enables RuntimeDefault seccomp —
satisfying OpenShift's default restricted SCC without needing a
custom SCC or root access.

Fixes #8381"
2026-02-20 00:35:42 -08:00
Michał Szynkiewicz
2f837c4780 Fix error on deleting non-empty bucket (#8376)
* Move check for non-empty bucket deletion out of `WithFilerClient` call

* Added proper checking if a bucket has "user" objects
2026-02-19 22:56:50 -08:00
Chris Lu
36c469e34e Enforce IAM for S3 Tables bucket creation (#8388)
* Enforce IAM for s3tables bucket creation

* Prefer IAM path when policies exist

* Ensure IAM enforcement honors default allow

* address comments

* Reused the precomputed principal when setting tableBucketMetadata.OwnerAccountID, avoiding the redundant getAccountID call.

* get identity

* fix

* dedup

* fix

* comments

* fix tests

* update iam config

* go fmt

* fix ports

* fix flags

* mini clean shutdown

* Revert "update iam config"

This reverts commit ca48fdbb0afa45657823d98657556c0bbf24f239.

Revert "mini clean shutdown"

This reverts commit 9e17f6baffd5dd7cc404d831d18dd618b9fe5049.

Revert "fix flags"

This reverts commit e9e7b29d2f77ee5cb82147d50621255410695ee3.

Revert "go fmt"

This reverts commit bd3241960b1d9484b7900190773b0ecb3f762c9a.

* test/s3tables: share single weed mini per test package via TestMain

Previously each top-level test function in the catalog and s3tables
package started and stopped its own weed mini instance. This caused
failures when a prior instance wasn't cleanly stopped before the next
one started (port conflicts, leaked global state).

Changes:
- catalog/iceberg_catalog_test.go: introduce TestMain that starts one
  shared TestEnvironment (external weed binary) before all tests and
  tears it down after. All individual test functions now use sharedEnv.
  Added randomSuffix() for unique resource names across tests.
- catalog/pyiceberg_test.go: updated to use sharedEnv instead of
  per-test environments.
- catalog/pyiceberg_test_helpers.go -> pyiceberg_test_helpers_test.go:
  renamed to a _test.go file so it can access TestEnvironment which is
  defined in a test file.
- table-buckets/setup.go: add package-level sharedCluster variable.
- table-buckets/s3tables_integration_test.go: introduce TestMain that
  starts one shared TestCluster before all tests. TestS3TablesIntegration
  now uses sharedCluster. Extract startMiniClusterInDir (no *testing.T)
  for TestMain use. TestS3TablesCreateBucketIAMPolicy keeps its own
  cluster (different IAM config). Remove miniClusterMutex (no longer
  needed). Fix Stop() to not panic when t is nil."

* delete

* parse

* default allow should work with anonymous

* fix port

* iceberg route

The failures are from Iceberg REST using the default bucket warehouse when no prefix is provided. Your tests create random buckets, so /v1/namespaces was looking in warehouse and failing. I updated the tests to use the prefixed Iceberg routes (/v1/{bucket}/...) via a small helper.

* test(s3tables): fix port conflicts and IAM ARN matching in integration tests

- Pass -master.dir explicitly to prevent filer store directory collision
  between shared cluster and per-test clusters running in the same process
- Pass -volume.port.public and -volume.publicUrl to prevent the global
  publicPort flag (mutated from 0 → concrete port by first cluster) from
  being reused by a second cluster, causing 'address already in use'
- Remove the flag-reset loop in Stop() that reset global flag values while
  other goroutines were reading them (race → panic)
- Fix IAM policy Resource ARN in TestS3TablesCreateBucketIAMPolicy to use
  wildcards (arn:aws:s3tables:*:*:bucket/<name>) because the handler
  generates ARNs with its own DefaultRegion (us-east-1) and principal name
  ('admin'), not the test constants testRegion/testAccountID
2026-02-19 22:52:05 -08:00
Chris Lu
a2005cb2a6 fix: resolve gRPC DNS resolution issues in Kubernetes #8384 (#8387)
* fix: resolve gRPC DNS resolution issues in Kubernetes #8384

- Replace direct `grpc.NewClient` calls with `pb.GrpcDial` for consistent connection establishment
- Fix async DNS resolution behavior in K8s with `ndots:5`
- Ensure high-level components use established helper for reliable networking

* refactor: refine gRPC DNS fix and add documentation

- Use instance's grpcDialOption in BrokerClient.ConfigureTopic
- Add detailed comments to GrpcDial explaining Kubernetes DNS resolution rationale

* fix: ensure proper context propagation in broker_client gRPC calls

- Pass the provided `ctx` to `pb.GrpcDial` in `ConfigureTopic` and `GetUnflushedMessages`
- Ensures that timeouts and cancellations are correctly honored during connection establishment

* docs: refine gRPC resolver documentation and cleanup dead code

- Enhanced documentation for `GrpcDial` with explicit warnings about global state mutation when using `resolver.SetDefaultScheme("passthrough")`.
- Recommended `passthrough:///` prefix as the primary migration path for `grpc.NewClient`.
- Removed dead commented-out code for `grpc.WithBlock()` and `grpc.WithTimeout()`.
2026-02-19 15:46:02 -08:00
Chris Lu
e9c45144cf Implement managed policy storage (#8385)
* Persist managed IAM policies

* Add IAM list/get policy integration test

* Faster marker lookup and cleanup

* Handle delete conflict and improve listing

* Add delete-in-use policy integration test

* Stabilize policy ID and guard path prefix

* Tighten CreatePolicy guard and reload

* Add ListPolicyNames to credential store
2026-02-19 14:21:19 -08:00
Chris Lu
5ecee9e64d s3: fix signature mismatch with non-standard ports and capitalized host (#8386)
* s3: fix signature mismatch with non-standard ports and capitalized host

- ensure host header extraction is case-insensitive in SignedHeaders
- prioritize non-standard ports in X-Forwarded-Host over default ports in X-Forwarded-Port
- add regression tests for both scenarios

fixes https://github.com/seaweedfs/seaweedfs/issues/8382

* simplify
2026-02-19 14:17:31 -08:00
Konstantin Lebedev
01b3125815 [shell]: volume balance capacity by min volume density (#8026)
volume balance by min volume density and active volumes
2026-02-19 13:30:59 -08:00
Chris Lu
7b8df39cf7 s3api: add AttachUserPolicy/DetachUserPolicy/ListAttachedUserPolicies (#8379)
* iam: add XML responses for managed user policy APIs

* s3api: implement attach/detach/list attached user policies

* s3api: add embedded IAM tests for managed user policies

* iam: update CredentialStore interface and Manager for managed policies

Updated the `CredentialStore` interface to include `AttachUserPolicy`,
`DetachUserPolicy`, and `ListAttachedUserPolicies` methods.
The `CredentialManager` was updated to delegate these calls to the store.
Added common error variables for policy management.

* iam: implement managed policy methods in MemoryStore

Implemented `AttachUserPolicy`, `DetachUserPolicy`, and
`ListAttachedUserPolicies` in the MemoryStore.
Also ensured deep copying of identities includes PolicyNames.

* iam: implement managed policy methods in PostgresStore

Modified Postgres schema to include `policy_names` JSONB column in `users`.
Implemented `AttachUserPolicy`, `DetachUserPolicy`, and `ListAttachedUserPolicies`.
Updated user CRUD operations to handle policy names persistence.

* iam: implement managed policy methods in remaining stores

Implemented user policy management in:
- `FilerEtcStore` (partial implementation)
- `IamGrpcStore` (delegated via GetUser/UpdateUser)
- `PropagatingCredentialStore` (to broadcast updates)
Ensures cluster-wide consistency for policy attachments.

* s3api: refactor EmbeddedIamApi to use managed policy APIs

- Refactored `AttachUserPolicy`, `DetachUserPolicy`, and `ListAttachedUserPolicies`
  to use `e.credentialManager` directly.
- Fixed a critical error suppression bug in `ExecuteAction` that always
  returned success even on failure.
- Implemented robust error matching using string comparison fallbacks.
- Improved consistency by reloading configuration after policy changes.

* s3api: update and refine IAM integration tests

- Updated tests to use a real `MemoryStore`-backed `CredentialManager`.
- Refined test configuration synchronization using `sync.Once` and
  manual deep-copying to prevent state corruption.
- Improved `extractEmbeddedIamErrorCodeAndMessage` to handle more XML
  formats robustly.
- Adjusted test expectations to match current AWS IAM behavior.

* fix compilation

* visibility

* ensure 10 policies

* reload

* add integration tests

* Guard raft command registration

* Allow IAM actions in policy tests

* Validate gRPC policy attachments

* Revert Validate gRPC policy attachments

* Tighten gRPC policy attach/detach

* Improve IAM managed policy handling

* Improve managed policy filters
2026-02-19 12:26:27 -08:00
dependabot[bot]
6787dccace build(deps): bump filippo.io/edwards25519 from 1.1.0 to 1.1.1 (#8383)
Bumps [filippo.io/edwards25519](https://github.com/FiloSottile/edwards25519) from 1.1.0 to 1.1.1.
- [Commits](https://github.com/FiloSottile/edwards25519/compare/v1.1.0...v1.1.1)

---
updated-dependencies:
- dependency-name: filippo.io/edwards25519
  dependency-version: 1.1.1
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-19 10:51:27 -08:00
Chris Lu
d1fecdface Fix IAM defaults and S3Tables IAM regression (#8374)
* Fix IAM defaults and s3tables identities

* Refine S3Tables identity tests

* Clarify identity tests
2026-02-18 18:20:03 -08:00
Chris Lu
38e14a867b fix: cancel volume server requests on client disconnect during S3 downloads (#8373)
* fix: cancel volume server requests on client disconnect during S3 downloads

- Use http.NewRequestWithContext in ReadUrlAsStream so in-flight volume
  server requests are properly aborted when the client disconnects and
  the request context is canceled
- Distinguish context-canceled errors (client disconnect, expected) from
  real server errors in streamFromVolumeServers; log at V(3) instead of
  ERROR to reduce noise from client-side disconnects (e.g. Nginx upstream
  timeout, browser cancel, curl --max-time)

Fixes: streamFromVolumeServers: streamFn failed...context canceled"

* fixup: separate Canceled/DeadlineExceeded log severity in streamFromVolumeServers

- context.Canceled → V(3) Infof "client disconnected" (expected, no noise)
- context.DeadlineExceeded → Warningf "server-side deadline exceeded" (unexpected, needs attention)
- all other errors → Errorf (unchanged)"
2026-02-18 17:14:54 -08:00
Chris Lu
eda4a000cc Revert "Fix IAM defaults and s3tables identities"
This reverts commit bf71fe0039.
2026-02-18 16:23:13 -08:00
Chris Lu
bf71fe0039 Fix IAM defaults and s3tables identities 2026-02-18 16:21:48 -08:00
Michał Szynkiewicz
53048ffffb Add md5 checksum validation support on PutObject and UploadPart (#8367)
* Add md5 checksum validation support on PutObject and UploadPart

Per the S3 specification, when a client sends a Content-MD5 header, the server must compare it against the MD5 of the received body and return BadDigest (HTTP 400) if they don't match.

SeaweedFS was silently accepting objects with incorrect Content-MD5 headers, which breaks data integrity verification for clients that rely on this feature (e.g. boto3). The error infrastructure (ErrBadDigest, ErrMsgBadDigest) already existed from PR #7306 but was never wired to an actual check.

This commit adds MD5 verification in putToFiler after the body is streamed and the MD5 is computed, and adds Content-MD5 header validation to PutObjectPartHandler (matching PutObjectHandler). Orphaned chunks are cleaned up on mismatch.

Refs: https://github.com/seaweedfs/seaweedfs/discussions/3908

* handle SSE, add uploadpart test

* s3 integration test: fix typo and add multipart upload checksum test

* s3api: move validateContentMd5 after GetBucketAndObject in PutObjectPartHandler

* s3api: move validateContentMd5 after GetBucketAndObject in PutObjectHandler

* s3api: fix MD5 validation for SSE uploads and logging in putToFiler

* add SSE test with checksum validation - mostly ai-generated

* Update s3_integration_test.go

* Address S3 integration test feedback: fix typos, rename variables, add verification steps, and clean up comments.

---------

Co-authored-by: Chris Lu <chris.lu@gmail.com>
2026-02-18 15:40:08 -08:00
Аlexey Medvedev
6a3a97333f Add support for TLS in gRPC communication between worker and volume server (#8370)
* Add support for TLS in gRPC communication between worker and volume server

* address comments

* worker: capture shared grpc.DialOption in BalanceTask registration closure

* worker: capture shared grpc.DialOption in ErasureCodingTask registration closure

* worker: capture shared grpc.DialOption in VacuumTask registration closure

* worker: use grpc.worker security configuration section for tasks

* plugin/worker: fix compilation errors by passing grpc.DialOption to task constructors

* plugin/worker: prevent double-counting in EC skip counters

---------

Co-authored-by: Chris Lu <chris.lu@gmail.com>
2026-02-18 15:39:53 -08:00
Chris Lu
8ec9ff4a12 Refactor plugin system and migrate worker runtime (#8369)
* admin: add plugin runtime UI page and route wiring

* pb: add plugin gRPC contract and generated bindings

* admin/plugin: implement worker registry, runtime, monitoring, and config store

* admin/dash: wire plugin runtime and expose plugin workflow APIs

* command: add flags to enable plugin runtime

* admin: rename remaining plugin v2 wording to plugin

* admin/plugin: add detectable job type registry helper

* admin/plugin: add scheduled detection and dispatch orchestration

* admin/plugin: prefetch job type descriptors when workers connect

* admin/plugin: add known job type discovery API and UI

* admin/plugin: refresh design doc to match current implementation

* admin/plugin: enforce per-worker scheduler concurrency limits

* admin/plugin: use descriptor runtime defaults for scheduler policy

* admin/ui: auto-load first known plugin job type on page open

* admin/plugin: bootstrap persisted config from descriptor defaults

* admin/plugin: dedupe scheduled proposals by dedupe key

* admin/ui: add job type and state filters for plugin monitoring

* admin/ui: add per-job-type plugin activity summary

* admin/plugin: split descriptor read API from schema refresh

* admin/ui: keep plugin summary metrics global while tables are filtered

* admin/plugin: retry executor reservation before timing out

* admin/plugin: expose scheduler states for monitoring

* admin/ui: show per-job-type scheduler states in plugin monitor

* pb/plugin: rename protobuf package to plugin

* admin/plugin: rename pluginRuntime wiring to plugin

* admin/plugin: remove runtime naming from plugin APIs and UI

* admin/plugin: rename runtime files to plugin naming

* admin/plugin: persist jobs and activities for monitor recovery

* admin/plugin: lease one detector worker per job type

* admin/ui: show worker load from plugin heartbeats

* admin/plugin: skip stale workers for detector and executor picks

* plugin/worker: add plugin worker command and stream runtime scaffold

* plugin/worker: implement vacuum detect and execute handlers

* admin/plugin: document external vacuum plugin worker starter

* command: update plugin.worker help to reflect implemented flow

* command/admin: drop legacy Plugin V2 label

* plugin/worker: validate vacuum job type and respect min interval

* plugin/worker: test no-op detect when min interval not elapsed

* command/admin: document plugin.worker external process

* plugin/worker: advertise configured concurrency in hello

* command/plugin.worker: add jobType handler selection

* command/plugin.worker: test handler selection by job type

* command/plugin.worker: persist worker id in workingDir

* admin/plugin: document plugin.worker jobType and workingDir flags

* plugin/worker: support cancel request for in-flight work

* plugin/worker: test cancel request acknowledgements

* command/plugin.worker: document workingDir and jobType behavior

* plugin/worker: emit executor activity events for monitor

* plugin/worker: test executor activity builder

* admin/plugin: send last successful run in detection request

* admin/plugin: send cancel request when detect or execute context ends

* admin/plugin: document worker cancel request responsibility

* admin/handlers: expose plugin scheduler states API in no-auth mode

* admin/handlers: test plugin scheduler states route registration

* admin/plugin: keep worker id on worker-generated activity records

* admin/plugin: test worker id propagation in monitor activities

* admin/dash: always initialize plugin service

* command/admin: remove plugin enable flags and default to enabled

* admin/dash: drop pluginEnabled constructor parameter

* admin/plugin UI: stop checking plugin enabled state

* admin/plugin: remove docs for plugin enable flags

* admin/dash: remove unused plugin enabled check method

* admin/dash: fallback to in-memory plugin init when dataDir fails

* admin/plugin API: expose worker gRPC port in status

* command/plugin.worker: resolve admin gRPC port via plugin status

* split plugin UI into overview/configuration/monitoring pages

* Update layout_templ.go

* add volume_balance plugin worker handler

* wire plugin.worker CLI for volume_balance job type

* add erasure_coding plugin worker handler

* wire plugin.worker CLI for erasure_coding job type

* support multi-job handlers in plugin worker runtime

* allow plugin.worker jobType as comma-separated list

* admin/plugin UI: rename to Workers and simplify config view

* plugin worker: queue detection requests instead of capacity reject

* Update plugin_worker.go

* plugin volume_balance: remove force_move/timeout from worker config UI

* plugin erasure_coding: enforce local working dir and cleanup

* admin/plugin UI: rename admin settings to job scheduling

* admin/plugin UI: persist and robustly render detection results

* admin/plugin: record and return detection trace metadata

* admin/plugin UI: show detection process and decision trace

* plugin: surface detector decision trace as activities

* mini: start a plugin worker by default

* admin/plugin UI: split monitoring into detection and execution tabs

* plugin worker: emit detection decision trace for EC and balance

* admin workers UI: split monitoring into detection and execution pages

* plugin scheduler: skip proposals for active assigned/running jobs

* admin workers UI: add job queue tab

* plugin worker: add dummy stress detector and executor job type

* admin workers UI: reorder tabs to detection queue execution

* admin workers UI: regenerate plugin template

* plugin defaults: include dummy stress and add stress tests

* plugin dummy stress: rotate detection selections across runs

* plugin scheduler: remove cross-run proposal dedupe

* plugin queue: track pending scheduled jobs

* plugin scheduler: wait for executor capacity before dispatch

* plugin scheduler: skip detection when waiting backlog is high

* plugin: add disk-backed job detail API and persistence

* admin ui: show plugin job detail modal from job id links

* plugin: generate unique job ids instead of reusing proposal ids

* plugin worker: emit heartbeats on work state changes

* plugin registry: round-robin tied executor and detector picks

* add temporary EC overnight stress runner

* plugin job details: persist and render EC execution plans

* ec volume details: color data and parity shard badges

* shard labels: keep parity ids numeric and color-only distinction

* admin: remove legacy maintenance UI routes and templates

* admin: remove dead maintenance endpoint helpers

* Update layout_templ.go

* remove dummy_stress worker and command support

* refactor plugin UI to job-type top tabs and sub-tabs

* migrate weed worker command to plugin runtime

* remove plugin.worker command and keep worker runtime with metrics

* update helm worker args for jobType and execution flags

* set plugin scheduling defaults to global 16 and per-worker 4

* stress: fix RPC context reuse and remove redundant variables in ec_stress_runner

* admin/plugin: fix lifecycle races, safe channel operations, and terminal state constants

* admin/dash: randomize job IDs and fix priority zero-value overwrite in plugin API

* admin/handlers: implement buffered rendering to prevent response corruption

* admin/plugin: implement debounced persistence flusher and optimize BuildJobDetail memory lookups

* admin/plugin: fix priority overwrite and implement bounded wait in scheduler reserve

* admin/plugin: implement atomic file writes and fix run record side effects

* admin/plugin: use P prefix for parity shard labels in execution plans

* admin/plugin: enable parallel execution for cancellation tests

* admin: refactor time.Time fields to pointers for better JSON omitempty support

* admin/plugin: implement pointer-safe time assignments and comparisons in plugin core

* admin/plugin: fix time assignment and sorting logic in plugin monitor after pointer refactor

* admin/plugin: update scheduler activity tracking to use time pointers

* admin/plugin: fix time-based run history trimming after pointer refactor

* admin/dash: fix JobSpec struct literal in plugin API after pointer refactor

* admin/view: add D/P prefixes to EC shard badges for UI consistency

* admin/plugin: use lifecycle-aware context for schema prefetching

* Update ec_volume_details_templ.go

* admin/stress: fix proposal sorting and log volume cleanup errors

* stress: refine ec stress runner with math/rand and collection name

- Added Collection field to VolumeEcShardsDeleteRequest for correct filename construction.
- Replaced crypto/rand with seeded math/rand PRNG for bulk payloads.
- Added documentation for EcMinAge zero-value behavior.
- Added logging for ignored errors in volume/shard deletion.

* admin: return internal server error for plugin store failures

Changed error status code from 400 Bad Request to 500 Internal Server Error for failures in GetPluginJobDetail to correctly reflect server-side errors.

* admin: implement safe channel sends and graceful shutdown sync

- Added sync.WaitGroup to Plugin struct to manage background goroutines.
- Implemented safeSendCh helper using recover() to prevent panics on closed channels.
- Ensured Shutdown() waits for all background operations to complete.

* admin: robustify plugin monitor with nil-safe time and record init

- Standardized nil-safe assignment for *time.Time pointers (CreatedAt, UpdatedAt, CompletedAt).
- Ensured persistJobDetailSnapshot initializes new records correctly if they don't exist on disk.
- Fixed debounced persistence to trigger immediate write on job completion.

* admin: improve scheduler shutdown behavior and logic guards

- Replaced brittle error string matching with explicit r.shutdownCh selection for shutdown detection.
- Removed redundant nil guard in buildScheduledJobSpec.
- Standardized WaitGroup usage for schedulerLoop.

* admin: implement deep copy for job parameters and atomic write fixes

- Implemented deepCopyGenericValue and used it in cloneTrackedJob to prevent shared state.
- Ensured atomicWriteFile creates parent directories before writing.

* admin: remove unreachable branch in shard classification

Removed an unreachable 'totalShards <= 0' check in classifyShardID as dataShards and parityShards are already guarded.

* admin: secure UI links and use canonical shard constants

- Added rel="noopener noreferrer" to external links for security.
- Replaced magic number 14 with erasure_coding.TotalShardsCount.
- Used renderEcShardBadge for missing shard list consistency.

* admin: stabilize plugin tests and fix regressions

- Composed a robust plugin_monitor_test.go to handle asynchronous persistence.
- Updated all time.Time literals to use timeToPtr helper.
- Added explicit Shutdown() calls in tests to synchronize with debounced writes.
- Fixed syntax errors and orphaned struct literals in tests.

* Potential fix for code scanning alert no. 278: Slice memory allocation with excessive size value

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* Potential fix for code scanning alert no. 283: Uncontrolled data used in path expression

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* admin: finalize refinements for error handling, scheduler, and race fixes

- Standardized HTTP 500 status codes for store failures in plugin_api.go.
- Tracked scheduled detection goroutines with sync.WaitGroup for safe shutdown.
- Fixed race condition in safeSendDetectionComplete by extracting channel under lock.
- Implemented deep copy for JobActivity details.
- Used defaultDirPerm constant in atomicWriteFile.

* test(ec): migrate admin dockertest to plugin APIs

* admin/plugin_api: fix RunPluginJobTypeAPI to return 500 for server-side detection/filter errors

* admin/plugin_api: fix ExecutePluginJobAPI to return 500 for job execution failures

* admin/plugin_api: limit parseProtoJSONBody request body to 1MB to prevent unbounded memory usage

* admin/plugin: consolidate regex to package-level validJobTypePattern; add char validation to sanitizeJobID

* admin/plugin: fix racy Shutdown channel close with sync.Once

* admin/plugin: track sendLoop and recv goroutines in WorkerStream with r.wg

* admin/plugin: document writeProtoFiles atomicity — .pb is source of truth, .json is human-readable only

* admin/plugin: extract activityLess helper to deduplicate nil-safe OccurredAt sort comparators

* test/ec: check http.NewRequest errors to prevent nil req panics

* test/ec: replace deprecated ioutil/math/rand, fix stale step comment 5.1→3.1

* plugin(ec): raise default detection and scheduling throughput limits

* topology: include empty disks in volume list and EC capacity fallback

* topology: remove hard 10-task cap for detection planning

* Update ec_volume_details_templ.go

* adjust default

* fix tests

---------

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
2026-02-18 13:42:41 -08:00
github-pawo
5463038760 Remove trailing spaces (line 53) in seaweedfs-dev-compose.yml (#8365) 2026-02-18 07:32:07 -08:00
github-pawo
828cbabb55 Add Admin UI to Docker Compose files (#8364) 2026-02-18 01:05:54 -08:00
Chris Lu
5919f519fd fix: allow overriding Enterprise image name using Helm #8361 (#8363)
* fix: allow overriding Enterprise image name using Helm #8361

* refactor: flatten image name construction logic for better readability
2026-02-17 13:49:16 -08:00
Chris Lu
63f641a6c9 Merge branch 'master' of https://github.com/seaweedfs/seaweedfs 2026-02-16 17:01:26 -08:00
Chris Lu
3c3a78d08e 4.13 2026-02-16 17:01:19 -08:00
Chris Lu
3300874cb5 filer: add default log purging to master maintenance scripts (#8359)
* filer: add default log purging to master maintenance scripts

* filer: fix default maintenance scripts to include full set of tasks

* filer: refactor maintenance scripts to avoid duplication
2026-02-16 16:58:15 -08:00
dependabot[bot]
bddd7960c1 build(deps): bump org.apache.avro:avro from 1.11.4 to 1.11.5 in /test/java/spark (#8358)
build(deps): bump org.apache.avro:avro in /test/java/spark

Bumps org.apache.avro:avro from 1.11.4 to 1.11.5.

---
updated-dependencies:
- dependency-name: org.apache.avro:avro
  dependency-version: 1.11.5
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
2026-02-16 15:28:55 -08:00
Lisandro Pin
a9d12a0792 Implement full scrubbing for EC volumes (#8318)
Implement full scrubbing for EC volumes.
2026-02-16 15:09:01 -08:00
Chris Lu
564fc56698 Update docker-compose.yml 2026-02-16 15:01:09 -08:00
Lisandro Pin
11fdb68281 Fix superblock write error checks on volume compaction. (#8352) 2026-02-16 14:44:37 -08:00
Chris Lu
35ad7d08a5 remove debug 2026-02-16 14:03:02 -08:00
Chris Lu
0d8588e3ae S3: Implement IAM defaults and STS signing key fallback (#8348)
* S3: Implement IAM defaults and STS signing key fallback logic

* S3: Refactor startup order to init SSE-S3 key manager before IAM

* S3: Derive STS signing key from KEK using HKDF for security isolation

* S3: Document STS signing key fallback in security.toml

* fix(s3api): refine anonymous access logic and secure-by-default behavior

- Initialize anonymous identity by default in `NewIdentityAccessManagement` to prevent nil pointer exceptions.
- Ensure `ReplaceS3ApiConfiguration` preserves the anonymous identity if not present in the new configuration.
- Update `NewIdentityAccessManagement` signature to accept `filerClient`.
- In legacy mode (no policy engine), anonymous defaults to Deny (no actions), preserving secure-by-default behavior.
- Use specific `LookupAnonymous` method instead of generic map lookup.
- Update tests to accommodate signature changes and verify improved anonymous handling.

* feat(s3api): make IAM configuration optional

- Start S3 API server without a configuration file if `EnableIam` option is set.
- Default to `Allow` effect for policy engine when no configuration is provided (Zero-Config mode).
- Handle empty configuration path gracefully in `loadIAMManagerFromConfig`.
- Add integration test `iam_optional_test.go` to verify empty config behavior.

* fix(iamapi): fix signature mismatch in NewIdentityAccessManagementWithStore

* fix(iamapi): properly initialize FilerClient instead of passing nil

* fix(iamapi): properly initialize filer client for IAM management

- Instead of passing `nil`, construct a `wdclient.FilerClient` using the provided `Filers` addresses.
- Ensure `NewIdentityAccessManagementWithStore` receives a valid `filerClient` to avoid potential nil pointer dereferences or limited functionality.

* clean: remove dead code in s3api_server.go

* refactor(s3api): improve IAM initialization, safety and anonymous access security

* fix(s3api): ensure IAM config loads from filer after client init

* fix(s3): resolve test failures in integration, CORS, and tagging tests

- Fix CORS tests by providing explicit anonymous permissions config
- Fix S3 integration tests by setting admin credentials in init
- Align tagging test credentials in CI with IAM defaults
- Added goroutine to retry IAM config load in iamapi server

* fix(s3): allow anonymous access to health targets and S3 Tables when identities are present

* fix(ci): use /healthz for Caddy health check in awscli tests

* iam, s3api: expose DefaultAllow from IAM and Policy Engine

This allows checking the global "Open by Default" configuration from
other components like S3 Tables.

* s3api/s3tables: support DefaultAllow in permission logic and handler

Updated CheckPermissionWithContext to respect the DefaultAllow flag
in PolicyContext. This enables "Open by Default" behavior for
unauthenticated access in zero-config environments. Added a targeted
unit test to verify the logic.

* s3api/s3tables: propagate DefaultAllow through handlers

Propagated the DefaultAllow flag to individual handlers for
namespaces, buckets, tables, policies, and tagging. This ensures
consistent "Open by Default" behavior across all S3 Tables API
endpoints.

* s3api: wire up DefaultAllow for S3 Tables API initialization

Updated registerS3TablesRoutes to query the global IAM configuration
and set the DefaultAllow flag on the S3 Tables API server. This
completes the end-to-end propagation required for anonymous access in
zero-config environments. Added a SetDefaultAllow method to
S3TablesApiServer to facilitate this.

* s3api: fix tests by adding DefaultAllow to mock IAM integrations

The IAMIntegration interface was updated to include DefaultAllow(),
breaking several mock implementations in tests. This commit fixes
the build errors by adding the missing method to the mocks.

* env

* ensure ports

* env

* env

* fix default allow

* add one more test using non-anonymous user

* debug

* add more debug

* less logs
2026-02-16 13:59:13 -08:00
dependabot[bot]
cc58272219 build(deps): bump github.com/klauspost/compress from 1.18.3 to 1.18.4 (#8353)
Bumps [github.com/klauspost/compress](https://github.com/klauspost/compress) from 1.18.3 to 1.18.4.
- [Release notes](https://github.com/klauspost/compress/releases)
- [Commits](https://github.com/klauspost/compress/compare/v1.18.3...v1.18.4)

---
updated-dependencies:
- dependency-name: github.com/klauspost/compress
  dependency-version: 1.18.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-16 10:58:13 -08:00
dependabot[bot]
5be4ee9892 build(deps): bump github.com/redis/go-redis/v9 from 9.17.2 to 9.18.0 (#8356)
Bumps [github.com/redis/go-redis/v9](https://github.com/redis/go-redis) from 9.17.2 to 9.18.0.
- [Release notes](https://github.com/redis/go-redis/releases)
- [Changelog](https://github.com/redis/go-redis/blob/master/RELEASE-NOTES.md)
- [Commits](https://github.com/redis/go-redis/compare/v9.17.2...v9.18.0)

---
updated-dependencies:
- dependency-name: github.com/redis/go-redis/v9
  dependency-version: 9.18.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-16 10:57:54 -08:00
dependabot[bot]
22e74221cb build(deps): bump github.com/getsentry/sentry-go from 0.40.0 to 0.42.0 (#8357)
Bumps [github.com/getsentry/sentry-go](https://github.com/getsentry/sentry-go) from 0.40.0 to 0.42.0.
- [Release notes](https://github.com/getsentry/sentry-go/releases)
- [Changelog](https://github.com/getsentry/sentry-go/blob/master/CHANGELOG.md)
- [Commits](https://github.com/getsentry/sentry-go/compare/v0.40.0...v0.42.0)

---
updated-dependencies:
- dependency-name: github.com/getsentry/sentry-go
  dependency-version: 0.42.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-16 10:57:43 -08:00
dependabot[bot]
cc80641be1 build(deps): bump github.com/mattn/go-sqlite3 from 1.14.33 to 1.14.34 (#8355)
Bumps [github.com/mattn/go-sqlite3](https://github.com/mattn/go-sqlite3) from 1.14.33 to 1.14.34.
- [Release notes](https://github.com/mattn/go-sqlite3/releases)
- [Commits](https://github.com/mattn/go-sqlite3/compare/v1.14.33...v1.14.34)

---
updated-dependencies:
- dependency-name: github.com/mattn/go-sqlite3
  dependency-version: 1.14.34
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-16 10:57:31 -08:00
dependabot[bot]
927c906379 build(deps): bump github.com/Azure/azure-sdk-for-go/sdk/storage/azblob from 1.6.3 to 1.6.4 (#8354)
build(deps): bump github.com/Azure/azure-sdk-for-go/sdk/storage/azblob

Bumps [github.com/Azure/azure-sdk-for-go/sdk/storage/azblob](https://github.com/Azure/azure-sdk-for-go) from 1.6.3 to 1.6.4.
- [Release notes](https://github.com/Azure/azure-sdk-for-go/releases)
- [Commits](https://github.com/Azure/azure-sdk-for-go/compare/sdk/storage/azblob/v1.6.3...sdk/storage/azblob/v1.6.4)

---
updated-dependencies:
- dependency-name: github.com/Azure/azure-sdk-for-go/sdk/storage/azblob
  dependency-version: 1.6.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-16 10:57:17 -08:00
Lisandro Pin
0721e3c1e9 Rework volume compaction (a.k.a vacuuming) logic to cleanly support new parameters. (#8337)
We'll leverage on this to support a "ignore broken needles" option, necessary
to properly recover damaged volumes, as described in
https://github.com/seaweedfs/seaweedfs/issues/7442#issuecomment-3897784283 .
2026-02-16 02:15:14 -08:00
Chris Lu
703d5e27b3 Fix S3 ListObjectsV2 recursion issue (#8347)
* Fix S3 ListObjectsV2 recursion issue (#8346)

Removed aggressive Limit=1 optimization in doListFilerEntries that caused missed directory entries when prefix ended with a delimiter. Added regression tests to verify deep directory traversal.

* Address PR comments: condense test comments
2026-02-15 10:52:10 -08:00
Chris Lu
e863767ac7 cleanup(iam): final removal of temporary debug logging from STS and S3 API 2026-02-14 22:15:06 -08:00
Chris Lu
e29a7f1741 cleanup(iam): remove temporary debug logging from STS and S3 API (redo) 2026-02-14 22:14:33 -08:00
Chris Lu
cf8e383e1e STS: Fallback to Caller Identity when RoleArn is missing in AssumeRole (#8345)
* s3api: make RoleArn optional in AssumeRole

* s3api: address PR feedback for optional RoleArn

* iam: add configurable default role for AssumeRole

* S3 STS: Use caller identity when RoleArn is missing

- Fallback to PrincipalArn/Context in AssumeRole if RoleArn is empty

- Handle User ARNs in prepareSTSCredentials

- Fix PrincipalArn generation for env var credentials

* Test: Add unit test for AssumeRole caller identity fallback

* fix(s3api): propagate admin permissions to assumed role session when using caller identity fallback

* STS: Fix is_admin propagation and optimize IAM policy evaluation for assumed roles

- Restore is_admin propagation via JWT req_ctx
- Optimize IsActionAllowed to skip role lookups for admin sessions
- Ensure session policies are still applied for downscoping
- Remove debug logging
- Fix syntax errors in cleanup

* fix(iam): resolve STS policy bypass for admin sessions

- Fixed IsActionAllowed in iam_manager.go to correctly identify and validate internal STS tokens, ensuring session policies are enforced.
- Refactored VerifyActionPermission in auth_credentials.go to properly handle session tokens and avoid legacy authorization short-circuits.
- Added debug logging for better tracing of policy evaluation and session validation.
2026-02-14 22:00:59 -08:00
Chris Lu
f49f6c6876 FUSE mount: fix failed git clone (#8344)
tests: reset MemoryStore to avoid test pollution; fix port reservation to prevent duplicate ports in mini

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-02-14 00:28:20 -08:00
Chris Lu
7799915e50 Fix IAM identity loss on S3 restart migration (#8343)
* Fix IAM reload after legacy config migration

Handle legacy identity.json metadata events by reloading from the credential manager instead of parsing event content, and watch the correct /etc/iam multi-file directories so identity changes are applied.

Add regression tests for legacy deletion and /etc/iam/identities change events.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix auth_credentials_subscribe_test helper to not pollute global memory store

The SaveConfiguration call was affecting other tests. Use local credential manager and ReplaceS3ApiConfiguration instead.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix IAM event watching: subscribe to IAM directories and improve directory matching

- Add /etc/iam and its subdirectories (identities, policies, service_accounts) to directoriesToWatch
- Fix directory matching to avoid false positives from sibling directories
  - Use exact match or prefix with trailing slash instead of plain HasPrefix
  - Prevents matching hypothetical /etc/iam/identities_backup directory

This ensures IAM config change events are actually delivered to the handler.

* fix tests

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-02-13 22:49:27 -08:00
Chris Lu
c090604143 Add UpdateAccessKey support to IAM API (#8342)
* Add UpdateAccessKey support to IAM API

* simplify
2026-02-13 21:11:07 -08:00
Chris Lu
f44e25b422 fix(iam): ensure access key status is persisted and defaulted to Active (#8341)
* Fix master leader election startup issue

Fixes #error-log-leader-not-selected-yet

* not useful test

* fix(iam): ensure access key status is persisted and defaulted to Active

* make pb

* update tests

* using constants
2026-02-13 20:28:41 -08:00