seaweedFS

Author	SHA1	Message	Date
Chris Lu	6213daf118	4.18	2026-04-01 17:42:41 -07:00
Chris Lu	5fa5507234	Add Prometheus metric to count upload errors (#8788 ) Add Prometheus metric to count upload errors (#8775) Add SeaweedFS_upload_error_total counter labeled by HTTP status code, so operators can alert on write/replication failures. Code "0" indicates a transport error (no HTTP response received). Also add an "Upload Errors" panel to the Grafana dashboard.	2026-03-26 16:58:05 -07:00
Andreas Røste	79f4a4579f	feat(k8s): added possibility to specify service.type for multiple ser… (#8372 ) * feat(k8s): added possibility to specify service.type for multiple services in helm chart * fix(k8s): removed headless (clusterIP: None) from services * fix(k8s): keep master and filer services headless for StatefulSet compatibility Master and filer services must remain headless (clusterIP: None) because their StatefulSets reference them via serviceName for stable pod DNS. Revert the service.type change for these two services and remove their unused service config from values.yaml. S3 and SFTP remain configurable. --------- Co-authored-by: Andreas Røste <andreas2101@gmail.com> Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com> Co-authored-by: Chris Lu <chris.lu@gmail.com>	2026-03-25 11:30:14 -07:00
Chris Lu	5e76f55077	fix(helm): namespace app-specific global values under global.seaweedfs (#8700 ) * fix(helm): namespace app-specific values under global.seaweedfs Move all app-specific values from the global namespace to global.seaweedfs.* to avoid polluting the shared .Values.global namespace when the chart is used as a subchart. Standard Helm conventions (global.imageRegistry, global.imagePullSecrets) remain at the global level as they are designed to be shared across subcharts. Fixes seaweedfs/seaweedfs#8699 BREAKING CHANGE: global values have been restructured. Users must update their values files to use the new paths: - global.registry → global.imageRegistry - global.repository → global.seaweedfs.image.repository - global.imageName → global.seaweedfs.image.name - global.<key> → global.seaweedfs.<key> (for all other app-specific values) * fix(ci): update helm CI tests to use new global.seaweedfs.* value paths Update all --set flags in helm_ci.yml to use the new namespaced global.seaweedfs.* paths matching the values.yaml restructuring. * fix(ci): install Claude Code via npm to avoid install.sh 403 The claude-code-action's built-in installer uses `curl https://claude.ai/install.sh \| bash` which can fail with 403. Due to the pipe, bash exits 0 on empty input, masking the curl failure and leaving the `claude` binary missing. Work around this by installing Claude Code via npm before invoking the action, and passing the executable path via path_to_claude_code_executable. * revert: remove claude-code-review.yml changes from this PR The claude-code-action OIDC token exchange validates that the workflow file matches the version on the default branch. Modifying it in a PR causes the review job to fail with "Workflow validation failed". The Claude Code install fix will need to be applied directly to master or in a separate PR. * fix: update stale references to old global.* value paths - admin-statefulset.yaml: fix fail message to reference global.seaweedfs.masterServer - values.yaml: fix comment to reference image.name instead of imageName - helm_ci.yml: fix diagnostic message to reference global.seaweedfs.enableSecurity * feat(helm): add backward-compat shim for old global.* value paths Add _compat.tpl with a seaweedfs.compat helper that detects old-style global.* keys (e.g. global.enableSecurity, global.registry) and merges them into the new global.seaweedfs.* namespace. Since the old keys no longer have defaults in values.yaml, their presence means the user explicitly provided them. The helper uses in-place mutation via `set` so all templates see the merged values. This ensures existing deployments using old value paths continue to work without changes after upgrading. * fix: update stale comment references in values.yaml Update comments referencing global.enableSecurity and global.masterServer to the new global.seaweedfs.* paths. --------- Co-authored-by: Copilot <copilot@github.com>	2026-03-19 13:00:48 -07:00
Jayshan Raghunandan	1f1eac4f08	feat: improve aio support for admin/volume ingress and fix UI links (#8679 ) * feat: improve allInOne mode support for admin/volume ingress and fix master UI links - Add allInOne support to admin ingress template, matching the pattern used by filer and s3 ingress templates (or-based enablement with ternary service name selection) - Add allInOne support to volume ingress template, which previously required volume.enabled even when the volume server runs within the allInOne pod - Expose admin ports in allInOne deployment and service when allInOne.admin.enabled is set - Add allInOne.admin config section to values.yaml (enabled by default, ports inherit from admin.) - Fix legacy master UI templates (master.html, masterNewRaft.html) to prefer PublicUrl over internal Url when linking to volume server UI. The new admin UI already handles this correctly. fix: revert admin allInOne changes and fix PublicUrl in admin dashboard The admin binary (`weed admin`) is a separate process that cannot run inside `weed server` (allInOne mode). Revert the admin-related allInOne helm chart changes that caused 503 errors on admin ingress. Fix bug in cluster_topology.go where VolumeServer.PublicURL was set to node.Id (internal pod address) instead of the actual public URL. Add public_url field to DataNodeInfo proto message so the topology gRPC response carries the public URL set via -volume.publicUrl flag. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use HTTP /dir/status to populate PublicUrl in admin dashboard The gRPC DataNodeInfo proto does not include PublicUrl, so the admin dashboard showed internal pod IPs instead of the configured public URL. Fetch PublicUrl from the master's /dir/status HTTP endpoint and apply it in both GetClusterTopology and GetClusterVolumeServers code paths. Also reverts the unnecessary proto field additions from the previous commit and cleans up a stray blank line in all-in-one-service.yml. * fix: apply PublicUrl link fix to masterNewRaft.html Match the same conditional logic already applied to master.html — prefer PublicUrl when set and different from Url. * fix: add HTTP timeout and status check to fetchPublicUrlMap Use a 5s-timeout client instead of http.DefaultClient to prevent blocking indefinitely when the master is unresponsive. Also check the HTTP status code before attempting to parse the response body. * fix: fall back to node address when PublicUrl is empty Prevents blank links in the admin dashboard when PublicUrl is not configured, such as in standalone or mixed-version clusters. * fix: log io.ReadAll error in fetchPublicUrlMap --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Chris Lu <chris.lu@gmail.com>	2026-03-18 13:20:55 -07:00
hoppla20	d34da671eb	fix(chart): bucket hook (#8680 ) * fix(chart): add imagePullPolicy and imagePullSecret to bucket-hook * chore(chart): add configurable bucket hook resources * fix(chart): add createBucketsHook value to allInOne and filer s3 blocks	2026-03-18 12:58:29 -07:00
hoppla20	d79e82ee60	fix(chart): missing resources on volume statefulset initContainer (#8678 ) * fix(chart): missing resources on volume statefulset initContainer * chore(chart): use own resources for idx-vol-move initContainer * chore(chart): improve comment for idxMoveResources value	2026-03-18 12:30:18 -07:00
hoppla20	efe722c18c	fix(chart): all in one maxVolumes value (#8683 ) * fix(chart): all-in-one deployment maxVolumes value * chore(chart): improve readability * fix(chart): maxVolume nil value check * fix(chart): guard against nil/empty volume.dataDirs before calling first Without this check, `first` errors when volume.dataDirs is nil or empty, causing a template render failure for users who omit the setting entirely. --------- Co-authored-by: Copilot <copilot@github.com>	2026-03-18 12:19:46 -07:00
Copilot	7174760a5d	helm: add urlPrefix support for admin UI behind reverse proxy subpath	2026-03-17 18:28:40 -07:00
Lukas Kallies	729df9c375	Update admin UI secret example to match (#8618 )	2026-03-13 10:37:47 -07:00
Moray Baruh	3fe5a7d761	Fix misuse of $__interval instead of $__rate_interval in Grafana panels (#8617 )	2026-03-13 07:54:03 -07:00
Chris Lu	0443b66a75	fix(helm): trim whitespace before s3 TLS args to prevent command breakage (#8614 ) * fix(helm): trim whitespace before s3 TLS args to prevent command breakage (#8613) When global.enableSecurity is enabled, the `{{ include }}` call for s3 TLS args lacked the leading dash (`{{-`), producing an extra blank line in the rendered shell command. This broke shell continuation and caused the filer (and s3/all-in-one) to crash because arguments after the blank line were silently dropped. * ci(helm): assert no blank lines in security+S3 command blocks Renders the chart with global.enableSecurity=true and S3 enabled for normal mode (filer + s3 deployments) and all-in-one mode, then parses every /bin/sh -ec command block and fails if any contains blank lines. This catches the whitespace regression from #8613 where a missing {{- dash on the seaweedfs.s3.tlsArgs include produced a blank line that broke shell continuation. * ci(helm): enable S3 in all-in-one security render test The s3.tlsArgs include is gated by allInOne.s3.enabled, so without this flag the all-in-one command block wasn't actually exercising the TLS args path.	2026-03-12 15:35:22 -07:00
Chris Lu	bfd0d5c084	fix(helm): use componentName for all service names to fix truncation mismatch (#8612 ) * fix(helm): use componentName for all service names to fix truncation mismatch (#8610) PR #8143 updated statefulsets and deployments to use the componentName helper (which truncates the fullname before appending the suffix), but left service definitions using the old `printf + trunc 63` pattern. When release names are long enough, these two strategies produce different names, causing DNS resolution failures (e.g., S3 cannot find the filer-client service and falls back to localhost:8888). Unify all service name definitions and cluster address helpers to use the componentName helper consistently. * refactor(helm): simplify cluster address helpers with ternary * test(helm): add regression test for service name truncation with long release names Renders the chart with a >63-char fullname in both normal and all-in-one modes, then asserts that Service metadata.name values match the hostnames produced by cluster.masterAddress, cluster.filerAddress, and the S3 deployment's -filer= argument. Prevents future truncation/DNS mismatch regressions like #8610. * fix(helm-ci): limit S3_FILER_HOST extraction to first match	2026-03-12 11:59:24 -07:00
Chris Lu	e4a77b8b16	feat(admin): support env var and security.toml for credentials (#8606 ) * feat(security): add [admin] section to security.toml scaffold Add admin credential fields (user, password, readonly.user, readonly.password) to security.toml. Via viper's WEED_ env prefix and AutomaticEnv(), these are automatically overridable as WEED_ADMIN_USER, WEED_ADMIN_PASSWORD, etc. Ref: https://github.com/seaweedfs/seaweedfs/discussions/8586 * feat(admin): support env var and security.toml fallbacks for credentials Add applyViperFallback() to read admin credentials from security.toml / WEED_* environment variables when CLI flags are not explicitly set. This allows systems like NixOS to pass secrets via env vars instead of CLI flags, which appear in process listings. Precedence: CLI flag > env var / security.toml > default value. Also change -adminUser default from "admin" to "" so that credentials are fully opt-in. Ref: https://github.com/seaweedfs/seaweedfs/discussions/8586 * feat(helm): use WEED_ env vars for admin credentials instead of CLI flags Rename SEAWEEDFS_ADMIN_USER/PASSWORD to WEED_ADMIN_USER/PASSWORD so viper picks them up natively. Remove -adminUser/-adminPassword shell expansion from command args since the Go binary now reads these directly via viper. * docs(admin): document env var and security.toml credential support Add environment variable mapping table, security.toml example, and precedence rules to the admin README. * style(security): use nested [admin.readonly] table in security.toml Use a nested TOML table instead of dotted keys for the readonly credentials. More idiomatic and easier to read; no change in how Viper parses it. * fix(admin): use util.GetViper() for env var support and fix README example applyViperFallback() was using viper.GetString() directly, which bypasses the WEED_ env prefix and AutomaticEnv setup that only happens in util.GetViper(). Switch to util.GetViper().GetString() so WEED_ADMIN_* environment variables are actually picked up. Also fix the README example to include WEED_ADMIN_USER alongside WEED_ADMIN_PASSWORD, since runAdmin() rejects an empty username when a password is set. * fix(admin): restore default adminUser to "admin" Defaulting adminUser to "" broke the common flow of setting only WEED_ADMIN_PASSWORD — runAdmin() rejects an empty username when a password is set. Restore "admin" as the default so that setting only the password works out of the box. * docs(admin): align README security.toml example with scaffold format Use nested [admin.readonly] table instead of flat dotted keys to match the format in weed/command/scaffold/security.toml. * docs(admin): remove README.md in favor of wiki page Admin documentation lives at the wiki (Admin-UI.md). Remove the in-repo README to avoid maintaining duplicate docs. --------- Co-authored-by: Copilot <copilot@github.com>	2026-03-11 17:40:24 -07:00
Mohamed Sekour	1df6821ec6	Fix topologySpreadConstraints key in sftp-deployment.yaml (#8600 )	2026-03-11 03:05:23 -07:00
Chris Lu	4a5243886a	4.17	2026-03-11 02:29:24 -07:00
Chris Lu	8ad58e7002	4.16	2026-03-09 21:52:43 -07:00
Chris Lu	6c7fe87a72	helm: add s3.tlsSecret for custom S3 HTTPS certificate (#8582 ) * helm: add s3.tlsSecret to allow custom TLS certificate for S3 HTTPS endpoint Allow users to specify an external Kubernetes TLS secret for the S3 HTTPS endpoint instead of using the internal self-signed client certificate. This enables using publicly trusted certificates (e.g. from Let's Encrypt) so S3 clients don't need to trust the internal CA. The new s3.tlsSecret value is supported in the standalone S3 gateway, filer with embedded S3, and all-in-one deployment templates. Closes #8581 * refactor: extract S3 TLS helpers to reduce duplication Move repeated S3 TLS cert/key logic into shared helper templates (seaweedfs.s3.tlsArgs, seaweedfs.s3.tlsVolumeMount, seaweedfs.s3.tlsVolume) in _helpers.tpl, and use them across all three deployment templates. * helm: add allInOne.s3.trafficDistribution support Add the missing allInOne.s3.trafficDistribution branch to the seaweedfs.trafficDistribution helper and wire it into the all-in-one service template, mirroring the existing s3-service.yaml behavior. PreferClose is auto-converted to PreferSameZone on k8s >=1.35. * fix: scope S3 TLS mounts to S3-enabled pods and simplify trafficDistribution helper - Wrap S3 TLS volume/volumeMount includes in allInOne.s3.enabled and filer.s3.enabled guards so the custom TLS secret is only mounted when S3 is actually enabled in that deployment mode. - Refactor seaweedfs.trafficDistribution helper to accept an explicit value+Capabilities dict instead of walking multiple .Values paths, making each call site responsible for passing its own setting.	2026-03-09 14:24:42 -07:00
Surote	bfc430afbd	Update helm for support on OpenShift to have data replication and replicas for master,filer and volume (#8543 )	2026-03-07 05:30:23 -08:00
Chris Lu	fcd5de9710	Fix YAML parse error in post-install-bucket-hook template (#8523 ) The 'set -o pipefail' line was improperly indented outside the YAML block scalar, causing a parse error when s3.enabled=true and s3.createBuckets were populated. Moved the line to the beginning of the script block with correct indentation (12 spaces). Fixes #8520 Co-authored-by: Claude Haiku 4.5 <noreply@anthropic.com>	2026-03-05 10:01:31 -08:00
Steven Crespo	b6f6f0187e	Add before-hook-creation delete policy to bucket-hook Job (#8519 )	2026-03-05 06:28:03 -08:00
Chris Lu	b3f7472fd3	4.15	2026-03-04 22:13:57 -08:00
Chris Lu	7799804200	4.14 Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com>	2026-03-04 19:22:39 -08:00
Chris Lu	1a3e3100d0	Helm: set serviceAccountName independent of cluster role (#8495 ) * Add stale job expiry and expire API * Add expire job button * helm: decouple serviceAccountName from cluster role --------- Co-authored-by: Copilot <copilot@github.com>	2026-03-03 12:13:18 -08:00
Surote	3db05f59f0	Feat: update openshift helm value to support seaweed s3 (#8494 ) feat: update openshift helm values Update helm values for openshift to enable/disable s3 and change log to `emptydir` instead of `hostpath`	2026-03-03 01:11:01 -08:00
Chris Lu	2644816692	helm: avoid duplicate env var keys in workload env lists (#8488 ) * helm: dedupe merged extraEnvironmentVars in workloads * address comments Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com> * range Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com> * helm: reuse merge helper for extraEnvironmentVars --------- Co-authored-by: Copilot <copilot@github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-03-02 12:10:57 -08:00
Kirill Ilin	ae02d47433	helm: add optional parameters to COSI BucketClass (#8453 ) Add cosi.bucketClassParameters to allow passing arbitrary parameters to the default BucketClass resource. This enables use cases like tiered storage where a diskType parameter needs to be set on the BucketClass to route objects to specific volume servers. When bucketClassParameters is empty (default), the BucketClass is rendered without a parameters block, preserving backward compatibility. Signed-off-by: Kirill Ilin <stitch14@yandex.ru> Co-authored-by: Claude <noreply@anthropic.com>	2026-02-26 12:19:07 -08:00
Chris Lu	9b6fc49946	Chart createBuckets config #8368 : Add TTL, Object Lock, and Versioning support (#8375 ) * Chart createBuckets config #8368: Add TTL, Object Lock, and Versioning support * Update weed/shell/command_s3_bucket_versioning.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> * address comments * address comments * go fmt * fix failures are still treated like “bucket not found” --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2026-02-26 11:56:10 -08:00
Peter Dodd	f4af1cc0ba	feat(helm): annotations for service account (#8429 )	2026-02-24 07:35:13 -08:00
Sheya Bernstein	d8b8f0dffd	fix(helm): add missing app.kubernetes.io/instance label to volume service (#8403 )	2026-02-22 07:20:38 -08:00
Chris Lu	2a1ae896e4	helm: refine openshift-values.yaml for assigned UID ranges (#8396 ) * helm: refine openshift-values.yaml to remove hardcoded UIDs Remove hardcoded runAsUser, runAsGroup, and fsGroup from the openshift-values.yaml example. This allows OpenShift's admission controller to automatically assign a valid UID from the namespace's allocated range, avoiding "forbidden" errors when UID 1000 is outside the permissible range. Updates #8381, #8390. * helm: fix volume.logs and add consistent security context comments * Update README.md	2026-02-20 12:05:57 -08:00
Richard Chen Zheng	964a8f5fde	Allow user to define access and secret key via values (#8389 ) * Allow user to define admin access and secret key via values * Add comments to values.yaml * Add support for read for consistency * Simplify templating * Add checksum to s3 config * Update comments * Revert "Add checksum to s3 config" This reverts commit d21a7038a86ae2adf547730b2cb6f455dcd4ce70.	2026-02-20 00:37:54 -08:00
Chris Lu	40cc0e04a6	docker: fix entrypoint chown guard; helm: add openshift-values.yaml (#8390 ) * Enforce IAM for s3tables bucket creation * Prefer IAM path when policies exist * Ensure IAM enforcement honors default allow * address comments * Reused the precomputed principal when setting tableBucketMetadata.OwnerAccountID, avoiding the redundant getAccountID call. * get identity * fix * dedup * fix * comments * fix tests * update iam config * go fmt * fix ports * fix flags * mini clean shutdown * Revert "update iam config" This reverts commit ca48fdbb0afa45657823d98657556c0bbf24f239. Revert "mini clean shutdown" This reverts commit 9e17f6baffd5dd7cc404d831d18dd618b9fe5049. Revert "fix flags" This reverts commit e9e7b29d2f77ee5cb82147d50621255410695ee3. Revert "go fmt" This reverts commit bd3241960b1d9484b7900190773b0ecb3f762c9a. * test/s3tables: share single weed mini per test package via TestMain Previously each top-level test function in the catalog and s3tables package started and stopped its own weed mini instance. This caused failures when a prior instance wasn't cleanly stopped before the next one started (port conflicts, leaked global state). Changes: - catalog/iceberg_catalog_test.go: introduce TestMain that starts one shared TestEnvironment (external weed binary) before all tests and tears it down after. All individual test functions now use sharedEnv. Added randomSuffix() for unique resource names across tests. - catalog/pyiceberg_test.go: updated to use sharedEnv instead of per-test environments. - catalog/pyiceberg_test_helpers.go -> pyiceberg_test_helpers_test.go: renamed to a _test.go file so it can access TestEnvironment which is defined in a test file. - table-buckets/setup.go: add package-level sharedCluster variable. - table-buckets/s3tables_integration_test.go: introduce TestMain that starts one shared TestCluster before all tests. TestS3TablesIntegration now uses sharedCluster. Extract startMiniClusterInDir (no testing.T) for TestMain use. TestS3TablesCreateBucketIAMPolicy keeps its own cluster (different IAM config). Remove miniClusterMutex (no longer needed). Fix Stop() to not panic when t is nil." delete * parse * default allow should work with anonymous * fix port * iceberg route The failures are from Iceberg REST using the default bucket warehouse when no prefix is provided. Your tests create random buckets, so /v1/namespaces was looking in warehouse and failing. I updated the tests to use the prefixed Iceberg routes (/v1/{bucket}/...) via a small helper. * test(s3tables): fix port conflicts and IAM ARN matching in integration tests - Pass -master.dir explicitly to prevent filer store directory collision between shared cluster and per-test clusters running in the same process - Pass -volume.port.public and -volume.publicUrl to prevent the global publicPort flag (mutated from 0 → concrete port by first cluster) from being reused by a second cluster, causing 'address already in use' - Remove the flag-reset loop in Stop() that reset global flag values while other goroutines were reading them (race → panic) - Fix IAM policy Resource ARN in TestS3TablesCreateBucketIAMPolicy to use wildcards (arn:aws:s3tables:::bucket/<name>) because the handler generates ARNs with its own DefaultRegion (us-east-1) and principal name ('admin'), not the test constants testRegion/testAccountID * docker: fix entrypoint chown guard; helm: add openshift-values.yaml Fix a regression in entrypoint.sh where the DATA_UID/DATA_GID ownership comparison was dropped, causing chown -R /data to run unconditionally on every container start even when ownership was already correct. Restore the guard so the recursive chown is skipped when the seaweed user already owns /data — making startup faster on subsequent runs and a no-op on OpenShift/PVC deployments where fsGroup has already set correct ownership. Add k8s/charts/seaweedfs/openshift-values.yaml: an example Helm overrides file for deploying SeaweedFS on OpenShift (or any cluster enforcing the Kubernetes restricted Pod Security Standard). Replaces hostPath volumes with PVCs, sets runAsUser/fsGroup to 1000 (the seaweed user baked into the image), drops all capabilities, disables privilege escalation, and enables RuntimeDefault seccomp — satisfying OpenShift's default restricted SCC without needing a custom SCC or root access. Fixes #8381"	2026-02-20 00:35:42 -08:00
Chris Lu	8ec9ff4a12	Refactor plugin system and migrate worker runtime (#8369 ) * admin: add plugin runtime UI page and route wiring * pb: add plugin gRPC contract and generated bindings * admin/plugin: implement worker registry, runtime, monitoring, and config store * admin/dash: wire plugin runtime and expose plugin workflow APIs * command: add flags to enable plugin runtime * admin: rename remaining plugin v2 wording to plugin * admin/plugin: add detectable job type registry helper * admin/plugin: add scheduled detection and dispatch orchestration * admin/plugin: prefetch job type descriptors when workers connect * admin/plugin: add known job type discovery API and UI * admin/plugin: refresh design doc to match current implementation * admin/plugin: enforce per-worker scheduler concurrency limits * admin/plugin: use descriptor runtime defaults for scheduler policy * admin/ui: auto-load first known plugin job type on page open * admin/plugin: bootstrap persisted config from descriptor defaults * admin/plugin: dedupe scheduled proposals by dedupe key * admin/ui: add job type and state filters for plugin monitoring * admin/ui: add per-job-type plugin activity summary * admin/plugin: split descriptor read API from schema refresh * admin/ui: keep plugin summary metrics global while tables are filtered * admin/plugin: retry executor reservation before timing out * admin/plugin: expose scheduler states for monitoring * admin/ui: show per-job-type scheduler states in plugin monitor * pb/plugin: rename protobuf package to plugin * admin/plugin: rename pluginRuntime wiring to plugin * admin/plugin: remove runtime naming from plugin APIs and UI * admin/plugin: rename runtime files to plugin naming * admin/plugin: persist jobs and activities for monitor recovery * admin/plugin: lease one detector worker per job type * admin/ui: show worker load from plugin heartbeats * admin/plugin: skip stale workers for detector and executor picks * plugin/worker: add plugin worker command and stream runtime scaffold * plugin/worker: implement vacuum detect and execute handlers * admin/plugin: document external vacuum plugin worker starter * command: update plugin.worker help to reflect implemented flow * command/admin: drop legacy Plugin V2 label * plugin/worker: validate vacuum job type and respect min interval * plugin/worker: test no-op detect when min interval not elapsed * command/admin: document plugin.worker external process * plugin/worker: advertise configured concurrency in hello * command/plugin.worker: add jobType handler selection * command/plugin.worker: test handler selection by job type * command/plugin.worker: persist worker id in workingDir * admin/plugin: document plugin.worker jobType and workingDir flags * plugin/worker: support cancel request for in-flight work * plugin/worker: test cancel request acknowledgements * command/plugin.worker: document workingDir and jobType behavior * plugin/worker: emit executor activity events for monitor * plugin/worker: test executor activity builder * admin/plugin: send last successful run in detection request * admin/plugin: send cancel request when detect or execute context ends * admin/plugin: document worker cancel request responsibility * admin/handlers: expose plugin scheduler states API in no-auth mode * admin/handlers: test plugin scheduler states route registration * admin/plugin: keep worker id on worker-generated activity records * admin/plugin: test worker id propagation in monitor activities * admin/dash: always initialize plugin service * command/admin: remove plugin enable flags and default to enabled * admin/dash: drop pluginEnabled constructor parameter * admin/plugin UI: stop checking plugin enabled state * admin/plugin: remove docs for plugin enable flags * admin/dash: remove unused plugin enabled check method * admin/dash: fallback to in-memory plugin init when dataDir fails * admin/plugin API: expose worker gRPC port in status * command/plugin.worker: resolve admin gRPC port via plugin status * split plugin UI into overview/configuration/monitoring pages * Update layout_templ.go * add volume_balance plugin worker handler * wire plugin.worker CLI for volume_balance job type * add erasure_coding plugin worker handler * wire plugin.worker CLI for erasure_coding job type * support multi-job handlers in plugin worker runtime * allow plugin.worker jobType as comma-separated list * admin/plugin UI: rename to Workers and simplify config view * plugin worker: queue detection requests instead of capacity reject * Update plugin_worker.go * plugin volume_balance: remove force_move/timeout from worker config UI * plugin erasure_coding: enforce local working dir and cleanup * admin/plugin UI: rename admin settings to job scheduling * admin/plugin UI: persist and robustly render detection results * admin/plugin: record and return detection trace metadata * admin/plugin UI: show detection process and decision trace * plugin: surface detector decision trace as activities * mini: start a plugin worker by default * admin/plugin UI: split monitoring into detection and execution tabs * plugin worker: emit detection decision trace for EC and balance * admin workers UI: split monitoring into detection and execution pages * plugin scheduler: skip proposals for active assigned/running jobs * admin workers UI: add job queue tab * plugin worker: add dummy stress detector and executor job type * admin workers UI: reorder tabs to detection queue execution * admin workers UI: regenerate plugin template * plugin defaults: include dummy stress and add stress tests * plugin dummy stress: rotate detection selections across runs * plugin scheduler: remove cross-run proposal dedupe * plugin queue: track pending scheduled jobs * plugin scheduler: wait for executor capacity before dispatch * plugin scheduler: skip detection when waiting backlog is high * plugin: add disk-backed job detail API and persistence * admin ui: show plugin job detail modal from job id links * plugin: generate unique job ids instead of reusing proposal ids * plugin worker: emit heartbeats on work state changes * plugin registry: round-robin tied executor and detector picks * add temporary EC overnight stress runner * plugin job details: persist and render EC execution plans * ec volume details: color data and parity shard badges * shard labels: keep parity ids numeric and color-only distinction * admin: remove legacy maintenance UI routes and templates * admin: remove dead maintenance endpoint helpers * Update layout_templ.go * remove dummy_stress worker and command support * refactor plugin UI to job-type top tabs and sub-tabs * migrate weed worker command to plugin runtime * remove plugin.worker command and keep worker runtime with metrics * update helm worker args for jobType and execution flags * set plugin scheduling defaults to global 16 and per-worker 4 * stress: fix RPC context reuse and remove redundant variables in ec_stress_runner * admin/plugin: fix lifecycle races, safe channel operations, and terminal state constants * admin/dash: randomize job IDs and fix priority zero-value overwrite in plugin API * admin/handlers: implement buffered rendering to prevent response corruption * admin/plugin: implement debounced persistence flusher and optimize BuildJobDetail memory lookups * admin/plugin: fix priority overwrite and implement bounded wait in scheduler reserve * admin/plugin: implement atomic file writes and fix run record side effects * admin/plugin: use P prefix for parity shard labels in execution plans * admin/plugin: enable parallel execution for cancellation tests * admin: refactor time.Time fields to pointers for better JSON omitempty support * admin/plugin: implement pointer-safe time assignments and comparisons in plugin core * admin/plugin: fix time assignment and sorting logic in plugin monitor after pointer refactor * admin/plugin: update scheduler activity tracking to use time pointers * admin/plugin: fix time-based run history trimming after pointer refactor * admin/dash: fix JobSpec struct literal in plugin API after pointer refactor * admin/view: add D/P prefixes to EC shard badges for UI consistency * admin/plugin: use lifecycle-aware context for schema prefetching * Update ec_volume_details_templ.go * admin/stress: fix proposal sorting and log volume cleanup errors * stress: refine ec stress runner with math/rand and collection name - Added Collection field to VolumeEcShardsDeleteRequest for correct filename construction. - Replaced crypto/rand with seeded math/rand PRNG for bulk payloads. - Added documentation for EcMinAge zero-value behavior. - Added logging for ignored errors in volume/shard deletion. * admin: return internal server error for plugin store failures Changed error status code from 400 Bad Request to 500 Internal Server Error for failures in GetPluginJobDetail to correctly reflect server-side errors. * admin: implement safe channel sends and graceful shutdown sync - Added sync.WaitGroup to Plugin struct to manage background goroutines. - Implemented safeSendCh helper using recover() to prevent panics on closed channels. - Ensured Shutdown() waits for all background operations to complete. * admin: robustify plugin monitor with nil-safe time and record init - Standardized nil-safe assignment for time.Time pointers (CreatedAt, UpdatedAt, CompletedAt). - Ensured persistJobDetailSnapshot initializes new records correctly if they don't exist on disk. - Fixed debounced persistence to trigger immediate write on job completion. admin: improve scheduler shutdown behavior and logic guards - Replaced brittle error string matching with explicit r.shutdownCh selection for shutdown detection. - Removed redundant nil guard in buildScheduledJobSpec. - Standardized WaitGroup usage for schedulerLoop. * admin: implement deep copy for job parameters and atomic write fixes - Implemented deepCopyGenericValue and used it in cloneTrackedJob to prevent shared state. - Ensured atomicWriteFile creates parent directories before writing. * admin: remove unreachable branch in shard classification Removed an unreachable 'totalShards <= 0' check in classifyShardID as dataShards and parityShards are already guarded. * admin: secure UI links and use canonical shard constants - Added rel="noopener noreferrer" to external links for security. - Replaced magic number 14 with erasure_coding.TotalShardsCount. - Used renderEcShardBadge for missing shard list consistency. * admin: stabilize plugin tests and fix regressions - Composed a robust plugin_monitor_test.go to handle asynchronous persistence. - Updated all time.Time literals to use timeToPtr helper. - Added explicit Shutdown() calls in tests to synchronize with debounced writes. - Fixed syntax errors and orphaned struct literals in tests. * Potential fix for code scanning alert no. 278: Slice memory allocation with excessive size value Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * Potential fix for code scanning alert no. 283: Uncontrolled data used in path expression Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * admin: finalize refinements for error handling, scheduler, and race fixes - Standardized HTTP 500 status codes for store failures in plugin_api.go. - Tracked scheduled detection goroutines with sync.WaitGroup for safe shutdown. - Fixed race condition in safeSendDetectionComplete by extracting channel under lock. - Implemented deep copy for JobActivity details. - Used defaultDirPerm constant in atomicWriteFile. * test(ec): migrate admin dockertest to plugin APIs * admin/plugin_api: fix RunPluginJobTypeAPI to return 500 for server-side detection/filter errors * admin/plugin_api: fix ExecutePluginJobAPI to return 500 for job execution failures * admin/plugin_api: limit parseProtoJSONBody request body to 1MB to prevent unbounded memory usage * admin/plugin: consolidate regex to package-level validJobTypePattern; add char validation to sanitizeJobID * admin/plugin: fix racy Shutdown channel close with sync.Once * admin/plugin: track sendLoop and recv goroutines in WorkerStream with r.wg * admin/plugin: document writeProtoFiles atomicity — .pb is source of truth, .json is human-readable only * admin/plugin: extract activityLess helper to deduplicate nil-safe OccurredAt sort comparators * test/ec: check http.NewRequest errors to prevent nil req panics * test/ec: replace deprecated ioutil/math/rand, fix stale step comment 5.1→3.1 * plugin(ec): raise default detection and scheduling throughput limits * topology: include empty disks in volume list and EC capacity fallback * topology: remove hard 10-task cap for detection planning * Update ec_volume_details_templ.go * adjust default * fix tests --------- Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>	2026-02-18 13:42:41 -08:00
Chris Lu	5919f519fd	fix: allow overriding Enterprise image name using Helm #8361 (#8363 ) * fix: allow overriding Enterprise image name using Helm #8361 * refactor: flatten image name construction logic for better readability	2026-02-17 13:49:16 -08:00
Chris Lu	3c3a78d08e	4.13	2026-02-16 17:01:19 -08:00
Lukas	abd681b54b	Fix service name in the worker deployment (seaweedfs#8314) (#8315 ) Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>	2026-02-12 14:22:42 -08:00
Chris Lu	6bd6bba594	Fix inconsistent admin argument in worker pods (#8316 ) * Fix inconsistent admin argument in worker pods * Use seaweedfs.componentName for admin service naming	2026-02-12 09:50:53 -08:00
Chris Lu	af8273386d	4.12	2026-02-09 18:15:19 -08:00
Chris Lu	cb9e21cdc5	Normalize hashicorp raft peer ids (#8253 ) * Normalize raft voter ids * 4.11 * Update raft_hashicorp.go	2026-02-09 07:46:34 -08:00
Chris Lu	5a279c4d2f	fmt	2026-02-08 21:19:00 -08:00
Chris Lu	0c89185291	4.10	2026-02-08 21:16:58 -08:00
Nikita	c44716f9af	helm: add a trafficDistribution field to an s3 service (#8232 ) helm: add trafficDistribution field to s3 service Signed-off-by: nbykov0 <166552198+nbykov0@users.noreply.github.com>	2026-02-06 10:47:39 -08:00
Yalın Doğu Şahin	ef3b5f7efa	helm/add iceberg rest catalog ingress for s3 (#8205 ) * helm: add Iceberg REST catalog support to S3 service * helm: add Iceberg REST catalog support to S3 service * add ingress for iceberg catalog endpoint * helm: conditionally render ingressClassName in s3-iceberg-ingress.yaml * helm: refactor s3-iceberg-ingress.yaml to use named template for paths * helm: remove unused $serviceName variable in s3-iceberg-ingress.yaml --------- Co-authored-by: yalin.sahin <yalin.sahin@tradition.ch> Co-authored-by: Chris Lu <chris.lu@gmail.com>	2026-02-04 12:00:59 -08:00
Chris Lu	5a5cc38692	4.09	2026-02-03 17:56:25 -08:00
Yalın Doğu Şahin	47fc9e771f	helm: add Iceberg REST catalog support to S3 service (#8193 ) * helm: add Iceberg REST catalog support to S3 service * helm: add Iceberg REST catalog support to S3 service --------- Co-authored-by: yalin.sahin <yalin.sahin@tradition.ch>	2026-02-03 13:44:52 -08:00
Chris Lu	ba8816e2e1	4.08	2026-02-02 20:36:03 -08:00
Emanuele Leopardi	51ef39fc76	Update Helm hook annotations for post-install and upgrade (#8150 ) * Update Helm hook annotations for post-install and upgrade I believe it makes sense to allow this job to run also after installation. Assuming weed shell is idempotent, and assuming someone wants to add a new bucket after the initial installation, it makes sense to trigger the job again. * Add check for existing buckets before creation * Enhances S3 bucket existence check Improves the reliability of checking for existing S3 buckets in the post-install hook. The previous `grep -w` command could lead to imprecise matches. This update extracts only the bucket name and performs an exact, whole-line match to ensure accurate detection of existing buckets. This prevents potential issues with redundant creation attempts or false negatives. * Currently Bucket Creation is ignored if filer.s3.enabled is disabled This commit enables bucket creation on both scenarios,i.e. if any of filer.s3.enabled or s3.enabled are used. --------- Co-authored-by: Emanuele <emanuele.leopardi@tset.com>	2026-01-28 13:08:20 -08:00
Chris Lu	4f5f1f6be7	refactor(helm): Unified Naming Truncation and Bug Fixes (#8143 ) * refactor(helm): add componentName helper for truncation * fix(helm): unify ingress backend naming with truncation * fix(helm): unify statefulset/deployment naming with truncation * fix(helm): add missing labels to services for servicemonitor discovery * chore(helm): secure secrets and add upgrade notes * fix(helm): truncate context instead of suffix in componentName * revert(docs): remove upgrade notes per feedback * fix(helm): use componentName for COSI serviceAccountName * helm: update master -ip to use component name for correct truncation * helm: refactor masterServers helper to use truncated component names * helm: update volume -ip to use component name and cleanup redundant printf * helm: refine helpers with robustness check and updated docs	2026-01-27 17:45:16 -08:00
MorezMartin	20952aa514	Fix jwt error in admin UI (#8140 ) * add jwt token in weed admin headers requests * add jwt token to header for download * :s/upload/download * filer_signing.read despite of filer_signing key * finalize filer_browser_handlers.go * admin: add JWT authorization to file browser handlers * security: fix typos in JWT read validation descriptions * Move security.toml to example and secure keys * security: address PR feedback on JWT enforcement and example keys * security: refactor JWT logic and improve example keys readability * Update docker/Dockerfile.local Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Chris Lu <chris.lu@gmail.com> Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-01-27 17:27:02 -08:00

1 2 3 4 5 ...

314 Commits