121 Commits

Author SHA1 Message Date
Andreas Røste
79f4a4579f feat(k8s): added possibility to specify service.type for multiple ser… (#8372)
* feat(k8s): added possibility to specify service.type for multiple services in helm chart

* fix(k8s): removed headless (clusterIP: None) from services

* fix(k8s): keep master and filer services headless for StatefulSet compatibility

Master and filer services must remain headless (clusterIP: None) because
their StatefulSets reference them via serviceName for stable pod DNS.
Revert the service.type change for these two services and remove their
unused service config from values.yaml. S3 and SFTP remain configurable.

---------

Co-authored-by: Andreas Røste <andreas2101@gmail.com>
Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
Co-authored-by: Chris Lu <chris.lu@gmail.com>
2026-03-25 11:30:14 -07:00
Chris Lu
5e76f55077 fix(helm): namespace app-specific global values under global.seaweedfs (#8700)
* fix(helm): namespace app-specific values under global.seaweedfs

Move all app-specific values from the global namespace to
global.seaweedfs.* to avoid polluting the shared .Values.global
namespace when the chart is used as a subchart.

Standard Helm conventions (global.imageRegistry, global.imagePullSecrets)
remain at the global level as they are designed to be shared across
subcharts.

Fixes seaweedfs/seaweedfs#8699

BREAKING CHANGE: global values have been restructured. Users must update
their values files to use the new paths:
- global.registry → global.imageRegistry
- global.repository → global.seaweedfs.image.repository
- global.imageName → global.seaweedfs.image.name
- global.<key> → global.seaweedfs.<key> (for all other app-specific values)

* fix(ci): update helm CI tests to use new global.seaweedfs.* value paths

Update all --set flags in helm_ci.yml to use the new namespaced
global.seaweedfs.* paths matching the values.yaml restructuring.

* fix(ci): install Claude Code via npm to avoid install.sh 403

The claude-code-action's built-in installer uses
`curl https://claude.ai/install.sh | bash` which can fail with 403.
Due to the pipe, bash exits 0 on empty input, masking the curl failure
and leaving the `claude` binary missing.

Work around this by installing Claude Code via npm before invoking the
action, and passing the executable path via path_to_claude_code_executable.

* revert: remove claude-code-review.yml changes from this PR

The claude-code-action OIDC token exchange validates that the workflow
file matches the version on the default branch. Modifying it in a PR
causes the review job to fail with "Workflow validation failed".

The Claude Code install fix will need to be applied directly to master
or in a separate PR.

* fix: update stale references to old global.* value paths

- admin-statefulset.yaml: fix fail message to reference
  global.seaweedfs.masterServer
- values.yaml: fix comment to reference image.name instead of imageName
- helm_ci.yml: fix diagnostic message to reference
  global.seaweedfs.enableSecurity

* feat(helm): add backward-compat shim for old global.* value paths

Add _compat.tpl with a seaweedfs.compat helper that detects old-style
global.* keys (e.g. global.enableSecurity, global.registry) and merges
them into the new global.seaweedfs.* namespace.

Since the old keys no longer have defaults in values.yaml, their
presence means the user explicitly provided them. The helper uses
in-place mutation via `set` so all templates see the merged values.

This ensures existing deployments using old value paths continue to
work without changes after upgrading.

* fix: update stale comment references in values.yaml

Update comments referencing global.enableSecurity and global.masterServer
to the new global.seaweedfs.* paths.

---------

Co-authored-by: Copilot <copilot@github.com>
2026-03-19 13:00:48 -07:00
hoppla20
d34da671eb fix(chart): bucket hook (#8680)
* fix(chart): add imagePullPolicy and imagePullSecret to bucket-hook

* chore(chart): add configurable bucket hook resources

* fix(chart): add createBucketsHook value to allInOne and filer s3 blocks
2026-03-18 12:58:29 -07:00
hoppla20
d79e82ee60 fix(chart): missing resources on volume statefulset initContainer (#8678)
* fix(chart): missing resources on volume statefulset initContainer

* chore(chart): use own resources for idx-vol-move initContainer

* chore(chart): improve comment for idxMoveResources value
2026-03-18 12:30:18 -07:00
Copilot
7174760a5d helm: add urlPrefix support for admin UI behind reverse proxy subpath 2026-03-17 18:28:40 -07:00
Chris Lu
6c7fe87a72 helm: add s3.tlsSecret for custom S3 HTTPS certificate (#8582)
* helm: add s3.tlsSecret to allow custom TLS certificate for S3 HTTPS endpoint

Allow users to specify an external Kubernetes TLS secret for the S3
HTTPS endpoint instead of using the internal self-signed client
certificate. This enables using publicly trusted certificates (e.g.
from Let's Encrypt) so S3 clients don't need to trust the internal CA.

The new s3.tlsSecret value is supported in the standalone S3 gateway,
filer with embedded S3, and all-in-one deployment templates.

Closes #8581

* refactor: extract S3 TLS helpers to reduce duplication

Move repeated S3 TLS cert/key logic into shared helper templates
(seaweedfs.s3.tlsArgs, seaweedfs.s3.tlsVolumeMount, seaweedfs.s3.tlsVolume)
in _helpers.tpl, and use them across all three deployment templates.

* helm: add allInOne.s3.trafficDistribution support

Add the missing allInOne.s3.trafficDistribution branch to the
seaweedfs.trafficDistribution helper and wire it into the all-in-one
service template, mirroring the existing s3-service.yaml behavior.
PreferClose is auto-converted to PreferSameZone on k8s >=1.35.

* fix: scope S3 TLS mounts to S3-enabled pods and simplify trafficDistribution helper

- Wrap S3 TLS volume/volumeMount includes in allInOne.s3.enabled and
  filer.s3.enabled guards so the custom TLS secret is only mounted
  when S3 is actually enabled in that deployment mode.
- Refactor seaweedfs.trafficDistribution helper to accept an explicit
  value+Capabilities dict instead of walking multiple .Values paths,
  making each call site responsible for passing its own setting.
2026-03-09 14:24:42 -07:00
Kirill Ilin
ae02d47433 helm: add optional parameters to COSI BucketClass (#8453)
Add cosi.bucketClassParameters to allow passing arbitrary parameters
to the default BucketClass resource. This enables use cases like
tiered storage where a diskType parameter needs to be set on the
BucketClass to route objects to specific volume servers.

When bucketClassParameters is empty (default), the BucketClass is
rendered without a parameters block, preserving backward compatibility.

Signed-off-by: Kirill Ilin <stitch14@yandex.ru>
Co-authored-by: Claude <noreply@anthropic.com>
2026-02-26 12:19:07 -08:00
Chris Lu
9b6fc49946 Chart createBuckets config #8368: Add TTL, Object Lock, and Versioning support (#8375)
* Chart createBuckets config #8368: Add TTL, Object Lock, and Versioning support

* Update weed/shell/command_s3_bucket_versioning.go

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* address comments

* address comments

* go fmt

* fix failures are still treated like “bucket not found”

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-02-26 11:56:10 -08:00
Peter Dodd
f4af1cc0ba feat(helm): annotations for service account (#8429) 2026-02-24 07:35:13 -08:00
Richard Chen Zheng
964a8f5fde Allow user to define access and secret key via values (#8389)
* Allow user to define admin access and secret key via values

* Add comments to values.yaml

* Add support for read for consistency

* Simplify templating

* Add checksum to s3 config

* Update comments

* Revert "Add checksum to s3 config"

This reverts commit d21a7038a86ae2adf547730b2cb6f455dcd4ce70.
2026-02-20 00:37:54 -08:00
Chris Lu
8ec9ff4a12 Refactor plugin system and migrate worker runtime (#8369)
* admin: add plugin runtime UI page and route wiring

* pb: add plugin gRPC contract and generated bindings

* admin/plugin: implement worker registry, runtime, monitoring, and config store

* admin/dash: wire plugin runtime and expose plugin workflow APIs

* command: add flags to enable plugin runtime

* admin: rename remaining plugin v2 wording to plugin

* admin/plugin: add detectable job type registry helper

* admin/plugin: add scheduled detection and dispatch orchestration

* admin/plugin: prefetch job type descriptors when workers connect

* admin/plugin: add known job type discovery API and UI

* admin/plugin: refresh design doc to match current implementation

* admin/plugin: enforce per-worker scheduler concurrency limits

* admin/plugin: use descriptor runtime defaults for scheduler policy

* admin/ui: auto-load first known plugin job type on page open

* admin/plugin: bootstrap persisted config from descriptor defaults

* admin/plugin: dedupe scheduled proposals by dedupe key

* admin/ui: add job type and state filters for plugin monitoring

* admin/ui: add per-job-type plugin activity summary

* admin/plugin: split descriptor read API from schema refresh

* admin/ui: keep plugin summary metrics global while tables are filtered

* admin/plugin: retry executor reservation before timing out

* admin/plugin: expose scheduler states for monitoring

* admin/ui: show per-job-type scheduler states in plugin monitor

* pb/plugin: rename protobuf package to plugin

* admin/plugin: rename pluginRuntime wiring to plugin

* admin/plugin: remove runtime naming from plugin APIs and UI

* admin/plugin: rename runtime files to plugin naming

* admin/plugin: persist jobs and activities for monitor recovery

* admin/plugin: lease one detector worker per job type

* admin/ui: show worker load from plugin heartbeats

* admin/plugin: skip stale workers for detector and executor picks

* plugin/worker: add plugin worker command and stream runtime scaffold

* plugin/worker: implement vacuum detect and execute handlers

* admin/plugin: document external vacuum plugin worker starter

* command: update plugin.worker help to reflect implemented flow

* command/admin: drop legacy Plugin V2 label

* plugin/worker: validate vacuum job type and respect min interval

* plugin/worker: test no-op detect when min interval not elapsed

* command/admin: document plugin.worker external process

* plugin/worker: advertise configured concurrency in hello

* command/plugin.worker: add jobType handler selection

* command/plugin.worker: test handler selection by job type

* command/plugin.worker: persist worker id in workingDir

* admin/plugin: document plugin.worker jobType and workingDir flags

* plugin/worker: support cancel request for in-flight work

* plugin/worker: test cancel request acknowledgements

* command/plugin.worker: document workingDir and jobType behavior

* plugin/worker: emit executor activity events for monitor

* plugin/worker: test executor activity builder

* admin/plugin: send last successful run in detection request

* admin/plugin: send cancel request when detect or execute context ends

* admin/plugin: document worker cancel request responsibility

* admin/handlers: expose plugin scheduler states API in no-auth mode

* admin/handlers: test plugin scheduler states route registration

* admin/plugin: keep worker id on worker-generated activity records

* admin/plugin: test worker id propagation in monitor activities

* admin/dash: always initialize plugin service

* command/admin: remove plugin enable flags and default to enabled

* admin/dash: drop pluginEnabled constructor parameter

* admin/plugin UI: stop checking plugin enabled state

* admin/plugin: remove docs for plugin enable flags

* admin/dash: remove unused plugin enabled check method

* admin/dash: fallback to in-memory plugin init when dataDir fails

* admin/plugin API: expose worker gRPC port in status

* command/plugin.worker: resolve admin gRPC port via plugin status

* split plugin UI into overview/configuration/monitoring pages

* Update layout_templ.go

* add volume_balance plugin worker handler

* wire plugin.worker CLI for volume_balance job type

* add erasure_coding plugin worker handler

* wire plugin.worker CLI for erasure_coding job type

* support multi-job handlers in plugin worker runtime

* allow plugin.worker jobType as comma-separated list

* admin/plugin UI: rename to Workers and simplify config view

* plugin worker: queue detection requests instead of capacity reject

* Update plugin_worker.go

* plugin volume_balance: remove force_move/timeout from worker config UI

* plugin erasure_coding: enforce local working dir and cleanup

* admin/plugin UI: rename admin settings to job scheduling

* admin/plugin UI: persist and robustly render detection results

* admin/plugin: record and return detection trace metadata

* admin/plugin UI: show detection process and decision trace

* plugin: surface detector decision trace as activities

* mini: start a plugin worker by default

* admin/plugin UI: split monitoring into detection and execution tabs

* plugin worker: emit detection decision trace for EC and balance

* admin workers UI: split monitoring into detection and execution pages

* plugin scheduler: skip proposals for active assigned/running jobs

* admin workers UI: add job queue tab

* plugin worker: add dummy stress detector and executor job type

* admin workers UI: reorder tabs to detection queue execution

* admin workers UI: regenerate plugin template

* plugin defaults: include dummy stress and add stress tests

* plugin dummy stress: rotate detection selections across runs

* plugin scheduler: remove cross-run proposal dedupe

* plugin queue: track pending scheduled jobs

* plugin scheduler: wait for executor capacity before dispatch

* plugin scheduler: skip detection when waiting backlog is high

* plugin: add disk-backed job detail API and persistence

* admin ui: show plugin job detail modal from job id links

* plugin: generate unique job ids instead of reusing proposal ids

* plugin worker: emit heartbeats on work state changes

* plugin registry: round-robin tied executor and detector picks

* add temporary EC overnight stress runner

* plugin job details: persist and render EC execution plans

* ec volume details: color data and parity shard badges

* shard labels: keep parity ids numeric and color-only distinction

* admin: remove legacy maintenance UI routes and templates

* admin: remove dead maintenance endpoint helpers

* Update layout_templ.go

* remove dummy_stress worker and command support

* refactor plugin UI to job-type top tabs and sub-tabs

* migrate weed worker command to plugin runtime

* remove plugin.worker command and keep worker runtime with metrics

* update helm worker args for jobType and execution flags

* set plugin scheduling defaults to global 16 and per-worker 4

* stress: fix RPC context reuse and remove redundant variables in ec_stress_runner

* admin/plugin: fix lifecycle races, safe channel operations, and terminal state constants

* admin/dash: randomize job IDs and fix priority zero-value overwrite in plugin API

* admin/handlers: implement buffered rendering to prevent response corruption

* admin/plugin: implement debounced persistence flusher and optimize BuildJobDetail memory lookups

* admin/plugin: fix priority overwrite and implement bounded wait in scheduler reserve

* admin/plugin: implement atomic file writes and fix run record side effects

* admin/plugin: use P prefix for parity shard labels in execution plans

* admin/plugin: enable parallel execution for cancellation tests

* admin: refactor time.Time fields to pointers for better JSON omitempty support

* admin/plugin: implement pointer-safe time assignments and comparisons in plugin core

* admin/plugin: fix time assignment and sorting logic in plugin monitor after pointer refactor

* admin/plugin: update scheduler activity tracking to use time pointers

* admin/plugin: fix time-based run history trimming after pointer refactor

* admin/dash: fix JobSpec struct literal in plugin API after pointer refactor

* admin/view: add D/P prefixes to EC shard badges for UI consistency

* admin/plugin: use lifecycle-aware context for schema prefetching

* Update ec_volume_details_templ.go

* admin/stress: fix proposal sorting and log volume cleanup errors

* stress: refine ec stress runner with math/rand and collection name

- Added Collection field to VolumeEcShardsDeleteRequest for correct filename construction.
- Replaced crypto/rand with seeded math/rand PRNG for bulk payloads.
- Added documentation for EcMinAge zero-value behavior.
- Added logging for ignored errors in volume/shard deletion.

* admin: return internal server error for plugin store failures

Changed error status code from 400 Bad Request to 500 Internal Server Error for failures in GetPluginJobDetail to correctly reflect server-side errors.

* admin: implement safe channel sends and graceful shutdown sync

- Added sync.WaitGroup to Plugin struct to manage background goroutines.
- Implemented safeSendCh helper using recover() to prevent panics on closed channels.
- Ensured Shutdown() waits for all background operations to complete.

* admin: robustify plugin monitor with nil-safe time and record init

- Standardized nil-safe assignment for *time.Time pointers (CreatedAt, UpdatedAt, CompletedAt).
- Ensured persistJobDetailSnapshot initializes new records correctly if they don't exist on disk.
- Fixed debounced persistence to trigger immediate write on job completion.

* admin: improve scheduler shutdown behavior and logic guards

- Replaced brittle error string matching with explicit r.shutdownCh selection for shutdown detection.
- Removed redundant nil guard in buildScheduledJobSpec.
- Standardized WaitGroup usage for schedulerLoop.

* admin: implement deep copy for job parameters and atomic write fixes

- Implemented deepCopyGenericValue and used it in cloneTrackedJob to prevent shared state.
- Ensured atomicWriteFile creates parent directories before writing.

* admin: remove unreachable branch in shard classification

Removed an unreachable 'totalShards <= 0' check in classifyShardID as dataShards and parityShards are already guarded.

* admin: secure UI links and use canonical shard constants

- Added rel="noopener noreferrer" to external links for security.
- Replaced magic number 14 with erasure_coding.TotalShardsCount.
- Used renderEcShardBadge for missing shard list consistency.

* admin: stabilize plugin tests and fix regressions

- Composed a robust plugin_monitor_test.go to handle asynchronous persistence.
- Updated all time.Time literals to use timeToPtr helper.
- Added explicit Shutdown() calls in tests to synchronize with debounced writes.
- Fixed syntax errors and orphaned struct literals in tests.

* Potential fix for code scanning alert no. 278: Slice memory allocation with excessive size value

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* Potential fix for code scanning alert no. 283: Uncontrolled data used in path expression

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* admin: finalize refinements for error handling, scheduler, and race fixes

- Standardized HTTP 500 status codes for store failures in plugin_api.go.
- Tracked scheduled detection goroutines with sync.WaitGroup for safe shutdown.
- Fixed race condition in safeSendDetectionComplete by extracting channel under lock.
- Implemented deep copy for JobActivity details.
- Used defaultDirPerm constant in atomicWriteFile.

* test(ec): migrate admin dockertest to plugin APIs

* admin/plugin_api: fix RunPluginJobTypeAPI to return 500 for server-side detection/filter errors

* admin/plugin_api: fix ExecutePluginJobAPI to return 500 for job execution failures

* admin/plugin_api: limit parseProtoJSONBody request body to 1MB to prevent unbounded memory usage

* admin/plugin: consolidate regex to package-level validJobTypePattern; add char validation to sanitizeJobID

* admin/plugin: fix racy Shutdown channel close with sync.Once

* admin/plugin: track sendLoop and recv goroutines in WorkerStream with r.wg

* admin/plugin: document writeProtoFiles atomicity — .pb is source of truth, .json is human-readable only

* admin/plugin: extract activityLess helper to deduplicate nil-safe OccurredAt sort comparators

* test/ec: check http.NewRequest errors to prevent nil req panics

* test/ec: replace deprecated ioutil/math/rand, fix stale step comment 5.1→3.1

* plugin(ec): raise default detection and scheduling throughput limits

* topology: include empty disks in volume list and EC capacity fallback

* topology: remove hard 10-task cap for detection planning

* Update ec_volume_details_templ.go

* adjust default

* fix tests

---------

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
2026-02-18 13:42:41 -08:00
Yalın Doğu Şahin
ef3b5f7efa helm/add iceberg rest catalog ingress for s3 (#8205)
* helm: add Iceberg REST catalog support to S3 service

* helm: add Iceberg REST catalog support to S3 service

* add ingress for iceberg catalog endpoint

* helm: conditionally render ingressClassName in s3-iceberg-ingress.yaml

* helm: refactor s3-iceberg-ingress.yaml to use named template for paths

* helm: remove unused $serviceName variable in s3-iceberg-ingress.yaml

---------

Co-authored-by: yalin.sahin <yalin.sahin@tradition.ch>
Co-authored-by: Chris Lu <chris.lu@gmail.com>
2026-02-04 12:00:59 -08:00
Yalın Doğu Şahin
47fc9e771f helm: add Iceberg REST catalog support to S3 service (#8193)
* helm: add Iceberg REST catalog support to S3 service

* helm: add Iceberg REST catalog support to S3 service

---------

Co-authored-by: yalin.sahin <yalin.sahin@tradition.ch>
2026-02-03 13:44:52 -08:00
Emanuele Leopardi
51ef39fc76 Update Helm hook annotations for post-install and upgrade (#8150)
* Update Helm hook annotations for post-install and upgrade

I believe it makes sense to allow this job to run also after installation. Assuming weed shell is idempotent, and assuming someone wants to add a new bucket after the initial installation, it makes sense to trigger the job again.

* Add check for existing buckets before creation

* Enhances S3 bucket existence check

Improves the reliability of checking for existing S3 buckets in the post-install hook.

The previous `grep -w` command could lead to imprecise matches. This update extracts only the bucket name and performs an exact, whole-line match to ensure accurate detection of existing buckets. This prevents potential issues with redundant creation attempts or false negatives.

* Currently Bucket Creation is ignored if filer.s3.enabled is disabled

This commit enables bucket creation on both scenarios,i.e. if any of filer.s3.enabled or s3.enabled are used.

---------

Co-authored-by: Emanuele <emanuele.leopardi@tset.com>
2026-01-28 13:08:20 -08:00
Yalın Doğu Şahin
d345752e3d Feature/volume ingress (#8084) 2026-01-22 06:48:29 -08:00
Sheya Bernstein
8740a087b9 fix: apply tpl function to all component extraEnvironmentVars (#8001) 2026-01-11 12:14:16 -08:00
MorezMartin
629d9479a1 Fix jwt error in Filer pod (k8s) (#7960)
* Avoid JWT error on liveprobeness

* fix jwt error

* address comments

* lint

---------

Co-authored-by: Chris Lu <chris.lu@gmail.com>
2026-01-04 12:05:31 -08:00
Sheya Bernstein
c0188db7cc chart: Set admin metrics port to http port (#7936)
* chart: Set admin metrics port to http port

* remove metrics reference
2026-01-02 12:15:33 -08:00
Sheya Bernstein
6f28cb7f87 helm: Support multiple hosts for S3 ingress (#7931) 2026-01-01 07:41:53 -08:00
Chris Lu
31a4f57cd9 Fix: Add -admin.grpc flag to worker for explicit gRPC port (#7926) (#7927)
* Fix: Add -admin.grpc flag to worker for explicit gRPC port configuration

* Fix(helm): Add adminGrpcServer to worker configuration

* Refactor: Support host:port.grpcPort address format, revert -admin.grpc flag

* Helm: Conditionally append grpcPort to worker admin address

* weed/admin: fix "send on closed channel" panic in worker gRPC server

Make unregisterWorker connection-aware to prevent closing channels
belonging to newer connections.

* weed/worker: improve gRPC client stability and logging

- Fix goroutine leak in reconnection logic
- Refactor reconnection loop to exit on success and prevent busy-waiting
- Add session identification and enhanced logging to client handlers
- Use constant for internal reset action and remove unused variables

* weed/worker: fix worker state initialization and add lifecycle logs

- Revert workerState to use running boolean correctly
- Prevent handleStart failing by checking running state instead of startTime
- Add more detailed logs for worker startup events
2025-12-31 11:55:09 -08:00
Sheya Bernstein
915a7d4a54 feat: Add probes to worker service (#7896)
* feat: Add probes to worker service

* feat: Add probes to worker service

* Merge branch 'master' into pr/7896

* refactor

---------

Co-authored-by: Chris Lu <chris.lu@gmail.com>
2025-12-27 13:40:05 -08:00
Sheya Bernstein
911aca74f3 Support volume server ID in Helm chart (#7867)
helm: Support volume server ID
2025-12-24 10:52:40 -08:00
Chris Lu
88ed187c27 fix(worker): add metrics HTTP server and health checks for Kubernetes (#7860)
* feat(worker): add metrics HTTP server and debug profiling support

- Add -metricsPort flag to enable Prometheus metrics endpoint
- Add -metricsIp flag to configure metrics server bind address
- Implement /metrics endpoint for Prometheus-compatible metrics
- Implement /health endpoint for Kubernetes readiness/liveness probes
- Add -debug flag to enable pprof debugging server
- Add -debug.port flag to configure debug server port
- Fix stats package import naming conflict by using alias
- Update usage examples to show new flags

Fixes #7843

* feat(helm): add worker metrics and health check support

- Update worker readiness probe to use httpGet on /health endpoint
- Update worker liveness probe to use httpGet on /health endpoint
- Add metricsPort flag to worker command in deployment template
- Support both httpGet and tcpSocket probe types for backward compatibility
- Update values.yaml with health check configuration

This enables Kubernetes pod lifecycle management for worker components through
proper health checks on the new metrics HTTP endpoint.

* feat(mini): align all services to share single debug and metrics servers

- Disable S3's separate debug server in mini mode (port 6060 now shared by all)
- Add metrics server startup to embedded worker for health monitoring
- All services now share the single metrics port (9327) and single debug port (6060)
- Consistent pattern with master, filer, volume, webdav services

* fix(worker): fix variable shadowing in health check handler

- Rename http.ResponseWriter parameter from 'w' to 'rw' to avoid shadowing
  the outer 'w *worker.Worker' parameter
- Prevents potential bugs if future code tries to use worker state in handler
- Improves code clarity and follows Go best practices

* fix(worker): remove unused worker parameter in metrics server

- Change 'w *worker.Worker' parameter to '_' as it's not used
- Clarifies intent that parameter is intentionally unused
- Follows Go best practices and improves code clarity

* fix(helm): fix trailing backslash syntax errors in worker command

- Fix conditional backslash placement to prevent shell syntax errors
- Only add backslash when metricsPort OR extraArgs are present
- Prevents worker pod startup failures due to malformed command arguments
- Ensures proper shell command parsing regardless of configuration state

* refactor(worker): use standard stats.StartMetricsServer for consistency

- Replace custom metrics server implementation with stats.StartMetricsServer
  to match pattern used in master, volume, s3, filer_sync components
- Simplifies code and improves maintainability
- Uses glog.Fatal for errors (consistent with other SeaweedFS components)
- Remove unused net/http and prometheus/promhttp imports
- Automatically provides /metrics and /health endpoints via standard implementation
2025-12-23 11:46:34 -08:00
Chris Lu
4f382b77c8 helm: fix admin secret template paths and remove duplicate (#7690)
* add admin and worker to helm charts

* workers are stateless, admin is stateful

* removed the duplicate admin-deployment.yaml

* address comments

* address comments

* purge

* Update README.md

* Update k8s/charts/seaweedfs/templates/admin/admin-ingress.yaml

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* address comments

* address comments

* supports Kubernetes versions from v1.14 to v1.30+, ensuring broad compatibility

* add probe for workers

* address comments

* add a todo

* chore: trigger CI

* use port name for probes in admin statefulset

* add secrets to admin helm chart

* fix error .Values.admin.secret.existingSecret

* helm: fix admin secret template paths and remove duplicate

- Fix value paths to use .Values.admin.secret.existingSecret instead of .Values.existingSecret
- Use templated secret name {{ template "seaweedfs.name" . }}-admin-secret
- Add .Values.admin.enabled check to admin-secret.yaml
- Remove duplicate admin-secret.yaml from templates/ root

* helm: address PR review feedback

- Only pass adminUser/adminPassword args when auth is enabled (fixes regression)
- Use $adminSecretName variable to reduce duplication (DRY)
- Only create admin-secret when adminPassword is set
- Add documentation comments for existingSecret, userKey, pwKey fields
- Clarify that empty adminPassword disables authentication

* helm: quote admin credentials to handle spaces

* helm: fix yaml lint errors (comment spacing, trailing blank line)

* helm: add validation for existingSecret requiring userKey and pwKey

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Ubuntu <morez.martin@gmail.com>
2025-12-09 16:57:11 -08:00
Chris Lu
80c7de8d76 Helm Charts: add admin and worker to helm charts (#7688)
* add admin and worker to helm charts

* workers are stateless, admin is stateful

* removed the duplicate admin-deployment.yaml

* address comments

* address comments

* purge

* Update README.md

* Update k8s/charts/seaweedfs/templates/admin/admin-ingress.yaml

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* address comments

* address comments

* supports Kubernetes versions from v1.14 to v1.30+, ensuring broad compatibility

* add probe for workers

* address comments

* add a todo

* chore: trigger CI

* use port name for probes in admin statefulset

* fix: remove trailing blank line in values.yaml

* address code review feedback

- Quote admin credentials in shell command to handle special characters
- Remove unimplemented capabilities (remote, replication) from worker defaults
- Add security note about admin password character restrictions

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-12-09 16:34:07 -08:00
chrislu
e28629bb5f reduce minFreeSpacePercent to 1
addressing https://github.com/seaweedfs/seaweedfs/issues/7110#issuecomment-3622472545
2025-12-08 01:34:53 -08:00
chrislu
5167bbd2a9 Remove deprecated allowEmptyFolder CLI option
The allowEmptyFolder option is no longer functional because:
1. The code that used it was already commented out
2. Empty folder cleanup is now handled asynchronously by EmptyFolderCleaner

The CLI flags are kept for backward compatibility but marked as deprecated
and ignored. This removes:
- S3ApiServerOption.AllowEmptyFolder field
- The actual usage in s3api_object_handlers_list.go
- Helm chart values and template references
- References in test Makefiles and docker-compose files
2025-12-06 21:54:12 -08:00
Chris Lu
62a83ed469 helm: enhance all-in-one deployment configuration (#7639)
* helm: enhance all-in-one deployment configuration

Fixes #7110

This PR addresses multiple issues with the all-in-one Helm chart configuration:

## New Features

### Configurable Replicas
- Added `allInOne.replicas` (was hardcoded to 1)

### S3 Gateway Configuration
- Added full S3 config under `allInOne.s3`:
  - port, httpsPort, domainName, allowEmptyFolder
  - enableAuth, existingConfigSecret, auditLogConfig
  - createBuckets for declarative bucket creation

### SFTP Server Configuration
- Added full SFTP config under `allInOne.sftp`:
  - port, sshPrivateKey, hostKeysFolder, authMethods
  - maxAuthTries, bannerMessage, loginGraceTime
  - clientAliveInterval, clientAliveCountMax, enableAuth

### Command Line Arguments
- Added `allInOne.extraArgs` for custom CLI arguments

### Update Strategy
- Added `allInOne.updateStrategy.type` (Recreate/RollingUpdate)

### Secret Environment Variables
- Added `allInOne.secretExtraEnvironmentVars` for injecting secrets

### Ingress Support
- Added `allInOne.ingress` with S3, filer, and master sub-configs

### Storage Options
- Enhanced `allInOne.data` with existingClaim support
- Added PVC template for persistentVolumeClaim type

## CI Enhancements
- Added comprehensive tests for all-in-one configurations
- Tests cover replicas, S3, SFTP, extraArgs, strategies, PVC, ingress

* helm: add real cluster deployment tests to CI

- Deploy all-in-one cluster with S3 enabled on kind cluster
- Test Master API (/cluster/status endpoint)
- Test Filer API (file upload/download)
- Test S3 API (/status endpoint)
- Test S3 operations with AWS CLI:
  - Create/delete buckets
  - Upload/download/delete objects
  - Verify file content integrity

* helm: simplify CI and remove all-in-one ingress

Address review comments:
- Remove detailed all-in-one template rendering tests from CI
- Remove real cluster deployment tests from CI
- Remove all-in-one ingress template and values configuration

Keep the core improvements:
- allInOne.replicas configuration
- allInOne.s3.* full configuration
- allInOne.sftp.* full configuration
- allInOne.extraArgs support
- allInOne.updateStrategy configuration
- allInOne.secretExtraEnvironmentVars support

* helm: address review comments

- Fix post-install-bucket-hook.yaml: add filer.s3.enableAuth and
  filer.s3.existingConfigSecret to or statements for consistency
- Fix all-in-one-deployment.yaml: use default function for s3.domainName
- Fix all-in-one-deployment.yaml: use hasKey function for s3.allowEmptyFolder

* helm: clarify updateStrategy multi-replica behavior

Expand comment to warn users that RollingUpdate with multiple replicas
requires shared storage (ReadWriteMany) to avoid data loss.

* helm: address gemini-code-assist review comments

- Make PVC accessModes configurable to support ReadWriteMany for
  multi-replica deployments (defaults to ReadWriteOnce)
- Use configured readiness probe paths in post-install bucket hook
  instead of hardcoded paths, respecting custom configurations

* helm: simplify allowEmptyFolder logic using coalesce

Use coalesce function for cleaner template code as suggested in review.

* helm: fix extraArgs trailing backslash issue

Remove trailing backslash after the last extraArgs argument to avoid
shell syntax error. Use counter to only add backslash between arguments.

* helm: fix fallback logic for allInOne s3/sftp configuration

Changes:
- Set allInOne.s3.* and allInOne.sftp.* override parameters to null by default
  This allows proper inheritance from global s3.* and sftp.* settings
- Fix allowEmptyFolder logic to use explicit nil checking instead of coalesce
  The coalesce/default functions treat 'false' as empty, causing incorrect
  fallback behavior when users want to explicitly set false values

Addresses review feedback about default value conflicts with fallback logic.

* helm: fix exec in bucket creation loop causing premature termination

Remove 'exec' from the range loops that create and configure S3 buckets.
The exec command replaces the current shell process, causing the script
to terminate after the first bucket, preventing creation/configuration
of subsequent buckets.

* helm: quote extraArgs to handle arguments with spaces

Use the quote function to ensure each item in extraArgs is treated as
a single, complete argument even if it contains spaces.

* helm: make s3/filer ingress work for both normal and all-in-one modes

Modified s3-ingress.yaml and filer-ingress.yaml to dynamically select
the service name based on deployment mode:
- Normal mode: points to seaweedfs-s3 / seaweedfs-filer services
- All-in-one mode: points to seaweedfs-all-in-one service

This eliminates the need for separate all-in-one ingress templates.
Users can now use the standard s3.ingress and filer.ingress settings
for both deployment modes.

* helm: fix allInOne.data.size and storageClass to use null defaults

Change size and storageClass from empty strings to null so the template
defaults (10Gi for size, cluster default for storageClass) will apply
correctly. Empty strings prevent the Helm | default function from working.

* helm: fix S3 ingress to include standalone S3 gateway case

Add s3.enabled check to the $s3Enabled logic so the ingress works for:
1. Standalone S3 gateway (s3.enabled)
2. S3 on Filer (filer.s3.enabled) when not in all-in-one mode
3. S3 in all-in-one mode (allInOne.s3.enabled)
2025-12-06 18:54:28 -08:00
Chris Lu
3183a49698 fix: S3 downloads failing after idle timeout (#7626)
* fix: S3 downloads failing after idle timeout (#7618)

The idle timeout was incorrectly terminating active downloads because
read and write deadlines were managed independently. During a download,
the server writes data but rarely reads, so the read deadline would
expire even though the connection was actively being used.

Changes:
1. Simplify to single Timeout field - since this is a 'no activity timeout'
   where any activity extends the deadline, separate read/write timeouts
   are unnecessary. Now uses SetDeadline() which sets both at once.

2. Implement proper 'no activity timeout' - any activity (read or write)
   now extends the deadline. The connection only times out when there's
   genuinely no activity in either direction.

3. Increase default S3 idleTimeout from 10s to 120s for additional safety
   margin when fetching chunks from slow storage backends.

Fixes #7618

* Update weed/util/net_timeout.go

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-04 18:31:46 -08:00
Chris Lu
268cc84e8c [helm] Fix liveness/readiness probe scheme path in templates (#7616)
Fix the templates to read scheme from httpGet.scheme instead of the
probe level, matching the structure defined in values.yaml.

This ensures that changing *.livenessProbe.httpGet.scheme or
*.readinessProbe.httpGet.scheme in values.yaml now correctly affects
the rendered manifests.

Affected components: master, filer, volume, s3, all-in-one

Fixes #7615
2025-12-03 18:53:06 -08:00
IvanHunters
e5521673eb Helm Charts: add certificate duration and renewBefore options (#7563)
* Helm Charts: add certificate duration and renewBefore options

Signed-off-by: ohotnikov.ivan <ohotnikov.ivan@e-queo.net>

* use .Values.global.certificates instead

certificates ca

---------

Signed-off-by: ohotnikov.ivan <ohotnikov.ivan@e-queo.net>
Co-authored-by: ohotnikov.ivan <ohotnikov.ivan@e-queo.net>
Co-authored-by: Chris Lu <chris.lu@gmail.com>
2025-11-27 14:22:20 -08:00
Chris Lu
f00cd38393 certificates ca 2025-11-27 14:17:37 -08:00
Federico A. Corazza
17b23f61e1 Don't make nginx the default ingress controller (#7436) 2025-11-04 13:44:29 -08:00
chrislu
f82c69b9a5 revert back s3 in helm chart to false
fix https://github.com/seaweedfs/seaweedfs/issues/7375
2025-10-27 17:23:31 -07:00
chrislu
4b76b2ad3c fix lint 2025-10-26 23:20:20 -07:00
Philipp Kraus
bf29963f75 ingress config (#7319)
* ingress config

* fixing issues

* prefix path type

For the S3 ingress path /, using pathType: Prefix is more explicit and standard-compliant for matching all subpaths. While ImplementationSpecific might work similarly with your current Ingress controller (often defaulting to a prefix match when use-regex is not enabled), Prefix clearly states the intent and improves portability across different Ingress controllers.

---------

Co-authored-by: Philipp Kraus <philipp.kraus@flashpixx.de>
Co-authored-by: Chris Lu <chris.lu@gmail.com>
2025-10-16 12:40:31 -07:00
Andrei Kvapil
d0a338684c Helm: allow specifying extraArgs for s3 (#7294)
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
2025-10-08 14:26:52 -07:00
chrislu
b3a401d9f9 setting the nodeSelector defaults to empty for all components, so pods can schedule on any compatible node architecture.
fix https://github.com/seaweedfs/seaweedfs/issues/7215
2025-09-09 08:07:37 -07:00
Benjamin Reed
b3b1316b54 fix missing support for .Values.global.repository (#7195)
* fix missing support for .Values.global.repository

* rework based on gemini feedback to handle repository+imageName more cleanly

* use base rather than last + splitList
2025-09-04 22:28:21 -07:00
Chris Lu
87fe03f2c4 k8s: resizeHook avoids bitnami in values.yaml (#7181)
Update values.yaml
2025-08-29 21:14:44 -07:00
Devin Lauderdale
92cebe12f0 chore: remove default replica count for all-in-one deployment (#7111) 2025-08-07 21:18:17 -07:00
Andrei Kvapil
39b574f3c5 [cosi] Update sidecar (#6993) 2025-07-16 13:51:30 -07:00
Andrei Kvapil
660941138b Introduce named volumes in Helm chart (#6972) 2025-07-14 11:00:02 -07:00
Yixing Cheng
5a7d226d93 chore: keep master statefulSet chart up-to-date (#6903)
This patch adds some missing master options to the helm chart of master statefulSet.
2025-06-20 17:30:17 -07:00
Chris Lu
2b3385e201 Helm Charts: add ip bind for filer (#6902)
add ip bind for filer

fix https://github.com/seaweedfs/seaweedfs/issues/6900
2025-06-20 10:46:57 -07:00
Chris Lu
f52134f9a1 adding metricsIp in Helm chart (#6897) 2025-06-19 22:52:19 -07:00
Mohamed Sekour
27a392f706 Fix sftp performances and add seaweedfs all-in-one deployment (#6792)
* improve perfs & fix rclone & refactoring
Signed-off-by: Mohamed Sekour <mohamed.sekour@exfo.com>

* improve perfs on download + add seaweedfs all-in-one deployment

Signed-off-by: Mohamed Sekour <mohamed.sekour@exfo.com>

* use helper for topologySpreadConstraints and fix create home dir of sftp users

Signed-off-by: Mohamed Sekour <mohamed.sekour@exfo.com>

* fix helm lint

Signed-off-by: Mohamed Sekour <mohamed.sekour@exfo.com>

* add missing ctx param

Signed-off-by: Mohamed Sekour <mohamed.sekour@exfo.com>

---------

Signed-off-by: Mohamed Sekour <mohamed.sekour@exfo.com>
2025-05-26 00:50:48 -07:00
Mohamed Sekour
93aed187e9 Add SFTP Server Support (#6753)
* Add SFTP Server Support

Signed-off-by: Mohamed Sekour <mohamed.sekour@exfo.com>

* fix s3 tests and helm lint

Signed-off-by: Mohamed Sekour <mohamed.sekour@exfo.com>

* increase  helm chart version

* adjust version

---------

Signed-off-by: Mohamed Sekour <mohamed.sekour@exfo.com>
Co-authored-by: chrislu <chris.lu@gmail.com>
2025-05-05 11:43:49 -07:00
klinch0
ffe6d928e3 feature/add-cosi-resources (#6638) 2025-03-17 07:32:17 -07:00
Manuel Leonhardt
7766e9729f Fix typos and YAML syntax issues (#6628)
* chore: remove trailing colon

Fixes a typo that might confuse users who simply uncomment or copy the
example, leading them to encounter invalid YAML.

* fix: using seaweedfs-s3-secret as default secret for COSI deployment

The default secret name containing the seaweedfs_s3_config secret key
is called "seaweedfs-s3-secret" throughout the configuration. This fix
ensures the COSI driver deployment uses the same consistent name.

* chore!: fix typo

BREAKING CHANGE: Changes name of key in helm-values.
2025-03-13 09:19:22 -07:00