18 Commits

Author SHA1 Message Date
Chris Lu
5e76f55077 fix(helm): namespace app-specific global values under global.seaweedfs (#8700)
* fix(helm): namespace app-specific values under global.seaweedfs

Move all app-specific values from the global namespace to
global.seaweedfs.* to avoid polluting the shared .Values.global
namespace when the chart is used as a subchart.

Standard Helm conventions (global.imageRegistry, global.imagePullSecrets)
remain at the global level as they are designed to be shared across
subcharts.

Fixes seaweedfs/seaweedfs#8699

BREAKING CHANGE: global values have been restructured. Users must update
their values files to use the new paths:
- global.registry → global.imageRegistry
- global.repository → global.seaweedfs.image.repository
- global.imageName → global.seaweedfs.image.name
- global.<key> → global.seaweedfs.<key> (for all other app-specific values)

* fix(ci): update helm CI tests to use new global.seaweedfs.* value paths

Update all --set flags in helm_ci.yml to use the new namespaced
global.seaweedfs.* paths matching the values.yaml restructuring.

* fix(ci): install Claude Code via npm to avoid install.sh 403

The claude-code-action's built-in installer uses
`curl https://claude.ai/install.sh | bash` which can fail with 403.
Due to the pipe, bash exits 0 on empty input, masking the curl failure
and leaving the `claude` binary missing.

Work around this by installing Claude Code via npm before invoking the
action, and passing the executable path via path_to_claude_code_executable.

* revert: remove claude-code-review.yml changes from this PR

The claude-code-action OIDC token exchange validates that the workflow
file matches the version on the default branch. Modifying it in a PR
causes the review job to fail with "Workflow validation failed".

The Claude Code install fix will need to be applied directly to master
or in a separate PR.

* fix: update stale references to old global.* value paths

- admin-statefulset.yaml: fix fail message to reference
  global.seaweedfs.masterServer
- values.yaml: fix comment to reference image.name instead of imageName
- helm_ci.yml: fix diagnostic message to reference
  global.seaweedfs.enableSecurity

* feat(helm): add backward-compat shim for old global.* value paths

Add _compat.tpl with a seaweedfs.compat helper that detects old-style
global.* keys (e.g. global.enableSecurity, global.registry) and merges
them into the new global.seaweedfs.* namespace.

Since the old keys no longer have defaults in values.yaml, their
presence means the user explicitly provided them. The helper uses
in-place mutation via `set` so all templates see the merged values.

This ensures existing deployments using old value paths continue to
work without changes after upgrading.

* fix: update stale comment references in values.yaml

Update comments referencing global.enableSecurity and global.masterServer
to the new global.seaweedfs.* paths.

---------

Co-authored-by: Copilot <copilot@github.com>
2026-03-19 13:00:48 -07:00
Jayshan Raghunandan
1f1eac4f08 feat: improve aio support for admin/volume ingress and fix UI links (#8679)
* feat: improve allInOne mode support for admin/volume ingress and fix master UI links

- Add allInOne support to admin ingress template, matching the pattern
  used by filer and s3 ingress templates (or-based enablement with
  ternary service name selection)
- Add allInOne support to volume ingress template, which previously
  required volume.enabled even when the volume server runs within the
  allInOne pod
- Expose admin ports in allInOne deployment and service when
  allInOne.admin.enabled is set
- Add allInOne.admin config section to values.yaml (enabled by default,
  ports inherit from admin.*)
- Fix legacy master UI templates (master.html, masterNewRaft.html) to
  prefer PublicUrl over internal Url when linking to volume server UI.
  The new admin UI already handles this correctly.

* fix: revert admin allInOne changes and fix PublicUrl in admin dashboard

The admin binary (`weed admin`) is a separate process that cannot run
inside `weed server` (allInOne mode). Revert the admin-related allInOne
helm chart changes that caused 503 errors on admin ingress.

Fix bug in cluster_topology.go where VolumeServer.PublicURL was set to
node.Id (internal pod address) instead of the actual public URL. Add
public_url field to DataNodeInfo proto message so the topology gRPC
response carries the public URL set via -volume.publicUrl flag.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use HTTP /dir/status to populate PublicUrl in admin dashboard

The gRPC DataNodeInfo proto does not include PublicUrl, so the admin dashboard showed internal pod IPs instead of the configured public URL.
Fetch PublicUrl from the master's /dir/status HTTP endpoint and apply it
in both GetClusterTopology and GetClusterVolumeServers code paths.

Also reverts the unnecessary proto field additions from the previous
commit and cleans up a stray blank line in all-in-one-service.yml.

* fix: apply PublicUrl link fix to masterNewRaft.html

Match the same conditional logic already applied to master.html —
prefer PublicUrl when set and different from Url.

* fix: add HTTP timeout and status check to fetchPublicUrlMap

Use a 5s-timeout client instead of http.DefaultClient to prevent
blocking indefinitely when the master is unresponsive. Also check
the HTTP status code before attempting to parse the response body.

* fix: fall back to node address when PublicUrl is empty

Prevents blank links in the admin dashboard when PublicUrl is not
configured, such as in standalone or mixed-version clusters.

* fix: log io.ReadAll error in fetchPublicUrlMap

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Chris Lu <chris.lu@gmail.com>
2026-03-18 13:20:55 -07:00
hoppla20
d79e82ee60 fix(chart): missing resources on volume statefulset initContainer (#8678)
* fix(chart): missing resources on volume statefulset initContainer

* chore(chart): use own resources for idx-vol-move initContainer

* chore(chart): improve comment for idxMoveResources value
2026-03-18 12:30:18 -07:00
Chris Lu
bfd0d5c084 fix(helm): use componentName for all service names to fix truncation mismatch (#8612)
* fix(helm): use componentName for all service names to fix truncation mismatch (#8610)

PR #8143 updated statefulsets and deployments to use the componentName
helper (which truncates the fullname before appending the suffix), but
left service definitions using the old `printf + trunc 63` pattern.
When release names are long enough, these two strategies produce
different names, causing DNS resolution failures (e.g., S3 cannot
find the filer-client service and falls back to localhost:8888).

Unify all service name definitions and cluster address helpers to use
the componentName helper consistently.

* refactor(helm): simplify cluster address helpers with ternary

* test(helm): add regression test for service name truncation with long release names

Renders the chart with a >63-char fullname in both normal and all-in-one
modes, then asserts that Service metadata.name values match the hostnames
produced by cluster.masterAddress, cluster.filerAddress, and the S3
deployment's -filer= argument. Prevents future truncation/DNS mismatch
regressions like #8610.

* fix(helm-ci): limit S3_FILER_HOST extraction to first match
2026-03-12 11:59:24 -07:00
Chris Lu
1a3e3100d0 Helm: set serviceAccountName independent of cluster role (#8495)
* Add stale job expiry and expire API

* Add expire job button

* helm: decouple serviceAccountName from cluster role

---------

Co-authored-by: Copilot <copilot@github.com>
2026-03-03 12:13:18 -08:00
Chris Lu
2644816692 helm: avoid duplicate env var keys in workload env lists (#8488)
* helm: dedupe merged extraEnvironmentVars in workloads

* address comments

Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com>

* range

Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com>

* helm: reuse merge helper for extraEnvironmentVars

---------

Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-02 12:10:57 -08:00
Sheya Bernstein
d8b8f0dffd fix(helm): add missing app.kubernetes.io/instance label to volume service (#8403) 2026-02-22 07:20:38 -08:00
Chris Lu
4f5f1f6be7 refactor(helm): Unified Naming Truncation and Bug Fixes (#8143)
* refactor(helm): add componentName helper for truncation

* fix(helm): unify ingress backend naming with truncation

* fix(helm): unify statefulset/deployment naming with truncation

* fix(helm): add missing labels to services for servicemonitor discovery

* chore(helm): secure secrets and add upgrade notes

* fix(helm): truncate context instead of suffix in componentName

* revert(docs): remove upgrade notes per feedback

* fix(helm): use componentName for COSI serviceAccountName

* helm: update master -ip to use component name for correct truncation

* helm: refactor masterServers helper to use truncated component names

* helm: update volume -ip to use component name and cleanup redundant printf

* helm: refine helpers with robustness check and updated docs
2026-01-27 17:45:16 -08:00
Chris Lu
c9c91ba568 Refactor Helm chart to use dynamic names for resources (#8142)
* Refactor Helm chart to use dynamic names for resources

* ensure name length
2026-01-27 12:52:06 -08:00
Yalın Doğu Şahin
d345752e3d Feature/volume ingress (#8084) 2026-01-22 06:48:29 -08:00
Vladimir Shishkaryov
b49f3ce6d3 fix(chart): place backoffLimit correctly in resize hook (#8036)
Signed-off-by: Vladimir Shishkaryov <vladimir@jckls.com>
2026-01-15 12:45:49 -08:00
Sheya Bernstein
8740a087b9 fix: apply tpl function to all component extraEnvironmentVars (#8001) 2026-01-11 12:14:16 -08:00
Sheya Bernstein
911aca74f3 Support volume server ID in Helm chart (#7867)
helm: Support volume server ID
2025-12-24 10:52:40 -08:00
Chris Lu
d5f21fd8ba fix: add missing backslash for volume extraArgs in helm chart (#7676)
Fixes #7467

The -mserver argument line in volume-statefulset.yaml was missing a
trailing backslash, which prevented extraArgs from being passed to
the weed volume process.

Also:
- Extracted master server list generation logic into shared helper
  templates in _helpers.tpl for better maintainability
- Updated all occurrences of deprecated -mserver flag to -master
  across docker-compose files, test files, and documentation
2025-12-08 23:21:02 -08:00
Chris Lu
268cc84e8c [helm] Fix liveness/readiness probe scheme path in templates (#7616)
Fix the templates to read scheme from httpGet.scheme instead of the
probe level, matching the structure defined in values.yaml.

This ensures that changing *.livenessProbe.httpGet.scheme or
*.readinessProbe.httpGet.scheme in values.yaml now correctly affects
the rendered manifests.

Affected components: master, filer, volume, s3, all-in-one

Fixes #7615
2025-12-03 18:53:06 -08:00
Dennis Witt
8fe14d1368 fix(helm): set securitycontext for idx move initcontainer if enabled (#7331) 2025-10-16 12:24:41 -07:00
Cristian Chiru
e030530aab Fix volume annotations in volume-servicemonitor.yaml (#7193)
* Update volume annotations in servicemonitor.yaml

* Idiomatic annotations handling in volume-servicemonitor.yaml
2025-09-03 00:34:39 -07:00
Devin Lauderdale
fae416586b Move helm templates into folders (#7113)
* refactor: move helm templates into respective service folders

* fix: update template path reference in filer-statefulset for s3-secret
2025-08-08 10:36:01 -07:00