Commit Graph

17 Commits

Author SHA1 Message Date
Jayshan Raghunandan
1f1eac4f08 feat: improve aio support for admin/volume ingress and fix UI links (#8679)
* feat: improve allInOne mode support for admin/volume ingress and fix master UI links

- Add allInOne support to admin ingress template, matching the pattern
  used by filer and s3 ingress templates (or-based enablement with
  ternary service name selection)
- Add allInOne support to volume ingress template, which previously
  required volume.enabled even when the volume server runs within the
  allInOne pod
- Expose admin ports in allInOne deployment and service when
  allInOne.admin.enabled is set
- Add allInOne.admin config section to values.yaml (enabled by default,
  ports inherit from admin.*)
- Fix legacy master UI templates (master.html, masterNewRaft.html) to
  prefer PublicUrl over internal Url when linking to volume server UI.
  The new admin UI already handles this correctly.

* fix: revert admin allInOne changes and fix PublicUrl in admin dashboard

The admin binary (`weed admin`) is a separate process that cannot run
inside `weed server` (allInOne mode). Revert the admin-related allInOne
helm chart changes that caused 503 errors on admin ingress.

Fix bug in cluster_topology.go where VolumeServer.PublicURL was set to
node.Id (internal pod address) instead of the actual public URL. Add
public_url field to DataNodeInfo proto message so the topology gRPC
response carries the public URL set via -volume.publicUrl flag.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use HTTP /dir/status to populate PublicUrl in admin dashboard

The gRPC DataNodeInfo proto does not include PublicUrl, so the admin dashboard showed internal pod IPs instead of the configured public URL.
Fetch PublicUrl from the master's /dir/status HTTP endpoint and apply it
in both GetClusterTopology and GetClusterVolumeServers code paths.

Also reverts the unnecessary proto field additions from the previous
commit and cleans up a stray blank line in all-in-one-service.yml.

* fix: apply PublicUrl link fix to masterNewRaft.html

Match the same conditional logic already applied to master.html —
prefer PublicUrl when set and different from Url.

* fix: add HTTP timeout and status check to fetchPublicUrlMap

Use a 5s-timeout client instead of http.DefaultClient to prevent
blocking indefinitely when the master is unresponsive. Also check
the HTTP status code before attempting to parse the response body.

* fix: fall back to node address when PublicUrl is empty

Prevents blank links in the admin dashboard when PublicUrl is not
configured, such as in standalone or mixed-version clusters.

* fix: log io.ReadAll error in fetchPublicUrlMap

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Chris Lu <chris.lu@gmail.com>
2026-03-18 13:20:55 -07:00
hoppla20
d79e82ee60 fix(chart): missing resources on volume statefulset initContainer (#8678)
* fix(chart): missing resources on volume statefulset initContainer

* chore(chart): use own resources for idx-vol-move initContainer

* chore(chart): improve comment for idxMoveResources value
2026-03-18 12:30:18 -07:00
Chris Lu
bfd0d5c084 fix(helm): use componentName for all service names to fix truncation mismatch (#8612)
* fix(helm): use componentName for all service names to fix truncation mismatch (#8610)

PR #8143 updated statefulsets and deployments to use the componentName
helper (which truncates the fullname before appending the suffix), but
left service definitions using the old `printf + trunc 63` pattern.
When release names are long enough, these two strategies produce
different names, causing DNS resolution failures (e.g., S3 cannot
find the filer-client service and falls back to localhost:8888).

Unify all service name definitions and cluster address helpers to use
the componentName helper consistently.

* refactor(helm): simplify cluster address helpers with ternary

* test(helm): add regression test for service name truncation with long release names

Renders the chart with a >63-char fullname in both normal and all-in-one
modes, then asserts that Service metadata.name values match the hostnames
produced by cluster.masterAddress, cluster.filerAddress, and the S3
deployment's -filer= argument. Prevents future truncation/DNS mismatch
regressions like #8610.

* fix(helm-ci): limit S3_FILER_HOST extraction to first match
2026-03-12 11:59:24 -07:00
Chris Lu
1a3e3100d0 Helm: set serviceAccountName independent of cluster role (#8495)
* Add stale job expiry and expire API

* Add expire job button

* helm: decouple serviceAccountName from cluster role

---------

Co-authored-by: Copilot <copilot@github.com>
2026-03-03 12:13:18 -08:00
Chris Lu
2644816692 helm: avoid duplicate env var keys in workload env lists (#8488)
* helm: dedupe merged extraEnvironmentVars in workloads

* address comments

Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com>

* range

Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com>

* helm: reuse merge helper for extraEnvironmentVars

---------

Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-02 12:10:57 -08:00
Sheya Bernstein
d8b8f0dffd fix(helm): add missing app.kubernetes.io/instance label to volume service (#8403) 2026-02-22 07:20:38 -08:00
Chris Lu
4f5f1f6be7 refactor(helm): Unified Naming Truncation and Bug Fixes (#8143)
* refactor(helm): add componentName helper for truncation

* fix(helm): unify ingress backend naming with truncation

* fix(helm): unify statefulset/deployment naming with truncation

* fix(helm): add missing labels to services for servicemonitor discovery

* chore(helm): secure secrets and add upgrade notes

* fix(helm): truncate context instead of suffix in componentName

* revert(docs): remove upgrade notes per feedback

* fix(helm): use componentName for COSI serviceAccountName

* helm: update master -ip to use component name for correct truncation

* helm: refactor masterServers helper to use truncated component names

* helm: update volume -ip to use component name and cleanup redundant printf

* helm: refine helpers with robustness check and updated docs
2026-01-27 17:45:16 -08:00
Chris Lu
c9c91ba568 Refactor Helm chart to use dynamic names for resources (#8142)
* Refactor Helm chart to use dynamic names for resources

* ensure name length
2026-01-27 12:52:06 -08:00
Yalın Doğu Şahin
d345752e3d Feature/volume ingress (#8084) 2026-01-22 06:48:29 -08:00
Vladimir Shishkaryov
b49f3ce6d3 fix(chart): place backoffLimit correctly in resize hook (#8036)
Signed-off-by: Vladimir Shishkaryov <vladimir@jckls.com>
2026-01-15 12:45:49 -08:00
Sheya Bernstein
8740a087b9 fix: apply tpl function to all component extraEnvironmentVars (#8001) 2026-01-11 12:14:16 -08:00
Sheya Bernstein
911aca74f3 Support volume server ID in Helm chart (#7867)
helm: Support volume server ID
2025-12-24 10:52:40 -08:00
Chris Lu
d5f21fd8ba fix: add missing backslash for volume extraArgs in helm chart (#7676)
Fixes #7467

The -mserver argument line in volume-statefulset.yaml was missing a
trailing backslash, which prevented extraArgs from being passed to
the weed volume process.

Also:
- Extracted master server list generation logic into shared helper
  templates in _helpers.tpl for better maintainability
- Updated all occurrences of deprecated -mserver flag to -master
  across docker-compose files, test files, and documentation
2025-12-08 23:21:02 -08:00
Chris Lu
268cc84e8c [helm] Fix liveness/readiness probe scheme path in templates (#7616)
Fix the templates to read scheme from httpGet.scheme instead of the
probe level, matching the structure defined in values.yaml.

This ensures that changing *.livenessProbe.httpGet.scheme or
*.readinessProbe.httpGet.scheme in values.yaml now correctly affects
the rendered manifests.

Affected components: master, filer, volume, s3, all-in-one

Fixes #7615
2025-12-03 18:53:06 -08:00
Dennis Witt
8fe14d1368 fix(helm): set securitycontext for idx move initcontainer if enabled (#7331) 2025-10-16 12:24:41 -07:00
Cristian Chiru
e030530aab Fix volume annotations in volume-servicemonitor.yaml (#7193)
* Update volume annotations in servicemonitor.yaml

* Idiomatic annotations handling in volume-servicemonitor.yaml
2025-09-03 00:34:39 -07:00
Devin Lauderdale
fae416586b Move helm templates into folders (#7113)
* refactor: move helm templates into respective service folders

* fix: update template path reference in filer-statefulset for s3-secret
2025-08-08 10:36:01 -07:00