13 Commits

Author SHA1 Message Date
Chris Lu
5e76f55077 fix(helm): namespace app-specific global values under global.seaweedfs (#8700)
* fix(helm): namespace app-specific values under global.seaweedfs

Move all app-specific values from the global namespace to
global.seaweedfs.* to avoid polluting the shared .Values.global
namespace when the chart is used as a subchart.

Standard Helm conventions (global.imageRegistry, global.imagePullSecrets)
remain at the global level as they are designed to be shared across
subcharts.

Fixes seaweedfs/seaweedfs#8699

BREAKING CHANGE: global values have been restructured. Users must update
their values files to use the new paths:
- global.registry → global.imageRegistry
- global.repository → global.seaweedfs.image.repository
- global.imageName → global.seaweedfs.image.name
- global.<key> → global.seaweedfs.<key> (for all other app-specific values)

* fix(ci): update helm CI tests to use new global.seaweedfs.* value paths

Update all --set flags in helm_ci.yml to use the new namespaced
global.seaweedfs.* paths matching the values.yaml restructuring.

* fix(ci): install Claude Code via npm to avoid install.sh 403

The claude-code-action's built-in installer uses
`curl https://claude.ai/install.sh | bash` which can fail with 403.
Due to the pipe, bash exits 0 on empty input, masking the curl failure
and leaving the `claude` binary missing.

Work around this by installing Claude Code via npm before invoking the
action, and passing the executable path via path_to_claude_code_executable.

* revert: remove claude-code-review.yml changes from this PR

The claude-code-action OIDC token exchange validates that the workflow
file matches the version on the default branch. Modifying it in a PR
causes the review job to fail with "Workflow validation failed".

The Claude Code install fix will need to be applied directly to master
or in a separate PR.

* fix: update stale references to old global.* value paths

- admin-statefulset.yaml: fix fail message to reference
  global.seaweedfs.masterServer
- values.yaml: fix comment to reference image.name instead of imageName
- helm_ci.yml: fix diagnostic message to reference
  global.seaweedfs.enableSecurity

* feat(helm): add backward-compat shim for old global.* value paths

Add _compat.tpl with a seaweedfs.compat helper that detects old-style
global.* keys (e.g. global.enableSecurity, global.registry) and merges
them into the new global.seaweedfs.* namespace.

Since the old keys no longer have defaults in values.yaml, their
presence means the user explicitly provided them. The helper uses
in-place mutation via `set` so all templates see the merged values.

This ensures existing deployments using old value paths continue to
work without changes after upgrading.

* fix: update stale comment references in values.yaml

Update comments referencing global.enableSecurity and global.masterServer
to the new global.seaweedfs.* paths.

---------

Co-authored-by: Copilot <copilot@github.com>
2026-03-19 13:00:48 -07:00
Jayshan Raghunandan
1f1eac4f08 feat: improve aio support for admin/volume ingress and fix UI links (#8679)
* feat: improve allInOne mode support for admin/volume ingress and fix master UI links

- Add allInOne support to admin ingress template, matching the pattern
  used by filer and s3 ingress templates (or-based enablement with
  ternary service name selection)
- Add allInOne support to volume ingress template, which previously
  required volume.enabled even when the volume server runs within the
  allInOne pod
- Expose admin ports in allInOne deployment and service when
  allInOne.admin.enabled is set
- Add allInOne.admin config section to values.yaml (enabled by default,
  ports inherit from admin.*)
- Fix legacy master UI templates (master.html, masterNewRaft.html) to
  prefer PublicUrl over internal Url when linking to volume server UI.
  The new admin UI already handles this correctly.

* fix: revert admin allInOne changes and fix PublicUrl in admin dashboard

The admin binary (`weed admin`) is a separate process that cannot run
inside `weed server` (allInOne mode). Revert the admin-related allInOne
helm chart changes that caused 503 errors on admin ingress.

Fix bug in cluster_topology.go where VolumeServer.PublicURL was set to
node.Id (internal pod address) instead of the actual public URL. Add
public_url field to DataNodeInfo proto message so the topology gRPC
response carries the public URL set via -volume.publicUrl flag.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use HTTP /dir/status to populate PublicUrl in admin dashboard

The gRPC DataNodeInfo proto does not include PublicUrl, so the admin dashboard showed internal pod IPs instead of the configured public URL.
Fetch PublicUrl from the master's /dir/status HTTP endpoint and apply it
in both GetClusterTopology and GetClusterVolumeServers code paths.

Also reverts the unnecessary proto field additions from the previous
commit and cleans up a stray blank line in all-in-one-service.yml.

* fix: apply PublicUrl link fix to masterNewRaft.html

Match the same conditional logic already applied to master.html —
prefer PublicUrl when set and different from Url.

* fix: add HTTP timeout and status check to fetchPublicUrlMap

Use a 5s-timeout client instead of http.DefaultClient to prevent
blocking indefinitely when the master is unresponsive. Also check
the HTTP status code before attempting to parse the response body.

* fix: fall back to node address when PublicUrl is empty

Prevents blank links in the admin dashboard when PublicUrl is not
configured, such as in standalone or mixed-version clusters.

* fix: log io.ReadAll error in fetchPublicUrlMap

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Chris Lu <chris.lu@gmail.com>
2026-03-18 13:20:55 -07:00
hoppla20
efe722c18c fix(chart): all in one maxVolumes value (#8683)
* fix(chart): all-in-one deployment maxVolumes value

* chore(chart): improve readability

* fix(chart): maxVolume nil value check

* fix(chart): guard against nil/empty volume.dataDirs before calling first

Without this check, `first` errors when volume.dataDirs is nil or empty,
causing a template render failure for users who omit the setting entirely.

---------

Co-authored-by: Copilot <copilot@github.com>
2026-03-18 12:19:46 -07:00
Chris Lu
0443b66a75 fix(helm): trim whitespace before s3 TLS args to prevent command breakage (#8614)
* fix(helm): trim whitespace before s3 TLS args to prevent command breakage (#8613)

When global.enableSecurity is enabled, the `{{ include }}` call for
s3 TLS args lacked the leading dash (`{{-`), producing an extra blank
line in the rendered shell command. This broke shell continuation and
caused the filer (and s3/all-in-one) to crash because arguments after
the blank line were silently dropped.

* ci(helm): assert no blank lines in security+S3 command blocks

Renders the chart with global.enableSecurity=true and S3 enabled for
normal mode (filer + s3 deployments) and all-in-one mode, then parses
every /bin/sh -ec command block and fails if any contains blank lines.

This catches the whitespace regression from #8613 where a missing {{-
dash on the seaweedfs.s3.tlsArgs include produced a blank line that
broke shell continuation.

* ci(helm): enable S3 in all-in-one security render test

The s3.tlsArgs include is gated by allInOne.s3.enabled, so without
this flag the all-in-one command block wasn't actually exercising the
TLS args path.
2026-03-12 15:35:22 -07:00
Chris Lu
bfd0d5c084 fix(helm): use componentName for all service names to fix truncation mismatch (#8612)
* fix(helm): use componentName for all service names to fix truncation mismatch (#8610)

PR #8143 updated statefulsets and deployments to use the componentName
helper (which truncates the fullname before appending the suffix), but
left service definitions using the old `printf + trunc 63` pattern.
When release names are long enough, these two strategies produce
different names, causing DNS resolution failures (e.g., S3 cannot
find the filer-client service and falls back to localhost:8888).

Unify all service name definitions and cluster address helpers to use
the componentName helper consistently.

* refactor(helm): simplify cluster address helpers with ternary

* test(helm): add regression test for service name truncation with long release names

Renders the chart with a >63-char fullname in both normal and all-in-one
modes, then asserts that Service metadata.name values match the hostnames
produced by cluster.masterAddress, cluster.filerAddress, and the S3
deployment's -filer= argument. Prevents future truncation/DNS mismatch
regressions like #8610.

* fix(helm-ci): limit S3_FILER_HOST extraction to first match
2026-03-12 11:59:24 -07:00
Chris Lu
6c7fe87a72 helm: add s3.tlsSecret for custom S3 HTTPS certificate (#8582)
* helm: add s3.tlsSecret to allow custom TLS certificate for S3 HTTPS endpoint

Allow users to specify an external Kubernetes TLS secret for the S3
HTTPS endpoint instead of using the internal self-signed client
certificate. This enables using publicly trusted certificates (e.g.
from Let's Encrypt) so S3 clients don't need to trust the internal CA.

The new s3.tlsSecret value is supported in the standalone S3 gateway,
filer with embedded S3, and all-in-one deployment templates.

Closes #8581

* refactor: extract S3 TLS helpers to reduce duplication

Move repeated S3 TLS cert/key logic into shared helper templates
(seaweedfs.s3.tlsArgs, seaweedfs.s3.tlsVolumeMount, seaweedfs.s3.tlsVolume)
in _helpers.tpl, and use them across all three deployment templates.

* helm: add allInOne.s3.trafficDistribution support

Add the missing allInOne.s3.trafficDistribution branch to the
seaweedfs.trafficDistribution helper and wire it into the all-in-one
service template, mirroring the existing s3-service.yaml behavior.
PreferClose is auto-converted to PreferSameZone on k8s >=1.35.

* fix: scope S3 TLS mounts to S3-enabled pods and simplify trafficDistribution helper

- Wrap S3 TLS volume/volumeMount includes in allInOne.s3.enabled and
  filer.s3.enabled guards so the custom TLS secret is only mounted
  when S3 is actually enabled in that deployment mode.
- Refactor seaweedfs.trafficDistribution helper to accept an explicit
  value+Capabilities dict instead of walking multiple .Values paths,
  making each call site responsible for passing its own setting.
2026-03-09 14:24:42 -07:00
Chris Lu
c9c91ba568 Refactor Helm chart to use dynamic names for resources (#8142)
* Refactor Helm chart to use dynamic names for resources

* ensure name length
2026-01-27 12:52:06 -08:00
Sheya Bernstein
8740a087b9 fix: apply tpl function to all component extraEnvironmentVars (#8001) 2026-01-11 12:14:16 -08:00
chrislu
5167bbd2a9 Remove deprecated allowEmptyFolder CLI option
The allowEmptyFolder option is no longer functional because:
1. The code that used it was already commented out
2. Empty folder cleanup is now handled asynchronously by EmptyFolderCleaner

The CLI flags are kept for backward compatibility but marked as deprecated
and ignored. This removes:
- S3ApiServerOption.AllowEmptyFolder field
- The actual usage in s3api_object_handlers_list.go
- Helm chart values and template references
- References in test Makefiles and docker-compose files
2025-12-06 21:54:12 -08:00
Chris Lu
62a83ed469 helm: enhance all-in-one deployment configuration (#7639)
* helm: enhance all-in-one deployment configuration

Fixes #7110

This PR addresses multiple issues with the all-in-one Helm chart configuration:

## New Features

### Configurable Replicas
- Added `allInOne.replicas` (was hardcoded to 1)

### S3 Gateway Configuration
- Added full S3 config under `allInOne.s3`:
  - port, httpsPort, domainName, allowEmptyFolder
  - enableAuth, existingConfigSecret, auditLogConfig
  - createBuckets for declarative bucket creation

### SFTP Server Configuration
- Added full SFTP config under `allInOne.sftp`:
  - port, sshPrivateKey, hostKeysFolder, authMethods
  - maxAuthTries, bannerMessage, loginGraceTime
  - clientAliveInterval, clientAliveCountMax, enableAuth

### Command Line Arguments
- Added `allInOne.extraArgs` for custom CLI arguments

### Update Strategy
- Added `allInOne.updateStrategy.type` (Recreate/RollingUpdate)

### Secret Environment Variables
- Added `allInOne.secretExtraEnvironmentVars` for injecting secrets

### Ingress Support
- Added `allInOne.ingress` with S3, filer, and master sub-configs

### Storage Options
- Enhanced `allInOne.data` with existingClaim support
- Added PVC template for persistentVolumeClaim type

## CI Enhancements
- Added comprehensive tests for all-in-one configurations
- Tests cover replicas, S3, SFTP, extraArgs, strategies, PVC, ingress

* helm: add real cluster deployment tests to CI

- Deploy all-in-one cluster with S3 enabled on kind cluster
- Test Master API (/cluster/status endpoint)
- Test Filer API (file upload/download)
- Test S3 API (/status endpoint)
- Test S3 operations with AWS CLI:
  - Create/delete buckets
  - Upload/download/delete objects
  - Verify file content integrity

* helm: simplify CI and remove all-in-one ingress

Address review comments:
- Remove detailed all-in-one template rendering tests from CI
- Remove real cluster deployment tests from CI
- Remove all-in-one ingress template and values configuration

Keep the core improvements:
- allInOne.replicas configuration
- allInOne.s3.* full configuration
- allInOne.sftp.* full configuration
- allInOne.extraArgs support
- allInOne.updateStrategy configuration
- allInOne.secretExtraEnvironmentVars support

* helm: address review comments

- Fix post-install-bucket-hook.yaml: add filer.s3.enableAuth and
  filer.s3.existingConfigSecret to or statements for consistency
- Fix all-in-one-deployment.yaml: use default function for s3.domainName
- Fix all-in-one-deployment.yaml: use hasKey function for s3.allowEmptyFolder

* helm: clarify updateStrategy multi-replica behavior

Expand comment to warn users that RollingUpdate with multiple replicas
requires shared storage (ReadWriteMany) to avoid data loss.

* helm: address gemini-code-assist review comments

- Make PVC accessModes configurable to support ReadWriteMany for
  multi-replica deployments (defaults to ReadWriteOnce)
- Use configured readiness probe paths in post-install bucket hook
  instead of hardcoded paths, respecting custom configurations

* helm: simplify allowEmptyFolder logic using coalesce

Use coalesce function for cleaner template code as suggested in review.

* helm: fix extraArgs trailing backslash issue

Remove trailing backslash after the last extraArgs argument to avoid
shell syntax error. Use counter to only add backslash between arguments.

* helm: fix fallback logic for allInOne s3/sftp configuration

Changes:
- Set allInOne.s3.* and allInOne.sftp.* override parameters to null by default
  This allows proper inheritance from global s3.* and sftp.* settings
- Fix allowEmptyFolder logic to use explicit nil checking instead of coalesce
  The coalesce/default functions treat 'false' as empty, causing incorrect
  fallback behavior when users want to explicitly set false values

Addresses review feedback about default value conflicts with fallback logic.

* helm: fix exec in bucket creation loop causing premature termination

Remove 'exec' from the range loops that create and configure S3 buckets.
The exec command replaces the current shell process, causing the script
to terminate after the first bucket, preventing creation/configuration
of subsequent buckets.

* helm: quote extraArgs to handle arguments with spaces

Use the quote function to ensure each item in extraArgs is treated as
a single, complete argument even if it contains spaces.

* helm: make s3/filer ingress work for both normal and all-in-one modes

Modified s3-ingress.yaml and filer-ingress.yaml to dynamically select
the service name based on deployment mode:
- Normal mode: points to seaweedfs-s3 / seaweedfs-filer services
- All-in-one mode: points to seaweedfs-all-in-one service

This eliminates the need for separate all-in-one ingress templates.
Users can now use the standard s3.ingress and filer.ingress settings
for both deployment modes.

* helm: fix allInOne.data.size and storageClass to use null defaults

Change size and storageClass from empty strings to null so the template
defaults (10Gi for size, cluster default for storageClass) will apply
correctly. Empty strings prevent the Helm | default function from working.

* helm: fix S3 ingress to include standalone S3 gateway case

Add s3.enabled check to the $s3Enabled logic so the ingress works for:
1. Standalone S3 gateway (s3.enabled)
2. S3 on Filer (filer.s3.enabled) when not in all-in-one mode
3. S3 in all-in-one mode (allInOne.s3.enabled)
2025-12-06 18:54:28 -08:00
Chris Lu
268cc84e8c [helm] Fix liveness/readiness probe scheme path in templates (#7616)
Fix the templates to read scheme from httpGet.scheme instead of the
probe level, matching the structure defined in values.yaml.

This ensures that changing *.livenessProbe.httpGet.scheme or
*.readinessProbe.httpGet.scheme in values.yaml now correctly affects
the rendered manifests.

Affected components: master, filer, volume, s3, all-in-one

Fixes #7615
2025-12-03 18:53:06 -08:00
Chris Lu
8ed1b104ce WEED_CLUSTER_SW_* Environment Variables should not be passed to allIn… (#7217)
* WEED_CLUSTER_SW_* Environment Variables should not be passed to allInOne config

* address comment

* address comments

Fixed filtering logic: Replaced specific key matching with regex patterns that catch ALL WEED_CLUSTER_*_MASTER and WEED_CLUSTER_*_FILER variables:
}
Corrected merge precedence: Fixed the merge order so global environment variables properly override allInOne variables:

* refactoring
2025-09-09 08:48:34 -07:00
Devin Lauderdale
fae416586b Move helm templates into folders (#7113)
* refactor: move helm templates into respective service folders

* fix: update template path reference in filer-statefulset for s3-secret
2025-08-08 10:36:01 -07:00