Files
seaweedFS/k8s/charts/seaweedfs
Chris Lu f5c666052e feat: add S3 bucket size and object count metrics (#7776)
* feat: add S3 bucket size and object count metrics

Adds periodic collection of bucket size metrics:
- SeaweedFS_s3_bucket_size_bytes: logical size (deduplicated across replicas)
- SeaweedFS_s3_bucket_physical_size_bytes: physical size (including replicas)
- SeaweedFS_s3_bucket_object_count: object count (deduplicated)

Collection runs every 1 minute via background goroutine that queries
filer Statistics RPC for each bucket's collection.

Also adds Grafana dashboard panels for:
- S3 Bucket Size (logical vs physical)
- S3 Bucket Object Count

* address PR comments: fix bucket size metrics collection

1. Fix collectCollectionInfoFromMaster to use master VolumeList API
   - Now properly queries master for topology info
   - Uses WithMasterClient to get volume list from master
   - Correctly calculates logical vs physical size based on replication

2. Return error when filerClient is nil to trigger fallback
   - Changed from 'return nil, nil' to 'return nil, error'
   - Ensures fallback to filer stats is properly triggered

3. Implement pagination in listBucketNames
   - Added listBucketPageSize constant (1000)
   - Uses StartFromFileName for pagination
   - Continues fetching until fewer entries than limit returned

4. Handle NewReplicaPlacementFromByte error and prevent division by zero
   - Check error return from NewReplicaPlacementFromByte
   - Default to 1 copy if error occurs
   - Add explicit check for copyCount == 0

* simplify bucket size metrics: remove filer fallback, align with quota enforcement

- Remove fallback to filer Statistics RPC
- Use only master topology for collection info (same as s3.bucket.quota.enforce)
- Updated comments to clarify this runs the same collection logic as quota enforcement
- Simplified code by removing collectBucketSizeFromFilerStats

* use s3a.option.Masters directly instead of querying filer

* address PR comments: fix dashboard overlaps and improve metrics collection

Grafana dashboard fixes:
- Fix overlapping panels 55 and 59 in grafana_seaweedfs.json (moved 59 to y=30)
- Fix grid collision in k8s dashboard (moved panel 72 to y=48)
- Aggregate bucket metrics with max() by (bucket) for multi-instance S3 gateways

Go code improvements:
- Add graceful shutdown support via context cancellation
- Use ticker instead of time.Sleep for better shutdown responsiveness
- Distinguish EOF from actual errors in stream handling

* improve bucket size metrics: multi-master failover and proper error handling

- Initial delay now respects context cancellation using select with time.After
- Use WithOneOfGrpcMasterClients for multi-master failover instead of hardcoding Masters[0]
- Properly propagate stream errors instead of just logging them (EOF vs real errors)

* improve bucket size metrics: distributed lock and volume ID deduplication

- Add distributed lock (LiveLock) so only one S3 instance collects metrics at a time
- Add IsLocked() method to LiveLock for checking lock status
- Fix deduplication: use volume ID tracking instead of dividing by copyCount
  - Previous approach gave wrong results if replicas were missing
  - Now tracks seen volume IDs and counts each volume only once
- Physical size still includes all replicas for accurate disk usage reporting

* rename lock to s3.leader

* simplify: remove StartBucketSizeMetricsCollection wrapper function

* fix data race: use atomic operations for LiveLock.isLocked field

- Change isLocked from bool to int32
- Use atomic.LoadInt32/StoreInt32 for all reads/writes
- Sync shared isLocked field in StartLongLivedLock goroutine

* add nil check for topology info to prevent panic

* fix bucket metrics: use Ticker for consistent intervals, fix pagination logic

- Use time.Ticker instead of time.After for consistent interval execution
- Fix pagination: count all entries (not just directories) for proper termination
- Update lastFileName for all entries to prevent pagination issues

* address PR comments: remove redundant atomic store, propagate context

- Remove redundant atomic.StoreInt32 in StartLongLivedLock (AttemptToLock already sets it)
- Propagate context through metrics collection for proper cancellation on shutdown
  - collectAndUpdateBucketSizeMetrics now accepts ctx
  - collectCollectionInfoFromMaster uses ctx for VolumeList RPC
  - listBucketNames uses ctx for ListEntries RPC
2025-12-15 19:23:25 -08:00
..
2025-12-15 01:06:55 -08:00

SEAWEEDFS - helm chart (2.x+)

Getting Started

Add the helm repo

helm repo add seaweedfs https://seaweedfs.github.io/seaweedfs/helm

Install the helm chart

helm install seaweedfs seaweedfs/seaweedfs
helm install --values=values.yaml seaweedfs seaweedfs/seaweedfs

Info:

  • master/filer/volume are stateful sets with anti-affinity on the hostname, so your deployment will be spread/HA.
  • chart is using memsql(mysql) as the filer backend to enable HA (multiple filer instances) and backup/HA memsql can provide.
  • mysql user/password are created in a k8s secret (secret-seaweedfs-db.yaml) and injected to the filer with ENV.
  • cert config exists and can be enabled, but not been tested, requires cert-manager to be installed.

Prerequisites

Database

leveldb is the default database, this supports multiple filer replicas that will sync automatically, with some limitations.

When the limitations apply, or for a large number of filer replicas, an external datastore is recommended.

Such as MySQL-compatible database, as specified in the values.yaml at filer.extraEnvironmentVars. This database should be pre-configured and initialized by running:

CREATE TABLE IF NOT EXISTS `filemeta` (
  `dirhash`   BIGINT NOT NULL       COMMENT 'first 64 bits of MD5 hash value of directory field',
  `name`      VARCHAR(766) NOT NULL COMMENT 'directory or file name',
  `directory` TEXT NOT NULL         COMMENT 'full path to parent directory',
  `meta`      LONGBLOB,
  PRIMARY KEY (`dirhash`, `name`)
) DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;

Alternative database can also be configured (e.g. leveldb, postgres) following the instructions at filer.extraEnvironmentVars.

Node Labels

Kubernetes nodes can have labels which help to define which node(Host) will run which pod:

Here is an example:

  • s3/filer/master needs the label sw-backend=true
  • volume need the label sw-volume=true

to label a node to be able to run all pod types in k8s:

kubectl label node YOUR_NODE_NAME sw-volume=true sw-backend=true

on production k8s deployment you will want each pod to have a different host, especially the volume server and the masters, all pods (master/volume/filer) should have anti-affinity rules to disallow running multiple component pods on the same host.

If you still want to run multiple pods of the same component (master/volume/filer) on the same host please set/update the corresponding affinity rule in values.yaml to an empty one:

affinity: ""

PVC - storage class

On the volume stateful set added support for k8s PVC, currently example with the simple local-path-provisioner from Rancher (comes included with k3d / k3s) https://github.com/rancher/local-path-provisioner

you can use ANY storage class you like, just update the correct storage-class for your deployment.

current instances config (AIO):

1 instance for each type (master/filer+s3/volume)

You can update the replicas count for each node type in values.yaml, need to add more nodes with the corresponding labels if applicable.

Most of the configuration are available through values.yaml any pull requests to expand functionality or usability are greatly appreciated. Any pull request must pass chart-testing.

S3 configuration

To enable an s3 endpoint for your filer with a default install add the following to your values.yaml:

filer:
  s3:
    enabled: true

Enabling Authentication to S3

To enable authentication for S3, you have two options:

  • let the helm chart create an admin user as well as a read only user
  • provide your own s3 config.json file via an existing Kubernetes Secret

Use the default credentials for S3

Example parameters for your values.yaml:

filer:
  s3:
    enabled: true
    enableAuth: true

Provide your own credentials for S3

Example parameters for your values.yaml:

filer:
  s3:
    enabled: true
    enableAuth: true
    existingConfigSecret: my-s3-secret

Example existing secret with your s3 config to create an admin user and readonly user, both with credentials:

---
# Source: seaweedfs/templates/seaweedfs-s3-secret.yaml
apiVersion: v1
kind: Secret
type: Opaque
metadata:
  name: my-s3-secret
  namespace: seaweedfs
  labels:
    app.kubernetes.io/name: seaweedfs
    app.kubernetes.io/component: s3
stringData:
  # this key must be an inline json config file
  seaweedfs_s3_config: '{"identities":[{"name":"anvAdmin","credentials":[{"accessKey":"snu8yoP6QAlY0ne4","secretKey":"PNzBcmeLNEdR0oviwm04NQAicOrDH1Km"}],"actions":["Admin","Read","Write"]},{"name":"anvReadOnly","credentials":[{"accessKey":"SCigFee6c5lbi04A","secretKey":"kgFhbT38R8WUYVtiFQ1OiSVOrYr3NKku"}],"actions":["Read"]}]}'

Admin Component

The admin component provides a modern web-based administration interface for managing SeaweedFS clusters. It includes:

  • Dashboard: Real-time cluster status and metrics
  • Volume Management: Monitor volume servers, capacity, and health
  • File Browser: Browse and manage files in the filer
  • Maintenance Operations: Trigger maintenance tasks via workers
  • Object Store Management: Create and manage buckets with web interface

Enabling Admin

To enable the admin interface, add the following to your values.yaml:

admin:
  enabled: true
  port: 23646
  grpcPort: 33646  # For worker connections
  adminUser: "admin"
  adminPassword: "your-secure-password"  # Leave empty to disable auth
  
  # Optional: persist admin data
  data:
    type: "persistentVolumeClaim"
    size: "10Gi"
    storageClass: "your-storage-class"
  
  # Optional: enable ingress
  ingress:
    enabled: true
    host: "admin.seaweedfs.local"
    className: "nginx"

The admin interface will be available at http://<admin-service>:23646 (or via ingress). Workers connect to the admin server via gRPC on port 33646.

Admin Authentication

If adminPassword is set, the admin interface requires authentication:

  • Username: Value of adminUser (default: admin)
  • Password: Value of adminPassword

If adminPassword is empty or not set, the admin interface runs without authentication (not recommended for production).

Admin Data Persistence

The admin component can store configuration and maintenance data. You can configure storage in several ways:

  • emptyDir (default): Data is lost when pod restarts
  • persistentVolumeClaim: Data persists across pod restarts
  • hostPath: Data stored on the host filesystem
  • existingClaim: Use an existing PVC

Worker Component

Workers are maintenance agents that execute cluster maintenance tasks such as vacuum, volume balancing, and erasure coding. Workers connect to the admin server via gRPC and receive task assignments.

Enabling Workers

To enable workers, add the following to your values.yaml:

worker:
  enabled: true
  replicas: 2  # Scale based on workload
  capabilities: "vacuum,balance,erasure_coding"  # Tasks this worker can handle
  maxConcurrent: 3  # Maximum concurrent tasks per worker
  
  # Working directory for task execution
  # Default: "/tmp/seaweedfs-worker"
  # Note: /tmp is ephemeral - use persistent storage (hostPath/existingClaim) for long-running tasks
  workingDir: "/tmp/seaweedfs-worker"
  
  # Optional: configure admin server address
  # If not specified, auto-discovers from admin service in the same namespace by looking for
  # a service named "<release-name>-admin" (e.g., "seaweedfs-admin").
  # Auto-discovery only works if the admin is in the same namespace and same Helm release.
  # For cross-namespace or separate release scenarios, explicitly set this value.
  # Example: If main SeaweedFS is deployed in "production" namespace:
  #   adminServer: "seaweedfs-admin.production.svc:33646"
  adminServer: ""
  
  # Workers need storage for task execution
  # Note: Workers use a Deployment, which does not support `volumeClaimTemplates` 
  # for dynamic PVC creation per pod. To use persistent storage, you must 
  # pre-provision a PersistentVolumeClaim and use `type: "existingClaim"`.
  data:
    type: "emptyDir"  # Options: "emptyDir", "hostPath", or "existingClaim"
    hostPathPrefix: /storage  # For hostPath
    # claimName: "worker-pvc"  # For existingClaim with pre-provisioned PVC
  
  # Resource limits for worker pods
  resources:
    requests:
      cpu: "500m"
      memory: "512Mi"
    limits:
      cpu: "2"
      memory: "2Gi"

Worker Capabilities

Workers can be configured with different capabilities:

  • vacuum: Reclaim deleted file space
  • balance: Balance volumes across volume servers
  • erasure_coding: Handle erasure coding operations

You can configure workers with all capabilities or create specialized worker pools with specific capabilities.

Worker Deployment Strategy

For production deployments, consider:

  1. Multiple Workers: Deploy 2+ worker replicas for high availability
  2. Resource Allocation: Workers need sufficient CPU/memory for maintenance tasks
  3. Storage: Workers need temporary storage for vacuum and balance operations (size depends on volume size)
  4. Specialized Workers: Create separate worker deployments for different capabilities if needed

Example specialized worker configuration:

For specialized worker pools, deploy separate Helm releases with different capabilities:

values-worker-vacuum.yaml (for vacuum operations):

# Disable all other components, enable only workers
master:
  enabled: false
volume:
  enabled: false
filer:
  enabled: false
s3:
  enabled: false
admin:
  enabled: false

worker:
  enabled: true
  replicas: 2
  capabilities: "vacuum"
  maxConcurrent: 2
  # REQUIRED: Point to the admin service of your main SeaweedFS release
  # Replace <namespace> with the namespace where your main seaweedfs is deployed
  # Example: If deploying in namespace "production":
  #   adminServer: "seaweedfs-admin.production.svc:33646"
  adminServer: "seaweedfs-admin.<namespace>.svc:33646"

values-worker-balance.yaml (for balance operations):

# Disable all other components, enable only workers
master:
  enabled: false
volume:
  enabled: false
filer:
  enabled: false
s3:
  enabled: false
admin:
  enabled: false

worker:
  enabled: true
  replicas: 1
  capabilities: "balance"
  maxConcurrent: 1
  # REQUIRED: Point to the admin service of your main SeaweedFS release
  # Replace <namespace> with the namespace where your main seaweedfs is deployed
  # Example: If deploying in namespace "production":
  #   adminServer: "seaweedfs-admin.production.svc:33646"
  adminServer: "seaweedfs-admin.<namespace>.svc:33646"

Deploy the specialized workers as separate releases:

# Deploy vacuum workers
helm install seaweedfs-worker-vacuum seaweedfs/seaweedfs -f values-worker-vacuum.yaml

# Deploy balance workers
helm install seaweedfs-worker-balance seaweedfs/seaweedfs -f values-worker-balance.yaml

Enterprise

For enterprise users, please visit seaweedfs.com for the SeaweedFS Enterprise Edition, which has a self-healing storage format with better data protection.