* Enforce IAM for s3tables bucket creation
* Prefer IAM path when policies exist
* Ensure IAM enforcement honors default allow
* address comments
* Reused the precomputed principal when setting tableBucketMetadata.OwnerAccountID, avoiding the redundant getAccountID call.
* get identity
* fix
* dedup
* fix
* comments
* fix tests
* update iam config
* go fmt
* fix ports
* fix flags
* mini clean shutdown
* Revert "update iam config"
This reverts commit ca48fdbb0afa45657823d98657556c0bbf24f239.
Revert "mini clean shutdown"
This reverts commit 9e17f6baffd5dd7cc404d831d18dd618b9fe5049.
Revert "fix flags"
This reverts commit e9e7b29d2f77ee5cb82147d50621255410695ee3.
Revert "go fmt"
This reverts commit bd3241960b1d9484b7900190773b0ecb3f762c9a.
* test/s3tables: share single weed mini per test package via TestMain
Previously each top-level test function in the catalog and s3tables
package started and stopped its own weed mini instance. This caused
failures when a prior instance wasn't cleanly stopped before the next
one started (port conflicts, leaked global state).
Changes:
- catalog/iceberg_catalog_test.go: introduce TestMain that starts one
shared TestEnvironment (external weed binary) before all tests and
tears it down after. All individual test functions now use sharedEnv.
Added randomSuffix() for unique resource names across tests.
- catalog/pyiceberg_test.go: updated to use sharedEnv instead of
per-test environments.
- catalog/pyiceberg_test_helpers.go -> pyiceberg_test_helpers_test.go:
renamed to a _test.go file so it can access TestEnvironment which is
defined in a test file.
- table-buckets/setup.go: add package-level sharedCluster variable.
- table-buckets/s3tables_integration_test.go: introduce TestMain that
starts one shared TestCluster before all tests. TestS3TablesIntegration
now uses sharedCluster. Extract startMiniClusterInDir (no *testing.T)
for TestMain use. TestS3TablesCreateBucketIAMPolicy keeps its own
cluster (different IAM config). Remove miniClusterMutex (no longer
needed). Fix Stop() to not panic when t is nil."
* delete
* parse
* default allow should work with anonymous
* fix port
* iceberg route
The failures are from Iceberg REST using the default bucket warehouse when no prefix is provided. Your tests create random buckets, so /v1/namespaces was looking in warehouse and failing. I updated the tests to use the prefixed Iceberg routes (/v1/{bucket}/...) via a small helper.
* test(s3tables): fix port conflicts and IAM ARN matching in integration tests
- Pass -master.dir explicitly to prevent filer store directory collision
between shared cluster and per-test clusters running in the same process
- Pass -volume.port.public and -volume.publicUrl to prevent the global
publicPort flag (mutated from 0 → concrete port by first cluster) from
being reused by a second cluster, causing 'address already in use'
- Remove the flag-reset loop in Stop() that reset global flag values while
other goroutines were reading them (race → panic)
- Fix IAM policy Resource ARN in TestS3TablesCreateBucketIAMPolicy to use
wildcards (arn:aws:s3tables:*:*:bucket/<name>) because the handler
generates ARNs with its own DefaultRegion (us-east-1) and principal name
('admin'), not the test constants testRegion/testAccountID
* docker: fix entrypoint chown guard; helm: add openshift-values.yaml
Fix a regression in entrypoint.sh where the DATA_UID/DATA_GID
ownership comparison was dropped, causing chown -R /data to run
unconditionally on every container start even when ownership was
already correct. Restore the guard so the recursive chown is
skipped when the seaweed user already owns /data — making startup
faster on subsequent runs and a no-op on OpenShift/PVC deployments
where fsGroup has already set correct ownership.
Add k8s/charts/seaweedfs/openshift-values.yaml: an example Helm
overrides file for deploying SeaweedFS on OpenShift (or any cluster
enforcing the Kubernetes restricted Pod Security Standard). Replaces
hostPath volumes with PVCs, sets runAsUser/fsGroup to 1000
(the seaweed user baked into the image), drops all capabilities,
disables privilege escalation, and enables RuntimeDefault seccomp —
satisfying OpenShift's default restricted SCC without needing a
custom SCC or root access.
Fixes #8381"
SEAWEEDFS - helm chart (2.x+)
Add the helm repo
helm repo add seaweedfs https://seaweedfs.github.io/seaweedfs/helm
Install the helm chart
helm install seaweedfs seaweedfs/seaweedfs
(Recommended) Provide values.yaml
helm install --values=values.yaml seaweedfs seaweedfs/seaweedfs
Info:
- master/filer/volume are stateful sets with anti-affinity on the hostname, so your deployment will be spread/HA.
- chart is using memsql(mysql) as the filer backend to enable HA (multiple filer instances) and backup/HA memsql can provide.
- mysql user/password are created in a k8s secret (default:
<release>-seaweedfs-db-secret) and injected to the filer with ENV. - cert config exists and can be enabled, but not been tested, requires cert-manager to be installed.
Prerequisites
Database
leveldb is the default database, this supports multiple filer replicas that will sync automatically, with some limitations.
When the limitations apply, or for a large number of filer replicas, an external datastore is recommended.
Such as MySQL-compatible database, as specified in the values.yaml at filer.extraEnvironmentVars.
This database should be pre-configured and initialized. If using the default db-init-config, the configmap name is now dynamic (e.g., <release>-seaweedfs-db-init-config). You can override this name via filer.dbInitConfigName.
To initialize manually:
CREATE TABLE IF NOT EXISTS `filemeta` (
`dirhash` BIGINT NOT NULL COMMENT 'first 64 bits of MD5 hash value of directory field',
`name` VARCHAR(766) NOT NULL COMMENT 'directory or file name',
`directory` TEXT NOT NULL COMMENT 'full path to parent directory',
`meta` LONGBLOB,
PRIMARY KEY (`dirhash`, `name`)
) DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;
Alternative database can also be configured (e.g. leveldb, postgres) following the instructions at filer.extraEnvironmentVars.
Node Labels
Kubernetes nodes can have labels which help to define which node(Host) will run which pod:
Here is an example:
- s3/filer/master needs the label sw-backend=true
- volume need the label sw-volume=true
to label a node to be able to run all pod types in k8s:
kubectl label node YOUR_NODE_NAME sw-volume=true sw-backend=true
on production k8s deployment you will want each pod to have a different host, especially the volume server and the masters, all pods (master/volume/filer) should have anti-affinity rules to disallow running multiple component pods on the same host.
If you still want to run multiple pods of the same component (master/volume/filer) on the same host please set/update the corresponding affinity rule in values.yaml to an empty one:
affinity: ""
PVC - storage class
On the volume stateful set added support for k8s PVC, currently example with the simple local-path-provisioner from Rancher (comes included with k3d / k3s) https://github.com/rancher/local-path-provisioner
you can use ANY storage class you like, just update the correct storage-class for your deployment.
current instances config (AIO):
1 instance for each type (master/filer+s3/volume)
You can update the replicas count for each node type in values.yaml, need to add more nodes with the corresponding labels if applicable.
Most of the configuration are available through values.yaml any pull requests to expand functionality or usability are greatly appreciated. Any pull request must pass chart-testing.
S3 configuration
To enable an s3 endpoint for your filer with a default install add the following to your values.yaml:
filer:
s3:
enabled: true
Enabling Authentication to S3
To enable authentication for S3, you have two options:
- let the helm chart create an admin user as well as a read only user
- provide your own s3 config.json file via an existing Kubernetes Secret
Use the default credentials for S3
Example parameters for your values.yaml:
filer:
s3:
enabled: true
enableAuth: true
Provide your own credentials for S3
Example parameters for your values.yaml:
filer:
s3:
enabled: true
enableAuth: true
existingConfigSecret: my-s3-secret
Example existing secret with your s3 config to create an admin user and readonly user, both with credentials:
---
# Source: seaweedfs/templates/seaweedfs-s3-secret.yaml
apiVersion: v1
kind: Secret
type: Opaque
metadata:
name: my-s3-secret
namespace: seaweedfs
labels:
app.kubernetes.io/name: seaweedfs
app.kubernetes.io/component: s3
stringData:
# this key must be an inline json config file
seaweedfs_s3_config: '{"identities":[{"name":"anvAdmin","credentials":[{"accessKey":"snu8yoP6QAlY0ne4","secretKey":"PNzBcmeLNEdR0oviwm04NQAicOrDH1Km"}],"actions":["Admin","Read","Write"]},{"name":"anvReadOnly","credentials":[{"accessKey":"SCigFee6c5lbi04A","secretKey":"kgFhbT38R8WUYVtiFQ1OiSVOrYr3NKku"}],"actions":["Read"]}]}'
Admin Component
The admin component provides a modern web-based administration interface for managing SeaweedFS clusters. It includes:
- Dashboard: Real-time cluster status and metrics
- Volume Management: Monitor volume servers, capacity, and health
- File Browser: Browse and manage files in the filer
- Maintenance Operations: Trigger maintenance tasks via workers
- Object Store Management: Create and manage buckets with web interface
Enabling Admin
To enable the admin interface, add the following to your values.yaml:
admin:
enabled: true
port: 23646
grpcPort: 33646 # For worker connections
adminUser: "admin"
adminPassword: "your-secure-password" # Leave empty to disable auth
# Optional: persist admin data
data:
type: "persistentVolumeClaim"
size: "10Gi"
storageClass: "your-storage-class"
# Optional: enable ingress
ingress:
enabled: true
host: "admin.seaweedfs.local"
className: "nginx"
The admin interface will be available at http://<admin-service>:23646 (or via ingress). Workers connect to the admin server via gRPC on port 33646.
Admin Authentication
If adminPassword is set, the admin interface requires authentication:
- Username: Value of
adminUser(default:admin) - Password: Value of
adminPassword
If adminPassword is empty or not set, the admin interface runs without authentication (not recommended for production).
Admin Data Persistence
The admin component can store configuration and maintenance data. You can configure storage in several ways:
- emptyDir (default): Data is lost when pod restarts
- persistentVolumeClaim: Data persists across pod restarts
- hostPath: Data stored on the host filesystem
- existingClaim: Use an existing PVC
Worker Component
Workers are maintenance agents that execute cluster maintenance tasks such as vacuum, volume balancing, and erasure coding. Workers connect to the admin server via gRPC and receive task assignments.
Enabling Workers
To enable workers, add the following to your values.yaml:
worker:
enabled: true
replicas: 2 # Scale based on workload
jobType: "vacuum,volume_balance,erasure_coding" # Job types this worker can handle
maxDetect: 1 # Maximum concurrent detection requests
maxExecute: 4 # Maximum concurrent execution jobs per worker
# Working directory for task execution
# Default: "/tmp/seaweedfs-worker"
# Note: /tmp is ephemeral - use persistent storage (hostPath/existingClaim) for long-running tasks
workingDir: "/tmp/seaweedfs-worker"
# Optional: configure admin server address
# If not specified, auto-discovers from admin service in the same namespace by looking for
# a service named "<release-name>-admin" (e.g., "seaweedfs-admin").
# Auto-discovery only works if the admin is in the same namespace and same Helm release.
# For cross-namespace or separate release scenarios, explicitly set this value.
# Example: If main SeaweedFS is deployed in "production" namespace:
# adminServer: "seaweedfs-admin.production.svc:33646"
adminServer: ""
# Workers need storage for task execution
# Note: Workers use a Deployment, which does not support `volumeClaimTemplates`
# for dynamic PVC creation per pod. To use persistent storage, you must
# pre-provision a PersistentVolumeClaim and use `type: "existingClaim"`.
data:
type: "emptyDir" # Options: "emptyDir", "hostPath", or "existingClaim"
hostPathPrefix: /storage # For hostPath
# claimName: "worker-pvc" # For existingClaim with pre-provisioned PVC
# Resource limits for worker pods
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "2"
memory: "2Gi"
Worker Job Types
Workers can be configured with different job types:
- vacuum: Reclaim deleted file space
- volume_balance: Balance volumes across volume servers
- erasure_coding: Handle erasure coding operations
You can configure workers with all job types or create specialized worker pools with specific job types.
Worker Deployment Strategy
For production deployments, consider:
- Multiple Workers: Deploy 2+ worker replicas for high availability
- Resource Allocation: Workers need sufficient CPU/memory for maintenance tasks
- Storage: Workers need temporary storage for vacuum and balance operations (size depends on volume size)
- Specialized Workers: Create separate worker deployments for different job types if needed
Example specialized worker configuration:
For specialized worker pools, deploy separate Helm releases with different job types:
values-worker-vacuum.yaml (for vacuum operations):
# Disable all other components, enable only workers
master:
enabled: false
volume:
enabled: false
filer:
enabled: false
s3:
enabled: false
admin:
enabled: false
worker:
enabled: true
replicas: 2
jobType: "vacuum"
maxExecute: 2
# REQUIRED: Point to the admin service of your main SeaweedFS release
# Replace <namespace> with the namespace where your main seaweedfs is deployed
# Example: If deploying in namespace "production":
# adminServer: "seaweedfs-admin.production.svc:33646"
adminServer: "seaweedfs-admin.<namespace>.svc:33646"
values-worker-balance.yaml (for balance operations):
# Disable all other components, enable only workers
master:
enabled: false
volume:
enabled: false
filer:
enabled: false
s3:
enabled: false
admin:
enabled: false
worker:
enabled: true
replicas: 1
jobType: "volume_balance"
maxExecute: 1
# REQUIRED: Point to the admin service of your main SeaweedFS release
# Replace <namespace> with the namespace where your main seaweedfs is deployed
# Example: If deploying in namespace "production":
# adminServer: "seaweedfs-admin.production.svc:33646"
adminServer: "seaweedfs-admin.<namespace>.svc:33646"
Deploy the specialized workers as separate releases:
# Deploy vacuum workers
helm install seaweedfs-worker-vacuum seaweedfs/seaweedfs -f values-worker-vacuum.yaml
# Deploy balance workers
helm install seaweedfs-worker-balance seaweedfs/seaweedfs -f values-worker-balance.yaml
Enterprise
For enterprise users, please visit seaweedfs.com for the SeaweedFS Enterprise Edition, which has a self-healing storage format with better data protection.