Files
seaweedFS/test/s3/iam/test_config.json
Chris Lu f5c666052e feat: add S3 bucket size and object count metrics (#7776)
* feat: add S3 bucket size and object count metrics

Adds periodic collection of bucket size metrics:
- SeaweedFS_s3_bucket_size_bytes: logical size (deduplicated across replicas)
- SeaweedFS_s3_bucket_physical_size_bytes: physical size (including replicas)
- SeaweedFS_s3_bucket_object_count: object count (deduplicated)

Collection runs every 1 minute via background goroutine that queries
filer Statistics RPC for each bucket's collection.

Also adds Grafana dashboard panels for:
- S3 Bucket Size (logical vs physical)
- S3 Bucket Object Count

* address PR comments: fix bucket size metrics collection

1. Fix collectCollectionInfoFromMaster to use master VolumeList API
   - Now properly queries master for topology info
   - Uses WithMasterClient to get volume list from master
   - Correctly calculates logical vs physical size based on replication

2. Return error when filerClient is nil to trigger fallback
   - Changed from 'return nil, nil' to 'return nil, error'
   - Ensures fallback to filer stats is properly triggered

3. Implement pagination in listBucketNames
   - Added listBucketPageSize constant (1000)
   - Uses StartFromFileName for pagination
   - Continues fetching until fewer entries than limit returned

4. Handle NewReplicaPlacementFromByte error and prevent division by zero
   - Check error return from NewReplicaPlacementFromByte
   - Default to 1 copy if error occurs
   - Add explicit check for copyCount == 0

* simplify bucket size metrics: remove filer fallback, align with quota enforcement

- Remove fallback to filer Statistics RPC
- Use only master topology for collection info (same as s3.bucket.quota.enforce)
- Updated comments to clarify this runs the same collection logic as quota enforcement
- Simplified code by removing collectBucketSizeFromFilerStats

* use s3a.option.Masters directly instead of querying filer

* address PR comments: fix dashboard overlaps and improve metrics collection

Grafana dashboard fixes:
- Fix overlapping panels 55 and 59 in grafana_seaweedfs.json (moved 59 to y=30)
- Fix grid collision in k8s dashboard (moved panel 72 to y=48)
- Aggregate bucket metrics with max() by (bucket) for multi-instance S3 gateways

Go code improvements:
- Add graceful shutdown support via context cancellation
- Use ticker instead of time.Sleep for better shutdown responsiveness
- Distinguish EOF from actual errors in stream handling

* improve bucket size metrics: multi-master failover and proper error handling

- Initial delay now respects context cancellation using select with time.After
- Use WithOneOfGrpcMasterClients for multi-master failover instead of hardcoding Masters[0]
- Properly propagate stream errors instead of just logging them (EOF vs real errors)

* improve bucket size metrics: distributed lock and volume ID deduplication

- Add distributed lock (LiveLock) so only one S3 instance collects metrics at a time
- Add IsLocked() method to LiveLock for checking lock status
- Fix deduplication: use volume ID tracking instead of dividing by copyCount
  - Previous approach gave wrong results if replicas were missing
  - Now tracks seen volume IDs and counts each volume only once
- Physical size still includes all replicas for accurate disk usage reporting

* rename lock to s3.leader

* simplify: remove StartBucketSizeMetricsCollection wrapper function

* fix data race: use atomic operations for LiveLock.isLocked field

- Change isLocked from bool to int32
- Use atomic.LoadInt32/StoreInt32 for all reads/writes
- Sync shared isLocked field in StartLongLivedLock goroutine

* add nil check for topology info to prevent panic

* fix bucket metrics: use Ticker for consistent intervals, fix pagination logic

- Use time.Ticker instead of time.After for consistent interval execution
- Fix pagination: count all entries (not just directories) for proper termination
- Update lastFileName for all entries to prevent pagination issues

* address PR comments: remove redundant atomic store, propagate context

- Remove redundant atomic.StoreInt32 in StartLongLivedLock (AttemptToLock already sets it)
- Propagate context through metrics collection for proper cancellation on shutdown
  - collectAndUpdateBucketSizeMetrics now accepts ctx
  - collectCollectionInfoFromMaster uses ctx for VolumeList RPC
  - listBucketNames uses ctx for ListEntries RPC
2025-12-15 19:23:25 -08:00

322 lines
7.9 KiB
JSON

{
"identities": [
{
"name": "testuser",
"credentials": [
{
"accessKey": "test-access-key",
"secretKey": "test-secret-key"
}
],
"actions": ["Admin"]
},
{
"name": "readonlyuser",
"credentials": [
{
"accessKey": "readonly-access-key",
"secretKey": "readonly-secret-key"
}
],
"actions": ["Read"]
},
{
"name": "writeonlyuser",
"credentials": [
{
"accessKey": "writeonly-access-key",
"secretKey": "writeonly-secret-key"
}
],
"actions": ["Write"]
}
],
"iam": {
"enabled": true,
"sts": {
"tokenDuration": "15m",
"issuer": "seaweedfs-sts",
"signingKey": "dGVzdC1zaWduaW5nLWtleS0zMi1jaGFyYWN0ZXJzLWxvbmc="
},
"policy": {
"defaultEffect": "Deny"
},
"providers": {
"oidc": {
"test-oidc": {
"issuer": "http://localhost:8080/.well-known/openid_configuration",
"clientId": "test-client-id",
"jwksUri": "http://localhost:8080/jwks",
"userInfoUri": "http://localhost:8080/userinfo",
"roleMapping": {
"rules": [
{
"claim": "groups",
"claimValue": "admins",
"roleName": "S3AdminRole"
},
{
"claim": "groups",
"claimValue": "users",
"roleName": "S3ReadOnlyRole"
},
{
"claim": "groups",
"claimValue": "writers",
"roleName": "S3WriteOnlyRole"
}
]
},
"claimsMapping": {
"email": "email",
"displayName": "name",
"groups": "groups"
}
}
},
"ldap": {
"test-ldap": {
"server": "ldap://localhost:389",
"baseDN": "dc=example,dc=com",
"bindDN": "cn=admin,dc=example,dc=com",
"bindPassword": "admin-password",
"userFilter": "(uid=%s)",
"groupFilter": "(memberUid=%s)",
"attributes": {
"email": "mail",
"displayName": "cn",
"groups": "memberOf"
},
"roleMapping": {
"rules": [
{
"claim": "groups",
"claimValue": "cn=admins,ou=groups,dc=example,dc=com",
"roleName": "S3AdminRole"
},
{
"claim": "groups",
"claimValue": "cn=users,ou=groups,dc=example,dc=com",
"roleName": "S3ReadOnlyRole"
}
]
}
}
}
},
"policyStore": {}
},
"roles": {
"S3AdminRole": {
"trustPolicy": {
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": ["test-oidc", "test-ldap"]
},
"Action": "sts:AssumeRoleWithWebIdentity"
}
]
},
"attachedPolicies": ["S3AdminPolicy"],
"description": "Full administrative access to S3 resources"
},
"S3ReadOnlyRole": {
"trustPolicy": {
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": ["test-oidc", "test-ldap"]
},
"Action": "sts:AssumeRoleWithWebIdentity"
}
]
},
"attachedPolicies": ["S3ReadOnlyPolicy"],
"description": "Read-only access to S3 resources"
},
"S3WriteOnlyRole": {
"trustPolicy": {
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": ["test-oidc", "test-ldap"]
},
"Action": "sts:AssumeRoleWithWebIdentity"
}
]
},
"attachedPolicies": ["S3WriteOnlyPolicy"],
"description": "Write-only access to S3 resources"
}
},
"policies": {
"S3AdminPolicy": {
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:*"],
"Resource": [
"arn:aws:s3:::*",
"arn:aws:s3:::*/*"
]
}
]
},
"S3ReadOnlyPolicy": {
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:GetObjectVersion",
"s3:ListBucket",
"s3:ListBucketVersions",
"s3:GetBucketLocation",
"s3:GetBucketVersioning"
],
"Resource": [
"arn:aws:s3:::*",
"arn:aws:s3:::*/*"
]
}
]
},
"S3WriteOnlyPolicy": {
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:PutObjectAcl",
"s3:DeleteObject",
"s3:DeleteObjectVersion",
"s3:InitiateMultipartUpload",
"s3:UploadPart",
"s3:CompleteMultipartUpload",
"s3:AbortMultipartUpload",
"s3:ListMultipartUploadParts"
],
"Resource": [
"arn:aws:s3:::*/*"
]
}
]
},
"S3BucketManagementPolicy": {
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:CreateBucket",
"s3:DeleteBucket",
"s3:GetBucketPolicy",
"s3:PutBucketPolicy",
"s3:DeleteBucketPolicy",
"s3:GetBucketVersioning",
"s3:PutBucketVersioning"
],
"Resource": [
"arn:aws:s3:::*"
]
}
]
},
"S3IPRestrictedPolicy": {
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:*"],
"Resource": [
"arn:aws:s3:::*",
"arn:aws:s3:::*/*"
],
"Condition": {
"IpAddress": {
"aws:SourceIp": ["192.168.1.0/24", "10.0.0.0/8"]
}
}
}
]
},
"S3TimeBasedPolicy": {
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:ListBucket"],
"Resource": [
"arn:aws:s3:::*",
"arn:aws:s3:::*/*"
],
"Condition": {
"DateGreaterThan": {
"aws:CurrentTime": "2023-01-01T00:00:00Z"
},
"DateLessThan": {
"aws:CurrentTime": "2025-12-31T23:59:59Z"
}
}
}
]
}
},
"bucketPolicyExamples": {
"PublicReadPolicy": {
"Version": "2012-10-17",
"Statement": [
{
"Sid": "PublicReadGetObject",
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::example-bucket/*"
}
]
},
"DenyDeletePolicy": {
"Version": "2012-10-17",
"Statement": [
{
"Sid": "DenyDeleteOperations",
"Effect": "Deny",
"Principal": "*",
"Action": ["s3:DeleteObject", "s3:DeleteBucket"],
"Resource": [
"arn:aws:s3:::example-bucket",
"arn:aws:s3:::example-bucket/*"
]
}
]
},
"IPRestrictedAccessPolicy": {
"Version": "2012-10-17",
"Statement": [
{
"Sid": "IPRestrictedAccess",
"Effect": "Allow",
"Principal": "*",
"Action": ["s3:GetObject", "s3:PutObject"],
"Resource": "arn:aws:s3:::example-bucket/*",
"Condition": {
"IpAddress": {
"aws:SourceIp": ["203.0.113.0/24"]
}
}
}
]
}
}
}