Go to file

Chris Lu f41925b60b Embed IAM API into S3 server (#7740 )

* Embed IAM API into S3 server

This change simplifies the S3 and IAM deployment by embedding the IAM API
directly into the S3 server, following the patterns used by MinIO and Ceph RGW.

Changes:
- Add -iam flag to S3 server (enabled by default)
- Create embedded IAM API handler in s3api package
- Register IAM routes (POST to /) in S3 server when enabled
- Deprecate standalone 'weed iam' command with warning

Benefits:
- Single binary, single port for both S3 and IAM APIs
- Simpler deployment and configuration
- Shared credential manager between S3 and IAM
- Backward compatible: 'weed iam' still works with deprecation warning

Usage:
- weed s3 -port=8333          # S3 + IAM on same port (default)
- weed s3 -iam=false          # S3 only, disable embedded IAM
- weed iam -port=8111         # Deprecated, shows warning

* Fix nil pointer panic: add s3.iam flag to weed server command

The enableIam field was not initialized when running S3 via 'weed server',
causing a nil pointer dereference when checking *s3opt.enableIam.

* Fix nil pointer panic: add s3.iam flag to weed filer command

The enableIam field was not initialized when running S3 via 'weed filer -s3',
causing a nil pointer dereference when checking *s3opt.enableIam.

* Add integration tests for embedded IAM API

Tests cover:
- CreateUser, ListUsers, GetUser, UpdateUser, DeleteUser
- CreateAccessKey, DeleteAccessKey, ListAccessKeys
- CreatePolicy, PutUserPolicy, GetUserPolicy
- Implicit username extraction from authorization header
- Full user lifecycle workflow test

These tests validate the embedded IAM API functionality that was
added in the S3 server, ensuring IAM operations work correctly
when served from the same port as S3.

* Security: Use crypto/rand for IAM credential generation

SECURITY FIX: Replace math/rand with crypto/rand for generating
access keys and secret keys.

Using math/rand is not cryptographically secure and can lead to
predictable credentials. This change:

1. Replaces math/rand with crypto/rand in both:
   - weed/s3api/s3api_embedded_iam.go (embedded IAM)
   - weed/iamapi/iamapi_management_handlers.go (standalone IAM)

2. Removes the seededRand variable that was initialized with
   time-based seed (predictable)

3. Updates StringWithCharset/iamStringWithCharset to:
   - Use crypto/rand.Int() for secure random index generation
   - Return an error for proper error handling

4. Updates CreateAccessKey to handle the new error return

5. Updates DoActions handlers to propagate errors properly

* Fix critical bug: DeleteUserPolicy was deleting entire user instead of policy

BUG FIX: DeleteUserPolicy was incorrectly deleting the entire user
identity from s3cfg.Identities instead of just clearing the user's
inline policy (Actions).

Before (wrong):
  s3cfg.Identities = append(s3cfg.Identities[:i], s3cfg.Identities[i+1:]...)

After (correct):
  ident.Actions = nil

Also:
- Added proper iamDeleteUserPolicyResponse / DeleteUserPolicyResponse types
- Fixed return type from iamPutUserPolicyResponse to iamDeleteUserPolicyResponse

Affected files:
- weed/s3api/s3api_embedded_iam.go (embedded IAM)
- weed/iamapi/iamapi_management_handlers.go (standalone IAM)
- weed/iamapi/iamapi_response.go (response types)

* Add tests for DeleteUserPolicy to prevent regression

Added two tests:
1. TestEmbeddedIamDeleteUserPolicy - Verifies that:
   - User is NOT deleted (identity still exists)
   - Credentials are NOT deleted
   - Only Actions (policy) are cleared to nil

2. TestEmbeddedIamDeleteUserPolicyUserNotFound - Verifies:
   - Returns 404 when user doesn't exist

These tests ensure the bug fixed in the previous commit
(deleting user instead of policy) doesn't regress.

* Fix race condition: Add mutex lock to IAM DoActions

The DoActions function performs a read-modify-write operation on the
shared IAM configuration without any locking. This could lead to race
conditions and data loss if multiple requests modify the IAM config
concurrently.

Added mutex lock at the start of DoActions in both:
- weed/s3api/s3api_embedded_iam.go (embedded IAM)
- weed/iamapi/iamapi_management_handlers.go (standalone IAM)

The lock protects the entire read-modify-write cycle:
1. GetS3ApiConfiguration (read)
2. Modify s3cfg based on action
3. PutS3ApiConfiguration (write)

* Fix action comparison and document CreatePolicy limitation

1. Replace reflect.DeepEqual with order-independent string slice comparison
   - Added iamStringSlicesEqual/stringSlicesEqual helper functions
   - Prevents duplicate policy statements when actions are in different order

2. Document CreatePolicy limitation in embedded IAM
   - Added TODO comment explaining that managed policies are not persisted
   - Users should use PutUserPolicy for inline policies

3. Fix deadlock in standalone IAM's CreatePolicy
   - Removed nested lock acquisition (DoActions already holds the lock)

Files changed:
- weed/s3api/s3api_embedded_iam.go
- weed/iamapi/iamapi_management_handlers.go

* Add rate limiting to embedded IAM endpoint

Apply circuit breaker rate limiting to the IAM endpoint to prevent abuse.
Also added request tracking for IAM operations.

The IAM endpoint now follows the same pattern as other S3 endpoints:
- track() for request metrics
- s3a.iam.Auth() for authentication
- s3a.cb.Limit() for rate limiting

* Fix handleImplicitUsername to properly look up username from AccessKeyId

According to AWS spec, when UserName is not specified in an IAM request,
IAM should determine the username implicitly based on the AccessKeyId
signing the request.

Previously, the code incorrectly extracted s[2] (region field) from the
SigV4 credential string as the username. This fix:

1. Extracts the AccessKeyId from s[0] of the credential string
2. Looks up the AccessKeyId in the credential store using LookupByAccessKey
3. Uses the identity's Name field as the username if found

Also:
- Added exported LookupByAccessKey wrapper method to IdentityAccessManagement
- Updated tests to verify correct access key lookup behavior
- Applied fix to both embedded IAM and standalone IAM implementations

* Fix CreatePolicy to not trigger unnecessary save

CreatePolicy validates the policy document and returns metadata but does not
actually store the policy (SeaweedFS uses inline policies attached via
PutUserPolicy). However, 'changed' was left as true, triggering an unnecessary
save operation.

Set changed = false after successful CreatePolicy validation in both embedded
IAM and standalone IAM implementations.

* Improve embedded IAM test quality

- Remove unused mock types (mockCredentialManager, mockEmbeddedIamApi)
- Use proto.Clone instead of proto.Merge for proper deep copy semantics
- Replace brittle regex-based XML error extraction with proper XML unmarshalling
- Remove unused regexp import
- Add state and field assertions to tests:
  - CreateUser: verify username in response and user persisted in config
  - ListUsers: verify response contains expected users
  - GetUser: verify username in response
  - CreatePolicy: verify policy metadata in response
  - PutUserPolicy: verify actions were attached to user
  - CreateAccessKey: verify credentials in response and persisted in config

* Remove shared test state and improve executeEmbeddedIamRequest

- Remove package-level embeddedIamApi variable to avoid shared test state
- Update executeEmbeddedIamRequest to accept API instance as parameter
- Only call xml.Unmarshal when v != nil, making nil-v cases explicit
- Return unmarshal error properly instead of always returning it
- Update all tests to create their own EmbeddedIamApiForTest instance
- Each test now has isolated state, preventing test interdependencies

* Add comprehensive test coverage for embedded IAM

Added tests for previously uncovered functions:
- iamStringSlicesEqual: 0% → 100%
- iamMapToStatementAction: 40% → 100%
- iamMapToIdentitiesAction: 30% → 70%
- iamHash: 100%
- iamStringWithCharset: 85.7%
- GetPolicyDocument: 75% → 100%
- CreatePolicy: 91.7% → 100%
- DeleteUser: 83.3% → 100%
- GetUser: 83.3% → 100%
- ListAccessKeys: 55.6% → 88.9%

New test cases for helper functions, error handling, and edge cases.

* Document IAM code duplication and reference GitHub issue #7747

Added comments to both IAM implementations noting the code duplication
and referencing the tracking issue for future refactoring:
- weed/s3api/s3api_embedded_iam.go (embedded IAM)
- weed/iamapi/iamapi_management_handlers.go (standalone IAM)

See: https://github.com/seaweedfs/seaweedfs/issues/7747

* Implement granular IAM authorization for self-service operations

Previously, all IAM actions required ACTION_ADMIN permission, which was
overly restrictive. This change implements AWS-like granular permissions:

Self-service operations (allowed without admin for own resources):
- CreateAccessKey (on own user)
- DeleteAccessKey (on own user)
- ListAccessKeys (on own user)
- GetUser (on own user)
- UpdateAccessKey (on own user)

Admin-only operations:
- CreateUser, DeleteUser, UpdateUser
- PutUserPolicy, GetUserPolicy, DeleteUserPolicy
- CreatePolicy
- ListUsers
- Operations on other users

The new AuthIam middleware:
1. Authenticates the request (signature verification)
2. Parses the IAM Action and target UserName
3. For self-service actions, allows if user is operating on own resources
4. For all other actions or operations on other users, requires admin

* Fix misleading comment in standalone IAM CreatePolicy

The comment incorrectly stated that CreatePolicy only validates the policy
document. In the standalone IAM server, CreatePolicy actually persists
the policy via iama.s3ApiConfig.PutPolicies(). The changed flag is false
because it doesn't modify s3cfg.Identities, not because nothing is stored.

* Simplify IAM auth and add RequestId to responses

- Remove redundant ACTION_ADMIN fallback in AuthIam: The action parameter
  in authRequest is for permission checking, not signature verification.
  If auth fails with ACTION_READ, it will fail with ACTION_ADMIN too.

- Add SetRequestId() call before writing IAM responses for AWS compatibility.
  All IAM response structs embed iamCommonResponse which has SetRequestId().

* Address code review feedback for IAM implementation

1. auth_credentials.go: Add documentation warning that LookupByAccessKey
   returns internal pointers that should not be mutated.

2. iamapi_management_handlers.go & s3api_embedded_iam.go: Add input guards
   for StringWithCharset/iamStringWithCharset when length <= 0 or charset
   is empty to avoid runtime errors from rand.Int.

3. s3api_embedded_iam_test.go: Don't ignore xml.Marshal errors in test
   DoActions handler. Return proper error response if marshaling fails.

4. s3api_embedded_iam_test.go: Use obviously fake access key IDs
   (AKIATESTFAKEKEY*) to avoid CI secret scanner false positives.

* Address code review feedback for IAM implementation (batch 2)

1. iamapi/iamapi_management_handlers.go:
   - Redact Authorization header log (security: avoid exposing signature)
   - Add nil-guard for iama.iam before LookupByAccessKey call

2. iamapi/iamapi_test.go:
   - Replace real-looking access keys with obviously fake ones
     (AKIATESTFAKEKEY*) to avoid CI secret scanner false positives

3. s3api/s3api_embedded_iam.go - CreateUser:
   - Validate UserName is not empty (return ErrCodeInvalidInputException)
   - Check for duplicate users (return ErrCodeEntityAlreadyExistsException)

4. s3api/s3api_embedded_iam.go - CreateAccessKey:
   - Return ErrCodeNoSuchEntityException if user doesn't exist
   - Removed implicit user creation behavior

5. s3api/s3api_embedded_iam.go - getActions:
   - Fix S3 ARN parsing for bucket/path patterns
   - Handle mybucket, mybucket/*, mybucket/path/* correctly
   - Return error if no valid actions found in policy

6. s3api/s3api_embedded_iam.go - handleImplicitUsername:
   - Redact Authorization header log
   - Add nil-guard for e.iam

7. s3api/s3api_embedded_iam.go - DoActions:
   - Reload in-memory IAM maps after credential mutations
   - Call LoadS3ApiConfigurationFromCredentialManager after save

8. s3api/auth_credentials.go - AuthSignatureOnly:
   - Add new signature-only authentication method
   - Bypasses S3 authorization checks for IAM operations
   - Used by AuthIam to properly separate signature verification
     from IAM-specific permission checks

* Fix nil pointer dereference and error handling in IAM

1. AuthIam: Add nil check for identity after AuthSignatureOnly
   - AuthSignatureOnly can return nil identity with ErrNone for
     authTypePostPolicy or authTypeStreamingUnsigned
   - Now returns ErrAccessDenied if identity is nil

2. writeIamErrorResponse: Add missing error code cases
   - ErrCodeEntityAlreadyExistsException -> HTTP 409 Conflict
   - ErrCodeInvalidInputException -> HTTP 400 Bad Request

3. UpdateUser: Use consistent error handling
   - Changed from direct ErrInvalidRequest to writeIamErrorResponse
   - Now returns correct HTTP status codes based on error type

* Add IAM config reload to standalone IAM server after mutations

Match the behavior of embedded IAM (s3api_embedded_iam.go) by reloading
the in-memory identity maps after persisting configuration changes.
This ensures newly created access keys are visible to LookupByAccessKey
immediately without requiring a server restart.

* Minor improvements to test helpers and log masking

1. iamapi_test.go: Update mustMarshalJSON to use t.Helper() and t.Fatal()
   instead of panic() for better test diagnostics

2. s3api_embedded_iam.go: Mask access key in 'not found' log message
   to avoid exposing full access key IDs in logs

* Mask access key in standalone IAM log message for consistency

Match the embedded IAM version by masking the access key ID in
the 'not found' log message (show only first 4 chars).

2025-12-14 16:02:06 -08:00

.github

fix: admin UI bucket deletion with filer group configured (#7735 )

2025-12-13 19:04:12 -08:00

docker

docker: add curl for HTTPS healthcheck support (#7709 )

2025-12-10 12:54:20 -08:00

k8s/charts

fix worker -admin -adminServer error (#7706 )

2025-12-10 12:56:09 -08:00

note

Correct gopher on SVG logo (#5833 )

2024-07-29 09:13:41 -07:00

other

fix: add missing backslash for volume extraArgs in helm chart (#7676 )

2025-12-08 23:21:02 -08:00

postgres-examples

Message Queue: Add sql querying (#7185 )

2025-09-09 01:01:03 -07:00

seaweedfs-rdma-sidecar

fix: add missing backslash for volume extraArgs in helm chart (#7676 )

2025-12-08 23:21:02 -08:00

snap

move to https://github.com/seaweedfs/seaweedfs

2022-07-29 00:17:28 -07:00

telemetry

Add Kafka Gateway (#7231 )

2025-10-13 18:05:17 -07:00

test

fix: admin UI bucket deletion with filer group configured (#7735 )

2025-12-13 19:04:12 -08:00

unmaintained

Migrate from deprecated azure-storage-blob-go to modern Azure SDK (#7310 )

2025-10-08 23:12:03 -07:00

util

util: added gostd script

2019-04-30 03:23:20 +00:00

weed

Embed IAM API into S3 server (#7740 )

2025-12-14 16:02:06 -08:00

.gitignore

Embed IAM API into S3 server (#7740 )

2025-12-14 16:02:06 -08:00

backers.md

chore: add nimbus web services to backers.md (#4769 )

2023-08-20 15:31:23 -07:00

BUCKET_POLICY_ENGINE_INTEGRATION.md

S3: Enforce bucket policy (#7471 )

2025-11-12 22:14:50 -08:00

CODE_OF_CONDUCT.md

add code of conduct (#4109 )

2023-01-05 11:01:22 -08:00

DESIGN.md

Admin: misc improvements on admin server and workers. EC now works. (#7055 )

2025-07-30 12:38:03 -07:00

go.mod

chore(deps): bump github.com/quic-go/quic-go from 0.54.1 to 0.57.0 (#7718 )

2025-12-11 10:58:43 -08:00

go.sum

chore(deps): bump github.com/quic-go/quic-go from 0.54.1 to 0.57.0 (#7718 )

2025-12-11 10:58:43 -08:00

LICENSE

Update LICENSE, fix copyright license year (#6405 )

2025-01-01 01:55:42 -08:00

Makefile

Remove deprecated allowEmptyFolder CLI option

2025-12-06 21:54:12 -08:00

README.md

fix: add missing backslash for volume extraArgs in helm chart (#7676 )

2025-12-08 23:21:02 -08:00

SQL_FEATURE_PLAN.md

Message Queue: Add sql querying (#7185 )

2025-09-09 01:01:03 -07:00

SSE-C_IMPLEMENTATION.md

S3 API: Add SSE-KMS (#7144 )

2025-08-21 08:28:07 -07:00

SeaweedFS

SeaweedFS is an independent Apache-licensed open source project with its ongoing development made possible entirely thanks to the support of these awesome backers. If you'd like to grow SeaweedFS even stronger, please consider joining our sponsors on Patreon.

Your support will be really appreciated by me and other supporters!

Quick Start

Quick Start for S3 API on Docker

docker run -p 8333:8333 chrislusf/seaweedfs server -s3

Quick Start with Single Binary

Download the latest binary from https://github.com/seaweedfs/seaweedfs/releases and unzip a single binary file weed or weed.exe. Or run go install github.com/seaweedfs/seaweedfs/weed@latest.
export AWS_ACCESS_KEY_ID=admin ; export AWS_SECRET_ACCESS_KEY=key as the admin credentials to access the object store.
Run weed server -dir=/some/data/dir -s3 to start one master, one volume server, one filer, and one S3 gateway.

Also, to increase capacity, just add more volume servers by running weed volume -dir="/some/data/dir2" -master="<master_host>:9333" -port=8081 locally, or on a different machine, or on thousands of machines. That is it!

Quick Start SeaweedFS S3 on AWS

Setup fast production-ready SeaweedFS S3 on AWS with cloudformation

Introduction

SeaweedFS is a simple and highly scalable distributed file system. There are two objectives:

to store billions of files!
to serve the files fast!

SeaweedFS started as an Object Store to handle small files efficiently. Instead of managing all file metadata in a central master, the central master only manages volumes on volume servers, and these volume servers manage files and their metadata. This relieves concurrency pressure from the central master and spreads file metadata into volume servers, allowing faster file access (O(1), usually just one disk read operation).

There is only 40 bytes of disk storage overhead for each file's metadata. It is so simple with O(1) disk reads that you are welcome to challenge the performance with your actual use cases.

SeaweedFS started by implementing Facebook's Haystack design paper. Also, SeaweedFS implements erasure coding with ideas from f4: Facebook’s Warm BLOB Storage System, and has a lot of similarities with Facebook’s Tectonic Filesystem

On top of the object store, optional Filer can support directories and POSIX attributes. Filer is a separate linearly-scalable stateless server with customizable metadata stores, e.g., MySql, Postgres, Redis, Cassandra, HBase, Mongodb, Elastic Search, LevelDB, RocksDB, Sqlite, MemSql, TiDB, Etcd, CockroachDB, YDB, etc.

For any distributed key value stores, the large values can be offloaded to SeaweedFS. With the fast access speed and linearly scalable capacity, SeaweedFS can work as a distributed Key-Large-Value store.

SeaweedFS can transparently integrate with the cloud. With hot data on local cluster, and warm data on the cloud with O(1) access time, SeaweedFS can achieve both fast local access time and elastic cloud storage capacity. What's more, the cloud storage access API cost is minimized. Faster and cheaper than direct cloud storage!

System	File Metadata	File Content Read	POSIX	REST API	Optimized for large number of small files
SeaweedFS	lookup volume id, cacheable	O(1) disk seek		Yes	Yes
SeaweedFS Filer	Linearly Scalable, Customizable	O(1) disk seek	FUSE	Yes	Yes
GlusterFS	hashing		FUSE, NFS
Ceph	hashing + rules		FUSE	Yes
MooseFS	in memory		FUSE		No
MinIO	separate meta file for each file			Yes	No

SeaweedFS	comparable to Ceph	advantage
Master	MDS	simpler
Volume	OSD	optimized for small files
Filer	Ceph FS	linearly scalable, Customizable, O(1) or O(logN)

README.md Unescape Escape

SeaweedFS

Sponsor SeaweedFS via Patreon

Gold Sponsors

Table of Contents

Quick Start

Quick Start for S3 API on Docker

Quick Start with Single Binary

Quick Start SeaweedFS S3 on AWS

Introduction

Features

Additional Features

Filer Features

Kubernetes

Example: Using Seaweed Object Store

Start Master Server

Start Volume Servers

Write File

Save File Id

Read File

Rack-Aware and Data Center-Aware Replication

Allocate File Key on Specific Data Center

Other Features

Object Store Architecture

Master Server and Volume Server

Write and Read files

Storage Size

Saving memory

Tiered Storage to the cloud

Compared to Other File Systems

Compared to HDFS

Compared to GlusterFS, Ceph

Compared to GlusterFS

Compared to MooseFS

Compared to Ceph

Compared to MinIO

Dev Plan

Installation Guide

Disk Related Topics

Hard Drive Performance

Solid State Disk

Benchmark

Run WARP and launch a mixed benchmark.

Enterprise

License

Stargazers over time

README.md