* s3: use url.PathUnescape for X-Amz-Copy-Source header (#8544)
The X-Amz-Copy-Source header is a URL-encoded path, not a query string.
Using url.QueryUnescape incorrectly converts literal '+' characters to
spaces, which can cause object key mismatches during copy operations.
Switch to url.PathUnescape in CopyObjectHandler, CopyObjectPartHandler,
and pathToBucketObjectAndVersion to correctly handle special characters
like '!', '+', and other RFC 3986 sub-delimiters that S3 clients may
percent-encode (e.g. '!' as %21).
* s3: add path validation to CopyObjectPartHandler
CopyObjectPartHandler was missing the validateTableBucketObjectPath
checks that CopyObjectHandler has, allowing potential path traversal
in the source bucket/object of copy part requests.
* s3: fix case-sensitive HeadersRegexp for copy source routing
The HeadersRegexp for X-Amz-Copy-Source used `%2F` which only matched
uppercase hex encoding. RFC 3986 allows both `%2F` and `%2f`, so
clients sending lowercase percent-encoding would bypass the copy
handler and hit PutObjectHandler instead. Add (?i) flag for
case-insensitive matching.
Also add test coverage for the versionId branch in
pathToBucketObjectAndVersion and for lowercase %2f routing.
* filer: expose metadata events and list snapshots
* mount: invalidate hot directory caches
* mount: read hot directories directly from filer
* mount: add sequenced metadata cache applier
* mount: apply metadata responses through cache applier
* mount: replay snapshot-consistent directory builds
* mount: dedupe self metadata events
* mount: factor directory build cleanup
* mount: replace proto marshal dedup with composite key and ring buffer
The dedup logic was doing a full deterministic proto.Marshal on every
metadata event just to produce a dedup key. Replace with a cheap
composite string key (TsNs|Directory|OldName|NewName).
Also replace the sliding-window slice (which leaked the backing array
unboundedly) with a fixed-size ring buffer that reuses the same array.
* filer: remove mutex and proto.Clone from request-scoped MetadataEventSink
MetadataEventSink is created per-request and only accessed by the
goroutine handling the gRPC call. The mutex and double proto.Clone
(once in Record, once in Last) were unnecessary overhead on every
filer write operation. Store the pointer directly instead.
* mount: skip proto.Clone for caller-owned metadata events
Add ApplyMetadataResponseOwned that takes ownership of the response
without cloning. Local metadata events (mkdir, create, flush, etc.)
are freshly constructed and never shared, so the clone is unnecessary.
* filer: only populate MetadataEvent on successful DeleteEntry
Avoid calling eventSink.Last() on error paths where the sink may
contain a partial event from an intermediate child deletion during
recursive deletes.
* mount: avoid map allocation in collectDirectoryNotifications
Replace the map with a fixed-size array and linear dedup. There are
at most 3 directories to notify (old parent, new parent, new child
if directory), so a 3-element array avoids the heap allocation on
every metadata event.
* mount: fix potential deadlock in enqueueApplyRequest
Release applyStateMu before the blocking channel send. Previously,
if the channel was full (cap 128), the send would block while holding
the mutex, preventing Shutdown from acquiring it to set applyClosed.
* mount: restore signature-based self-event filtering as fast path
Re-add the signature check that was removed when content-based dedup
was introduced. Checking signatures is O(1) on a small slice and
avoids enqueuing and processing events that originated from this
mount instance. The content-based dedup remains as a fallback.
* filer: send snapshotTsNs only in first ListEntries response
The snapshot timestamp is identical for every entry in a single
ListEntries stream. Sending it in every response message wastes
wire bandwidth for large directories. The client already reads
it only from the first response.
* mount: exit read-through mode after successful full directory listing
MarkDirectoryRefreshed was defined but never called, so directories
that entered read-through mode (hot invalidation threshold) stayed
there permanently, hitting the filer on every readdir even when cold.
Call it after a complete read-through listing finishes.
* mount: include event shape and full paths in dedup key
The previous dedup key only used Names, which could collapse distinct
rename targets. Include the event shape (C/D/U/R), source directory,
new parent path, and both entry names so structurally different events
are never treated as duplicates.
* mount: drain pending requests on shutdown in runApplyLoop
After receiving the shutdown sentinel, drain any remaining requests
from applyCh non-blockingly and signal each with errMetaCacheClosed
so callers waiting on req.done are released.
* mount: include IsDirectory in synthetic delete events
metadataDeleteEvent now accepts an isDirectory parameter so the
applier can distinguish directory deletes from file deletes. Rmdir
passes true, Unlink passes false.
* mount: fall back to synthetic event when MetadataEvent is nil
In mknod and mkdir, if the filer response omits MetadataEvent (e.g.
older filer without the field), synthesize an equivalent local
metadata event so the cache is always updated.
* mount: make Flush metadata apply best-effort after successful commit
After filer_pb.CreateEntryWithResponse succeeds, the entry is
persisted. Don't fail the Flush syscall if the local metadata cache
apply fails — log and invalidate the directory cache instead.
Also fall back to a synthetic event when MetadataEvent is nil.
* mount: make Rename metadata apply best-effort
The rename has already succeeded on the filer by the time we apply
the local metadata event. Log failures instead of returning errors
that would be dropped by the caller anyway.
* mount: make saveEntry metadata apply best-effort with fallback
After UpdateEntryWithResponse succeeds, treat local metadata apply
as non-fatal. Log and invalidate the directory cache on failure.
Also fall back to a synthetic event when MetadataEvent is nil.
* filer_pb: preserve snapshotTsNs on error in ReadDirAllEntriesWithSnapshot
Return the snapshot timestamp even when the first page fails, so
callers receive the snapshot boundary when partial data was received.
* filer: send snapshot token for empty directory listings
When no entries are streamed, send a final ListEntriesResponse with
only SnapshotTsNs so clients always receive the snapshot boundary.
* mount: distinguish not-found vs transient errors in lookupEntry
Return fuse.EIO for non-not-found filer errors instead of
unconditionally returning ENOENT, so transient failures don't
masquerade as missing entries.
* mount: make CacheRemoteObject metadata apply best-effort
The file content has already been cached successfully. Don't fail
the read if the local metadata cache update fails.
* mount: use consistent snapshot for readdir in direct mode
Capture the SnapshotTsNs from the first loadDirectoryEntriesDirect
call and store it on the DirectoryHandle. Subsequent batch loads
pass this stored timestamp so all batches use the same snapshot.
Also export DoSeaweedListWithSnapshot so mount can use it directly
with snapshot passthrough.
* filer_pb: fix test fake to send SnapshotTsNs only on first response
Match the server behavior: only the first ListEntriesResponse in a
page carries the snapshot timestamp, subsequent entries leave it zero.
* Fix nil pointer dereference in ListEntries stream consumers
Remove the empty-directory snapshot-only response from ListEntries
that sent a ListEntriesResponse with Entry==nil, which crashed every
raw stream consumer that assumed resp.Entry is always non-nil.
Also add defensive nil checks for resp.Entry in all raw ListEntries
stream consumers across: S3 listing, broker topic lookup, broker
topic config, admin dashboard, topic retention, hybrid message
scanner, Kafka integration, and consumer offset storage.
* Add nil guards for resp.Entry in remaining ListEntries stream consumers
Covers: S3 object lock check, MQ management dashboard (version/
partition/offset loops), and topic retention version loop.
* Make applyLocalMetadataEvent best-effort in Link and Symlink
The filer operations already succeeded; failing the syscall because
the local cache apply failed is wrong. Log a warning and invalidate
the parent directory cache instead.
* Make applyLocalMetadataEvent best-effort in Mkdir/Rmdir/Mknod/Unlink
The filer RPC already committed; don't fail the syscall when the
local metadata cache apply fails. Log a warning and invalidate the
parent directory cache to force a re-fetch on next access.
* flushFileMetadata: add nil-fallback for metadata event and best-effort apply
Synthesize a metadata event when resp.GetMetadataEvent() is nil
(matching doFlush), and make the apply best-effort with cache
invalidation on failure.
* Prevent double-invocation of cleanupBuild in doEnsureVisited
Add a cleanupDone guard so the deferred cleanup and inline error-path
cleanup don't both call DeleteFolderChildren/AbortDirectoryBuild.
* Fix comment: signature check is O(n) not O(1)
* Prevent deferred cleanup after successful CompleteDirectoryBuild
Set cleanupDone before returning from the success path so the
deferred context-cancellation check cannot undo a published build.
* Invalidate parent directory caches on rename metadata apply failure
When applyLocalMetadataEvent fails during rename, invalidate the
source and destination parent directory caches so subsequent accesses
trigger a re-fetch from the filer.
* Add event nil-fallback and cache invalidation to Link and Symlink
Synthesize metadata events when the server doesn't return one, and
invalidate parent directory caches on apply failure.
* Match requested partition when scanning partition directories
Parse the partition range format (NNNN-NNNN) and match against the
requested partition parameter instead of using the first directory.
* Preserve snapshot timestamp across empty directory listings
Initialize actualSnapshotTsNs from the caller-requested value so it
isn't lost when the server returns no entries. Re-add the server-side
snapshot-only response for empty directories (all raw stream consumers
now have nil guards for Entry).
* Fix CreateEntry error wrapping to support errors.Is/errors.As
Use errors.New + %w instead of %v for resp.Error so callers can
unwrap the underlying error.
* Fix object lock pagination: only advance on non-nil entries
Move entriesReceived inside the nil check so nil entries don't
cause repeated ListEntries calls with the same lastFileName.
* Guard Attributes nil check before accessing Mtime in MQ management
* Do not send nil-Entry response for empty directory listings
The snapshot-only ListEntriesResponse (with Entry == nil) for empty
directories breaks consumers that treat any received response as an
entry (Java FilerClient, S3 listing). The Go client-side
DoSeaweedListWithSnapshot already preserves the caller-requested
snapshot via actualSnapshotTsNs initialization, so the server-side
send is unnecessary.
* Fix review findings: subscriber dedup, invalidation normalization, nil guards, shutdown race
- Remove self-signature early-return in processEventFn so all events
flow through the applier (directory-build buffering sees self-originated
events that arrive after a snapshot)
- Normalize NewParentPath in collectEntryInvalidations to avoid duplicate
invalidations when NewParentPath is empty (same-directory update)
- Guard resp.Entry.Attributes for nil in admin_server.go and
topic_retention.go to prevent panics on entries without attributes
- Fix enqueueApplyRequest race with shutdown by using select on both
applyCh and applyDone, preventing sends after the apply loop exits
- Add cleanupDone check to deferred cleanup in meta_cache_init.go for
clarity alongside the existing guard in cleanupBuild
- Add empty directory test case for snapshot consistency
* Propagate authoritative metadata event from CacheRemoteObjectToLocalCluster and generate client-side snapshot for empty directories
- Add metadata_event field to CacheRemoteObjectToLocalClusterResponse
proto so the filer-emitted event is available to callers
- Use WithMetadataEventSink in the server handler to capture the event
from NotifyUpdateEvent and return it on the response
- Update filehandle_read.go to prefer the RPC's metadata event over
a locally fabricated one, falling back to metadataUpdateEvent when
the server doesn't provide one (e.g., older filers)
- Generate a client-side snapshot cutoff in DoSeaweedListWithSnapshot
when the server sends no snapshot (empty directory), so callers like
CompleteDirectoryBuild get a meaningful boundary for filtering
buffered events
* Skip directory notifications for dirs being built to prevent mid-build cache wipe
When a metadata event is buffered during a directory build,
applyMetadataSideEffects was still firing noteDirectoryUpdate for the
building directory. If the directory accumulated enough updates to
become "hot", markDirectoryReadThrough would call DeleteFolderChildren,
wiping entries that EnsureVisited had already inserted. The build would
then complete and mark the directory cached with incomplete data.
Fix by using applyMetadataSideEffectsSkippingBuildingDirs for buffered
events, which suppresses directory notifications for dirs currently in
buildingDirs while still applying entry invalidations.
* Add test for directory notification suppression during active build
TestDirectoryNotificationsSuppressedDuringBuild verifies that metadata
events targeting a directory under active EnsureVisited build do NOT
fire onDirectoryUpdate for that directory. In production, this prevents
markDirectoryReadThrough from calling DeleteFolderChildren mid-build,
which would wipe entries already inserted by the listing.
The test inserts an entry during a build, sends multiple metadata events
for the building directory, asserts no notifications fired for it,
verifies the entry survives, and confirms buffered events are replayed
after CompleteDirectoryBuild.
* Fix create invalidations, build guard, event shape, context, and snapshot error path
- collectEntryInvalidations: invalidate FUSE kernel cache on pure
create events (OldEntry==nil && NewEntry!=nil), not just updates
and deletes
- completeDirectoryBuildNow: only call markCachedFn when an active
build existed (state != nil), preventing an unpopulated directory
from being marked as cached
- Add metadataCreateEvent helper that produces a create-shaped event
(NewEntry only, no OldEntry) and use it in mkdir, mknod, symlink,
and hardlink create fallback paths instead of metadataUpdateEvent
which incorrectly set both OldEntry and NewEntry
- applyMetadataResponseEnqueue: use context.Background() for the
queued mutation so a cancelled caller context cannot abort the
apply loop mid-write
- DoSeaweedListWithSnapshot: move snapshot initialization before
ListEntries call so the error path returns the preserved snapshot
instead of 0
* Fix review findings: test loop, cache race, context safety, snapshot consistency
- Fix build test loop starting at i=1 instead of i=0, missing new-0.txt verification
- Re-check IsDirectoryCached after cache miss to avoid ENOENT race with markDirectoryReadThrough
- Use context.Background() in enqueueAndWait so caller cancellation can't abort build/complete mid-way
- Pass dh.snapshotTsNs in skip-batch loadDirectoryEntriesDirect for snapshot consistency
- Prefer resp.MetadataEvent over fallback in Unlink event derivation
- Add comment on MetadataEventSink.Record single-event assumption
* Fix empty-directory snapshot clock skew and build cancellation race
Empty-directory snapshot: Remove client-side time.Now() synthesis when
the server returns no entries. Instead return snapshotTsNs=0, and in
completeDirectoryBuildNow replay ALL buffered events when snapshot is 0.
This eliminates the clock-skew bug where a client ahead of the filer
would filter out legitimate post-list events.
Build cancellation: Use context.Background() for BeginDirectoryBuild
and CompleteDirectoryBuild calls in doEnsureVisited, so errgroup
cancellation doesn't cause enqueueAndWait to return early and trigger
cleanupBuild while the operation is still queued.
* Add tests for empty-directory build replay and cancellation resilience
TestEmptyDirectoryBuildReplaysAllBufferedEvents: verifies that when
CompleteDirectoryBuild receives snapshotTsNs=0 (empty directory, no
server snapshot), ALL buffered events are replayed regardless of their
TsNs values — no clock-skew-sensitive filtering occurs.
TestBuildCompletionSurvivesCallerCancellation: verifies that once
CompleteDirectoryBuild is enqueued, a cancelled caller context does not
prevent the build from completing. The apply loop runs with
context.Background(), so the directory becomes cached and buffered
events are replayed even when the caller gives up waiting.
* Fix directory subtree cleanup, Link rollback, test robustness
- applyMetadataResponseLocked: when a directory entry is deleted or
moved, call DeleteFolderChildren on the old path so cached descendants
don't leak as stale entries.
- Link: save original HardLinkId/Counter before mutation. If
CreateEntryWithResponse fails after the source was already updated,
rollback the source entry to its original state via UpdateEntry.
- TestBuildCompletionSurvivesCallerCancellation: replace fixed
time.Sleep(50ms) with a deadline-based poll that checks
IsDirectoryCached in a loop, failing only after 2s timeout.
- TestReadDirAllEntriesWithSnapshotEmptyDirectory: assert that
ListEntries was actually invoked on the mock client so the test
exercises the RPC path.
- newMetadataEvent: add early return when both oldEntry and newEntry are
nil to avoid emitting events with empty Directory.
---------
Co-authored-by: Copilot <copilot@github.com>
* ec: fall back to data dir when ecx file not found in idx dir (#8540)
When -dir.idx is configured after EC encoding, the .ecx/.ecj files
remain in the data directory. NewEcVolume now falls back to the data
directory when the index file is not found in dirIdx.
* ec: add fallback logging and improved error message for ecx lookup
* ec: preserve configured dirIdx, track actual ecx location separately
The previous fallback set ev.dirIdx = dir when finding .ecx in the data
directory, which corrupted IndexBaseFileName() for future writes (e.g.,
WriteIdxFileFromEcIndex during EC-to-volume conversion would write the
.idx file to the data directory instead of the configured index directory).
Introduce ecxActualDir to track where .ecx/.ecj were actually found,
used only by FileName() for cleanup/destroy. IndexBaseFileName() continues
to use the configured dirIdx for new file creation.
* ec: check both idx and data dirs for .ecx in all cleanup and lookup paths
When -dir.idx is configured after EC encoding, .ecx/.ecj files may
reside in the data directory. Several code paths only checked
l.IdxDirectory, causing them to miss these files:
- removeEcVolumeFiles: now removes .ecx/.ecj from both directories
- loadExistingVolume: ecx existence check falls back to data dir
- deleteEcShardIdsForEachLocation: ecx existence check and cleanup
both cover the data directory
- VolumeEcShardsRebuild: ecx lookup falls back to data directory
so RebuildEcxFile operates on the correct file
* request_id: add shared request middleware
* s3err: preserve request ids in responses and logs
* iam: reuse request ids in XML responses
* sts: reuse request ids in XML responses
* request_id: drop legacy header fallback
* request_id: use AWS-style request id format
* iam: fix AWS-compatible XML format for ErrorResponse and field ordering
- ErrorResponse uses bare <RequestId> at root level instead of
<ResponseMetadata> wrapper, matching the AWS IAM error response spec
- Move CommonResponse to last field in success response structs so
<ResponseMetadata> serializes after result elements
- Add randomness to request ID generation to avoid collisions
- Add tests for XML ordering and ErrorResponse format
* iam: remove duplicate error_response_test.go
Test is already covered by responses_test.go.
* address PR review comments
- Guard against typed nil pointers in SetResponseRequestID before
interface assertion (CodeRabbit)
- Use regexp instead of strings.Index in test helpers for extracting
request IDs (Gemini)
* request_id: prevent spoofing, fix nil-error branch, thread reqID to error writers
- Ensure() now always generates a server-side ID, ignoring client-sent
x-amz-request-id headers to prevent request ID spoofing. Uses a
private context key (contextKey{}) instead of the header string.
- writeIamErrorResponse in both iamapi and embedded IAM now accepts
reqID as a parameter instead of calling Ensure() internally, ensuring
a single request ID per request lifecycle.
- The nil-iamError branch in writeIamErrorResponse now writes a 500
Internal Server Error response instead of returning silently.
- Updated tests to set request IDs via context (not headers) and added
tests for spoofing prevention and context reuse.
* sts: add request-id consistency assertions to ActionInBody tests
* test: update admin test to expect server-generated request IDs
The test previously sent a client x-amz-request-id header and expected
it echoed back. Since Ensure() now ignores client headers to prevent
spoofing, update the test to verify the server returns a non-empty
server-generated request ID instead.
* iam: add generic WithRequestID helper alongside reflection-based fallback
Add WithRequestID[T] that uses generics to take the address of a value
type, satisfying the pointer receiver on SetRequestId without reflection.
The existing SetResponseRequestID is kept for the two call sites that
operate on interface{} (from large action switches where the concrete
type varies at runtime). Generics cannot replace reflection there since
Go cannot infer type parameters from interface{}.
* Remove reflection and generics from request ID setting
Call SetRequestId directly on concrete response types in each switch
branch before boxing into interface{}, eliminating the need for
WithRequestID (generics) and SetResponseRequestID (reflection).
* iam: return pointer responses in action dispatch
* Fix IAM error handling consistency and ensure request IDs on all responses
- UpdateUser/CreatePolicy error branches: use writeIamErrorResponse instead
of s3err.WriteErrorResponse to preserve IAM formatting and request ID
- ExecuteAction: accept reqID parameter and generate one if empty, ensuring
every response carries a RequestId regardless of caller
* Clean up inline policies on DeleteUser and UpdateUser rename
DeleteUser: remove InlinePolicies[userName] from policy storage before
removing the identity, so policies are not orphaned.
UpdateUser: move InlinePolicies[userName] to InlinePolicies[newUserName]
when renaming, so GetUserPolicy/DeleteUserPolicy work under the new name.
Both operations persist the updated policies and return an error if
the storage write fails, preventing partial state.
* Places the CommonResponse struct at the end of all IAM responses, rather than the start.
* iam: fix error response request id layout
* iam: add XML ordering regression test
* iam: share request id generation
---------
Co-authored-by: Aaron Segal <aaron.segal@rpsolutions.com>
Co-authored-by: Chris Lu <chris.lu@gmail.com>
Previously, evaluateIAMPolicies created a new PolicyEngine and re-parsed
the JSON policy document for every policy on every request. This adds a
shared iamPolicyEngine field that caches compiled policies, kept in sync
by PutPolicy, DeletePolicy, and bulk config reload paths.
- PutPolicy deletes the old cache entry before setting the new one, so a
parse failure on update does not leave a stale allow.
- Log warnings when policy compilation fails instead of silently
discarding errors.
- Add test for valid-to-invalid policy update regression.
* admin: seed admin_script plugin config from master maintenance scripts
When the admin server starts, fetch the maintenance scripts configuration
from the master via GetMasterConfiguration. If the admin_script plugin
worker does not already have a saved config, use the master's scripts as
the default value. This enables seamless migration from master.toml
[master.maintenance] to the admin script plugin worker.
Changes:
- Add maintenance_scripts and maintenance_sleep_minutes fields to
GetMasterConfigurationResponse in master.proto
- Populate the new fields from viper config in master_grpc_server.go
- On admin server startup, fetch the master config and seed the
admin_script plugin config if no config exists yet
- Strip lock/unlock commands from the master scripts since the admin
script worker handles locking automatically
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: address review comments on admin_script seeding
- Replace TOCTOU race (separate Load+Save) with atomic
SaveJobTypeConfigIfNotExists on ConfigStore and Plugin
- Replace ineffective polling loop with single GetMaster call using
30s context timeout, since GetMaster respects context cancellation
- Add unit tests for SaveJobTypeConfigIfNotExists (in-memory + on-disk)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: apply maintenance script defaults in gRPC handler
The gRPC handler for GetMasterConfiguration read maintenance scripts
from viper without calling SetDefault, relying on startAdminScripts
having run first. If the admin server calls GetMasterConfiguration
before startAdminScripts sets the defaults, viper returns empty
strings and the seeding is silently skipped.
Apply SetDefault in the gRPC handler itself so it is self-contained.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Revert "fix: apply maintenance script defaults in gRPC handler"
This reverts commit 068a5063303f6bc34825a07bb681adfa67e6f9de.
* fix: use atomic save in ensureJobTypeConfigFromDescriptor
ensureJobTypeConfigFromDescriptor used a separate Load + Save, racing
with seedAdminScriptFromMaster. If the descriptor defaults (empty
script) were saved first, SaveJobTypeConfigIfNotExists in the seeding
goroutine would see an existing config and skip, losing the master's
maintenance scripts.
Switch to SaveJobTypeConfigIfNotExists so both paths are atomic. Whichever
wins, the other is a safe no-op.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: fetch master scripts inline during config bootstrap, not in goroutine
Replace the seedAdminScriptFromMaster goroutine with a
ConfigDefaultsProvider callback. When the plugin bootstraps
admin_script defaults from the worker descriptor, it calls the
provider which fetches maintenance scripts from the master
synchronously. This eliminates the race between the seeding
goroutine and the descriptor-based config bootstrap.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* skip commented lock unlock
Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com>
* reduce grpc calls
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix: maintenance task topology lookup, retry, and stale task cleanup
1. Strip gRPC port from ServerAddress in SyncTask using ToHttpAddress()
so task targets match topology disk keys (NodeId format).
2. Skip capacity check when topology has no disks yet (startup race
where tasks are loaded from persistence before first topology update).
3. Don't retry permanent errors like "volume not found" - these will
never succeed on retry.
4. Cancel all pending tasks for each task type before re-detection,
ensuring stale proposals from previous cycles are cleaned up.
This prevents stale tasks from blocking new detection and from
repeatedly failing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* logs
Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com>
* less lock scope
Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* feat: Implement IAM managed policy operations (GetPolicy, ListPolicies, DeletePolicy, AttachUserPolicy, DetachUserPolicy)
- Add response type aliases in iamapi_response.go for managed policy operations
- Implement 6 handler methods in iamapi_management_handlers.go:
- GetPolicy: Lookup managed policy by ARN
- DeletePolicy: Remove managed policy
- ListPolicies: List all managed policies
- AttachUserPolicy: Attach managed policy to user, aggregating inline + managed actions
- DetachUserPolicy: Detach managed policy from user
- ListAttachedUserPolicies: List user's attached managed policies
- Add computeAllActionsForUser() to aggregate actions from both inline and managed policies
- Wire 6 new DoActions switch cases for policy operations
- Add comprehensive tests for all new handlers
- Fixes#8506
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: address PR review feedback for IAM managed policy operations
- Add parsePolicyArn() helper with proper ARN prefix validation, replacing
fragile strings.Split parsing in GetPolicy, DeletePolicy, AttachUserPolicy,
and DetachUserPolicy
- DeletePolicy now detaches the policy from all users and recomputes their
aggregated actions, preventing stale permissions after deletion
- Set changed=true for DeletePolicy DoActions case so identity updates persist
- Make PolicyId consistent: CreatePolicy now uses Hash(&policyName) matching
GetPolicy and ListPolicies
- Remove redundant nil map checks (Go handles nil map lookups safely)
- DRY up action deduplication in computeAllActionsForUser with addUniqueActions
closure
- Add tests for invalid/empty ARN rejection and DeletePolicy identity cleanup
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: add integration tests for managed policy lifecycle (#8506)
Add two integration tests covering the user-reported use case where
managed policy operations returned 500 errors:
- TestS3IAMManagedPolicyLifecycle: end-to-end workflow matching the
issue report — CreatePolicy, ListPolicies, GetPolicy, AttachUserPolicy,
ListAttachedUserPolicies, idempotent re-attach, DeletePolicy while
attached (expects DeleteConflict), DetachUserPolicy, DeletePolicy,
and verification that deleted policy is gone
- TestS3IAMManagedPolicyErrorCases: covers error paths — nonexistent
policy/user for GetPolicy, DeletePolicy, AttachUserPolicy,
DetachUserPolicy, and ListAttachedUserPolicies
Also fixes DeletePolicy to reject deletion when policy is still attached
to a user (AWS-compatible DeleteConflictException), and adds the 409
status code mapping for DeleteConflictException in the error response
handler.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: nil map panic in CreatePolicy, add PolicyId test assertions
- Initialize policies.Policies map in CreatePolicy if nil (prevents panic
when no policies exist yet); also handle filer_pb.ErrNotFound like other
callers
- Add PolicyId assertions in TestGetPolicy and TestListPolicies to lock in
the consistent Hash(&policyName) behavior
- Remove redundant time.Sleep calls from new integration tests (startMiniCluster
already blocks on waitForS3Ready)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: PutUserPolicy and DeleteUserPolicy now preserve managed policy actions
PutUserPolicy and DeleteUserPolicy were calling computeAggregatedActionsForUser
(inline-only), overwriting ident.Actions and dropping managed policy actions.
Both now call computeAllActionsForUser which unions inline + managed actions.
Add TestManagedPolicyActionsPreservedAcrossInlineMutations regression test:
attaches a managed policy, adds an inline policy (verifies both actions present),
deletes the inline policy, then asserts managed policy actions still persist.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: PutUserPolicy verifies user exists before persisting inline policy
Previously the inline policy was written to storage before checking if the
target user exists in s3cfg.Identities, leaving orphaned policy data when
the user was absent. Now validates the user first, returning
NoSuchEntityException immediately if not found.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: prevent stale/lost actions on computeAllActionsForUser failure
- PutUserPolicy: on recomputation failure, preserve existing ident.Actions
instead of falling back to only the current inline policy's actions
- DeleteUserPolicy: on recomputation failure, preserve existing ident.Actions
instead of assigning nil (which wiped all permissions)
- AttachUserPolicy: roll back ident.PolicyNames and return error if
action recomputation fails, keeping identity consistent
- DetachUserPolicy: roll back ident.PolicyNames and return error if
GetPolicies or action recomputation fails
- Add doc comment on newTestIamApiServer noting it only sets s3ApiConfig
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* s3api: add error code and header constants for GetObjectAttributes
Add ErrInvalidAttributeName error code and header constants
(X-Amz-Object-Attributes, X-Amz-Max-Parts, X-Amz-Part-Number-Marker,
X-Amz-Delete-Marker) needed by the S3 GetObjectAttributes API.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* s3api: implement GetObjectAttributes handler
Add GetObjectAttributesHandler that returns selected object metadata
(ETag, Checksum, StorageClass, ObjectSize, ObjectParts) without
returning the object body. Follows the same versioning and conditional
header patterns as HeadObjectHandler.
The handler parses the X-Amz-Object-Attributes header to determine
which attributes to include in the XML response, and supports
ObjectParts pagination via X-Amz-Max-Parts and X-Amz-Part-Number-Marker.
Ref: https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObjectAttributes.html
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* s3api: register GetObjectAttributes route
Register the GET /{object}?attributes route for the
GetObjectAttributes API, placed before other object query
routes to ensure proper matching.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* s3api: add integration tests for GetObjectAttributes
Test coverage:
- Basic: simple object with all attribute types
- MultipartObject: multipart upload with parts pagination
- SelectiveAttributes: requesting only specific attributes
- InvalidAttribute: server rejects invalid attribute names
- NonExistentObject: returns NoSuchKey for missing objects
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* s3api: add versioned object test for GetObjectAttributes
Test puts two versions of the same object and verifies that:
- GetObjectAttributes returns the latest version by default
- GetObjectAttributes with versionId returns the specific version
- ObjectSize and VersionId are correct for each version
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* s3api: fix combined conditional header evaluation per RFC 7232
Per RFC 7232:
- Section 3.4: If-Unmodified-Since MUST be ignored when If-Match is
present (If-Match is the more accurate replacement)
- Section 3.3: If-Modified-Since MUST be ignored when If-None-Match is
present (If-None-Match is the more accurate replacement)
Previously, all four conditional headers were evaluated independently.
This caused incorrect 412 responses when If-Match succeeded but
If-Unmodified-Since failed (should return 200 per AWS S3 behavior).
Fix applied to both validateConditionalHeadersForReads (GET/HEAD) and
validateConditionalHeaders (PUT) paths.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* s3api: add conditional header combination tests for GetObjectAttributes
Test the RFC 7232 combined conditional header semantics:
- If-Match=true + If-Unmodified-Since=false => 200 (If-Unmodified-Since ignored)
- If-None-Match=false + If-Modified-Since=true => 304 (If-Modified-Since ignored)
- If-None-Match=true + If-Modified-Since=false => 200 (If-Modified-Since ignored)
- If-Match=true + If-Unmodified-Since=true => 200
- If-Match=false => 412 regardless
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* s3api: document Checksum attribute as not yet populated
Checksum is accepted in validation (so clients requesting it don't get
a 400 error, matching AWS behavior for objects without checksums) but
SeaweedFS does not yet store S3 checksums. Add a comment explaining
this and noting where to populate it when checksum storage is added.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* s3api: add s3:GetObjectAttributes IAM action for ?attributes query
Previously, GET /{object}?attributes resolved to s3:GetObject via the
fallback path since resolveFromQueryParameters had no case for the
"attributes" query parameter.
Add S3_ACTION_GET_OBJECT_ATTRIBUTES constant ("s3:GetObjectAttributes")
and a branch in resolveFromQueryParameters to return it for GET requests
with the "attributes" query parameter, so IAM policies can distinguish
GetObjectAttributes from GetObject.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* s3api: evaluate conditional headers after version resolution
Move conditional header evaluation (If-Match, If-None-Match, etc.) to
after the version resolution step in GetObjectAttributesHandler. This
ensures that when a specific versionId is requested, conditions are
checked against the correct version entry rather than always against
the latest version.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* s3api: use bounded HTTP client in GetObjectAttributes tests
Replace http.DefaultClient with a timeout-aware http.Client (10s) in
the signedGetObjectAttributes helper and testGetObjectAttributesInvalid
to prevent tests from hanging indefinitely.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* s3api: check attributes query before versionId in action resolver
Move the GetObjectAttributes action check before the versionId check
in resolveFromQueryParameters. This fixes GET /bucket/key?attributes&versionId=xyz
being incorrectly classified as s3:GetObjectVersion instead of
s3:GetObjectAttributes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* s3api: add tests for versioned conditional headers and action resolver
Add integration test that verifies conditional headers (If-Match,
If-None-Match) are evaluated against the requested version entry, not
the latest version. This covers the fix in 55c409dec.
Add unit test for ResolveS3Action verifying that the attributes query
parameter takes precedence over versionId, so GET ?attributes&versionId
resolves to s3:GetObjectAttributes. This covers the fix in b92c61c95.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* s3api: guard negative chunk indices and rename PartsCount field
Add bounds checks for b.StartChunk >= 0 and b.EndChunk >= 0 in
buildObjectAttributesParts to prevent panics from corrupted metadata
with negative index values.
Rename ObjectAttributesParts.PartsCount to TotalPartsCount to match
the AWS SDK v2 Go field naming convention, while preserving the XML
element name "PartsCount" via the struct tag.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* s3api: reject malformed max-parts and part-number-marker headers
Return ErrInvalidMaxParts and ErrInvalidPartNumberMarker when the
X-Amz-Max-Parts or X-Amz-Part-Number-Marker headers contain
non-integer or negative values, matching ListObjectPartsHandler
behavior. Previously these were silently ignored with defaults.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* pb: add job type max runtime setting
* plugin: default job type max runtime
* plugin: redesign scheduler loop
* admin ui: update scheduler settings
* plugin: fix scheduler loop state name
* plugin scheduler: restore backlog skip
* plugin scheduler: drop legacy detection helper
* admin api: require scheduler config body
* admin ui: preserve detection interval on save
* plugin scheduler: use job context and drain cancels
* plugin scheduler: respect detection intervals
* plugin scheduler: gate runs and drain queue
* ec test: reuse req/resp vars
* ec test: add scheduler debug logs
* Adjust scheduler idle sleep and initial run delay
* Clear pending job queue before scheduler runs
* Log next detection time in EC integration test
* Improve plugin scheduler debug logging in EC test
* Expose scheduler next detection time
* Log scheduler next detection time in EC test
* Wake scheduler on config or worker updates
* Expose scheduler sleep interval in UI
* Fix scheduler sleep save value selection
* Set scheduler idle sleep default to 613s
* Show scheduler next run time in plugin UI
---------
Co-authored-by: Copilot <copilot@github.com>
* Add remote storage index for lazy metadata pull
Introduces remoteStorageIndex, which maintains a map of filer directory
to remote storage client/location, refreshed periodically from the
filer's mount mappings. Provides lazyFetchFromRemote, ensureRemoteEntryInFiler,
and isRemoteBacked on S3ApiServer as integration points for handler-level
work in a follow-up PR. Nothing is wired into the server yet.
Made-with: Cursor
* Add unit tests for remote storage index and wire field into S3ApiServer
Adds tests covering isEmpty, findForPath (including longest-prefix
resolution), and isRemoteBacked. Also removes a stray PR review
annotation from the index file and adds the remoteStorageIdx field
to S3ApiServer so the package compiles ahead of the wiring PR.
Made-with: Cursor
* Address review comments on remote storage index
- Use filer_pb.CreateEntry helper so resp.Error is checked, not just the RPC error
- Extract keepPrev closure to remove duplicated error-handling in refresh loop
- Add comment explaining availability-over-consistency trade-off on filer save failure
Made-with: Cursor
* Move lazy metadata pull from S3 API to filer
- Add maybeLazyFetchFromRemote in filer: on FindEntry miss, stat remote
and CreateEntry when path is under a remote mount
- Use singleflight for dedup; context guard prevents CreateEntry recursion
- Availability-over-consistency: return in-memory entry if CreateEntry fails
- Add longest-prefix test for nested mounts in remote_storage_test.go
- Remove remoteStorageIndex, lazyFetchFromRemote, ensureRemoteEntryInFiler,
doLazyFetch from s3api; filer now owns metadata operations
- Add filer_lazy_remote_test.go with tests for hit, miss, not-found,
CreateEntry failure, longest-prefix, and FindEntry integration
Made-with: Cursor
* Address review: fix context guard test, add FindMountDirectory comment, remove dead code
Made-with: Cursor
* Nitpicks: restore prev maker in registerStubMaker, instance-scope lazyFetchGroup, nil-check remoteEntry
Made-with: Cursor
* Fix remotePath when mountDir is root: ensure relPath has leading slash
Made-with: Cursor
* filer: decouple lazy-fetch persistence from caller context
Use context.Background() inside the singleflight closure for CreateEntry
so persistence is not cancelled when the winning request's context is
cancelled. Fixes CreateEntry failing for all waiters when the first
caller times out.
Made-with: Cursor
* filer: remove redundant Mode bitwise OR with zero
Made-with: Cursor
* filer: use bounded context for lazy-fetch persistence
Replace context.Background() with context.WithTimeout(30s) and defer
cancel() to prevent indefinite blocking and release resources.
Made-with: Cursor
* filer: use checked type assertion for singleflight result
Made-with: Cursor
* filer: rename persist context vars to avoid shadowing function parameter
Made-with: Cursor
* Add stale job expiry and expire API
* Add expire job button
* Add test hook and coverage for ExpirePluginJobAPI
* Document scheduler filtering side effect and reuse helper
* Restore job spec proposal test
* Regenerate plugin template output
---------
Co-authored-by: Copilot <copilot@github.com>
* Add volume dir tags to topology
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add preferred tag config for EC
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Prioritize EC destinations by tags
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add EC placement planner tag tests
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Refactor EC placement tests to reuse buildActiveTopology
Remove buildActiveTopologyWithDiskTags helper function and consolidate
tag setup inline in test cases. Tests now use UpdateTopology to apply
tags after topology creation, reusing the existing buildActiveTopology
function rather than duplicating its logic.
All tag scenario tests pass:
- TestECPlacementPlannerPrefersTaggedDisks
- TestECPlacementPlannerFallsBackWhenTagsInsufficient
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Consolidate normalizeTagList into shared util package
Extract normalizeTagList from three locations (volume.go,
detection.go, erasure_coding_handler.go) into new weed/util/tag.go
as exported NormalizeTagList function. Replace all duplicate
implementations with imports and calls to util.NormalizeTagList.
This improves code reuse and maintainability by centralizing
tag normalization logic.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add PreferredTags to EC config persistence
Add preferred_tags field to ErasureCodingTaskConfig protobuf with field
number 5. Update GetConfigSpec to include preferred_tags field in the
UI configuration schema. Add PreferredTags to ToTaskPolicy to serialize
config to protobuf. Add PreferredTags to FromTaskPolicy to deserialize
from protobuf with defensive copy to prevent external mutation.
This allows EC preferred tags to be persisted and restored across
worker restarts.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add defensive copy for Tags slice in DiskLocation
Copy the incoming tags slice in NewDiskLocation instead of storing
by reference. This prevents external callers from mutating the
DiskLocation.Tags slice after construction, improving encapsulation
and preventing unexpected changes to disk metadata.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add doc comment to buildCandidateSets method
Document the tiered candidate selection and fallback behavior. Explain
that for a planner with preferredTags, it accumulates disks matching
each tag in order into progressively larger tiers, emits a candidate
set once a tier reaches shardsNeeded, and finally falls back to the
full candidates set if preferred-tag tiers are insufficient.
This clarifies the intended semantics for future maintainers.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Apply final PR review fixes
1. Update parseVolumeTags to replicate single tag entry to all folders
instead of leaving some folders with nil tags. This prevents nil
pointer dereferences when processing folders without explicit tags.
2. Add defensive copy in ToTaskPolicy for PreferredTags slice to match
the pattern used in FromTaskPolicy, preventing external mutation of
the returned TaskPolicy.
3. Add clarifying comment in buildCandidateSets explaining that the
shardsNeeded <= 0 branch is a defensive check for direct callers,
since selectDestinations guarantees shardsNeeded > 0.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix nil pointer dereference in parseVolumeTags
Ensure all folder tags are initialized to either normalized tags or
empty slices, not nil. When multiple tag entries are provided and there
are more folders than entries, remaining folders now get empty slices
instead of nil, preventing nil pointer dereference in downstream code.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fix NormalizeTagList to return empty slice instead of nil
Change NormalizeTagList to always return a non-nil slice. When all tags
are empty or whitespace after normalization, return an empty slice
instead of nil. This prevents nil pointer dereferences in downstream
code that expects a valid (possibly empty) slice.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add nil safety check for v.tags pointer
Add a safety check to handle the case where v.tags might be nil,
preventing a nil pointer dereference. If v.tags is nil, use an empty
string instead. This is defensive programming to prevent panics in
edge cases.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add volume.tags flag to weed server and weed mini commands
Add the volume.tags CLI option to both the 'weed server' and 'weed mini'
commands. This allows users to specify disk tags when running the
combined server modes, just like they can with 'weed volume'.
The flag uses the same format and description as the volume command:
comma-separated tag groups per data dir with ':' separators
(e.g. fast:ssd,archive).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* refactor ec shard distribution
* fix shard assignment merge and mount errors
* fix mount error aggregation scope
* make WithFields compatible and wrap errors
* Prevent concurrent maintenance tasks per volume
* fix panic
* fix(s3api): correctly extract host header port when X-Forwarded-Port is present
* test(s3api): add test cases for misreported X-Forwarded-Port
* set working directory
* consolidate to worker directory
* working directory
* correct directory name
* refactoring to use wildcard matcher
* simplify
* cleaning ec working directory
* fix reference
* clean
* adjust test
* feat: add customizable plugin display names and weights
- Add weight field to JobTypeCapability proto message
- Modify ListKnownJobTypes() to return JobTypeInfo with display names and weights
- Modify ListPluginJobTypes() to return JobTypeInfo instead of string
- Sort plugins by weight (descending) then alphabetically
- Update admin API to return enriched job type metadata
- Update plugin UI template to display names instead of IDs
- Consolidate API by reusing existing function names instead of suffixed variants
* perf: optimize plugin job type capability lookup and add null-safe parsing
- Pre-calculate job type capabilities in a map to reduce O(n*m) nested loops
to O(n+m) lookup time in ListKnownJobTypes()
- Add parseJobTypeItem() helper function for null-safe job type item parsing
- Refactor plugin.templ to use parseJobTypeItem() in all job type access points
(hasJobType, applyInitialNavigation, ensureActiveNavigation, renderTopTabs)
- Deterministic capability resolution by using first worker's capability
* templ
* refactor: use parseJobTypeItem helper consistently in plugin.templ
Replace duplicated job type extraction logic at line 1296-1298 with
parseJobTypeItem() helper function for consistency and maintainability.
* improve: prefer richer capability metadata and add null-safety checks
- Improve capability selection in ListKnownJobTypes() to prefer capabilities
with non-empty DisplayName and higher Weight across all workers instead of
first-wins approach. Handles mixed-version clusters better.
- Add defensive null checks in renderJobTypeSummary() to safely access
parseJobTypeItem() result before property access
- Ensures malformed or missing entries won't break the rendering pipeline
* fix: preserve existing DisplayName when merging capabilities
Fix capability merge logic to respect existing DisplayName values:
- If existing has DisplayName but candidate doesn't, preserve existing
- If existing doesn't have DisplayName but candidate does, use candidate
- Only use Weight comparison if DisplayName status is equal
- Prevents higher-weight capabilities with empty DisplayName from
overriding capabilities with non-empty DisplayName
* feat: drop table location mapping support
Disable external metadata locations for S3 Tables and remove the table location
mapping index entirely. Table metadata must live under the table bucket paths,
so lookups no longer use mapping directories.
Changes:
- Remove mapping lookup and cache from bucket path resolution
- Reject metadataLocation in CreateTable and UpdateTable
- Remove mapping helpers and tests
* compile
* refactor
* fix: accept metadataLocation in S3 Tables API requests
We removed the external table location mapping feature, but still need to
accept and store metadataLocation values from clients like Trino. The mapping
feature was an internal implementation detail that mapped external buckets to
internal table paths. The metadataLocation field itself is part of the S3 Tables
API and should be preserved.
* fmt
* fix: handle MetadataLocation in UpdateTable requests
Mirror handleCreateTable behavior by updating metadata.MetadataLocation
when req.MetadataLocation is provided in UpdateTable requests. This ensures
table metadata location can be updated, not just set during creation.
* fix: move table location mappings to /etc/s3tables to avoid bucket name validation
Fixes#8362 - table location mappings were stored under /buckets/.table-location-mappings
which fails bucket name validation because it starts with a dot. Moving them to
/etc/s3tables resolves the migration error for upgrades.
Changes:
- Table location mappings now stored under /etc/s3tables
- Ensure parent /etc directory exists before creating /etc/s3tables
- Normal writes go to new location only (no legacy compatibility)
- Removed bucket name validation exception for old location
* refactor: simplify lookupTableLocationMapping by removing redundant mappingPath parameter
The mappingPath function parameter was redundant as the path can be derived
from mappingDir and bucket using path.Join. This simplifies the code and
reduces the risk of path mismatches between parameters.
* Fix S3 signature verification behind reverse proxies
When SeaweedFS is deployed behind a reverse proxy (e.g. nginx, Kong,
Traefik), AWS S3 Signature V4 verification fails because the Host header
the client signed with (e.g. "localhost:9000") differs from the Host
header SeaweedFS receives on the backend (e.g. "seaweedfs:8333").
This commit adds a new -s3.externalUrl parameter (and S3_EXTERNAL_URL
environment variable) that tells SeaweedFS what public-facing URL clients
use to connect. When set, SeaweedFS uses this host value for signature
verification instead of the Host header from the incoming request.
New parameter:
-s3.externalUrl (flag) or S3_EXTERNAL_URL (environment variable)
Example: -s3.externalUrl=http://localhost:9000
Example: S3_EXTERNAL_URL=https://s3.example.com
The environment variable is particularly useful in Docker/Kubernetes
deployments where the external URL is injected via container config.
The flag takes precedence over the environment variable when both are set.
At startup, the URL is parsed and default ports are stripped to match
AWS SDK behavior (port 80 for HTTP, port 443 for HTTPS), so
"http://s3.example.com:80" and "http://s3.example.com" are equivalent.
Bugs fixed:
- Default port stripping was removed by a prior PR, causing signature
mismatches when clients connect on standard ports (80/443)
- X-Forwarded-Port was ignored when X-Forwarded-Host was not present
- Scheme detection now uses proper precedence: X-Forwarded-Proto >
TLS connection > URL scheme > "http"
- Test expectations for standard port stripping were incorrect
- expectedHost field in TestSignatureV4WithForwardedPort was declared
but never actually checked (self-referential test)
* Add Docker integration test for S3 proxy signature verification
Docker Compose setup with nginx reverse proxy to validate that the
-s3.externalUrl parameter (or S3_EXTERNAL_URL env var) correctly
resolves S3 signature verification when SeaweedFS runs behind a proxy.
The test uses nginx proxying port 9000 to SeaweedFS on port 8333,
with X-Forwarded-Host/Port/Proto headers set. SeaweedFS is configured
with -s3.externalUrl=http://localhost:9000 so it uses "localhost:9000"
for signature verification, matching what the AWS CLI signs with.
The test can be run with aws CLI on the host or without it by using
the amazon/aws-cli Docker image with --network host.
Test covers: create-bucket, list-buckets, put-object, head-object,
list-objects-v2, get-object, content round-trip integrity,
delete-object, and delete-bucket — all through the reverse proxy.
* Create s3-proxy-signature-tests.yml
* fix CLI
* fix CI
* Update s3-proxy-signature-tests.yml
* address comments
* Update Dockerfile
* add user
* no need for fuse
* Update s3-proxy-signature-tests.yml
* debug
* weed mini
* fix health check
* health check
* fix health checking
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Chris Lu <chris.lu@gmail.com>
* Make EC detection context aware
* Update register.go
* Speed up EC detection planning
* Add tests for EC detection planner
* optimizations
detection.go: extracted ParseCollectionFilter (exported) and feed it into the detection loop so both detection and tracing share the same parsing/whitelisting logic; the detection loop now iterates on a sorted list of volume IDs, checks the context at every iteration, and only sets hasMore when there are still unprocessed groups after hitting maxResults, keeping runtime bounded while still scheduling planned tasks before returning the results.
erasure_coding_handler.go: dropped the duplicated inline filter parsing in emitErasureCodingDetectionDecisionTrace and now reuse erasurecodingtask.ParseCollectionFilter, and the summary suffix logic now only accounts for the hasMore case that can actually happen.
detection_test.go: updated the helper topology builder to use master_pb.VolumeInformationMessage (matching the current protobuf types) and tightened the cancellation/max-results tests so they reliably exercise the detection logic (cancel before calling Detection, and provide enough disks so one result is produced before the limit).
* use working directory
* fix compilation
* fix compilation
* rename
* go vet
* fix getenv
* address comments, fix error
* Fix SFTP file upload failures with JWT filer tokens (issue #8425)
When JWT authentication is enabled for filer operations via jwt.filer_signing.*
configuration, SFTP server file upload requests were rejected because they lacked
JWT authorization headers.
Changes:
- Added JWT signing key and expiration fields to SftpServer struct
- Modified putFile() to generate and include JWT tokens in upload requests
- Enhanced SFTPServiceOptions with JWT configuration fields
- Updated SFTP command startup to load and pass JWT config to service
This allows SFTP uploads to authenticate with JWT-enabled filers, consistent
with how other SeaweedFS components (S3 API, file browser) handle filer auth.
Fixes#8425
* Apply suggestion from @gemini-code-assist[bot]
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
---------
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* Allow multipart upload operations when s3:PutObject is authorized
Multipart upload is an implementation detail of putting objects, not a separate
permission. When a policy grants s3:PutObject, it should implicitly allow:
- s3:CreateMultipartUpload
- s3:UploadPart
- s3:CompleteMultipartUpload
- s3:AbortMultipartUpload
- s3:ListParts
This fixes a compatibility issue where clients like PyArrow that use multipart
uploads by default would fail even though the role had s3:PutObject permission.
The session policy intersection still applies - both the identity-based policy
AND session policy must allow s3:PutObject for multipart operations to work.
Implementation:
- Added constants for S3 multipart action strings
- Added multipartActionSet to efficiently check if action is multipart-related
- Updated MatchesAction method to implicitly grant multipart when PutObject allowed
* Update weed/s3api/policy_engine/types.go
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* Add s3:ListMultipartUploads to multipart action set
Include s3:ListMultipartUploads in the multipartActionSet so that listing
multipart uploads is implicitly granted when s3:PutObject is authorized.
ListMultipartUploads is a critical part of the multipart upload workflow,
allowing clients to query in-progress uploads before completing them.
Changes:
- Added s3ListMultipartUploads constant definition
- Included s3ListMultipartUploads in multipartActionSet initialization
- Existing references to multipartActionSet automatically now cover ListMultipartUploads
All policy engine tests pass (0.351s execution time)
* Refactor: reuse multipart action constants from s3_constants package
Remove duplicate constant definitions from policy_engine/types.go and import
the canonical definitions from s3api/s3_constants/s3_actions.go instead.
This eliminates duplication and ensures a single source of truth for
multipart action strings:
- ACTION_CREATE_MULTIPART_UPLOAD
- ACTION_UPLOAD_PART
- ACTION_COMPLETE_MULTIPART
- ACTION_ABORT_MULTIPART
- ACTION_LIST_PARTS
- ACTION_LIST_MULTIPART_UPLOADS
All policy engine tests pass (0.350s execution time)
* Fix S3_ACTION_LIST_MULTIPART_UPLOADS constant value
Move S3_ACTION_LIST_MULTIPART_UPLOADS from bucket operations to multipart
operations section and change value from 's3:ListBucketMultipartUploads' to
's3:ListMultipartUploads' to match the action strings used in policy_engine
and s3_actions.go.
This ensures consistent action naming across all S3 constant definitions.
* refactor names
* Fix S3 action constant mismatches and MatchesAction early return bug
Fix two critical issues in policy engine:
1. S3Actions map had incorrect multipart action mappings:
- 'ListMultipartUploads' was 's3:ListMultipartUploads' (should be 's3:ListBucketMultipartUploads')
- 'ListParts' was 's3:ListParts' (should be 's3:ListMultipartUploadParts')
These mismatches caused authorization checks to fail for list operations
2. CompiledStatement.MatchesAction() had early return bug:
- Previously returned true immediately upon first direct action match
- This prevented scanning remaining matchers for s3:PutObject permission
- Now scans ALL matchers before returning, tracking both direct match and PutObject grant
- Ensures multipart operations inherit s3:PutObject authorization even when
explicitly requested action doesn't match (e.g., s3:ListMultipartUploadParts)
Changes:
- Track matchedAction flag to defer
Fix two critical issues in policy engine:
1. S3Actions map had incorrect multipart action mappings:
- 'ListMultipartUploads' was 's3:ListMultipartUplPer
1. S3Actions map had incorrect multiparAll - 'ListMultipartUploads(0.334s execution time)
* Refactor S3Actions map to use s3_constants
Replace hardcoded action strings in the S3Actions map with references to
canonical S3_ACTION_* constants from s3_constants/s3_action_strings.go.
Benefits:
- Single source of truth for S3 action values
- Eliminates string duplication across codebase
- Ensures consistency between policy engine and middleware
- Reduces maintenance burden when action strings need updates
All policy engine tests pass (0.334s execution time)
* Remove unused S3Actions map
The S3Actions map in types.go was never referenced anywhere in the codebase.
All action mappings are handled by GetActionMappings() in integration.go instead.
This removes 42 lines of dead code.
* Fix test: reload configuration function must also reload IAM state
TestEmbeddedIamAttachUserPolicyRefreshesIAM was failing because the test's
reloadConfigurationFunc only updated mockConfig but didn't reload the actual IAM
state. When AttachUserPolicy calls refreshIAMConfiguration(), it would use the
test's incomplete reload function instead of the real LoadS3ApiConfigurationFromCredentialManager().
Fixed by making the test's reloadConfigurationFunc also call
e.iam.LoadS3ApiConfigurationFromCredentialManager() so lookupByIdentityName()
sees the updated policy attachments.
---------
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* feat: add statfile; add error for remote storage misses
* feat: statfile implementations for storage providers
* test: add unit tests for StatFile method across providers
Add comprehensive unit tests for the StatFile implementation covering:
- S3: interface compliance and error constant accessibility
- Azure: interface compliance, error constants, and field population
- GCS: interface compliance, error constants, error detection, and field population
Also fix variable shadowing issue in S3 and Azure StatFile implementations where
named return parameters were being shadowed by local variable declarations.
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix: address StatFile review feedback
- Use errors.New for ErrRemoteObjectNotFound sentinel
- Fix S3 HeadObject 404 detection to use awserr.Error code check
- Remove hollow field-population tests that tested nothing
- Remove redundant stdlib error detection tests
- Trim verbose doc comment on ErrRemoteObjectNotFound
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix: address second round of StatFile review feedback
- Rename interface assertion tests to TestXxxRemoteStorageClientImplementsInterface
- Delegate readFileRemoteEntry to StatFile in all three providers
- Revert S3 404 detection to RequestFailure.StatusCode() check
- Fix double-slash in GCS error message format string
- Add storage type prefix to S3 error message for consistency
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix: comments
---------
Co-authored-by: Cursor <cursoragent@cursor.com>