seaweedFS

Author	SHA1	Message	Date
Chris Lu	ced2236cc6	Adjust rename events metadata format (#8854 ) * rename metadata events * fix subscription filter to use NewEntry.Name for rename path matching The server-side subscription filter constructed the new path using OldEntry.Name instead of NewEntry.Name when checking if a rename event's destination matches the subscriber's path prefix. This could cause events to be incorrectly filtered when a rename changes the file name. * fix bucket events to handle rename of bucket directories onBucketEvents only checked IsCreate and IsDelete. A bucket directory rename via AtomicRenameEntry now emits a single rename event (both OldEntry and NewEntry non-nil), which matched neither check. Handle IsRename by deleting the old bucket and creating the new one. * fix replicator to handle rename events across directory boundaries Two issues fixed: 1. The replicator filtered events by checking if the key (old path) was under the source directory. Rename events now use the old path as key, so renames from outside into the watched directory were silently dropped. Now both old and new paths are checked, and cross-boundary renames are converted to create or delete. 2. NewParentPath was passed to the sink without remapping to the sink's target directory structure, causing the sink to write entries at the wrong location. Now NewParentPath is remapped alongside the key. * fix filer sync to handle rename events crossing directory boundaries The early directory-prefix filter only checked resp.Directory (old parent). Rename events now carry the old parent as Directory, so renames from outside the source path into it were dropped before reaching the existing cross-boundary handling logic. Check both old and new directories against sourcePath and excludePaths so the downstream old-key/new-key logic can properly convert these to create or delete operations. * fix metadata event path matching * fix metadata event consumers for rename targets * Fix replication rename target keys Logical rename events now reach replication sinks with distinct source and target paths.\n\nHandle non-filer sinks as delete-plus-create on the translated target key, and make the rename fallback path create at the translated target key too.\n\nAdd focused tests covering non-filer renames, filer rename updates, and the fallback path.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix filer sync rename path scoping Use directory-boundary matching instead of raw prefix checks when classifying source and target paths during filer sync.\n\nAlso apply excludePaths per side so renames across excluded boundaries downgrade cleanly to create/delete instead of being misclassified as in-scope updates.\n\nAdd focused tests for boundary matching and rename classification.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix replicator directory boundary checks Use directory-boundary matching instead of raw prefix checks when deciding whether a source or target path is inside the watched tree or an excluded subtree.\n\nThis prevents sibling paths such as /foo and /foobar from being misclassified during rename handling, and preserves the earlier rename-target-key fix.\n\nAdd focused tests for boundary matching and rename classification across sibling/excluded directories.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix etc-remote rename-out handling Use boundary-safe source/target directory membership when classifying metadata events under DirectoryEtcRemote.\n\nThis prevents rename-out events from being processed as config updates, while still treating them as removals where appropriate for the remote sync and remote gateway command paths.\n\nAdd focused tests for update/removal classification and sibling-prefix handling.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Defer rename events until commit Queue logical rename metadata events during atomic and streaming renames and publish them only after the transaction commits successfully.\n\nThis prevents subscribers from seeing delete or logical rename events for operations that later fail during delete or commit.\n\nAlso serialize notification.Queue swaps in rename tests and add failure-path coverage.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Skip descendant rename target lookups Avoid redundant target lookups during recursive directory renames once the destination subtree is known absent.\n\nThe recursive move path now inserts known-absent descendants directly, and the test harness exercises prefixed directory listing so the optimization is covered by a directory rename regression test.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Tighten rename review tests Return filer_pb.ErrNotFound from the bucket tracking store test stub so it follows the FilerStore contract, and add a webhook filter case for same-name renames across parent directories.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix HardLinkId format verb in InsertEntryKnownAbsent error HardLinkId is a byte slice. %d prints each byte as a decimal number which is not useful for an identifier. Use %x to match the log line two lines above. * only skip descendant target lookup when source and dest use same store moveFolderSubEntries unconditionally passed skipTargetLookup=true for every descendant. This is safe when all paths resolve to the same underlying store, but with path-specific store configuration a child's destination may map to a different backend that already holds an entry at that path. Use FilerStoreWrapper.SameActualStore to check per-child and fall back to the full CreateEntry path when stores differ. * add nil and create edge-case tests for metadata event scope helpers * extract pathIsEqualOrUnder into util.IsEqualOrUnder Identical implementations existed in both replication/replicator.go and command/filer_sync.go. Move to util.IsEqualOrUnder (alongside the existing FullPath.IsUnder) and remove the duplicates. * use MetadataEventTargetDirectory for new-side directory in filer sync The new-side directory checks and sourceNewKey computation used message.NewParentPath directly. If NewParentPath were empty (legacy events, older filer versions during rolling upgrades), sourceNewKey would be wrong (/filename instead of /dir/filename) and the UpdateEntry parent path rewrite would panic on slice bounds. Derive targetDir once from MetadataEventTargetDirectory, which falls back to resp.Directory when NewParentPath is empty, and use it consistently for all new-side checks and the sink parent path.	2026-03-30 18:25:11 -07:00
Chris Lu	92c2fc0d52	Add insecure_skip_verify option for HTTPS client in security.toml (#8781 ) * Add -insecureSkipVerify flag and config option for filer.sync HTTPS connections When using filer.sync between clusters with different CAs (e.g., separate OpenShift clusters), TLS certificate verification fails with "x509: certificate signed by unknown authority". This adds two ways to skip TLS certificate verification: 1. CLI flag: `weed filer.sync -insecureSkipVerify ...` 2. Config option: `insecure_skip_verify = true` under [https.client] in security.toml Closes #8778 * Add insecure_skip_verify option for HTTPS client in security.toml When using filer.sync between clusters with different CAs (e.g., separate OpenShift clusters), TLS certificate verification fails. Adding insecure_skip_verify = true under [https.client] in security.toml allows skipping TLS certificate verification. The option is read during global HTTP client initialization so it applies to all HTTPS connections including filer.sync proxy reads and writes. Closes #8778 --------- Co-authored-by: Copilot <copilot@github.com>	2026-03-26 11:42:47 -07:00
Chris Lu	2877febd73	S3: fix silent PutObject failure and enforce 1024-byte key limit (#8764 ) * S3: add KeyTooLongError error code Add ErrKeyTooLongError (HTTP 400, code "KeyTooLongError") to match the standard AWS S3 error for object keys that exceed length limits. * S3: fix silent PutObject failure when entry name exceeds max_file_name_length putToFiler called client.CreateEntry() directly and discarded the gRPC response. The filer embeds application errors like "entry name too long" in resp.Error (not as gRPC transport errors), so the error was silently swallowed and clients received HTTP 200 with an ETag for objects that were never stored. Switch to the filer_pb.CreateEntry() helper which properly checks resp.Error, and map "entry name too long" to KeyTooLongError (HTTP 400). To avoid fragile string parsing across the gRPC boundary, define shared error message constants in weed/util/constants and use them in both the filer (producing errors) and S3 API (matching errors). Switch filerErrorToS3Error to use strings.Contains/HasSuffix with these constants so matches work regardless of any wrapper prefix. Apply filerErrorToS3Error to the mkdir path for directory markers. Fixes #8759 * S3: enforce 1024-byte maximum object key length AWS S3 limits object keys to 1024 bytes. Add early validation on write paths (PutObject, CopyObject, CreateMultipartUpload) to reject keys exceeding the limit with the standard KeyTooLongError (HTTP 400). The key length check runs before bucket auto-creation to prevent overlong keys from triggering unnecessary side effects. Also use filerErrorToS3Error for CopyObject's mkFile error paths so name-too-long errors from the filer return KeyTooLongError instead of InternalError. Ref #8758 * S3: add handler-level tests for key length validation and error mapping Add tests for filerErrorToS3Error mapping "entry name too long" to KeyTooLongError, including a regression test for the CreateEntry-prefixed "existing ... is a directory" form. Add handler-level integration tests that exercise PutObjectHandler, CopyObjectHandler, and NewMultipartUploadHandler via httptest, verifying HTTP 400 and KeyTooLongError XML response for overlong keys and acceptance of keys at the 1024-byte limit.	2026-03-24 13:35:28 -07:00
Chris Lu	6bf654c25c	fix: keep metadata subscriptions progressing (#8730 ) (#8746 ) * fix: keep metadata subscriptions progressing (#8730) * test: cancel slow metadata writers with parent context * filer: ignore missing persisted log chunks	2026-03-23 15:26:54 -07:00
Chris Lu	08b79a30f6	Fix lock table shared wait condition (#8707 ) * Fix lock table shared wait condition (#8696) * Refactor lock table waiter check * Add exclusive lock wait helper test	2026-03-19 16:08:24 -07:00
Chris Lu	b665c329bc	fix(replication): resume partial chunk reads on EOF instead of re-downloading (#8607 ) * fix(replication): resume partial chunk reads on EOF instead of re-downloading When replicating chunks and the source connection drops mid-transfer, accumulate the bytes already received and retry with a Range header to fetch only the remaining bytes. This avoids re-downloading potentially large chunks from scratch on each retry, reducing load on busy source servers and speeding up recovery. * test(replication): add tests for downloadWithRange including gzip partial reads Tests cover: - No offset (no Range header sent) - With offset (Range header verified) - Content-Disposition filename extraction - Partial read + resume: server drops connection mid-transfer, client resumes with Range from the offset of received bytes - Gzip partial read + resume: first response is gzip-encoded (Go auto- decompresses), connection drops, resume request gets decompressed data (Go doesn't add Accept-Encoding when Range is set, so the server decompresses), combined bytes match original * fix(replication): address PR review comments - Consolidate downloadWithRange into DownloadFile with optional offset parameter (variadic), eliminating code duplication (DRY) - Validate HTTP response status: require 206 + correct Content-Range when offset > 0, reject when server ignores Range header - Use if/else for fullData assignment for clarity - Add test for rejected Range (server returns 200 instead of 206) * refactor(replication): remove unused ReplicationSource interface The interface was never referenced and its signature didn't match the actual FilerSource.ReadPart method. --------- Co-authored-by: Copilot <copilot@github.com>	2026-03-11 22:38:22 -07:00
Chris Lu	4a5243886a	4.17	2026-03-11 02:29:24 -07:00
Chris Lu	8ad58e7002	4.16	2026-03-09 21:52:43 -07:00
Chris Lu	2ec0a67ee3	master: return 503/Unavailable during topology warmup after leader change (#8529 ) * master: return 503/Unavailable during topology warmup after leader change After a master restart or leader change, the topology is empty until volume servers reconnect and send heartbeats. During this warmup window (3 heartbeat intervals = 15 seconds), volume lookups that fail now return 503 Service Unavailable (HTTP) or gRPC Unavailable instead of 404 Not Found, signaling clients to retry with other masters. * master: skip warmup 503 on fresh start and single-master setups - Check MaxVolumeId > 0 to distinguish restart from fresh start (MaxVolumeId is Raft-persisted, so 0 means no prior data) - Check peer count > 1 so single-master deployments aren't affected (no point suggesting "retry with other masters" if there are none) * master: address review feedback and block assigns during warmup - Protect LastLeaderChangeTime with dedicated mutex (fix data race) - Extract warmup multiplier as WarmupPulseMultiplier constant - Derive Retry-After header from pulse config instead of hardcoding - Only trigger warmup 503 for "not found" errors, not parse errors - Return nil response (not partial) on gRPC Unavailable - Add doc comments to IsWarmingUp, getter/setter, WarmupDuration - Block volume assign requests (HTTP and gRPC) during warmup, since the topology is incomplete and assignments would be unreliable - Skip warmup behavior for single-master setups (no peers to retry) * master: apply warmup to all setups, skip only on fresh start Single-master restarts still have an empty topology until heartbeats arrive, so warmup protection should apply there too. The only case to skip is a fresh cluster start (MaxVolumeId == 0), which already has no volumes to look up. - Remove GetMasterCount() > 1 guard from all warmup checks - Remove now-unused GetMasterCount helper - Update error messages to "topology is still loading" (not "retry with other masters" which doesn't apply to single-master) * master: add client-side retry on Unavailable for lookup and assign The server-side 503/Unavailable during warmup needs client cooperation. Previously, LookupVolumeIds and Assign would immediately propagate the error without retry. Now both paths retry with exponential backoff (1s -> 1.5s -> ... up to 6s) when receiving Unavailable, respecting context cancellation. This covers the warmup window where the master's topology is still loading after a restart or leader change. * master: seed warmup timestamp in legacy raft path at setup The legacy raft path only set lastLeaderChangeTime inside the event listener callback, which could fire after IsLeader() was already observed as true in SetRaftServer. Seed the timestamp at setup time (matching the hashicorp path) so IsWarmingUp() is active immediately. * master: fix assign retry loop to cover full warmup window The retry loop used waitTime <= maxWaitTime as a stop condition, causing it to give up after ~13s while warmup lasts 15s. Now cap each individual sleep at maxWaitTime but keep retrying until the context is cancelled. * master: preserve gRPC status in lookup retry and fix retry window Return the raw gRPC error instead of wrapping with fmt.Errorf so status.FromError() can extract the status code. Use proper gRPC status check (codes.Unavailable) instead of string matching. Also cap individual sleep at maxWaitTime while retrying until ctx is done. * master: use gRPC status code instead of string matching in assign retry Use status.FromError/codes.Unavailable instead of brittle strings.Contains for detecting retriable gRPC errors in the assign retry loop. * master: use remaining warmup duration for Retry-After header Set Retry-After to the remaining warmup time instead of the full warmup duration, so clients don't wait longer than necessary. * master: reset ret.Replicas before populating from assign response Clear Replicas slice before appending to prevent duplicate entries when the assign response is retried or when alternative requests are attempted. * master: add unit tests for warmup retry behavior Test that Assign() and LookupVolumeIds() retry on codes.Unavailable and stop promptly when the context is cancelled. * master: record leader change time before initialization work Move SetLastLeaderChangeTime() to fire immediately when the leader change event is received, before DoBarrier(), EnsureTopologyId(), and updatePeers(), so the warmup clock starts at the true moment of leadership transition. * master: use topology warmup duration in volume growth wait loop Replace hardcoded constants.VolumePulsePeriod * 2 with topo.IsWarmingUp() and topo.WarmupDuration() so the growth wait stays in sync with the configured warmup window. Remove unused constants import. * master: resolve master before creating RPC timeout context Move GetMaster() call before context.WithTimeout() so master resolution blocking doesn't consume the gRPC call timeout. * master: use NotFound flag instead of string matching for volume lookup Add a NotFound field to LookupResult and set it in findVolumeLocation when a volume is genuinely missing. Update HTTP and gRPC warmup checks to use this flag instead of strings.Contains on the error message. * master: bound assign retry loop to 30s for deadline-free contexts Without a context deadline, the Unavailable retry loop could spin forever. Add a maxRetryDuration of 30s so the loop gives up even when no context deadline is set. * master: strengthen assign retry cancellation test Verify the retry loop actually retried (callCount > 1) and that the returned error is context.DeadlineExceeded, not just any error. * master: extract shared retry-with-backoff utility Add util.RetryWithBackoff for context-aware, bounded retry with exponential backoff. Refactor both Assign() and LookupVolumeIds() to use it instead of duplicating the retry/sleep/backoff logic. * master: cap waitTime in RetryWithBackoff to prevent unbounded growth Cap the backoff waitTime at maxWaitTime so it doesn't grow indefinitely in long-running retry scenarios. * master: only return Unavailable during warmup when all lookups failed For batched LookupVolume requests, return partial results when some volumes are found. Only return codes.Unavailable when no volumes were successfully resolved, so clients benefit from partial results instead of retrying unnecessarily. * master: set retriable error message in 503 response body When returning 503 during warmup, replace the "not found" error in the JSON body with "service warming up, please retry" so clients don't treat it as a permanent error. * master: guard empty master address in LookupVolumeIds If GetMaster() returns empty (no master found or ctx cancelled), return an appropriate error instead of dialing an empty address. Returns ctx.Err() if context is done, otherwise codes.Unavailable to trigger retry. * master: add comprehensive tests for RetryWithBackoff Test success after retries, non-retryable error handling, context cancellation, and maxDuration cap with context.Background(). * master: enforce hard maxDuration bound in RetryWithBackoff Use a deadline instead of elapsed-time check so the last sleep is capped to remaining time. This prevents the total retry duration from overshooting maxDuration by up to one full backoff interval. * master: respect fresh-start bypass in RemainingWarmupDuration Check IsWarmingUp() first (which returns false when MaxVolumeId==0) so RemainingWarmupDuration returns 0 on fresh clusters. * master: round up Retry-After seconds to avoid underestimating Use math.Ceil so fractional remaining seconds (e.g. 1.9s) round up to the next integer (2) instead of flooring down (1). * master: tighten batch lookup warmup to all-NotFound only Only return codes.Unavailable when every requested volume ID was a transient not-found. Mixed cases with non-NotFound errors now return the response with per-volume error details preserved. * master: reduce retry log noise and fix timer leak Lower per-attempt retry log from V(0) to V(1) to reduce noise during warmup. Replace time.After with time.NewTimer to avoid lingering timers when context is cancelled. * master: add per-attempt timeout for assign RPC Use a 10s per-attempt timeout so a single slow RPC can't consume the entire 30s retry budget when ctx has no deadline. * master: share single 30s retry deadline across assign request entries The Assign() function iterates over primary and fallback requests, previously giving each its own 30s RetryWithBackoff budget. With a primary + fallback, the total could reach 60s. Compute one deadline up front and pass the remaining budget to each RetryWithBackoff call so the entire Assign() call stays within a single 30s cap. * master: strengthen context-cancel test with DeadlineExceeded and retry assertions Assert errors.Is(err, context.DeadlineExceeded) to verify the error is specifically from the context deadline, and check callCount > 1 to prove retries actually occurred before cancellation. Mirrors the pattern used in TestAssignStopsOnContextCancel. * master: bound GetMaster with per-attempt timeout in LookupVolumeIds GetMaster() calls WaitUntilConnected() which can block indefinitely if no master is available. Previously it used the outer ctx, so a slow master resolution could consume the entire RetryWithBackoff budget in a single attempt. Move the per-attempt timeoutCtx creation before the GetMaster call so both master resolution and the gRPC LookupVolume RPC share one grpcTimeout-bounded attempt. * master: use deadline-aware context for assign retry budget The shared 30s deadline only limited RetryWithBackoff's internal wall-clock tracking, but per-attempt contexts were still derived from the original ctx and could run for up to 10s even when the budget was nearly exhausted. Create a deadlineCtx from the computed deadline and derive both RetryWithBackoff and per-attempt timeouts from it so all operations honor the shared 30s cap. * master: skip warmup gate for empty lookup requests When VolumeOrFileIds is empty, notFoundCount == len(req.VolumeOrFileIds) is 0 == 0 which is true, causing empty lookup batches during warmup to return codes.Unavailable and be retried endlessly. Add a len(req.VolumeOrFileIds) > 0 guard so empty requests pass through. * master: validate request fields before warmup gate in Assign Move Replication and Ttl parsing before the IsWarmingUp() check so invalid inputs get a proper validation error instead of being masked by codes.Unavailable during warmup. Pure syntactic validation does not depend on topology state and should run first. * master: check deadline and context before starting retry attempt RetryWithBackoff only checked the deadline and context after an attempt completed or during the sleep select. If the deadline expired or context was canceled during sleep, the next iteration would still call operation() before detecting it. Add pre-operation checks so no new attempt starts after the budget is exhausted. * master: always return ctx.Err() on context cancellation in RetryWithBackoff When ctx.Err() is non-nil, the pre-operation check was returning lastErr instead of ctx.Err(). This broke callers checking errors.Is(err, context.DeadlineExceeded) and contradicted the documented contract. Always return ctx.Err() so the cancellation reason is properly surfaced. * master: handle warmup errors in StreamAssign without killing the stream StreamAssign was returning codes.Unavailable errors from Assign directly, which terminates the gRPC stream and breaks pooled connections. Instead, return transient errors as in-band error responses so the stream survives warmup periods. Also reset assignClient in doAssign on Send/Recv failures so a broken stream doesn't leave the proxy permanently dead. * master: wait for warmup before slot search in findAndGrow findEmptySlotsForOneVolume was called before the warmup wait loop, selecting slots from an incomplete topology. Move the warmup wait before slot search so volume placement uses the fully warmed-up topology with all servers registered. * master: add Retry-After header to /dir/assign warmup response The /dir/lookup handler already sets Retry-After during warmup but /dir/assign did not, leaving HTTP clients without guidance on when to retry. Add the same header using RemainingWarmupDuration(). * master: only seed warmup timestamp on leader at startup SetLastLeaderChangeTime was called unconditionally for both leader and follower nodes. Followers don't need warmup state, and the leader change event listener handles real elections. Move the seed into the IsLeader() block so only the startup leader gets warmup initialized. * master: preserve codes.Unavailable for StreamAssign warmup errors in doAssign StreamAssign returns transient warmup errors as in-band AssignResponse.Error messages. doAssign was converting these to plain fmt.Errorf, losing the codes.Unavailable classification needed for the caller's retry logic. Detect warmup error messages and wrap them as status.Error(codes.Unavailable) so RetryWithBackoff can retry.	2026-03-08 16:05:45 -07:00
Chris Lu	540fc97e00	s3/iam: reuse one request id per request (#8538 ) * request_id: add shared request middleware * s3err: preserve request ids in responses and logs * iam: reuse request ids in XML responses * sts: reuse request ids in XML responses * request_id: drop legacy header fallback * request_id: use AWS-style request id format * iam: fix AWS-compatible XML format for ErrorResponse and field ordering - ErrorResponse uses bare <RequestId> at root level instead of <ResponseMetadata> wrapper, matching the AWS IAM error response spec - Move CommonResponse to last field in success response structs so <ResponseMetadata> serializes after result elements - Add randomness to request ID generation to avoid collisions - Add tests for XML ordering and ErrorResponse format * iam: remove duplicate error_response_test.go Test is already covered by responses_test.go. * address PR review comments - Guard against typed nil pointers in SetResponseRequestID before interface assertion (CodeRabbit) - Use regexp instead of strings.Index in test helpers for extracting request IDs (Gemini) * request_id: prevent spoofing, fix nil-error branch, thread reqID to error writers - Ensure() now always generates a server-side ID, ignoring client-sent x-amz-request-id headers to prevent request ID spoofing. Uses a private context key (contextKey{}) instead of the header string. - writeIamErrorResponse in both iamapi and embedded IAM now accepts reqID as a parameter instead of calling Ensure() internally, ensuring a single request ID per request lifecycle. - The nil-iamError branch in writeIamErrorResponse now writes a 500 Internal Server Error response instead of returning silently. - Updated tests to set request IDs via context (not headers) and added tests for spoofing prevention and context reuse. * sts: add request-id consistency assertions to ActionInBody tests * test: update admin test to expect server-generated request IDs The test previously sent a client x-amz-request-id header and expected it echoed back. Since Ensure() now ignores client headers to prevent spoofing, update the test to verify the server returns a non-empty server-generated request ID instead. * iam: add generic WithRequestID helper alongside reflection-based fallback Add WithRequestID[T] that uses generics to take the address of a value type, satisfying the pointer receiver on SetRequestId without reflection. The existing SetResponseRequestID is kept for the two call sites that operate on interface{} (from large action switches where the concrete type varies at runtime). Generics cannot replace reflection there since Go cannot infer type parameters from interface{}. * Remove reflection and generics from request ID setting Call SetRequestId directly on concrete response types in each switch branch before boxing into interface{}, eliminating the need for WithRequestID (generics) and SetResponseRequestID (reflection). * iam: return pointer responses in action dispatch * Fix IAM error handling consistency and ensure request IDs on all responses - UpdateUser/CreatePolicy error branches: use writeIamErrorResponse instead of s3err.WriteErrorResponse to preserve IAM formatting and request ID - ExecuteAction: accept reqID parameter and generate one if empty, ensuring every response carries a RequestId regardless of caller * Clean up inline policies on DeleteUser and UpdateUser rename DeleteUser: remove InlinePolicies[userName] from policy storage before removing the identity, so policies are not orphaned. UpdateUser: move InlinePolicies[userName] to InlinePolicies[newUserName] when renaming, so GetUserPolicy/DeleteUserPolicy work under the new name. Both operations persist the updated policies and return an error if the storage write fails, preventing partial state.	2026-03-06 15:22:39 -08:00
Chris Lu	b3f7472fd3	4.15	2026-03-04 22:13:57 -08:00
Chris Lu	7799804200	4.14 Co-Authored-By: Copilot <223556219+Copilot@users.noreply.github.com>	2026-03-04 19:22:39 -08:00
Chris Lu	f5c35240be	Add volume dir tags and EC placement priority (#8472 ) * Add volume dir tags to topology Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add preferred tag config for EC Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Prioritize EC destinations by tags Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add EC placement planner tag tests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Refactor EC placement tests to reuse buildActiveTopology Remove buildActiveTopologyWithDiskTags helper function and consolidate tag setup inline in test cases. Tests now use UpdateTopology to apply tags after topology creation, reusing the existing buildActiveTopology function rather than duplicating its logic. All tag scenario tests pass: - TestECPlacementPlannerPrefersTaggedDisks - TestECPlacementPlannerFallsBackWhenTagsInsufficient Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Consolidate normalizeTagList into shared util package Extract normalizeTagList from three locations (volume.go, detection.go, erasure_coding_handler.go) into new weed/util/tag.go as exported NormalizeTagList function. Replace all duplicate implementations with imports and calls to util.NormalizeTagList. This improves code reuse and maintainability by centralizing tag normalization logic. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add PreferredTags to EC config persistence Add preferred_tags field to ErasureCodingTaskConfig protobuf with field number 5. Update GetConfigSpec to include preferred_tags field in the UI configuration schema. Add PreferredTags to ToTaskPolicy to serialize config to protobuf. Add PreferredTags to FromTaskPolicy to deserialize from protobuf with defensive copy to prevent external mutation. This allows EC preferred tags to be persisted and restored across worker restarts. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add defensive copy for Tags slice in DiskLocation Copy the incoming tags slice in NewDiskLocation instead of storing by reference. This prevents external callers from mutating the DiskLocation.Tags slice after construction, improving encapsulation and preventing unexpected changes to disk metadata. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add doc comment to buildCandidateSets method Document the tiered candidate selection and fallback behavior. Explain that for a planner with preferredTags, it accumulates disks matching each tag in order into progressively larger tiers, emits a candidate set once a tier reaches shardsNeeded, and finally falls back to the full candidates set if preferred-tag tiers are insufficient. This clarifies the intended semantics for future maintainers. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Apply final PR review fixes 1. Update parseVolumeTags to replicate single tag entry to all folders instead of leaving some folders with nil tags. This prevents nil pointer dereferences when processing folders without explicit tags. 2. Add defensive copy in ToTaskPolicy for PreferredTags slice to match the pattern used in FromTaskPolicy, preventing external mutation of the returned TaskPolicy. 3. Add clarifying comment in buildCandidateSets explaining that the shardsNeeded <= 0 branch is a defensive check for direct callers, since selectDestinations guarantees shardsNeeded > 0. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix nil pointer dereference in parseVolumeTags Ensure all folder tags are initialized to either normalized tags or empty slices, not nil. When multiple tag entries are provided and there are more folders than entries, remaining folders now get empty slices instead of nil, preventing nil pointer dereference in downstream code. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix NormalizeTagList to return empty slice instead of nil Change NormalizeTagList to always return a non-nil slice. When all tags are empty or whitespace after normalization, return an empty slice instead of nil. This prevents nil pointer dereferences in downstream code that expects a valid (possibly empty) slice. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add nil safety check for v.tags pointer Add a safety check to handle the case where v.tags might be nil, preventing a nil pointer dereference. If v.tags is nil, use an empty string instead. This is defensive programming to prevent panics in edge cases. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Add volume.tags flag to weed server and weed mini commands Add the volume.tags CLI option to both the 'weed server' and 'weed mini' commands. This allows users to specify disk tags when running the combined server modes, just like they can with 'weed volume'. The flag uses the same format and description as the volume command: comma-separated tag groups per data dir with ':' separators (e.g. fast:ssd,archive). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <copilot@github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-03-01 10:22:00 -08:00
Chris Lu	4f647e1036	Worker set its working directory (#8461 ) * set working directory * consolidate to worker directory * working directory * correct directory name * refactoring to use wildcard matcher * simplify * cleaning ec working directory * fix reference * clean * adjust test	2026-02-27 12:22:21 -08:00
Chris Lu	e4b70c2521	go fix	2026-02-20 18:42:00 -08:00
Konstantin Lebedev	01b3125815	[shell]: volume balance capacity by min volume density (#8026 ) volume balance by min volume density and active volumes	2026-02-19 13:30:59 -08:00
Chris Lu	38e14a867b	fix: cancel volume server requests on client disconnect during S3 downloads (#8373 ) * fix: cancel volume server requests on client disconnect during S3 downloads - Use http.NewRequestWithContext in ReadUrlAsStream so in-flight volume server requests are properly aborted when the client disconnects and the request context is canceled - Distinguish context-canceled errors (client disconnect, expected) from real server errors in streamFromVolumeServers; log at V(3) instead of ERROR to reduce noise from client-side disconnects (e.g. Nginx upstream timeout, browser cancel, curl --max-time) Fixes: streamFromVolumeServers: streamFn failed...context canceled" * fixup: separate Canceled/DeadlineExceeded log severity in streamFromVolumeServers - context.Canceled → V(3) Infof "client disconnected" (expected, no noise) - context.DeadlineExceeded → Warningf "server-side deadline exceeded" (unexpected, needs attention) - all other errors → Errorf (unchanged)"	2026-02-18 17:14:54 -08:00
Chris Lu	3c3a78d08e	4.13	2026-02-16 17:01:19 -08:00
Chris Lu	1e4f30c56f	pb: fix IPv6 double brackets in ServerAddress formatting (#8329 ) * pb: fix IPv6 double brackets in ServerAddress formatting * pb: refactor IPv6 tests into table-driven test * util: add JoinHostPortStr and use it in pb to avoid unsafe port parsing	2026-02-12 18:11:03 -08:00
Chris Lu	b57429ef2e	Switch empty-folder cleanup to bucket policy (#8292 ) * Fix Spark _temporary cleanup and add issue #8285 regression test * Generalize empty folder cleanup for Spark temp artifacts * Revert synchronous folder pruning and add cleanup diagnostics * Add actionable empty-folder cleanup diagnostics * Fix Spark temp marker cleanup in async folder cleaner * Fix Spark temp cleanup with implicit directory markers * Keep explicit directory markers non-implicit * logging * more logs * Switch empty-folder cleanup to bucket policy * Seaweed-X-Amz-Allow-Empty-Folders * less logs * go vet * less logs * refactoring	2026-02-10 18:38:38 -08:00
Chris Lu	af8273386d	4.12	2026-02-09 18:15:19 -08:00
Chris Lu	cb9e21cdc5	Normalize hashicorp raft peer ids (#8253 ) * Normalize raft voter ids * 4.11 * Update raft_hashicorp.go	2026-02-09 07:46:34 -08:00
Chris Lu	0c89185291	4.10	2026-02-08 21:16:58 -08:00
Chris Lu	5a5cc38692	4.09	2026-02-03 17:56:25 -08:00
Chris Lu	330bd92ddc	4.08	2026-02-02 20:44:13 -08:00
Chris Lu	ba8816e2e1	4.08	2026-02-02 20:36:03 -08:00
Chris Lu	bc853bdee5	4.07	2026-01-18 15:48:09 -08:00
Chris Lu	ce6e9be66b	4.06	2026-01-10 12:08:16 -08:00
Chris Lu	379c032868	Fix chown Input/output error on large file sets (#7996 ) * Fix chown Input/output error on large file sets (Fixes #7911) Implemented retry logic for MySQL/MariaDB backend to handle transient errors like deadlocks and timeouts. * Fix syntax error: missing closing brace * Refactor: Use %w for error wrapping and errors.As for extraction * Fix: Disable retry logic inside transactions	2026-01-09 18:02:59 -08:00
promalert	9012069bd7	chore: execute goimports to format the code (#7983 ) * chore: execute goimports to format the code Signed-off-by: promalert <promalert@outlook.com> * goimports -w . --------- Signed-off-by: promalert <promalert@outlook.com> Co-authored-by: Chris Lu <chris.lu@gmail.com>	2026-01-07 13:06:08 -08:00
Chris Lu	0e9f433ec4	refactoring	2026-01-04 11:40:42 -08:00
Chris Lu	87b71029f7	4.05	2026-01-01 20:39:22 -08:00
Chris Lu	f7f133166a	adjust fuse logs	2025-12-31 13:04:05 -08:00
ai8future	73098c9792	filer.meta.backup: add -excludePaths flag to skip paths from backup (#7916 ) * filer.meta.backup: add -excludePaths flag to skip paths from backup Add a new -excludePaths flag that accepts comma-separated path prefixes to exclude from backup operations. This enables selective backup when certain directories (e.g., legacy buckets) should be skipped. Usage: weed filer.meta.backup -filerDir=/buckets -excludePaths=/buckets/legacy1,/buckets/legacy2 -config=backup.toml 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * filer.meta.backup: address code review feedback for -excludePaths Fixes based on CodeRabbit and Gemini review: - Cache parsed exclude paths in struct (performance) - TrimSpace and skip empty entries (handles "a,,b" and "a, b") - Add trailing slash for directory boundary matching (prevents /buckets/legacy matching /buckets/legacy_backup) - Validate paths start with '/' and warn if not - Log excluded paths at startup for debugging - Fix rename handling: check both old and new paths, handle all four combinations correctly - Add docstring to shouldExclude() - Update UsageLine and Long description with new flag 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * filer.meta.backup: address nitpick feedback - Clarify directory boundary matching behavior in help text - Add warning when root path '/' is excluded (would exclude everything) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * includePrefixes and excludePrefixes --------- Co-authored-by: C Shaw <cliffshaw@users.noreply.github.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: Chris Lu <chris.lu@gmail.com>	2025-12-30 14:28:50 -08:00
Chris Lu	1261e93ef2	fix: comprehensive go vet error fixes and add CI enforcement (#7861 ) * fix: use keyed fields in struct literals - Replace unsafe reflect.StringHeader/SliceHeader with safe unsafe.String/Slice (weed/query/sqltypes/unsafe.go) - Add field names to Type_ScalarType struct literals (weed/mq/schema/schema_builder.go) - Add Duration field name to FlexibleDuration struct literals across test files - Add field names to bson.D struct literals (weed/filer/mongodb/mongodb_store_kv.go) Fixes go vet warnings about unkeyed struct literals. * fix: remove unreachable code - Remove unreachable return statements after infinite for loops - Remove unreachable code after if/else blocks where all paths return - Simplify recursive logic by removing unnecessary for loop (inode_to_path.go) - Fix Type_ScalarType literal to use enum value directly (schema_builder.go) - Call onCompletionFn on stream error (subscribe_session.go) Files fixed: - weed/query/sqltypes/unsafe.go - weed/mq/schema/schema_builder.go - weed/mq/client/sub_client/connect_to_sub_coordinator.go - weed/filer/redis3/ItemList.go - weed/mq/client/agent_client/subscribe_session.go - weed/mq/broker/broker_grpc_pub_balancer.go - weed/mount/inode_to_path.go - weed/util/skiplist/name_list.go * fix: avoid copying lock values in protobuf messages - Use proto.Merge() instead of direct assignment to avoid copying sync.Mutex in S3ApiConfiguration (iamapi_server.go) - Add explicit comments noting that channel-received values are already copies before taking addresses (volume_grpc_client_to_master.go) The protobuf messages contain sync.Mutex fields from the message state, which should not be copied. Using proto.Merge() properly merges messages without copying the embedded mutex. * fix: correct byte array size for uint32 bit shift operations The generateAccountId() function only needs 4 bytes to create a uint32 value. Changed from allocating 8 bytes to 4 bytes to match the actual usage. This fixes go vet warning about shifting 8-bit values (bytes) by more than 8 bits. * fix: ensure context cancellation on all error paths In broker_client_subscribe.go, ensure subscriberCancel() is called on all error return paths: - When stream creation fails - When partition assignment fails - When sending initialization message fails This prevents context leaks when an error occurs during subscriber creation. * fix: ensure subscriberCancel called for CreateFreshSubscriber stream.Send error Ensure subscriberCancel() is called when stream.Send fails in CreateFreshSubscriber. * ci: add go vet step to prevent future lint regressions - Add go vet step to GitHub Actions workflow - Filter known protobuf lock warnings (MessageState sync.Mutex) These are expected in generated protobuf code and are safe - Prevents accumulation of go vet errors in future PRs - Step runs before build to catch issues early * fix: resolve remaining syntax and logic errors in vet fixes - Fixed syntax errors in filer_sync.go caused by missing closing braces - Added missing closing brace for if block and function - Synchronized fixes to match previous commits on branch * fix: add missing return statements to daemon functions - Add 'return false' after infinite loops in filer_backup.go and filer_meta_backup.go - Satisfies declared bool return type signatures - Maintains consistency with other daemon functions (runMaster, runFilerSynchronize, runWorker) - While unreachable, explicitly declares the return satisfies function signature contract * fix: add nil check for onCompletionFn in SubscribeMessageRecord - Check if onCompletionFn is not nil before calling it - Prevents potential panic if nil function is passed - Matches pattern used in other callback functions * docs: clarify unreachable return statements in daemon functions - Add comments documenting that return statements satisfy function signature - Explains that these returns follow infinite loops and are unreachable - Improves code clarity for future maintainers	2025-12-23 14:48:50 -08:00
Chris Lu	8d75290601	4.04	2025-12-22 23:46:30 -08:00
Chris Lu	ed1da07665	Add consistent -debug and -debug.port flags to commands (#7816 ) * Add consistent -debug and -debug.port flags to commands Add -debug and -debug.port flags to weed master, weed volume, weed s3, weed mq.broker, and weed filer.sync commands for consistency with weed filer. When -debug is enabled, an HTTP server starts on the specified port (default 6060) serving runtime profiling data at /debug/pprof/. For mq.broker, replaced the older -port.pprof flag with the new -debug and -debug.port pattern for consistency. * Update weed/util/grace/pprof.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-12-18 17:44:36 -08:00
chrislu	bcce8d164c	4.03	2025-12-15 01:06:55 -08:00
Chris Lu	221b352593	fix: handle ResumeFromDiskError gracefully in LoopProcessLogData (#7753 ) When ReadFromBuffer returns ResumeFromDiskError, the function now: - Attempts to read from disk if ReadFromDiskFn is available - Checks if the client is still connected via waitForDataFn - Waits for notification or short timeout before retrying - Continues the loop instead of immediately returning the error This fixes TestNewLogBufferFirstBuffer which was failing because the function returned too early before data was available in the buffer.	2025-12-14 21:52:50 -08:00
Chris Lu	e8b7347031	Reduce memory allocations in hot paths (#7725 ) * filer: reduce allocations in MatchStorageRule Optimize MatchStorageRule to avoid allocations in common cases: - Return singleton emptyPathConf when no rules match (zero allocations) - Return existing rule directly when only one rule matches (zero allocations) - Only allocate and merge when multiple rules match (rare case) Based on heap profile analysis showing 111MB allocated from 1.64M calls to this function during 180 seconds of operation. * filer: add fast path for getActualStore when no path-specific stores Add hasPathSpecificStore flag to FilerStoreWrapper to skip the MatchPrefix() call and []byte(path) conversion when no path-specific stores are configured (the common case). Based on heap profile analysis showing 1.39M calls to this function during 180 seconds of operation, each requiring a string-to-byte slice conversion for the MatchPrefix call. * filer/foundationdb: use sync.Pool for tuple allocation in genKey Use sync.Pool to reuse tuple.Tuple slices in genKey(), reducing allocation overhead for every FoundationDB operation. Based on heap profile analysis showing 102MB allocated from 1.79M calls to genKey() during 180 seconds of operation. The Pack() call still allocates internally, but this reduces the tuple slice allocation overhead by ~50%. * filer: use sync.Pool for protobuf Entry and FuseAttributes Add pooling for filer_pb.Entry and filer_pb.FuseAttributes in EncodeAttributesAndChunks and DecodeAttributesAndChunks to reduce allocations during filer store operations. Changes: - Add pbEntryPool with pre-allocated FuseAttributes - Add EntryAttributeToExistingPb for in-place attribute conversion - Update ToExistingProtoEntry to reuse existing Attributes when available Based on heap profile showing: - EncodeAttributesAndChunks: 69.5MB cumulative - DecodeAttributesAndChunks: 46.5MB cumulative - EntryAttributeToPb: 47.5MB flat allocations * log_buffer: use sync.Pool for LogEntry in readTs Add logEntryPool to reuse filer_pb.LogEntry objects in readTs(), which is called frequently during binary search in ReadFromBuffer. This function only needs the TsNs field from the unmarshaled entry, so pooling the LogEntry avoids repeated allocations. Based on heap profile showing readTs with 188MB cumulative allocations from timestamp lookups during log buffer reads. * pb: reduce gRPC metadata allocations in interceptor Optimize requestIDUnaryInterceptor and WithGrpcClient to reduce metadata allocations on every gRPC request: - Use AppendToOutgoingContext instead of NewOutgoingContext + New() This avoids creating a new map[string]string for single key-value pairs - Check FromIncomingContext return value before using metadata Based on heap profile showing metadata operations contributing 0.45GB (10.5%) of allocations, with requestIDUnaryInterceptor being the main source at 0.44GB cumulative. Expected reduction: ~0.2GB from avoiding map allocations per request. * filer/log_buffer: address code review feedback - Use proto.Reset() instead of manual field clearing in resetLogEntry for more idiomatic and comprehensive state clearing - Add resetPbEntry() call before pool return in error path for consistency with success path in DecodeAttributesAndChunks * log_buffer: reduce PreviousBufferCount from 32 to 4 Reduce the number of retained previous buffers from 32 to 4. Each buffer is 8MB, so this reduces the maximum retained memory from 256MB to 32MB for previous buffers. Most subscribers catch up quickly, so 4 buffers (32MB) should be sufficient while significantly reducing memory footprint. * filer/foundationdb: use defer for tuple pool cleanup in genKey Refactor genKey to use defer for returning the pooled tuple. This ensures the pooled object is always returned even if store.seaweedfsDir.Pack panics, making the code more robust. Also simplifies the code by removing the temporary variable. * filer: early-stop MatchStorageRule prescan after 2 matches Stop the prescan callback after finding 2 matches since we only need to know if there are 0, 1, or multiple matches. This avoids unnecessarily scanning the rest of the trie when many rules exist. * fix: address critical code review issues filer_conf.go: - Remove mutable singleton emptyPathConf that could corrupt shared state - Return fresh copy for no-match case and cloned copy for single-match case - Add clonePathConf helper to create shallow copies safely grpc_client_server.go: - Remove incorrect AppendToOutgoingContext call in server interceptor (that API is for outbound client calls, not server-side handlers) - Rely on request_id.Set and SetTrailer for request ID propagation * fix: treat FilerConf_PathConf as immutable Fix callers that were incorrectly mutating the returned PathConf: - filer_server_handlers_write.go: Use local variable for MaxFileNameLength instead of mutating the shared rule - command_s3_bucket_quota_check.go: Create new PathConf explicitly when modifying config instead of mutating the returned one This allows MatchStorageRule to safely return the singleton or direct references without copying, restoring the memory optimization. Callers must NOT mutate the returned FilerConf_PathConf. filer: add ClonePathConf helper for creating mutable copies Add reusable ClonePathConf function that creates a mutable copy of a PathConf. This is useful when callers need to modify config before calling SetLocationConf. Update command_s3_bucket_quota_check.go to use the new helper. Also fix redundant return statement in DeleteLocationConf. * fmt * filer: fix protobuf pool reset to clear internal fields Address code review feedback: 1. resetPbEntry/resetFuseAttributes: Use struct assignment (e = T{}) instead of field-by-field reset to clear protobuf internal fields (unknownFields, sizeCache) that would otherwise accumulate across pool reuses, causing data corruption or memory bloat. 2. EntryAttributeToExistingPb: Add nil guard for attr parameter to prevent panic if caller passes nil. log_buffer: reset logEntry before pool return in error path For consistency with success path, reset the logEntry before putting it back in the pool in the error path. This prevents the pooled object from holding references to partially unmarshaled data. * filer: optimize MatchStorageRule and document ClonePathConf 1. Avoid double []byte(path) conversion in multi-match case by converting once and reusing pathBytes. 2. Add IMPORTANT comment to ClonePathConf documenting that it must be kept in sync with filer_pb.FilerConf_PathConf fields when the protobuf evolves. * filer/log_buffer: fix data race and use defer for pool cleanup 1. entry_codec.go EncodeAttributesAndChunks: Fix critical data race - proto.Marshal may return a slice sharing memory with the message. Copy the data before returning message to pool to prevent corruption. 2. entry_codec.go DecodeAttributesAndChunks: Use defer for cleaner pool management, ensuring message is always returned to pool. 3. log_buffer.go readTs: Use defer for pool cleanup, removing duplicated resetLogEntry/Put calls in success and error paths. * filer: fix ClonePathConf field order and add comprehensive test 1. Fix field order in ClonePathConf to match protobuf struct definition (WormGracePeriodSeconds before WormRetentionTimeSeconds). 2. Add TestClonePathConf that constructs a fully-populated PathConf, calls ClonePathConf, and asserts equality of all exported fields. This will catch future schema drift when new fields are added. 3. Add TestClonePathConfNil to verify nil handling. * filer: use reflection in ClonePathConf test to detect schema drift Replace hardcoded field comparisons with reflection-based comparison. This automatically catches: 1. New fields added to the protobuf but not copied in ClonePathConf 2. Missing non-zero test values for any exported field The test iterates over all exported fields using reflect and compares src vs clone values, failing if any field differs. * filer: update EntryAttributeToExistingPb comment to reflect nil handling The function safely handles nil attr by returning early, but the comment incorrectly stated 'attr must not be nil'. Update comment to accurately describe the defensive behavior. * Fix review feedback: restore request ID propagation and remove redundant resets 1. grpc_client_server.go: Restore AppendToOutgoingContext for request ID so handlers making downstream gRPC calls will automatically propagate the request ID to downstream services. 2. entry_codec.go: Remove redundant resetPbEntry calls after Get. The defer block ensures reset before Put, so next Get receives clean object. 3. log_buffer.go: Remove redundant resetLogEntry call after Get for same reason - defer already handles reset before Put.	2025-12-12 12:51:48 -08:00
Chris Lu	d970c15d71	fix: prevent filer.backup stall in single-filer setups (#7695 ) * fix: prevent filer.backup stall in single-filer setups (#4977) When MetaAggregator.MetaLogBuffer is empty (which happens in single-filer setups with no peers), ReadFromBuffer was returning nil error, causing LoopProcessLogData to enter an infinite wait loop on ListenersCond. This fix returns ResumeFromDiskError instead, allowing SubscribeMetadata to loop back and read from persisted logs on disk. This ensures filer.backup continues processing events even when the in-memory aggregator buffer is empty. Fixes #4977 * test: add integration tests for metadata subscription Add integration tests for metadata subscription functionality: - TestMetadataSubscribeBasic: Tests basic subscription and event receiving - TestMetadataSubscribeSingleFilerNoStall: Regression test for #4977, verifies subscription doesn't stall under high load in single-filer setups - TestMetadataSubscribeResumeFromDisk: Tests resuming subscription from disk Related to #4977 * ci: add GitHub Actions workflow for metadata subscribe tests Add CI workflow that runs on: - Push/PR to master affecting filer, log_buffer, or metadata subscribe code - Runs the integration tests for metadata subscription - Uploads logs on failure for debugging Related to #4977 * fix: use multipart form-data for file uploads in integration tests The filer expects multipart/form-data for file uploads, not raw POST body. This fixes the 'Content-Type isn't multipart/form-data' error. * test: use -peers=none for faster master startup * test: add -peers=none to remaining master startup in ec tests * fix: use filer HTTP port 8888, WithFilerClient adds 10000 for gRPC WithFilerClient calls ToGrpcAddress() which adds 10000 to the port. Passing 18888 resulted in connecting to 28888. Use 8888 instead. * test: add concurrent writes and million updates tests - TestMetadataSubscribeConcurrentWrites: 50 goroutines writing 20 files each - TestMetadataSubscribeMillionUpdates: 1 million metadata entries via gRPC (metadata only, no actual file content for speed) * fix: address PR review comments - Handle os.MkdirAll errors explicitly instead of ignoring - Handle log file creation errors with proper error messages - Replace silent event dropping with 100ms timeout and warning log * Update metadata_subscribe_integration_test.go	2025-12-09 20:15:35 -08:00
chrislu	805950b401	4.02	2025-12-08 01:38:16 -08:00
Chris Lu	28ac536280	fix: normalize Windows backslash paths in weed admin file uploads (#7636 ) fix: normalize Windows backslash paths in file uploads When uploading files from a Windows client to a Linux server, file paths containing backslashes were not being properly interpreted as directory separators. This caused files intended for subdirectories to be created in the root directory with backslashes in their filenames. Changes: - Add util.CleanWindowsPath and util.CleanWindowsPathBase helper functions in weed/util/fullpath.go for reusable path normalization - Use path.Join/path.Clean/path.Base instead of filepath equivalents for URL path semantics (filepath is OS-specific) - Apply normalization in weed admin handlers and filer upload parsing Fixes #7628	2025-12-05 17:40:32 -08:00
Chris Lu	5c1de633cb	mount: improve read throughput with parallel chunk fetching (#7627 ) * filer: remove lock contention during chunk download This addresses issue #7504 where a single weed mount FUSE instance does not fully utilize node network bandwidth when reading large files. The SingleChunkCacher was holding a mutex during the entire HTTP download, causing readers to block until the download completed. This serialized chunk reads even when multiple goroutines were downloading in parallel. Changes: - Add sync.Cond to SingleChunkCacher for efficient waiting - Move HTTP download outside the critical section in startCaching() - Use condition variable in readChunkAt() to wait for download completion - Add isComplete flag to track download state Now multiple chunk downloads can proceed truly in parallel, and readers wait efficiently using the condition variable instead of blocking on a mutex held during I/O operations. Ref: #7504 * filer: parallel chunk fetching within doReadAt This addresses issue #7504 by enabling parallel chunk downloads within a single read operation. Previously, doReadAt() processed chunks sequentially in a loop, meaning each chunk had to be fully downloaded before the next one started. This left significant network bandwidth unused when chunks resided on different volume servers. Changes: - Collect all chunk read tasks upfront - Use errgroup to fetch multiple chunks in parallel - Each chunk reads directly into its correct buffer position - Limit concurrency to prefetchCount (min 4) to avoid overwhelming the system - Handle gaps and zero-filling before parallel fetch - Trigger prefetch after parallel reads complete For a read spanning N chunks on different volume servers, this can now utilize up to N times the bandwidth of a single connection. Ref: #7504 * http: direct buffer read to reduce memory copies This addresses issue #7504 by reducing memory copy overhead during chunk downloads. Previously, RetriedFetchChunkData used ReadUrlAsStream which: 1. Allocated a 64KB intermediate buffer 2. Read data in 64KB chunks 3. Called a callback to copy each chunk to the destination For a 16MB chunk, this meant 256 copy operations plus the callback overhead. Profiling showed significant time spent in memmove. Changes: - Add readUrlDirectToBuffer() that reads directly into the destination - Add retriedFetchChunkDataDirect() for unencrypted, non-gzipped chunks - Automatically use direct read path when possible (cipher=nil, gzip=false) - Use http.NewRequestWithContext for proper cancellation For unencrypted chunks (the common case), this eliminates the intermediate buffer entirely, reading HTTP response bytes directly into the final destination buffer. Ref: #7504 * address review comments - Use channel (done) instead of sync.Cond for download completion signaling This integrates better with context cancellation patterns - Remove redundant groupErr check in reader_at.go (errors are already captured in task.err) - Remove buggy URL encoding logic from retriedFetchChunkDataDirect (The existing url.PathEscape on full URL is a pre-existing bug that should be fixed separately) * address review comments (round 2) - Return io.ErrUnexpectedEOF when HTTP response is truncated This prevents silent data corruption from incomplete reads - Simplify errgroup error handling by using g.Wait() error directly Remove redundant task.err field and manual error aggregation loop - Define minReadConcurrency constant instead of magic number 4 Improves code readability and maintainability Note: Context propagation to startCaching() is intentionally NOT changed. The downloaded chunk is a shared resource that may be used by multiple readers. Using context.Background() ensures the download completes even if one reader cancels, preventing data loss for other waiting readers. * http: inject request ID for observability in direct read path Add request_id.InjectToRequest() call to readUrlDirectToBuffer() for consistency with ReadUrlAsStream path. This ensures full-chunk reads carry the same tracing/correlation headers for server logs and metrics. * filer: consistent timestamp handling in sequential read path Use max(ts, task.chunk.ModifiedTsNs) in sequential path to match parallel path behavior. Also update ts before error check so that on failure, the returned timestamp reflects the max of all chunks processed so far. * filer: document why context.Background() is used in startCaching Add comment explaining the intentional design decision: the downloaded chunk is a shared resource that may be used by multiple concurrent readers. Using context.Background() ensures the download completes even if one reader cancels, preventing errors for other waiting readers. * filer: propagate context for reader cancellation Address review comment: pass context through ReadChunkAt call chain so that a reader can cancel its wait for a download. The key distinction is: - Download uses context.Background() - shared resource, always completes - Reader wait uses request context - can be cancelled individually If a reader cancels, it stops waiting and returns ctx.Err(), but the download continues to completion for other readers waiting on the same chunk. This properly handles the shared resource semantics while still allowing individual reader cancellation. * filer: use defer for close(done) to guarantee signal on panic Move close(s.done) to a defer statement at the start of startCaching() to ensure the completion signal is always sent, even if an unexpected panic occurs. This prevents readers from blocking indefinitely. * filer: remove unnecessary code - Remove close(s.cacheStartedCh) in destroy() - the channel is only used for one-time synchronization, closing it provides no benefit - Remove task := task loop variable capture - Go 1.22+ fixed loop variable semantics, this capture is no longer necessary (go.mod specifies Go 1.24.0) * filer: restore fallback to chunkCache when cacher returns no data Fix critical issue where ReadChunkAt would return 0,nil immediately if SingleChunkCacher couldn't provide data for the requested offset, without trying the chunkCache fallback. Now if cacher.readChunkAt returns n=0 and err=nil, we fall through to try chunkCache. * filer: add comprehensive tests for ReaderCache Tests cover: - Context cancellation while waiting for download - Fallback to chunkCache when cacher returns n=0, err=nil - Multiple concurrent readers waiting for same chunk - Partial reads at different offsets - Downloader cleanup when exceeding cache limit - Done channel signaling (no hangs on completion) * filer: prioritize done channel over context cancellation If data is already available (done channel closed), return it even if the reader's context is also cancelled. This avoids unnecessary errors when the download has already completed. * filer: add lookup error test and document test limitations Add TestSingleChunkCacherLookupError to test error handling when lookup fails. Document that full HTTP integration tests for SingleChunkCacher require global HTTP client initialization which is complex in unit tests. The download path is tested via FUSE integration tests. * filer: add tests that exercise SingleChunkCacher concurrency logic Add tests that use blocking lookupFileIdFn to exercise the actual SingleChunkCacher wait/cancellation logic: - TestSingleChunkCacherContextCancellationDuringLookup: tests reader cancellation while lookup is blocked - TestSingleChunkCacherMultipleReadersWaitForDownload: tests multiple readers waiting on the same download - TestSingleChunkCacherOneReaderCancelsOthersContinue: tests that when one reader cancels, other readers continue waiting These tests properly exercise the done channel wait/cancel logic without requiring HTTP calls - the blocking lookup simulates a slow download.	2025-12-04 23:40:56 -08:00
Chris Lu	3183a49698	fix: S3 downloads failing after idle timeout (#7626 ) * fix: S3 downloads failing after idle timeout (#7618) The idle timeout was incorrectly terminating active downloads because read and write deadlines were managed independently. During a download, the server writes data but rarely reads, so the read deadline would expire even though the connection was actively being used. Changes: 1. Simplify to single Timeout field - since this is a 'no activity timeout' where any activity extends the deadline, separate read/write timeouts are unnecessary. Now uses SetDeadline() which sets both at once. 2. Implement proper 'no activity timeout' - any activity (read or write) now extends the deadline. The connection only times out when there's genuinely no activity in either direction. 3. Increase default S3 idleTimeout from 10s to 120s for additional safety margin when fetching chunks from slow storage backends. Fixes #7618 * Update weed/util/net_timeout.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-12-04 18:31:46 -08:00
Chris Lu	5ed0b00fb9	Support separate volume server ID independent of RPC bind address (#7609 ) * pb: add id field to Heartbeat message for stable volume server identification This adds an 'id' field to the Heartbeat protobuf message that allows volume servers to identify themselves independently of their IP:port address. Ref: https://github.com/seaweedfs/seaweedfs/issues/7487 * storage: add Id field to Store struct Add Id field to Store struct and include it in CollectHeartbeat(). The Id field provides a stable volume server identity independent of IP:port. Ref: https://github.com/seaweedfs/seaweedfs/issues/7487 * topology: support id-based DataNode identification Update GetOrCreateDataNode to accept an id parameter for stable node identification. When id is provided, the DataNode can maintain its identity even when its IP address changes (e.g., in Kubernetes pod reschedules). For backward compatibility: - If id is provided, use it as the node ID - If id is empty, fall back to ip:port Ref: https://github.com/seaweedfs/seaweedfs/issues/7487 * volume: add -id flag for stable volume server identity Add -id command line flag to volume server that allows specifying a stable identifier independent of the IP address. This is useful for Kubernetes deployments with hostPath volumes where pods can be rescheduled to different nodes while the persisted data remains on the original node. Usage: weed volume -id=node-1 -ip=10.0.0.1 ... If -id is not specified, it defaults to ip:port for backward compatibility. Fixes https://github.com/seaweedfs/seaweedfs/issues/7487 * server: add -volume.id flag to weed server command Support the -volume.id flag in the all-in-one 'weed server' command, consistent with the standalone 'weed volume' command. Usage: weed server -volume.id=node-1 ... Ref: https://github.com/seaweedfs/seaweedfs/issues/7487 * topology: add test for id-based DataNode identification Test the key scenarios: 1. Create DataNode with explicit id 2. Same id with different IP returns same DataNode (K8s reschedule) 3. IP/PublicUrl are updated when node reconnects with new address 4. Different id creates new DataNode 5. Empty id falls back to ip:port (backward compatibility) Ref: https://github.com/seaweedfs/seaweedfs/issues/7487 * pb: add address field to DataNodeInfo for proper node addressing Previously, DataNodeInfo.Id was used as the node address, which worked when Id was always ip:port. Now that Id can be an explicit string, we need a separate Address field for connection purposes. Changes: - Add 'address' field to DataNodeInfo protobuf message - Update ToDataNodeInfo() to populate the address field - Update NewServerAddressFromDataNode() to use Address (with Id fallback) - Fix LookupEcVolume to use dn.Url() instead of dn.Id() Ref: https://github.com/seaweedfs/seaweedfs/issues/7487 * fix: trim whitespace from volume server id and fix test - Trim whitespace from -id flag to treat ' ' as empty - Fix store_load_balancing_test.go to include id parameter in NewStore call Ref: https://github.com/seaweedfs/seaweedfs/issues/7487 * refactor: extract GetVolumeServerId to util package Move the volume server ID determination logic to a shared utility function to avoid code duplication between volume.go and rack.go. Ref: https://github.com/seaweedfs/seaweedfs/issues/7487 * fix: improve transition logic for legacy nodes - Use exact ip:port match instead of net.SplitHostPort heuristic - Update GrpcPort and PublicUrl during transition for consistency - Remove unused net import Ref: https://github.com/seaweedfs/seaweedfs/issues/7487 * fix: add id normalization and address change logging - Normalize id parameter at function boundary (trim whitespace) - Log when DataNode IP:Port changes (helps debug K8s pod rescheduling) Ref: https://github.com/seaweedfs/seaweedfs/issues/7487	2025-12-02 22:08:11 -08:00
Chris Lu	7e15a4abe2	4.01	2025-11-27 11:39:05 -08:00
Chris Lu	cd2fac4551	S3: pass HTTP 429 from volume servers to S3 clients (#7556 ) With the recent changes (commit `c1b8d4bf0`) that made S3 directly access volume servers instead of proxying through filer, we need to properly handle HTTP 429 (Too Many Requests) errors from volume servers. This change ensures that when volume servers rate limit requests with HTTP 429, the S3 API properly translates this to an S3-compatible error response (ErrRequestBytesExceed with HTTP 503) instead of returning a generic InternalError. Changes: - Add ErrTooManyRequests sentinel error in weed/util/http - Detect HTTP 429 in ReadUrlAsStream and wrap with ErrTooManyRequests - Check for ErrTooManyRequests in GetObjectHandler and map to S3 error - Return ErrRequestBytesExceed (HTTP 503) for rate limiting scenarios This addresses the same issue as PR #7482 but for the new direct volume server access path instead of the filer proxy path. Fixes: Rate limiting errors from volume servers being masked as 500	2025-11-26 13:03:09 -08:00
Chris Lu	5f7a292334	add build info metrics (#7525 ) * add build info metrics * unused * metrics on build * size limit * once	2025-11-21 16:55:28 -08:00
Chris Lu	ca84a8a713	S3: Directly read write volume servers (#7481 ) * Lazy Versioning Check, Conditional SSE Entry Fetch, HEAD Request Optimization * revert Reverted the conditional versioning check to always check versioning status Reverted the conditional SSE entry fetch to always fetch entry metadata Reverted the conditional versioning check to always check versioning status Reverted the conditional SSE entry fetch to always fetch entry metadata * Lazy Entry Fetch for SSE, Skip Conditional Header Check * SSE-KMS headers are present, this is not an SSE-C request (mutually exclusive) * SSE-C is mutually exclusive with SSE-S3 and SSE-KMS * refactor * Removed Premature Mutual Exclusivity Check * check for the presence of the X-Amz-Server-Side-Encryption header * not used * fmt * directly read write volume servers * HTTP Range Request Support * set header * md5 * copy object * fix sse * fmt * implement sse * sse continue * fixed the suffix range bug (bytes=-N for "last N bytes") * debug logs * Missing PartsCount Header * profiling * url encoding * test_multipart_get_part * headers * debug * adjust log level * handle part number * Update s3api_object_handlers.go * nil safety * set ModifiedTsNs * remove * nil check * fix sse header * same logic as filer * decode values * decode ivBase64 * s3: Fix SSE decryption JWT authentication and streaming errors Critical fix for SSE (Server-Side Encryption) test failures: 1. JWT Authentication Bug (Root Cause): - Changed from GenJwtForFilerServer to GenJwtForVolumeServer - S3 API now uses correct JWT when directly reading from volume servers - Matches filer's authentication pattern for direct volume access - Fixes 'unexpected EOF' and 500 errors in SSE tests 2. Streaming Error Handling: - Added error propagation in getEncryptedStreamFromVolumes goroutine - Use CloseWithError() to properly communicate stream failures - Added debug logging for streaming errors 3. Response Header Timing: - Removed premature WriteHeader(http.StatusOK) call - Let Go's http package write status automatically on first write - Prevents header lock when errors occur during streaming 4. Enhanced SSE Decryption Debugging: - Added IV/Key validation and logging for SSE-C, SSE-KMS, SSE-S3 - Better error messages for missing or invalid encryption metadata - Added glog.V(2) debugging for decryption setup This fixes SSE integration test failures where encrypted objects could not be retrieved due to volume server authentication failures. The JWT bug was causing volume servers to reject requests, resulting in truncated/empty streams (EOF) or internal errors. * s3: Fix SSE multipart upload metadata preservation Critical fix for SSE multipart upload test failures (SSE-C and SSE-KMS): Root Cause - Incomplete SSE Metadata Copying: The old code only tried to copy 'SeaweedFSSSEKMSKey' from the first part to the completed object. This had TWO bugs: 1. Wrong Constant Name (Key Mismatch Bug): - Storage uses: SeaweedFSSSEKMSKeyHeader = 'X-SeaweedFS-SSE-KMS-Key' - Old code read: SeaweedFSSSEKMSKey = 'x-seaweedfs-sse-kms-key' - Result: SSE-KMS metadata was NEVER copied → 500 errors 2. Missing SSE-C and SSE-S3 Headers: - SSE-C requires: IV, Algorithm, KeyMD5 - SSE-S3 requires: encrypted key data + standard headers - Old code: copied nothing for SSE-C/SSE-S3 → decryption failures Fix - Complete SSE Header Preservation: Now copies ALL SSE headers from first part to completed object: - SSE-C: SeaweedFSSSEIV, CustomerAlgorithm, CustomerKeyMD5 - SSE-KMS: SeaweedFSSSEKMSKeyHeader, AwsKmsKeyId, ServerSideEncryption - SSE-S3: SeaweedFSSSES3Key, ServerSideEncryption Applied consistently to all 3 code paths: 1. Versioned buckets (creates version file) 2. Suspended versioning (creates main object with null versionId) 3. Non-versioned buckets (creates main object) Why This Is Correct: The headers copied EXACTLY match what putToFiler stores during part upload (lines 496-521 in s3api_object_handlers_put.go). This ensures detectPrimarySSEType() can correctly identify encrypted multipart objects and trigger inline decryption with proper metadata. Fixes: TestSSEMultipartUploadIntegration (SSE-C and SSE-KMS subtests) * s3: Add debug logging for versioning state diagnosis Temporary debug logging to diagnose test_versioning_obj_plain_null_version_overwrite_suspended failure. Added glog.V(0) logging to show: 1. setBucketVersioningStatus: when versioning status is changed 2. PutObjectHandler: what versioning state is detected (Enabled/Suspended/none) 3. PutObjectHandler: which code path is taken (putVersionedObject vs putSuspendedVersioningObject) This will help identify if: - The versioning status is being set correctly in bucket config - The cache is returning stale/incorrect versioning state - The switch statement is correctly routing to suspended vs enabled handlers * s3: Enhanced versioning state tracing for suspended versioning diagnosis Added comprehensive logging across the entire versioning state flow: PutBucketVersioningHandler: - Log requested status (Enabled/Suspended) - Log when calling setBucketVersioningStatus - Log success/failure of status change setBucketVersioningStatus: - Log bucket and status being set - Log when config is updated - Log completion with error code updateBucketConfig: - Log versioning state being written to cache - Immediate cache verification after Set - Log if cache verification fails getVersioningState: - Log bucket name and state being returned - Log if object lock forces VersioningEnabled - Log errors This will reveal: 1. If PutBucketVersioning(Suspended) is reaching the handler 2. If the cache update succeeds 3. What state getVersioningState returns during PUT 4. Any cache consistency issues Expected to show why bucket still reports 'Enabled' after 'Suspended' call. * s3: Add SSE chunk detection debugging for multipart uploads Added comprehensive logging to diagnose why TestSSEMultipartUploadIntegration fails: detectPrimarySSEType now logs: 1. Total chunk count and extended header count 2. All extended headers with 'sse'/'SSE'/'encryption' in the name 3. For each chunk: index, SseType, and whether it has metadata 4. Final SSE type counts (SSE-C, SSE-KMS, SSE-S3) This will reveal if: - Chunks are missing SSE metadata after multipart completion - Extended headers are copied correctly from first part - The SSE detection logic is working correctly Expected to show if chunks have SseType=0 (none) or proper SSE types set. * s3: Trace SSE chunk metadata through multipart completion and retrieval Added end-to-end logging to track SSE chunk metadata lifecycle: During Multipart Completion (filer_multipart.go): 1. Log finalParts chunks BEFORE mkFile - shows SseType and metadata 2. Log versionEntry.Chunks INSIDE mkFile callback - shows if mkFile preserves SSE info 3. Log success after mkFile completes During GET Retrieval (s3api_object_handlers.go): 1. Log retrieved entry chunks - shows SseType and metadata after retrieval 2. Log detected SSE type result This will reveal at which point SSE chunk metadata is lost: - If finalParts have SSE metadata but versionEntry.Chunks don't → mkFile bug - If versionEntry.Chunks have SSE metadata but retrieved chunks don't → storage/retrieval bug - If chunks never have SSE metadata → multipart completion SSE processing bug Expected to show chunks with SseType=NONE during retrieval even though they were created with proper SseType during multipart completion. * s3: Fix SSE-C multipart IV base64 decoding bug Critical Bug Found: SSE-C multipart uploads were failing because: Root Cause: - entry.Extended[SeaweedFSSSEIV] stores base64-encoded IV (24 bytes for 16-byte IV) - SerializeSSECMetadata expects raw IV bytes (16 bytes) - During multipart completion, we were passing base64 IV directly → serialization error Error Message: "Failed to serialize SSE-C metadata for chunk in part X: invalid IV length: expected 16 bytes, got 24" Fix: - Base64-decode IV before passing to SerializeSSECMetadata - Added error handling for decode failures Impact: - SSE-C multipart uploads will now correctly serialize chunk metadata - Chunks will have proper SSE metadata for decryption during GET This fixes the SSE-C subtest of TestSSEMultipartUploadIntegration. SSE-KMS still has a separate issue (error code 23) being investigated. * fixes * kms sse * handle retry if not found in .versions folder and should read the normal object * quick check (no retries) to see if the .versions/ directory exists * skip retry if object is not found * explicit update to avoid sync delay * fix map update lock * Remove fmt.Printf debug statements * Fix SSE-KMS multipart base IV fallback to fail instead of regenerating * fmt * Fix ACL grants storage logic * header handling * nil handling * range read for sse content * test range requests for sse objects * fmt * unused code * upload in chunks * header case * fix url * bucket policy error vs bucket not found * jwt handling * fmt * jwt in request header * Optimize Case-Insensitive Prefix Check * dead code * Eliminated Unnecessary Stream Prefetch for Multipart SSE * range sse * sse * refactor * context * fmt * fix type * fix SSE-C IV Mismatch * Fix Headers Being Set After WriteHeader * fix url parsing * propergate sse headers * multipart sse-s3 * aws sig v4 authen * sse kms * set content range * better errors * Update s3api_object_handlers_copy.go * Update s3api_object_handlers.go * Update s3api_object_handlers.go * avoid magic number * clean up * Update s3api_bucket_policy_handlers.go * fix url parsing * context * data and metadata both use background context * adjust the offset * SSE Range Request IV Calculation * adjust logs * IV relative to offset in each part, not the whole file * collect logs * offset * fix offset * fix url * logs * variable * jwt * Multipart ETag semantics: conditionally set object-level Md5 for single-chunk uploads only. * sse * adjust IV and offset * multipart boundaries * ensures PUT and GET operations return consistent ETags * Metadata Header Case * CommonPrefixes Sorting with URL Encoding * always sort * remove the extra PathUnescape call * fix the multipart get part ETag * the FileChunk is created without setting ModifiedTsNs * Sort CommonPrefixes lexicographically to match AWS S3 behavior * set md5 for multipart uploads * prevents any potential data loss or corruption in the small-file inline storage path * compiles correctly * decryptedReader will now be properly closed after use * Fixed URL encoding and sort order for CommonPrefixes * Update s3api_object_handlers_list.go * SSE-x Chunk View Decryption * Different IV offset calculations for single-part vs multipart objects * still too verbose in logs * less logs * ensure correct conversion * fix listing * nil check * minor fixes * nil check * single character delimiter * optimize * range on empty object or zero-length * correct IV based on its position within that part, not its position in the entire object * adjust offset * offset Fetch FULL encrypted chunk (not just the range) Adjust IV by PartOffset/ChunkOffset only Decrypt full chunk Skip in the DECRYPTED stream to reach OffsetInChunk * look breaking * refactor * error on no content * handle intra-block byte skipping * Incomplete HTTP Response Error Handling * multipart SSE * Update s3api_object_handlers.go * address comments * less logs * handling directory * Optimized rejectDirectoryObjectWithoutSlash() to avoid unnecessary lookups * Revert "handling directory" This reverts commit 3a335f0ac33c63f51975abc63c40e5328857a74b. * constant * Consolidate nil entry checks in GetObjectHandler * add range tests * Consolidate redundant nil entry checks in HeadObjectHandler * adjust logs * SSE type * large files * large files Reverted the plain-object range test * ErrNoEncryptionConfig * Fixed SSERangeReader Infinite Loop Vulnerability * Fixed SSE-KMS Multipart ChunkReader HTTP Body Leak * handle empty directory in S3, added PyArrow tests * purge unused code * Update s3_parquet_test.py * Update requirements.txt * According to S3 specifications, when both partNumber and Range are present, the Range should apply within the selected part's boundaries, not to the full object. * handle errors * errors after writing header * https * fix: Wait for volume assignment readiness before running Parquet tests The test-implicit-dir-with-server test was failing with an Internal Error because volume assignment was not ready when tests started. This fix adds a check that attempts a volume assignment and waits for it to succeed before proceeding with tests. This ensures that: 1. Volume servers are registered with the master 2. Volume growth is triggered if needed 3. The system can successfully assign volumes for writes Fixes the timeout issue where boto3 would retry 4 times and fail with 'We encountered an internal error, please try again.' * sse tests * store derived IV * fix: Clean up gRPC ports between tests to prevent port conflicts The second test (test-implicit-dir-with-server) was failing because the volume server's gRPC port (18080 = VOLUME_PORT + 10000) was still in use from the first test. The cleanup code only killed HTTP port processes, not gRPC port processes. Added cleanup for gRPC ports in all stop targets: - Master gRPC: MASTER_PORT + 10000 (19333) - Volume gRPC: VOLUME_PORT + 10000 (18080) - Filer gRPC: FILER_PORT + 10000 (18888) This ensures clean state between test runs in CI. * add import * address comments * docs: Add placeholder documentation files for Parquet test suite Added three missing documentation files referenced in test/s3/parquet/README.md: 1. TEST_COVERAGE.md - Documents 43 total test cases (17 Go unit tests, 6 Python integration tests, 20 Python end-to-end tests) 2. FINAL_ROOT_CAUSE_ANALYSIS.md - Explains the s3fs compatibility issue with PyArrow, the implicit directory problem, and how the fix works 3. MINIO_DIRECTORY_HANDLING.md - Compares MinIO's directory handling approach with SeaweedFS's implementation Each file contains: - Title and overview - Key technical details relevant to the topic - TODO sections for future expansion These placeholder files resolve the broken README links and provide structure for future detailed documentation. * clean up if metadata operation failed * Update s3_parquet_test.py * clean up * Update Makefile * Update s3_parquet_test.py * Update Makefile * Handle ivSkip for non-block-aligned offsets * Update README.md * stop volume server faster * stop volume server in 1 second * different IV for each chunk in SSE-S3 and SSE-KMS * clean up if fails * testing upload * error propagation * fmt * simplify * fix copying * less logs * endian * Added marshaling error handling * handling invalid ranges * error handling for adding to log buffer * fix logging * avoid returning too quickly and ensure proper cleaning up * Activity Tracking for Disk Reads * Cleanup Unused Parameters * Activity Tracking for Kafka Publishers * Proper Test Error Reporting * refactoring * less logs * less logs * go fmt * guard it with if entry.Attributes.TtlSec > 0 to match the pattern used elsewhere. * Handle bucket-default encryption config errors explicitly for multipart * consistent activity tracking * obsolete code for s3 on filer read/write handlers * Update weed/s3api/s3api_object_handlers_list.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2025-11-18 23:18:35 -08:00

1 2 3 4 5 ...

906 Commits