seaweedFS

Author	SHA1	Message	Date
Chris Lu	4c88fbfd5e	Fix nil pointer crash during concurrent vacuum compaction (#8592 ) * check for nil needle map before compaction sync When CommitCompact runs concurrently, it sets v.nm = nil under dataFileAccessLock. CompactByIndex does not hold that lock, so v.nm.Sync() can hit a nil pointer. Add an early nil check to return an error instead of crashing. Fixes #8591 * guard copyDataBasedOnIndexFile size check against nil needle map The post-compaction size validation at line 538 accesses v.nm.ContentSize() and v.nm.DeletedSize(). If CommitCompact has concurrently set v.nm to nil, this causes a SIGSEGV. Skip the validation when v.nm is nil since the actual data copy uses local needle maps (oldNm/newNm) and is unaffected. Fixes #8591 * use atomic.Bool for compaction flags to prevent concurrent vacuum races The isCompacting and isCommitCompacting flags were plain bools read and written from multiple goroutines without synchronization. This allowed concurrent vacuums on the same volume to pass the guard checks and run simultaneously, leading to the nil pointer crash. Using atomic.Bool with CompareAndSwap ensures only one compaction or commit can run per volume at a time. Fixes #8591 * use go-version-file in CI workflows instead of hardcoded versions Use go-version-file: 'go.mod' so CI automatically picks up the Go version from go.mod, avoiding future version drift. Reordered checkout before setup-go in go.yml and e2e.yml so go.mod is available. Removed the now-unused GO_VERSION env vars. * capture v.nm locally in CompactByIndex to close TOCTOU race A bare nil check on v.nm followed by v.nm.Sync() has a race window where CommitCompact can set v.nm = nil between the two. Snapshot the pointer into a local variable so the nil check and Sync operate on the same reference. * add dynamic timeouts to plugin worker vacuum gRPC calls All vacuum gRPC calls used context.Background() with no deadline, so the plugin scheduler's execution timeout could kill a job while a large volume compact was still in progress. Use volume-size-scaled timeouts matching the topology vacuum approach: 3 min/GB for compact, 1 min/GB for check, commit, and cleanup. Fixes #8591 * Revert "add dynamic timeouts to plugin worker vacuum gRPC calls" This reverts commit 80951934c37416bc4f6c1472a5d3f8d204a637d9. * unify compaction lifecycle into single atomic flag Replace separate isCompacting and isCommitCompacting flags with a single isCompactionInProgress atomic.Bool. This ensures CompactBy*, CommitCompact, Close, and Destroy are mutually exclusive — only one can run at a time per volume. Key changes: - All entry points use CompareAndSwap(false, true) to claim exclusive access. CompactByVolumeData and CompactByIndex now also guard v.nm and v.DataBackend with local captures. - Close() waits for the flag outside dataFileAccessLock to avoid deadlocking with CommitCompact (which holds the flag while waiting for the lock). It claims the flag before acquiring the lock so no new compaction can start. - Destroy() uses CAS instead of a racy Load check, preventing concurrent compaction from racing with volume teardown. - unmountVolumeByCollection no longer deletes from the map; DeleteCollectionFromDiskLocation removes entries only after successful Destroy, preventing orphaned volumes on failure. Fixes #8591	2026-03-10 13:31:45 -07:00
Jaehoon Kim	d22e3d3495	Fix uncleanable orphans issue with `volume.fsck -forcePurging` (#7332 ) - Modified `needle_map_memory.go` to include needles with size=0 during needle map loading - Updated `volume_write.go` to handle size=0 needles in delete operations	2025-10-16 12:21:51 -07:00
Chris Lu	69553e5ba6	convert error fromating to %w everywhere (#6995 )	2025-07-16 23:39:27 -07:00
Chris Lu	90802cb201	revert part of `d8c574a5ef` (#6829 )	2025-06-01 12:27:49 -07:00
dongxufeng	ff878a542d	correctly report volume with input/output error to master (#6790 ) * correctly capture io error and report to master * code fix * check io error by error.Is --------- Co-authored-by: dongxu_feng <dongxu_feng@intsig.net>	2025-05-15 00:56:43 -07:00
chrislu	d8c574a5ef	fix fsync logic	2025-05-14 01:33:36 -07:00
chrislu	f3dde99796	adjust error message	2024-12-05 09:33:50 -08:00
dsd	1e13b6879c	fix(volume): to avoid duplicate write a same needle (#6138 ) fix WriteNeedleBlob to avoid duplicate write a same needle Co-authored-by: 邓书东 <shudong_deng@hhnb2024010108.intsig.com> Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>	2024-10-18 01:20:50 -07:00
skycope	0e8a54f6f6	fix write volume over size MaxPossibleVolumeSize (#5190 ) Co-authored-by: Yang Wang <yangwang@weride.ai>	2024-01-11 20:23:46 -08:00
chrislu	c278bac263	avoid nil needle map fix https://github.com/seaweedfs/seaweedfs/issues/4640	2023-07-07 22:16:58 -07:00
柏杰	0b0fb9b9e4	avoid data race read volume.IsEmpty (#4574 ) * avoid data race read volume.IsEmpty - avoid phantom read isEmpty for onlyEmpty - use `v.DataBackend.GetStat()` in v.dataFileAccessLock scope * add Destroy(onlyEmpty: true) test * add Destroy(onlyEmpty: false) test * remove unused `IsEmpty()` * change literal `8` to `SuperBlockSize`	2023-06-14 14:39:58 -07:00
chrislu	d880fc2bb3	fix merge	2022-10-23 18:26:22 -07:00
Konstantin Lebedev	6253058d9d	ensure monotonic n.AppendAtNs in each place (#3880 ) https://github.com/seaweedfs/seaweedfs/issues/3852 Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>	2022-10-23 18:24:52 -07:00
chrislu	f2d9049e6a	fix size variable	2022-10-23 13:05:37 -07:00
chrislu	184fbb6c50	volume server: remote tier volumes only soft delete in local index fix https://github.com/seaweedfs/seaweedfs/issues/3889	2022-10-23 13:04:38 -07:00
chrislu	9c8678ded9	ensure monotonic n.AppendAtNs fix https://github.com/seaweedfs/seaweedfs/issues/3852	2022-10-13 23:15:00 -07:00
Ryan Russell	277976bd76	refactor(storage): readability improvements (#3703 ) Signed-off-by: Ryan Russell <git@ryanrussell.org> Signed-off-by: Ryan Russell <git@ryanrussell.org>	2022-09-16 02:43:17 -07:00
chrislu	26dbc6c905	move to https://github.com/seaweedfs/seaweedfs	2022-07-29 00:17:28 -07:00
chrislu	37ab8909b0	use two flags: v.isCompacting and v.isCommitCompacting	2022-04-26 23:28:34 -07:00
Chris Lu	78e8ddf910	Only when tailing volume, the zero-ed cookie should skip checking. This only happens when checkCookie == false and fsync == false.	2021-08-13 02:09:35 -07:00
Chris Lu	a8617c1a39	tail volume: fix zero cookie problem from batch deletion	2021-08-13 01:54:35 -07:00
Chris Lu	b465095db1	shell: add volume.check.disk to fix inconsistency for replicated volumes fix https://github.com/chrislusf/seaweedfs/issues/1923	2021-03-22 00:03:16 -07:00
Chris Lu	102a951377	refactor, split into 2 files	2021-03-21 13:05:53 -07:00

23 Commits