61 Commits

Author SHA1 Message Date
Chris Lu
baae672b6f feat: auto-disable master vacuum when plugin worker is active (#8624)
* feat: auto-disable master vacuum when plugin vacuum worker is active

When a vacuum-capable plugin worker connects to the admin server, the
admin server calls DisableVacuum on the master to prevent the automatic
scheduled vacuum from conflicting with the plugin worker's vacuum. When
the worker disconnects, EnableVacuum is called to restore the default
behavior. A safety net in the topology refresh loop re-enables vacuum
if the admin server disconnects without cleanup.

* rename isAdminServerConnected to isAdminServerConnectedFunc

* add 5s timeout to DisableVacuum/EnableVacuum gRPC calls

Prevents the monitor goroutine from blocking indefinitely if the
master is unresponsive.

* track plugin ownership of vacuum disable to avoid overriding operator

- Add vacuumDisabledByPlugin flag to Topology, set when DisableVacuum
  is called while admin server is connected (i.e., by plugin monitor)
- Safety net only re-enables vacuum when it was disabled by plugin,
  not when an operator intentionally disabled it via shell command
- EnableVacuum clears the plugin flag

* extract syncVacuumState for testability, add fake toggler tests

Extract the single sync step into syncVacuumState() with a
vacuumToggler interface. Add TestSyncVacuumState with a fake
toggler that verifies disable/enable calls on state transitions.

* use atomic.Bool for isDisableVacuum and vacuumDisabledByPlugin

Both fields are written by gRPC handlers and read by the vacuum
goroutine, causing a data race. Use atomic.Bool with Store/Load
for thread-safe access.

* use explicit by_plugin field instead of connection heuristic

Add by_plugin bool to DisableVacuumRequest proto so the caller
declares intent explicitly. The admin server monitor sets it to
true; shell commands leave it false. This prevents an operator's
intentional disable from being auto-reversed by the safety net.

* use setter for admin server callback instead of function parameter

Move isAdminServerConnected from StartRefreshWritableVolumes
parameter to Topology.SetAdminServerConnectedFunc() setter.
Keeps the function signature stable and decouples the topology
layer from the admin server concept.

* suppress repeated log messages on persistent sync failures

Add retrying parameter to syncVacuumState so the initial
state transition is logged at V(0) but subsequent retries
of the same transition are silent until the call succeeds.

* clear plugin ownership flag on manual DisableVacuum

Prevents stale plugin flag from causing incorrect auto-enable
when an operator manually disables vacuum after a plugin had
previously disabled it.

* add by_plugin to EnableVacuumRequest for symmetric ownership tracking

Plugin-driven EnableVacuum now only re-enables if the plugin was
the one that disabled it. If an operator manually disabled vacuum
after the plugin, the plugin's EnableVacuum is a no-op. This
prevents the plugin monitor from overriding operator intent on
worker disconnect.

* use cancellable context for monitorVacuumWorker goroutine

Replace context.Background() with a cancellable context stored
as bgCancel on AdminServer. Shutdown() calls bgCancel() so
monitorVacuumWorker exits cleanly via ctx.Done().

* track operator and plugin vacuum disables independently

Replace single isDisableVacuum flag with two independent flags:
vacuumDisabledByOperator and vacuumDisabledByPlugin. Each caller
only flips its own flag. The effective disabled state is the OR
of both. This prevents a plugin connect/disconnect cycle from
overriding an operator's manual disable, and vice versa.

* fix safety net to clear plugin flag, not operator flag

The safety net should call EnableVacuumByPlugin() to clear only
the plugin disable flag when the admin server disconnects. The
previous call to EnableVacuum() incorrectly cleared the operator
flag instead.
2026-03-13 22:49:12 -07:00
dsd
72af97162f [shell] feat:stop vacuum immediately once volume.vacuum.disable was executed (#6375)
stop vacuum immediately once volume.vacuum.disable was executed

Co-authored-by: dsd <dsd2019@foxmail.com>
2024-12-18 11:56:40 -08:00
chrislu
4463296811 add parallel vacuuming 2024-08-21 22:53:54 -07:00
chrislu
b3696024d1 add warning for not enough copies when skipping vacuuming volumes
fix https://github.com/seaweedfs/seaweedfs/issues/5906
2024-08-20 09:39:35 -07:00
wusong
9bdbf9c880 revert #4491 (#4550)
Co-authored-by: wang wusong <wangwusong@virtaitech.com>
2023-06-06 00:17:51 -07:00
wusong
26f15d0079 Fix no more writable volumes by delay judgment (#4548)
* fix nomore writables volumes while disk free space is sufficient by time delay

* reset

---------

Co-authored-by: wang wusong <wangwusong@virtaitech.com>
2023-06-05 10:17:21 -07:00
wusong
8fffe3e822 fix no more writables volumes while disk free space is sufficient (#4491)
Co-authored-by: wang wusong <wangwusong@virtaitech.com>
2023-05-21 22:18:50 -07:00
wusong
2e240704ab Writables inconsistency (#4417)
fix: inconsistent read and write permissions between master and volume server

Signed-off-by: Wusong Wang <wangwusong@virtaitech.com>
Co-authored-by: Wusong Wang <wangwusong@virtaitech.com>
2023-04-21 00:14:41 -07:00
Konstantin Lebedev
409c9328de [master] avoid vacuum if not enough replica copies (#3924)
avoid vacuum if not enough replica copies
2022-10-30 20:34:19 -07:00
Konstantin Lebedev
fc65122766 rename to LoadAvg_1M 2022-08-01 21:32:21 +05:00
Konstantin Lebedev
5209ebbeef remove percent 2022-08-01 20:40:38 +05:00
Konstantin Lebedev
78cbd8002f revert Sleep 2022-08-01 20:21:23 +05:00
Konstantin Lebedev
cd5c7ad052 move to github.com/seaweedfs/seaweedfs 2022-08-01 16:36:32 +05:00
Konstantin Lebedev
3c75479e2b Merge branch 'master' into gentle_vacuum
# Conflicts:
#	weed/pb/messaging_pb/messaging.pb.go
#	weed/pb/messaging_pb/messaging_grpc.pb.go
#	weed/pb/s3_pb/s3.pb.go
#	weed/pb/volume_server_pb/volume_server.pb.go
#	weed/server/volume_grpc_vacuum.go
2022-08-01 14:45:22 +05:00
Konstantin Lebedev
c0d92f61a1 comment 2022-08-01 14:40:42 +05:00
Konstantin Lebedev
1d29f67c02 revert disk stats 2022-08-01 14:29:41 +05:00
chrislu
26dbc6c905 move to https://github.com/seaweedfs/seaweedfs 2022-07-29 00:17:28 -07:00
Konstantin Lebedev
2f0dda384d vacuum show LA 2022-07-29 11:59:33 +05:00
chrislu
48382676d2 fix filtering by volume id 2022-07-08 10:29:24 -07:00
zzq09494
9df5ad5309 fix: vacuum create a lot of connections quickly 2022-06-22 09:57:22 +08:00
Konstantin Lebedev
36c5a59ed8 add help 2022-04-18 19:36:14 +05:00
Konstantin Lebedev
1e35b4929f shell vacuum volume by collection and volume id 2022-04-18 18:40:58 +05:00
chrislu
9f9ef1340c use streaming mode for long poll grpc calls
streaming mode would create separate grpc connections for each call.
this is to ensure the long poll connections are properly closed.
2021-12-26 00:15:03 -08:00
Chris Lu
3be3c17f59 volume vacuum: avoid timeout with streaming progress report
fix https://github.com/chrislusf/seaweedfs/issues/2396
2021-10-24 01:55:34 -07:00
Chris Lu
e5fc35ed0c change server address from string to a type 2021-09-12 22:47:52 -07:00
Chris Lu
2270737344 volume: avoid fixed vacuum timeout for large volumes
1GB for 3 minutes, about 5.7MB/s
2021-02-22 12:52:37 -08:00
Chris Lu
003b6245e7 fix nil 2020-12-02 00:09:19 -08:00
Chris Lu
965413c21b shell: add volume.vacuum command 2020-11-28 23:18:02 -08:00
Chris Lu
410b818aa7 master: avoid timer leakage 2020-10-19 14:24:57 -07:00
Chris Lu
c7d7b1a0f6 Merge pull request #1485 from LIBA-S/fix_oversized
Correct the oversized state of volume after compaction
2020-09-23 19:24:30 -07:00
LIBA-S
eecd6b5d35 Fix a race condition when handle VolumeLocationList 2020-09-23 20:56:51 +08:00
LIBA-S
0157798ebf Correct the oversized state of volume after compaction 2020-09-23 20:27:42 +08:00
Chris Lu
c3cb6fa1d7 volume: compaction can cause readonly volumes
address https://github.com/chrislusf/seaweedfs/issues/1233
2020-03-17 09:43:57 -07:00
Chris Lu
3cc9e85895 volume: vacuum pass preallocate variable 2020-03-13 16:17:44 -07:00
Chris Lu
c90eb0da1f volume: handling readonly volumes after compaction
ensure readonly volumes are not added as writable
2020-03-13 15:41:27 -07:00
Chris Lu
892e726eb9 avoid reusing context object
fix https://github.com/chrislusf/seaweedfs/issues/1182
2020-02-25 21:50:12 -08:00
Chris Lu
72a64a5cf8 use the same context object in order to retry 2020-01-26 14:42:11 -08:00
zhangsong
e83c36e26f fix the bug of volume never be vacuumed 2019-12-02 13:25:32 +08:00
divinerapier
5656d43264 can not break out of for-select block
Signed-off-by: divinerapier <poriter.coco@gmail.com>
2019-11-20 08:25:29 +08:00
Chris Lu
79762385bd master: ensure only one exclusive vacuum process
fix https://github.com/chrislusf/seaweedfs/issues/1011
2019-07-21 21:49:10 -07:00
Chris Lu
4b15c8f0c4 volume: lock writables changes 2019-07-21 13:49:09 -07:00
Chris Lu
e5506152c0 refactoring 2019-04-18 21:43:36 -07:00
Chris Lu
95e0520182 weed volume: add grpc operation to relicate a volume to local 2019-03-23 11:33:34 -07:00
Chris Lu
e108688990 avoid grpc 5 seconds timeout
some operations may take longer than 5 seconds.

only keep the timeout for raft operations
2019-02-20 01:01:01 -08:00
Chris Lu
77b9af531d adding grpc mutual tls 2019-02-18 12:11:52 -08:00
Sergey
aa5ccff6d2 fixing of typos 2019-02-06 18:59:15 +05:00
bingoohuang
ab6be025d7 go fmt and fix some typo 2019-01-17 09:17:19 +08:00
Chris Lu
2d23d86fd3 no timeout for volume vacuum
revert changes on volume vacuum timemout from https://github.com/chrislusf/seaweedfs/pull/829
2019-01-10 09:07:40 -08:00
chenwanli
0a3e83a36a Set timeout for master and volume non-streaming rpc 2019-01-10 19:41:03 +08:00
Chris Lu
03cfb4267f adjust vaccum logging 2018-12-31 00:06:52 -08:00