Commit Graph

107 Commits

Author SHA1 Message Date
Chris Lu
891a2fb6eb Admin: misc improvements on admin server and workers. EC now works. (#7055)
* initial design

* added simulation as tests

* reorganized the codebase to move the simulation framework and tests into their own dedicated package

* integration test. ec worker task

* remove "enhanced" reference

* start master, volume servers, filer

Current Status
 Master: Healthy and running (port 9333)
 Filer: Healthy and running (port 8888)
 Volume Servers: All 6 servers running (ports 8080-8085)
🔄 Admin/Workers: Will start when dependencies are ready

* generate write load

* tasks are assigned

* admin start wtih grpc port. worker has its own working directory

* Update .gitignore

* working worker and admin. Task detection is not working yet.

* compiles, detection uses volumeSizeLimitMB from master

* compiles

* worker retries connecting to admin

* build and restart

* rendering pending tasks

* skip task ID column

* sticky worker id

* test canScheduleTaskNow

* worker reconnect to admin

* clean up logs

* worker register itself first

* worker can run ec work and report status

but:
1. one volume should not be repeatedly worked on.
2. ec shards needs to be distributed and source data should be deleted.

* move ec task logic

* listing ec shards

* local copy, ec. Need to distribute.

* ec is mostly working now

* distribution of ec shards needs improvement
* need configuration to enable ec

* show ec volumes

* interval field UI component

* rename

* integration test with vauuming

* garbage percentage threshold

* fix warning

* display ec shard sizes

* fix ec volumes list

* Update ui.go

* show default values

* ensure correct default value

* MaintenanceConfig use ConfigField

* use schema defined defaults

* config

* reduce duplication

* refactor to use BaseUIProvider

* each task register its schema

* checkECEncodingCandidate use ecDetector

* use vacuumDetector

* use volumeSizeLimitMB

* remove

remove

* remove unused

* refactor

* use new framework

* remove v2 reference

* refactor

* left menu can scroll now

* The maintenance manager was not being initialized when no data directory was configured for persistent storage.

* saving config

* Update task_config_schema_templ.go

* enable/disable tasks

* protobuf encoded task configurations

* fix system settings

* use ui component

* remove logs

* interface{} Reduction

* reduce interface{}

* reduce interface{}

* avoid from/to map

* reduce interface{}

* refactor

* keep it DRY

* added logging

* debug messages

* debug level

* debug

* show the log caller line

* use configured task policy

* log level

* handle admin heartbeat response

* Update worker.go

* fix EC rack and dc count

* Report task status to admin server

* fix task logging, simplify interface checking, use erasure_coding constants

* factor in empty volume server during task planning

* volume.list adds disk id

* track disk id also

* fix locking scheduled and manual scanning

* add active topology

* simplify task detector

* ec task completed, but shards are not showing up

* implement ec in ec_typed.go

* adjust log level

* dedup

* implementing ec copying shards and only ecx files

* use disk id when distributing ec shards

🎯 Planning: ActiveTopology creates DestinationPlan with specific TargetDisk
📦 Task Creation: maintenance_integration.go creates ECDestination with DiskId
🚀 Task Execution: EC task passes DiskId in VolumeEcShardsCopyRequest
💾 Volume Server: Receives disk_id and stores shards on specific disk (vs.store.Locations[req.DiskId])
📂 File System: EC shards and metadata land in the exact disk directory planned

* Delete original volume from all locations

* clean up existing shard locations

* local encoding and distributing

* Update docker/admin_integration/EC-TESTING-README.md

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* check volume id range

* simplify

* fix tests

* fix types

* clean up logs and tests

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-07-30 12:38:03 -07:00
chrislu
c602f53a6e tail-volume-uses-the-source-volume-version 2025-06-16 22:46:13 -07:00
chrislu
96632a34b1 add version to volume proto 2025-06-16 22:05:06 -07:00
chrislu
9873b033d1 backward compatible vif loading 2024-10-28 19:44:30 -07:00
chrislu
2f3d820f52 rename proto field
This should not have any impact.
2024-10-24 21:36:56 -07:00
chrislu
ae5bd0667a rename proto field from DestroyTime to expire_at_sec
For TTL volume converted into EC volume, this change may leave the volumes staying.
2024-10-24 21:35:11 -07:00
Max Denushev
d056c0ddf2 fix(volume): don't persist RO state in specific cases (#6058)
* fix(volume): don't persist RO state in specific cases

* fix(volume): writable always persist
2024-09-24 16:15:54 -07:00
Bruce
f9e141a412 persist readonly state to volume info (#5977) 2024-09-05 07:58:24 -07:00
augustazz
0b00706454 EC volume supports expiration and displays expiration message when executing volume.list (#5895)
* ec volume expire

* volume.list show DestroyTime

* comments

* code optimization

---------

Co-authored-by: xuwenfeng <xuwenfeng1@zto.com>
2024-08-16 00:20:00 -07:00
chrislu
fdf7193ae7 rename 2024-08-13 13:59:24 -07:00
chrislu
07f4998188 add dat file size into vif for EC 2024-08-13 13:56:00 -07:00
Konstantin Lebedev
2b3e39397e fix: skipping checking active volumes with the same number of files at the moment (#4893)
* fix: skipping checking active volumes with the same number of files at the moment
 https://github.com/seaweedfs/seaweedfs/issues/4140

* refactor with comments
https://github.com/seaweedfs/seaweedfs/issues/4140

* add TestShouldSkipVolume

---------

Co-authored-by: Konstantin Lebedev <9497591+kmlebedev@users.noreply.github.co>
2023-10-09 09:57:26 -07:00
Konstantin Lebedev
25535e9c36 Delete volume is empty (#4561)
* use onlyEmpty for deleteVolume
https://github.com/seaweedfs/seaweedfs/issues/4559

* fix IsEmpty

* fix test

---------

Co-authored-by: Konstantin Lebedev <9497591+kmlebedev@users.noreply.github.co>
2023-06-12 10:42:44 -07:00
wusong
26f15d0079 Fix no more writable volumes by delay judgment (#4548)
* fix nomore writables volumes while disk free space is sufficient by time delay

* reset

---------

Co-authored-by: wang wusong <wangwusong@virtaitech.com>
2023-06-05 10:17:21 -07:00
chrislu
e1ca6308cb add chunk etag when downloading from remote storage
fix https://github.com/seaweedfs/seaweedfs/issues/3987
2022-12-10 21:49:07 -08:00
James Hartig
81624de27b Include name/mime in ReadAllNeedles (#4005) 2022-11-23 15:59:38 -08:00
James Hartig
4c85da7844 Include meta in ReadAllNeedles (#3991)
This is useful for doing backups on the data so we can accurately store the
last modified time, the compression state, and verify the crc.

Previously we were doing VolumeNeedleStatus and then an HTTP request which
needlessly read from the dat file twice.
2022-11-20 20:19:41 -08:00
Eric Yang
51d462f204 ADHOC: volume fsck using append at ns (#3906)
* ADHOC: volume fsck using append at ns

* nit

* nit

Co-authored-by: root <root@HQ-10MSTD3EY.roblox.local>
2022-10-24 22:09:38 -07:00
chrislu
de286fe662 shell: volume.move handles volume moved to cloud tier
fix https://github.com/seaweedfs/seaweedfs/issues/3803
2022-10-16 17:52:22 -07:00
Konstantin Lebedev
2f72103c83 avoid load volume file with BytesOffset mismatch (#3841)
* avoid load volume file with BytesOffset mismatch

https://github.com/seaweedfs/seaweedfs/issues/2966

* set BytesOffset if has not VolumeInfoFile

* typos fail => failed

* exit if bytesOffset mismatch
2022-10-14 00:18:09 -07:00
chrislu
dcd0743a35 remove unused ReadNeedleBlobRequest.needle_id
fix https://github.com/seaweedfs/seaweedfs/issues/3853
2022-10-13 23:10:46 -07:00
Ryan Russell
12914af4d8 Character readability (#3678)
* refactor(pb): `quote_charactoer` -> `quote_character`

Signed-off-by: Ryan Russell <git@ryanrussell.org>

* refactor(volume_server): `QuoteCharactoer` -> `QuoteCharacter`

Signed-off-by: Ryan Russell <git@ryanrussell.org>

* refactor(volume_server): `quoteCharactoer` -> `quoteCharacter`

Signed-off-by: Ryan Russell <git@ryanrussell.org>

Signed-off-by: Ryan Russell <git@ryanrussell.org>
2022-09-14 13:09:53 -07:00
Eric Yang
b324a6536c ADHOC: add read needle meta grpc (#3581)
* ADHOC: add read needle meta grpc

* add test

* nit

Co-authored-by: root <root@HQ-10MSTD3EY.roblox.local>
2022-09-06 23:51:27 -07:00
qzh
74b53729e1 feat(weed.move): add a speed limit parameter of moving files (#3478)
* feat(weed.move): add a speed limit parameter of moving files

* fix(weed.move): set the default value of ioBytePerSecond to vs.compactionBytePerSecond

Co-authored-by: zhihao.qu <zhihao.qu@ly.com>
2022-08-21 23:08:31 -07:00
Konstantin Lebedev
fc65122766 rename to LoadAvg_1M 2022-08-01 21:32:21 +05:00
Konstantin Lebedev
5209ebbeef remove percent 2022-08-01 20:40:38 +05:00
Konstantin Lebedev
3c75479e2b Merge branch 'master' into gentle_vacuum
# Conflicts:
#	weed/pb/messaging_pb/messaging.pb.go
#	weed/pb/messaging_pb/messaging_grpc.pb.go
#	weed/pb/s3_pb/s3.pb.go
#	weed/pb/volume_server_pb/volume_server.pb.go
#	weed/server/volume_grpc_vacuum.go
2022-08-01 14:45:22 +05:00
chrislu
26dbc6c905 move to https://github.com/seaweedfs/seaweedfs 2022-07-29 00:17:28 -07:00
Konstantin Lebedev
2f0dda384d vacuum show LA 2022-07-29 11:59:33 +05:00
chrislu
d12f431d98 collect volume server status 2022-06-12 11:56:23 -07:00
chrislu
b4be56bb3b add timing info during ping operation 2022-04-16 12:45:49 -07:00
chrislu
800cbc004c volume server adds ping function 2022-04-01 16:37:06 -07:00
Chris Lu
5435027ff0 volume copy: stream out copying progress and avoid grpc request timeout
fix https://github.com/chrislusf/seaweedfs/issues/2386
2021-10-24 02:52:56 -07:00
Chris Lu
3be3c17f59 volume vacuum: avoid timeout with streaming progress report
fix https://github.com/chrislusf/seaweedfs/issues/2396
2021-10-24 01:55:34 -07:00
Chris Lu
225b019fe0 stream read multiple volumes in a volume server 2021-09-27 02:51:31 -07:00
Chris Lu
c4d7ee6c5c volume server: read all files in a volume 2021-09-27 01:45:32 -07:00
Chris Lu
e5fc35ed0c change server address from string to a type 2021-09-12 22:47:52 -07:00
Chris Lu
2b1feb732c remote.cache supports replication 2021-09-06 18:30:44 -07:00
Chris Lu
d1a4e19a3f volume: copy file also copies modification time
to ensure ttl can work well
2021-09-01 02:42:57 -07:00
Chris Lu
05a648bb96 refactor: separating out remote.proto 2021-08-26 15:18:34 -07:00
Chris Lu
713c035a6e shell: remote.cache remote.uncache 2021-08-09 14:35:18 -07:00
Chris Lu
270770d7d7 refactor 2021-08-07 14:18:53 -07:00
Chris Lu
9df7d16791 read <- remote_storage 2021-07-31 22:39:38 -07:00
Chris Lu
b465095db1 shell: add volume.check.disk to fix inconsistency for replicated volumes
fix https://github.com/chrislusf/seaweedfs/issues/1923
2021-03-22 00:03:16 -07:00
Chris Lu
770393a48c volume: add capability to change disk type when moving a volume 2021-02-09 23:58:08 -08:00
Chris Lu
2e8dba571b adjust volume server UI 2020-12-14 00:51:57 -08:00
Chris Lu
94525aa0fd allocate volume by disk type 2020-12-13 23:08:21 -08:00
Chris Lu
5d6753fb98 shell: add volumeServer.leave command 2020-09-13 21:25:51 -07:00
James Hartig
3ccfa4c6ad Added VolumeMarkWritable and VolumeStatus grpc methods
This is necessary for copy to mark as read-only and then restore the
original state afterwards.
2020-08-19 11:42:56 -04:00
James Hartig
229f11c660 Added VolumeNeedleStatus volume server grpc method
This is needed for the diffing tool to get the cookie for a needle
2020-07-22 15:02:21 -04:00