Files
seaweedFS/docker/admin_integration/docker-compose-ec-test.yml
Chris Lu 891a2fb6eb Admin: misc improvements on admin server and workers. EC now works. (#7055)
* initial design

* added simulation as tests

* reorganized the codebase to move the simulation framework and tests into their own dedicated package

* integration test. ec worker task

* remove "enhanced" reference

* start master, volume servers, filer

Current Status
 Master: Healthy and running (port 9333)
 Filer: Healthy and running (port 8888)
 Volume Servers: All 6 servers running (ports 8080-8085)
🔄 Admin/Workers: Will start when dependencies are ready

* generate write load

* tasks are assigned

* admin start wtih grpc port. worker has its own working directory

* Update .gitignore

* working worker and admin. Task detection is not working yet.

* compiles, detection uses volumeSizeLimitMB from master

* compiles

* worker retries connecting to admin

* build and restart

* rendering pending tasks

* skip task ID column

* sticky worker id

* test canScheduleTaskNow

* worker reconnect to admin

* clean up logs

* worker register itself first

* worker can run ec work and report status

but:
1. one volume should not be repeatedly worked on.
2. ec shards needs to be distributed and source data should be deleted.

* move ec task logic

* listing ec shards

* local copy, ec. Need to distribute.

* ec is mostly working now

* distribution of ec shards needs improvement
* need configuration to enable ec

* show ec volumes

* interval field UI component

* rename

* integration test with vauuming

* garbage percentage threshold

* fix warning

* display ec shard sizes

* fix ec volumes list

* Update ui.go

* show default values

* ensure correct default value

* MaintenanceConfig use ConfigField

* use schema defined defaults

* config

* reduce duplication

* refactor to use BaseUIProvider

* each task register its schema

* checkECEncodingCandidate use ecDetector

* use vacuumDetector

* use volumeSizeLimitMB

* remove

remove

* remove unused

* refactor

* use new framework

* remove v2 reference

* refactor

* left menu can scroll now

* The maintenance manager was not being initialized when no data directory was configured for persistent storage.

* saving config

* Update task_config_schema_templ.go

* enable/disable tasks

* protobuf encoded task configurations

* fix system settings

* use ui component

* remove logs

* interface{} Reduction

* reduce interface{}

* reduce interface{}

* avoid from/to map

* reduce interface{}

* refactor

* keep it DRY

* added logging

* debug messages

* debug level

* debug

* show the log caller line

* use configured task policy

* log level

* handle admin heartbeat response

* Update worker.go

* fix EC rack and dc count

* Report task status to admin server

* fix task logging, simplify interface checking, use erasure_coding constants

* factor in empty volume server during task planning

* volume.list adds disk id

* track disk id also

* fix locking scheduled and manual scanning

* add active topology

* simplify task detector

* ec task completed, but shards are not showing up

* implement ec in ec_typed.go

* adjust log level

* dedup

* implementing ec copying shards and only ecx files

* use disk id when distributing ec shards

🎯 Planning: ActiveTopology creates DestinationPlan with specific TargetDisk
📦 Task Creation: maintenance_integration.go creates ECDestination with DiskId
🚀 Task Execution: EC task passes DiskId in VolumeEcShardsCopyRequest
💾 Volume Server: Receives disk_id and stores shards on specific disk (vs.store.Locations[req.DiskId])
📂 File System: EC shards and metadata land in the exact disk directory planned

* Delete original volume from all locations

* clean up existing shard locations

* local encoding and distributing

* Update docker/admin_integration/EC-TESTING-README.md

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* check volume id range

* simplify

* fix tests

* fix types

* clean up logs and tests

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-07-30 12:38:03 -07:00

240 lines
5.7 KiB
YAML

name: admin_integration
networks:
seaweed_net:
driver: bridge
services:
master:
image: chrislusf/seaweedfs:local
ports:
- "9333:9333"
- "19333:19333"
command: "master -ip=master -mdir=/data -volumeSizeLimitMB=50"
environment:
- WEED_MASTER_VOLUME_GROWTH_COPY_1=1
- WEED_MASTER_VOLUME_GROWTH_COPY_2=2
- WEED_MASTER_VOLUME_GROWTH_COPY_OTHER=1
volumes:
- ./data/master:/data
networks:
- seaweed_net
volume1:
image: chrislusf/seaweedfs:local
ports:
- "8080:8080"
- "18080:18080"
command: "volume -mserver=master:9333 -ip=volume1 -dir=/data -max=10"
depends_on:
- master
volumes:
- ./data/volume1:/data
networks:
- seaweed_net
volume2:
image: chrislusf/seaweedfs:local
ports:
- "8081:8080"
- "18081:18080"
command: "volume -mserver=master:9333 -ip=volume2 -dir=/data -max=10"
depends_on:
- master
volumes:
- ./data/volume2:/data
networks:
- seaweed_net
volume3:
image: chrislusf/seaweedfs:local
ports:
- "8082:8080"
- "18082:18080"
command: "volume -mserver=master:9333 -ip=volume3 -dir=/data -max=10"
depends_on:
- master
volumes:
- ./data/volume3:/data
networks:
- seaweed_net
volume4:
image: chrislusf/seaweedfs:local
ports:
- "8083:8080"
- "18083:18080"
command: "volume -mserver=master:9333 -ip=volume4 -dir=/data -max=10"
depends_on:
- master
volumes:
- ./data/volume4:/data
networks:
- seaweed_net
volume5:
image: chrislusf/seaweedfs:local
ports:
- "8084:8080"
- "18084:18080"
command: "volume -mserver=master:9333 -ip=volume5 -dir=/data -max=10"
depends_on:
- master
volumes:
- ./data/volume5:/data
networks:
- seaweed_net
volume6:
image: chrislusf/seaweedfs:local
ports:
- "8085:8080"
- "18085:18080"
command: "volume -mserver=master:9333 -ip=volume6 -dir=/data -max=10"
depends_on:
- master
volumes:
- ./data/volume6:/data
networks:
- seaweed_net
filer:
image: chrislusf/seaweedfs:local
ports:
- "8888:8888"
- "18888:18888"
command: "filer -master=master:9333 -ip=filer"
depends_on:
- master
volumes:
- ./data/filer:/data
networks:
- seaweed_net
admin:
image: chrislusf/seaweedfs:local
ports:
- "23646:23646" # HTTP admin interface (default port)
- "33646:33646" # gRPC worker communication (23646 + 10000)
command: "admin -port=23646 -masters=master:9333 -dataDir=/data"
depends_on:
- master
- filer
volumes:
- ./data/admin:/data
networks:
- seaweed_net
worker1:
image: chrislusf/seaweedfs:local
command: "-v=2 worker -admin=admin:23646 -capabilities=erasure_coding,vacuum -maxConcurrent=2"
depends_on:
- admin
volumes:
- ./data/worker1:/data
networks:
- seaweed_net
environment:
- WORKER_ID=worker-1
worker2:
image: chrislusf/seaweedfs:local
command: "-v=2 worker -admin=admin:23646 -capabilities=erasure_coding,vacuum -maxConcurrent=2"
depends_on:
- admin
volumes:
- ./data/worker2:/data
networks:
- seaweed_net
environment:
- WORKER_ID=worker-2
worker3:
image: chrislusf/seaweedfs:local
command: "-v=2 worker -admin=admin:23646 -capabilities=erasure_coding,vacuum -maxConcurrent=2"
depends_on:
- admin
volumes:
- ./data/worker3:/data
networks:
- seaweed_net
environment:
- WORKER_ID=worker-3
load_generator:
image: chrislusf/seaweedfs:local
entrypoint: ["/bin/sh"]
command: >
-c "
echo 'Starting load generator...';
sleep 30;
echo 'Generating continuous load with 50MB volume limit...';
while true; do
echo 'Writing test files...';
echo 'Test file content at $(date)' | /usr/bin/weed upload -server=master:9333;
sleep 5;
echo 'Deleting some files...';
/usr/bin/weed shell -master=master:9333 <<< 'fs.rm /test_file_*' || true;
sleep 10;
done
"
depends_on:
- master
- filer
- admin
networks:
- seaweed_net
monitor:
image: alpine:latest
entrypoint: ["/bin/sh"]
command: >
-c "
apk add --no-cache curl jq;
echo 'Starting cluster monitor...';
sleep 30;
while true; do
echo '=== Cluster Status $(date) ===';
echo 'Master status:';
curl -s http://master:9333/cluster/status | jq '.IsLeader, .Peers' || echo 'Master not ready';
echo;
echo 'Admin status:';
curl -s http://admin:23646/ | grep -o 'Admin.*Interface' || echo 'Admin not ready';
echo;
echo 'Volume count by server:';
curl -s http://master:9333/vol/status | jq '.Volumes | length' || echo 'Volumes not ready';
echo;
sleep 60;
done
"
depends_on:
- master
- admin
- filer
networks:
- seaweed_net
vacuum-tester:
image: chrislusf/seaweedfs:local
entrypoint: ["/bin/sh"]
command: >
-c "
echo 'Installing dependencies for vacuum testing...';
apk add --no-cache jq curl go bash;
echo 'Vacuum tester ready...';
echo 'Use: docker-compose exec vacuum-tester sh';
echo 'Available commands: go, weed, curl, jq, bash, sh';
sleep infinity
"
depends_on:
- master
- admin
- filer
volumes:
- .:/testing
working_dir: /testing
networks:
- seaweed_net
environment:
- MASTER_HOST=master:9333
- ADMIN_HOST=admin:23646