* fix: EC rebalance fails with replica placement 000 This PR fixes several issues with EC shard distribution: 1. Pre-flight check before EC encoding - Verify target disk type has capacity before encoding starts - Prevents encoding shards only to fail during rebalance - Shows helpful error when wrong diskType is specified (e.g., ssd when volumes are on hdd) 2. Fix EC rebalance with replica placement 000 - When DiffRackCount=0, shards should be distributed freely across racks - The '000' placement means 'no volume replication needed' because EC provides redundancy - Previously all racks were skipped with error 'shards X > replica placement limit (0)' 3. Add unit tests for EC rebalance slot calculation - TestECRebalanceWithLimitedSlots: documents the limited slots scenario - TestECRebalanceZeroFreeSlots: reproduces the 0 free slots error 4. Add Makefile for manual EC testing - make setup: start cluster and populate data - make shell: open weed shell for EC commands - make clean: stop cluster and cleanup * fix: default -rebalance to true for ec.encode The -rebalance flag was defaulting to false, which meant ec.encode would only print shard moves but not actually execute them. This is a poor default since the whole point of EC encoding is to distribute shards across servers for fault tolerance. Now -rebalance defaults to true, so shards are actually distributed after encoding. Users can use -rebalance=false if they only want to see what would happen without making changes. * test/erasure_coding: improve Makefile safety and docs - Narrow pkill pattern for volume servers to use TEST_DIR instead of port pattern, avoiding accidental kills of unrelated SeaweedFS processes - Document external dependencies (curl, jq) in header comments * shell: refactor buildRackWithEcShards to reuse buildEcShards Extract common shard bit construction logic to avoid duplication between buildEcShards and buildRackWithEcShards helper functions. * shell: update test for EC replication 000 behavior When DiffRackCount=0 (replication "000"), EC shards should be distributed freely across racks since erasure coding provides its own redundancy. Update test expectation to reflect this behavior. * erasure_coding: add distribution package for proportional EC shard placement Add a new reusable package for EC shard distribution that: - Supports configurable EC ratios (not hard-coded 10+4) - Distributes shards proportionally based on replication policy - Provides fault tolerance analysis - Prefers moving parity shards to keep data shards spread out Key components: - ECConfig: Configurable data/parity shard counts - ReplicationConfig: Parsed XYZ replication policy - ECDistribution: Target shard counts per DC/rack/node - Rebalancer: Plans shard moves with parity-first strategy This enables seaweed-enterprise custom EC ratios and weed worker integration while maintaining a clean, testable architecture. * shell: integrate distribution package for EC rebalancing Add shell wrappers around the distribution package: - ProportionalECRebalancer: Plans moves using distribution.Rebalancer - NewProportionalECRebalancerWithConfig: Supports custom EC configs - GetDistributionSummary/GetFaultToleranceAnalysis: Helper functions The shell layer converts between EcNode types and the generic TopologyNode types used by the distribution package. * test setup * ec: improve data and parity shard distribution across racks - Add shardsByTypePerRack helper to track data vs parity shards - Rewrite doBalanceEcShardsAcrossRacks for two-pass balancing: 1. Balance data shards (0-9) evenly, max ceil(10/6)=2 per rack 2. Balance parity shards (10-13) evenly, max ceil(4/6)=1 per rack - Add balanceShardTypeAcrossRacks for generic shard type balancing - Add pickRackForShardType to select destination with room for type - Add unit tests for even data/parity distribution verification This ensures even read load during normal operation by spreading both data and parity shards across all available racks. * ec: make data/parity shard counts configurable in ecBalancer - Add dataShardCount and parityShardCount fields to ecBalancer struct - Add getDataShardCount() and getParityShardCount() methods with defaults - Replace direct constant usage with configurable methods - Fix unused variable warning for parityPerRack This allows seaweed-enterprise to use custom EC ratios while defaulting to standard 10+4 scheme. * Address PR 7812 review comments Makefile improvements: - Save PIDs for each volume server for precise termination - Use PID-based killing in stop target with pkill fallback - Use more specific pkill patterns with TEST_DIR paths Documentation: - Document jq dependency in README.md Rebalancer fix: - Fix duplicate shard count updates in applyMovesToAnalysis - All planners (DC/rack/node) update counts inline during planning - Remove duplicate updates from applyMovesToAnalysis to avoid double-counting * test/erasure_coding: use mktemp for test file template Use mktemp instead of hardcoded /tmp/testfile_template.bin path to provide better isolation for concurrent test runs.
134 lines
2.5 KiB
Plaintext
134 lines
2.5 KiB
Plaintext
.goxc*
|
|
vendor
|
|
tags
|
|
*.swp
|
|
### OSX template
|
|
.DS_Store
|
|
.AppleDouble
|
|
.LSOverride
|
|
|
|
# Icon must end with two \r
|
|
Icon
|
|
|
|
# Thumbnails
|
|
._*
|
|
|
|
# Files that might appear in the root of a volume
|
|
.DocumentRevisions-V100
|
|
.fseventsd
|
|
.Spotlight-V100
|
|
.TemporaryItems
|
|
.Trashes
|
|
.VolumeIcon.icns
|
|
|
|
# Directories potentially created on remote AFP share
|
|
.AppleDB
|
|
.AppleDesktop
|
|
Network Trash Folder
|
|
Temporary Items
|
|
.apdisk
|
|
### JetBrains template
|
|
# Covers JetBrains IDEs: IntelliJ, RubyMine, PhpStorm, AppCode, PyCharm, CLion, Android Studio
|
|
|
|
*.iml
|
|
|
|
## Directory-based project format:
|
|
.idea/
|
|
# if you remove the above rule, at least ignore the following:
|
|
|
|
# User-specific stuff:
|
|
# .idea/workspace.xml
|
|
# .idea/tasks.xml
|
|
# .idea/dictionaries
|
|
|
|
# Sensitive or high-churn files:
|
|
# .idea/dataSources.ids
|
|
# .idea/dataSources.xml
|
|
# .idea/sqlDataSources.xml
|
|
# .idea/dynamic.xml
|
|
# .idea/uiDesigner.xml
|
|
|
|
# Gradle:
|
|
# .idea/gradle.xml
|
|
# .idea/libraries
|
|
|
|
# Mongo Explorer plugin:
|
|
# .idea/mongoSettings.xml
|
|
|
|
## vscode
|
|
.vscode
|
|
## File-based project format:
|
|
*.ipr
|
|
*.iws
|
|
|
|
## Plugin-specific files:
|
|
|
|
# IntelliJ
|
|
/out/
|
|
|
|
# mpeltonen/sbt-idea plugin
|
|
.idea_modules/
|
|
|
|
# JIRA plugin
|
|
atlassian-ide-plugin.xml
|
|
|
|
# Crashlytics plugin (for Android Studio and IntelliJ)
|
|
com_crashlytics_export_strings.xml
|
|
crashlytics.properties
|
|
crashlytics-build.properties
|
|
|
|
workspace/
|
|
|
|
test_data
|
|
build
|
|
target
|
|
*.class
|
|
other/java/hdfs/dependency-reduced-pom.xml
|
|
|
|
# binary file
|
|
weed/weed
|
|
docker/weed
|
|
|
|
# test generated files
|
|
weed/*/*.jpg
|
|
docker/weed_sub
|
|
docker/weed_pub
|
|
weed/mq/schema/example.parquet
|
|
docker/agent_sub_record
|
|
test/mq/bin/consumer
|
|
test/mq/bin/producer
|
|
test/producer
|
|
bin/weed
|
|
weed_binary
|
|
/test/s3/copying/filerldb2
|
|
/filerldb2
|
|
/test/s3/retention/test-volume-data
|
|
test/s3/cors/weed-test.log
|
|
test/s3/cors/weed-server.pid
|
|
/test/s3/cors/test-volume-data
|
|
test/s3/cors/cors.test
|
|
/test/s3/retention/filerldb2
|
|
test/s3/retention/weed-server.pid
|
|
test/s3/retention/weed-test.log
|
|
/test/s3/versioning/test-volume-data
|
|
test/s3/versioning/weed-test.log
|
|
/docker/admin_integration/data
|
|
docker/agent_pub_record
|
|
docker/admin_integration/weed-local
|
|
/seaweedfs-rdma-sidecar/bin
|
|
/test/s3/encryption/filerldb2
|
|
/test/s3/sse/filerldb2
|
|
test/s3/sse/weed-test.log
|
|
ADVANCED_IAM_DEVELOPMENT_PLAN.md
|
|
/test/s3/iam/test-volume-data
|
|
*.log
|
|
weed-iam
|
|
test/kafka/kafka-client-loadtest/weed-linux-arm64
|
|
/test/tus/filerldb2
|
|
coverage.out
|
|
/test/s3/remote_cache/test-primary-data
|
|
/test/s3/remote_cache/test-remote-data
|
|
test/s3/remote_cache/remote-server.pid
|
|
test/s3/remote_cache/primary-server.pid
|
|
/test/erasure_coding/filerldb2
|