Files
seaweedFS/weed/server/volume_grpc_erasure_coding.go
Chris Lu 208d7f24f4 Erasure Coding: Ec refactoring (#7396)
* refactor: add ECContext structure to encapsulate EC parameters

- Create ec_context.go with ECContext struct
- NewDefaultECContext() creates context with default 10+4 configuration
- Helper methods: CreateEncoder(), ToExt(), String()
- Foundation for cleaner function signatures
- No behavior change, still uses hardcoded 10+4

* refactor: update ec_encoder.go to use ECContext

- Add WriteEcFilesWithContext() and RebuildEcFilesWithContext() functions
- Keep old functions for backward compatibility (call new versions)
- Update all internal functions to accept ECContext parameter
- Use ctx.DataShards, ctx.ParityShards, ctx.TotalShards consistently
- Use ctx.CreateEncoder() instead of hardcoded reedsolomon.New()
- Use ctx.ToExt() for shard file extensions
- No behavior change, still uses default 10+4 configuration

* refactor: update ec_volume.go to use ECContext

- Add ECContext field to EcVolume struct
- Initialize ECContext with default configuration in NewEcVolume()
- Update LocateEcShardNeedleInterval() to use ECContext.DataShards
- Phase 1: Always uses default 10+4 configuration
- No behavior change

* refactor: add EC shard count fields to VolumeInfo protobuf

- Add data_shards_count field (field 8) to VolumeInfo message
- Add parity_shards_count field (field 9) to VolumeInfo message
- Fields are optional, 0 means use default (10+4)
- Backward compatible: fields added at end
- Phase 1: Foundation for future customization

* refactor: regenerate protobuf Go files with EC shard count fields

- Regenerated volume_server_pb/*.go with new EC fields
- DataShardsCount and ParityShardsCount accessors added to VolumeInfo
- No behavior change, fields not yet used

* refactor: update VolumeEcShardsGenerate to use ECContext

- Create ECContext with default configuration in VolumeEcShardsGenerate
- Use ecCtx.TotalShards and ecCtx.ToExt() in cleanup
- Call WriteEcFilesWithContext() instead of WriteEcFiles()
- Save EC configuration (DataShardsCount, ParityShardsCount) to VolumeInfo
- Log EC context being used
- Phase 1: Always uses default 10+4 configuration
- No behavior change

* fmt

* refactor: update ec_test.go to use ECContext

- Update TestEncodingDecoding to create and use ECContext
- Update validateFiles() to accept ECContext parameter
- Update removeGeneratedFiles() to use ctx.TotalShards and ctx.ToExt()
- Test passes with default 10+4 configuration

* refactor: use EcShardConfig message instead of separate fields

* optimize: pre-calculate row sizes in EC encoding loop

* refactor: replace TotalShards field with Total() method

- Remove TotalShards field from ECContext to avoid field drift
- Add Total() method that computes DataShards + ParityShards
- Update all references to use ctx.Total() instead of ctx.TotalShards
- Read EC config from VolumeInfo when loading EC volumes
- Read data shard count from .vif in VolumeEcShardsToVolume
- Use >= instead of > for exact boundary handling in encoding loops

* optimize: simplify VolumeEcShardsToVolume to use existing EC context

- Remove redundant CollectEcShards call
- Remove redundant .vif file loading
- Use v.ECContext.DataShards directly (already loaded by NewEcVolume)
- Slice tempShards instead of collecting again

* refactor: rename MaxShardId to MaxShardCount for clarity

- Change from MaxShardId=31 to MaxShardCount=32
- Eliminates confusing +1 arithmetic (MaxShardId+1)
- More intuitive: MaxShardCount directly represents the limit

fix: support custom EC ratios beyond 14 shards in VolumeEcShardsToVolume

- Add MaxShardId constant (31, since ShardBits is uint32)
- Use MaxShardId+1 (32) instead of TotalShardsCount (14) for tempShards buffer
- Prevents panic when slicing for volumes with >14 total shards
- Critical fix for custom EC configurations like 20+10

* fix: add validation for EC shard counts from VolumeInfo

- Validate DataShards/ParityShards are positive and within MaxShardCount
- Prevent zero or invalid values that could cause divide-by-zero
- Fallback to defaults if validation fails, with warning log
- VolumeEcShardsGenerate now preserves existing EC config when regenerating
- Critical safety fix for corrupted or legacy .vif files

* fix: RebuildEcFiles now loads EC config from .vif file

- Critical: RebuildEcFiles was always using default 10+4 config
- Now loads actual EC config from .vif file when rebuilding shards
- Validates config before use (positive shards, within MaxShardCount)
- Falls back to default if .vif missing or invalid
- Prevents data corruption when rebuilding custom EC volumes

* add: defensive validation for dataShards in VolumeEcShardsToVolume

- Validate dataShards > 0 and <= MaxShardCount before use
- Prevents panic from corrupted or uninitialized ECContext
- Returns clear error message instead of panic
- Defense-in-depth: validates even though upstream should catch issues

* fix: replace TotalShardsCount with MaxShardCount for custom EC ratio support

Critical fixes to support custom EC ratios > 14 shards:

disk_location_ec.go:
- validateEcVolume: Check shards 0-31 instead of 0-13 during validation
- removeEcVolumeFiles: Remove shards 0-31 instead of 0-13 during cleanup

ec_volume_info.go ShardBits methods:
- ShardIds(): Iterate up to MaxShardCount (32) instead of TotalShardsCount (14)
- ToUint32Slice(): Iterate up to MaxShardCount (32)
- IndexToShardId(): Iterate up to MaxShardCount (32)
- MinusParityShards(): Remove shards 10-31 instead of 10-13 (added note about Phase 2)
- Minus() shard size copy: Iterate up to MaxShardCount (32)
- resizeShardSizes(): Iterate up to MaxShardCount (32)

Without these changes:
- Custom EC ratios > 14 total shards would fail validation on startup
- Shards 14-31 would never be discovered or cleaned up
- ShardBits operations would miss shards >= 14

These changes are backward compatible - MaxShardCount (32) includes
the default TotalShardsCount (14), so existing 10+4 volumes work as before.

* fix: replace TotalShardsCount with MaxShardCount in critical data structures

Critical fixes for buffer allocations and loops that must support
custom EC ratios up to 32 shards:

Data Structures:
- store_ec.go:354: Buffer allocation for shard recovery (bufs array)
- topology_ec.go:14: EcShardLocations.Locations fixed array size
- command_ec_rebuild.go:268: EC shard map allocation
- command_ec_common.go:626: Shard-to-locations map allocation

Shard Discovery Loops:
- ec_task.go:378: Loop to find generated shard files
- ec_shard_management.go: All 8 loops that check/count EC shards

These changes are critical because:
1. Buffer allocations sized to 14 would cause index-out-of-bounds panics
   when accessing shards 14-31
2. Fixed arrays sized to 14 would truncate shard location data
3. Loops limited to 0-13 would never discover/manage shards 14-31

Note: command_ec_encode.go:208 intentionally NOT changed - it creates
shard IDs to mount after encoding. In Phase 1 we always generate 14
shards, so this remains TotalShardsCount and will be made dynamic in
Phase 2 based on actual EC context.

Without these fixes, custom EC ratios > 14 total shards would cause:
- Runtime panics (array index out of bounds)
- Data loss (shards 14-31 never discovered/tracked)
- Incomplete shard management (missing shards not detected)

* refactor: move MaxShardCount constant to ec_encoder.go

Moved MaxShardCount from ec_volume_info.go to ec_encoder.go to group it
with other shard count constants (DataShardsCount, ParityShardsCount,
TotalShardsCount). This improves code organization and makes it easier
to understand the relationship between these constants.

Location: ec_encoder.go line 22, between TotalShardsCount and MinTotalDisks

* improve: add defensive programming and better error messages for EC

Code review improvements from CodeRabbit:

1. ShardBits Guardrails (ec_volume_info.go):
   - AddShardId, RemoveShardId: Reject shard IDs >= MaxShardCount
   - HasShardId: Return false for out-of-range shard IDs
   - Prevents silent no-ops from bit shifts with invalid IDs

2. Future-Proof Regex (disk_location_ec.go):
   - Updated regex from \.ec[0-9][0-9] to \.ec\d{2,3}
   - Now matches .ec00 through .ec999 (currently .ec00-.ec31 used)
   - Supports future increases to MaxShardCount beyond 99

3. Better Error Messages (volume_grpc_erasure_coding.go):
   - Include valid range (1..32) in dataShards validation error
   - Helps operators quickly identify the problem

4. Validation Before Save (volume_grpc_erasure_coding.go):
   - Validate ECContext (DataShards > 0, ParityShards > 0, Total <= MaxShardCount)
   - Log EC config being saved to .vif for debugging
   - Prevents writing invalid configs to disk

These changes improve robustness and debuggability without changing
core functionality.

* fmt

* fix: critical bugs from code review + clean up comments

Critical bug fixes:
1. command_ec_rebuild.go: Fixed indentation causing compilation error
   - Properly nested if/for blocks in registerEcNode

2. ec_shard_management.go: Fixed isComplete logic incorrectly using MaxShardCount
   - Changed from MaxShardCount (32) back to TotalShardsCount (14)
   - Default 10+4 volumes were being incorrectly reported as incomplete
   - Missing shards 14-31 were being incorrectly reported as missing
   - Fixed in 4 locations: volume completeness checks and getMissingShards

3. ec_volume_info.go: Fixed MinusParityShards removing too many shards
   - Changed from MaxShardCount (32) back to TotalShardsCount (14)
   - Was incorrectly removing shard IDs 10-31 instead of just 10-13

Comment cleanup:
- Removed Phase 1/Phase 2 references (development plan context)
- Replaced with clear statements about default 10+4 configuration
- SeaweedFS repo uses fixed 10+4 EC ratio, no phases needed

Root cause: Over-aggressive replacement of TotalShardsCount with MaxShardCount.
MaxShardCount (32) is the limit for buffer allocations and shard ID loops,
but TotalShardsCount (14) must be used for default EC configuration logic.

* fix: add defensive bounds checks and compute actual shard counts

Critical fixes from code review:

1. topology_ec.go: Add defensive bounds checks to AddShard/DeleteShard
   - Prevent panic when shardId >= MaxShardCount (32)
   - Return false instead of crashing on out-of-range shard IDs

2. command_ec_common.go: Fix doBalanceEcShardsAcrossRacks
   - Was using hardcoded TotalShardsCount (14) for all volumes
   - Now computes actual totalShardsForVolume from rackToShardCount
   - Fixes incorrect rebalancing for volumes with custom EC ratios
   - Example: 5+2=7 shards would incorrectly use 14 as average

These fixes improve robustness and prepare for future custom EC ratios
without changing current behavior for default 10+4 volumes.

Note: MinusParityShards and ec_task.go intentionally NOT changed for
seaweedfs repo - these will be enhanced in seaweed-enterprise repo
where custom EC ratio configuration is added.

* fmt

* style: make MaxShardCount type casting explicit in loops

Improved code clarity by explicitly casting MaxShardCount to the
appropriate type when used in loop comparisons:

- ShardId comparisons: Cast to ShardId(MaxShardCount)
- uint32 comparisons: Cast to uint32(MaxShardCount)

Changed in 5 locations:
- Minus() loop (line 90)
- ShardIds() loop (line 143)
- ToUint32Slice() loop (line 152)
- IndexToShardId() loop (line 219)
- resizeShardSizes() loop (line 248)

This makes the intent explicit and improves type safety readability.
No functional changes - purely a style improvement.
2025-10-27 22:13:31 -07:00

554 lines
18 KiB
Go

package weed_server
import (
"context"
"fmt"
"io"
"math"
"os"
"path"
"strings"
"time"
"github.com/seaweedfs/seaweedfs/weed/glog"
"github.com/seaweedfs/seaweedfs/weed/operation"
"github.com/seaweedfs/seaweedfs/weed/pb"
"github.com/seaweedfs/seaweedfs/weed/pb/volume_server_pb"
"github.com/seaweedfs/seaweedfs/weed/storage"
"github.com/seaweedfs/seaweedfs/weed/storage/erasure_coding"
"github.com/seaweedfs/seaweedfs/weed/storage/needle"
"github.com/seaweedfs/seaweedfs/weed/storage/types"
"github.com/seaweedfs/seaweedfs/weed/storage/volume_info"
"github.com/seaweedfs/seaweedfs/weed/util"
)
/*
Steps to apply erasure coding to .dat .idx files
0. ensure the volume is readonly
1. client call VolumeEcShardsGenerate to generate the .ecx and .ec00 ~ .ec13 files
2. client ask master for possible servers to hold the ec files
3. client call VolumeEcShardsCopy on above target servers to copy ec files from the source server
4. target servers report the new ec files to the master
5. master stores vid -> [14]*DataNode
6. client checks master. If all 14 slices are ready, delete the original .idx, .idx files
*/
// VolumeEcShardsGenerate generates the .ecx and .ec00 ~ .ec13 files
func (vs *VolumeServer) VolumeEcShardsGenerate(ctx context.Context, req *volume_server_pb.VolumeEcShardsGenerateRequest) (*volume_server_pb.VolumeEcShardsGenerateResponse, error) {
glog.V(0).Infof("VolumeEcShardsGenerate: %v", req)
v := vs.store.GetVolume(needle.VolumeId(req.VolumeId))
if v == nil {
return nil, fmt.Errorf("volume %d not found", req.VolumeId)
}
baseFileName := v.DataFileName()
if v.Collection != req.Collection {
return nil, fmt.Errorf("existing collection:%v unexpected input: %v", v.Collection, req.Collection)
}
// Create EC context - prefer existing .vif config if present (for regeneration scenarios)
ecCtx := erasure_coding.NewDefaultECContext(req.Collection, needle.VolumeId(req.VolumeId))
if volumeInfo, _, found, _ := volume_info.MaybeLoadVolumeInfo(baseFileName + ".vif"); found && volumeInfo.EcShardConfig != nil {
ds := int(volumeInfo.EcShardConfig.DataShards)
ps := int(volumeInfo.EcShardConfig.ParityShards)
// Validate and use existing EC config
if ds > 0 && ps > 0 && ds+ps <= erasure_coding.MaxShardCount {
ecCtx.DataShards = ds
ecCtx.ParityShards = ps
glog.V(0).Infof("Using existing EC config for volume %d: %s", req.VolumeId, ecCtx.String())
} else {
glog.Warningf("Invalid EC config in .vif for volume %d (data=%d, parity=%d), using defaults", req.VolumeId, ds, ps)
}
} else {
glog.V(0).Infof("Using default EC config for volume %d: %s", req.VolumeId, ecCtx.String())
}
shouldCleanup := true
defer func() {
if !shouldCleanup {
return
}
for i := 0; i < ecCtx.Total(); i++ {
os.Remove(baseFileName + ecCtx.ToExt(i))
}
os.Remove(v.IndexFileName() + ".ecx")
}()
// write .ec00 ~ .ec[TotalShards-1] files using context
if err := erasure_coding.WriteEcFilesWithContext(baseFileName, ecCtx); err != nil {
return nil, fmt.Errorf("WriteEcFilesWithContext %s: %v", baseFileName, err)
}
// write .ecx file
if err := erasure_coding.WriteSortedFileFromIdx(v.IndexFileName(), ".ecx"); err != nil {
return nil, fmt.Errorf("WriteSortedFileFromIdx %s: %v", v.IndexFileName(), err)
}
// write .vif files
var expireAtSec uint64
if v.Ttl != nil {
ttlSecond := v.Ttl.ToSeconds()
if ttlSecond > 0 {
expireAtSec = uint64(time.Now().Unix()) + ttlSecond //calculated expiration time
}
}
volumeInfo := &volume_server_pb.VolumeInfo{Version: uint32(v.Version())}
volumeInfo.ExpireAtSec = expireAtSec
datSize, _, _ := v.FileStat()
volumeInfo.DatFileSize = int64(datSize)
// Validate EC configuration before saving to .vif
if ecCtx.DataShards <= 0 || ecCtx.ParityShards <= 0 || ecCtx.Total() > erasure_coding.MaxShardCount {
return nil, fmt.Errorf("invalid EC config before saving: data=%d, parity=%d, total=%d (max=%d)",
ecCtx.DataShards, ecCtx.ParityShards, ecCtx.Total(), erasure_coding.MaxShardCount)
}
// Save EC configuration to VolumeInfo
volumeInfo.EcShardConfig = &volume_server_pb.EcShardConfig{
DataShards: uint32(ecCtx.DataShards),
ParityShards: uint32(ecCtx.ParityShards),
}
glog.V(1).Infof("Saving EC config to .vif for volume %d: %d+%d (total: %d)",
req.VolumeId, ecCtx.DataShards, ecCtx.ParityShards, ecCtx.Total())
if err := volume_info.SaveVolumeInfo(baseFileName+".vif", volumeInfo); err != nil {
return nil, fmt.Errorf("SaveVolumeInfo %s: %v", baseFileName, err)
}
shouldCleanup = false
return &volume_server_pb.VolumeEcShardsGenerateResponse{}, nil
}
// VolumeEcShardsRebuild generates the any of the missing .ec00 ~ .ec13 files
func (vs *VolumeServer) VolumeEcShardsRebuild(ctx context.Context, req *volume_server_pb.VolumeEcShardsRebuildRequest) (*volume_server_pb.VolumeEcShardsRebuildResponse, error) {
glog.V(0).Infof("VolumeEcShardsRebuild: %v", req)
baseFileName := erasure_coding.EcShardBaseFileName(req.Collection, int(req.VolumeId))
var rebuiltShardIds []uint32
for _, location := range vs.store.Locations {
_, _, existingShardCount, err := checkEcVolumeStatus(baseFileName, location)
if err != nil {
return nil, err
}
if existingShardCount == 0 {
continue
}
if util.FileExists(path.Join(location.IdxDirectory, baseFileName+".ecx")) {
// write .ec00 ~ .ec13 files
dataBaseFileName := path.Join(location.Directory, baseFileName)
if generatedShardIds, err := erasure_coding.RebuildEcFiles(dataBaseFileName); err != nil {
return nil, fmt.Errorf("RebuildEcFiles %s: %v", dataBaseFileName, err)
} else {
rebuiltShardIds = generatedShardIds
}
indexBaseFileName := path.Join(location.IdxDirectory, baseFileName)
if err := erasure_coding.RebuildEcxFile(indexBaseFileName); err != nil {
return nil, fmt.Errorf("RebuildEcxFile %s: %v", dataBaseFileName, err)
}
break
}
}
return &volume_server_pb.VolumeEcShardsRebuildResponse{
RebuiltShardIds: rebuiltShardIds,
}, nil
}
// VolumeEcShardsCopy copy the .ecx and some ec data slices
func (vs *VolumeServer) VolumeEcShardsCopy(ctx context.Context, req *volume_server_pb.VolumeEcShardsCopyRequest) (*volume_server_pb.VolumeEcShardsCopyResponse, error) {
glog.V(0).Infof("VolumeEcShardsCopy: %v", req)
var location *storage.DiskLocation
// Use disk_id if provided (disk-aware storage)
if req.DiskId > 0 || (req.DiskId == 0 && len(vs.store.Locations) > 0) {
// Validate disk ID is within bounds
if int(req.DiskId) >= len(vs.store.Locations) {
return nil, fmt.Errorf("invalid disk_id %d: only have %d disks", req.DiskId, len(vs.store.Locations))
}
// Use the specific disk location
location = vs.store.Locations[req.DiskId]
glog.V(1).Infof("Using disk %d for EC shard copy: %s", req.DiskId, location.Directory)
} else {
// Fallback to old behavior for backward compatibility
if req.CopyEcxFile {
location = vs.store.FindFreeLocation(func(location *storage.DiskLocation) bool {
return location.DiskType == types.HardDriveType
})
} else {
location = vs.store.FindFreeLocation(func(location *storage.DiskLocation) bool {
return true
})
}
if location == nil {
return nil, fmt.Errorf("no space left")
}
}
dataBaseFileName := storage.VolumeFileName(location.Directory, req.Collection, int(req.VolumeId))
indexBaseFileName := storage.VolumeFileName(location.IdxDirectory, req.Collection, int(req.VolumeId))
err := operation.WithVolumeServerClient(true, pb.ServerAddress(req.SourceDataNode), vs.grpcDialOption, func(client volume_server_pb.VolumeServerClient) error {
// copy ec data slices
for _, shardId := range req.ShardIds {
if _, err := vs.doCopyFile(client, true, req.Collection, req.VolumeId, math.MaxUint32, math.MaxInt64, dataBaseFileName, erasure_coding.ToExt(int(shardId)), false, false, nil); err != nil {
return err
}
}
if req.CopyEcxFile {
// copy ecx file
if _, err := vs.doCopyFile(client, true, req.Collection, req.VolumeId, math.MaxUint32, math.MaxInt64, indexBaseFileName, ".ecx", false, false, nil); err != nil {
return err
}
}
if req.CopyEcjFile {
// copy ecj file
if _, err := vs.doCopyFile(client, true, req.Collection, req.VolumeId, math.MaxUint32, math.MaxInt64, indexBaseFileName, ".ecj", true, true, nil); err != nil {
return err
}
}
if req.CopyVifFile {
// copy vif file
if _, err := vs.doCopyFile(client, true, req.Collection, req.VolumeId, math.MaxUint32, math.MaxInt64, dataBaseFileName, ".vif", false, true, nil); err != nil {
return err
}
}
return nil
})
if err != nil {
return nil, fmt.Errorf("VolumeEcShardsCopy volume %d: %v", req.VolumeId, err)
}
return &volume_server_pb.VolumeEcShardsCopyResponse{}, nil
}
// VolumeEcShardsDelete local delete the .ecx and some ec data slices if not needed
// the shard should not be mounted before calling this.
func (vs *VolumeServer) VolumeEcShardsDelete(ctx context.Context, req *volume_server_pb.VolumeEcShardsDeleteRequest) (*volume_server_pb.VolumeEcShardsDeleteResponse, error) {
bName := erasure_coding.EcShardBaseFileName(req.Collection, int(req.VolumeId))
glog.V(0).Infof("ec volume %s shard delete %v", bName, req.ShardIds)
for _, location := range vs.store.Locations {
if err := deleteEcShardIdsForEachLocation(bName, location, req.ShardIds); err != nil {
glog.Errorf("deleteEcShards from %s %s.%v: %v", location.Directory, bName, req.ShardIds, err)
return nil, err
}
}
return &volume_server_pb.VolumeEcShardsDeleteResponse{}, nil
}
func deleteEcShardIdsForEachLocation(bName string, location *storage.DiskLocation, shardIds []uint32) error {
found := false
indexBaseFilename := path.Join(location.IdxDirectory, bName)
dataBaseFilename := path.Join(location.Directory, bName)
if util.FileExists(path.Join(location.IdxDirectory, bName+".ecx")) {
for _, shardId := range shardIds {
shardFileName := dataBaseFilename + erasure_coding.ToExt(int(shardId))
if util.FileExists(shardFileName) {
found = true
os.Remove(shardFileName)
}
}
}
if !found {
return nil
}
hasEcxFile, hasIdxFile, existingShardCount, err := checkEcVolumeStatus(bName, location)
if err != nil {
return err
}
if hasEcxFile && existingShardCount == 0 {
if err := os.Remove(indexBaseFilename + ".ecx"); err != nil {
return err
}
os.Remove(indexBaseFilename + ".ecj")
if !hasIdxFile {
// .vif is used for ec volumes and normal volumes
os.Remove(dataBaseFilename + ".vif")
}
}
return nil
}
func checkEcVolumeStatus(bName string, location *storage.DiskLocation) (hasEcxFile bool, hasIdxFile bool, existingShardCount int, err error) {
// check whether to delete the .ecx and .ecj file also
fileInfos, err := os.ReadDir(location.Directory)
if err != nil {
return false, false, 0, err
}
if location.IdxDirectory != location.Directory {
idxFileInfos, err := os.ReadDir(location.IdxDirectory)
if err != nil {
return false, false, 0, err
}
fileInfos = append(fileInfos, idxFileInfos...)
}
for _, fileInfo := range fileInfos {
if fileInfo.Name() == bName+".ecx" || fileInfo.Name() == bName+".ecj" {
hasEcxFile = true
continue
}
if fileInfo.Name() == bName+".idx" {
hasIdxFile = true
continue
}
if strings.HasPrefix(fileInfo.Name(), bName+".ec") {
existingShardCount++
}
}
return hasEcxFile, hasIdxFile, existingShardCount, nil
}
func (vs *VolumeServer) VolumeEcShardsMount(ctx context.Context, req *volume_server_pb.VolumeEcShardsMountRequest) (*volume_server_pb.VolumeEcShardsMountResponse, error) {
glog.V(0).Infof("VolumeEcShardsMount: %v", req)
for _, shardId := range req.ShardIds {
err := vs.store.MountEcShards(req.Collection, needle.VolumeId(req.VolumeId), erasure_coding.ShardId(shardId))
if err != nil {
glog.Errorf("ec shard mount %v: %v", req, err)
} else {
glog.V(2).Infof("ec shard mount %v", req)
}
if err != nil {
return nil, fmt.Errorf("mount %d.%d: %v", req.VolumeId, shardId, err)
}
}
return &volume_server_pb.VolumeEcShardsMountResponse{}, nil
}
func (vs *VolumeServer) VolumeEcShardsUnmount(ctx context.Context, req *volume_server_pb.VolumeEcShardsUnmountRequest) (*volume_server_pb.VolumeEcShardsUnmountResponse, error) {
glog.V(0).Infof("VolumeEcShardsUnmount: %v", req)
for _, shardId := range req.ShardIds {
err := vs.store.UnmountEcShards(needle.VolumeId(req.VolumeId), erasure_coding.ShardId(shardId))
if err != nil {
glog.Errorf("ec shard unmount %v: %v", req, err)
} else {
glog.V(2).Infof("ec shard unmount %v", req)
}
if err != nil {
return nil, fmt.Errorf("unmount %d.%d: %v", req.VolumeId, shardId, err)
}
}
return &volume_server_pb.VolumeEcShardsUnmountResponse{}, nil
}
func (vs *VolumeServer) VolumeEcShardRead(req *volume_server_pb.VolumeEcShardReadRequest, stream volume_server_pb.VolumeServer_VolumeEcShardReadServer) error {
ecVolume, found := vs.store.FindEcVolume(needle.VolumeId(req.VolumeId))
if !found {
return fmt.Errorf("VolumeEcShardRead not found ec volume id %d", req.VolumeId)
}
ecShard, found := ecVolume.FindEcVolumeShard(erasure_coding.ShardId(req.ShardId))
if !found {
return fmt.Errorf("not found ec shard %d.%d", req.VolumeId, req.ShardId)
}
if req.FileKey != 0 {
_, size, _ := ecVolume.FindNeedleFromEcx(types.Uint64ToNeedleId(req.FileKey))
if size.IsDeleted() {
return stream.Send(&volume_server_pb.VolumeEcShardReadResponse{
IsDeleted: true,
})
}
}
bufSize := req.Size
if bufSize > BufferSizeLimit {
bufSize = BufferSizeLimit
}
buffer := make([]byte, bufSize)
startOffset, bytesToRead := req.Offset, req.Size
for bytesToRead > 0 {
// min of bytesToRead and bufSize
bufferSize := bufSize
if bufferSize > bytesToRead {
bufferSize = bytesToRead
}
bytesread, err := ecShard.ReadAt(buffer[0:bufferSize], startOffset)
// println("read", ecShard.FileName(), "startOffset", startOffset, bytesread, "bytes, with target", bufferSize)
if bytesread > 0 {
if int64(bytesread) > bytesToRead {
bytesread = int(bytesToRead)
}
err = stream.Send(&volume_server_pb.VolumeEcShardReadResponse{
Data: buffer[:bytesread],
})
if err != nil {
// println("sending", bytesread, "bytes err", err.Error())
return err
}
startOffset += int64(bytesread)
bytesToRead -= int64(bytesread)
}
if err != nil {
if err != io.EOF {
return err
}
return nil
}
}
return nil
}
func (vs *VolumeServer) VolumeEcBlobDelete(ctx context.Context, req *volume_server_pb.VolumeEcBlobDeleteRequest) (*volume_server_pb.VolumeEcBlobDeleteResponse, error) {
glog.V(0).Infof("VolumeEcBlobDelete: %v", req)
resp := &volume_server_pb.VolumeEcBlobDeleteResponse{}
for _, location := range vs.store.Locations {
if localEcVolume, found := location.FindEcVolume(needle.VolumeId(req.VolumeId)); found {
_, size, _, err := localEcVolume.LocateEcShardNeedle(types.NeedleId(req.FileKey), needle.Version(req.Version))
if err != nil {
return nil, fmt.Errorf("locate in local ec volume: %w", err)
}
if size.IsDeleted() {
return resp, nil
}
err = localEcVolume.DeleteNeedleFromEcx(types.NeedleId(req.FileKey))
if err != nil {
return nil, err
}
break
}
}
return resp, nil
}
// VolumeEcShardsToVolume generates the .idx, .dat files from .ecx, .ecj and .ec01 ~ .ec14 files
func (vs *VolumeServer) VolumeEcShardsToVolume(ctx context.Context, req *volume_server_pb.VolumeEcShardsToVolumeRequest) (*volume_server_pb.VolumeEcShardsToVolumeResponse, error) {
glog.V(0).Infof("VolumeEcShardsToVolume: %v", req)
// Collect all EC shards (NewEcVolume will load EC config from .vif into v.ECContext)
// Use MaxShardCount (32) to support custom EC ratios up to 32 total shards
tempShards := make([]string, erasure_coding.MaxShardCount)
v, found := vs.store.CollectEcShards(needle.VolumeId(req.VolumeId), tempShards)
if !found {
return nil, fmt.Errorf("ec volume %d not found", req.VolumeId)
}
if v.Collection != req.Collection {
return nil, fmt.Errorf("existing collection:%v unexpected input: %v", v.Collection, req.Collection)
}
// Use EC context (already loaded from .vif) to determine data shard count
dataShards := v.ECContext.DataShards
// Defensive validation to prevent panics from corrupted ECContext
if dataShards <= 0 || dataShards > erasure_coding.MaxShardCount {
return nil, fmt.Errorf("invalid data shard count %d for volume %d (must be 1..%d)", dataShards, req.VolumeId, erasure_coding.MaxShardCount)
}
shardFileNames := tempShards[:dataShards]
glog.V(1).Infof("Using EC config from volume %d: %d data shards", req.VolumeId, dataShards)
// Verify all data shards are present
for shardId := 0; shardId < dataShards; shardId++ {
if shardFileNames[shardId] == "" {
return nil, fmt.Errorf("ec volume %d missing shard %d", req.VolumeId, shardId)
}
}
dataBaseFileName, indexBaseFileName := v.DataBaseFileName(), v.IndexBaseFileName()
// calculate .dat file size
datFileSize, err := erasure_coding.FindDatFileSize(dataBaseFileName, indexBaseFileName)
if err != nil {
return nil, fmt.Errorf("FindDatFileSize %s: %v", dataBaseFileName, err)
}
// write .dat file from .ec00 ~ .ec09 files
if err := erasure_coding.WriteDatFile(dataBaseFileName, datFileSize, shardFileNames); err != nil {
return nil, fmt.Errorf("WriteDatFile %s: %v", dataBaseFileName, err)
}
// write .idx file from .ecx and .ecj files
if err := erasure_coding.WriteIdxFileFromEcIndex(indexBaseFileName); err != nil {
return nil, fmt.Errorf("WriteIdxFileFromEcIndex %s: %v", v.IndexBaseFileName(), err)
}
return &volume_server_pb.VolumeEcShardsToVolumeResponse{}, nil
}
func (vs *VolumeServer) VolumeEcShardsInfo(ctx context.Context, req *volume_server_pb.VolumeEcShardsInfoRequest) (*volume_server_pb.VolumeEcShardsInfoResponse, error) {
glog.V(0).Infof("VolumeEcShardsInfo: volume %d", req.VolumeId)
var ecShardInfos []*volume_server_pb.EcShardInfo
// Find the EC volume
for _, location := range vs.store.Locations {
if v, found := location.FindEcVolume(needle.VolumeId(req.VolumeId)); found {
// Get shard details from the EC volume
shardDetails := v.ShardDetails()
for _, shardDetail := range shardDetails {
ecShardInfo := &volume_server_pb.EcShardInfo{
ShardId: uint32(shardDetail.ShardId),
Size: int64(shardDetail.Size),
Collection: v.Collection,
}
ecShardInfos = append(ecShardInfos, ecShardInfo)
}
break
}
}
return &volume_server_pb.VolumeEcShardsInfoResponse{
EcShardInfos: ecShardInfos,
}, nil
}