* dlm: replace modulo hashing with consistent hash ring Introduce HashRing with virtual nodes (CRC32-based consistent hashing) to replace the modulo-based hashKeyToServer. When a filer node is removed, only keys that hashed to that node are remapped to the next server on the ring, leaving all other mappings stable. This is the foundation for backup replication — the successor on the ring is always the natural takeover node. * dlm: add Generation and IsBackup fields to Lock Lock now carries IsBackup (whether this node holds the lock as a backup replica) and Generation (a monotonic fencing token that increments on each fresh acquisition, stays the same on renewal). Add helper methods: AllLocks, PromoteLock, DemoteLock, InsertBackupLock, RemoveLock, GetLock. * dlm: add ReplicateLock RPC and generation/is_backup proto fields Add generation field to LockResponse for fencing tokens. Add generation and is_backup fields to Lock message. Add ReplicateLock RPC for primary-to-backup lock replication. Add ReplicateLockRequest/ReplicateLockResponse messages. * dlm: add async backup replication to DistributedLockManager Route lock/unlock via consistent hash ring's GetPrimaryAndBackup(). After a successful lock or unlock on the primary, asynchronously replicate the operation to the backup server via ReplicateFunc callback. Single-server deployments skip replication. * dlm: add ReplicateLock handler and backup-aware topology changes Add ReplicateLock gRPC handler for primary-to-backup replication. Revise OnDlmChangeSnapshot to handle three cases on topology change: - Promote backup locks when this node becomes primary - Demote primary locks when this node becomes backup - Transfer locks when this node is neither primary nor backup Wire up SetupDlmReplication during filer server initialization. * dlm: expose generation fencing token in lock client LiveLock now captures the generation from LockResponse and exposes it via Generation() method. Consumers can use this as a fencing token to detect stale lock holders. * dlm: update empty folder cleaner to use consistent hash ring Replace local modulo-based hashKeyToServer with LockRing.GetPrimary() which uses the shared consistent hash ring for folder ownership. * dlm: add unit tests for consistent hash ring Test basic operations, consistency on server removal (only keys from removed server move), backup-is-successor property (backup becomes new primary when primary is removed), and key distribution balance. * dlm: add integration tests for lock replication failure scenarios Test cases: - Primary crash with backup promotion (backup has valid token) - Backup crash with primary continuing - Both primary and backup crash (lock lost, re-acquirable) - Rolling restart across all nodes - Generation fencing token increments on new acquisition - Replication failure (primary still works independently) - Unlock replicates deletion to backup - Lock survives server addition (topology change) - Consistent hashing minimal disruption (only removed server's keys move) * dlm: address PR review findings 1. Causal replication ordering: Add per-lock sequence number (Seq) that increments on every mutation. Backup rejects incoming mutations with seq <= current seq, preventing stale async replications from overwriting newer state. Unlock replication also carries seq and is rejected if stale. 2. Demote-after-handoff: OnDlmChangeSnapshot now transfers the lock to the new primary first and only demotes to backup after a successful TransferLocks RPC. If the transfer fails, the lock stays as primary on this node. 3. SetSnapshot candidateServers leak: Replace the candidateServers map entirely instead of appending, so removed servers don't linger. 4. TransferLocks preserves Generation and Seq: InsertLock now accepts generation and seq parameters. After accepting a transferred lock, the receiving node re-replicates to its backup. 5. Rolling restart test: Add re-replication step after promotion and assert survivedCount > 0. Add TestDLM_StaleReplicationRejected. 6. Mixed-version upgrade note: Add comment on HashRing documenting that all filer nodes must be upgraded together. * dlm: serve renewals locally during transfer window on node join When a new node joins and steals hash ranges from surviving nodes, there's a window between ring update and lock transfer where the client gets redirected to a node that doesn't have the lock yet. Fix: if the ring says primary != self but we still hold the lock locally (non-backup, matching token), serve the renewal/unlock here rather than redirecting. The lock will be transferred by OnDlmChangeSnapshot, and subsequent requests will go to the new primary once the transfer completes. Add tests: - TestDLM_NodeDropAndJoin_OwnershipDisruption: measures disruption when a node drops and a new one joins (14/100 surviving-node locks disrupted, all handled by transfer logic) - TestDLM_RenewalDuringTransferWindow: verifies renewal succeeds on old primary during the transfer window * dlm: master-managed lock ring with stabilization batching The master now owns the lock ring membership. Instead of filers independently reacting to individual ClusterNodeUpdate add/remove events, the master: 1. Tracks filer membership in LockRingManager 2. Batches rapid changes with a 1-second stabilization timer (e.g., a node drop + join within 1 second → single ring update) 3. Broadcasts the complete ring snapshot atomically via the new LockRingUpdate message in KeepConnectedResponse Filers receive the ring as a complete snapshot and apply it via SetSnapshot, ensuring all filers converge to the same ring state without intermediate churn. This eliminates the double-churn problem where a rapid drop+join would fire two separate ring mutations, each triggering lock transfers and disrupting ownership on surviving nodes. * dlm: track ring version, reject stale updates, remove dead code SetSnapshot now takes a version parameter from the master. Stale updates (version < current) are rejected, preventing reordered messages from overwriting a newer ring state. Version 0 is always accepted for bootstrap. Remove AddServer/RemoveServer from LockRing — the ring is now exclusively managed by the master via SetSnapshot. Remove the candidateServers map that was only used by those methods. * dlm: fix SelectLocks data race, advance generation on backup insert - SelectLocks: change RLock to Lock since the function deletes map entries, which is a write operation and causes a data race under RLock. - InsertBackupLock: advance nextGeneration to at least the incoming generation so that after failover promotion, new lock acquisitions get a generation strictly greater than any replicated lock. - Bump replication failure log from V(1) to Warningf for production visibility. * dlm: fix SetSnapshot race, test reliability, timer edge cases - SetSnapshot: hold LockRing lock through both version update and Ring.SetServers() so they're atomic. Prevents a concurrent caller from seeing the new version but applying stale servers. - Transfer window test: search for a key that actually moves primary when filer4 joins, instead of relying on a fixed key that may not. - renewLock redirect: pass the existing token to the new primary instead of empty string, so redirected renewals work correctly. - scheduleBroadcast: check timer.Stop() return value. If the timer already fired, the callback picks up latest state. - FlushPending: only broadcast if timer.Stop() returns true (timer was still pending). If false, the callback is already running. - Fix test comment: "idempotent" → "accepted, state-changing". * dlm: use wall-clock nanoseconds for lock ring version The lock ring version was an in-memory counter that reset to 0 on master restart. A filer that had seen version 5 would reject version 1 from the restarted master. Fix: use time.Now().UnixNano() as the version. This survives master restarts without persistence — the restarted master produces a version greater than any pre-restart value. * dlm: treat expired lock owners as missing Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * dlm: reject stale lock transfers Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * dlm: order replication by generation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * dlm: bootstrap lock ring on reconnect Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
599 lines
16 KiB
Protocol Buffer
599 lines
16 KiB
Protocol Buffer
syntax = "proto3";
|
|
|
|
package filer_pb;
|
|
|
|
option go_package = "github.com/seaweedfs/seaweedfs/weed/pb/filer_pb";
|
|
option java_package = "seaweedfs.client";
|
|
option java_outer_classname = "FilerProto";
|
|
|
|
//////////////////////////////////////////////////
|
|
|
|
service SeaweedFiler {
|
|
|
|
rpc LookupDirectoryEntry (LookupDirectoryEntryRequest) returns (LookupDirectoryEntryResponse) {
|
|
}
|
|
|
|
rpc ListEntries (ListEntriesRequest) returns (stream ListEntriesResponse) {
|
|
}
|
|
|
|
rpc CreateEntry (CreateEntryRequest) returns (CreateEntryResponse) {
|
|
}
|
|
|
|
rpc UpdateEntry (UpdateEntryRequest) returns (UpdateEntryResponse) {
|
|
}
|
|
|
|
rpc AppendToEntry (AppendToEntryRequest) returns (AppendToEntryResponse) {
|
|
}
|
|
|
|
rpc DeleteEntry (DeleteEntryRequest) returns (DeleteEntryResponse) {
|
|
}
|
|
|
|
rpc AtomicRenameEntry (AtomicRenameEntryRequest) returns (AtomicRenameEntryResponse) {
|
|
}
|
|
rpc StreamRenameEntry (StreamRenameEntryRequest) returns (stream StreamRenameEntryResponse) {
|
|
}
|
|
|
|
rpc StreamMutateEntry (stream StreamMutateEntryRequest) returns (stream StreamMutateEntryResponse) {
|
|
}
|
|
|
|
rpc AssignVolume (AssignVolumeRequest) returns (AssignVolumeResponse) {
|
|
}
|
|
|
|
rpc LookupVolume (LookupVolumeRequest) returns (LookupVolumeResponse) {
|
|
}
|
|
|
|
rpc CollectionList (CollectionListRequest) returns (CollectionListResponse) {
|
|
}
|
|
|
|
rpc DeleteCollection (DeleteCollectionRequest) returns (DeleteCollectionResponse) {
|
|
}
|
|
|
|
rpc Statistics (StatisticsRequest) returns (StatisticsResponse) {
|
|
}
|
|
|
|
rpc Ping (PingRequest) returns (PingResponse) {
|
|
}
|
|
|
|
rpc GetFilerConfiguration (GetFilerConfigurationRequest) returns (GetFilerConfigurationResponse) {
|
|
}
|
|
|
|
rpc TraverseBfsMetadata (TraverseBfsMetadataRequest) returns (stream TraverseBfsMetadataResponse) {
|
|
}
|
|
|
|
rpc SubscribeMetadata (SubscribeMetadataRequest) returns (stream SubscribeMetadataResponse) {
|
|
}
|
|
|
|
rpc SubscribeLocalMetadata (SubscribeMetadataRequest) returns (stream SubscribeMetadataResponse) {
|
|
}
|
|
|
|
rpc KvGet (KvGetRequest) returns (KvGetResponse) {
|
|
}
|
|
|
|
rpc KvPut (KvPutRequest) returns (KvPutResponse) {
|
|
}
|
|
|
|
rpc CacheRemoteObjectToLocalCluster (CacheRemoteObjectToLocalClusterRequest) returns (CacheRemoteObjectToLocalClusterResponse) {
|
|
}
|
|
|
|
rpc DistributedLock(LockRequest) returns (LockResponse) {
|
|
}
|
|
rpc DistributedUnlock(UnlockRequest) returns (UnlockResponse) {
|
|
}
|
|
rpc FindLockOwner(FindLockOwnerRequest) returns (FindLockOwnerResponse) {
|
|
}
|
|
// distributed lock management internal use only
|
|
rpc TransferLocks(TransferLocksRequest) returns (TransferLocksResponse) {
|
|
}
|
|
rpc ReplicateLock(ReplicateLockRequest) returns (ReplicateLockResponse) {
|
|
}
|
|
}
|
|
|
|
//////////////////////////////////////////////////
|
|
|
|
message LookupDirectoryEntryRequest {
|
|
string directory = 1;
|
|
string name = 2;
|
|
}
|
|
|
|
message LookupDirectoryEntryResponse {
|
|
Entry entry = 1;
|
|
}
|
|
|
|
message ListEntriesRequest {
|
|
string directory = 1;
|
|
string prefix = 2;
|
|
string startFromFileName = 3;
|
|
bool inclusiveStartFrom = 4;
|
|
uint32 limit = 5;
|
|
int64 snapshot_ts_ns = 6;
|
|
}
|
|
|
|
message ListEntriesResponse {
|
|
Entry entry = 1;
|
|
int64 snapshot_ts_ns = 2;
|
|
}
|
|
|
|
message RemoteEntry {
|
|
string storage_name = 1;
|
|
int64 last_local_sync_ts_ns = 2;
|
|
string remote_e_tag = 3;
|
|
int64 remote_mtime = 4;
|
|
int64 remote_size = 5;
|
|
}
|
|
message Entry {
|
|
string name = 1;
|
|
bool is_directory = 2;
|
|
repeated FileChunk chunks = 3;
|
|
FuseAttributes attributes = 4;
|
|
map<string, bytes> extended = 5;
|
|
bytes hard_link_id = 7;
|
|
int32 hard_link_counter = 8; // only exists in hard link meta data
|
|
bytes content = 9; // if not empty, the file content
|
|
|
|
RemoteEntry remote_entry = 10;
|
|
int64 quota = 11; // for bucket only. Positive/Negative means enabled/disabled.
|
|
int64 worm_enforced_at_ts_ns = 12;
|
|
}
|
|
|
|
message FullEntry {
|
|
string dir = 1;
|
|
Entry entry = 2;
|
|
}
|
|
|
|
message EventNotification {
|
|
Entry old_entry = 1;
|
|
Entry new_entry = 2;
|
|
bool delete_chunks = 3;
|
|
string new_parent_path = 4;
|
|
bool is_from_other_cluster = 5;
|
|
repeated int32 signatures = 6;
|
|
}
|
|
|
|
enum SSEType {
|
|
NONE = 0; // No server-side encryption
|
|
SSE_C = 1; // Server-Side Encryption with Customer-Provided Keys
|
|
SSE_KMS = 2; // Server-Side Encryption with KMS-Managed Keys
|
|
SSE_S3 = 3; // Server-Side Encryption with S3-Managed Keys
|
|
}
|
|
|
|
message FileChunk {
|
|
string file_id = 1; // to be deprecated
|
|
int64 offset = 2;
|
|
uint64 size = 3;
|
|
int64 modified_ts_ns = 4;
|
|
string e_tag = 5;
|
|
string source_file_id = 6; // to be deprecated
|
|
FileId fid = 7;
|
|
FileId source_fid = 8;
|
|
bytes cipher_key = 9;
|
|
bool is_compressed = 10;
|
|
bool is_chunk_manifest = 11; // content is a list of FileChunks
|
|
SSEType sse_type = 12; // Server-side encryption type
|
|
bytes sse_metadata = 13; // Serialized SSE metadata for this chunk (SSE-C, SSE-KMS, or SSE-S3)
|
|
}
|
|
|
|
message FileChunkManifest {
|
|
repeated FileChunk chunks = 1;
|
|
}
|
|
|
|
message FileId {
|
|
uint32 volume_id = 1;
|
|
uint64 file_key = 2;
|
|
fixed32 cookie = 3;
|
|
}
|
|
|
|
message FuseAttributes {
|
|
uint64 file_size = 1;
|
|
int64 mtime = 2; // unix time in seconds
|
|
uint32 file_mode = 3;
|
|
uint32 uid = 4;
|
|
uint32 gid = 5;
|
|
int64 crtime = 6; // unix time in seconds
|
|
string mime = 7;
|
|
int32 ttl_sec = 10;
|
|
string user_name = 11; // for hdfs
|
|
repeated string group_name = 12; // for hdfs
|
|
string symlink_target = 13;
|
|
bytes md5 = 14;
|
|
uint32 rdev = 16;
|
|
uint64 inode = 17;
|
|
}
|
|
|
|
message CreateEntryRequest {
|
|
string directory = 1;
|
|
Entry entry = 2;
|
|
bool o_excl = 3;
|
|
bool is_from_other_cluster = 4;
|
|
repeated int32 signatures = 5;
|
|
bool skip_check_parent_directory = 6;
|
|
}
|
|
|
|
// Structured error codes for filer entry operations.
|
|
// Values are stable — do not reorder or reuse numbers.
|
|
enum FilerError {
|
|
OK = 0;
|
|
ENTRY_NAME_TOO_LONG = 1; // name exceeds max_file_name_length
|
|
PARENT_IS_FILE = 2; // parent path component is a file, not a directory
|
|
EXISTING_IS_DIRECTORY = 3; // cannot overwrite directory with file
|
|
EXISTING_IS_FILE = 4; // cannot overwrite file with directory
|
|
ENTRY_ALREADY_EXISTS = 5; // O_EXCL and entry already exists
|
|
}
|
|
|
|
message CreateEntryResponse {
|
|
string error = 1; // kept for human readability + backward compat
|
|
SubscribeMetadataResponse metadata_event = 2;
|
|
FilerError error_code = 3; // machine-readable error code
|
|
}
|
|
|
|
message UpdateEntryRequest {
|
|
string directory = 1;
|
|
Entry entry = 2;
|
|
bool is_from_other_cluster = 3;
|
|
repeated int32 signatures = 4;
|
|
map<string, bytes> expected_extended = 5;
|
|
}
|
|
message UpdateEntryResponse {
|
|
SubscribeMetadataResponse metadata_event = 1;
|
|
}
|
|
|
|
message AppendToEntryRequest {
|
|
string directory = 1;
|
|
string entry_name = 2;
|
|
repeated FileChunk chunks = 3;
|
|
}
|
|
message AppendToEntryResponse {
|
|
}
|
|
|
|
message DeleteEntryRequest {
|
|
string directory = 1;
|
|
string name = 2;
|
|
// bool is_directory = 3;
|
|
bool is_delete_data = 4;
|
|
bool is_recursive = 5;
|
|
bool ignore_recursive_error = 6;
|
|
bool is_from_other_cluster = 7;
|
|
repeated int32 signatures = 8;
|
|
int64 if_not_modified_after = 9;
|
|
}
|
|
|
|
message DeleteEntryResponse {
|
|
string error = 1;
|
|
SubscribeMetadataResponse metadata_event = 2;
|
|
}
|
|
|
|
message AtomicRenameEntryRequest {
|
|
string old_directory = 1;
|
|
string old_name = 2;
|
|
string new_directory = 3;
|
|
string new_name = 4;
|
|
repeated int32 signatures = 5;
|
|
}
|
|
|
|
message AtomicRenameEntryResponse {
|
|
}
|
|
|
|
message StreamRenameEntryRequest {
|
|
string old_directory = 1;
|
|
string old_name = 2;
|
|
string new_directory = 3;
|
|
string new_name = 4;
|
|
repeated int32 signatures = 5;
|
|
}
|
|
message StreamRenameEntryResponse {
|
|
string directory = 1;
|
|
EventNotification event_notification = 2;
|
|
int64 ts_ns = 3;
|
|
}
|
|
message AssignVolumeRequest {
|
|
int32 count = 1;
|
|
string collection = 2;
|
|
string replication = 3;
|
|
int32 ttl_sec = 4;
|
|
string data_center = 5;
|
|
string path = 6;
|
|
string rack = 7;
|
|
string data_node = 9;
|
|
string disk_type = 8;
|
|
}
|
|
|
|
message AssignVolumeResponse {
|
|
string file_id = 1;
|
|
int32 count = 4;
|
|
string auth = 5;
|
|
string collection = 6;
|
|
string replication = 7;
|
|
string error = 8;
|
|
Location location = 9;
|
|
}
|
|
|
|
message LookupVolumeRequest {
|
|
repeated string volume_ids = 1;
|
|
}
|
|
|
|
message Locations {
|
|
repeated Location locations = 1;
|
|
}
|
|
|
|
message Location {
|
|
string url = 1;
|
|
string public_url = 2;
|
|
uint32 grpc_port = 3;
|
|
string data_center = 4;
|
|
}
|
|
message LookupVolumeResponse {
|
|
map<string, Locations> locations_map = 1;
|
|
}
|
|
|
|
message Collection {
|
|
string name = 1;
|
|
}
|
|
message CollectionListRequest {
|
|
bool include_normal_volumes = 1;
|
|
bool include_ec_volumes = 2;
|
|
}
|
|
message CollectionListResponse {
|
|
repeated Collection collections = 1;
|
|
}
|
|
message DeleteCollectionRequest {
|
|
string collection = 1;
|
|
}
|
|
|
|
message DeleteCollectionResponse {
|
|
}
|
|
|
|
message StatisticsRequest {
|
|
string replication = 1;
|
|
string collection = 2;
|
|
string ttl = 3;
|
|
string disk_type = 4;
|
|
}
|
|
message StatisticsResponse {
|
|
uint64 total_size = 4;
|
|
uint64 used_size = 5;
|
|
uint64 file_count = 6;
|
|
}
|
|
|
|
message PingRequest {
|
|
string target = 1; // default to ping itself
|
|
string target_type = 2;
|
|
}
|
|
message PingResponse {
|
|
int64 start_time_ns = 1;
|
|
int64 remote_time_ns = 2;
|
|
int64 stop_time_ns = 3;
|
|
}
|
|
|
|
message GetFilerConfigurationRequest {
|
|
}
|
|
message GetFilerConfigurationResponse {
|
|
repeated string masters = 1;
|
|
string replication = 2;
|
|
string collection = 3;
|
|
uint32 max_mb = 4;
|
|
string dir_buckets = 5;
|
|
bool cipher = 7;
|
|
int32 signature = 8;
|
|
string metrics_address = 9;
|
|
int32 metrics_interval_sec = 10;
|
|
string version = 11;
|
|
string cluster_id = 12;
|
|
string filer_group = 13;
|
|
int32 major_version = 14;
|
|
int32 minor_version = 15;
|
|
}
|
|
|
|
message SubscribeMetadataRequest {
|
|
string client_name = 1;
|
|
string path_prefix = 2;
|
|
int64 since_ns = 3;
|
|
int32 signature = 4;
|
|
repeated string path_prefixes = 6;
|
|
int32 client_id = 7;
|
|
int64 until_ns = 8;
|
|
int32 client_epoch = 9;
|
|
repeated string directories = 10; // exact directory to watch
|
|
bool client_supports_batching = 11; // client can unpack SubscribeMetadataResponse.events
|
|
bool client_supports_metadata_chunks = 12; // client can read log file chunks from volume servers
|
|
}
|
|
message SubscribeMetadataResponse {
|
|
string directory = 1;
|
|
EventNotification event_notification = 2;
|
|
int64 ts_ns = 3;
|
|
repeated SubscribeMetadataResponse events = 4; // batch of additional events (backlog catch-up)
|
|
repeated LogFileChunkRef log_file_refs = 5; // log file chunk refs for client direct-read
|
|
}
|
|
// A persisted log file that the client can read directly from volume servers.
|
|
// The file format is: [4-byte size | protobuf LogEntry] repeated.
|
|
// Each LogEntry.Data contains a marshaled SubscribeMetadataResponse.
|
|
message LogFileChunkRef {
|
|
repeated FileChunk chunks = 1; // chunk references (fids) to read from volume servers
|
|
int64 file_ts_ns = 2; // minute-level timestamp of the log file
|
|
string filer_id = 3; // filer signature suffix from log filename
|
|
}
|
|
|
|
message TraverseBfsMetadataRequest {
|
|
string directory = 1;
|
|
repeated string excluded_prefixes = 2;
|
|
}
|
|
message TraverseBfsMetadataResponse {
|
|
string directory = 1;
|
|
Entry entry = 2;
|
|
}
|
|
|
|
message LogEntry {
|
|
int64 ts_ns = 1;
|
|
int32 partition_key_hash = 2;
|
|
bytes data = 3;
|
|
bytes key = 4;
|
|
int64 offset = 5; // Sequential offset within partition
|
|
}
|
|
|
|
message KeepConnectedRequest {
|
|
string name = 1;
|
|
uint32 grpc_port = 2;
|
|
repeated string resources = 3;
|
|
}
|
|
message KeepConnectedResponse {
|
|
}
|
|
|
|
message LocateBrokerRequest {
|
|
string resource = 1;
|
|
}
|
|
|
|
message LocateBrokerResponse {
|
|
bool found = 1;
|
|
// if found, send the exact address
|
|
// if not found, send the full list of existing brokers
|
|
message Resource {
|
|
string grpc_addresses = 1;
|
|
int32 resource_count = 2;
|
|
}
|
|
repeated Resource resources = 2;
|
|
}
|
|
|
|
/////////////////////////
|
|
// Key-Value operations
|
|
/////////////////////////
|
|
message KvGetRequest {
|
|
bytes key = 1;
|
|
}
|
|
message KvGetResponse {
|
|
bytes value = 1;
|
|
string error = 2;
|
|
}
|
|
message KvPutRequest {
|
|
bytes key = 1;
|
|
bytes value = 2;
|
|
}
|
|
message KvPutResponse {
|
|
string error = 1;
|
|
}
|
|
|
|
/////////////////////////
|
|
// path-based configurations
|
|
/////////////////////////
|
|
message FilerConf {
|
|
int32 version = 1;
|
|
message PathConf {
|
|
string location_prefix = 1;
|
|
string collection = 2;
|
|
string replication = 3;
|
|
string ttl = 4;
|
|
string disk_type = 5;
|
|
bool fsync = 6;
|
|
uint32 volume_growth_count = 7;
|
|
bool read_only = 8;
|
|
string data_center = 9;
|
|
string rack = 10;
|
|
string data_node = 11;
|
|
uint32 max_file_name_length = 12;
|
|
bool disable_chunk_deletion = 13;
|
|
bool worm = 14;
|
|
uint64 worm_grace_period_seconds = 15;
|
|
uint64 worm_retention_time_seconds = 16;
|
|
}
|
|
repeated PathConf locations = 2;
|
|
}
|
|
|
|
/////////////////////////
|
|
// Remote Storage related
|
|
/////////////////////////
|
|
message CacheRemoteObjectToLocalClusterRequest {
|
|
string directory = 1;
|
|
string name = 2;
|
|
int32 chunk_concurrency = 3; // parallel chunk downloads per file, 0 = default (8)
|
|
int32 download_concurrency = 4; // multipart download concurrency per chunk (if supported by remote storage), 0 = default (5 for S3)
|
|
}
|
|
message CacheRemoteObjectToLocalClusterResponse {
|
|
Entry entry = 1;
|
|
SubscribeMetadataResponse metadata_event = 2;
|
|
}
|
|
|
|
/////////////////////////
|
|
// distributed lock management
|
|
/////////////////////////
|
|
message LockRequest {
|
|
string name = 1;
|
|
int64 seconds_to_lock = 2;
|
|
string renew_token = 3;
|
|
bool is_moved = 4;
|
|
string owner = 5;
|
|
}
|
|
message LockResponse {
|
|
string renew_token = 1;
|
|
string lock_owner = 2;
|
|
string lock_host_moved_to = 3;
|
|
string error = 4;
|
|
int64 generation = 5;
|
|
}
|
|
message UnlockRequest {
|
|
string name = 1;
|
|
string renew_token = 2;
|
|
bool is_moved = 3;
|
|
}
|
|
message UnlockResponse {
|
|
string error = 1;
|
|
string moved_to = 2;
|
|
}
|
|
message FindLockOwnerRequest {
|
|
string name = 1;
|
|
bool is_moved = 2;
|
|
}
|
|
message FindLockOwnerResponse {
|
|
string owner = 1;
|
|
}
|
|
message Lock {
|
|
string name = 1;
|
|
string renew_token = 2;
|
|
int64 expired_at_ns = 3;
|
|
string owner = 4;
|
|
int64 generation = 5;
|
|
bool is_backup = 6;
|
|
int64 seq = 7;
|
|
}
|
|
message TransferLocksRequest {
|
|
repeated Lock locks = 1;
|
|
}
|
|
message TransferLocksResponse {
|
|
}
|
|
message ReplicateLockRequest {
|
|
string name = 1;
|
|
string renew_token = 2;
|
|
int64 expired_at_ns = 3;
|
|
string owner = 4;
|
|
int64 generation = 5;
|
|
bool is_unlock = 6;
|
|
int64 seq = 7;
|
|
}
|
|
message ReplicateLockResponse {
|
|
}
|
|
|
|
//////////////////////////////////////////////////
|
|
// StreamMutateEntry: ordered bidirectional streaming for all filer mutations.
|
|
// All create/update/delete/rename operations from a single mount go through
|
|
// one stream, preserving mutation ordering and eliminating per-request
|
|
// connection overhead.
|
|
|
|
message StreamMutateEntryRequest {
|
|
uint64 request_id = 1;
|
|
oneof request {
|
|
CreateEntryRequest create_request = 2;
|
|
UpdateEntryRequest update_request = 3;
|
|
DeleteEntryRequest delete_request = 4;
|
|
StreamRenameEntryRequest rename_request = 5;
|
|
}
|
|
}
|
|
|
|
message StreamMutateEntryResponse {
|
|
uint64 request_id = 1;
|
|
bool is_last = 2; // always true except for rename, which sends multiple events
|
|
oneof response {
|
|
CreateEntryResponse create_response = 3;
|
|
UpdateEntryResponse update_response = 4;
|
|
DeleteEntryResponse delete_response = 5;
|
|
StreamRenameEntryResponse rename_response = 6;
|
|
}
|
|
string error = 7; // human-readable error message when the operation failed
|
|
int32 errno = 8; // POSIX errno (e.g. ENOENT=2, ENOTEMPTY=66) for direct FUSE status mapping
|
|
}
|