Files
seaweedFS/weed/server/master_grpc_server_raft_test.go
Chris Lu f4cdfcc5fd Add cluster.raft.leader.transfer command for graceful leader change (#7819)
* proto: add RaftLeadershipTransfer RPC for forced leader change

Add new gRPC RPC and messages for leadership transfer:
- RaftLeadershipTransferRequest: optional target_id and target_address
- RaftLeadershipTransferResponse: previous_leader and new_leader

This enables graceful leadership transfer before master maintenance,
reducing errors in filers during planned maintenance windows.

Ref: https://github.com/seaweedfs/seaweedfs/issues/7527

* proto: regenerate Go files for RaftLeadershipTransfer

Generated from master.proto changes.

* master: implement RaftLeadershipTransfer gRPC handler

Add gRPC handler for leadership transfer with support for:
- Transfer to any eligible follower (when target_id is empty)
- Transfer to a specific server (when target_id and target_address are provided)

Uses hashicorp/raft LeadershipTransfer() and LeadershipTransferToServer() APIs.

Returns the previous and new leader in the response.

* shell: add cluster.raft.leader.transfer command

Add weed shell command for graceful leadership transfer:
- Displays current cluster status before transfer
- Supports auto-selection of target (any eligible follower)
- Supports targeted transfer with -id and -address flags
- Provides clear feedback on success/failure with troubleshooting tips

Usage:
  cluster.raft.leader.transfer
  cluster.raft.leader.transfer -id <server_id> -address <grpc_address>

* master: add unit tests for raft gRPC handlers

Add tests covering:
- RaftLeadershipTransfer with no raft initialized
- RaftLeadershipTransfer with target_id but no address
- RaftListClusterServers with no raft initialized
- RaftAddServer with no raft initialized
- RaftRemoveServer with no raft initialized

These tests verify error handling when raft is not configured.

* shell: add tests for cluster.raft.leader.transfer command

Add tests covering:
- Command name and help text validation
- HasTag returns false for ResourceHeavy
- Validation of -id without -address
- Argument parsing with unknown flags

* master: clarify that leadership transfer requires -raftHashicorp

The default raft implementation (seaweedfs/raft, a goraft fork) does not
support graceful leadership transfer. This feature is only available when
using hashicorp raft (-raftHashicorp=true).

Update error messages and help text to make this requirement clear:
- gRPC handler returns specific error for goraft users
- Shell command help text notes the requirement
- Added test for goraft case

* test: use strings.Contains instead of custom helper

Replace custom contains/containsHelper functions with the standard
library strings.Contains for better maintainability.

* shell: return flag parsing errors instead of swallowing them

- Return the error from flag.Parse() instead of returning nil
- Update test to explicitly assert error for unknown flags

* test: document integration test scenarios for Raft leadership transfer

Add comments explaining:
- Why these unit tests only cover 'Raft not initialized' scenarios
- What integration tests should cover (with multi-master cluster)
- hashicorp/raft uses concrete types that cannot be easily mocked

* fix: address reviewer feedback on tests and leader routing

- Remove misleading tests that couldn't properly validate their
  documented behavior without a real Raft cluster:
  - TestRaftLeadershipTransfer_GoraftNotSupported
  - TestRaftLeadershipTransfer_ValidationTargetIdWithoutAddress

- Change WithClient(false) to WithClient(true) for RaftLeadershipTransfer
  RPC to ensure the request is routed to the current leader

* Improve cluster.raft.transferLeader command

- Rename command from cluster.raft.leader.transfer to cluster.raft.transferLeader
- Add symmetric validation: -id and -address must be specified together
- Handle case where same leader is re-elected after transfer
- Add test for -address without -id validation
- Add docker compose file for 5-master raft cluster testing
2025-12-19 00:15:39 -08:00

110 lines
3.0 KiB
Go

package weed_server
// These tests cover the Raft gRPC handlers in scenarios where Raft is not initialized
// (single master mode). Testing with an initialized Raft cluster requires integration
// tests with a multi-master setup, as hashicorp/raft uses concrete types that cannot
// be easily mocked.
//
// Integration tests for RaftLeadershipTransfer should cover:
// - Successful leadership transfer to any follower (auto-selection)
// - Successful leadership transfer to a specific target server
// - Error when caller is not the current leader
// - Error when target server is not a voting member
// - Error when target server is unreachable
//
// These scenarios are best tested with test/multi_master/ integration tests
// using a real 3-node master cluster with -raftHashicorp=true.
import (
"context"
"strings"
"testing"
"github.com/seaweedfs/seaweedfs/weed/pb/master_pb"
"github.com/seaweedfs/seaweedfs/weed/topology"
)
func TestRaftLeadershipTransfer_NoRaft(t *testing.T) {
// Test case: raft not initialized (single master mode)
ms := &MasterServer{
Topo: topology.NewTopology("test", nil, 0, 0, false),
}
ctx := context.Background()
req := &master_pb.RaftLeadershipTransferRequest{}
_, err := ms.RaftLeadershipTransfer(ctx, req)
if err == nil {
t.Error("expected error when raft is not initialized")
}
expectedMsg := "single master mode"
if err != nil && !strings.Contains(err.Error(), expectedMsg) {
t.Errorf("expected error message to contain %q, got %q", expectedMsg, err.Error())
}
}
func TestRaftListClusterServers_NoRaft(t *testing.T) {
// Test case: raft not initialized returns empty response
ms := &MasterServer{
Topo: topology.NewTopology("test", nil, 0, 0, false),
}
ctx := context.Background()
req := &master_pb.RaftListClusterServersRequest{}
resp, err := ms.RaftListClusterServers(ctx, req)
if err != nil {
t.Errorf("unexpected error: %v", err)
}
if resp == nil {
t.Error("expected non-nil response")
}
if len(resp.ClusterServers) != 0 {
t.Errorf("expected empty cluster servers, got %d", len(resp.ClusterServers))
}
}
func TestRaftAddServer_NoRaft(t *testing.T) {
// Test case: raft not initialized returns empty response
ms := &MasterServer{
Topo: topology.NewTopology("test", nil, 0, 0, false),
}
ctx := context.Background()
req := &master_pb.RaftAddServerRequest{
Id: "test-server",
Address: "localhost:19333",
Voter: true,
}
resp, err := ms.RaftAddServer(ctx, req)
if err != nil {
t.Errorf("unexpected error: %v", err)
}
if resp == nil {
t.Error("expected non-nil response")
}
}
func TestRaftRemoveServer_NoRaft(t *testing.T) {
// Test case: raft not initialized returns empty response
ms := &MasterServer{
Topo: topology.NewTopology("test", nil, 0, 0, false),
}
ctx := context.Background()
req := &master_pb.RaftRemoveServerRequest{
Id: "test-server",
Force: true,
}
resp, err := ms.RaftRemoveServer(ctx, req)
if err != nil {
t.Errorf("unexpected error: %v", err)
}
if resp == nil {
t.Error("expected non-nil response")
}
}