Commit Graph

112 Commits

Author SHA1 Message Date
Chris Lu
5ed0b00fb9 Support separate volume server ID independent of RPC bind address (#7609)
* pb: add id field to Heartbeat message for stable volume server identification

This adds an 'id' field to the Heartbeat protobuf message that allows
volume servers to identify themselves independently of their IP:port address.

Ref: https://github.com/seaweedfs/seaweedfs/issues/7487

* storage: add Id field to Store struct

Add Id field to Store struct and include it in CollectHeartbeat().
The Id field provides a stable volume server identity independent of IP:port.

Ref: https://github.com/seaweedfs/seaweedfs/issues/7487

* topology: support id-based DataNode identification

Update GetOrCreateDataNode to accept an id parameter for stable node
identification. When id is provided, the DataNode can maintain its identity
even when its IP address changes (e.g., in Kubernetes pod reschedules).

For backward compatibility:
- If id is provided, use it as the node ID
- If id is empty, fall back to ip:port

Ref: https://github.com/seaweedfs/seaweedfs/issues/7487

* volume: add -id flag for stable volume server identity

Add -id command line flag to volume server that allows specifying a stable
identifier independent of the IP address. This is useful for Kubernetes
deployments with hostPath volumes where pods can be rescheduled to different
nodes while the persisted data remains on the original node.

Usage: weed volume -id=node-1 -ip=10.0.0.1 ...

If -id is not specified, it defaults to ip:port for backward compatibility.

Fixes https://github.com/seaweedfs/seaweedfs/issues/7487

* server: add -volume.id flag to weed server command

Support the -volume.id flag in the all-in-one 'weed server' command,
consistent with the standalone 'weed volume' command.

Usage: weed server -volume.id=node-1 ...

Ref: https://github.com/seaweedfs/seaweedfs/issues/7487

* topology: add test for id-based DataNode identification

Test the key scenarios:
1. Create DataNode with explicit id
2. Same id with different IP returns same DataNode (K8s reschedule)
3. IP/PublicUrl are updated when node reconnects with new address
4. Different id creates new DataNode
5. Empty id falls back to ip:port (backward compatibility)

Ref: https://github.com/seaweedfs/seaweedfs/issues/7487

* pb: add address field to DataNodeInfo for proper node addressing

Previously, DataNodeInfo.Id was used as the node address, which worked
when Id was always ip:port. Now that Id can be an explicit string,
we need a separate Address field for connection purposes.

Changes:
- Add 'address' field to DataNodeInfo protobuf message
- Update ToDataNodeInfo() to populate the address field
- Update NewServerAddressFromDataNode() to use Address (with Id fallback)
- Fix LookupEcVolume to use dn.Url() instead of dn.Id()

Ref: https://github.com/seaweedfs/seaweedfs/issues/7487

* fix: trim whitespace from volume server id and fix test

- Trim whitespace from -id flag to treat ' ' as empty
- Fix store_load_balancing_test.go to include id parameter in NewStore call

Ref: https://github.com/seaweedfs/seaweedfs/issues/7487

* refactor: extract GetVolumeServerId to util package

Move the volume server ID determination logic to a shared utility function
to avoid code duplication between volume.go and rack.go.

Ref: https://github.com/seaweedfs/seaweedfs/issues/7487

* fix: improve transition logic for legacy nodes

- Use exact ip:port match instead of net.SplitHostPort heuristic
- Update GrpcPort and PublicUrl during transition for consistency
- Remove unused net import

Ref: https://github.com/seaweedfs/seaweedfs/issues/7487

* fix: add id normalization and address change logging

- Normalize id parameter at function boundary (trim whitespace)
- Log when DataNode IP:Port changes (helps debug K8s pod rescheduling)

Ref: https://github.com/seaweedfs/seaweedfs/issues/7487
2025-12-02 22:08:11 -08:00
wyang
a7973ed7d1 fix deadlock hang when broadcast to clients (#6184)
fix deadlock when broadcast to clients

when master thransfer leader, the old master will disconnect with all
filers and volumeServers, if the cluster is a big , the broadcast
messages may be more big than the max of the channel len 100, then if the
KeepConnect was not listen on the channel in disconnect, it will
deadlock. and the whole cluster will not serve!
2024-11-03 23:20:48 -08:00
LHHDZ
4dc33cc143 fix unclaimed spaces calculation when volumePreallocate is enabled (#6063)
the calculation of `unclaimedSpaces` only needs to subtract `unusedSpace` when `preallocate` is not enabled.

Signed-off-by: LHHDZ <shichanglin5@qq.com>
2024-09-24 23:04:18 -07:00
augustazz
db833abfa2 fix ec volume lookup data sync (#5900) 2024-08-16 06:08:33 -07:00
steve.wei
2150289442 fix: Ensure that the clientAddress is unique (#5655) 2024-06-07 09:13:03 -07:00
Gaspare Iengo
dc6b750424 Fix panic (#5654) 2024-06-06 18:59:50 -07:00
steve.wei
d8da4bbaa7 Set the capacity of clientChan to 10000 (#5647) 2024-06-05 05:41:46 -07:00
LHHDZ
36b5b713ba fix deadlock caused by message chan blocked (#5639) 2024-06-03 07:42:40 -07:00
chrislu
3e7a92061b pass along volume server grpc port
fix https://github.com/seaweedfs/seaweedfs/issues/5617
2024-05-29 10:41:33 -07:00
chrislu
364bb6c7b4 avoid ticker leak 2024-05-24 17:15:12 -07:00
Konstantin Lebedev
a7fc723ae0 chore: add status code for request_total metrics (#5188) 2024-01-10 10:05:27 -08:00
Konstantin Lebedev
5db25a8f2a metric shows who is currently blocking the cluster or not (#3799)
* master_admin_lock Shows whether cluster is locked now or not
https://github.com/seaweedfs/seaweedfs/issues/3452

* fix metric MasterAdminLock
2022-10-07 13:26:29 -07:00
Konstantin Lebedev
721c6197f9 skip deltaBeat if dn is zero (#3630)
* skip deltaBeat
https://github.com/seaweedfs/seaweedfs/issues/3629

* fix GrpcPort

* skip url :0

* skip empty DataCenter or Rack

* skip empty heartbeat Ip

* dell msg add DataCenter

* comment todo

* fix
2022-09-11 22:31:53 -07:00
Konstantin Lebedev
31d2f77ceb refactor https://github.com/seaweedfs/seaweedfs/pull/3616 (#3625) 2022-09-07 23:23:33 -07:00
famosss
449582343f fix:Sometimes a nil pointer exception is thrown (#3618) 2022-09-07 18:57:13 -07:00
famosss
9678fc2106 fix: volume heartbeat processing error (#3616) 2022-09-07 09:48:51 -07:00
famosss
dc4037925d fix: Build DeletedVids before reset dn's children (#3530) 2022-08-26 22:52:08 -07:00
Konstantin Lebedev
4d08393b7c filer prefer volume server in same data center (#3405)
* initial prefer same data center
https://github.com/seaweedfs/seaweedfs/issues/3404

* GetDataCenter

* prefer same data center for ReplicationSource

* GetDataCenterId

* remove glog
2022-08-04 17:35:00 -07:00
chrislu
26dbc6c905 move to https://github.com/seaweedfs/seaweedfs 2022-07-29 00:17:28 -07:00
chrislu
bb01b68fa0 refactor 2022-07-28 23:24:38 -07:00
chrislu
68065128b8 add dc and rack 2022-07-28 23:22:51 -07:00
chrislu
3828b8ce87 "github.com/chrislusf/raft" => "github.com/seaweedfs/raft" 2022-07-27 12:12:40 -07:00
Konstantin Lebedev
6c20a3b622 avoid set currentMaster k8s svc.local discoveruy service domains
https://github.com/chrislusf/seaweedfs/issues/2589
2022-06-27 21:47:05 +05:00
chrislu
444ac21050 go fmt 2022-06-11 09:51:11 -07:00
guol-fnst
b12944f9c6 fix naming convention
notify volume server of duplicate directoris
improve searching efficiency
2022-05-17 15:41:49 +08:00
guol-fnst
8fab39e775 rename UUID file
fix typo
move locationUUID  into DiskLocation
2022-05-17 15:41:47 +08:00
guol-fnst
de6aa9cce8 avoid duplicated volume directory 2022-05-16 19:33:51 +08:00
chrislu
94635e9b5c filer: add filer group 2022-05-01 21:59:16 -07:00
shibinbin
c20e1edd99 fix: master lose some volumes 2022-04-07 15:18:28 +08:00
chrislu
bc888226fc erasure coding: tracking encoded/decoded volumes
If an EC shard is created but not spread to other servers, the masterclient would think this shard is not located here.
2022-04-05 19:03:02 -07:00
Konstantin Lebedev
c9952759c4 metrics master is leader 2022-01-24 20:13:07 +05:00
Konstantin Lebedev
28efe31524 new master metrics 2022-01-24 19:09:43 +05:00
Chris Lu
330d1fde7f send peers info to filers 2021-11-06 11:29:50 -07:00
Chris Lu
4b9c42996a refactor grpc API 2021-11-05 18:11:40 -07:00
Chris Lu
5ea86ef1da Revert "master: rename grpc function KeepConnected() to SubscribeVolumeLocationUpdates()"
This reverts commit af71ae11aa.
2021-11-05 17:52:15 -07:00
Chris Lu
af71ae11aa master: rename grpc function KeepConnected() to SubscribeVolumeLocationUpdates() 2021-11-03 01:09:48 -07:00
Chris Lu
5160eb08f7 shell: optionally read filer address from master 2021-11-02 23:38:45 -07:00
Chris Lu
2789d10342 go fmt 2021-09-14 10:37:06 -07:00
Chris Lu
e5fc35ed0c change server address from string to a type 2021-09-12 22:47:52 -07:00
Chris Lu
7591336a22 log format 2021-09-11 14:27:57 -07:00
Chris Lu
5496d68f6a increase counter only if not early terminated 2021-09-11 02:05:55 -07:00
Chris Lu
0128239c0f handle ipv6 addresses 2021-09-07 16:43:54 -07:00
Chris Lu
006c01a519 fix format 2021-09-05 16:18:50 -07:00
Chris Lu
65af3cf4df master: disconnect only the phantom volume server
fix https://github.com/chrislusf/seaweedfs/issues/2311
2021-09-05 15:20:03 -07:00
Chris Lu
8126ab4b5d rename 2021-08-14 05:03:45 -07:00
Chris Lu
5469019852 adjust data type 2021-08-12 17:54:34 -07:00
Chris Lu
ee6c67682c minor 2021-06-12 02:52:41 -07:00
Chris Lu
5d931eff27 avoid possible nil
fix https://github.com/chrislusf/seaweedfs/issues/1928

The nil was because of `dn.Parent().UnlinkChildNode(dn.Id())` in topo.UnRegisterDataNode() function, when the dn leaves the cluster.
2021-03-22 13:24:07 -07:00
Chris Lu
f8446b42ab this can compile now!!! 2021-02-16 02:47:02 -08:00
Chris Lu
d156c74ec0 volume server set volume type and heartbeat to the master 2020-12-13 03:11:24 -08:00