You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In our Elasticsearch cluster, the shutdown status of the cold-node-2 node remains in the IN_PROGRESS state, preventing it from being taken offline.
The root cause of this issue lies in the following function:
func (hHealth) HasShardActivity() bool {
returnh.TimedOut||// make sure request did not time out (i.e. no pending events)h.NumberOfInFlightFetch>0||// no shards being fetchedh.InitializingShards>0||// no shards initializingh.RelocatingShards>0// no shards relocating
}
The variable h.RelocatingShards represents the total number of relocating shards across the entire cluster. Therefore, even when the shards on cold-node-2 have completed migration, if other nodes in the cluster are still relocating shards, the HasShardActivity function will continuously return true. As a result, the node shutdown is delayed due to ongoing shard activity elsewhere in the cluster.
In certain scenarios, such as when hot nodes are always migrating data to cold nodes, the node intended for shutdown may never be taken offline, even after its own shards have finished relocating.
Question: Why can't the node be taken offline as soon as its own shard migration is complete?
I would like the node to be shut down once its shard data migration is finished, but this is not currently happening. Is there a specific reason why Elasticsearch doesn't allow for shutting down a node immediately after its own shard migration completes?
The text was updated successfully, but these errors were encountered:
In our Elasticsearch cluster, the shutdown status of the
cold-node-2
node remains in theIN_PROGRESS
state, preventing it from being taken offline.The root cause of this issue lies in the following function:
The variable
h.RelocatingShards
represents the total number of relocating shards across the entire cluster. Therefore, even when the shards oncold-node-2
have completed migration, if other nodes in the cluster are still relocating shards, theHasShardActivity
function will continuously returntrue
. As a result, the node shutdown is delayed due to ongoing shard activity elsewhere in the cluster.In certain scenarios, such as when hot nodes are always migrating data to cold nodes, the node intended for shutdown may never be taken offline, even after its own shards have finished relocating.
Question: Why can't the node be taken offline as soon as its own shard migration is complete?
I would like the node to be shut down once its shard data migration is finished, but this is not currently happening. Is there a specific reason why Elasticsearch doesn't allow for shutting down a node immediately after its own shard migration completes?
The text was updated successfully, but these errors were encountered: