Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Scheduler Replica Propagation #6045

Open
LavredisG opened this issue Jan 14, 2025 · 10 comments
Open

Custom Scheduler Replica Propagation #6045

LavredisG opened this issue Jan 14, 2025 · 10 comments
Labels
kind/question Indicates an issue that is a support question.

Comments

@LavredisG
Copy link
Contributor

Since ReplicaScheduling step is not customisable on the Scheduling Process, what is the way to go if I were to do that? If for example I score some clusters based on custom metrics and I would like to assign replicas to them based on these scores, is it possible (if it's not, where would cluster scores be used, because I am struggling to come up with a use case other than replica propagation)? Or would it require something like custom controllers to do that?

@LavredisG LavredisG added the kind/question Indicates an issue that is a support question. label Jan 14, 2025
@chaosi-zju
Copy link
Member

There are two scenarios:

  1. The first scheduling is based on custom metrics, and thereafter, changes in the metrics do not affect the replica allocation results.
  2. The first scheduling is based on custom metrics, if there are significant changes in the metrics, the expected replica allocation results should also be adjusted accordingly.

Which scenario do you belong to?

@LavredisG
Copy link
Contributor Author

I would expect to have a Deployment as Input {CPU, RAM, Replicas, DelayThreshold}, that would have to be scheduled on a multi-cluster environment based on dynamic metrics such as expected incoming traffic, network delays, resource consumption etc. So, at its simplest form I would be interested in the 1st case where only replica propagation matters, but if you can dive a bit deeper and also explain how this would expand to adjust replica allocation dynamically and not only statically when first scheduling, I'd be grateful, since I am not 100% sure which use case we will end up following.

@chaosi-zju
Copy link
Member

chaosi-zju commented Jan 15, 2025

For 1st case, you can try implement a custom scheduler-estimator.

You known as for dynamic weight scheduling strategy, scheduler-estimator will calculate MaxAvailableReplicas of each member cluster, and then scheduler will divide the replicas by the weight of MaxAvailableReplicas.

Currently, the scheduler-estimator determines the MaxAvailableReplicas based solely on the available CPU and memory of the cluster and pods requirements. You might consider customizing a scheduler-estimator to use your own metrics for evaluating the MaxAvailableReplicas for each cluster, which would allow for a more accurate estimation of the reasonable allocation ratios across clusters.

@LavredisG
Copy link
Contributor Author

  1. According to this, the current implementation uses scheduler-estimator only when the Type is Divided and Preference is Aggregated as per my understanding, is that correct? I am thinking that I would probably need Divided/Weighted/DynamicWeight for my case, so should I go for a custom scheduler-estimator or a custom factor for the DynamicWeight?

Image

  1. Regarding Cluster Resource Modeling, since Customised Cluster Resource Modeling is used since Karmada 1.4 as default, is the General Cluster Modeling useless if you use any version after v1.4?

Image

@chaosi-zju
Copy link
Member

According to this, the current implementation uses scheduler-estimator only when the Type is Divided and Preference is Aggregated as per my understanding, is that correct?

no, scheduler-estimator serve for Divided/Aggregated and Divided/DynamicWeight, the document is a bit ambiguous.

is the General Cluster Modeling useless if you use any version after v1.4?

I didn't get your doubts, in fact, there can be multiple estimators working in the scheduler (General Cluster Modeling is a general estimator and scheduler-estimator is a accurate estimator).

The relationship between different estimators is:

// Get the minimum value of MaxAvailableReplicas in terms of all estimators.
estimators := estimatorclient.GetReplicaEstimators()
ctx := context.WithValue(context.TODO(), util.ContextKeyObject,
fmt.Sprintf("kind=%s, name=%s/%s", spec.Resource.Kind, spec.Resource.Namespace, spec.Resource.Name))
for name, estimator := range estimators {
res, err := estimator.MaxAvailableReplicas(ctx, clusters, spec.ReplicaRequirements)
if err != nil {
klog.Errorf("Max cluster available replicas error: %v", err)
continue
}
klog.V(4).Infof("Invoked MaxAvailableReplicas of estimator %s for workload(%s, kind=%s, %s): %v", name,
spec.Resource.APIVersion, spec.Resource.Kind, namespacedKey, res)
for i := range res {
if res[i].Replicas == estimatorclient.UnauthenticReplica {
continue
}
if availableTargetClusters[i].Name == res[i].Name && availableTargetClusters[i].Replicas > res[i].Replicas {
availableTargetClusters[i].Replicas = res[i].Replicas
}
}
}

@LavredisG
Copy link
Contributor Author

I didn't get your doubts, in fact, there can be multiple estimators working in the scheduler (General Cluster Modeling is a general estimator and scheduler-estimator is a accurate estimator).

I mean to say that since the scheduler-estimator was created to "fix" the problems that the general estimator had, is there any use case for the general estimator anymore?

@LavredisG
Copy link
Contributor Author

Is it normal that the scheduler-estimator was working even without using the hack/deploy-scheduler-estimator.sh script? I mean to say that propagating a resource with either AvailableReplicas or Aggregated the distribution would be correct, as if the scheduler-estimator was already there. Is that expected?

@chaosi-zju
Copy link
Member

chaosi-zju commented Jan 24, 2025

I mean to say that since the scheduler-estimator was created to "fix" the problems that the general estimator had, is there any use case for the general estimator anymore?

Sorry, this got buried in my notifications.

the generic estimator is still useful, it serves as a default fallback estimator with lower overhead:

  • scheduler-estimator need extra components and gRPC access, which some users see as too costly and prefer not to install it.
  • when scheduler-estimator fail, a general fallback is available.

@chaosi-zju
Copy link
Member

Is it normal that the scheduler-estimator was working even without using the hack/deploy-scheduler-estimator.sh script?

I still don't understand what you mean.

The script is just one way to install the component, we only care about whether the component exists, as there are many installation methods.

@LavredisG
Copy link
Contributor Author

LavredisG commented Jan 24, 2025

Sorry, this got buried in my notifications

No problem, all good!

Ok I will try to explain it better. I was propagating a deployment using either Aggregate or Weighted/Dynamic/AvailableReplicas for replica propagation, but without having deployed the scheduler-estimator with the script provided (I had set up karmada and joined member clusters but there were no scheduler-estimator pods running for the members). Both of these worked without the scheduler-estimator as they would if I had the scheduler estimator, meaning that they correctly assigned the pods to each cluster. Is the default ClusterResourceModel used in that case when we haven't deployed the estimator?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/question Indicates an issue that is a support question.
Projects
None yet
Development

No branches or pull requests

2 participants