You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In our load scenarios, the fleet is CPU bound with quite low memory usage. Alibaba, however, reports that although a "typical" server would run 2k function instances, their CPU utilization remains <50% in their recent RunD paper.
One need to evaluate CPU and memory utilization, using their CPU/memory ratio (100 vCPU, probably with SMT enabled, to 384GB, or 1 to 4), which is similar to our expectation for AWS (metal.m5 instance: 48vCPU with SMT disabled to 384GB).
Please configure a similar ratio by disabling cores or occupying memory, and run the slowdown sweep experiment again.
The expected outcome would be the refined CPU quotas derived from the memory usage, which would deliver similar CPU and memory utilization levels as reported by the providers.
The text was updated successfully, but these errors were encountered:
As I looked into the function image code, it does not actually use any memory as caller requests, it just reports the amount of requested memory without allocating it. @cvetkovic, am I correct? I've seen you are the last one to change this part in image code.
If so, there is no reason to try other CPU/memory ratio because now memory is only used for the pod itself, not for imitating the "useful" work. Memory allocation should be fixed at the first place, after that we would be able to continue to throttle the server by memory consumption.
@leokondrashov The way we do it currently is that we give hint to the kube-scheduler through CPU and MEM requests. This guarantees that the pod will get at least the amount of resources specified in requests. This value is also used by Linux cgroups for resource throttling/multiplexing on resource overcommitment. Limits means that a pod will be evicted if it uses more resources than specified in limits.
Here is the way we calculate these values. OVERCOMMITMENT_RATIO is 10.
We currently do not allocate any memory in the functions that run as we experienced a lot of timeouts if memory allocation is done. Malloc can take a lot of time when a memory chunk is bigger, and cannot fit into a single-digit milliseconds function execution time.
I think this feature of the load generator is the most sensitive in terms of designing it properly. One needs to fully understand what K8s offers, and we didn't have too much time to explore this. What we have currently is just a temporary solution which I and Dmitrii agreed upon while redesigning the loader.
Let me know if you want to have a chat on this issue sometime.
In our load scenarios, the fleet is CPU bound with quite low memory usage. Alibaba, however, reports that although a "typical" server would run 2k function instances, their CPU utilization remains <50% in their recent RunD paper.
One need to evaluate CPU and memory utilization, using their CPU/memory ratio (100 vCPU, probably with SMT enabled, to 384GB, or 1 to 4), which is similar to our expectation for AWS (metal.m5 instance: 48vCPU with SMT disabled to 384GB).
Please configure a similar ratio by disabling cores or occupying memory, and run the slowdown sweep experiment again.
The expected outcome would be the refined CPU quotas derived from the memory usage, which would deliver similar CPU and memory utilization levels as reported by the providers.
The text was updated successfully, but these errors were encountered: