Need to study CPU-to-memory utilization based on the data from providers #144

ustiugov · 2023-02-09T06:29:51Z

In our load scenarios, the fleet is CPU bound with quite low memory usage. Alibaba, however, reports that although a "typical" server would run 2k function instances, their CPU utilization remains <50% in their recent RunD paper.

One need to evaluate CPU and memory utilization, using their CPU/memory ratio (100 vCPU, probably with SMT enabled, to 384GB, or 1 to 4), which is similar to our expectation for AWS (metal.m5 instance: 48vCPU with SMT disabled to 384GB).

Please configure a similar ratio by disabling cores or occupying memory, and run the slowdown sweep experiment again.

The expected outcome would be the refined CPU quotas derived from the memory usage, which would deliver similar CPU and memory utilization levels as reported by the providers.

leokondrashov · 2023-02-13T16:10:42Z

As I looked into the function image code, it does not actually use any memory as caller requests, it just reports the amount of requested memory without allocating it.
@cvetkovic, am I correct? I've seen you are the last one to change this part in image code.

If so, there is no reason to try other CPU/memory ratio because now memory is only used for the pod itself, not for imitating the "useful" work. Memory allocation should be fixed at the first place, after that we would be able to continue to throttle the server by memory consumption.

ustiugov · 2023-02-14T05:10:44Z

@cvetkovic is that the case in the main branch? I missed this change somehow

cvetkovic · 2023-02-14T21:12:37Z

@leokondrashov The way we do it currently is that we give hint to the kube-scheduler through CPU and MEM requests. This guarantees that the pod will get at least the amount of resources specified in requests. This value is also used by Linux cgroups for resource throttling/multiplexing on resource overcommitment. Limits means that a pod will be evicted if it uses more resources than specified in limits.

Here is the way we calculate these values. OVERCOMMITMENT_RATIO is 10.

We currently do not allocate any memory in the functions that run as we experienced a lot of timeouts if memory allocation is done. Malloc can take a lot of time when a memory chunk is bigger, and cannot fit into a single-digit milliseconds function execution time.

I think this feature of the load generator is the most sensitive in terms of designing it properly. One needs to fully understand what K8s offers, and we didn't have too much time to explore this. What we have currently is just a temporary solution which I and Dmitrii agreed upon while redesigning the loader.

Let me know if you want to have a chat on this issue sometime.

ustiugov assigned leokondrashov Feb 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need to study CPU-to-memory utilization based on the data from providers #144

Need to study CPU-to-memory utilization based on the data from providers #144

ustiugov commented Feb 9, 2023

leokondrashov commented Feb 13, 2023

ustiugov commented Feb 14, 2023

cvetkovic commented Feb 14, 2023 •

edited

Loading

Need to study CPU-to-memory utilization based on the data from providers #144

Need to study CPU-to-memory utilization based on the data from providers #144

Comments

ustiugov commented Feb 9, 2023

leokondrashov commented Feb 13, 2023

ustiugov commented Feb 14, 2023

cvetkovic commented Feb 14, 2023 • edited Loading

cvetkovic commented Feb 14, 2023 •

edited

Loading