-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Loading status checks…
summary for Google's RPC paper
Max Liu
authored and
Max Liu
committed
May 14, 2024
1 parent
5f5e98f
commit 16168b8
Showing
2 changed files
with
21 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
--- | ||
title: "A Cloud-Scale Characterization of Remote Procedure Calls" | ||
layout: post | ||
--- | ||
|
||
|
||
Research Question(s): RPC is a key enabler for cloud-scale distributed applications and its use increases rapidly. Characterizing RPCs usage in the cloud helps with better understanding of cloud applications, yielding insights for optimizations. This paper presents methodology and results of characterizing RPCs at Google. | ||
|
||
|
||
Key Contributions: By analyzing monitoring datasets of metrics, traces, and CPU cycles spent on processing RPCs, the authors show that 1) the characters of latency, frequency, size, and nested hierarchy of over 10,000 distinct RPC methods and find that on average RPCs at Google operate at millisecond timescale (not microseconds which was the focus in many prior work) and kilobyte sizes. The RPC traces are wider than deeper, which is consistent with prior work; 2) At the tail, RPC latency tax, which is defined as non-application latencies, dominates a request's latency, indicating the importance of optimization of RPC; 3) They further breakdown the RPC latency tax into nine components and study latency variations within and across clusters, concluding that high server and memory utilization correlates with high variation within clusters. 4) Lastly, they find that RPC CPU utilization is heavily tailed. Also, most CPU cycles are consumed by a few services. Both results reveal the need for method-specific hardware optimization. | ||
|
||
|
||
Opportunities for future work: This work studies RPCs sent over TCP, leaving it a future work to study RPCs over RDMA, an important alternative transport. Also, inherently, the results hold at Google, but unclear if other cloud providers have similar concerns. | ||
|
||
|
||
Presenter: Max Liu |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters