Skip to content

Commit

Permalink
added notes
Browse files Browse the repository at this point in the history
  • Loading branch information
nukemberg committed Oct 7, 2015
1 parent 13b5a3c commit 17e88c9
Show file tree
Hide file tree
Showing 2 changed files with 22 additions and 2 deletions.
20 changes: 20 additions & 0 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,8 @@ <h1 style="font-size: 120px;">The Math of Reliability</h1>
<aside class="notes">
<ul>
<li>Reliability is initimately linked to culture, but that's a different talk</li>
<li>Purpose of this talk: get people to think about reliability analytically</li>
<li>You can't "bolt on" reliability</li>
<li>Who likes math?</li>
</ul>
</aside>
Expand Down Expand Up @@ -370,13 +372,24 @@ <h2>Queuing delay</h2>
<p style="font-size: 70%;">&rho; - system utilization</p>
<img src="images/queue-latency.svg" style="border: none; background: none; box-shadow: none;" height="50%" width="50%" alt="">
<h3 class="fragment">Throttle your system!</h3>
<aside class="notes">
<ul>
<li>if you go over ~ 80% utilization latency will start rising fast</li>
</ul>
</aside>
</section>
<section>
<div data-svg-fragment="images/queue.svg#[*|label=Layer_1]">
<div class="fragment" title="[*|label=Layer_2]"></div>
<div class="fragment" title="[*|label=Layer_3]"></div>
</div>
<h3 class="fragment">Backpressure</h3>
<aside class="notes">
<ul>
<li>load will queue inside your system</li>
<li>limit internal queues and apply backpressure</li>
</ul>
</aside>
</section>
<section>
<h2>Little's Law</h2>
Expand All @@ -386,6 +399,13 @@ <h2>Little's Law</h2>
<div class="fragment" title="[*|label=Layer_2]"></div>
</div>
<p>$L_i = L_j \rightarrow \frac {\lambda_i} {\lambda_j} = \frac {W_j} {W_i}$</p>
<aside class="notes">
<ul>
<li>What happens when 1 process failes and returs errors with 1/100 latency?</li>
<li>How do you deal with this?</li>
<li>throttling according to "normal" throughput</li>
</ul>
</aside>
</section>
<section>
<h2>Feedback loops</h2>
Expand Down
4 changes: 2 additions & 2 deletions reliability.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -505,9 +505,9 @@
"\n",
"Let's explore the consueques of the law, focusing our attention on a cluster of machines behind a least-connections load balancer. The load balacner tries to keep L equal between servers, so that for any two servers $L_i = L_j$. It follows that $\\lambda_i W_i = \\lambda_j W_j \\Rightarrow \\frac {\\lambda_i} {\\lambda_j} = \\frac {W_j} {W_i}$\n",
"\n",
"Now consider a service failure on one server: the server start returning 50x errors with very small latency, perhaps $W_{normal}/10$. Our equation tells us that it will consume 10x throughput as any other server! This means that the failure of a very small portion of your cluster can manifest in a very severe failure raising your overall error rate to 50% or even more!!!\n",
"Now consider a service failure on one server: the server start returning 50x errors with very low latency, perhaps $W_{normal}/100$. Our equation tells us that it will consume 100x the throughput of any other server! This means that the failure of a very small portion of your cluster can manifest in a very severe failure raising your overall error rate to 50% or even more!!!\n",
"\n",
"This highlights the importance of properly throttling server based on \"normal\" workload."
"This highlights the importance of properly throttling servers based on \"normal\" workload."
]
}
],
Expand Down

0 comments on commit 17e88c9

Please sign in to comment.