This repository contains our code and presentation for comparing the effectiveness of the entangled polynomial code with that of a naive repetition code for straggler mitigation in distributed matrix multiplication, done as part of the course project for EE 605 (Error Correcting Codes), Autumn 2022, IIT Bombay.
Authors: Ankit Kumar Misra and Dhruva Dhingra
This work is based on the paper "Straggler Mitigation in Distributed Matrix Multiplication: Fundamental Limits and Optimal Coding" by Yu, Maddah-Ali, and Avestimehr. We implemented their ideas and reviewed their methods, discovering two flaws in the polynomial code design, as explained in our presentation.
Organization:
-
src/
contains:-
polynomial_code.py
: Functionpolynomial_matmul
takes matrices$A$ and$B$ (along with other parameters), uses entangled polynomial coding to perform distributed matrix multiplicationavg_over
number of times, and returns the average error, average execution time, and average preprocessing time. -
redundant_code.py
: Functionredundant_matmul
takes matrices$A$ and$B$ (along with other parameters), uses repetitive coding to perform distributed matrix multiplicationavg_over
number of times, and returns the average error and average execution time. -
plot.py
:- Generates a plot of error versus recovery threshold for two entangled polynomial codes, one having equally spaced nodes and the other having Chebyshev nodes.
- Generates a plot of execution times and preprocessing times versus matrix size, using entangled polynomial coding and repetitive coding.
-
-
plots/
contains sample plots generated by our code. -
presentation/
contains our project presentation's LaTeX source and PDF, generated using Beamer.