A list of blogs, videos, and other content that provides advice on building experimentation platform. Most of the content comes from leading companies like Netflix, Airbnb, Microsoft, Facebook, Google, Booking, Stitch Fix, Convoy, and similar who have built some of the most advanced experimentation platforms in the world.
- Building our Centralized Experimental Platform (Stitch Fix)
- Reimagining Experimental Analysis at Netflix
- Scaling Airbnb's Experimentation Platform
- How We Scaled Experimentation at Hulu
- Supporting Rapid Product Iteration with an Experimentation Analysis Platform
- How we Reimagined A/B Testing at Squarespace
- Modern Experimentation Platforms (Podcast by Ex-Airbnb Data Scientist)
- Democratizing Online Experiments (Booking)
- Building a Culture of Experimentation (Harvard Business Review)
- It takes a Flywheel to Fly: Kickstarting and Growing the A/B testing Momentum at Scale (Microsoft)
- A Culture of Learning (Netflix)
- Top Challenges from the first Practical Online Controlled Experiments Summit
- Building a cuture of experimentation (VistaPrint)
- The experimentation culture at HelloFresh
- How Etsy Handles Peeking in A/B Testing
- Peeking problem – the fatal mistake in A/B testing and experimentation
- How to double A/B Testing with CUPED
- How to speed up your AB test (Faire)
- How to speed up your AB test - Part 2, Outlier Capping and CUPED (Faire)
- Improving Experimental Power through Control Using Predictions as Covariate (CUPAC) (Doordash)
- Increasing the sensitivity of A/B tests by utilizing the variance estimates of experimental units (Facebook)
- Improving Online Experiment Capacity by 4X with Parallelization and Increased Sensitivity
- The 4 Principles DoorDash Used to Increase Its Logistics Experiment Capacity by 1000% (Doordash)
- How Booking.com Increases the Power of Online Experiments with CUPED
- Reducing A/B Test Measurement Variance by 30% (TripAdvisor)
- Multi-Armed Bandits and the Stitch Fix Experimentation Platform
- There's More to Experimentation Than A/B
- Multi Arm Bandit Algorithms (VWO)
- Quasi Experiments at Netflix
- Key Challenges with Quasi Experiments at Netflix
- How to Use Quasi-experiments and Counterfactuals to Build Great Products
- Counterfactual Inference Presentation (Neurips 2018)
- Why Tenant-Randomized A/B Test is Challenging and Tenant-Pairing May Not Work (Microsoft)
- Interleaving in Online Controlled Experiments (Netflix)
- Switchback Tests and Randomized Experimentation Under Network Effects at DoorDash
- Analyzing Switchback Experiments by Cluster Robust Standard Error to Prevent False Positive Results (Doordash)
- Experiment Rigor for Switchback Experiment Analysis (Doordash)
- Experimentation in a Ridesharing Marketplace (Lyft) - Part 1, 2, and 3. Covers a range of topics but heavily focused on assignment and measurement challenges in network-effect businesses.
- Why it matters where you randomize users in A/B Experiments
- Streaming Video Experimentation at Netflix: Visualizing Practical and Statistical Significance - Good overview of quantile functions and comparisons
- Data Compression for Large-Scale Streaming Experimentation (Netflix) - More on quantile functions and comparisons
- The Power of Bayesian A/B Testing (Convoy)
- Formulas for Bayesian A/B Testing
- Is Bayesian A/B Testing Immune to Peeking? Not Exactly? (StackExchange)
- Easy Evaluation of Decision Rules in Bayesian A/B testing
- Cracking Correlated Observations in A/B Tests with Mixed Model Effects (Convoy)
- Exploring Bayesian A/B Testing with Simulations (Faire)
- Why You Should Switch to Bayesian A/B Testing (Wix)
- How to do Bayesian A/B Testing Fast (Wix)
- Detecting and Avoiding Bucket Imbalance in A/B Tests (Twitter)
- How not to run an A/B Test
- Airbnb Growth Principles for Effective Experimentation
- The What and Why of Experimentation at Twitter
- Zulily's Experimentation Journey
- Good Experiment, Bad Experiment (Reforge)
- Patterns of Trustworthy Experiments - Pre-Experiment Stage (Microsoft)
- You Cannot Be Data Driven Without Experimentation (Reforge)
- Reforge Experimentation Course
- Experimentation is a Major Focus of Data Science Across Netflix
- Experimentation Works: The Surprising Power of Business Experiments (Book)
- Leaky Abstrctions in Online Experimentation Platforms
- Inferring the effect of an event using CausalImpact by Kay Brodersen (Google)
- Causal Impact - R Library
- Causal Models at Lyft
- Estimating Mechanisms of Change at Booking
- CausalML Library (Uber)
- EconML Library (Microsoft)
- A Dirty Dozen: Twelve Common Metric Interpretation Pitfalls in Online Controlled Experiments (Microsoft)
- Novelty/Primacy Effect Detection in Randomized Online Controlled Experiments (Microsoft)
- Improving the Sensitivity of Online Controlled Experiments: Case Studies at Netflix
- Improving the Sensitivity of Online Controlled Experiments by Utilizing Pre-Experiment Data (Microsoft)
- Peeking at A/B Tests (Optimizely)
- Graph cluster randomization: network exposure to multiple universes (Facebook)
- Machine Learning for Variance Reduction in Online Experiments