Analyses
Framing and approaches.
Framing
Tukey, Design Thinking, and Better Questions
In my view, the most useful thing a data scientist can do is to devote serious effort towards improving the quality and sharpness of the question being asked.
The Ten Fallacies of Data Science
There exists a hidden gap between the more idealized view of the world given to data-science students and recent hires, and the issues they often face getting to grips with real-world data science problems in industry.
It’s entirely possible to trust an analysis but not believe the final conclusions.
20 Questions to Ask Prior to Starting Data Analysis
Data-Informed Product Building
Our goal is to give you an understanding of how a product evolves from infancy to maturity; a holistic sense of the product metric ecosystem of growth, engagement and monetization; a framework to define goals for your company; and a toolkit you can use to analyze your product’s performance against those goals.
Model Tuning and the Bias-Variance Tradeoff
Uncertainty + Visualization, Explained
Uncertainty + Visualization, Explained (Part 2: Continuous Encodings)
10 Reads for Data Scientists Getting Started with Business Models
Approaches
Conversion rates – you are (most likely) computing them wrong
Modeling conversion rates and saving millions of dollars using Kaplan-Meier and gamma distributions
You’re all calculating churn rates wrong
The Power User Curve: The best way to understand your most engaged users
What to do when your metrics dip
Experiments
Framing, approaches, pitfalls.
Framing
The Engineering Problem of A/B Testing
Leaky Abstractions In Online Experimentation Platforms
Common statistical tests are linear models (or: how to teach stats)
North Star or sign post metrics: which should one optimize?
Misadventures in experiments for growth
The Agony and Ecstasy of Building with Data
Is Bayesian A/B Testing Immune to Peeking? Not Exactly
Approaches
How Etsy Handles Peeking in A/B Testing
AB Testing 101: What I wish I knew about AB testing when I started my career
Suffering from a Non-inferiority Complex?
Analyzing Experiment Outcomes: Beyond Average Treatment Effects
Experimentation & Measurement for Search Engine Optimization
Pitfalls
Our key point here is that it is possible to have multiple potential comparisons, in the sense of a data analysis whose details are highly contingent on data, without the researcher performing any conscious procedure of fishing or examining multiple p-values.
Statistical Paradises and Paradoxes in Big Data (pdf)
Why Most Published Research Findings Are False
The Control Group is Out of Control
Decrease your confidence about most things if you’re not sure that you’ve investigated every piece of evidence.