Statistics and Probability for Data Science
Develop statistical intuition for inference, experimentation, and uncertainty-aware decisions.
Content
t-tests and ANOVA
Versions:
Watch & Learn
AI-discovered learning video
Sign in to watch the learning video for this topic.
t-Tests and ANOVA: Comparing Groups Like a Data Scientist (But Friendlier)
"If confidence intervals tell you where the true effect might live, t-tests and ANOVA tell you whether it's time to stop guessing and start reporting."
You already learned about confidence intervals and the basics of hypothesis testing. Now we're moving from single-parameter thinking to comparing groups. This is where t-tests and ANOVA shine — they answer questions like: "Are these two teaching methods really different?" or "Do these three marketing campaigns produce different click-through rates?" — but with math, dignity, and a little sass.
Why this matters (and where you’ll use it)
- A/B tests in product experiments (two groups) → t-tests.
- Comparing multiple conditions (3+ versions) → ANOVA.
- Before you build fancy ML models, you often want to check whether simple group differences exist. It's faster, interpretable, and often more convincing.
We also build on your visualization skills from the "Data Visualization and Storytelling" module — think boxplots, violin plots, and swarm plots as your pre-test detectives.
Quick roadmap: Which test when
- One-sample t-test: Compare sample mean to a known value (e.g., does average test score differ from 70?).
- Independent (two-sample) t-test: Compare means of two independent groups (A vs B). Use Welch's t-test if variances differ.
- Paired t-test: Compare repeated measures (before vs after on same subjects).
- One-way ANOVA: Compare means across 3+ groups. If significant, follow up with post-hoc tests (Tukey HSD).
Assumptions (so you don’t get fooled)
- Independence of observations (design issue).
- Normality of the sampling distribution (t/ANOVA robust for moderate n, but check residuals).
- Homoscedasticity (equal variances) for classic t-test/ANOVA — if not met, use Welch's t-test or Welch ANOVA, or non-parametric tests.
If assumptions seem violated: transform the data, use non-parametric alternatives (Mann–Whitney U, Wilcoxon, Kruskal–Wallis), or bootstrap the difference.
Real-world analogy (because metaphors make us remember things)
Imagine three bands playing in different rooms and you want to know if the crowd noise level differs. Listening to each once is risky — you might catch a quiet song. So you sample multiple songs (observations). A t-test is like comparing two rooms; ANOVA is comparing all rooms at once. If ANOVA says "hey, something’s different," you then pin down which rooms differ (post-hoc tests).
Small, practical Python examples (useful cheat-sheet)
First, always visualize before testing.
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from scipy import stats
np.random.seed(0)
# Two groups
group_a = np.random.normal(75, 10, 30)
group_b = np.random.normal(80, 12, 28)
sns.boxplot(data=[group_a, group_b])
plt.xticks([0,1], ['A','B'])
plt.title('Scores by Group')
plt.show()
# Independent t-test (Welch)
tstat, pval = stats.ttest_ind(group_a, group_b, equal_var=False)
print(f"t={tstat:.3f}, p={pval:.3f}")
Interpretation: small p (< 0.05) → reject null (means differ). Relate to the confidence intervals you learned — if CI for difference excludes 0, that's consistent with a significant test.
Paired t-test (pre-post)
pre = np.random.normal(70, 8, 30)
post = pre + np.random.normal(3, 6, 30)
stats.ttest_rel(post, pre)
Use paired when measurements are linked.
One-way ANOVA + Tukey post-hoc (statsmodels)
import pandas as pd
import statsmodels.formula.api as smf
from statsmodels.stats.anova import anova_lm
from statsmodels.stats.multicomp import pairwise_tukeyhsd
# Simulate 3 groups
group_c = np.random.normal(78, 11, 27)
df = pd.DataFrame({
'score': np.concatenate([group_a, group_b, group_c]),
'group': ['A']*len(group_a) + ['B']*len(group_b) + ['C']*len(group_c)
})
model = smf.ols('score ~ C(group)', data=df).fit()
print(anova_lm(model))
# If p < 0.05, do Tukey to see which pairs differ
print(pairwise_tukeyhsd(df['score'], df['group']))
Effect sizes (don’t just report p-values like a robot)
- Cohen's d for two groups — tells how large the difference is in SD units.
- d ~ 0.2 small, 0.5 medium, 0.8 large.
- Eta-squared (η²) for ANOVA — proportion of variance explained by group.
Quick Cohen's d (pooled):
def cohens_d(x,y):
nx, ny = len(x), len(y)
sx2, sy2 = x.var(ddof=1), y.var(ddof=1)
pooled = np.sqrt(((nx-1)*sx2 + (ny-1)*sy2) / (nx+ny-2))
return (x.mean() - y.mean()) / pooled
print(cohens_d(group_a, group_b))
When ANOVA says yes — how to proceed
- Check assumptions & residuals.
- If ANOVA p < alpha, run pairwise comparisons with multiple comparison correction (Tukey HSD is a standard choice).
- Report effect sizes and confidence intervals for pairwise differences.
- Visualize group means with error bars or notched boxplots to help storytelling.
Common pitfalls
- Running multiple t-tests instead of ANOVA when you have 3+ groups increases false positives.
- Relying on p-values without effect sizes and CIs — people love asterisks but hate real insight.
- Ignoring independence (paired vs independent is a design decision, not a calculation choice).
Key takeaways (so this sticks in your brain like a catchy chorus)
- t-tests = compare two means; ANOVA = compare 3+ means.
- Always visualize first — your plots are the honest friends of your tests.
- Check assumptions: normality, independence, homoscedasticity. If violated, consider robust or non-parametric methods.
- Report p-values, confidence intervals, and effect sizes — that trio tells the full story.
"Statistics without context is like coffee without caffeine: technically present, spiritually absent."
Go run some tests on your A/B dataset. Plot the distributions, pick the correct test, and then celebrate responsibly when the results are significant (or when they make you re-think your experiment). You’re now equipped to move from hypothesis testing to insights — and that’s the real win.
Summary: t-tests and ANOVA are your bread-and-butter tools for comparing group means. Use visuals from the Data Visualization module, recall confidence intervals for interpretation, and keep your assumptions in check. Happy testing (and may your p-values behave)!
Comments (0)
Please sign in to leave a comment.
No comments yet. Be the first to comment!