Courses/Python for Data Science, AI & Development/Statistics and Probability for Data Science

Statistics and Probability for Data Science

45976 views

Develop statistical intuition for inference, experimentation, and uncertainty-aware decisions.

Content

7 of 15

Nonparametric Tests

Nonparametric Tests for Data Science: When and How to Use

684 views

intermediate

humorous

statistics

python

data-science

gpt-5-mini

684 views

Versions:

Nonparametric Tests for Data Science: When and How to Use

Watch & Learn

AI-discovered learning video

Start learning for free

Bookmark content and pick up later
AI-generated study materials
Flashcards, timelines, and more
Progress tracking and certificates

Free to join · No credit card required

Nonparametric Tests: When Your Data Hates Normality (and What to Do About It)

“The t-test and ANOVA are great — until your data shows up in sweatpants.”

You just learned t-tests, ANOVA, and confidence intervals — neat tools when your residuals are behaving (i.e., roughly normal, homoscedastic, independent). But real-world data often refuses to dress up for the occasion: skewed distributions, outliers, ordinal measures, tiny sample sizes. That’s where nonparametric tests come in — the toolbox for when assumptions fail, or when you just don’t trust means.

What are nonparametric tests? (Short, sweet, and dramatic)

Nonparametric tests are statistical methods that don't assume a specific parametric form (like normality) for the population distribution. Instead of modeling means under normal assumptions, they often use medians, ranks, or resampling. They’re robust, flexible, and a little rebellious.

Why they matter: In data science you’ll see skewed revenue, ordinal survey responses (“agree / neutral / disagree”), and tiny A/B test buckets. Use nonparametric tests when parametric assumptions (normality, equal variances) are violated or when your metric is ordinal.

Quick reminder: When you might prefer nonparametric over t-tests/ANOVA

Small sample sizes (n < ~30) and non-normal distributions
Heavy outliers that distort means
Ordinal data (Likert scales)
Heteroscedasticity that can’t be fixed by transformation

Remember: earlier, we used visual checks (from Data Visualization & Storytelling) — histograms, QQ-plots, boxplots, and violin plots — to verify normality and spot outliers. If those visuals scream “nope”, nonparametric tests are your friend.

The main nonparametric tests you’ll use (and when to pick them)

Mann–Whitney U test (a.k.a. Wilcoxon rank-sum)
- Use: Compare two independent groups (like two separate A/B variants) when you can’t assume normality.
- Works with: continuous or ordinal data.
- Intuition: Rank all observations across both groups; test whether ranks differ between groups.
Wilcoxon signed-rank test
- Use: Paired data (before/after on same users) where normality for paired differences is suspect.
- Intuition: Rank absolute differences and consider signs — testing if median difference is zero.
Kruskal–Wallis H test
- Use: More than two independent groups (nonparametric alternative to one-way ANOVA).
- Intuition: Generalizes rank-based comparison to k groups.
Friedman test
- Use: Repeated measures (like ANOVA repeated measures) when normality fails.
- Intuition: Ranks within each block (subject) and compares treatments across blocks.
Spearman rank correlation
- Use: Correlation when linearity or normality is questionable; measures monotonic relationships.
Sign test / Significance of medians
- Use: Extremely simple paired test based solely on direction (sign) of differences — very robust but low power.
Bootstrap methods (nonparametric resampling)
- Use: Estimate confidence intervals for medians, percentiles, or complex statistics when analytic CIs aren’t available.
- Intuition: Resample your data with replacement many times and use the empirical distribution to build CIs — remember our earlier lessons on confidence intervals? Bootstrapping builds them without parametric assumptions.

How to choose: a mini decision tree

Is your outcome ordinal or non-normal? → Consider nonparametric.
Are groups independent? → Mann–Whitney (2 groups) or Kruskal–Wallis (k groups).
Are observations paired/repeated? → Wilcoxon signed-rank or Friedman.
Need correlation? → Spearman.
Want CI for median or complex stat? → Bootstrap.

Tiny Python recipes (so you can stop reading and start testing)

Note: We previously used Matplotlib/Seaborn to explore distributions. Visualize first! Then run these.

Mann–Whitney U (scipy)

from scipy import stats
u_stat, p = stats.mannwhitneyu(group_A, group_B, alternative='two-sided')

Wilcoxon signed-rank (paired)

w_stat, p = stats.wilcoxon(before, after)

Kruskal–Wallis (k groups)

h_stat, p = stats.kruskal(group1, group2, group3)

Spearman correlation

rho, p = stats.spearmanr(x, y)

Simple bootstrap for median CI

import numpy as np
boots = [np.median(np.random.choice(data, size=len(data), replace=True)) for _ in range(5000)]
ci = np.percentile(boots, [2.5, 97.5])

Tip: SciPy functions return p-values like the parametric tests; interpret them the same way, but remember nonparametric tests often have less power (harder to detect small effects).

Practical examples (real talk)

You have customer satisfaction scores (1–5 Likert). Comparing two design prototypes? Use Mann–Whitney rather than t-test.
You’re comparing revenue per user but distributions are long-tailed. Consider median differences with bootstrap CIs.
A/B test with users paired across time (same users before/after feature). Wilcoxon signed-rank beats paired t-test if differences are non-normal.

Pitfalls and things your stats professor will quietly sigh about

Nonparametric doesn't mean assumption-free. Many rely on exchangeability and independence.
You lose power: nonparametric tests can need larger samples to detect the same effect size.
Reporting: Don’t just give p-values. Report effect sizes (median difference, rank-biserial correlation) and visuals (boxplots, violin plots, bootstrap CIs).

Visuals + robust statistics = honest storytelling. You’ve already learned to communicate insights with Matplotlib/Seaborn — now use those plots to explain why you chose nonparametric methods.

Quick summary — the nonparametric cheat-sheet

Use nonparametric tests when normality or equal variance assumptions fail, or data is ordinal.
Mann–Whitney (two independent groups), Wilcoxon (paired), Kruskal–Wallis (k groups), Friedman (repeated), Spearman (correlation), Bootstrap (CIs).
Visualize first. Report effect sizes and CIs (bootstrap if necessary).

"This is the moment where the concept finally clicks": nonparametric tests are not weaker cousins of t-tests — they’re the rugged off-road vehicles for messy, real-world data. When parametric roads disappear, nonparametric tools keep you moving.

Final lab assignment (tiny, satisfying)

Load a skewed dataset (e.g., revenue per user), plot distribution (hist, boxplot, violin).
Compare two groups with both t-test and Mann–Whitney. Report both p-values and a bootstrap CI for the median difference. Explain why results differ.
Write 2–3 sentences justifying the test you chose for a stakeholder who only skims emails.

Keep your plots clean, your explanations crisp, and your statistical choices defensible. If a stakeholder asks why you didn’t use a t-test, show them the violin plot. If they still ask, show them the bootstrap CI and smile.

Flashcards

Mind Map

Speed Challenge

Comments (0)

Please sign in to leave a comment.

No comments yet. Be the first to comment!

Ready to practice?

Study with flashcards, timelines, and more

Earn certificates for completed courses

Bookmark content for later reference

Track your progress across all topics