jypi
  • Explore
ChatWays to LearnMind mapAbout

jypi

  • About Us
  • Our Mission
  • Team
  • Careers

Resources

  • Ways to Learn
  • Mind map
  • Blog
  • Help Center
  • Community Guidelines
  • Contributor Guide

Legal

  • Terms of Service
  • Privacy Policy
  • Cookie Policy
  • Content Policy

Connect

  • Twitter
  • Discord
  • Instagram
  • Contact Us
jypi

© 2026 jypi. All rights reserved.

Python for Data Science, AI & Development
Chapters

1Python Foundations for Data Work

2Data Structures and Iteration

3Numerical Computing with NumPy

4Data Analysis with pandas

5Data Cleaning and Feature Engineering

6Data Visualization and Storytelling

7Statistics and Probability for Data Science

Descriptive StatisticsProbability DistributionsSampling and CLTHypothesis TestingConfidence Intervalst-tests and ANOVANonparametric TestsCorrelation and CovarianceRegression FundamentalsBias–Variance TradeoffCross-Validation ConceptsBayesian Thinking BasicsA/B Testing DesignPower and Sample SizeCausality and Confounding

8Machine Learning with scikit-learn

9Deep Learning Foundations

10Data Sources, Engineering, and Deployment

Courses/Python for Data Science, AI & Development/Statistics and Probability for Data Science

Statistics and Probability for Data Science

45969 views

Develop statistical intuition for inference, experimentation, and uncertainty-aware decisions.

Content

7 of 15

Nonparametric Tests

Nonparametric Tests for Data Science: When and How to Use
684 views
intermediate
humorous
statistics
python
data-science
gpt-5-mini
684 views

Versions:

Nonparametric Tests for Data Science: When and How to Use

Watch & Learn

AI-discovered learning video

Sign in to watch the learning video for this topic.

Sign inSign up free

Start learning for free

Sign up to save progress, unlock study materials, and track your learning.

  • Bookmark content and pick up later
  • AI-generated study materials
  • Flashcards, timelines, and more
  • Progress tracking and certificates

Free to join · No credit card required

Nonparametric Tests: When Your Data Hates Normality (and What to Do About It)

“The t-test and ANOVA are great — until your data shows up in sweatpants.”

You just learned t-tests, ANOVA, and confidence intervals — neat tools when your residuals are behaving (i.e., roughly normal, homoscedastic, independent). But real-world data often refuses to dress up for the occasion: skewed distributions, outliers, ordinal measures, tiny sample sizes. That’s where nonparametric tests come in — the toolbox for when assumptions fail, or when you just don’t trust means.


What are nonparametric tests? (Short, sweet, and dramatic)

Nonparametric tests are statistical methods that don't assume a specific parametric form (like normality) for the population distribution. Instead of modeling means under normal assumptions, they often use medians, ranks, or resampling. They’re robust, flexible, and a little rebellious.

Why they matter: In data science you’ll see skewed revenue, ordinal survey responses (“agree / neutral / disagree”), and tiny A/B test buckets. Use nonparametric tests when parametric assumptions (normality, equal variances) are violated or when your metric is ordinal.


Quick reminder: When you might prefer nonparametric over t-tests/ANOVA

  • Small sample sizes (n < ~30) and non-normal distributions
  • Heavy outliers that distort means
  • Ordinal data (Likert scales)
  • Heteroscedasticity that can’t be fixed by transformation

Remember: earlier, we used visual checks (from Data Visualization & Storytelling) — histograms, QQ-plots, boxplots, and violin plots — to verify normality and spot outliers. If those visuals scream “nope”, nonparametric tests are your friend.


The main nonparametric tests you’ll use (and when to pick them)

  1. Mann–Whitney U test (a.k.a. Wilcoxon rank-sum)

    • Use: Compare two independent groups (like two separate A/B variants) when you can’t assume normality.
    • Works with: continuous or ordinal data.
    • Intuition: Rank all observations across both groups; test whether ranks differ between groups.
  2. Wilcoxon signed-rank test

    • Use: Paired data (before/after on same users) where normality for paired differences is suspect.
    • Intuition: Rank absolute differences and consider signs — testing if median difference is zero.
  3. Kruskal–Wallis H test

    • Use: More than two independent groups (nonparametric alternative to one-way ANOVA).
    • Intuition: Generalizes rank-based comparison to k groups.
  4. Friedman test

    • Use: Repeated measures (like ANOVA repeated measures) when normality fails.
    • Intuition: Ranks within each block (subject) and compares treatments across blocks.
  5. Spearman rank correlation

    • Use: Correlation when linearity or normality is questionable; measures monotonic relationships.
  6. Sign test / Significance of medians

    • Use: Extremely simple paired test based solely on direction (sign) of differences — very robust but low power.
  7. Bootstrap methods (nonparametric resampling)

    • Use: Estimate confidence intervals for medians, percentiles, or complex statistics when analytic CIs aren’t available.
    • Intuition: Resample your data with replacement many times and use the empirical distribution to build CIs — remember our earlier lessons on confidence intervals? Bootstrapping builds them without parametric assumptions.

How to choose: a mini decision tree

  1. Is your outcome ordinal or non-normal? → Consider nonparametric.
  2. Are groups independent? → Mann–Whitney (2 groups) or Kruskal–Wallis (k groups).
  3. Are observations paired/repeated? → Wilcoxon signed-rank or Friedman.
  4. Need correlation? → Spearman.
  5. Want CI for median or complex stat? → Bootstrap.

Tiny Python recipes (so you can stop reading and start testing)

Note: We previously used Matplotlib/Seaborn to explore distributions. Visualize first! Then run these.

  • Mann–Whitney U (scipy)
from scipy import stats
u_stat, p = stats.mannwhitneyu(group_A, group_B, alternative='two-sided')
  • Wilcoxon signed-rank (paired)
w_stat, p = stats.wilcoxon(before, after)
  • Kruskal–Wallis (k groups)
h_stat, p = stats.kruskal(group1, group2, group3)
  • Spearman correlation
rho, p = stats.spearmanr(x, y)
  • Simple bootstrap for median CI
import numpy as np
boots = [np.median(np.random.choice(data, size=len(data), replace=True)) for _ in range(5000)]
ci = np.percentile(boots, [2.5, 97.5])

Tip: SciPy functions return p-values like the parametric tests; interpret them the same way, but remember nonparametric tests often have less power (harder to detect small effects).


Practical examples (real talk)

  • You have customer satisfaction scores (1–5 Likert). Comparing two design prototypes? Use Mann–Whitney rather than t-test.
  • You’re comparing revenue per user but distributions are long-tailed. Consider median differences with bootstrap CIs.
  • A/B test with users paired across time (same users before/after feature). Wilcoxon signed-rank beats paired t-test if differences are non-normal.

Pitfalls and things your stats professor will quietly sigh about

  • Nonparametric doesn't mean assumption-free. Many rely on exchangeability and independence.
  • You lose power: nonparametric tests can need larger samples to detect the same effect size.
  • Reporting: Don’t just give p-values. Report effect sizes (median difference, rank-biserial correlation) and visuals (boxplots, violin plots, bootstrap CIs).

Visuals + robust statistics = honest storytelling. You’ve already learned to communicate insights with Matplotlib/Seaborn — now use those plots to explain why you chose nonparametric methods.


Quick summary — the nonparametric cheat-sheet

  • Use nonparametric tests when normality or equal variance assumptions fail, or data is ordinal.
  • Mann–Whitney (two independent groups), Wilcoxon (paired), Kruskal–Wallis (k groups), Friedman (repeated), Spearman (correlation), Bootstrap (CIs).
  • Visualize first. Report effect sizes and CIs (bootstrap if necessary).

"This is the moment where the concept finally clicks": nonparametric tests are not weaker cousins of t-tests — they’re the rugged off-road vehicles for messy, real-world data. When parametric roads disappear, nonparametric tools keep you moving.


Final lab assignment (tiny, satisfying)

  1. Load a skewed dataset (e.g., revenue per user), plot distribution (hist, boxplot, violin).
  2. Compare two groups with both t-test and Mann–Whitney. Report both p-values and a bootstrap CI for the median difference. Explain why results differ.
  3. Write 2–3 sentences justifying the test you chose for a stakeholder who only skims emails.

Keep your plots clean, your explanations crisp, and your statistical choices defensible. If a stakeholder asks why you didn’t use a t-test, show them the violin plot. If they still ask, show them the bootstrap CI and smile.

Flashcards
Mind Map
Speed Challenge

Comments (0)

Please sign in to leave a comment.

No comments yet. Be the first to comment!

Ready to practice?

Sign up now to study with flashcards, practice questions, and more — and track your progress on this topic.

Study with flashcards, timelines, and more
Earn certificates for completed courses
Bookmark content for later reference
Track your progress across all topics