jypi
  • Explore
ChatWays to LearnMind mapAbout

jypi

  • About Us
  • Our Mission
  • Team
  • Careers

Resources

  • Ways to Learn
  • Mind map
  • Blog
  • Help Center
  • Community Guidelines
  • Contributor Guide

Legal

  • Terms of Service
  • Privacy Policy
  • Cookie Policy
  • Content Policy

Connect

  • Twitter
  • Discord
  • Instagram
  • Contact Us
jypi

© 2026 jypi. All rights reserved.

Supervised Machine Learning: Regression and Classification
Chapters

1Foundations of Supervised Learning

2Data Wrangling and Feature Engineering

3Exploratory Data Analysis for Predictive Modeling

Univariate Distributions and Summary StatsPairwise Relationships and CorrelationsVisualization for Regression TargetsVisualization for Class ImbalanceDetecting Nonlinearity and HeteroscedasticityMulticollinearity DiagnosticsTrain–Test Split Before EDAStratification StrategiesLeakage-Aware EDA PracticesRobust Scaling Decisions from EDAIdentifying Data Quality IssuesFeature Importance via Baseline ModelsPartial Plots for Early InsightHandling Out-of-Range ValuesData Imputation Strategy Design

4Train/Validation/Test and Cross-Validation Strategies

5Regression I: Linear Models

6Regression II: Regularization and Advanced Techniques

7Classification I: Logistic Regression and Probabilistic View

8Classification II: Thresholding, Calibration, and Metrics

9Distance- and Kernel-Based Methods

10Tree-Based Models and Ensembles

11Handling Real-World Data Issues

12Dimensionality Reduction and Feature Selection

13Model Tuning, Pipelines, and Experiment Tracking

14Model Interpretability and Responsible AI

15Deployment, Monitoring, and Capstone Project

Courses/Supervised Machine Learning: Regression and Classification/Exploratory Data Analysis for Predictive Modeling

Exploratory Data Analysis for Predictive Modeling

25147 views

EDA methods tailored to supervised tasks to reveal signal, distribution shifts, and modeling risks.

Content

5 of 15

Detecting Nonlinearity and Heteroscedasticity

Nonlinearity & Heteroscedasticity — Sass and Stats
2524 views
intermediate
humorous
visual
science
gpt-5-mini
2524 views

Versions:

Nonlinearity & Heteroscedasticity — Sass and Stats

Watch & Learn

AI-discovered learning video

Sign in to watch the learning video for this topic.

Sign inSign up free

Start learning for free

Sign up to save progress, unlock study materials, and track your learning.

  • Bookmark content and pick up later
  • AI-generated study materials
  • Flashcards, timelines, and more
  • Progress tracking and certificates

Free to join · No credit card required

Detecting Nonlinearity and Heteroscedasticity — the plot twist your model didn't see coming

"If your residual plot looks like a hairball, your model is lying to you." — probably me, loudly, in a data lab

You already explored the target distribution visually and checked for class imbalance and target weirdness (see: Visualization for Regression Targets; Visualization for Class Imbalance). You also did the sensible stuff in Data Wrangling and Feature Engineering: cleaned, encoded, scaled, and guarded against leakage. Nice. Now it's time for the emotional heart-to-heart with your model: ask whether the relationship you're modeling is linear enough to justify a plain old linear model, and whether the model's errors behave themselves.

Why this matters

  • Nonlinearity means your predictor and target have a relationship that isn't a straight line. If your model pretends it is linear, you'll get biased predictions. That feeling when your friend says they 'only drink water' and then chugs an espresso shot — same betrayal.
  • Heteroscedasticity means the variance of errors changes across levels of a predictor. If you ignore it, your uncertainty estimates and hypothesis tests will be wrong; confidence intervals will be lying little confidence liars.

Quick checklist (what we will do)

  1. Visual checks: residual vs fitted, grouped variance plots, scale-location plot.
  2. Formal tests: Breusch-Pagan, White, Goldfeld-Quandt.
  3. Remedial actions: transforms, polynomials/splines, GAMs, weighted methods, heteroscedasticity-robust inference.
  4. Special note: classification models have their own nonlinearity issues (link function, calibration).

Visual detective work (start here)

Why start visually? Because numbers lie and plots tell the truth. Visual checks are quick and often decisive.

1) Residuals vs Fitted

  • Plot: residuals on y-axis, fitted values on x-axis.
  • What to look for: a funnel shape (widening or narrowing) indicates heteroscedasticity. A systematic curve pattern indicates nonlinearity.

Code sketch:

# Python sketch
import matplotlib.pyplot as plt
fitted = model.predict(X)
resid = y - fitted
plt.scatter(fitted, resid, alpha=0.6)
plt.axhline(0, color='k', linestyle='--')
plt.xlabel('Fitted values')
plt.ylabel('Residuals')
plt.show()

Ask: Is the cloud centered around zero and uniform? If not, you found a problem.

2) Scale-Location (Spread) Plot

  • Plot sqrt(|residuals|) against fitted values.
  • This makes heteroscedasticity patterns more visible.

3) Residuals vs Predictor

Plot residuals against each important predictor. If you see curvature, your linear terms are missing the beat.

4) Binned variance plot

Group data by quantiles of a predictor, compute variance of residuals per bin, and plot. This clarifies trends when scatter is noisy.


Formal statistical tests (they won't replace plots)

  • Breusch-Pagan test: tests whether residual variance can be explained by predictors. Good general-purpose test.
  • White test: allows for nonlinearity in variance, tests more general specifications.
  • Goldfeld-Quandt test: compares variance across two subsamples; useful if you suspect variance increases with a predictor.

Remember: tests can be sensitive to non-normality of errors and outliers. Use them as complements to plots, not scripture.


Detecting nonlinearity more formally

  • Partial dependence plots (PDPs) and individual conditional expectation (ICE) plots: great for black-box models, but useful even with linear models to see if relationship looks straight.
  • Component plus residual (partial residual) plots: reveal whether adding polynomial terms might help.
  • Correlation + scatter + loess smoother: fit a lowess curve; if it bulges, you need nonlinear features.

Quick code idea for lowess:

from statsmodels.nonparametric.smoothers_lowess import lowess
sm = lowess(y, x, frac=0.3)
plt.plot(x, sm[:,1])

Remedies and when to use them

Table: Problem -> Quick Fix -> When it's best

Problem Quick Fix When to use
Nonlinearity (mild) Add polynomial terms (x^2, x^3) When shape is simple curve, few features
Nonlinearity (complex) Splines, regression trees, GAMs When curve is wiggly or you want interpretable smoothness
Heteroscedasticity Transform target (log, Box-Cox) or Weighted Least Squares When variance grows with level; transform can stabilize
Heteroscedasticity (inference) Robust SEs (HC0-HC3), bootstrap When you only need correct CIs/p-values

Notes:

  • Transforming the target can fix both nonlinearity and heteroscedasticity at once (log often tames multiplicative error patterns). But remember interpretability changes.
  • Weighted Least Squares gives more weight to observations with lower variance; requires estimating a weight function (often via modeling residual variance).
  • Generalized Additive Models (GAMs) are elegant: they model nonlinearity with smooth functions and can also model variance if extended (e.g., mgcv in R can fit location-scale models).

Classification models: the twist

You still care about nonlinearity: if logit link doesn't fit, predicted probabilities can be systematically off (miscalibration). Diagnostics:

  • Calibration plot: bin predicted probabilities and compare observed frequency.
  • Residual-like checks: deviance residuals vs predictors.

Remedies: add nonlinear terms, use tree-based models, or recalibrate probabilities (isotonic regression, Platt scaling).


Practical workflow (do this in order)

  1. Fit your baseline model (after proper splitting/cross-validation!).
  2. Plot residuals vs fitted and residuals vs key predictors. Ask: curve? funnel? both?
  3. Fit a lowess smoother or partial residual plot to confirm nonlinearity.
  4. Run Breusch-Pagan to test heteroscedasticity if visual signs exist.
  5. Try a simple transform (log or Box-Cox). Re-evaluate.
  6. If transform insufficient, try polynomial/spline or a flexible model like GAM or tree ensembles.
  7. For inference, switch to robust SEs or WLS as needed.

Closing mic drop

Nonlinearity and heteroscedasticity are not bugs in the data, they're features of reality refusing to be simplified. Your job is to listen: plot, test, and adapt. Start with visual empathy, then apply formal tools, and only then choose a remedy that balances accuracy and interpretability.

Key takeaways:

  • Always look at residuals; they will whisper the truth long before your metrics scream it.
  • Use transforms, splines, GAMs, or robust methods depending on severity and your goals.
  • For classification, pay special attention to calibration and link function adequacy.

Final thought: models are like friends — they work best when you accept their quirks and tailor your expectations. Fit the relationship, not the ego.

Flashcards
Mind Map
Speed Challenge

Comments (0)

Please sign in to leave a comment.

No comments yet. Be the first to comment!

Ready to practice?

Sign up now to study with flashcards, practice questions, and more — and track your progress on this topic.

Study with flashcards, timelines, and more
Earn certificates for completed courses
Bookmark content for later reference
Track your progress across all topics