jypi
  • Explore
ChatWays to LearnMind mapAbout

jypi

  • About Us
  • Our Mission
  • Team
  • Careers

Resources

  • Ways to Learn
  • Mind map
  • Blog
  • Help Center
  • Community Guidelines
  • Contributor Guide

Legal

  • Terms of Service
  • Privacy Policy
  • Cookie Policy
  • Content Policy

Connect

  • Twitter
  • Discord
  • Instagram
  • Contact Us
jypi

© 2026 jypi. All rights reserved.

Python for Data Science, AI & Development
Chapters

1Python Foundations for Data Work

2Data Structures and Iteration

3Numerical Computing with NumPy

4Data Analysis with pandas

5Data Cleaning and Feature Engineering

6Data Visualization and Storytelling

7Statistics and Probability for Data Science

8Machine Learning with scikit-learn

ML Workflow and PipelinesData Splits and CV StrategiesClassification MetricsRegression MetricsLinear and Logistic RegressionDecision Trees and ForestsGradient Boosting MethodskNN and SVMNaive Bayes ModelsClustering with k-meansDimensionality Reduction with PCAHyperparameter TuningModel InterpretationHandling Class ImbalanceSaving and Loading Models

9Deep Learning Foundations

10Data Sources, Engineering, and Deployment

Courses/Python for Data Science, AI & Development/Machine Learning with scikit-learn

Machine Learning with scikit-learn

44934 views

Build, tune, and evaluate models using scikit-learn pipelines with reproducible ML workflows.

Content

4 of 15

Regression Metrics

Regression Metrics in scikit-learn: RMSE, MAE, R² Guide
2490 views
beginner
python
regression
scikit-learn
data-science
gpt-5-mini
2490 views

Versions:

Regression Metrics in scikit-learn: RMSE, MAE, R² Guide

Watch & Learn

AI-discovered learning video

Sign in to watch the learning video for this topic.

Sign inSign up free

Start learning for free

Sign up to save progress, unlock study materials, and track your learning.

  • Bookmark content and pick up later
  • AI-generated study materials
  • Flashcards, timelines, and more
  • Progress tracking and certificates

Free to join · No credit card required

Regression Metrics in scikit-learn — How to judge continuous predictions

You already learned how to evaluate classifiers and split your data cleanly. Now y'know how to not lie with accuracy — welcome to the regression edition, where everything is a number and everyone cries into spreadsheets.

In the previous sections we covered classification metrics and cross-validation strategies. We also built statistical intuition from the "Statistics and Probability for Data Science" chapter — so you already understand variance, bias, sampling noise, and why confidence matters. Now we apply that intuition to regression metrics — the measures that tell you whether your model’s continuous predictions are useful.


What are regression metrics and why they matter

  • Regression metrics quantify the difference between predicted and actual continuous values.
  • They answer questions like: How wrong is my model on average? How much of the variance did it explain? Is the error sensitive to outliers?

Why this matters (real-world examples):

  • Predicting house prices: a high RMSE might mean millions wrong. Not good.
  • Forecasting demand: a biased model could lead to understock or waste.
  • Scientific measurements: understanding uncertainty ties back to inference and hypothesis testing from earlier chapters.

Key metrics you’ll use (and when to use them)

1) Mean Squared Error (MSE)

Definition: average of squared residuals.

Why it’s used:

  • Penalizes larger errors strongly (because of squaring).
  • Mathematically convenient (differentiable) — often the loss used to train regressors.

Sklearn: mean_squared_error(y_true, y_pred)

Pros: smooth and optimizable. Cons: in units squared of target (awkward to interpret).

2) Root Mean Squared Error (RMSE)

Definition: sqrt(MSE). Same units as the target — easier to interpret.

Use when: you care about large errors and want interpretable scale.

Sklearn: mean_squared_error(y_true, y_pred, squared=False)

3) Mean Absolute Error (MAE)

Definition: average absolute difference between predictions and truth.

Why it’s useful:

  • Robust to outliers compared to MSE/RMSE.
  • Interpretable: average absolute error in same units as target.

Sklearn: mean_absolute_error(y_true, y_pred)

4) R-squared (R²)

Definition (intuitively): proportion of variance in y explained by the model.

  • R² = 1 → perfect fit. R² = 0 → model does no better than predicting the mean. R² can be negative if worse than mean predictor.

Sklearn: r2_score(y_true, y_pred)

Caveat: R² doesn't tell you about bias, heteroscedasticity, or goodness for forecasting — it’s about variance explained on the evaluation data.

5) Explained Variance

Definition: how much of the variance of y is captured by the predictions (similar to R² but subtly different for some edge cases).

Sklearn: explained_variance_score(y_true, y_pred)

6) Mean Absolute Percentage Error (MAPE)

Definition: mean of |(y_true - y_pred) / y_true|.

Be careful: division by zero issues, and it punishes small true values heavily. Use only when target is strictly positive and percent error interpretation is desired.

Sklearn: mean_absolute_percentage_error(y_true, y_pred)


Quick comparison table

Metric Sensitive to outliers? Units Good when...
MSE Yes (high) squared units you want to penalize big errors / training loss
RMSE Yes (high) original units interpretability + penalize large errors
MAE Moderate original units robustness to outliers
R² N/A unitless measuring variance explained
MAPE Yes, and scale-dependent percent percent-error interpretation (positive targets only)

Code cheat-sheet (scikit-learn)

from sklearn.metrics import (
    mean_squared_error, mean_absolute_error,
    r2_score, explained_variance_score,
    mean_absolute_percentage_error
)

y_true = [3.0, -0.5, 2.0, 7.0]
 y_pred = [2.5, 0.0, 2.0, 8.0]

mse = mean_squared_error(y_true, y_pred)
rmse = mean_squared_error(y_true, y_pred, squared=False)
mae = mean_absolute_error(y_true, y_pred)
r2 = r2_score(y_true, y_pred)

print('MSE', mse, 'RMSE', rmse, 'MAE', mae, 'R2', r2)

Micro explanation: use squared=False to get RMSE directly.


Practical pitfalls and statistical connections

  • Scale dependence: MSE/RMSE/MAE are in the units of the target. If you standardize/scale your target, metric values change — compare across models on the same target scale only.
  • Outliers: MSE/RMSE exaggerate outliers (squared term). If your residuals have heavy tails (remember distributional intuition from Statistics chapter), MAE or median absolute error may be better.
  • R² misinterpretation: High R² doesn't guarantee low errors if your target has low variance. Conversely, negative R² signals the model is worse than predicting the mean.
  • Cross-validation / scoring API: scikit-learn’s cross_val_score and GridSearchCV expect a score where higher is better. Many regression metrics are loss-like (lower is better). Sklearn exposes negative versions: e.g., scoring='neg_mean_squared_error'. After CV you’ll usually take the negative of the reported score to interpret MSE.

Example:

from sklearn.model_selection import cross_val_score
scores = cross_val_score(model, X, y, scoring='neg_root_mean_squared_error', cv=5)
rmse_cv = -scores.mean()
  • Heteroscedasticity: If residual variance changes across X, pointwise error metrics miss that structure. Visualize residuals! The statistical tools you learned (e.g., plots, tests for variance) help diagnose this.

Practical workflow — what to report and why

  1. Always report at least two metrics: one scale-aware (RMSE or MAE) and one relative (R²).
  2. Use RMSE when big errors are especially bad; use MAE for robustness.
  3. If your business cares about percentages, use MAPE only when target > 0 and you understand its bias.
  4. Always show residual plots and distribution (histogram or QQ-plot). Metrics alone lie.
  5. When tuning with cross-validation, use sklearn’s negative-loss scoring and convert back to positive losses for interpretation.

Quick checklist before you ship a model

  • Did you compute RMSE and MAE (or whichever suits your objective)?
  • Did you compute R² or explained variance for relative performance?
  • Did you plot residuals and check for heteroscedasticity?
  • Did you compare against a simple baseline (mean predictor) and confirm positive gain?
  • Did you think about outliers and whether your metric is robust to them?

Final takeaways — the memorable insight

Metrics are not just numbers — they're narratives. RMSE screams about big mistakes, MAE whispers about the median human experience, and R² tells you how much of the story your model remembers. Use more than one metric, visualize residuals, and always compare to a naive baseline.

"A single metric gives you a number. A set of metrics with residual plots gives you truth."

If you want, next we can: code a small function that returns a neat metric report for any regression model (with plots), or walk through interpreting metrics from a real dataset — your call.


Flashcards
Mind Map
Speed Challenge

Comments (0)

Please sign in to leave a comment.

No comments yet. Be the first to comment!

Ready to practice?

Sign up now to study with flashcards, practice questions, and more — and track your progress on this topic.

Study with flashcards, timelines, and more
Earn certificates for completed courses
Bookmark content for later reference
Track your progress across all topics