jypi
  • Explore
ChatWays to LearnMind mapAbout

jypi

  • About Us
  • Our Mission
  • Team
  • Careers

Resources

  • Ways to Learn
  • Mind map
  • Blog
  • Help Center
  • Community Guidelines
  • Contributor Guide

Legal

  • Terms of Service
  • Privacy Policy
  • Cookie Policy
  • Content Policy

Connect

  • Twitter
  • Discord
  • Instagram
  • Contact Us
jypi

© 2026 jypi. All rights reserved.

Supervised Machine Learning: Regression and Classification
Chapters

1Foundations of Supervised Learning

2Data Wrangling and Feature Engineering

3Exploratory Data Analysis for Predictive Modeling

Univariate Distributions and Summary StatsPairwise Relationships and CorrelationsVisualization for Regression TargetsVisualization for Class ImbalanceDetecting Nonlinearity and HeteroscedasticityMulticollinearity DiagnosticsTrain–Test Split Before EDAStratification StrategiesLeakage-Aware EDA PracticesRobust Scaling Decisions from EDAIdentifying Data Quality IssuesFeature Importance via Baseline ModelsPartial Plots for Early InsightHandling Out-of-Range ValuesData Imputation Strategy Design

4Train/Validation/Test and Cross-Validation Strategies

5Regression I: Linear Models

6Regression II: Regularization and Advanced Techniques

7Classification I: Logistic Regression and Probabilistic View

8Classification II: Thresholding, Calibration, and Metrics

9Distance- and Kernel-Based Methods

10Tree-Based Models and Ensembles

11Handling Real-World Data Issues

12Dimensionality Reduction and Feature Selection

13Model Tuning, Pipelines, and Experiment Tracking

14Model Interpretability and Responsible AI

15Deployment, Monitoring, and Capstone Project

Courses/Supervised Machine Learning: Regression and Classification/Exploratory Data Analysis for Predictive Modeling

Exploratory Data Analysis for Predictive Modeling

25147 views

EDA methods tailored to supervised tasks to reveal signal, distribution shifts, and modeling risks.

Content

3 of 15

Visualization for Regression Targets

Visualization but Make It Vivid
4898 views
intermediate
visual
humorous
machine learning
gpt-5-mini
4898 views

Versions:

Visualization but Make It Vivid

Watch & Learn

AI-discovered learning video

Sign in to watch the learning video for this topic.

Sign inSign up free

Start learning for free

Sign up to save progress, unlock study materials, and track your learning.

  • Bookmark content and pick up later
  • AI-generated study materials
  • Flashcards, timelines, and more
  • Progress tracking and certificates

Free to join · No credit card required

Visualization for Regression Targets — The Fun, Scientific Part of Staring at Charts

You're past the low-level reconnaissance: you've looked at univariate distributions (Position 1), peeked at pairwise correlations (Position 2), and wrestled with messy data during Data Wrangling and Feature Engineering. Now it's showtime: visualize how predictors actually relate to your continuous target so your model doesn't learn nonsense.


Why this chapter matters (quick reminder)

You're not just making pretty plots. Good visual exploration:

  • Reveals nonlinearities you should model (or transform).
  • Exposes heteroscedasticity (variance changing with X) and outliers that ruin MSEs.
  • Suggests useful feature transformations or binnings without leaking the test set.

Think of this as the tactical reconnaissance before you deploy predictive artillery.


Core plots and when to use them

Below: the high-utility toolkit for regression-target visualization, with quick notes on the what, why, and how.

1) Scatter plot + smoother (LOESS / LOWESS)

  • Use when predictor is continuous. Shows shape: linear, quadratic, plateau, threshold.
  • Plot dogma: raw points (alpha < 0.2 for big datasets) + smooth trend line + linear fit for comparison.

Code sketch (seaborn):

import seaborn as sns
sns.scatterplot(x='sqft', y='price', data=df, alpha=0.3)
sns.regplot(x='sqft', y='price', data=df, lowess=True, scatter=False, color='red')

2) Hexbin / 2D density for overplotting

  • For huge datasets: scatter becomes soup. Use hexbin (matplotlib) or sns.kdeplot(… , fill=True).
  • Colors convey density; still overlay smoothing if useful.

3) Binned-aggregates (bin numeric X -> mean Y + CI)

  • When you want an interpretable summary: bin X (quantiles or equal-width), plot mean(Y) with error bars or boxplots.
  • Great for revealing monotonic trends hidden by noise.

Pseudocode:

df['x_bin'] = pd.qcut(df['x'], 10)
agg = df.groupby('x_bin')['y'].agg(['mean','std','count'])
agg['se'] = agg['std']/np.sqrt(agg['count'])

4) Residual vs Fitted (early model-based EDA)

  • Fit a quick simple model (linear, tree) and plot residuals vs fitted values.
  • Use to catch heteroscedasticity, nonlinearity, and clusters of bad fits.
fitted = model.predict(X)
resid = y - fitted
sns.scatterplot(x=fitted, y=resid, alpha=0.3)
plt.axhline(0, color='k', linestyle='--')

Note: model-based plots are allowed in EDA — but be explicit about what you fit and why.

5) Categorical X vs Continuous Y: boxplots, violin + swarm

  • For categorical predictors: use boxplots to see medians and spread; violins to see distribution shape; add swarm/jittered points when not too many observations.

6) Interaction plots / Facets

  • Facet by a categorical variable to visualize conditional relationships (e.g., sqft vs price by neighborhood).
  • Use sns.FacetGrid to make clean multi-panel comparisons.

7) Transformations: before-and-after plots

  • If target or predictor is skewed, visualize relationship before and after log / Box–Cox / Yeo–Johnson transforms.
  • Plot both panels side-by-side: often linearity improves and variance stabilizes.

Practical recipe: A step-by-step checklist (what to plot, in order)

  1. Reconfirm target distribution from Position 1 (histogram, skewness). If heavy skew, consider log transform and re-plot.
  2. For each continuous predictor X_i:
    • Scatter + LOESS vs target (subsample if >100k rows).
    • Hexbin/2D density if overplot.
    • Binned mean ± SE to get a smoothed, interpretable signal.
  3. For categorical predictors C_j:
    • Boxplot and violin of target by category. Add count annotation.
    • If many categories, sort by median target and consider grouping rare levels into "other".
  4. Quick linear model fit per predictor (univariate) → store slope, R², and residual plot. This helps prioritize features.
  5. Multi-panel faceting for plausible interactions (e.g., sqft * neighborhood).
  6. Check heteroscedasticity: residuals vs fitted (from a quick multivariate model).
  7. Inspect high-leverage/outlier points (scatter + label suspicious ids). Decide: fix, transform, or keep.

Table: Which plot to pick (cheat sheet)

Goal Predictor type Plot(s)
See shape of relationship continuous Scatter + LOESS; Hexbin if dense
Quick summary of trend continuous Binned mean ± CI
Categorical effect categorical Boxplot / violin + swarm
Heteroscedasticity / nonlinearity any (model-based) Residual vs fitted
Interactions mix FacetGrid / interaction plots

Little heuristics and gotchas (because data punishes the unwise)

  • Use alpha and point size to combat overplotting; sample for exploratory speed but keep a reproducible sample seed.
  • Label your axes and include units. "x" and "y" are not good humans.
  • Be conservative with transformations — always back-transform interpretation for stakeholders.
  • When you bin, test different bin widths. Binning can create false plateau illusions.
  • Watch leakage: do not create features that use future knowledge of the target. Feature engineering should be reproducible in deployment (see previous Data Wrangling notes).

Example — Quick walkthrough (housing prices)

  1. You've already seen price distribution (skewed right). You log-transform price.
  2. Plot sqft vs log(price): LOESS shows diminishing returns after ~3000 sqft.
  3. Facet by neighborhood: the slope differs — interaction suspected.
  4. Bin sqft into deciles and plot mean log(price) ± SE: nicer for a report.
  5. Fit a quick tree; residual vs fitted shows pockets of large residuals in high-priced neighborhoods → maybe missing a prestige variable.

Final takeaways (bite-sized and motivational)

  • Visualization is hypothesis generation: use plots to suggest transformations, interactions, and missing features — not just to confirm your biases.
  • Combine raw-point plots with summarized plots (bins, means) and model-based checks (residuals) for a 3D perspective.
  • Always keep deployability in mind: any transformation you plan to use must be reproducible and not leak future info.

Visualization doesn't make models for you, but good visual work saves you from building models that lie.

Versioning/Next steps:

  • After these visuals, you should be ready to: build candidate features (informed by observed shapes), select models that can capture the observed nonlinearity (splines, trees, GAMs), and design cross-validation that respects the structures you discovered (groups, time, neighborhoods).

"Go make one scatterplot that changes your model’s life. Preferably two."

Flashcards
Mind Map
Speed Challenge

Comments (0)

Please sign in to leave a comment.

No comments yet. Be the first to comment!

Ready to practice?

Sign up now to study with flashcards, practice questions, and more — and track your progress on this topic.

Study with flashcards, timelines, and more
Earn certificates for completed courses
Bookmark content for later reference
Track your progress across all topics