Supervised Machine Learning: Regression and Classification

Chapters

1Foundations of Supervised Learning

2Data Wrangling and Feature Engineering

3Exploratory Data Analysis for Predictive Modeling

Univariate Distributions and Summary Stats Pairwise Relationships and Correlations Visualization for Regression Targets Visualization for Class Imbalance Detecting Nonlinearity and Heteroscedasticity Multicollinearity Diagnostics Train–Test Split Before EDA Stratification Strategies Leakage-Aware EDA Practices Robust Scaling Decisions from EDA Identifying Data Quality Issues Feature Importance via Baseline Models Partial Plots for Early Insight Handling Out-of-Range Values Data Imputation Strategy Design

4Train/Validation/Test and Cross-Validation Strategies

5Regression I: Linear Models

6Regression II: Regularization and Advanced Techniques

7Classification I: Logistic Regression and Probabilistic View

8Classification II: Thresholding, Calibration, and Metrics

9Distance- and Kernel-Based Methods

10Tree-Based Models and Ensembles

11Handling Real-World Data Issues

12Dimensionality Reduction and Feature Selection

13Model Tuning, Pipelines, and Experiment Tracking

14Model Interpretability and Responsible AI

15Deployment, Monitoring, and Capstone Project

Courses/Supervised Machine Learning: Regression and Classification/Exploratory Data Analysis for Predictive Modeling

Exploratory Data Analysis for Predictive Modeling

25159 views

EDA methods tailored to supervised tasks to reveal signal, distribution shifts, and modeling risks.

Content

3 of 15

Visualization for Regression Targets

Visualization but Make It Vivid

4902 views

intermediate

visual

humorous

machine learning

gpt-5-mini

4902 views

Versions:

Visualization but Make It Vivid

Watch & Learn

AI-discovered learning video

Start learning for free

Bookmark content and pick up later
AI-generated study materials
Flashcards, timelines, and more
Progress tracking and certificates

Free to join · No credit card required

Visualization for Regression Targets — The Fun, Scientific Part of Staring at Charts

You're past the low-level reconnaissance: you've looked at univariate distributions (Position 1), peeked at pairwise correlations (Position 2), and wrestled with messy data during Data Wrangling and Feature Engineering. Now it's showtime: visualize how predictors actually relate to your continuous target so your model doesn't learn nonsense.

Why this chapter matters (quick reminder)

You're not just making pretty plots. Good visual exploration:

Reveals nonlinearities you should model (or transform).
Exposes heteroscedasticity (variance changing with X) and outliers that ruin MSEs.
Suggests useful feature transformations or binnings without leaking the test set.

Think of this as the tactical reconnaissance before you deploy predictive artillery.

Core plots and when to use them

Below: the high-utility toolkit for regression-target visualization, with quick notes on the what, why, and how.

1) Scatter plot + smoother (LOESS / LOWESS)

Use when predictor is continuous. Shows shape: linear, quadratic, plateau, threshold.
Plot dogma: raw points (alpha < 0.2 for big datasets) + smooth trend line + linear fit for comparison.

Code sketch (seaborn):

import seaborn as sns
sns.scatterplot(x='sqft', y='price', data=df, alpha=0.3)
sns.regplot(x='sqft', y='price', data=df, lowess=True, scatter=False, color='red')

2) Hexbin / 2D density for overplotting

For huge datasets: scatter becomes soup. Use hexbin (matplotlib) or sns.kdeplot(… , fill=True).
Colors convey density; still overlay smoothing if useful.

3) Binned-aggregates (bin numeric X -> mean Y + CI)

When you want an interpretable summary: bin X (quantiles or equal-width), plot mean(Y) with error bars or boxplots.
Great for revealing monotonic trends hidden by noise.

Pseudocode:

df['x_bin'] = pd.qcut(df['x'], 10)
agg = df.groupby('x_bin')['y'].agg(['mean','std','count'])
agg['se'] = agg['std']/np.sqrt(agg['count'])

4) Residual vs Fitted (early model-based EDA)

Fit a quick simple model (linear, tree) and plot residuals vs fitted values.
Use to catch heteroscedasticity, nonlinearity, and clusters of bad fits.

fitted = model.predict(X)
resid = y - fitted
sns.scatterplot(x=fitted, y=resid, alpha=0.3)
plt.axhline(0, color='k', linestyle='--')

Note: model-based plots are allowed in EDA — but be explicit about what you fit and why.

5) Categorical X vs Continuous Y: boxplots, violin + swarm

For categorical predictors: use boxplots to see medians and spread; violins to see distribution shape; add swarm/jittered points when not too many observations.

6) Interaction plots / Facets

Facet by a categorical variable to visualize conditional relationships (e.g., sqft vs price by neighborhood).
Use sns.FacetGrid to make clean multi-panel comparisons.

7) Transformations: before-and-after plots

If target or predictor is skewed, visualize relationship before and after log / Box–Cox / Yeo–Johnson transforms.
Plot both panels side-by-side: often linearity improves and variance stabilizes.

Practical recipe: A step-by-step checklist (what to plot, in order)

Reconfirm target distribution from Position 1 (histogram, skewness). If heavy skew, consider log transform and re-plot.
For each continuous predictor X_i:
- Scatter + LOESS vs target (subsample if >100k rows).
- Hexbin/2D density if overplot.
- Binned mean ± SE to get a smoothed, interpretable signal.
For categorical predictors C_j:
- Boxplot and violin of target by category. Add count annotation.
- If many categories, sort by median target and consider grouping rare levels into "other".
Quick linear model fit per predictor (univariate) → store slope, R², and residual plot. This helps prioritize features.
Multi-panel faceting for plausible interactions (e.g., sqft * neighborhood).
Check heteroscedasticity: residuals vs fitted (from a quick multivariate model).
Inspect high-leverage/outlier points (scatter + label suspicious ids). Decide: fix, transform, or keep.

Table: Which plot to pick (cheat sheet)

Goal	Predictor type	Plot(s)
See shape of relationship	continuous	Scatter + LOESS; Hexbin if dense
Quick summary of trend	continuous	Binned mean ± CI
Categorical effect	categorical	Boxplot / violin + swarm
Heteroscedasticity / nonlinearity	any (model-based)	Residual vs fitted
Interactions	mix	FacetGrid / interaction plots

Little heuristics and gotchas (because data punishes the unwise)

Use alpha and point size to combat overplotting; sample for exploratory speed but keep a reproducible sample seed.
Label your axes and include units. "x" and "y" are not good humans.
Be conservative with transformations — always back-transform interpretation for stakeholders.
When you bin, test different bin widths. Binning can create false plateau illusions.
Watch leakage: do not create features that use future knowledge of the target. Feature engineering should be reproducible in deployment (see previous Data Wrangling notes).

Example — Quick walkthrough (housing prices)

You've already seen price distribution (skewed right). You log-transform price.
Plot sqft vs log(price): LOESS shows diminishing returns after ~3000 sqft.
Facet by neighborhood: the slope differs — interaction suspected.
Bin sqft into deciles and plot mean log(price) ± SE: nicer for a report.
Fit a quick tree; residual vs fitted shows pockets of large residuals in high-priced neighborhoods → maybe missing a prestige variable.

Final takeaways (bite-sized and motivational)

Visualization is hypothesis generation: use plots to suggest transformations, interactions, and missing features — not just to confirm your biases.
Combine raw-point plots with summarized plots (bins, means) and model-based checks (residuals) for a 3D perspective.
Always keep deployability in mind: any transformation you plan to use must be reproducible and not leak future info.

Visualization doesn't make models for you, but good visual work saves you from building models that lie.

Versioning/Next steps:

After these visuals, you should be ready to: build candidate features (informed by observed shapes), select models that can capture the observed nonlinearity (splines, trees, GAMs), and design cross-validation that respects the structures you discovered (groups, time, neighborhoods).

"Go make one scatterplot that changes your model’s life. Preferably two."

Flashcards

Mind Map

Speed Challenge

Comments (0)

Please sign in to leave a comment.

No comments yet. Be the first to comment!

Ready to practice?

Study with flashcards, timelines, and more

Earn certificates for completed courses

Bookmark content for later reference

Track your progress across all topics