jypi
  • Explore
ChatWays to LearnMind mapAbout

jypi

  • About Us
  • Our Mission
  • Team
  • Careers

Resources

  • Ways to Learn
  • Mind map
  • Blog
  • Help Center
  • Community Guidelines
  • Contributor Guide

Legal

  • Terms of Service
  • Privacy Policy
  • Cookie Policy
  • Content Policy

Connect

  • Twitter
  • Discord
  • Instagram
  • Contact Us
jypi

© 2026 jypi. All rights reserved.

CFA Level 1
Chapters

1Introduction to CFA Program

2Ethics and Professional Standards

3Quantitative Methods

Time Value of MoneyBasic StatisticsProbability ConceptsStatistical InferenceCorrelation and RegressionHypothesis TestingDiscounted Cash Flow AnalysisFinancial RatiosData Analysis ToolsRisk and Return Calculations

4Financial Reporting and Analysis

5Corporate Finance

6Equity Investments

7Fixed Income

8Derivatives

9Alternative Investments

10Portfolio Management and Wealth Planning

11Economics

12Financial Markets

13Risk Management

14Preparation and Exam Strategy

Courses/CFA Level 1/Quantitative Methods

Quantitative Methods

677 views

Fundamentals of quantitative analysis used in finance.

Content

5 of 10

Correlation and Regression

Regression with Sass: CFA Quant Methods Crash
122 views
intermediate
humorous
finance
quantitative methods
gpt-5-mini
122 views

Versions:

Regression with Sass: CFA Quant Methods Crash

Watch & Learn

AI-discovered learning video

Sign in to watch the learning video for this topic.

Sign inSign up free

Start learning for free

Sign up to save progress, unlock study materials, and track your learning.

  • Bookmark content and pick up later
  • AI-generated study materials
  • Flashcards, timelines, and more
  • Progress tracking and certificates

Free to join · No credit card required

Correlation and Regression — The Sexy Side of Numbers (Actually Useful for CFA L1)

"Correlation is the gossip of data; regression is the confession booth." — Probably me, right now.

You just finished Probability Concepts and Statistical Inference — so you know about distributions, sampling variability, and hypothesis testing. Now we move to the duo that lets you quantify relationships between variables: correlation (how tight the gossip circle is) and regression (who influences whom — or at least who looks like they do). This is crucial for finance: think factor models, forecasting returns, or just making your Excel look impressively academic. But remember Ethics 101: correlation ≠ causation — misuse here is a fast route to misleading clients (and failing ethics questions).


1) Correlation: The Short Summary

What it is: A standardized measure of linear association between two variables. The most common: Pearson correlation coefficient (r).

  • Range: -1 to +1.
    • r = +1 perfect positive linear relationship
    • r = -1 perfect negative linear relationship
    • r ≈ 0 little-to-no linear relationship
  • Formula (conceptual):
r = cov(X, Y) / (σ_X * σ_Y)
  • Interpretation: If r = 0.8, X and Y move together strongly in a linear sense. If r = 0.2, weak linear association — but there might still be a non-linear relationship.

Quick heuristics (context matters!):

  • |r| < 0.3 — weak
  • 0.3 ≤ |r| < 0.6 — moderate
  • |r| ≥ 0.6 — strong

Ask yourself: Is the correlation economically meaningful, or just statistically significant because my sample is huge? Large N can make tiny r significant. That's where your Statistical Inference lessons kick in.

Nonparametric alternative

  • Spearman rank correlation: measures monotonic relationships (good when data aren't linear or are ordinal).

2) Simple Linear Regression: The Basics

Model:

Y = β0 + β1 X + ε
  • β1 (slope): expected change in Y for a one-unit change in X (ceteris paribus).
  • β0 (intercept): predicted value of Y when X = 0 (may be meaningless if X = 0 is outside data range).

Estimation (OLS): choose β-hats to minimize sum of squared residuals.

Formulas (for simple regression):

β1_hat = Σ(x_i - x̄)(y_i - ȳ) / Σ(x_i - x̄)^2
β0_hat = ȳ - β1_hat * x̄

Good to know: OLS gives unbiased estimates under the Gauss–Markov assumptions (we’ll summarize these next).

Partitioning variance: SST = SSR + SSE

  • SST (total) = Σ(y_i - ȳ)^2
  • SSR (explained by model) = Σ(ŷ_i - ȳ)^2
  • SSE (residual) = Σ(y_i - ŷ_i)^2

R-squared: SSR / SST — proportion of variance in Y explained by X.

Note: A high R² isn't an automatic green light. Check residuals, think economics/logic, and watch for overfitting.


3) Hypothesis testing in regression

Test slope = 0 (no linear relationship):

t = β1_hat / SE(β1_hat)

Compare t to t-critical or compute a p-value. This ties directly to your Statistical Inference knowledge: sampling distributions, t-statistics, and confidence intervals.

Confidence interval for β1: β1_hat ± t_(α/2, n-2) * SE(β1_hat).

Prediction vs. Estimation:

  • Confidence interval: for the mean E[Y|X=x0]
  • Prediction interval: for an individual Y at X = x0 (wider because includes residual variance)

4) Assumptions (the LINE checklist) and what breaks

  • Linearity: relationship is linear in parameters
  • Independence of errors: no autocorrelation
  • Normality of errors (for small-sample inference)
  • Equal variance (homoskedasticity)

If assumptions are violated: biased or inefficient estimates, wrong SEs, and misleading inference.

Common problems and quick remedies:

  • Heteroskedasticity → use robust (White) standard errors
  • Autocorrelation (time series) → use Durbin–Watson test; consider AR models or Newey–West SEs
  • Multicollinearity (in multiple regression) → large SEs, unstable β-hats; check VIFs (>10 is suspicious)
  • Omitted variable bias → estimate may be biased; think carefully about causal structure

Omitted variable bias formula (simple intuition):

Bias(β1_hat) = β2 * [Cov(X1, X2) / Var(X1)]

Meaning: if an omitted variable affects Y and correlates with X, your β1 is biased.


5) Practical finance example (mini)

Imagine regressing a stock's excess return (Y) on market excess return (X) — the CAPM spirit.

  • β1_hat is the stock's beta (systematic risk).
  • Test H0: β1 = 1 (is the stock as risky as market?) with t-test — this is a hypothesis test you've seen in Statistical Inference.
  • Low R² doesn't mean beta useless — beta may still be a key parameter for risk.

Table (toy data):

Month Market Excess (%) Stock Excess (%)
1 2.0 3.0
2 -1.0 -1.5
3 1.5 1.0
4 0.0 0.2

(You'd compute β1_hat using the formulas above — practice this in Excel or your calculator.)


6) Ethics: Don’t be that analyst who lies with statistics

  • Never imply causation from correlation without a defensible causal model.
  • Don’t cherry-pick variables or time periods to produce a headline-grabbing R².
  • Disclose model limitations: sample period, data snooping, and assumption checks.

If your regression magically predicts everything with R² = 0.99, either you’ve discovered a financial miracle or you accidentally leaked future information into your predictor. Probable guilty party: look-back bias or data leakage.


7) Quick checklist before you report regression results

  1. Plot data and residuals (visualize before you worship a number).
  2. Check linearity and influential points (Cook’s distance).
  3. Test for heteroskedasticity and autocorrelation if time series.
  4. Consider multicollinearity in multivariate models (VIFs).
  5. Report β-hats, SEs, t-stats, p-values, R² (and adj. R²), and prediction vs confidence intervals.
  6. Be upfront about potential omitted variables and causality limits.

Closing: TL;DR (with Flair)

  • Correlation tells you about co-movement, not cause.
  • Regression estimates marginal effects and lets you test hypotheses (bring your t-tests!).
  • Assumptions matter — violate them and your inference is a house of cards.
  • Ethics matters — statistical glamour without transparency = investor harm and exam failure.

Final pep talk: Run your regressions, but don’t worship coefficients. Combine math with economic sense, check assumptions, and always ask: Does this story make sense outside the sample? If not, don’t publish it; fix it.

Version note: This builds on the probability and inference foundations you’ve already learned — now you get to apply those tests to relationships between variables and ask the ethical questions that separate decent analysts from dinner-table anecdote-sellers.

Flashcards
Mind Map
Speed Challenge

Comments (0)

Please sign in to leave a comment.

No comments yet. Be the first to comment!

Ready to practice?

Sign up now to study with flashcards, practice questions, and more — and track your progress on this topic.

Study with flashcards, timelines, and more
Earn certificates for completed courses
Bookmark content for later reference
Track your progress across all topics