Risk, Return, and Probability
Statistical foundations for evaluating investments and quantifying uncertainty.
Content
Distributions and normality tests
Versions:
Watch & Learn
AI-discovered learning video
Sign in to watch the learning video for this topic.
Distributions and normality tests
Quick promise: We’ll make the boring math feel like a late-night epiphany. Also: you will never look at a histogram the same way again.
You already learned how to compute expected return and variance, and how covariance helps diversification. Now we ask the critical follow-up: what do those numbers mean if the shape of the returns distribution is weird? That’s where distributions and normality tests come in — the plumbing behind whether your mean-variance math is trustworthy or just a pretty spreadsheet lying to your face.
What is this topic and why it matters
Primary keyword: Distributions and normality tests
When we say an asset’s returns are "normal" we mean the familiar bell curve is a decent description of the data. Why care? Because many tools in investment management (VaR, confidence intervals for mean returns, some optimization assumptions) either assume normality or behave very differently when returns are skewed or heavy-tailed.
Remember: expected return and variance capture location and dispersion — but not shape. Covariance tells how assets move together — but it assumes those movements are well-behaved. If returns have fat tails or skewness, your risk estimates and diversification gains can be badly biased.
How do we check if returns are normal?
Broad strategy:
- Visual checks (fast, intuitive)
- Summary statistics (skewness & kurtosis)
- Formal statistical tests (p-values and decision rules)
1) Visual checks
- Histogram with a fitted normal curve.
- Q–Q plot (quantile-quantile): if points fall on the 45° line, good; departures at tails = trouble.
Pro tip: Visuals show where the problem is (left tail? right tail? both?) which is crucial for risk decisions.
2) Summary measures: skewness & kurtosis
- Skewness measures asymmetry. Negative skew = long left tail (bad for long investors).
- Excess kurtosis measures tail heaviness relative to normal (0 for normal). Positive kurtosis = fat tails.
Excel: SKEW(range), KURT(range)
Python (pandas/scipy): returns.skew(), returns.kurtosis()
3) Formal tests
Short cheat-sheet of common tests:
| Test | Null hypothesis | Sensitive to | Good for | Weakness |
|---|---|---|---|---|
| Shapiro–Wilk | data ~ Normal | small samples | Detecting small-sample non-normality | Very sensitive to n (too powerful with large n) |
| Jarque–Bera | data ~ Normal | skewness & kurtosis | Financial series (common) | Asymptotic; needs moderate n |
| D’Agostino’s K^2 (normaltest) | data ~ Normal | skew + kurtosis | General-purpose | Similar caveats |
| Anderson–Darling | data ~ Normal | tails | Good edge detection in tails | Table lookup for critical values |
| Kolmogorov–Smirnov | data = specified dist | global differences | Flexible (one-sample) | Needs fully-specified distribution (fit issues) |
Remember: tests give p-values and depend on sample size. A huge sample almost always rejects strict normality for tiny deviations; a tiny sample may not reject big departures.
How to run quick checks: Excel vs Python
Excel (fast starter):
- Use Data Analysis Toolpak for histogram.
- Use SKEW and KURT for stats.
- For Q–Q plot: compute theoretical normal quantiles (NORM.S.INV) and plot against sorted returns.
- Formal tests: there’s no built-in Shapiro/Jarque in vanilla Excel. Use add-ins or switch to Python for robust testing.
Python (recommended for reproducible work):
import numpy as np
import pandas as pd
from scipy import stats
r = returns.dropna() # pandas Series of returns
# Quick stats
print(r.mean(), r.std(), r.skew(), r.kurtosis())
# Visuals
r.hist(bins=40)
stats.probplot(r, dist='norm', plot=plt)
# Tests
print('Shapiro:', stats.shapiro(r)) # small-sample
print('Jarque-Bera:', stats.jarque_bera(r)) # classic in finance
print('D\'Agostino:', stats.normaltest(r)) # K^2 test
print('Anderson-Darling:', stats.anderson(r,'norm'))
Note: standardize (subtract mean, divide by std) before kstest against 'norm' if you use that route.
Real-world examples & interpretation
Imagine daily returns for Stock A vs. Stock B.
- Stock A: skewness ≈ -0.2, kurtosis ≈ 0.1 — close to normal. Tests likely not to reject normality at moderate n.
- Stock B: skewness ≈ -1.5, kurtosis ≈ 8 — pronounced left tail and fat-tailed. Jarque–Bera and AD will reject strongly.
Implication: For Stock B, a mean-variance optimizer might pick it because of a decent mean and moderate variance — but it hides frequent sharp drops. VaR computed assuming normal returns will underestimate the probability of big losses.
Ask yourself: would you rather trust a single variance number or visualize tail behavior? The latter usually wins.
Why people keep misunderstanding this
Because textbooks love normality. It's neat, analytic, and makes math doable. But markets are messy: fat tails, skewness, volatility clustering (heteroskedasticity), and regime shifts. If you treat return distributions as tidy bell curves, your risk measures will be optimistic and your panic will be dramatic when reality bites.
Also: passing a normality test doesn't mean "safe". Failing doesn't mean your model is useless — it means adjust your tools.
Practical rules and alternatives
- Use visuals + skewness/kurtosis always.
- Use Jarque–Bera or Anderson–Darling for financial returns as routine checks.
- If non-normal:
- Consider Student-t models for heavy tails.
- Use robust risk measures: historical or bootstrap VaR, expected shortfall (CVaR).
- Model time-varying volatility (GARCH) instead of assuming iid.
- Use EVT (extreme value theory) for tail risk analysis.
Remember: mean-variance optimization only needs first two moments — but if higher moments are non-trivial, the optimization uses misleading inputs.
Closing — key takeaways
- Distributions and normality tests are the gatekeepers between tidy theory and messy markets.
- Always start with visuals, add skewness/kurtosis, then run formal tests — but interpret p-values with sample size in mind.
- If returns are non-normal (very likely for many assets), don’t pretend: use t-distributions, bootstrap methods, EVT, or conditional volatility models.
Final dramatic insight (because I promised a little chaos): if your portfolio strategy depends on "small probabilities of disaster," then the shape of the distribution is the hero or villain. Learn to read the shape like a weather report — it tells you whether calm is real or just the eye before the storm.
Go run that Q–Q plot. It’s a tiny habit that will save your portfolio's dignity.
Comments (0)
Please sign in to leave a comment.
No comments yet. Be the first to comment!