Data Modeling Fundamentals
Build robust and efficient semantic models: relationships, cardinality, star schemas, normalization considerations, and sizing strategies.
Content
Tables and Relationships
Versions:
Watch & Learn
Tables and Relationships — Data Modeling Fundamentals (Power BI)
You’ve already tamed messy source data with Power Query, turned it into reliable dataflows, and even whispered sweet nothings to M code so your ETL runs on autopilot. Now: welcome to the backstage of your dataset — the place where tables actually talk to each other. This is where analysis becomes fast, accurate, and not embarrassing.
Why this matters (and why your visuals care)
If Power Query is the chef prepping ingredients, data modeling is the recipe. Tables by themselves are good ingredients; relationships stitch them into something edible. A bad or missing relationship = incorrect measures, broken filters, performance nightmares, and dashboards that lie with confidence.
We’re building on the work you already did in Power Query: clean keys, consistent types, and reusable dataflows are the foundations that make relationships reliable. If you skipped keyed IDs or normalized a table into 17 pieces for fun, relationships will be your next battleground.
Core concepts (the cliff notes your future self will thank you for)
Tables
- Fact tables: Wide, transaction-level, numeric metrics (sales, clicks, events). They’re the heavy lifters.
- Dimension tables: Descriptive context (customers, products, dates). They answer the “who/what/where/when.”
Design tip: Keep dimensions small and stable. Power Query should have given them consistent surrogate keys and clean columns.
Keys (the glue)
- Surrogate keys (recommended): Numeric, simple, stable IDs you create in Power Query when source keys are messy.
- Natural keys: Real-world IDs (SKU, email). Fine, but fragile.
If your keys are inconsistent across tables, a relationship is a rumor, not a contract. Fix keys upstream in your dataflows.
Relationships — what can go wrong and how to reason about it
Cardinality (the relationship types)
| Type | Meaning | Typical example |
|---|---|---|
| One-to-many | One row in Table A matches many rows in Table B | Customer -> Sales |
| Many-to-one | The inverse view of above | Sales -> Product |
| One-to-one | Rare; usually implies denormalization possible | SKU -> SKU master data |
| Many-to-many | Both sides have duplicates; use carefully | Product tags <-> Products (or use bridge table) |
Cross-filter direction
- Single: Filters flow from one side (usually Dimension) to the other (Fact). This is the common, safe default.
- Both (bi-directional): Filters flow both ways. Useful for certain relationship patterns but can create ambiguous filter contexts and hurt performance.
Active vs Inactive relationships
- Only one active relationship between two tables at a time. Inactive ones can be activated temporarily in DAX with USERELATIONSHIP(). Handy for alternate date relationships (Order Date vs Ship Date).
Practical patterns and examples (aka, when to do what)
Star schema (the golden rule)
- Central fact table(s) + surrounding dimension tables.
- Simplifies relationships (one-to-many from each dimension to fact).
- Great for performance and intuitive DAX.
Imagine a star like a celebrity’s entourage: the celebrity (Fact) is surrounded by loyal dimension tables who only influence the celeb, not each other.
Snowflake (when you normalize too far)
- Dimensions are normalized into sub-dimensions.
- Can reduce redundancy but often complicates DAX and slows queries. Use sparingly.
Bridge tables (for many-to-many)
- If two tables both contain duplicates and need to relate, create a bridge (junction) table with unique rows for combinations.
- Power BI now supports many-to-many relationships, but bridges often give clarity and better control.
Walkthrough: Common scenarios and how to fix them
Missing join keys after transformations
- Fix in Power Query: ensure consistent types (Text vs Whole Number), trim whitespace, pad zeros if necessary.
- Example: Sales[ProductID] should match Product[ProductID] exactly. If not, add a step in M to clean.
Ambiguous relationship causing incorrect totals
- Use a star schema; remove unnecessary relationships. Prefer single-direction filters unless you know the implications.
Multiple date columns (OrderDate, ShipDate) in fact
- Create a Date dimension and relate it once to the fact via the primary date. Use USERELATIONSHIP() in DAX to temporarily activate alternate dates.
Example DAX using USERELATIONSHIP():
Total Sales by Ship Date =
CALCULATE(
[Total Sales],
USERELATIONSHIP('Date'[Date], 'Sales'[ShipDate])
)
- Need to get a lookup value from another table
- Use RELATED() when you’re on the many-side of a one-to-many relationship:
Product Category = RELATED(Product[Category])
If RELATED() returns blank, your relationship is probably missing or inactive.
Performance & maintainability tips (because dashboards get judged harshly)
- Prefer numeric surrogate keys for joins; they’re faster than text.
- Keep dimension tables small and wide, not skinny and long.
- Avoid unnecessary bi-directional filters; they can explode calculation time.
- Push cleaning and key creation into your reusable dataflows (you already built those — good job!). That makes your model easier to update and avoids surprises.
Troubleshooting checklist (quick and dirty)
- Are datatypes identical between key columns? (Yes/No)
- Are keys unique on the one-side? (Yes/No)
- Is there an active relationship? (Yes/No)
- Is cross-filter direction appropriate? (Single/Both)
- Are there unintended relationships between dimension tables? (Yes/No)
If any answer is No, start in Power Query to fix keys and in Model view to adjust cardinality/direction.
Final pep talk + big-picture takeaway
You’ve already done the heavy lifting in Power Query: cleaning, folding when possible, and building reusable dataflows. Those steps are your secret sauce — they make the modeling step not just possible but stable.
Data modeling in Power BI is less about clever DAX spells and more about good structure: clear fact/dimension separation, reliable keys, appropriate cardinality, and careful filter direction. When you get this right, your measures behave, your visuals tell the truth, and you sleep better.
Remember: relationships are contracts between tables. Treat keys like promises. Keep them clean, consistent, and honored.
Quick reference: When to use what
- Use a star schema for most analytic models.
- Use bridge tables for intentional many-to-many relationships.
- Use USERELATIONSHIP for alternate relationships (date types).
- Use RELATED/RELATEDTABLE to traverse relationships in DAX.
Go forth, connect your tables with intention, and make your dashboards un-lie-able.
Want a next step? Build a small practice model: Customer (dimension), Product (dimension), Date (dimension), Sales (fact). Create relationships, then try an inactive relationship for Ship Date and exercise USERELATIONSHIP. It’s like bench-pressing for your brain.
Comments (0)
Please sign in to leave a comment.
No comments yet. Be the first to comment!