Courses/Power BI/Power Query and Data Transformation

Power Query and Data Transformation

723 views

Master data ingestion and shaping using Power Query: cleaning, transforming, combining queries, and using M code for repeatable ETL.

Content

5 of 10

Splitting, Merging, and Appending Queries

Power Query — Split/Merge/Append: The No-BS Guide

130 views

intermediate

humorous

visual

education theory

gpt-5-mini

130 views

Versions:

Power Query — Split/Merge/Append: The No-BS Guide

Watch & Learn

AI-discovered learning video

YouTube

Start learning for free

Bookmark content and pick up later
AI-generated study materials
Flashcards, timelines, and more
Progress tracking and certificates

Free to join · No credit card required

Splitting, Merging, and Appending Queries — Power Query Mastery (aka Data Surgery)

"Data is messy. Power Query is your scalpel. Try not to nick a cardinality." — Probably a wise BI TA

You're already off the starter pack: you know how to connect to data sources and how to coerce types and fix text/number/date transformations. Now we get into the surgical toolkit: splitting columns into useful pieces, merging datasets like a careful matchmaker, and appending tables into one glorious stack. These operations are the practical glue that turns raw sources into analysis-ready models.

Why this matters: Splitting makes fields analysable (first/last names, years, ZIP codes), merging brings related data together (customer records + transactions), and appending aggregates repeating sources (monthly files → one table). Do them well, and your visuals stop lying to you.

1) Split Columns — Chop with intention

Imagine you have a "FullName" column: "Alexandra Lopez". You can't pivot by first name if it's trapped in a single cell. Splitting extracts structure.

When to split:

Delimiter-based data (names separated by spaces, addresses with commas)
Fixed-width fields (legacy systems that cram 10 chars per field)
Patterns (uppercase boundaries, digit vs non-digit)

Common methods in Power Query UI:

Split by delimiter (space, comma, comma+space)
Split by number of characters / positions
Split into rows (turns one row into many)
Split by lowercase-to-uppercase or by digit/non-digit (advanced)

Quick example (M):

Table.SplitColumn(
  Source,
  "FullName",
  Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv),
  {"FirstName","LastName"}
)

Pro tips:

After splitting, set data types (remember Data Type Detection & Conversion). FirstName should be text; BirthYear should be Int32.
Trim whitespace (Transform -> Format -> Trim) before merging on split fields.
If you need only the last token (e.g., file extensions), choose "Split by delimiter -> Right-most delimiter".

2) Merge Queries — The joinery that makes insights possible

Merging in Power Query = SQL joins. But with more drama and a GUI.

Classic analogy: You have a transactions table (lots of rows) and a customers table (unique customers). Merge them to attach customer attributes to each transaction.

Join kinds (short table):

Join kind	What it returns	Use when...
Left Outer	All rows from left, matching from right	You have a primary fact table and want lookup fields
Right Outer	All rows from right, matching from left	Rare in practice; swap tables instead
Inner	Only matching rows	You want only intersections (e.g., valid keys)
Full Outer	All rows from both	Reconcile two datasets to see mismatches
Anti (Left/Right)	Non-matching rows	Find or remove orphan records

Power Query UI steps:

Home -> Merge Queries (or Merge Queries as New)
Select tables and matching columns (hold Ctrl for multi-column keys)
Pick Join Kind
Expand the new column to select fields to bring in

M example (M code):

Table.ExpandTableColumn(
  Table.NestedJoin(
    SalesTable,
    {"CustomerID"},
    CustomersTable,
    {"CustomerID"},
    "CustomerInfo",
    JoinKind.LeftOuter
  ),
  "CustomerInfo",
  {"Name","Segment"},
  {"CustomerName","CustomerSegment"}
)

Important nuances:

Data types matter. If the customer ID in one table is text and in the other is number, the merge yields zero matches. Convert types first.
Trim and clean key columns. Leading/trailing spaces and invisible characters are silent killers.
Cardinality matters. Many-to-many merges can explode rows unexpectedly. Know which side is unique.
Fuzzy merge exists when keys aren't exact (typos), but use sparingly — it’s computationally heavy and approximate.

3) Append Queries — Stack like pancakes

Appending is straightforward: you're stacking tables with the same shape. Use this for combining multiple period files or similar-shape datasets.

When to append:

Monthly CSVs with same columns
Split extracts across regions that should be one report

How-to:

Home -> Append Queries (or Append as New)
Choose two or three+ tables

M snippet:

Table.Combine({SalesJan, SalesFeb, SalesMar})

Checklist before appending:

Column names should match exactly. If column order differs but names match, Power Query aligns by name.
Types should be consistent (or re-typed after append). Prefer setting types after the final append to avoid repeated conversions.
Remove leftover metadata columns from source files (like "Source.Name") if not needed.

Power tip: Use the built-in "Combine Files" experience when connecting to a folder — it automates the pattern: sample file -> transform -> apply to all files.

4) Quick comparisons: Split vs Merge vs Append

Operation	Conceptual	Common outcome
Split	Break one column into many	More granular columns (e.g., FirstName, LastName)
Merge	Bring columns from another table	Enrich table with lookup attributes
Append	Stack tables vertically	One big table with more rows

5) Workflow & Performance Best Practices

Clean and filter early: remove unneeded rows/columns before heavy merges/appends.
Detect or set data types after major combining steps (set types once at the end to avoid wasted computation).
Use Reference queries instead of Duplicate when branching transformations — it keeps evaluations leaner.
Avoid merging on computed columns if you can create a stable key in the source. Query folding matters: when queries fold to the source, heavy operations can be pushed to the server.
If merging many-to-many, consider aggregating one side first to control row explosion.

Common gotchas (and how to avoid crying):

"Why am I getting no matches?" — data type mismatch or leftover spaces. Convert and Trim.
"My append has mismatched columns." — rename columns consistently before append.
"Merge exploded rows." — check cardinality: did you unknowingly merge a one-to-many on both sides?
"Performance is slow." — filter before merging; minimize columns; consider incremental refresh for large datasets.

Closing — Key takeaways

Split to expose structure. Merge to enrich. Append to consolidate.
Always clean keys and align data types before merging or appending — you learned type hygiene earlier, remember?
Use staging/reference queries, set types at the right time, and keep performance in mind.

Final lab assignment (tiny challenge):

Connect to a folder of monthly sales CSVs (your "Connecting to Data Sources" skill).
Use Combine Files to produce one table. Clean columns, split "FullName" into two, and merge customer demographics from a separate customers query.
Verify joins by counting unmatched keys (Left Anti Join).

Do this well and your reports will stop telling fibs. Do this poorly and the slicers will shame you.

Version note: you just graduated from "I can connect to data" to "I can shape data like an artist with a very precise scalpel." Keep slicing.

Flashcards

Mind Map

Speed Challenge

Comments (0)

Please sign in to leave a comment.

No comments yet. Be the first to comment!

Ready to practice?

Study with flashcards, timelines, and more

Earn certificates for completed courses

Bookmark content for later reference

Track your progress across all topics