Power Query and Data Transformation
Master data ingestion and shaping using Power Query: cleaning, transforming, combining queries, and using M code for repeatable ETL.
Content
Text, Number and Date Transformations
Versions:
Watch & Learn
Text, Number, and Date Transformations — The Practical Gym for Your Messy Data
"Power Query is less about magic, more about muscle: the more reps you do on text, number, and date moves, the cleaner your reports." — Someone who drinks too much coffee and loves M code
You're already past the boring part: connecting to data sources and letting Power Query peek at your files. You also know about Applied Steps and Query Settings (so name your steps like a responsible adult), and you've seen Power Query's eager little Data Type Detection (which, bless it, is often wrong). Now we're doing the actual work: turning that chaotic blob of text, numbers, and dates into something your visuals will thank you for.
Why this matters: messy text breaks joins and labels, incorrectly typed numbers wreck measures, and mis-parsed dates turn time intelligence into a crime scene. Fix these in Power Query and your DAX life becomes peaceful.
Quick roadmap (so you don't get lost)
- Text transformations — cleanup, standardize, parse
- Number transformations — coerce, compute, round, fix precision
- Date transformations — parse correctly, create time keys, calculate durations
- Best practices tying back to Applied Steps and Data Type Detection
1) Text Transformations — Because data often speaks a different dialect of human
Common problems: leading/trailing spaces, inconsistent case, phone numbers with punctuation, numbers and currencies stored as text, delimiters inside fields.
Key operations (UI + M snippets):
Trim / Clean — remove invisible nasties
- UI: Transform > Format > Trim / Clean
- M: Text.Trim(" hello ") => "hello"
Change Case — Proper, Upper, Lower
- M: Text.Proper("mARY aNN") => "Mary Ann"
Split Column — by delimiter, positions
- Great for "Full Name" -> "First", "Last" or parsing address lines
Extract / Replace / Remove — regex-like cleaning via Text.Replace, Text.Select
- M: Text.Replace([Phone], "(", "")
Parse numbers from text — strip currency symbols and parse
- M: Number.From(Text.Select([Price], {"0".."9","."}))
Concatenate (build labels)
- M: Text.Combine({[FirstName],[LastName]}, " ")
Real-world example: Sales CSV where Amount is "$1,234.56" and Region codes are "US - West". Use Text.Replace to remove symbols and commas, then Number.From to coerce to numeric.
Quick tip: always Preview after major replace operations — sometimes you remove the decimal point and create disasters.
2) Number Transformations — precision, formatting, and sanity checks
Common problems: numbers stored as text, percentages stored as strings, inconsistent decimal separators, division by zero risks.
Useful operations:
- Change Type: Number.Type or Decimal.Type — remember Data Type Detection may set this incorrectly; explicitly set types in a dedicated step.
- Rounding: Number.Round, Number.RoundDown, Number.RoundUp
- M: Number.Round([Value], 2)
- Safe arithmetic: handle nulls and zeros
- M: if [Denominator] = 0 or [Denominator] = null then null else [Numerator]/[Denominator]
- Normalization: scale, z-score, or currency conversion prior to modeling
Example: A column showing "5%" as text. Remove "%" and divide by 100:
let
cleaned = Text.Replace([PctText], "%", ""),
asNumber = Number.From(cleaned)/100
in
asNumber
Performance note: prefer Table.TransformColumns with a transformation function rather than adding many custom columns — it keeps memory usage lower.
3) Date Transformations — time to stop arguing with years
Dates are where things go wrong dramatically: locale mismatches (mm/dd vs dd/mm), datetimes vs dates, epoch timestamps, and fiscal calendars.
Essential moves:
- Parse reliably: Date.FromText with culture if format differs
- M: Date.FromText("31/12/2020", [Format="dd/MM/yyyy"]) — or use Date.From if default works
- Extract parts: Date.Year, Date.Month, Date.Day, Date.DayOfWeek, Date.WeekOfYear
- Start/End periods: Date.StartOfMonth, Date.EndOfQuarter
- Durations: Duration.Days, Date.AddDays, Date.AddMonths
- Create a Date table: never rely on source dates for time intelligence; generate a calendar in Power Query
Example: Parsing a timestamp stored as 20200215 (YYYYMMDD integer):
let
txt = Text.From([YYYYMMDD]),
dt = Date.FromText(Text.Middle(txt,0,4) & "-" & Text.Middle(txt,4,2) & "-" & Text.End(txt,2))
in
dt
Deal with timezones: strip time or use DateTimeZone.SwitchZone if you must preserve zone-aware times.
4) Troubleshooting & Best Practices (tie-back to Applied Steps + Type Detection)
- Name your steps! (Applied Steps > click gear > rename). "Step 6" is not a résumé-worthy name.
- Do heavy text cleanup before type conversion when you must remove symbols or parse numbers/dates. If Data Type Detection already coerced a column, you can right-click and Change Type again — but consider removing the autoType step and redoing it after transforms.
- Keep an eye on locale: in the Navigator/Source step you might need to set Locale for CSVs (especially for decimals and dates).
- Use error-check steps: Table.SelectRows where Value.Is([Col], type number) = false to find bad rows.
- Performance: prefer column-wise transforms (Table.TransformColumns) and avoid row-by-row custom functions unless necessary.
Handy M Snippets Cheat Sheet
// Remove currency and parse to number
Number.From(Text.Select([Price], {"0".."9","."}))
// Safe divide
if [Den] = 0 or [Den] = null then null else [Num]/[Den]
// Extract Year from YYYYMMDD int
Date.FromText(Text.Middle(Text.From([YYYYMMDD]),0,4) & "-" & Text.Middle(Text.From([YYYYMMDD]),4,2) & "-" & Text.End(Text.From([YYYYMMDD]),2))
Common Gotchas (so you don’t cry at 2am)
- CSV with comma decimals in Europe vs comma delimiters: set the file locale at import.
- Percentages stored as "12%" — Text -> remove % -> divide by 100.
- Null vs empty string: Text.Trim("") is "" (not null). Use null coalescing if you need nulls.
- Auto type detection turned your SKU into a number and dropped leading zeros. Check the Applied Steps and fix the type at the right moment.
Closing — Practical Checklist & Final Thought
- Trim & clean text columns
- Parse numeric values (remove symbols) and set numeric types after cleaning
- Parse dates with locale/format awareness, build a Date table
- Rename steps and keep type conversion explicit (don't blindly trust auto-detect)
- Test for errors and edge cases (empty strings, special chars, unexpected formats)
Final insight: Transforming text, numbers, and dates isn't glamorous, but it's the work that makes your dashboards credible. Think of Power Query as your workshop — the tools are simple, but the craft separates the reports that mislead from the reports that inform.
Go forth, rename your steps, banish stray spaces, and build a date table like your KPIs depend on it (because they do).
Comments (0)
Please sign in to leave a comment.
No comments yet. Be the first to comment!