jypi
  • Explore
ChatWays to LearnMind mapAbout

jypi

  • About Us
  • Our Mission
  • Team
  • Careers

Resources

  • Ways to Learn
  • Mind map
  • Blog
  • Help Center
  • Community Guidelines
  • Contributor Guide

Legal

  • Terms of Service
  • Privacy Policy
  • Cookie Policy
  • Content Policy

Connect

  • Twitter
  • Discord
  • Instagram
  • Contact Us
jypi

© 2026 jypi. All rights reserved.

Python for Data Science, AI & Development
Chapters

1Python Foundations for Data Work

2Data Structures and Iteration

Lists and List ComprehensionsTuples and ImmutabilityDictionaries and Dict ComprehensionsSets and Set OperationsSlicing and ViewsIterables and IteratorsGenerators and yieldEnumerate and ZipSorting and Custom KeysLambda FunctionsMap, Filter, Reduce*args and **kwargsRecursion vs IterationTime Complexity BasicsType Hints and dataclasses

3Numerical Computing with NumPy

4Data Analysis with pandas

5Data Cleaning and Feature Engineering

6Data Visualization and Storytelling

7Statistics and Probability for Data Science

8Machine Learning with scikit-learn

9Deep Learning Foundations

10Data Sources, Engineering, and Deployment

Courses/Python for Data Science, AI & Development/Data Structures and Iteration

Data Structures and Iteration

41523 views

Use Python collections and iteration patterns to write expressive, efficient, and readable data-oriented code.

Content

3 of 15

Dictionaries and Dict Comprehensions

Dictionaries and Dict Comprehensions in Python for Data
3270 views
beginner
humorous
python
data-science
gpt-5-mini
3270 views

Versions:

Dictionaries and Dict Comprehensions in Python for Data

Watch & Learn

AI-discovered learning video

Sign in to watch the learning video for this topic.

Sign inSign up free

Start learning for free

Sign up to save progress, unlock study materials, and track your learning.

  • Bookmark content and pick up later
  • AI-generated study materials
  • Flashcards, timelines, and more
  • Progress tracking and certificates

Free to join · No credit card required

Dictionaries and Dict Comprehensions — Fast, Friendly, and Functional

"If lists are grocery bags and tuples are sealed Tupperware, dictionaries are labeled spice jars — wildly useful when you need to find the thing that matches a name."

You're coming from Python Foundations for Data Work and have already met lists (and their shiny list comprehensions) and tuples (those immutably reliable friends). Now we move to the structure that makes lookups instantaneous and your code smell less like a dumpster fire: dictionaries and dict comprehensions.


What is a dictionary and why it matters for data work

  • Dictionary: a mutable mapping of keys → values. Keys must be hashable (strings, numbers, tuples... not lists).
  • Where it appears in data tasks:
    • Feature lookup: map category → index or one-hot vector
    • Frequency tables: token → count
    • Metadata: column_name → dtype / normalization factor
    • Fast joins/merges when you don't want the overhead of pandas

If lists are great for ordered sequences and tuples guarantee safety (immutability), dictionaries are unbeatable for keyed access — O(1) average-time lookups. That’s why they’re everywhere in data pipelines.


Quick reminders from earlier (lists & tuples)

  • You used list comprehensions to transform sequences: [x**2 for x in nums]. Expect the same elegant expressiveness with dict comprehensions: {k: v for ...}.
  • Tuples can be used as dictionary keys because they’re immutable; lists cannot.

Basic dictionary usage (the easy bits)

Create from literals or two lists:

# literal
d = {'a': 1, 'b': 2}

# from two lists
cols = ['id', 'name', 'age']
values = [101, 'Ada', 29]
row = dict(zip(cols, values))  # {'id':101,'name':'Ada','age':29}

Access safely:

# may raise KeyError
x = d['c']

# safe with default
x = d.get('c', 0)

Update/merge:

d.update({'b': 3, 'c': 4})
# or Python 3.5+: new_d = {**d, **other}

Iteration patterns — choose your weapon:

for k in d:          # keys
for v in d.values(): # values
for k, v in d.items(): # both
for i, (k, v) in enumerate(d.items()): # index + items

Sort while iterating:

for k in sorted(d):
    print(k, d[k])

Dict comprehensions: list comprehension's wilder cousin

Syntax mirrors list comprehensions but builds a mapping:

# basic: feature -> normalized value
counts = {'a': 3, 'b': 7, 'c': 0}
total = sum(counts.values())
norm = {k: v/total for k, v in counts.items()}

Filter while building:

# keep only frequent features
freq_filtered = {k: v for k, v in counts.items() if v >= 2}

Conditionals inside values:

# bucketize
buckets = {k: ('high' if v > 5 else 'low') for k, v in counts.items()}

Nested comprehensions (grouping/inverting):

# invert mapping: value -> list of keys that had that value
inv = {}
for k, v in d.items():
    inv.setdefault(v, []).append(k)

# or using dict + list comprehension (less efficient):
inv = {v: [k for k, val in d.items() if val == v] for v in set(d.values())}

When to prefer dict comprehension: when you can build the mapping in a single, readable expression. If you need complex aggregation, a loop or collections.defaultdict/Counter is often clearer.


Data-science flavored examples (so you can flex in notebooks)

  1. Map categorical values to indices (useful before feeding into models):
cats = ['apple', 'banana', 'apple', 'cherry']
cat_to_idx = {cat: i for i, cat in enumerate(sorted(set(cats)))}
# {'apple': 0, 'banana': 1, 'cherry': 2}
  1. Frequency counts — idiomatic way (Counter) vs manual dict:
from collections import Counter
Counter(cats)  # quickest

# manual (good exercise):
counts = {}
for c in cats:
    counts[c] = counts.get(c, 0) + 1

# Normalize with dict comprehension
normalized = {k: v/sum(counts.values()) for k, v in counts.items()}
  1. Feature engineering — rename columns
raw_cols = ['Age (yrs)', 'Salary USD']
clean = {c: c.lower().replace(' ', '_').replace('(', '').replace(')', '')
         for c in raw_cols}
# {'Age (yrs)': 'age_yrs', 'Salary USD': 'salary_usd'}

Advanced tips & gotchas

  • Keys must be hashable: strings, numbers, tuples ok; lists and dicts not allowed.
  • If you need multiple values per key, store lists or use defaultdict(list).
  • Performance: dict lookups are O(1) on average — perfect for joins and lookups.
  • Beware colliding keys when merging: later keys overwrite earlier ones.
  • For frequency tasks, prefer collections.Counter or defaultdict for clarity and speed.

Quick comparison: list vs tuple vs dict (in one glance)

  • List: ordered, mutable — good for sequences
  • Tuple: ordered, immutable — safe as dict keys
  • Dictionary: unordered mapping key→value — fast lookups and labeled data

Best practices for data projects

  • Use dict comprehensions for readable mapping transforms and small lookups.
  • Use Counter/defaultdict for aggregations; use comprehensions for final transformations.
  • Keep keys simple and consistent (strings or tuples). Keys that are objects can be fragile when pickling or across sessions.
  • Document what keys mean — dictionaries are flexible but can become cryptic messes if keys are used inconsistently.

Key takeaways

  • Dictionaries are the go-to structure for labeled, fast-access data.
  • Dict comprehensions give you declarative power like list comprehensions, letting you map and filter in one line.
  • Use tuple keys when you need composite keys (they're immutable and hashable). Use defaultdict/Counter when aggregating.

"Think of a dict as the indexed index of your data — you can call things by name instead of rummaging through every row."

Go practice: convert a CSV header & row into a dict (zip), then write a dict comprehension to normalize numeric columns and filter out low-quality features. That combo bridges your Python Foundations into real, clean data work.


Happy mapping. When in doubt, enumerate + items() + a bit of comprehension will rescue 90% of your code smell.

Flashcards
Mind Map
Speed Challenge

Comments (0)

Please sign in to leave a comment.

No comments yet. Be the first to comment!

Ready to practice?

Sign up now to study with flashcards, practice questions, and more — and track your progress on this topic.

Study with flashcards, timelines, and more
Earn certificates for completed courses
Bookmark content for later reference
Track your progress across all topics