Data Structures and Iteration
Use Python collections and iteration patterns to write expressive, efficient, and readable data-oriented code.
Content
Enumerate and Zip
Versions:
Watch & Learn
AI-discovered learning video
Sign in to watch the learning video for this topic.
Enumerate and Zip — The Dynamic Duo for Python Iteration
You already know about iterables, iterators, and how generators lazily hand you values like a careful bartender handing out shots. Now meet the reliable wingmates who let you track positions and pair things up without turning your data into a chaotic pile of lists: enumerate and zip.
Why these matter (and where you'll use them)
- Enumerate: when you need the index and the value — think labeling rows, logging where a bad value showed up, or applying different rules by position.
- Zip: when you want to iterate multiple sequences in parallel — think pairing features with labels, merging column-wise data, or stepping through two streams together.
In data work you'll use these to: pair features and targets, align columns from different sources, iterate generators while keeping track of position, and produce tidy, readable loops instead of index gymnastics.
enumerate — counting without the noise
What it is
enumerate(iterable, start=0) returns an enumerate object (an iterator) that yields pairs: (index, value).
Why use it?
- Avoid
range(len(...))and ugly indexing. - Works with generators/iterators without materializing lists (we're still memory friendly — remember the generator lessons!).
Examples
names = ['Ada', 'Grace', 'Katherine']
for i, name in enumerate(names, start=1):
print(i, name)
# 1 Ada
# 2 Grace
# 3 Katherine
Micro explanation: start sets the first index — handy when your data has 1-based IDs.
Enumerate over a generator (no list conversion):
def row_stream():
for n in range(1000000):
yield f"row_{n}"
for idx, row in enumerate(row_stream()):
if idx >= 3:
break
print(idx, row)
This keeps memory low — you're just pulling values as needed.
zip — parallel iteration and pairing
What it is
*zip(iterables) returns an iterator of tuples where the i-th tuple contains the i-th element from each iterable.
Typical uses
- Combine columns: features and targets.
- Walk two lists together: predicted vs actual.
- Transpose matrix-like lists with the
zip(*rows)trick.
Examples
xs = [1, 2, 3]
ys = [10, 20, 30]
for x, y in zip(xs, ys):
print(x, y)
# 1 10
# 2 20
# 3 30
Unzip (the neat inverse trick):
pairs = [(1, 'a'), (2, 'b'), (3, 'c')]
nums, letters = zip(*pairs)
# nums -> (1,2,3), letters -> ('a','b','c')
Watch out: zip truncates
If iterables differ in length, zip stops at the shortest one. If you want to preserve length and fill with a default, use itertools.zip_longest.
from itertools import zip_longest
for a, b in zip_longest([1,2], [10], fillvalue=None):
print(a, b)
# 1 10
# 2 None
Combine enumerate + zip — elegant and common in data work
You often want the index while iterating multiple sequences: say you want to compare predicted and actual labels and report the first few mismatches with their positions.
Example:
preds = [0, 1, 0, 1]
actual = [0, 0, 0, 1]
for i, (p, a) in enumerate(zip(preds, actual)): # neat tuple unpacking
if p != a:
print(f"Mismatch at {i}: pred={p} actual={a}")
This pattern keeps loops readable and intention explicit.
Micro explanation: enumerate returns indices lazily; zip pairs items lazily. You keep streaming-style efficiency if preds / actual are generators.
Practical patterns and one-liners you should memorize
Replace index gymnastics:
- Bad: for i in range(len(values)): do stuff with values[i]
- Good: for i, v in enumerate(values): do stuff with i and v
Parallel transformations:
cols = ['age', 'income', 'score']
transforms = [lambda x: x, lambda x: x/1000, lambda x: (x-50)/10]
for name, fn in zip(cols, transforms):
df[name] = df[name].apply(fn)
- Transpose a list of rows to columns:
rows = [[1, 'a'], [2, 'b'], [3, 'c']]
cols = list(zip(*rows))
# cols -> [(1,2,3), ('a','b','c')]
Advanced tips and gotchas
- enumerate returns an iterator object — if you need to index into it you must materialize: list(enumerate(...)). But prefer processing on the fly.
- zip with different lengths quietly truncates — that's a common source of bugs when joining misaligned data from different sources.
- If you're zipping many iterables and memory matters, ensure each is an iterator/generator, not an expanded list.
- When using with pandas, prefer vectorized operations, but enumerate + zip remains useful when you must apply custom logic row-wise or column-wise.
Why this matters in the broader course
You already learned how iterators and generators let you handle streams efficiently. enumerate and zip are the everyday control tools that make those streams useful: tracking position and pairing streams. They let you write loops that are expressive, safe, and memory-conscious — which is exactly what you want in data pipelines, ETL jobs, and reproducible analyses.
"This is the moment where the concept finally clicks: generators give you values; enumerate and zip help you organize them."
Quick reference
- enumerate(iterable, start=0) -> iterator of (index, value)
- zip(*iterables) -> iterator of tuples (one element from each input)
- zip_longest for unmatched lengths
- unzip with zip(*pairs)
Key takeaways
- Use enumerate to get an index without ugly
range(len(...))code. It's generator-friendly. - Use zip to iterate multiple sequences in parallel and to transpose/unzip data.
- Combine them when you need indices while walking multiple streams — it's clean, efficient, and readable.
Final memorable image: think of a data pipeline like a conveyor belt. Generators are the belt — moving items one by one. Zip grabs two belts and places items side by side. Enumerate sticks little post-it notes on the belt so you always know which slot you’re looking at. Elegant, practical, and slightly therapeutic.
Continue next: we'll see how to use these with more advanced iterator combinators (itertools) and how generator expressions and lazy mapping play with these patterns — the perfect follow-up to the generators and yield material you just read.
Comments (0)
Please sign in to leave a comment.
No comments yet. Be the first to comment!