Python Foundations for Data Work
Master core Python syntax and tooling for data tasks, from environments and notebooks to clean, reliable scripts.
Content
Conditionals and Control Flow
Versions:
Watch & Learn
AI-discovered learning video
Sign in to watch the learning video for this topic.
Conditionals and Control Flow — Make Your Data Decide
You already met the gang: Booleans and Logic (we learned how True/False and and/or/not behave) and Numbers and Arithmetic (how to compare magnitudes). Now it’s time to give that logic muscles: conditionals and control flow. This is where your code starts making decisions like a slightly overcaffeinated data analyst.
"This is the moment where the concept finally clicks: code that chooses — not just calculates."
Why conditionals matter for data work
- Filtering bad rows, branching pipelines, feature engineering rules, early exits when a file is missing — all of these are control-flow problems.
- Conditionals let your scripts react to the data, not just passively compute values.
Imagine a data pipeline that tries to normalize values but runs into None, NaN, or negative values. Without conditionals, the pipeline crashes. With conditionals, your pipeline says, "Not today, NaN," and either fixes it or skips the row.
The basics: if, elif, else
x = 42
if x < 0:
print('negative')
elif x == 0:
print('zero')
else:
print('positive')
Micro explanation:
- if tests a condition; if True, its block runs.
- elif (else-if) lets you chain checks.
- else is the default path when none of the above conditions are True.
Quick tip
Use comparisons from the Numbers lesson (like <, >, ==) combined with Booleans/Logic (and/or/not). Those two earlier lessons are your conditionals' fuel.
Truthiness and falsiness (why Python sometimes surprises you)
Python evaluates many objects as True or False without explicit True/False values. Know these common falsy items:
- False
- None
- 0, 0.0
- empty sequences/collections: '', [], (), {}
So these are equivalent to False in conditionals:
if []:
print('this will NOT run')
if 0:
print('this will also NOT run')
Table: common checks
| Object | Boolean value |
|---|---|
| [] | False |
| [1] | True |
| '' | False |
| '0' | True |
| None | False |
Why do people misunderstand this? Because they assume only True/False are valid, but Python is pragmatic: non-empty means useful.
Combining conditions: short-circuiting and order
Remember logical operators from Booleans & Logic:
andstops at first False (short-circuit)orstops at first True
def get_first_positive(nums):
if nums and nums[0] > 0:
return nums[0]
return None
Here nums and nums[0] > 0 is safe because if nums is an empty list (falsy), Python stops and does not evaluate nums[0] > 0 (avoids IndexError).
Order matters in conditions. Put cheap/fast checks first and expensive ones later, especially if they guard against errors.
Chained comparisons and why they’re neat in data checks
Python supports chained comparisons:
if 0 <= value < 100:
print('value is a valid percentage-like number')
This is clearer and faster than if (0 <= value) and (value < 100): and reads like natural language.
Conditional expressions (ternary) — concise decisions
Compact inline conditional:
status = 'ok' if error_count == 0 else 'needs_attention'
Good for quick labels, not for multi-line logic. For data labeling, ternaries are a nice tool for succinct transformations.
Control flow beyond conditionals: loops, break, continue, and guard clauses
forandwhileiterate. Usebreakto stop early andcontinueto skip current iteration.- Guard clauses are early returns in functions that keep code readable.
Example: scanning rows and stopping when a critical error appears
for row in rows:
if row.get('critical_error'):
report(row)
break # stop scanning — we already found the big issue
if not valid(row):
continue # skip bad rows
process(row)
Micro explanation: guard clauses (if not valid: continue) reduce nesting and make code easier to follow.
Conditionals in data transformations
Common pattern: create masks or filtered lists.
List comprehension filtering:
values = [10, -1, None, 25, 0]
clean = [v for v in values if v and v > 0]
# clean -> [10, 25]
Numpy/pandas masks (conceptual — pandas specifics later):
- You’ll use boolean arrays to filter rows, e.g.
df[df['score'] > threshold]. - Remember operator precedence: when combining masks in pandas, use
&and|with parentheses:(cond1) & (cond2).
Common gotchas and best practices
- Don’t compare floats with
==for equality. Use tolerances from the Numbers lesson:abs(a - b) < epsilon. - Use
is Noneto check for None, not== None. - Beware mutable default arguments in functions when using conditionals that mutate state.
- Keep condition blocks small. If they grow, extract a function — easier to test and read.
Advanced note: pattern matching (Python 3.10+)
There’s a newer match/case syntax for structural pattern matching. It’s powerful for complex data patterns, but start with classic if/elif/else first. Think of match as the fancy tool for when your data has many structured variants.
Final checklist (so you don’t mess up in production)
- Use boolean checks to validate inputs early.
- Guard against None/empty data before indexing.
- Prefer chained comparisons for ranges.
- Use short-circuiting to avoid unnecessary or unsafe computations.
- Keep condition blocks readable; extract helper functions.
Key takeaways
- Conditionals are the decision-making center of data code — they keep pipelines robust.
- Combine what you learned about Booleans and Numbers to write safe, expressive checks.
- Short-circuiting and truthiness are your friends when used intentionally; they’re dangerous when used carelessly.
Memorable insight: think of conditionals as the data pipeline's bouncer. They decide who enters (valid rows), who gets a second look (edge cases), and who gets kicked out (bad data). Teach the bouncer well, and your analysis party stays classy.
Try this quick exercise
Given a list of temperature readings (floats, possibly None or negative values representing faulty sensors), write a function that returns the first valid reading between -50 and 60 (inclusive). Use guards and chained comparisons. Bonus: use a tolerance to ignore readings that are effectively 0 because of floating point noise.
Happy branching. Code that thinks is code that helps.
Comments (0)
Please sign in to leave a comment.
No comments yet. Be the first to comment!