Python Essentials for AI
Refresh core Python features and patterns most useful for AI and data-intensive programming.
Content
Data Types
Versions:
Watch & Learn
AI-discovered learning video
Python Data Types — the tiny brain cells of your AI code
"If code is a city, data types are the zoning laws. Break them and chaos (or a TypeError) ensues."
You already set up your environment and tooling (Orientation and Python Environment Setup) and revisited syntax essentials (Python Syntax Review). Nice. Now we teach your variables to behave. In this lesson on Python Data Types you'll learn what Python stores, how it stores it, and why that matters for AI workloads.
What is a data type and why should an aspiring AI practitioner care?
- Definition: A data type tells Python what kind of value a variable holds and what operations make sense on it (like adding numbers but not concatenating a list to a float — please don’t try that at home).
- In AI, data types matter because:
- They affect memory usage (tiny models vs. memory hogging pandas objects).
- They change behavior (mutable vs. immutable — we’ll get into drama there).
- Libraries (NumPy, pandas, PyTorch) expect specific types — supply the wrong one and your model will sulk (or crash).
Imagine giving a neural network a Python string instead of a float tensor. That’s like feeding a blender a phone. Neither will end well.
Core built-in types (your essential toolkit)
Numeric types
- int — integers like 42
- float — decimals like 3.14
- complex — complex numbers (rare in basic ML)
x = 7 # int
y = 3.1415 # float
z = 1+2j # complex
Boolean
- bool — True or False
is_training = True
Text
- str — text data
name = 'Ada'
Strings are sequences: you can index them, slice them, and whisper to them gently.
None
- NoneType — represents “nothing” or unknown/missing values
result = None
Use None to initialize placeholders.
Sequences and collections
- list — ordered, mutable sequence: [1, 2, 3]
- tuple — ordered, immutable sequence: (1, 2, 3)
- set — unordered collection of unique elements: {1, 2}
- dict — key:value mapping (the Swiss Army knife): {'a': 1}
L = [1, 2, 3]
T = (1, 2, 3)
S = {1, 2}
D = {'x': 10, 'y': 20}
Mutability: the soap opera of types
- Mutable types can be changed in-place (lists, dicts, sets).
- Immutable types cannot be changed; operations produce new objects (ints, floats, strings, tuples).
Why it matters: mutable objects shared across functions can produce surprising side effects.
a = [1, 2, 3]
b = a
b.append(4)
# a is now [1, 2, 3, 4] — surprise! a and b point to the same list
If you don’t want b to modify a, copy the list: b = a.copy() or b = list(a).
Type checking and conversion (be explicit, not mysterious)
- Check a type:
type(x)orisinstance(x, list) - Convert types:
int(),float(),str(),list(),tuple()
n = '42'
num = int(n) # 42
is_num = isinstance(num, int) # True
Be careful: int('3.14') raises ValueError. First convert to float, then to int if that’s what you want.
Why NumPy arrays matter for AI (and how they relate to built-in types)
Standard Python containers are great for small scripts, but AI code runs on numeric arrays where performance matters.
- NumPy arrays: homogeneous, fixed-type, vectorized operations — essential for fast linear algebra.
- A Python list of floats is not the same as a NumPy ndarray.
import numpy as np
arr = np.array([1.0, 2.0, 3.0]) # dtype float64
arr2 = np.array([1, 2, 3], dtype=np.float32)
Tip: Always check arr.dtype and arr.shape when debugging unexpected results in model training.
Examples: Little AI-friendly snippets
- Converting raw text features into numeric (toy example):
raw = ['cat', 'dog', 'cat']
# naive mapping
vocab = {'cat': 0, 'dog': 1}
encoded = [vocab[w] for w in raw] # [0, 1, 0]
- Padding sequences: notice types
seq = [1, 2, 3]
padded = seq + [0] * (5 - len(seq)) # list concatenation
- Use NumPy for vector math
import numpy as np
v1 = np.array([1.0, 2.0])
v2 = np.array([3.0, 4.0])
print(v1 + v2) # elementwise: [4.0, 6.0]
Common mistakes (and how to avoid them)
- Passing Python lists to libraries that expect ndarrays — convert with
np.array(). - Confusing mutable defaults in function signatures:
# BAD
def add_item(x, lst=[]):
lst.append(x)
return lst
# GOOD
def add_item(x, lst=None):
if lst is None:
lst = []
lst.append(x)
return lst
- Assuming string
'0'behaves like numeric 0 — it does not. Use conversions. - Ignoring dtypes — using float64 when float32 would suffice (and save memory/GPU bandwidth).
Quick cheatsheet table
| Concept | Mutable? | Use-case in AI |
|---|---|---|
| int, float | No | counters, scalars, hyperparams |
| bool | No | flags, masks |
| str | No | labels, file paths |
| list | Yes | small collections, dynamic buffers |
| tuple | No | fixed records, keys |
| set | Yes | unique tokens, membership checks |
| dict | Yes | configs, feature maps |
| np.ndarray | Yes (but element-wise) | tensors, matrices, vectorized ops |
Short practice exercises (do them in your Jupyter/VS Code now)
- Create a list of ints, convert to a NumPy float32 array, compute the mean.
- Write a function that safely appends a value to an optional list argument (hint: mutable default trap).
- Given a dict of label->index, convert a list of labels to one-hot NumPy arrays.
If you get stuck, remember: this course’s "Asking for Help" guidelines are your friend.
Closing: Key takeaways
- Data types are the little rules that govern what operations make sense on your data.
- For AI, prefer NumPy/PyTorch tensors for heavy numeric work, but understand the Python built-ins for glue logic.
- Watch mutability and dtypes — they’re the usual suspects when bugs appear.
Final thought: learning data types is like learning what each ingredient does in a kitchen. You might make a salad with a hammer if you don’t know the difference, but your model won’t taste any better.
Keep these rules in your mental toolbelt as we move from syntax to data pipelines and model training. Next up: using these types practically in data preprocessing for your course project.
Comments (0)
Please sign in to leave a comment.
No comments yet. Be the first to comment!