Courses/Python for Data Science, AI & Development/Numerical Computing with NumPy

Numerical Computing with NumPy

41597 views

Leverage NumPy for fast array programming, broadcasting, vectorization, and linear algebra operations.

Content

3 of 15

Indexing and Slicing

NumPy Indexing and Slicing: Practical Guide for Data Science

8036 views

beginner

intermediate

visual

python

numpy

gpt-5-mini

8036 views

Versions:

NumPy Indexing and Slicing: Practical Guide for Data Science

Watch & Learn

AI-discovered learning video

Start learning for free

Bookmark content and pick up later
AI-generated study materials
Flashcards, timelines, and more
Progress tracking and certificates

Free to join · No credit card required

NumPy Indexing and Slicing — Find the Data Fast (and Keep It)

Imagine your ndarray is a layered cake. You don't need to eat the whole cake — you want the chocolate layer, the corner piece, or every second berry on top. Indexing and slicing are your fork and laser scalpel.

Why this matters (building on what you already know)

You already know how to create ndarrays (see: ndarray Creation) and why dtype choices matter (see: Dtypes and Casting). Now you need to get at the data inside efficiently. Indexing and slicing let you: extract features, create masks, prepare batches for models, and write vectorized operations instead of painful Python loops (remember Data Structures and Iteration?). This is where speed and readability meet.

Quick vocabulary — the cheat sheet

Indexing: selecting individual elements (like arr[2, 3]).
Slicing: selecting ranges with start:stop:step (like arr[:, 1:5:2]).
Fancy indexing / advanced indexing: using integer arrays or boolean arrays to select elements — often returns a copy.
View vs Copy: slices produce views (no copy) — changes affect the original. Fancy indexing and boolean masking produce copies.

Basic indexing: the coordinates of your data

NumPy indexing looks like nested Python lists but with extra power.

import numpy as np
arr = np.arange(12).reshape(3,4)
# arr = [[ 0  1  2  3]
#        [ 4  5  6  7]
#        [ 8  9 10 11]]
print(arr[1,2])   # 6 -> row 1, col 2
print(arr[1])     # [4 5 6 7] -> row slice (1D view)
print(arr[1, :2]) # [4 5] -> row 1, first two columns

Micro explanation: single integer for a dimension collapses it. Use slices to keep dimensionality when needed.

Slicing: start:stop:step — the range operator

Syntax: start:stop:step (stop is exclusive).
Negative indices count from the end.
Negative step reverses.

print(arr[::-1])    # reverse rows (step = -1)
print(arr[:, ::-1])  # reverse columns
print(arr[0:3:2])   # every other row: rows 0 and 2

Why stop is exclusive? Because programming loves half-open intervals — they make lengths = stop - start.

Boolean masking — select by condition (very Data Science)

Create a boolean array from a condition and use it to filter rows or elements. This is essential for feature selection and cleaning.

ages = np.array([18, 22, 15, 45, 34])
adult_mask = ages >= 18
adults = ages[adult_mask]   # [18 22 45 34]

# Chain with other arrays
scores = np.array([55, 80, 40, 90, 70])
print(scores[ages >= 18])   # scores for adults

Micro explanation: boolean indexing returns a copy, so modifying it won't change the original array.

Fancy indexing: pick arbitrary items

Use integer arrays (or lists) to select arbitrary rows/cols.

arr = np.arange(16).reshape(4,4)
rows = np.array([0,2])
cols = np.array([1,3])
print(arr[rows])        # selects rows 0 and 2
print(arr[rows, cols])  # selects elements (0,1) and (2,3) -> [1 11]

Important: fancy indexing produces a copy, not a view.

View vs Copy — the gotcha you must remember

Slices (using :) -> usually a view. Modifying it changes the original.
Fancy indexing and boolean masks -> copies. Modifying them does not affect the original.

a = np.arange(6)
view = a[2:5]
view[0] = 999
print(a)        # a changed -> [  0   1 999   3   4   5]

b = a[[0,1]]
b[0] = -1
print(a)        # a unchanged by fancy indexing copy

If you want an independent copy use .copy():

safe = a[2:5].copy()

Dimension tricks: np.newaxis, None, and Ellipsis

Increase dims: arr[:, np.newaxis] or arr[:, None] adds an axis (useful for broadcasting).
Ellipsis ... fills in missing ':' for higher-rank arrays.

v = np.array([1,2,3])
print(v.shape)             # (3,)
v2 = v[:, None]            # shape (3,1)
# Ellipsis:
big = np.zeros((2,3,4,5))
print(big[0, ... , 1])     # shorthand to slice last axis

Use these when preparing data: machine learning models expect (n_samples, n_features) shapes.

Common patterns you'll use daily

select a column: X[:, 2]
select columns 1..3: X[:, 1:4]
select rows satisfying condition: X[X[:,0] > 0]
add axis for broadcasting: x[:, None] + y[None, :]

Task	Indexing pattern	Returns
Row slice	arr[2] or arr[2,:]	1D view or depending on slice
Column slice	arr[:,2]	1D view
Submatrix	arr[1:3, 2:4]	2D view
Arbitrary picks	arr[[0,2,3]]	copy
Condition	arr[arr>0]	copy

Performance notes (brief but crucial)

Views are cheap (no copy). Use slices when possible for memory-critical workloads.
Fancy indexing and boolean masks copy data; they can be expensive on large arrays.
Prefer vectorized boolean masks to Python loops (faster thanks to contiguous memory and C backing).

Closing: key takeaways (so you can flex in interviews)

Slicing gives views; fancy indexing/boolean masks give copies — remember this or you'll debug for hours.
Use negative indices and negative steps to count from the end or reverse quickly.
np.newaxis (or None) and Ellipsis are small helpers that unlock broadcasting and concise slicing in high dimensions.
Combine indexing with what you learned about dtypes and ndarray creation: correct dtype + correct slice = fast, memory-efficient pipelines.

"When you can slice the data the right way, you don't need to loop; you just need to think." — something your future self will be grateful you learned.