jypi
  • Explore
ChatWays to LearnMind mapAbout

jypi

  • About Us
  • Our Mission
  • Team
  • Careers

Resources

  • Ways to Learn
  • Mind map
  • Blog
  • Help Center
  • Community Guidelines
  • Contributor Guide

Legal

  • Terms of Service
  • Privacy Policy
  • Cookie Policy
  • Content Policy

Connect

  • Twitter
  • Discord
  • Instagram
  • Contact Us
jypi

© 2026 jypi. All rights reserved.

Python for Data Science, AI & Development
Chapters

1Python Foundations for Data Work

2Data Structures and Iteration

3Numerical Computing with NumPy

ndarray CreationDtypes and CastingIndexing and SlicingBoolean MaskingBroadcasting RulesVectorization TechniquesUniversal Functions (ufuncs)Aggregations and ReductionsReshaping and TransposeStacking and SplittingRandom Number GenerationLinear Algebra RoutinesMemory Layout and StridesPerformance Tips and NumExprSaving and Loading Arrays

4Data Analysis with pandas

5Data Cleaning and Feature Engineering

6Data Visualization and Storytelling

7Statistics and Probability for Data Science

8Machine Learning with scikit-learn

9Deep Learning Foundations

10Data Sources, Engineering, and Deployment

Courses/Python for Data Science, AI & Development/Numerical Computing with NumPy

Numerical Computing with NumPy

41594 views

Leverage NumPy for fast array programming, broadcasting, vectorization, and linear algebra operations.

Content

3 of 15

Indexing and Slicing

NumPy Indexing and Slicing: Practical Guide for Data Science
8036 views
beginner
intermediate
visual
python
numpy
gpt-5-mini
8036 views

Versions:

NumPy Indexing and Slicing: Practical Guide for Data Science

Watch & Learn

AI-discovered learning video

Sign in to watch the learning video for this topic.

Sign inSign up free

Start learning for free

Sign up to save progress, unlock study materials, and track your learning.

  • Bookmark content and pick up later
  • AI-generated study materials
  • Flashcards, timelines, and more
  • Progress tracking and certificates

Free to join · No credit card required

NumPy Indexing and Slicing — Find the Data Fast (and Keep It)

Imagine your ndarray is a layered cake. You don't need to eat the whole cake — you want the chocolate layer, the corner piece, or every second berry on top. Indexing and slicing are your fork and laser scalpel.


Why this matters (building on what you already know)

You already know how to create ndarrays (see: ndarray Creation) and why dtype choices matter (see: Dtypes and Casting). Now you need to get at the data inside efficiently. Indexing and slicing let you: extract features, create masks, prepare batches for models, and write vectorized operations instead of painful Python loops (remember Data Structures and Iteration?). This is where speed and readability meet.


Quick vocabulary — the cheat sheet

  • Indexing: selecting individual elements (like arr[2, 3]).
  • Slicing: selecting ranges with start:stop:step (like arr[:, 1:5:2]).
  • Fancy indexing / advanced indexing: using integer arrays or boolean arrays to select elements — often returns a copy.
  • View vs Copy: slices produce views (no copy) — changes affect the original. Fancy indexing and boolean masking produce copies.

Basic indexing: the coordinates of your data

NumPy indexing looks like nested Python lists but with extra power.

import numpy as np
arr = np.arange(12).reshape(3,4)
# arr = [[ 0  1  2  3]
#        [ 4  5  6  7]
#        [ 8  9 10 11]]
print(arr[1,2])   # 6 -> row 1, col 2
print(arr[1])     # [4 5 6 7] -> row slice (1D view)
print(arr[1, :2]) # [4 5] -> row 1, first two columns

Micro explanation: single integer for a dimension collapses it. Use slices to keep dimensionality when needed.


Slicing: start:stop:step — the range operator

  • Syntax: start:stop:step (stop is exclusive).
  • Negative indices count from the end.
  • Negative step reverses.
print(arr[::-1])    # reverse rows (step = -1)
print(arr[:, ::-1])  # reverse columns
print(arr[0:3:2])   # every other row: rows 0 and 2

Why stop is exclusive? Because programming loves half-open intervals — they make lengths = stop - start.


Boolean masking — select by condition (very Data Science)

Create a boolean array from a condition and use it to filter rows or elements. This is essential for feature selection and cleaning.

ages = np.array([18, 22, 15, 45, 34])
adult_mask = ages >= 18
adults = ages[adult_mask]   # [18 22 45 34]

# Chain with other arrays
scores = np.array([55, 80, 40, 90, 70])
print(scores[ages >= 18])   # scores for adults

Micro explanation: boolean indexing returns a copy, so modifying it won't change the original array.


Fancy indexing: pick arbitrary items

Use integer arrays (or lists) to select arbitrary rows/cols.

arr = np.arange(16).reshape(4,4)
rows = np.array([0,2])
cols = np.array([1,3])
print(arr[rows])        # selects rows 0 and 2
print(arr[rows, cols])  # selects elements (0,1) and (2,3) -> [1 11]

Important: fancy indexing produces a copy, not a view.


View vs Copy — the gotcha you must remember

  • Slices (using :) -> usually a view. Modifying it changes the original.
  • Fancy indexing and boolean masks -> copies. Modifying them does not affect the original.
a = np.arange(6)
view = a[2:5]
view[0] = 999
print(a)        # a changed -> [  0   1 999   3   4   5]

b = a[[0,1]]
b[0] = -1
print(a)        # a unchanged by fancy indexing copy

If you want an independent copy use .copy():

safe = a[2:5].copy()

Dimension tricks: np.newaxis, None, and Ellipsis

  • Increase dims: arr[:, np.newaxis] or arr[:, None] adds an axis (useful for broadcasting).
  • Ellipsis ... fills in missing ':' for higher-rank arrays.
v = np.array([1,2,3])
print(v.shape)             # (3,)
v2 = v[:, None]            # shape (3,1)
# Ellipsis:
big = np.zeros((2,3,4,5))
print(big[0, ... , 1])     # shorthand to slice last axis

Use these when preparing data: machine learning models expect (n_samples, n_features) shapes.


Common patterns you'll use daily

  • select a column: X[:, 2]
  • select columns 1..3: X[:, 1:4]
  • select rows satisfying condition: X[X[:,0] > 0]
  • add axis for broadcasting: x[:, None] + y[None, :]
Task Indexing pattern Returns
Row slice arr[2] or arr[2,:] 1D view or depending on slice
Column slice arr[:,2] 1D view
Submatrix arr[1:3, 2:4] 2D view
Arbitrary picks arr[[0,2,3]] copy
Condition arr[arr>0] copy

Performance notes (brief but crucial)

  • Views are cheap (no copy). Use slices when possible for memory-critical workloads.
  • Fancy indexing and boolean masks copy data; they can be expensive on large arrays.
  • Prefer vectorized boolean masks to Python loops (faster thanks to contiguous memory and C backing).

Closing: key takeaways (so you can flex in interviews)

  • Slicing gives views; fancy indexing/boolean masks give copies — remember this or you'll debug for hours.
  • Use negative indices and negative steps to count from the end or reverse quickly.
  • np.newaxis (or None) and Ellipsis are small helpers that unlock broadcasting and concise slicing in high dimensions.
  • Combine indexing with what you learned about dtypes and ndarray creation: correct dtype + correct slice = fast, memory-efficient pipelines.

"When you can slice the data the right way, you don't need to loop; you just need to think." — something your future self will be grateful you learned.


Try this (2-minute practice)

  1. Create a (1000, 10) array of random floats.
  2. Extract rows where column 0 > 0.5 and column 3 < 0.2.
  3. From that subset, take every other column starting at column 1.

If you get stuck, remember: boolean masks combine with & and parentheses, and columns slice with start:stop:step.

Happy slicing — may your views be fast and your copies deliberate.

Flashcards
Mind Map
Speed Challenge

Comments (0)

Please sign in to leave a comment.

No comments yet. Be the first to comment!

Ready to practice?

Sign up now to study with flashcards, practice questions, and more — and track your progress on this topic.

Study with flashcards, timelines, and more
Earn certificates for completed courses
Bookmark content for later reference
Track your progress across all topics