Courses/Python for Data Science, AI & Development/Data Structures and Iteration

Data Structures and Iteration

41530 views

Use Python collections and iteration patterns to write expressive, efficient, and readable data-oriented code.

Content

5 of 15

Slicing and Views

Python Slicing and Views Explained for Data Science

6134 views

beginner

humorous

python

data-science

numpy

gpt-5-mini

6134 views

Versions:

Python Slicing and Views Explained for Data Science

Watch & Learn

AI-discovered learning video

Start learning for free

Bookmark content and pick up later
AI-generated study materials
Flashcards, timelines, and more
Progress tracking and certificates

Free to join · No credit card required

Slicing and Views — Why Your Slice Might Be a Copy, a View, or a Tiny Betrayal

Ever sliced a list, changed the slice, and watched the original remain stubbornly intact — and then sliced a NumPy array, changed the slice, and your original screamed in pain? Welcome to the dramatic world of slicing and views in Python. This is where memory, performance, and subtle bugs meet in a smoky jazz club and occasionally fight.

"Slicing isn't just syntax — it's a contract about who owns the data."

We're building directly on what you already know: Python Foundations for Data Work (your toolbox and IDE habits), plus earlier Data Structures topics like Dictionaries and Sets. Those taught you what collections are. Now we’ll learn how slicing behaves differently across types, why it matters for data science, and how to avoid nasty surprises when manipulating data.

What is slicing? Quick refresher

Slicing is the sequence operation using [start:stop:step]. It creates a subsequence. Syntax summary:

seq[start:stop] — elements start..stop-1
seq[start:stop:step] — with step (skip or reverse if negative)
Omitted indices use defaults (start=0, stop=len(seq), step=1)

Micro explanation: Under the hood, Python hands the object a slice object (slice(start, stop, step)) — an object that getitem implementations can interpret any way they like.

s = [0,1,2,3,4,5]
print(s[1:5:2])  # [1, 3]
print(s[::-1])   # [5,4,3,2,1,0]  (reverse)

Copy vs View: The core distinction

Copy: a new object with new memory. Mutating the slice does not affect the original.
View: a different object sharing the same memory. Mutating the slice does affect the original.

Why this is important: in data work you either want fast, memory-efficient views or safe independent copies. Pick the right tool.

How common types behave

Type	slice returns	Mutable?	Notes
list	new list (copy)	yes	Slicing makes a fresh list. id differs.
tuple	new tuple (copy)	no	Tuples are immutable; slice gives a new tuple.
str	new str (copy)	no	Immutable, new object.
bytes	new bytes (copy)	no	Immutable. Use bytearray for mutability.
bytearray	new bytearray (copy?) or memoryview	yes	Use memoryview(obj) to get a view.
NumPy ndarray	view (usually)	yes	Slicing returns a view (no copy) unless complex indexing forces copy.
pandas DataFrame	sometimes view, sometimes copy	yes/no	Beware of chained indexing; use .loc/.iloc and .copy() if needed.

Micro tip: List slices are safe but costly for large data; NumPy slices are cheap but can bite you if you mutate them unintentionally.

Examples you will absolutely make at 2AM

Python list: a safe copy

L = list(range(10))
sub = L[2:6]
sub[0] = 999
print(L)   # original unchanged
print(sub) # mutated

Lists make a fresh object. This is predictable and safe — but copying big lists repeatedly is slow.

NumPy arrays: efficient views (and dangerous magic)

import numpy as np
A = np.arange(10)
view = A[2:6]
view[0] = 999
print(A)    # A is changed! view shares memory with A
print(view)

# If you need a copy explicitly:
copy = A[2:6].copy()
copy[0] = -1
print(A)    # unchanged now

Check whether two arrays share memory:

np.shares_memory(A, view)  # True for views

And internals: view.base references the original data buffer when it's a true view (or None if standalone).

Fancy indexing vs slicing in NumPy

Fancy indexing with arrays of indices returns a copy (not a view):

indices = np.array([1,3,5])
sel = A[indices]  # copy, not a view

This trip-up is common: slicing (A[1:6]) -> view; fancy indexing (A[[1,3,5]]) -> copy.

Pandas: the land of ambiguity (aka SettingWithCopyWarning)

DataFrame slicing returns a view or a copy depending on internal memory layout. Pandas warns you with SettingWithCopyWarning when it suspects you're assigning to a copy:

Bad pattern (chained indexing):

df = pd.DataFrame(...)
subset = df[df['col'] > 0]
subset['new'] = 1  # might be assigning to a copy -> warning

Better: use .loc and explicitly copy when required:

subset = df.loc[df['col'] > 0].copy()
subset['new'] = 1  # safe; no surprises

Rule of thumb: if you plan to modify, call .copy() on the DataFrame slice.

When to prefer views vs copies

Use views when: data is large, you need speed and lower memory usage, and you won't accidentally mutate the original (or you intend to). Common in model inference, windowing and feature selection.
Use copies when: you need safe independent manipulations without side effects (data cleaning, feature engineering drafts).

Performance note: copying large arrays repeatedly can convert a memory-bound pipeline into a slow, annoying pipeline. Use views and explicit copies consciously.

Practical patterns for data science

For feature slicing in NumPy: prefer views by default, but call .copy() if you'll mutate.
In pandas, avoid chained indexing. Use df.loc[row_mask, col_list] and .copy() when you plan to change values.
Use memoryview(bytearray) when you need a buffer-like view into binary data.
Check np.shares_memory or ndarray.base when debugging mysterious mutations.

Short debugging checklist

If a change to your slice unexpectedly modifies the original: you probably have a view.
If slicing is slow and memory-heavy: you probably are copying large objects; consider views.
If pandas warns SettingWithCopyWarning: make a deliberate .copy() or use .loc properly.

Key takeaways (so you can recite at the next study group)

Slicing semantics differ by type: lists and strings create copies; NumPy slices are views; pandas can be ambiguous.
Views save memory and time — but they share data, so mutating a view mutates the original.
When in doubt, copy explicitly with .copy() if you need an independent object.

"Treat slices like borrowing a book from a friend: if you dog-ear the pages (mutate), you should know whose book it is."

If you're building pipelines from the previous topics (dict-driven feature maps, set-based de-duplication), keep this in mind: choosing copy vs view affects both correctness and performance. Now go slice responsibly — and remember to .copy() when you're messy.