Python Foundations for Data Work
Master core Python syntax and tooling for data tasks, from environments and notebooks to clean, reliable scripts.
Content
Working in Jupyter and VS Code
Versions:
Watch & Learn
AI-discovered learning video
Sign in to watch the learning video for this topic.
Working in Jupyter and VS Code — Practical Workflow for Python Data Work
You already installed Python, pip/conda, and the basic tooling (see "Installing Python and Tooling"). Great — now let's make those tools actually useful instead of just being decorative icons on your desktop.
Why this matters (skip if you already love notebooks)
If "Installing Python and Tooling" was you laying the foundation bricks, this guide shows how to build the house where your data lives. Jupyter is the fast, interactive sketchpad for experiments and exploration. VS Code is the workshop and control center for building, debugging, and productionizing code. Mastering both gives you flexibility: quick experiments that scale into reproducible projects.
Jupyter vs VS Code — when to use which
Jupyter Notebook / JupyterLab
- Best for: exploratory data analysis (EDA), visualization, interactive demos, teaching, and iterative experiments.
- Strengths: cell-based execution, inline plots, rich markdown + LaTeX, rapid iteration.
VS Code (with Python + Jupyter extensions)
- Best for: building larger scripts, debugging, version-controlled projects, converting notebooks to modules, and mixed workflows (notebooks + .py files).
- Strengths: integrated terminal, debugger, extensions, Git integration, improved refactoring.
Think of Jupyter as the sketchbook and VS Code as the lab. One is for scribbling brilliant ideas, the other for turning scribbles into something you can ship without crying at pipeline failures.
Quick start: open a notebook in VS Code and JupyterLab
JupyterLab (classic route)
- Activate your environment (you learned to create it in the previous lesson):
# venv example
python -m venv .venv
source .venv/bin/activate # macOS / Linux
.\.venv\Scripts\activate # Windows PowerShell
# or conda
conda activate myenv
- Install Jupyter and start:
pip install jupyterlab
jupyter lab
- Open the browser UI, create a new notebook, and enjoy cell-based glory.
VS Code (recommended single-app workflow)
- Install the Python and Jupyter extensions.
- Open your project folder, select the interpreter (bottom-left), and open/create a .ipynb file.
- Run cells inside VS Code, use the interactive window, and use the variable explorer.
Kernels and environments — the secret to reproducibility
- A kernel is the process that executes your notebook's code. Kernels map to specific Python environments.
- Always select the kernel that matches your project's virtual environment. This avoids the disaster of notebooks using a global, outdated package when you thought you were in a project-specific env.
Micro explanation: If you change dependencies, create a fresh environment and register it as a kernel:
pip install ipykernel
python -m ipykernel install --user --name=myenv --display-name "Python (myenv)"
Now choose Python (myenv) as the kernel in Jupyter or VS Code.
Productivity tips & powerful shortcuts
- Run cell: Shift+Enter
- Run cell and insert below: Alt+Enter
- Restart kernel + run all (Jupyter) — the sanity-check combo to make sure everything runs from a clean state.
- In VS Code: use the Debug Cell command to step through cell execution with breakpoints.
Handy cell magics:
# install packages from inside a notebook (safer than pip in OS shell sometimes)
%pip install pandas seaborn
# time a statement
a%timeit df['x'].mean()
Pro tip: use %pip and %conda magics inside notebooks so the install affects the notebook kernel, not the system shell.
Debugging and inspecting data
- VS Code lets you set breakpoints inside cells and step through code, which is a game changer compared to print-debugging in notebooks.
- Use the Variables pane (VS Code) or
df.head()/.info()/.describe()in notebooks. - For DataFrames, use the Data Viewer (click the little grid in VS Code or use
df.head()in JupyterLab) to inspect columns, dtypes, and missing values visually.
Example: debug a cell
# In VS Code, right-click the cell and choose 'Debug Cell'
# Then set breakpoints inside functions and step through
Converting notebooks and preparing for production
- Convert to script:
jupyter nbconvert --to script notebook.ipynb
- Or in VS Code, use the command to export notebook to .py — it creates
# %%cell markers that let you run code interactively in an editor. - For production: move heavy logic into .py modules and import them in a lightweight notebook. Notebooks should be orchestration + visualization, not monoliths of business logic.
Version control & collaboration
- Notebooks are JSON — messy for diffs. Two strategies:
- Use tools like nbdime for cleaner diffs.
- Keep core logic in .py files and use notebooks mostly for EDA and reports.
- VS Code integrates Git, so commit notebooks, but consider clearing outputs before committing to reduce noise:
jupyter nbconvert --ClearOutputPreprocessor.enabled=True --inplace notebook.ipynb
Security & sanity checks
- Never run untrusted notebooks. They can execute arbitrary code on your machine.
- Restart the kernel and run all cells to ensure reproducibility before sharing results.
Quote to remember:
"Notebooks are wonderful until they are not reproducible. Clean kernels are your friend."
Common pitfalls and how to avoid them
- Pitfall: notebook uses a different interpreter than your project. Fix: always verify the kernel/interpreter.
- Pitfall: large outputs committed to Git. Fix: clear outputs or use .gitattributes to handle notebooks.
- Pitfall: installing packages with
!pip(shell) vs%pip(magic) — prefer%pipso the package goes into the active kernel environment.
Quick checklist for day-to-day data work
- Activate the correct environment / select the correct kernel
- Use
%pipto install packages inside notebooks - Restart kernel + Run all before sharing
- Move reusable code into .py modules and import
- Use VS Code debugger for complex issues
- Use nbconvert/nbdime or keep outputs out of Git to reduce merge pain
Key takeaways
- Jupyter = exploration; VS Code = development. Use both and let each do what it does best.
- Manage kernels/environments explicitly — reproducibility depends on it.
- Use VS Code's debugger, variable explorer, and Git integration to take notebooks from ad-hoc to production-ready.
Final memorable insight:
You don't have to choose only one tool. Think of notebooks as your creative lab notebooks and VS Code as the precision toolset that makes experiments reliable and shareable. Together, they make you not just a tinkerer, but a reproducible data scientist.
Tags: beginner, data-science, python, humorous
Comments (0)
Please sign in to leave a comment.
No comments yet. Be the first to comment!