Python Foundations for Data Work
Master core Python syntax and tooling for data tasks, from environments and notebooks to clean, reliable scripts.
Content
Installing Python and Tooling
Versions:
Watch & Learn
AI-discovered learning video
Sign in to watch the learning video for this topic.
Installing Python and Tooling — Python Foundations for Data Work
This is the moment where your laptop stops being a sad spreadsheet machine and becomes a data scientist's spaceship.
Why this matters (and why you should care)
If you're taking a course called 'Python for Data Science, AI & Development', the first real-world barrier is not theory — it's tools. You can know linear regression in your bones, but if your environment is broken you will cry in terminal-font. Installing Python and the right tooling gets you from theory to experiments, reproducible analysis, and shiny notebooks.
In this guide you'll learn practical, battle-tested steps to install Python and set up the tools data people actually use: package managers, virtual environments, Jupyter, an editor, and quick troubleshooting.
What we'll cover
- How to get Python on Windows, macOS, and Linux
- Package managers: pip vs conda vs pyenv
- Virtual environments: venv, conda envs, pipenv/poetry (short primer)
- Essential packages and Jupyter
- Editor/IDE recommendations and useful extensions
- Common trouble spots and fixes
Step 1 — Install Python (the short routes)
Windows
- Easiest: Download the official installer from python.org and check 'Add Python to PATH' during install.
- Alternative for power users: Install Chocolatey and run:
choco install python
macOS
- Easiest: Use Homebrew:
brew install python
- Or download from python.org if you prefer GUI installers.
Linux (Ubuntu/Debian)
sudo apt update
sudo apt install python3 python3-venv python3-pip
After installing, verify:
python --version
python3 --version
pip --version
If the command is python3 on your system, that’s okay — some systems keep python pointing to Python 2 historically.
Step 2 — Choose a package + environment strategy (don’t panic; here's a simple rule)
Rule of thumb: For beginners and most data science workflows, use either conda (Anaconda/Miniconda) or Python + venv + pip. Pick one and be consistent.
Option A: Miniconda/Conda (recommended for data)
- Conda manages Python versions and binary packages cleanly (great for numpy/pandas/scikit-learn where compiled libs matter).
- Install Miniconda (lightweight) from the official site.
Quick commands:
conda create -n ds-env python=3.10
conda activate ds-env
conda install numpy pandas scikit-learn jupyterlab matplotlib seaborn
Option B: Python + venv + pip
- Use builtin venv to isolate projects, pip to install packages from PyPI.
python -m venv .venv
# Activate (Mac/Linux)
source .venv/bin/activate
# Activate (Windows PowerShell)
.\.venv\Scripts\Activate.ps1
pip install --upgrade pip
pip install numpy pandas scikit-learn jupyterlab matplotlib seaborn
Quick note on pyenv
- Use pyenv when you need to manage multiple Python versions system-wide (e.g., 3.8 for an old project and 3.11 for a new one). Combine pyenv with virtual environments.
Step 3 — Install Jupyter and your notebook tooling
Jupyter is essential for data work — interactive exploration, plots, and documentation in one place.
Using pip:
pip install jupyterlab
jupyter lab
Using conda:
conda install -c conda-forge jupyterlab
jupyter lab
Tip: JupyterLab is the modern interface; you can also install the classic notebook if you prefer.
Step 4 — Pick an editor/IDE and recommended extensions
- VS Code (lightweight, extensible): install the Python extension and the Jupyter extension.
- PyCharm (full-featured IDE): great for larger projects.
- Optional: install Git and set up a GitHub account early.
VS Code quick setup:
- Install VS Code
- Install the 'Python' and 'Jupyter' extensions
- Open your project folder, select the interpreter (bottom-right), and choose your venv or conda env
Troubleshooting & common gotchas
- PATH not set: If
pythonorpipis not found, add the install directory to PATH or use python3/pip3 commands. - DLL errors on Windows with compiled packages: use conda, which provides compatible binaries.
- Wrong interpreter in VS Code: select the correct interpreter manually.
- Virtual environment not activated: your prompt should change; if not, activate it explicitly before installing packages.
Quick fix for permissions when pip fails:
# Prefer user-level installs or virtual envs; avoid sudo with pip
python -m pip install --user package-name
Best practices to avoid future pain
- Use virtual environments per project. Do NOT install every package globally.
- Pin dependencies with a requirements.txt or environment.yml.
- Use version control (git) from day one.
- Keep your Python version updated for new projects (but match old projects' versions when needed).
Example: save dependencies
# pip
pip freeze > requirements.txt
# conda
conda env export > environment.yml
Quick checklist — go-time
- Install Python or Miniconda
- Create & activate a virtual environment
- Install JupyterLab + core libraries (numpy, pandas, matplotlib, scikit-learn)
- Install VS Code and the Python/Jupyter extensions
- Initialize git in your project folder
Final takeaways
- Python tooling is an investment. Spend 30–60 minutes now to set it up right and save hours later.
- Choose conda if you want fewer headaches with compiled libraries; choose venv/pip for minimalism and PyPI access.
- Virtual environments = your future sanity. Use them.
Remember: installing Python is not glamorous, but it is the bootstrap ritual that turns your computer into a lab. Start with a clean environment, document your steps, and you'll be doing reproducible data work faster than you can say 'import pandas as pd'.
Tags: beginner, python, data-science, tooling
Comments (0)
Please sign in to leave a comment.
No comments yet. Be the first to comment!