Python Foundations for Data Work
Master core Python syntax and tooling for data tasks, from environments and notebooks to clean, reliable scripts.
Content
Running Scripts and Notebooks
Versions:
Watch & Learn
AI-discovered learning video
Sign in to watch the learning video for this topic.
Running Scripts and Notebooks — Practical Ways to Execute Python for Data Work
"Not everything you do in a notebook should live in a notebook forever — and not every script should feel like a nuclear launch script."
You already installed Python and the tooling, and you've been tinkering in Jupyter and VS Code (great — those foundations matter). Now let's bridge the gap between interactive exploration and repeatable execution: how to run Python code reliably, whether it's a .py script, a Jupyter notebook cell, or an automated pipeline.
Why this matters (quick elevator pitch)
- Notebooks = glorious playgrounds for exploration, visualization, and storytelling. Great for iteration and demos.
- Scripts = repeatable, testable, and automatable. Great for production, scheduled jobs, and CI/CD.
Most data work needs both: you prototype in notebooks, then extract the solid logic into scripts or modules for reproducibility and automation. This lesson shows how to run both cleanly, and how to convert between them when needed.
Quick decision guide: Notebook vs Script (TL;DR)
- Use a notebook when: exploration, visualization, narrative (reports), teaching.
- Use a script when: production tasks, scheduled jobs, CLI tools, heavier compute.
Imagine the notebook as a Swiss Army knife in a messy lab coat — amazing for discovery. Scripts are the sterilized lab bench where you run the final experiment again and again without surprise.
Running a .py script: command line basics
Simple: open a terminal (you've already installed Python and set up your environment), then:
python my_script.py
Handy variations:
- Run a module:
python -m package.module(useful for package entry points) - Make it executable (Unix): put a shebang at the top and
chmod +x:
#!/usr/bin/env python3
# my_script.py
if __name__ == '__main__':
print('Hello from script')
chmod +x my_script.py
./my_script.py
Use if __name__ == '__main__'
This little pattern avoids running top-level code when your file is imported as a module — essential for testability and reuse.
Passing arguments (argparse) — make your scripts configurable
# cli_example.py
import argparse
def main(path, verbose=False):
if verbose:
print('Running verbosely')
print(f'Processing {path}')
if __name__ == '__main__':
p = argparse.ArgumentParser()
p.add_argument('path')
p.add_argument('--verbose', action='store_true')
args = p.parse_args()
main(args.path, args.verbose)
Run: python cli_example.py data/input.csv --verbose
Virtual environments and correct interpreter (short reminder)
You learned installing Python and tooling earlier. Always run scripts with the interpreter that has the packages you expect. In practice:
- Activate your virtualenv (venv/conda) before running scripts
- In VS Code, select the correct interpreter in the bottom-left (or via Command Palette)
This avoids the classic "it works on my machine" disaster.
Running code in VS Code (interactive + script) — brief recap
From the module on Working in Jupyter and VS Code: VS Code supports both notebooks and scripts. Useful features:
- Run a script with the Run button or Debug (breakpoints, step-through)
- Run cells in a
.pyfile with# %%cell markers (interactive Python) - Use a launch.json for custom runs (args, environmentVars, workingDirectory)
Example launch snippet (conceptual):
{
"name": "Run my script",
"program": "${workspaceFolder}/cli_example.py",
"args": ["data/input.csv", "--verbose"]
}
Running notebooks: cells, kernels, and automation
- In Jupyter Notebook or JupyterLab: run cells (Shift+Enter), restart kernel if you get weird state.
- In VS Code: notebooks run using kernels too; you can run cells inline or export.
Pro tip: restart the kernel and run all cells before sharing or exporting — that ensures the notebook runs from a fresh state.
Convert and run programmatically
- Export to script:
jupyter nbconvert --to script notebook.ipynb(good for extracting code) - Execute a notebook end-to-end:
jupyter nbconvert --to notebook --execute notebook.ipynb --output executed.ipynb - Parameterize and run: use Papermill to inject parameters and execute notebooks (great for batch reports):
pip install papermill
papermill template.ipynb output.ipynb -p data_path data/input.csv
From interactive prototyping to reproducible runs
- Move core logic into functions and modules. Notebooks should mostly orchestrate and visualize.
- Add tests for functions (pytest). Scripts can call those functions.
- Pin environments: requirements.txt or environment.yml (conda). Save kernelspecs for notebooks.
- Use logging instead of print for scripts.
"Turn messy notebook cells into neat functions — your future self will send you thank-you emojis."
Scheduling & automation
- Unix cron example (runs daily at 3am):
0 3 * * * /path/to/venv/bin/python /path/to/script.py >> /var/log/myjob.log 2>&1
- Windows Task Scheduler: point to python.exe and pass the script as an argument.
- Containers: wrap your script in Docker for portable, reproducible runs.
Short list of gotchas and best practices
- Avoid long running stateful computations inside notebook cells for production logic.
- Keep secrets out of code; use environment variables or secret stores.
- Use
requirements.txtorpoetry/pipenvfor dependency management. - Version your notebooks with tools like nbdime or convert to scripts for diffs.
Quick reference commands
- Run script:
python my_script.py - Make executable:
chmod +x my_script.pywith shebang - Run module:
python -m package.module - Run notebook:
jupyter nbconvert --to notebook --execute notebook.ipynb - Parameterize notebook:
papermill template.ipynb out.ipynb -p param val
Takeaways (the stuff you want to remember)
- Use notebooks for exploration and story-telling, scripts for repeatability and automation.
- Structure code into functions and modules so the same logic can run in both contexts.
- Always run code with the right virtual environment or interpreter (you already set this up when installing tooling).
- Use nbconvert/papermill to automate notebooks; use cron/Task Scheduler or Docker for scheduling and portability.
This is where your workflow becomes professional: fast experimentation in notebooks, steadfast reproducibility in scripts. Go forth, refactor those glorious notebook experiments into clean, runnable, and schedulable code — and have fun doing it.
Tags: beginner, runnable code, reproducibility
Comments (0)
Please sign in to leave a comment.
No comments yet. Be the first to comment!