jypi
  • Explore
ChatWays to LearnMind mapAbout

jypi

  • About Us
  • Our Mission
  • Team
  • Careers

Resources

  • Ways to Learn
  • Mind map
  • Blog
  • Help Center
  • Community Guidelines
  • Contributor Guide

Legal

  • Terms of Service
  • Privacy Policy
  • Cookie Policy
  • Content Policy

Connect

  • Twitter
  • Discord
  • Instagram
  • Contact Us
jypi

© 2026 jypi. All rights reserved.

Python for Data Science, AI & Development
Chapters

1Python Foundations for Data Work

Installing Python and ToolingWorking in Jupyter and VS CodeRunning Scripts and NotebooksVariables, Types, and CastingStrings and f-stringsNumbers and ArithmeticBooleans and LogicConditionals and Control FlowFunctions and DocstringsModules and ImportsVirtual EnvironmentsErrors and ExceptionsFile I/O EssentialsCoding Style and PEP 8Using the REPL and Help

2Data Structures and Iteration

3Numerical Computing with NumPy

4Data Analysis with pandas

5Data Cleaning and Feature Engineering

6Data Visualization and Storytelling

7Statistics and Probability for Data Science

8Machine Learning with scikit-learn

9Deep Learning Foundations

10Data Sources, Engineering, and Deployment

Courses/Python for Data Science, AI & Development/Python Foundations for Data Work

Python Foundations for Data Work

41008 views

Master core Python syntax and tooling for data tasks, from environments and notebooks to clean, reliable scripts.

Content

10 of 15

Modules and Imports

Modules and Imports in Python for Data Science Foundations
3964 views
beginner
python
data-science
modules
gpt-5-mini
3964 views

Versions:

Modules and Imports in Python for Data Science Foundations

Watch & Learn

AI-discovered learning video

Sign in to watch the learning video for this topic.

Sign inSign up free

Start learning for free

Sign up to save progress, unlock study materials, and track your learning.

  • Bookmark content and pick up later
  • AI-generated study materials
  • Flashcards, timelines, and more
  • Progress tracking and certificates

Free to join · No credit card required

Modules and Imports — Practical Tools for Reusable Data Code

You've already seen how to wrap logic in functions and document them with docstrings, and how conditionals steer program flow. Now imagine those tidy, well-documented functions living in a tidy apartment building where anyone on your team (or your future self at 3 AM) can rent them — that's what modules and imports give you in Python.

"This is the moment where the concept finally clicks: functions are friends, modules are neighborhoods, and imports are the subway."


What a module is (and why you care)

  • Module = a single Python file (e.g., utils.py) containing values, functions, and classes.
  • Package = a folder containing modules plus an init.py (makes it look like one big module).

Why it matters for data work:

  • Reuse common ETL functions across projects (cleaning, validation, feature engineering).
  • Keep notebooks readable by importing tested utilities instead of copy-pasting code.
  • Share code across teams and version control.

Real-world analogy

Think of a module as a cookbook chapter called "data_cleaning.py". Instead of rewriting the recipe each time, you import the recipe and apply it. Your data becomes edible faster.


Basic import patterns (and when to use them)

  1. Import the whole module:
import math_utils
math_utils.normalize_column(df, "age")
  • Keeps namespace explicit (recommended for clarity).
  1. Import with an alias (very common in data science):
import numpy as np
import pandas as pd
  • Shortens long names; standard conventions (np, pd) help readability.
  1. Import specific names:
from math_utils import normalize_column, scale_values
normalize_column(df, "age")
  • Good for grabbing just what you need; watch for name collisions.
  1. Wildcard import (don't do this in production):
from math_utils import *  # makes debugging nightmares
  • Pollutes your namespace and hides origins of symbols.

Example: building a tiny module (and using docstrings)

Create a file math_utils.py:

# math_utils.py
"""Utilities for numeric columns in dataframes.

Functions:
- mean_ignore_na
- normalize_column
"""

import numpy as np

def mean_ignore_na(arr):
    """Return mean ignoring missing values."""
    return np.nanmean(arr)


def normalize_column(df, col):
    """Scale a DataFrame column to zero mean and unit variance."""
    mu = mean_ignore_na(df[col])
    sigma = np.nanstd(df[col])
    df[col] = (df[col] - mu) / sigma
    return df


if __name__ == "__main__":
    # quick local tests — won't run when imported
    import pandas as pd
    df = pd.DataFrame({"x": [1, 2, None, 4]})
    print(normalize_column(df, "x"))

Notes:

  • We used docstrings in the module and functions — remember your Functions & Docstrings lesson.
  • The if __name__ == "__main__": block is a great place for small demos or smoke tests.

Packages and relative imports (project structure)

Example layout for a data pipeline:

project/
  data_pipeline/
    __init__.py
    extract.py
    transform.py
    load.py
  scripts/
    run_pipeline.py

Inside transform.py you can reference sibling module functions with relative imports:

# transform.py
from .extract import load_raw_data
from .load import save_clean

Relative imports are perfect when organizing a package that will be installed or reused.


How Python finds modules

Python searches directories in sys.path. Typical entries include:

  • Directory containing the running script.
  • Entries from the PYTHONPATH environment variable.
  • Standard library directories and site-packages.
import sys
print(sys.path)

If your module isn't found, check your current working directory and virtual environment.


Tips and best practices for data work

  • Use explicit imports in scripts and notebooks to make origins clear.
  • Follow common alias conventions (np, pd, plt) to help collaborators.
  • Keep utility modules small and focused — e.g., cleaning.py, viz.py, metrics.py.
  • Use if __name__ == "__main__": for module-level quick checks and demos.
  • Avoid from module import * — it's a readability and debugging hazard.
  • For long imports that slow startup, consider lazy imports inside functions:
def heavy_transform(df):
    import pyarrow as pa  # only imported when needed
    ...
  • Use virtual environments and requirements.txt (or pyproject.toml) to lock dependencies for reproducibility.

Interactive workflow: editing a module while a REPL is open

When you change a module during an interactive session (like a Jupyter notebook), reload it:

import importlib
import math_utils
importlib.reload(math_utils)

This avoids restarting the kernel for small edits.


Watch out for circular imports

If module A imports B and B imports A at top-level, Python can get stuck. Fixes:

  • Move imports inside functions (deferred import).
  • Refactor shared code into a new module C that both can import.

Quick practical checklist before you commit code

  • Are imports explicit and clear?
  • Are utility functions documented with docstrings?
  • Did you avoid wildcard imports?
  • Is package layout logical for reuse?
  • Are heavy imports deferred if they slow tests or CLI tools?

Key takeaways

  • Modules let you bundle and reuse functions, classes, and constants — essential for tidy data code.
  • Use explicit imports to keep namespaces clear; alias standard libs (np, pd) for readability.
  • if __name__ == "__main__": is your friend for quick demos and module-level tests.
  • Keep packages well-structured and prefer relative imports inside a project.
  • For interactive work, use importlib.reload() to refresh edited modules.

Remember: good modules are like tidy toolboxes. When your workflow needs a hammer, you should be able to pull one out without rummaging through a pile of half-broken scripts.

"The best code is the code you can find in the dark at 2 AM." — maybe you, after writing a solid module structure.


If you want, I can generate a sample project scaffold (files + content) you can drop into VS Code or a notebook to practice these patterns.

Flashcards
Mind Map
Speed Challenge

Comments (0)

Please sign in to leave a comment.

No comments yet. Be the first to comment!

Ready to practice?

Sign up now to study with flashcards, practice questions, and more — and track your progress on this topic.

Study with flashcards, timelines, and more
Earn certificates for completed courses
Bookmark content for later reference
Track your progress across all topics