jypi
  • Explore
ChatWays to LearnMind mapAbout

jypi

  • About Us
  • Our Mission
  • Team
  • Careers

Resources

  • Ways to Learn
  • Mind map
  • Blog
  • Help Center
  • Community Guidelines
  • Contributor Guide

Legal

  • Terms of Service
  • Privacy Policy
  • Cookie Policy
  • Content Policy

Connect

  • Twitter
  • Discord
  • Instagram
  • Contact Us
jypi

© 2026 jypi. All rights reserved.

Full Stack AI and Data Science Professional
Chapters

1Foundations of AI and Data Science

AI vs Data Science landscapeRoles and workflowsProject lifecycle CRISP-DMProblem framingData types and formatsMetrics and evaluation basicsReproducibility and versioningNotebooks vs scriptsEnvironments and dependenciesCommand line essentialsGit and branchingData ethics and bias overviewPrivacy and governance basicsExperiment tracking overviewReading research papers

2Python for Data and AI

3Math for Machine Learning

4Data Acquisition and Wrangling

5SQL and Data Warehousing

6Exploratory Data Analysis and Visualization

7Supervised Learning

8Unsupervised Learning and Recommendation

9Deep Learning and Neural Networks

10NLP and Large Language Models

11MLOps and Model Deployment

12Data Engineering and Cloud Pipelines

Courses/Full Stack AI and Data Science Professional/Foundations of AI and Data Science

Foundations of AI and Data Science

47 views

Core concepts, roles, workflows, and ethics that frame end‑to‑end AI projects.

Content

10 of 15

Command line essentials

The No-Chill CLI Crash Course
1 views
beginner
humorous
science
gpt-5
1 views

Versions:

The No-Chill CLI Crash Course

Watch & Learn

AI-discovered learning video

YouTube

Watch & Learn

AI-discovered learning video

YouTube

Start learning for free

Sign up to save progress, unlock study materials, and track your learning.

  • Bookmark content and pick up later
  • AI-generated study materials
  • Flashcards, timelines, and more
  • Progress tracking and certificates

Free to join · No credit card required

Command Line Essentials: The Power Tools Your GUI Was Hiding From You

"The command line is like the gym for your brain — minimal decor, no distractions, wildly effective. Also a little scary until you learn where the weights go."

We just wrangled environments and dependencies, and had a civil-yet-spicy debate about notebooks vs scripts. Now it’s time to learn the thing that stitches those worlds together: the command line. The CLI is how you glue workflows, automate the boring parts, and yeet friction out of your data life. If you’ve ever thought, "There must be a faster way," the CLI politely says, "There is."


What Even Is a Shell (And Why Should AI People Care)?

  • A shell is your text-based interface to the computer. Common ones:
    • bash/zsh (macOS/Linux)
    • PowerShell (Windows)
  • You type commands; it does your bidding (usually). This is where you:
    • Spin up/activate environments
    • Run scripts and notebooks
    • Inspect data files quickly
    • Fetch datasets and wire up pipelines

Expert take: If your workflow can’t be expressed on the command line, it’ll be hard to automate, version, and scale. GUI clicks don’t commit to Git.


Navigating Like a Pro (aka: Stop Getting Lost)

You live in a filesystem. Know the neighborhood.

  • pwd — print working directory (where am I?)
  • ls -lah — list files (show me everything, including hidden dotfiles)
  • cd path/to/place — go somewhere
  • cd .. — go up one level; cd ~ — go home
  • mkdir -p data/raw — make directories, parents included
  • touch notes.txt — create an empty file
  • cp src.py backup/src.py — copy; mv a b — move/rename
  • rm file; rm -r folder — remove (careful)

Paths & globs you will meet:

  • . = current dir, .. = parent, ~ = home
  • *.csv matches all CSVs; data/{raw,processed} creates two dirs
  • Quote paths with spaces: cd 'My Data'

Quick peek at files:

  • head -n 5 big.csv — first 5 lines
  • tail -n 5 — last 5 lines
  • wc -l big.csv — how many rows
  • du -sh data/ — folder size

Pipes, Redirection, and The Art of Doing 5 Things At Once

  • > redirect output to a file; >> append
  • | pipe output of one command into the next

Examples you’ll use on day one:

# Count unique values in a column (CSV, comma-separated)
cut -d, -f3 data.csv | sort | uniq -c | sort -nr | head

# Save the first 1000 rows of a huge file
head -n 1000 big.csv > sample.csv

# Log output while still seeing it in the terminal
python train.py | tee logs/train.out

Working with compressed files:

zcat big.csv.gz | head
zgrep -i 'error' logs.gz

Your pipeline is a conveyor belt. Each command adds a transformation. Lego, but for text.


Find Stuff Fast: grep, find, jq (Your New Besties)

  • grep -R 'pattern' . — search recursively for text in files
  • grep -R --line-number --ignore-case 'todo' src/
  • find . -name '*.ipynb' -maxdepth 2 — find notebooks nearby

For JSON (APIs, logs), meet jq:

# Pretty-print JSON
echo '{"acc":0.91,"loss":0.23}' | jq

# Extract a field from a JSONL dataset
jq -r '.label' data.jsonl | sort | uniq -c

Lightweight text surgery:

# Replace tabs with commas in a TSV
sed 's/\t/,/g' data.tsv > data.csv

# Sum the 2nd column (numbers only)
awk -F, '{sum += $2} END {print sum}' data.csv

Why do people keep misunderstanding this? Because grep/awk/sed look like line noise. But they’re fast, composable, and perfect for quick checks without spinning up Python.


Environments & Dependencies — But Make It CLI

Remember our environment saga? Here’s the command-line muscle behind it.

Conda:

conda create -n ds-env python=3.11
conda activate ds-env
conda install numpy pandas scikit-learn
conda env export > environment.yml

venv + pip:

python -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install -r requirements.txt
pip freeze > requirements.txt

Path sanity checks:

which python         # macOS/Linux
where python         # Windows
python -c 'import sys; print(sys.executable)'

Environment variables (for API keys, secrets):

export OPENAI_API_KEY=sk-...
export WANDB_PROJECT=my-experiment
python train.py

Pro-tip: use a .env file with a loader (e.g., python-dotenv) or direnv so you don’t accidentally leak secrets in bash history.


Notebooks vs Scripts: Command-Line Edition

  • Start a notebook server:
jupyter lab  # or: jupyter notebook
  • Run a notebook headlessly (great for CI):
jupyter nbconvert --to notebook --execute notebook.ipynb --output executed.ipynb
  • Run a script with arguments:
python train.py --epochs 10 --lr 3e-4 --data data/processed
  • Make a script directly executable:
# In train.py, add the first line:
# !/usr/bin/env python

chmod +x train.py
./train.py --help

Notebooks are for exploration; scripts are for repeatability. The CLI is how you move from vibes to verified.


Git, Quickly (Because Future You Deserves Nice Things)

git init
git status
git add src/ notebook.ipynb requirements.txt
git commit -m 'Add baseline model'
  • Use .gitignore to avoid committing gigantic datasets and environment folders:
# .gitignore
*.pyc
.venv/
__pycache__/
.env
/data/

Bonus: git lfs for large artifacts, or use dataset registries and keep repos lean.


Fetch Data Like a Hacker (Legally)

curl -L -o data/raw/housing.csv https://example.com/housing.csv
wget -P data/raw https://example.com/housing.csv

# Test an API and parse JSON
curl -s 'https://api.example.com/items?limit=5' | jq '.items[] | {id, name}'

Remote machines:

ssh user@server
scp model.pkl user@server:/home/user/models/

Permissions, Sudo, and Other Spicy Buttons

  • Who am I? whoami
  • What’s executable? ls -l
  • Make it executable: chmod u+x script.sh
  • Ownership: chown user:group file

Use sudo sparingly. If you need it to install Python packages, consider fixing your environment instead.

A good rule: if a command makes you sweat, try a dry-run or read the --help first.


Customize Your Shell (Treat Yo’Self)

  • Add aliases and functions in ~/.bashrc or ~/.zshrc:
alias gs='git status'
alias ll='ls -lah'
function mkcd() { mkdir -p "$1" && cd "$1"; }
  • Persistent environment setup:
export PYTHONBREAKPOINT=ipdb.set_trace
export PIP_INDEX_URL=https://pypi.org/simple

Reload with source ~/.zshrc (or open a new terminal).


Cross-Platform Notes (So You Don’t Cry Later)

  • Windows: PowerShell is not bash. Install WSL for a Linux-like environment.
  • Paths: Windows uses backslashes; bash uses slashes. Many tools expect /.
  • Quoting rules differ; when scripts must run everywhere, prefer Python entrypoints.

Cheat Sheet: Commands You’ll Actually Use

Command What it does Why a data person cares
ls -lah List files with sizes Spot giant CSVs before RAM screams
head/tail Peek at files Sanity-check data quickly
wc -l Count lines Instant row count
cut/sort/uniq Column ops + dedupe Explore categories and frequency
grep -R Search text recursively Find code, configs, log patterns
find Locate files by name/type Hunt notebooks or models
jq JSON query APIs, logs, configs at speed
conda/venv Manage environments Reproducible science
python script.py Run scripts Batch jobs, automation
jupyter nbconvert Execute notebooks CI and reproducibility
curl/wget Download data Pipeline inputs
git Version control Collaborate without chaos

Small Frictions That Cause Big Headaches (and Fixes)

  • Spaces in filenames? Use quotes: cd 'My Data'
  • Accidentally nuked a folder with rm -r? Consider trash-cli to send to system trash.
  • Mysterious 'command not found'? Check echo $PATH. If a tool isn’t on PATH, either reinstall or export its path.
  • Python mismatch? which python, then python -V. Activate the right environment.
  • Slow notebook? Check running processes: top or htop (install), and watch that memory.

Try This Mini-Workflow

# 1) Create project skeleton
mkdir -p ds-project/{data/raw,data/processed,src,notebooks}
cd ds-project

# 2) Environment
python -m venv .venv && source .venv/bin/activate
pip install pandas scikit-learn jupyter

# 3) Get data
curl -L -o data/raw/titanic.csv https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv

# 4) Quick checks
wc -l data/raw/titanic.csv
head data/raw/titanic.csv | cut -d, -f3 | sort | uniq -c

# 5) Start notebook for exploration
jupyter lab

If it feels smooth, you’ve tasted CLI power. If it feels chaotic, that’s normal — you just leveled up from tourist to apprentice.


Wrap-Up: The CLI Is Your Exoskeleton

  • The command line gives you speed, automation, and reproducibility.
  • Environments, notebooks, and scripts all become more useful when you can glue them with pipes, redirection, and a few trusty utilities.
  • Your future self (and your teammates) will thank you for commands that can be documented, versioned, and rerun.

Final insight: Tools change; text interfaces endure. Learn the CLI once, and every new stack bows a little faster.

Now go open a terminal and make your computer do tricks.

Flashcards
Mind Map
Speed Challenge

Comments (0)

Please sign in to leave a comment.

No comments yet. Be the first to comment!

Ready to practice?

Sign up now to study with flashcards, practice questions, and more — and track your progress on this topic.

Study with flashcards, timelines, and more
Earn certificates for completed courses
Bookmark content for later reference
Track your progress across all topics