Generative AI: Prompt Engineering Basics

Chapters

1Foundations of Generative AI

What Is Generative AI AI vs ML vs Deep Learning Transformer Architecture Primer Tokens and Tokenization Probabilities and Next-Token Prediction Temperature and Top-p Sampling Context Window and Limits Prompt–Response Loop System, Developer, and User Messages Capabilities and Limitations Hallucinations and Uncertainty Determinism vs Stochasticity Safety Layers and Moderation Evaluation Mindset from Day One Useful Mental Models of LLMs

2LLM Behavior and Capabilities

3Core Principles of Prompt Engineering

4Writing Clear, Actionable Instructions

5Roles, Personas, and System Prompts

6Supplying Context and Grounding

7Examples: Zero-, One-, and Few-Shot

8Structuring Outputs and Formats

9Reasoning and Decomposition Techniques

10Iteration, Testing, and Prompt Debugging

11Evaluation, Metrics, and Quality Control

12Safety, Ethics, and Risk Mitigation

13Tools, Functions, and Agentic Workflows

14Retrieval-Augmented Generation (RAG)

15Multimodal and Advanced Prompt Patterns

Courses/Generative AI: Prompt Engineering Basics/Foundations of Generative AI

Foundations of Generative AI

21730 views

Establish how modern LLMs generate text, the role of tokens and probabilities, and the constraints that shape prompt behavior.

Content

4 of 15

Tokens and Tokenization

The Token Tango — Tiny Bricks, Big Consequences

1939 views

beginner

humorous

science

gpt-5-mini

1939 views

Versions:

The Token Tango — Tiny Bricks, Big Consequences

Watch & Learn

AI-discovered learning video

YouTube

Start learning for free

Bookmark content and pick up later
AI-generated study materials
Flashcards, timelines, and more
Progress tracking and certificates

Free to join · No credit card required

Tokens and Tokenization — The Tiny Building Blocks That Run Large Models

"If transformers are the brains, tokens are the neurons — tiny, weird, and absolutely essential."

You just read about transformer internals and where deep learning sits in the stack (shout-out to the previous modules). Now it’s time to zoom in even closer: how do words, punctuation, and even emojis become something a model can actually compute with? Welcome to the gritty little world of tokens and tokenization.

Hook: Imagine building IKEA furniture without screws

You get a box of parts, but nothing is labeled. Some things look like planks, some like bolts — and you call customer support. That’s a model without tokenization. Tokens are the screws and bolts that let the machine assemble language into meaning.

Why this matters: tokenization determines how input is chopped, how many tokens your prompt costs, how the model generalizes to rare words, and how outputs can get weirdly split. For prompt engineering, tokenization is a silent contract between you and the model.

What is a token? What is tokenization?

Token: a discrete unit the model uses as input/output. Could be a whole word, part of a word, a punctuation mark, or even a byte sequence.
Tokenization: the process that maps raw text (human language) into a sequence of tokens.

Big idea: tokens are not the same as words. The word "unbelievable" might be one token, two, or five tokens depending on the tokenizer. That affects both cost (token limits) and performance.

Common tokenization strategies (simple cheat-sheet)

Type	What it does	Pros	Cons
Character-level	Splits into individual characters	No OOV (out-of-vocab), simple	Long sequences, inefficient
Word-level	Splits on whitespace/punctuation	Intuitive, short tokens	Huge vocab, fails on rare words/languages
Subword (BPE, WordPiece, Unigram)	Breaks words into common subparts	Compact vocab, handles rare words	Can split inside morphemes, non-intuitive breaks
Byte-level	Encodes bytes directly (e.g., UTF-8)	Language-agnostic, robust	Less human-readable tokens

Quick explainer of subword algorithms

BPE (Byte-Pair Encoding): start with chars, iteratively merge most frequent pairs into new tokens. Good balance of vocab size vs coverage.
WordPiece: similar to BPE but optimized differently (used in some BERT models).
Unigram: probabilistic, chooses token set that maximizes likelihood under a unigram model.
Byte-level BPE: tokenization over raw bytes so it can represent any unicode without special handling (used by some GPT models).

Real-world analogies (because metaphors stick)

Tokens are LEGO bricks. Words can be big bricks or tiny bricks. Subword tokenizers give you flexible brick sizes so you can build rare or complex words without an infinite toy box.
Tokenization is like cutting a loaf of bread. Too thick: you can’t butter evenly. Too thin: you’re chewing forever.

What tokenization looks like (examples)

Input:  I'm learning to code 🤖 — and I love it!
Possible tokens (subword/BPE-style): ['I', "'m", ' learning', ' to', ' code', ' ', '916', ' —', ' and', ' I', ' love', ' it', '!']
Token count: ~13 (varies by tokenizer)

# Pseudocode example (Python-like)
ids = tokenizer.encode('I\'m learning to code 🤖 — and I love it!')
tokens = tokenizer.decode_tokens(ids)
print(tokens)
# -> ['I', "'m", ' learning', ' to', ' code', ' 🤖', ' —', ' and', ' I', ' love', ' it', '!']

Notice how punctuation, emoji, and contractions can be split into separate tokens. That affects generation: if the model learned to place an apostrophe token before conjugation, splitting can change fluency.

Tokenization and prompt engineering — the tricks you actually need

Token budget matters: model limits are in tokens, not characters. A dense Unicode string can cost more tokens than it looks like.
Watch out for surprising splits: long compound words or rare proper nouns might become many tokens. That eats your budget and can harm performance.
Special tokens: some models use special tokens (e.g., or ) for system signals. Know them — they might be counted or reserved.
Whitespace is meaningful: many tokenizers treat leading spaces differently, which can change completions. For example, 'hello' vs ' hello' might tokenize differently.
Language and script effects: tokenizers tuned on English can perform worse on languages with different morphology (e.g., agglutinative languages) unless byte-level or multilingual tokenizers are used.

Mini case study: Why a single character can explode token count

Consider code snippets or hex dumps. A JSON blob with lots of short keys can tokenize into many small subwords or bytes. That translates to higher cost and hit to latency. When building prompts that include long data, think about compression or summarization before sending.

Diagnostic moves (How to inspect tokenization)

Always run your tokenizer on representative prompts and count tokens before sending to the model.
Use tokenizer.debug/encode methods in SDKs to see how text maps to tokens.
Try alternate phrasings to reduce token count: prefer 'cannot' vs 'can not'? Sometimes merging reduces tokens.

Quick checklist:

Did I include unexpected whitespace or hidden characters? (copy-paste gremlins)
Are there many rare names or emojis? They cost tokens.
Do I need byte-level safety for non-Latin scripts?

Expert take: "Tokenization is not just an implementation detail. It's a design decision that shapes model behavior, costs, and fairness across languages."

Closing — TL;DR and Actionable Takeaways

Tokens are the atoms of language models; tokenization is the chemistry that makes atoms usable.
Prefer subword/byte-level tokenizers for modern models: they balance vocab size and coverage.
Always inspect tokenization for your prompts — it can save you money and improve results.
Be mindful of special tokens, whitespace sensitivity, and multilingual quirks.

Parting challenge: take your favorite prompt and run it through the tokenizer. How many tokens does it produce? Where are the splits? Tweak the text to halve the token count. That tiny exercise will instantly make you a sharper prompt engineer.

Version note: This builds on the transformer internals you saw earlier (attention needs sequence indices, and tokens are the sequence). Next up: how token embeddings convert tokens into vectors the transformer can actually reason about.

Flashcards

Mind Map

Speed Challenge

Comments (0)

Please sign in to leave a comment.

No comments yet. Be the first to comment!

Ready to practice?

Study with flashcards, timelines, and more

Earn certificates for completed courses

Bookmark content for later reference

Track your progress across all topics