Generative AI: Prompt Engineering Basics

Chapters

1Foundations of Generative AI

2LLM Behavior and Capabilities

3Core Principles of Prompt Engineering

4Writing Clear, Actionable Instructions

5Roles, Personas, and System Prompts

6Supplying Context and Grounding

7Examples: Zero-, One-, and Few-Shot

8Structuring Outputs and Formats

9Reasoning and Decomposition Techniques

10Iteration, Testing, and Prompt Debugging

11Evaluation, Metrics, and Quality Control

12Safety, Ethics, and Risk Mitigation

Harmful Content Avoidance Bias and Fairness Controls Privacy and PII Handling Copyright and Licensing Hallucination Containment Verification Before Action Domain-Specific Risk Patterns Prompt Injection Awareness Jailbreak Resistance Strategies Secure Delimiters and Sandboxing Sensitive Topic Handling Consent and User Safeguards Age-Appropriate Design Transparency and Disclosure Accountability and Audit Trails

13Tools, Functions, and Agentic Workflows

14Retrieval-Augmented Generation (RAG)

15Multimodal and Advanced Prompt Patterns

Courses/Generative AI: Prompt Engineering Basics/Safety, Ethics, and Risk Mitigation

Safety, Ethics, and Risk Mitigation

23990 views

Build safe prompts that reduce harm, protect privacy, handle sensitive content, and maintain accountability.

Content

5 of 15

Hallucination Containment

Hallucination Containment — No More Fake Facts

3785 views

intermediate

humorous

sarcastic

safety

ethics

gpt-5-mini

3785 views

Versions:

Hallucination Containment — No More Fake Facts

Watch & Learn

AI-discovered learning video

Start learning for free

Bookmark content and pick up later
AI-generated study materials
Flashcards, timelines, and more
Progress tracking and certificates

Free to join · No credit card required

Hallucination Containment — Stop the LLM from Making Stuff Up (A Tactical Guide)

"Hallucinations are like your chat buddy who confidently tells you they once had tea with Beethoven." — Expert TA, approximately losing it

Hook: Why we care (and why your app will get sued otherwise)

You already handled copyright landmines and scrubbed PII (nice work). You also set up evaluation and monitoring so you can measure when the model goes off the rails. Now meet the specific monster that eats all those safeguards for breakfast: hallucinations — when a model confidently outputs false, unverifiable, or fabricated information.

Hallucinations matter because they break trust, propagate misinformation, and can cause real-world harm (wrong medical advice, fake citations, false attributions). Containment is the logical next step after monitoring: once you can measure output quality, you must actively reduce and manage factual errors.

What is a hallucination, really? Quick taxonomy

Factual hallucination: The model states a verifiable fact that is false (e.g., wrong dates, invented studies).
Attribution hallucination: The model invents a source, quote, or author.
Contextual hallucination: The model applies true facts to the wrong context (e.g., misattributing a drug dosage to a different drug).
Speculative output: Creative output presented as fact.

Why taxonomy matters: different hallucinations require different fixes.

Core strategies — the containment toolkit

Below are practical techniques, from lowest-effort prompt tweaks to full system redesigns.

1) Grounding with Retrieval (RAG)

Use a retrieval step to fetch relevant documents and provide them as context to the generator.
Make the model only answer based on those documents and cite them.

Why it helps: instead of guessing, the model is forced to reference source text.

Example prompt pattern (conceptual):

System: You are a factual assistant. Only use the documents provided. If the document does not support a claim, say you cannot verify.
User: Based on the documents below, answer... [documents inserted]

2) Ask for citations and evidence (and verify them)

Have the model attach explicit citations and a short quote. Then run an automated citation verifier that checks that quoted text appears in the cited source.

3) Verifier / Critic Models (two-step)

Generate answer -> pass claims to a smaller verifier model that checks each claim against sources and flags unsupported claims. This is cheaper and often effective.

4) Conservative response policies

Design prompts that penalize confident guessing: instruct the model to answer "I don't know" or ask for clarification when evidence is lacking.

5) Structured output and schema enforcement

Require JSON outputs with fields like claim[], source[], confidence_score. Schema validation makes hallucinations easier to detect automatically.

6) Human-in-the-loop and escalation

For high-risk domains, route outputs with low verifier confidence to human reviewers before release.

7) Model selection and prompting choices

Avoid creative temperature for factual tasks.
Prefer models tuned for instruction-following and factuality.
Use chain-of-thought sparingly: it helps reasoning but sometimes increases hallucination; instead, use concise reasoning followed by verifier checks.

Quick comparative table: mitigation strategies vs tradeoffs

Strategy	Strengths	Weaknesses
Retrieval-augmented generation	Strong grounding, provides sources	Requires good index; stale or low-quality docs cause problems
Verifier model	Catch unsupported claims quickly	Extra compute; verifier itself can be wrong
Conservative prompts ('I don't know')	Reduces false positives	Can increase non-answer rate; poor UX
Human review	Best safety for critical outputs	Slow and expensive

How to measure success (metrics & evaluation)

You already know how to measure quality. Here are hallucination-specific metrics to add to your dashboard:

Hallucination rate: proportion of outputs containing at least one unsupported factual claim (human-evaluated or using a trusted ground truth).
Claim precision: fraction of claimed facts that are supported by sources.
Verifier precision/recall: how often the automated verifier catches hallucinations without false alarms.
Escalation rate: percent of outputs routed to humans.

Evaluation approach:

Sample outputs periodically and run both automated checks and targeted human evaluation (rubric: verifiability, source accuracy, severity).
Track trends, not single instances. Use alerting thresholds for sudden drift increases.

Pipeline pseudocode (practical blueprint)

input = user_query
docs = retrieve(index, input)
candidate = generator(input + docs)
claims = extract_claims(candidate)
verification = verifier(claims, docs)
if verification.low_confidence or verification.fails: 
    if high_risk: escalate_to_human(candidate)
    else: respond_with_conservative_answer()
else: return attach_citations(candidate, verification.sources)

This gives you a traceable workflow and clear intervention points.

Prompt patterns and micro-habits that help

Start system messages with: Only use the provided sources to answer factual questions.
Require a final line: Sources consulted: [list].
If unsure, instruct: State up front that you cannot verify and suggest next steps.

Example micro-prompt:

"Answer using only the documents below. For any factual claim, include a citation and a quoted line. If a claim is unsupported, say 'Unable to verify.'"

Risk management & ethics notes

Hallucinations that invent legal citations or medical advice can have outsized harm. Treat those outputs conservatively.
Hallucinations may also create fake authorship, leading to copyright/attribution problems—this ties directly into your previous work on licensing.
Privacy angle: a hallucination that invents a person's PII is dangerous; keep the model conservative around named individuals.

Final checklist (deploy-ready)

Retrieval pipeline connected and tested for staleness
Verifier model in production with tracked metrics
Schema for responses with citations and confidence
Human escalation for high-risk queries
Monitoring and alerts for hallucination rate drift
UX policy for "I don't know" answers

Closing riff (the part that sticks)

You measured quality. Now stop the model from making things up. Grounding, verification, conservative policies, and human escalation are the four pillars. Nothing removes hallucinations entirely, but containment makes your system reliable, auditable, and ethically sane. Treat hallucination containment like plumbing: boring until it bursts, then everything floods. Do the boring work now.

"Contain the hallucinations; don't invite them to dinner." — TA, finishing the checklist and powering down dramatic sarcasm

Flashcards

Mind Map

Speed Challenge

Comments (0)

Please sign in to leave a comment.

No comments yet. Be the first to comment!

Ready to practice?

Study with flashcards, timelines, and more

Earn certificates for completed courses

Bookmark content for later reference

Track your progress across all topics