Generative AI: Prompt Engineering Basics

Chapters

1Foundations of Generative AI

2LLM Behavior and Capabilities

3Core Principles of Prompt Engineering

4Writing Clear, Actionable Instructions

5Roles, Personas, and System Prompts

6Supplying Context and Grounding

7Examples: Zero-, One-, and Few-Shot

8Structuring Outputs and Formats

9Reasoning and Decomposition Techniques

10Iteration, Testing, and Prompt Debugging

11Evaluation, Metrics, and Quality Control

12Safety, Ethics, and Risk Mitigation

13Tools, Functions, and Agentic Workflows

Function Calling Patterns Parameter Schema Design Tool Selection Prompts Planner–Executor Architectures Grounding via External Tools Error Handling and Retries Timeouts and Circuit Breakers Result Summarization Prompts Chaining Tool Calls Calculators, Coders, and Browsers Tool Availability Checks Permissions and Scopes Semantic Caching Strategies Observability and Logs Fallback to Tool-Free Modes

14Retrieval-Augmented Generation (RAG)

15Multimodal and Advanced Prompt Patterns

Courses/Generative AI: Prompt Engineering Basics/Tools, Functions, and Agentic Workflows

Tools, Functions, and Agentic Workflows

20415 views

Integrate function calling and tools, design planner–executor patterns, and manage errors, scopes, and observability.

Content

5 of 15

Grounding via External Tools

Grounding but Make It Real

4465 views

intermediate

humorous

education theory

science

gpt-5-mini

4465 views

Versions:

Grounding but Make It Real

Watch & Learn

AI-discovered learning video

Start learning for free

Bookmark content and pick up later
AI-generated study materials
Flashcards, timelines, and more
Progress tracking and certificates

Free to join · No credit card required

Grounding via External Tools — The Reality-Check Toolkit

"If an LLM is a genius with a faulty memory, external tools are the receipts you actually want to trust."

You already know about planner-executor architectures and tool selection prompts (nice recall — you paid attention in Positions 4 and 3). Now we level up: how to ground model outputs using external tools so answers stop inventing things and start citing real stuff, crunching real numbers, and behaving like responsible adults.

What is grounding (brief, dramatic, and useful)

Grounding via external tools means: when the model needs facts, numbers, or senses about the world, it doesn't just guess — it calls an external system (search, database, calculator, API, code executor, sensor) and uses that result as evidence.

Why this matters: Without grounding you get hallucinations; with grounding you get verifiable outputs, provenance, and a better chance of compliance with safety and privacy constraints we covered earlier.

The toolbox — types of grounding tools (and when to prefer them)

Tool Type	Best for	Strengths	Weaknesses
Retrieval / Search (RAG)	Factual, document-backed answers	High fidelity to source docs, citeable	Requires good retrieval + snippet selection
Knowledge Bases / DBs	Structured facts, inventories	Deterministic, queryable	Schema dependency, may be stale
Calculators / Code Execution	Math, simulations, transformations	Precise numeric outputs	Security concerns if arbitrary code
APIs (e.g., weather, finance)	Real-time facts, authenticated data	Live, authoritative	Rate limits, privacy of queries
Browsers / Scrapers	Latest news or web-only content	Up-to-date	Fragile, brittle to site changes
Verifiers / Consistency Checkers	Cross-checking outputs	Reduce hallucinations	Needs good heuristics
Sensors / Agents	Physical state, IoT	Ground truth from environment	Hardware latency, trust

Patterns for grounding (practical architectures)

Retrieval-Augmented Generation (RAG)
- Retrieve top-k documents, pass as context, generate answer + citations.
- Good for explainable factual Q&A.
Function/Tool Calling with Structured Outputs
- Model returns a tool-call intent (which tool + args). System executes tool and returns structured result. Model composes final response.
Planner–Executor (builds on Position 4)
- Planner decides which tools to call and in what order. Executor runs them, returns results. Planner integrates and re-plans if verification fails.
Verification Loop
- After generating, call verifier tools or re-query independent sources. If mismatch/confidence low -> escalate (more tools, human-in-the-loop).

A sample grounded workflow (step-by-step)

Imagine: user asks for the latest regulation clause about data portability in EU law.

Planner: parse user intent and identify need for primary sources.
Tool selection prompt: choose 'legal search API' + 'document retrieval' + 'citation generator'.
Executor: call legal search API, fetch statute text, extract relevant clause with offsets.
Verifier: run a second search across a different provider or use a validator to ensure no misquote.
Composer: produce answer with exact quote, link, and short interpretation. Add limitation notice if uncertain.

Code-style prompt template (planner -> executor handoff):

Planner output format:
{
  "tool": "legal_search",
  "query": "EU data portability clause text",
  "max_results": 3,
  "confidence_threshold": 0.7
}

Executor returns structured JSON with source URLs, text ranges, and a hash for provenance.

Prompt templates & pragmatic hints

Tool selection prompt (compact):

You are the planner. Given the user query and safety constraints, choose the minimal set of tools to ground the response. Return JSON: {tools: [{name, reason, args}], safety_checks: [..]}

Composer instruction: always include provenance metadata: source_title, url, timestamp, excerpt_range, trust_score.
For function calling use strict schemas (JSON schemas) so downstream code knows what to expect and parsers are robust.

Mitigating hallucination and errors (verification strategies)

Consensus: query two independent sources and require agreement.
Redundancy: use both a KB and a live API for critical facts.
Sanity checks: numeric checks (does 2+2=4?) and domain rules (dates must be plausible).
External verifiers: grammar or legal validators that assert compliance to standards.
Human escalation: when confidence < threshold or when PII/safety issues arise (remember the safety module).

Quick pseudocode for verification loop:

result = executor.run(tool_call)
if verifier.agree(result, independent_call) < threshold:
    planner.add_tool('human_review')
else:
    return composer.format_with_provenance(result)

Privacy, safety, and auditability (builds on Safety, Ethics, and Risk Mitigation)

Data minimization: send only necessary fields to external APIs.
PII redaction: detect and redact before querying external tools.
Consent & policy: ensure user consent when querying 3rd-party services with personal data.
Logging & provenance: log tool calls with timestamps, request hashes, and identities for audits.
Rate limits & throttling: avoid unbounded queries that leak behavior or hit quotas.

Checklist (quick):

Minimal data sent to tools
PII detection active
Tool provenance recorded
Human fallback for low-confidence outputs

Common mistakes (and how to avoid them)

Mistake: calling too many tools by default. Fix: planner should prefer the minimal sufficient toolset.
Mistake: trusting a single retrieval snippet. Fix: cite full source and cross-check.
Mistake: ignoring latency. Fix: parallelize non-dependent calls and use caching for repeated queries.
Mistake: forgetting to sanitize tool outputs. Fix: always parse and validate tool returns before composing.

Why do people keep misunderstanding this? Because grounding looks like plumbing, and we humans prefer elegant words over boring pipes. But the plumbing is where reliability lives.

Tiny reference table: grounding strength vs latency (handy quick-guide)

Grounding Strength	Typical Latency	Use When
Very high (authoritative API, legal DB)	Medium	Compliance-critical answers
High (multiple independent retrievals)	Medium-High	Factual Q&A, explainers
Medium (single retrieval)	Low	Low-risk info, drafts
Low (model-only)	Instant	Creative writing, brainstorming

Closing — TL;DR and actionable next steps

Grounding = use tools as reality-checks; it's not optional for trustworthy systems.
Planner decides; executor runs; verifier tests; composer produces with provenance.
Always prioritize minimal, auditable tool calls and protect privacy.

Action items for your next lab:

Implement a planner that returns JSON tool calls (use a strict schema).
Add a verifier that cross-checks with a second source.
Log provenance and implement a human escalation path for low-confidence outputs.

Final thought: models are poets; tools are witnesses. If you want truth, make the poet swear an oath in front of the witnesses.

Flashcards

Mind Map

Speed Challenge

Comments (0)

Please sign in to leave a comment.

No comments yet. Be the first to comment!

Ready to practice?

Study with flashcards, timelines, and more

Earn certificates for completed courses

Bookmark content for later reference

Track your progress across all topics