jypi
  • Explore
ChatWays to LearnMind mapAbout

jypi

  • About Us
  • Our Mission
  • Team
  • Careers

Resources

  • Ways to Learn
  • Mind map
  • Blog
  • Help Center
  • Community Guidelines
  • Contributor Guide

Legal

  • Terms of Service
  • Privacy Policy
  • Cookie Policy
  • Content Policy

Connect

  • Twitter
  • Discord
  • Instagram
  • Contact Us
jypi

© 2026 jypi. All rights reserved.

Generative AI: Prompt Engineering Basics
Chapters

1Foundations of Generative AI

2LLM Behavior and Capabilities

3Core Principles of Prompt Engineering

4Writing Clear, Actionable Instructions

5Roles, Personas, and System Prompts

6Supplying Context and Grounding

7Examples: Zero-, One-, and Few-Shot

8Structuring Outputs and Formats

9Reasoning and Decomposition Techniques

10Iteration, Testing, and Prompt Debugging

11Evaluation, Metrics, and Quality Control

12Safety, Ethics, and Risk Mitigation

13Tools, Functions, and Agentic Workflows

Function Calling PatternsParameter Schema DesignTool Selection PromptsPlanner–Executor ArchitecturesGrounding via External ToolsError Handling and RetriesTimeouts and Circuit BreakersResult Summarization PromptsChaining Tool CallsCalculators, Coders, and BrowsersTool Availability ChecksPermissions and ScopesSemantic Caching StrategiesObservability and LogsFallback to Tool-Free Modes

14Retrieval-Augmented Generation (RAG)

15Multimodal and Advanced Prompt Patterns

Courses/Generative AI: Prompt Engineering Basics/Tools, Functions, and Agentic Workflows

Tools, Functions, and Agentic Workflows

20405 views

Integrate function calling and tools, design planner–executor patterns, and manage errors, scopes, and observability.

Content

5 of 15

Grounding via External Tools

Grounding but Make It Real
4465 views
intermediate
humorous
education theory
science
gpt-5-mini
4465 views

Versions:

Grounding but Make It Real

Watch & Learn

AI-discovered learning video

Sign in to watch the learning video for this topic.

Sign inSign up free

Start learning for free

Sign up to save progress, unlock study materials, and track your learning.

  • Bookmark content and pick up later
  • AI-generated study materials
  • Flashcards, timelines, and more
  • Progress tracking and certificates

Free to join · No credit card required

Grounding via External Tools — The Reality-Check Toolkit

"If an LLM is a genius with a faulty memory, external tools are the receipts you actually want to trust."

You already know about planner-executor architectures and tool selection prompts (nice recall — you paid attention in Positions 4 and 3). Now we level up: how to ground model outputs using external tools so answers stop inventing things and start citing real stuff, crunching real numbers, and behaving like responsible adults.


What is grounding (brief, dramatic, and useful)

Grounding via external tools means: when the model needs facts, numbers, or senses about the world, it doesn't just guess — it calls an external system (search, database, calculator, API, code executor, sensor) and uses that result as evidence.

Why this matters: Without grounding you get hallucinations; with grounding you get verifiable outputs, provenance, and a better chance of compliance with safety and privacy constraints we covered earlier.


The toolbox — types of grounding tools (and when to prefer them)

Tool Type Best for Strengths Weaknesses
Retrieval / Search (RAG) Factual, document-backed answers High fidelity to source docs, citeable Requires good retrieval + snippet selection
Knowledge Bases / DBs Structured facts, inventories Deterministic, queryable Schema dependency, may be stale
Calculators / Code Execution Math, simulations, transformations Precise numeric outputs Security concerns if arbitrary code
APIs (e.g., weather, finance) Real-time facts, authenticated data Live, authoritative Rate limits, privacy of queries
Browsers / Scrapers Latest news or web-only content Up-to-date Fragile, brittle to site changes
Verifiers / Consistency Checkers Cross-checking outputs Reduce hallucinations Needs good heuristics
Sensors / Agents Physical state, IoT Ground truth from environment Hardware latency, trust

Patterns for grounding (practical architectures)

  1. Retrieval-Augmented Generation (RAG)

    • Retrieve top-k documents, pass as context, generate answer + citations.
    • Good for explainable factual Q&A.
  2. Function/Tool Calling with Structured Outputs

    • Model returns a tool-call intent (which tool + args). System executes tool and returns structured result. Model composes final response.
  3. Planner–Executor (builds on Position 4)

    • Planner decides which tools to call and in what order. Executor runs them, returns results. Planner integrates and re-plans if verification fails.
  4. Verification Loop

    • After generating, call verifier tools or re-query independent sources. If mismatch/confidence low -> escalate (more tools, human-in-the-loop).

A sample grounded workflow (step-by-step)

Imagine: user asks for the latest regulation clause about data portability in EU law.

  1. Planner: parse user intent and identify need for primary sources.
  2. Tool selection prompt: choose 'legal search API' + 'document retrieval' + 'citation generator'.
  3. Executor: call legal search API, fetch statute text, extract relevant clause with offsets.
  4. Verifier: run a second search across a different provider or use a validator to ensure no misquote.
  5. Composer: produce answer with exact quote, link, and short interpretation. Add limitation notice if uncertain.

Code-style prompt template (planner -> executor handoff):

Planner output format:
{
  "tool": "legal_search",
  "query": "EU data portability clause text",
  "max_results": 3,
  "confidence_threshold": 0.7
}

Executor returns structured JSON with source URLs, text ranges, and a hash for provenance.


Prompt templates & pragmatic hints

  • Tool selection prompt (compact):
You are the planner. Given the user query and safety constraints, choose the minimal set of tools to ground the response. Return JSON: {tools: [{name, reason, args}], safety_checks: [..]}
  • Composer instruction: always include provenance metadata: source_title, url, timestamp, excerpt_range, trust_score.

  • For function calling use strict schemas (JSON schemas) so downstream code knows what to expect and parsers are robust.


Mitigating hallucination and errors (verification strategies)

  • Consensus: query two independent sources and require agreement.
  • Redundancy: use both a KB and a live API for critical facts.
  • Sanity checks: numeric checks (does 2+2=4?) and domain rules (dates must be plausible).
  • External verifiers: grammar or legal validators that assert compliance to standards.
  • Human escalation: when confidence < threshold or when PII/safety issues arise (remember the safety module).

Quick pseudocode for verification loop:

result = executor.run(tool_call)
if verifier.agree(result, independent_call) < threshold:
    planner.add_tool('human_review')
else:
    return composer.format_with_provenance(result)

Privacy, safety, and auditability (builds on Safety, Ethics, and Risk Mitigation)

  • Data minimization: send only necessary fields to external APIs.
  • PII redaction: detect and redact before querying external tools.
  • Consent & policy: ensure user consent when querying 3rd-party services with personal data.
  • Logging & provenance: log tool calls with timestamps, request hashes, and identities for audits.
  • Rate limits & throttling: avoid unbounded queries that leak behavior or hit quotas.

Checklist (quick):

  • Minimal data sent to tools
  • PII detection active
  • Tool provenance recorded
  • Human fallback for low-confidence outputs

Common mistakes (and how to avoid them)

  • Mistake: calling too many tools by default. Fix: planner should prefer the minimal sufficient toolset.
  • Mistake: trusting a single retrieval snippet. Fix: cite full source and cross-check.
  • Mistake: ignoring latency. Fix: parallelize non-dependent calls and use caching for repeated queries.
  • Mistake: forgetting to sanitize tool outputs. Fix: always parse and validate tool returns before composing.

Why do people keep misunderstanding this? Because grounding looks like plumbing, and we humans prefer elegant words over boring pipes. But the plumbing is where reliability lives.


Tiny reference table: grounding strength vs latency (handy quick-guide)

Grounding Strength Typical Latency Use When
Very high (authoritative API, legal DB) Medium Compliance-critical answers
High (multiple independent retrievals) Medium-High Factual Q&A, explainers
Medium (single retrieval) Low Low-risk info, drafts
Low (model-only) Instant Creative writing, brainstorming

Closing — TL;DR and actionable next steps

  • Grounding = use tools as reality-checks; it's not optional for trustworthy systems.
  • Planner decides; executor runs; verifier tests; composer produces with provenance.
  • Always prioritize minimal, auditable tool calls and protect privacy.

Action items for your next lab:

  1. Implement a planner that returns JSON tool calls (use a strict schema).
  2. Add a verifier that cross-checks with a second source.
  3. Log provenance and implement a human escalation path for low-confidence outputs.

Final thought: models are poets; tools are witnesses. If you want truth, make the poet swear an oath in front of the witnesses.

Flashcards
Mind Map
Speed Challenge

Comments (0)

Please sign in to leave a comment.

No comments yet. Be the first to comment!

Ready to practice?

Sign up now to study with flashcards, practice questions, and more — and track your progress on this topic.

Study with flashcards, timelines, and more
Earn certificates for completed courses
Bookmark content for later reference
Track your progress across all topics