Generative AI: Prompt Engineering Basics

Chapters

1Foundations of Generative AI

2LLM Behavior and Capabilities

3Core Principles of Prompt Engineering

4Writing Clear, Actionable Instructions

5Roles, Personas, and System Prompts

6Supplying Context and Grounding

7Examples: Zero-, One-, and Few-Shot

8Structuring Outputs and Formats

9Reasoning and Decomposition Techniques

10Iteration, Testing, and Prompt Debugging

11Evaluation, Metrics, and Quality Control

12Safety, Ethics, and Risk Mitigation

Harmful Content Avoidance Bias and Fairness Controls Privacy and PII Handling Copyright and Licensing Hallucination Containment Verification Before Action Domain-Specific Risk Patterns Prompt Injection Awareness Jailbreak Resistance Strategies Secure Delimiters and Sandboxing Sensitive Topic Handling Consent and User Safeguards Age-Appropriate Design Transparency and Disclosure Accountability and Audit Trails

13Tools, Functions, and Agentic Workflows

14Retrieval-Augmented Generation (RAG)

15Multimodal and Advanced Prompt Patterns

Courses/Generative AI: Prompt Engineering Basics/Safety, Ethics, and Risk Mitigation

Safety, Ethics, and Risk Mitigation

23990 views

Build safe prompts that reduce harm, protect privacy, handle sensitive content, and maintain accountability.

Content

3 of 15

Privacy and PII Handling

Privacy but Make It Unleakable

1854 views

intermediate

humorous

sarcastic

education theory

gpt-5-mini

1854 views

Versions:

Privacy but Make It Unleakable

Watch & Learn

AI-discovered learning video

Start learning for free

Bookmark content and pick up later
AI-generated study materials
Flashcards, timelines, and more
Progress tracking and certificates

Free to join · No credit card required

Privacy and PII Handling — The "Don't Leak the Secrets" Chapter

You already learned about steering models away from harmful content and checking for bias. Now let's stop them from becoming accidental gossip machines.

This lesson sits in the Safety, Ethics, and Risk Mitigation module right after Harmful Content Avoidance and Bias and Fairness Controls. It also builds on Evaluation, Metrics, and Quality Control — because you need ways to measure privacy risk as much as you need ways to reduce it.

Why privacy matters here (beyond the legal paperwork)

PII (Personally Identifiable Information) leakage can cause real-world harm: identity theft, doxxing, reputational damage.
Models trained on careless data or given sloppy prompts can regurgitate secrets, even when you swear at them like a bouncer.
Privacy risk is both a design constraint and an ongoing monitoring challenge — like a sour patch kid that gets worse if you ignore it.

Privacy isn't just compliance checkboxes. It's trust. If your product leaks a customer's data, you don't just get fines — you lose your reputation and users.

Core principles (the things you should tattoo on your team handbook)

Data minimization: Only collect what you absolutely need.
Purpose limitation: Use data only for the stated purpose and no sneaky backdoor features.
Anonymize and redact: Remove or reduce direct identifiers before use.
Differential privacy and synthetic data: Add noise or generate synthetic datasets when possible.
Human-in-the-loop for risky outputs: Make humans gatekeeper for high-risk responses.
Logging and monitoring: Track what gets asked, what the model returns, and who accessed data.

Prompt engineering dos and don'ts (practical, real-world examples)

Don't: feed raw user conversations into prompts

Bad:

User: Here is my insurance claim with SSN 123-45-6789 and email bob@example.com. Draft a reply.
Prompt sent to model: Use the following conversation to draft a reply: [full conversation with SSN].

Why bad: raw PII in prompt -> model may echo it. Also increases exposure in logs.

Do: strip or tokenise sensitive fields, and use placeholders

Better:

Input: {name: '[REDACTED_NAME]', ssn: '[REDACTED_SSN]', email: '[REDACTED_EMAIL]', claim: 'roof damage after hail'}
System: Draft a customer-facing reply that addresses the claim, without revealing or inferring any PII. Use placeholders for names.

Why better: model never sees the real SSN and is instructed to avoid inference.

Don't: ask the model to infer hidden fields

Bad prompt pattern:

This ticket mentions 'the client'. Who is the client? Give me any likely identifiers.

This encourages hallucination or reconstruction of PII. Avoid at all costs.

Detection and redaction strategies (how to catch leaks before they go live)

Regex and heuristics: fast, simple patterns for emails, credit cards, SSNs, phone numbers.

Examples (simple patterns):
- Email: [A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+.[A-Za-z]{2,}
- US SSN-ish: [0-9]{3}-[0-9]{2}-[0-9]{4}
- Phone (digits): (?:+?1[-. ]?)?[0-9]{3}[-. ]?[0-9]{3}[-. ]?[0-9]{4}
Token / embedding-based detection: Flag outputs semantically similar to known identifiers using embeddings and approximate match thresholds.
Model-based PII detectors: A smaller dedicated classifier fine-tuned to mark strings as PII or non-PII.
Human review: For high-risk cases (medical, financial), route outputs through human review before delivery.

Table: Pros and cons summary

Method	Speed	Accuracy	When to use
Regex / heuristics	Very fast	Low to medium (false positives/negatives)	Edge filtering, quick blocking
Embedding similarity	Medium	Medium-high	Catch masked or paraphrased PII
Classifier detector	Medium	High (with training)	Production gating
Human-in-the-loop	Slow	Very high	High-risk outputs

Measuring privacy risk (builds on Evaluation, Metrics, and QC)

You already track model quality and drift. Now add privacy-specific metrics:

PII leakage rate: fraction of responses that contain detected PII.
False positive / false negative rates of your PII detector (test with labeled datasets).
Exposure score: combine severity (SSN > email > name), frequency, and user impact into a single risk score.
Time-to-remediation: how long from detection to mitigation.

Set automated alerts for threshold breaches and visualize in dashboards. Close the loop: if leakage spikes, trigger data audits and model retraining with redaction.

Advanced techniques (brief, actionable intros)

Differential privacy: Inject calibrated noise into model training or query outputs so that individual records cannot be reverse-engineered. Good for analytics and training on sensitive corpora.
Synthetic data: Train/test on synthetic datasets that capture distributional features without real identifiers.
Tokenization / vaulting: Store real PII in a secure vault; pass only references into prompts. Resolve tokens only in secure back-end contexts, not in models.

Incident response and governance (yes, plan for failure)

Prepare a response playbook: contain, assess, notify, remediate.
Maintain consent logs and data provenance (who uploaded what, when, for what purpose).
Understand legal landscape: GDPR (right to be forgotten), HIPAA for health data, etc. Law changes frequently — assign a legal contact.
Run regular red-team exercises: simulate attackers trying to extract PII via adversarial prompts.

Quote to remember:

"You will not prevent every leak, but you can design to make leaks rare, detectable, and fixable."

Quick checklists for prompt engineers

Before sending user-provided text to a model:
1. Remove or redact direct identifiers.
2. Replace with placeholders and keep mapping in a secure vault if needed.
3. Add system instructions to never infer or output PII.
4. Pass outputs through automated PII detectors.
5. Escalate to human review if risk level is high.

Closing — the moral of the story

Privacy in prompt engineering is not a single trick. It is an engineering culture: minimize data, test like you're being audited by a hostile AI, and measure everything. You already know how to measure model quality; now measure how often your model tries to be a tattletale.

Parting line (dramatic): Treat your model like a party guest with a big mouth. Don't let it overhear the secrets.

Flashcards

Mind Map

Speed Challenge

Comments (0)

Please sign in to leave a comment.

No comments yet. Be the first to comment!

Ready to practice?

Study with flashcards, timelines, and more

Earn certificates for completed courses

Bookmark content for later reference

Track your progress across all topics