Structuring Outputs and Formats
Specify output schemas, enforce structure, and design responses for easy parsing, scoring, and downstream use.
Content
XML, CSV, and TSV Outputs
Versions:
Watch & Learn
AI-discovered learning video
Sign in to watch the learning video for this topic.
Structuring Outputs: XML, CSV, and TSV (the three formats your data hoarder aunt will ask for)
You already mastered Headings & Sections and JSON with schema enforcement. Great. Now we’re slinging plain-text formats that actually get emailed, parsed, and occasionally break production. Let’s make them predictable.
Why bother with XML / CSV / TSV?
- CSV/TSV = tabular, lightweight, spreadsheet-friendly.
- XML = hierarchical, verbose, excellent for nested data and strict document formats.
Think of JSON as your developer BFF (structured, strict), and these three as the formats your company legacy codebase still trusts at 3 AM.
We’ll build on your previous knowledge of schema enforcement (you know: "Make the model follow a structure or I will cry"). Now learn how to prompt so the model outputs XML, CSV, or TSV that you can actually validate and ingest.
Quick comparison (because your brain likes neat tables)
| Format | Best for | Strengths | Typical pitfalls |
|---|---|---|---|
| CSV | Flat tables (rows/columns) | Human-readable, Excel-friendly, tiny | Commas in fields, newlines, quoting rules |
| TSV | Flat tables with safer delimiter | Tabs rarely appear in data; fewer quoting issues | Tabs in text, inconsistent quoting practices |
| XML | Nested/hierarchical data | Namespaces, attributes, schemas (XSD) | Verbosity, encoding & escaping, complexity |
Core principles that apply across formats
- Tell the model the schema. You did this with JSON schemas — do the same here. If you expect columns/fields, list them.
- Provide examples (Zero/One/Few-shot) — examples steer token generation. High-quality exemplars reduce ambiguity.
- Be explicit about escaping rules. Ask for CSV with RFC4180 quoting, or TSV with tab-escaping, or XML with CDATA.
- Ask for a single, exact output block. Wrap with: "Only output the data block, no explanations." Then parse programmatically.
CSV: Recipe + gotchas
Prompt recipe (zero-shot)
Produce CSV only. Header: name,age,notes
Use RFC4180 quoting for fields that contain commas, quotes, or newlines.
Rows:
- Name: Alice, Age: 29, Notes: Loves, commas
- Name: Bob "The Builder", Age: 34, Notes: Multi-line\nEngineer
Only output the CSV block.
Expected output:
name,age,notes
"Alice",29,"Loves, commas"
"Bob ""The Builder""",34,"Multi-line
Engineer"
Key tips:
- Double any double-quotes inside a field.
- Newlines inside fields require the field to be quoted.
- If you must, specify the dialect: delimiter=","; quotechar='"'; escape doubling.
One-/Few-shot to improve behavior
Show one correct example with a tricky field (comma + quote + newline). Then ask for new rows. This reduces model inventiveness.
TSV: Less drama, different traps
TSV is like CSV after a spa day — less drama but still human.
Prompt recipe
Output TSV only. Header: id text tags
Escape any literal tabs in text fields as "\t" (backslash+t).
Rows:
- id: 1, text: "Hello\tWorld", tags: "greeting"
- id: 2, text: "Line1\nLine2", tags: "multi"
Expected output:
id text tags
1 Hello\tWorld greeting
2 Line1\nLine2 multi
Notes:
- Many parsers accept raw tabs and quoted fields; many don’t. Decide: do you want literal tabs inside fields or encoded sequences like "\t"?
- If your consumers are Excel addicts, TSV often imports cleaner than CSV with commas-in-data.
XML: The drama queen (but also the power move)
XML is for when your data has hierarchy, attributes, or requires strict validation with XSD.
Prompt recipe
Produce XML only. Root element: <people>
Each person: <person id="..."> with child elements <name>, <age>, <notes>.
Wrap any text that may contain "<" or "&" in CDATA.
People:
- id: p1, name: Alice, age: 29, notes: Loves <coffee> & commas
- id: p2, name: Bob, age: 34, notes: Works\non projects
Expected output:
<people>
<person id="p1">
<name>Alice</name>
<age>29</age>
<notes><![CDATA[Loves <coffee> & commas]]></notes>
</person>
<person id="p2">
<name>Bob</name>
<age>34</age>
<notes>Works
on projects</notes>
</person>
</people>
Tips:
- Use CDATA for arbitrary text containing markup-like characters.
- Decide attributes vs child elements up front and document it in the prompt.
- If you need strict validation, include an XSD or require XSD-conformant output.
Prompt templates (copy-paste-ready)
- CSV strict:
Format the following entries as CSV only. Header: {headers}
Use RFC4180 quoting. Double internal quotes. Quote fields with commas, quotes, or newlines.
Entries:
{entries}
Only output the CSV.
- XML strict:
Produce XML only. Root: {root}. Element template: {element_template}.
Wrap text containing '<', '>' or '&' inside CDATA. Do not include comments or prose.
Entries:
{entries}
- TSV safe:
Output TSV only. Header: {headers}
Replace literal tabs in fields with the two-character sequence \t. Replace newlines with \n.
Entries:
{entries}
When to include examples (Few-shot wisdom)
- Zero-shot: fine for trivial, strongly specified formats.
- One-shot: use when you need the model to mimic a non-obvious quoting/escape style.
- Few-shot (2–5 examples): use for complex nested XML patterns or CSV quirks (embedded JSON in a CSV cell, anyone?).
Order matters: put the cleanest, most canonical example first. If you want a different dialect, show a 2nd example of that dialect.
Final checklist before you hit send
- Did you specify a single block-only output? (Yes/No)
- Did you provide header/field names exactly? (Yes/No)
- Did you define escaping rules and/or show examples? (Yes/No)
- Will your downstream parser expect literal tabs/newlines or escaped sequences? (Decide now.)
Closing (mic drop)
If JSON/schema enforcement is your helmet for structure and safety, then CSV/TSV/XML are your tactical tools for the battlefield of spreadsheets and legacy systems. Be explicit, give examples, and demand exactness. The model loves a script — give it one and it will behave. Your future self (and the ops team) will thank you.
Key takeaway: Tell. Show. Enforce. Tell the schema, show high-quality examples when needed, and enforce parsing rules in your prompt.
Version note: This lesson builds on our earlier JSON/schema enforcement and headings lessons — use the same strictness you applied to JSON when you prompt for these older, messier formats.
Comments (0)
Please sign in to leave a comment.
No comments yet. Be the first to comment!