Guides
Schema Design
Designing JSON Schemas the AI can extract reliably
The schema you pass to /extract is also a prompt. Every field name, description, and type hint is read by the model. A well-shaped schema dramatically improves accuracy.
Field naming
Use names that read like English. The model is much better at productName than pn or name1.
{ "type": "object", "properties": {
"productName": { "type": "string" },
"currentPrice": { "type": "number" }
} }Field descriptions
Add description to anything ambiguous. "price" could be MSRP, current, or per-unit — be explicit:
{ "currentPrice": {
"type": "number",
"description": "Final price after discount, in USD"
} }Required vs optional
Mark only the fields you truly need. Required fields cause the entire extraction to fail if the model can't find them — use sparingly.
Nesting
Prefer one level of nesting where useful (address.city). Deeper nesting (3+ levels) tends to hurt extraction quality.
Common pitfalls
- Using ambiguous types (
stringfor numbers like"$19.99") — prefernumberand let the model parse - Vague enums without descriptions
- Required fields that aren't actually present on every page
This page is being expanded with a schema cookbook — check back soon.