OkraPDF

Output Schema — Runtime Demo

Store reproducible extraction recipes, materialize outputs with audit trails, serve publicly from R2 without waking the Durable Object.

SDKextract + materialize
Durable ObjectSQLite + R2 write
R2public read, no DO wake
1
Register Output Profile
PUT

Define the extraction recipe: JSON Schema for the output shape, a prompt for the LLM, and which model to use. Stored in DO SQLite.

PUT /document/{id}/output-profile/invoice
Request Body
{ "schema": { "type": "object", "properties": { "vendor": { "type": "string" }, "total": { "type": "number" }, "line_items": { "type": "array" } } }, "prompt": "Extract invoice fields including vendor, date, total, and line items.", "model": "claude-sonnet-4-5-20250929" }
Response
2
Materialize Output
PUT

After LLM extraction, write the validated data and full audit trail. DO stores in SQLite (source of truth) and writes data-only blob to R2 (serving layer).

PUT /document/{id}/output/invoice
Request Body
{ "data": { "vendor": "Acme Corp", "total": 1234.56, "date": "2026-01-15", "line_items": [ { "desc": "Widget", "qty": 10, "price": 123.456 } ] }, "audit": { "model": "claude-sonnet-4-5-20250929", "prompt": "Extract invoice fields...", "raw_response": "{\"vendor\":\"Acme Corp\",...}" } }
Response
3
Read via Durable Object
GET

Authenticated read from DO SQLite. Returns the validated extraction data. Used by SDK internals and admin tooling.

GET /document/{id}/output/invoice
Response
4
Read Audit Trail
GET

Every materialized output is traceable: see which model ran, what prompt was sent, and the raw LLM response before JSON parse.

GET /document/{id}/output/invoice/audit
Response
5
Public R2 Read
GET

Public URL served directly from R2. No authentication required. The Durable Object never wakes — zero compute cost on read.

GET /v1/documents/{id}/o_invoice/data.json
Response

Lifecycle Complete

Output profile and materialized data stored in DO SQLite (source of truth with audit trail). Data-only blob written to R2 (serving layer). Public reads at /v1/documents/{id}/o_invoice/data.json hit R2 directly — the Durable Object never wakes.