Reliable structured extraction from any LLM. Pass a pydantic model and a prompt. structext calls the model, parses the JSON, validates it against your schema, and on a validation failure re-asks the model with the exact errors appended, until it gets a valid object or runs out of retries.
Raw model JSON breaks your code the moment the model returns a near-miss:
import json
resp = client.messages.create(...) # the model returns:
# {"name": "Ada Lovelace", "age": "thirty-six", "email": "ada@example.com"}
data = json.loads(resp.content[0].text)
contact = Contact(**data) # ValidationError: age is not an intWith structext, the same near-miss is caught, fed back to the model, and fixed automatically:
from structext import Extractor, AnthropicClient
result = Extractor(AnthropicClient(), Contact).extract(
"Extract the contact from: Ada Lovelace, 36, ada@example.com"
)
contact = result.data # a validated Contact, typedThe FakeClient replays a scripted conversation, so this runs offline and is fully deterministic. It scripts a malformed reply on attempt 1 and a valid one on attempt 2, exercising the retry path:
import json
from pydantic import BaseModel, Field
from structext import Extractor, FakeClient
class Contact(BaseModel):
name: str
age: int = Field(ge=0)
email: str
malformed = '{"name": "Ada", "age": 36, "email":' # truncated
valid = json.dumps({"name": "Ada", "age": 36, "email": "ada@example.com"})
client = FakeClient([malformed, valid], model="claude-sonnet-4-6")
result = Extractor(client, Contact, max_retries=2).extract("Extract the contact.")
print(result.data) # name='Ada' age=36 email='ada@example.com'
print(result.attempts) # 2 (failed once, then succeeded)
print(result.usage.format_table())Or run the bundled demo command, which does exactly this:
structext
pip:
pip install structext # core (FakeClient, extractor, pydantic)
pip install "structext[anthropic]" # adds the real Anthropic adapter
uv:
uv add structext
uv add "structext[anthropic]"
The anthropic SDK is imported lazily, so the package imports and the test suite runs with no API key and no network.
Extractor.extract(prompt) runs one loop:
- Call. Send a system prompt (your schema as JSON Schema) plus the user prompt to the
LLMClient. - Parse. Pull a JSON object out of the reply. Bare JSON and JSON inside a
```jsonfence both work. - Validate. Validate against your pydantic model with
model_validate_json. - Retry on failure. If parsing or validation fails, append the model's bad reply and the exact validation errors as a new user turn, then call again. Repeat up to
max_retriestimes. - Return or raise. On success, return an
ExtractionResult(the typed object, the usage log, and the attempt count). If every attempt fails, raiseExtractionError, which carries the last raw output and the last validation errors.
Every call is recorded in a UsageLog with input/output token counts and an estimated USD cost from a small built-in pricing table for current Claude models.
LLMClient is a tiny protocol: a model property and a complete(system, messages) method returning text and token counts. Implement it for any provider. structext ships two: AnthropicClient (default, claude-sonnet-4-6) and FakeClient (deterministic, for tests and offline demos).
The retry path is measured by the test suite itself, not asserted in prose. The headline number:
| Metric | Value | How it is produced |
|---|---|---|
| Tests passing | 15/15 | pytest -q over tests/, fully offline |
| Retry recovery | malformed JSON on attempt 1 is recovered on attempt 2 | test_retry_on_invalid_then_valid drives FakeClient(["<malformed>", "<valid>"]) and asserts result.attempts == 2 |
| Cost math | $18.00 for 1M input + 1M output on claude-sonnet-4-6 |
test_estimate_cost_matches_pricing_table checks estimate_cost against the $3/$15 per-1M-token rate |
Reproduce it:
python3 -m venv .venv
.venv/bin/pip install -e ".[dev]"
.venv/bin/pytest -q
- The pricing table is a static snapshot of public Claude rates. Update
MODEL_PRICINGwhen rates change; an unknown model logs a$0.00cost and is flaggedpriced=Falserather than erroring. - Token counts from
FakeClientare a length-based estimate (~4 characters per token), not a real tokenizer. They are deterministic for tests, not billing-accurate. Real counts come from the provider viaAnthropicClient. - Extraction targets a single JSON object per call. Streaming and multi-object batch extraction are out of scope.
- Retries cost real tokens. Each retry is another model call; set
max_retrieswith that in mind. - structext does not constrain decoding at the token level. It validates after the fact and retries. For hard guarantees, pair it with a provider that supports schema-constrained output.
MIT. Copyright (c) Allan Paulo de Souza. See LICENSE.