SDKs
Python
Thunderbit Open API 的 Python 地道寫法
官方 SDK 正在路上。在那之前,Thunderbit API 就是普通的 HTTP/JSON REST 介面 —— httpx(或 requests)就夠了。
安裝
pip install httpx設定
import httpx, os
API = "https://openapi.thunderbit.com/openapi/v1"
H = {"Authorization": f"Bearer {os.environ['THUNDERBIT_API_KEY']}"}
client = httpx.Client(base_url=API, headers=H, timeout=60.0)Distill 一個頁面
resp = client.post("/distill", json={"url": "https://thunderbit.com/playground"})
resp.raise_for_status()
print(resp.json()["data"]["markdown"])Extract 結構化資料
resp = client.post("/extract", json={
"url": "https://example.com/product/iphone-15-pro",
"schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"price": {"type": "number"},
},
"required": ["name", "price"],
},
})
print(resp.json()["data"])非同步
對高吞吐管線,換成 httpx.AsyncClient:
import asyncio, httpx
async def distill_many(urls: list[str]):
async with httpx.AsyncClient(headers=H, timeout=60.0) as client:
tasks = [client.post(f"{API}/distill", json={"url": u}) for u in urls]
resps = await asyncio.gather(*tasks)
return [r.json()["data"]["markdown"] for r in resps]URL 數量超過 10 個時,優先用 /batch/distill,不要自己散發單次呼叫 —— 參見 Batch Job Lifecycle。
官方 Python SDK 開發中 —— 敬請期待。