开始使用

Thunderbit Open API 把任何网页变成 LLM 真正能用的干净结构化数据 —— 同时透明地处理 JavaScript 渲染、反爬保护、地理路由和代理轮换。

快速开始

五分钟走通流程。cURL、Python、Node.js 示例。

API 参考

端点、错误码、重试策略。

为什么选 Thunderbit

痛点	不用 Thunderbit	用 Thunderbit
重 JavaScript 的 SPA	自托管 headless Chrome、调超时、盯内存泄漏	`renderMode: "full"`
验证码 / 反爬墙	轮代理、解谜题、看着 IP 被烧	我们替你扛
地理屏蔽内容	按国家维护代理池	`countryCode: "DE"`
HTML 噪音（广告、导航、弹窗）	手写每个站点的可读性启发式	自动剥离的 Markdown
结构化提取	训练抽取器、维护每周都坏的 CSS 选择器	JSON Schema → JSON 输出
扩展到 10k+ URL	自建队列、重试、去重、状态面板	批量端点 + Webhook
LLM token 成本	喂模型原始 HTML 然后掏钱	预蒸馏过的 Markdown —— token 数减少 5–10 倍

三个核心端点

🔥 Distill —— 网页 → 干净 Markdown

curl -X POST https://openapi.thunderbit.com/openapi/v1/distill \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com/article"}'

返回 LLM 就绪的 Markdown，元数据已剥离。比原始 HTML 少 5–10 倍 token。

🧠 Extract —— JSON Schema → 结构化字段

curl -X POST https://openapi.thunderbit.com/openapi/v1/extract \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/product",
    "schema": {
      "type": "object",
      "properties": {
        "name":  { "type": "string" },
        "price": { "type": "number" }
      },
      "required": ["name", "price"]
    }
  }'

AI 会读取你的 schema 中各字段的 description —— 写得越具体越好（"product MSRP in USD before discount" 比 "price" 强）。

⚡ Batch —— 最多 100 个 URL，异步配 Webhook

curl -X POST https://openapi.thunderbit.com/openapi/v1/batch/distill \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "urls": ["https://example.com/page1", "https://example.com/page2"],
    "webhook": {
      "url":    "https://your-server.com/webhook/distill",
      "secret": "whsec_your_secret_key"
    }
  }'