SDKs

Rust

Thunderbit Open API 的地道 Rust 写法

reqwest + serde + tokio。默认异步;扇出场景配合 futures::stream::iter

Cargo.toml

[dependencies]
reqwest = { version = "0.12", features = ["json"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
tokio = { version = "1", features = ["full"] }

配置

use reqwest::Client;
use serde_json::json;

const API: &str = "https://openapi.thunderbit.com/openapi/v1";

fn client() -> Client {
    Client::builder()
        .timeout(std::time::Duration::from_secs(60))
        .build()
        .unwrap()
}

fn auth() -> String {
    format!("Bearer {}", std::env::var("THUNDERBIT_API_KEY").unwrap())
}

Distill 一个页面

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let res: serde_json::Value = client()
        .post(format!("{API}/distill"))
        .header("Authorization", auth())
        .json(&json!({ "url": "https://thunderbit.com/playground" }))
        .send()
        .await?
        .error_for_status()?
        .json()
        .await?;
    println!("{}", res["data"]["markdown"]);
    Ok(())
}

Extract 结构化数据

let res: serde_json::Value = client()
    .post(format!("{API}/extract"))
    .header("Authorization", auth())
    .json(&json!({
        "url": "https://example.com/product/iphone-15-pro",
        "schema": {
            "type": "object",
            "properties": {
                "name":  { "type": "string" },
                "price": { "type": "number" }
            },
            "required": ["name", "price"]
        }
    }))
    .send().await?.error_for_status()?.json().await?;

Batch 扇出

如果要做高并发的单 URL distill(而非走 batch endpoint),用一个有上限的 JoinSet

use tokio::task::JoinSet;

let mut set = JoinSet::new();
for url in urls {
    let c = client();
    set.spawn(async move {
        c.post(format!("{API}/distill"))
            .header("Authorization", auth())
            .json(&json!({ "url": url }))
            .send().await?.error_for_status()?
            .json::<serde_json::Value>().await
    });
}
while let Some(res) = set.join_next().await { /* … */ }

URL 数量超过 10 个时,优先用 /batch/distill 而不是扇出 —— 详见 Batch Job Lifecycle

官方 Rust SDK 正在开发中,敬请期待。