Integraciones

LangChain

Usa Thunderbit como Document loader o Tool dentro de un agente de LangChain

Mete Thunderbit en una pipeline de LangChain como Document loader (para ingesta RAG) o como Tool (para investigación web dirigida por agentes).

Instalación

pip install langchain-core httpx

Como Document loader

from langchain_core.documents import Document
import httpx

API = "https://openapi.thunderbit.com/openapi/v1"
H = {"Authorization": "Bearer YOUR_API_KEY"}

class ThunderbitLoader:
    def __init__(self, urls: list[str]):
        self.urls = urls

    def load(self) -> list[Document]:
        job = httpx.post(f"{API}/batch/distill",
                         headers=H,
                         json={"urls": self.urls,
                               "include": ["metadata"]}).json()
        # poll until COMPLETED — see Batch Job Lifecycle guide
        return [
            Document(page_content=r["markdown"],
                     metadata={"source": r["url"], **r.get("metadata", {})})
            for r in job["data"]["results"] if r["status"] == "SUCCEEDED"
        ]

docs = ThunderbitLoader(["https://docs.example.com"]).load()

Pasa docs a tu text splitter + vector store habituales de LangChain.

Como Tool de agente

from langchain_core.tools import tool

@tool
def read_url(url: str) -> str:
    """Fetch a URL and return clean Markdown for the agent to read.

    Use for any web research task: docs, articles, search results, product pages.
    """
    resp = httpx.post(f"{API}/distill",
                      headers=H,
                      json={"url": url, "renderMode": "basic"},
                      timeout=60.0)
    resp.raise_for_status()
    return resp.json()["data"]["markdown"]

# Pass [read_url] into create_react_agent / AgentExecutor / etc.

Relacionado

Esta integración se está ampliando con un paquete langchain-thunderbit — vuelve pronto.