AI-Powered Web Scraper API

Zero Maintenance. Ever.

One API call to turn any webpage into Markdown or tables. Fuel your agent with live web data, build RAG, and enrich databases — we handle the infrastructure.

Trusted by over 10,000+ users worldwide

BCGHarvard UniversityAdidasPatagoniaMITCarvanaArmisSam's ClubBCGHarvard UniversityAdidasPatagoniaMITCarvanaArmisSam's ClubBCGHarvard UniversityAdidasPatagoniaMITCarvanaArmisSam's ClubBCGHarvard UniversityAdidasPatagoniaMITCarvanaArmisSam's Club

Up and running in minutes

Try it in your terminal right now.

>_
URL to Markdown
1import requests
2
3resp = requests.post(
4 "https://open.thunderbit.com/v1/distill",
5 headers={"Authorization": f"Bearer {API_KEY}"},
6 json={"url": "https://example.com/article"}
7)
8
9markdown = resp.json()["data"]["markdown"]
Core API

Two core capabilities

Distill for clean content, Extract for structured data

Distill
URLMarkdown
Strips ads, nav, and noise — keeps only the content that matters
Full JS rendering and anti-bot bypass built in
Batch up to 100 URLs per request
Extract
URL + SchemaJSON / CSV
One schema works across all websites — no per-site maintenance
Survives site redesigns automatically
Batch up to 50 URLs per request
Advantages

Why use Thunderbit

The scraping / data extraction infrastructure your AI agent deserves

Define what, not how
No CSS selectors, no XPath, no per-site rules. Describe the data you need with a JSON Schema — AI figures out where it lives and how to get it.
One schema, every website
The same schema works across E-commerce sites, Sales Listings or any URL you throw at it. Adding a new data source is a config change, not an engineering sprint.
Stays working when sites break
Traditional scrapers die on every redesign. Thunderbit reads meaning, not DOM structure — so extraction keeps working even when the HTML changes underneath.
Industries

Use cases

What you can build with Thunderbit

AI Agents with Web Access
Give your agent the ability to read and understand any webpage. One API call returns structured context, ready for your agent's next step.
RAG & Knowledge Bases
Distill any URL into clean Markdown and feed it straight into your vector database. No HTML parsing, no content cleaning scripts.
Turn Any Website into an API
Define a schema, point at a URL, get JSON back. Build a product price API, a job listing API, or a news feed API — without writing a single scraper.
Database Enrichment
Keep your database fresh with live web data. Pull company profiles, contact info, or listing details on a schedule — schema stays the same even when sources change.
Competitive Monitoring
Track prices, inventory, reviews, or content changes across hundreds of pages. Same schema, same pipeline, add new sources in seconds.
Dataset Building
Build training sets, evaluation benchmarks, or research datasets from the open web. Batch process thousands of URLs into consistently structured output.

We build Thunderbit on this API

The same API you're looking at powers Thunderbit's Chrome Extension and web app — used by 100,000+ users to extract tens of millions of pages every month.
This isn't a side project. It's the infrastructure we bet our own product on.

0M+
Pages processed monthly and growing
0K+
Users on Thunderbit Extension
0%
Uptime
Plan

Pricing

Start free, pay as you grow

Free
A lightweight way to try scraping. No cost, no card, no hassle.
600 units / one-time
$0one-time
 
Distill 600 pages
Extract 30 pages
2 concurrent requests
Starter
Great for side projects and small tools. Fast, simple, no overkill.
60,000 API units / year
$16/month
Billed yearly. All units upfront.
Distill 60,000 pages
Extract 3,000 pages
30 concurrent requests
Basic support
Pro1Most popular
Built for high volume and speed. Thunderbit at full force.
600,000 API units / year
$40/month
Billed yearly. All units upfront.
600K1200K2400K4800K
Distill 600,000 pages
Extract 30,000 pages
50 concurrent requests
Priority support

Frequently
asked questions

Everything you need to know about the product and billing.