Question 1

What's the difference between Distill and Extract?

Accepted Answer

Distill converts any URL into clean Markdown, stripping ads, navigation, and noise. Extract takes a URL plus a JSON Schema and returns structured JSON or CSV data. Use Distill for content ingestion (RAG, knowledge bases) and Extract for structured data collection (prices, listings, contacts).

Question 2

Does it work with JavaScript-heavy sites?

Accepted Answer

Yes. Thunderbit's API includes full JavaScript rendering and anti-bot bypass built in. It handles SPAs, dynamic content, and pages that require JS execution to load data.

Question 3

Will extraction break when a site redesigns?

Accepted Answer

No. Thunderbit reads meaning, not DOM structure. Traditional scrapers rely on CSS selectors and XPath that break on every redesign. Thunderbit's AI understands the semantic content of the page, so extraction keeps working even when the HTML changes underneath.

Question 4

What is the confidence score?

Accepted Answer

The confidence score indicates how certain Thunderbit's AI is about the extracted data. It helps you programmatically decide whether to trust a result or flag it for review.

Question 5

How long do batch jobs take?

Accepted Answer

Batch processing times depend on the number of URLs and complexity. Distill supports up to 100 URLs per request and Extract supports up to 50 URLs per request. Most batch jobs complete within minutes.

AI 驱动的 Web Scraper API

零维护。永远如此。

全球超过 100,000+ 用户信赖

几分钟即可上手运行

两大核心能力

为什么使用 Thunderbit

使用场景

我们基于这个 API 构建 Thunderbit

价格

常见问题

AI 驱动的 Web Scraper API

零维护。永远如此。

全球超过 100,000+ 用户信赖

几分钟即可上手运行

两大核心能力

为什么使用 Thunderbit

使用场景

我们基于这个 API 构建 Thunderbit

价格

常见 问题

Distill 和 Extract 有什么区别？

它能处理 JavaScript 内容较多的网站吗？

网站改版后，提取会失效吗？

什么是置信度评分？

批处理任务需要多久？

常见问题