Guides

Webhooks

Receive batch job completion notifications

For batch jobs longer than a minute, webhooks are cheaper and faster than polling. Thunderbit POSTs to your URL when a job reaches a terminal state.

Configure on submission

{
  "urls": ["https://example.com/page1"],
  "webhook": {
    "url": "https://your-server.com/api/webhook/distill",
    "secret": "whsec_your_secret_key",
    "headers": { "X-Custom-Auth": "your-token" }
  }
}

The webhook.headers field accepts a map of custom headers, but it is reserved for future use and is not yet sent by the server. Plan your verification only against the headers documented below.

Request headers

Every delivery includes the following four headers:

HeaderValue
Content-Typeapplication/json
X-Webhook-Eventbatch.completed
X-Webhook-TimestampUnix epoch in milliseconds
X-Webhook-Signaturesha256=<base64-encoded HMAC-SHA256>

Payload

{
  "id": "batch_a1b2c3d4...",
  "jobType": "batch_distill",
  "status": "COMPLETED",
  "total": 50,
  "completed": 49,
  "failed": 1,
  "creditsUsed": 100,
  "createdAt": "2026-04-26T10:00:00Z",
  "completedAt": "2026-04-26T10:05:23Z"
}
  • jobType is batch_distill or batch_extract.
  • status is one of COMPLETED, FAILED, or CANCELLED.

Payloads are intentionally small — pull full results via GET /batch/distill/{id} (or /batch/extract/{id}) after receiving the callback.

Signature verification

When secret is set, every delivery includes an X-Webhook-Signature header.

  • Algorithm: HMAC-SHA256
  • String to sign: <X-Webhook-Timestamp>.<raw-json-body> (a literal . between the two)
  • Output format: sha256=<base64-encoded-hash> (Base64, not hex)

Verify the raw request bytes (do not re-serialize the JSON), then constant-time compare.

import hmac, hashlib, base64

def verify(raw_body: bytes, timestamp: str, signature: str, secret: str) -> bool:
    base = f"{timestamp}.{raw_body.decode('utf-8')}".encode('utf-8')
    digest = hmac.new(secret.encode('utf-8'), base, hashlib.sha256).digest()
    expected = "sha256=" + base64.b64encode(digest).decode('ascii')
    return hmac.compare_digest(expected, signature)

Replay protection

Reject the request if |now - X-Webhook-Timestamp| > 60000 (60-second window). The timestamp is in milliseconds since the Unix epoch. Never trust an unsigned webhook in production.

Retry behavior

  • Per-delivery timeout: 30 seconds
  • Max retries: 3
  • Backoff: exponential, starting at 1 second, capped at 10 seconds
  • Total end-to-end window: ~120 seconds

After all retries fail, the job is still complete on our side — your endpoint must be idempotent (use id as the dedupe key).