The playground

Try any of 1042 endpoints — live.

Pick an endpoint, load a working example, tweak the params, and send — no signup to try. Results render the way the data deserves; raw JSON, headers & code are one tab away.

Playground demo key · api.reefapi.com
post/web-extract/v1/crawl1 credit

Crawl a site starting from a seed URL (up to 25 pages): follows internal links breadth-first with configurable depth, include/exclude URL patterns, and returns every visited page in the formats you choose. Ideal for content indexing, site audits, and building knowledge bases from documentation or blog sites.

Working example
Parameters

The page URL to extract. Full URL or bare domain (https:// assumed). Only http/https; private/internal/metadata targets are SSRF-blocked.

Maximum number of pages to crawl in one call (1–25, default 10). Increase for deeper site coverage. (1–25)

Max link depth from the seed URL (0-5, default 2). (0–5)

Only follow links on the seed's registrable domain (default true).

Regex patterns; only URLs whose path matches are crawled.

Regex patterns; URLs whose path matches are skipped.

Which outputs to return (array or comma-string). Any of: markdown, text, html (cleaned main-content), rawHtml, metadata, links, images, jsonld. Default: markdown+metadata. Unknown values are ignored. For screenshots use the web-capture engine.

Per-request timeout in seconds (3-60, default 25). (3–60)

request preview
curl -X POST https://api.reefapi.com/web-extract/v1/crawl \
  -H "x-api-key: $REEF_KEY" \
  -H "content-type: application/json" \
  -d '{"url":"https://example.com","max_pages":"2","max_depth":"1"}'

Hit Send to run this endpoint live.