Change-Detect API
Change-Detect API
/change-detect/v1/snapshot1 creditFetch a URL and return a noise-resistant fingerprint: content_hash (normalised main content, rotating-token-stripped), structural_hash (DOM skeleton), text_hash, title, meta_description, main_content (boilerplate-stripped), word_count, links[]. Store the returned snapshot/hash and feed it back to diff/monitor_check later.
| Parameter | Allowed / range | Description | |
|---|---|---|---|
| url | required | — | The page to fingerprint. Full URL (https://example.com/page) or bare domain (example.com → https:// assumed). Only http/https; private / internal / cloud-metadata targets are SSRF-blocked. Alias: target. |
| selector | optional | — | Scope detection to a region instead of the whole page. Accepts a FULL CSS selector ('div.price-box > span.amount', '.product:nth-child(2)', '[data-testid=total]') OR an XPath ('//div[@id="price"]', '//table//tr[2]/td[3]') — lxml-backed. If the selector matches nothing the whole page is used and `selector_matched:false` is reported. |
| follow_redirects = true | optional | — | Follow redirects (http→https, apex→www) before fingerprinting the final URL (default true). Each hop is independently SSRF-validated. |
/change-detect/v1/diff1 creditDiff a page against a prior baseline (snapshot object OR content_hash OR raw text), or compare two live URLs (url + url2). Returns changed(bool), change_ratio(0..1), change_type(text|structural|both|none), added[]/removed[]/modified[] blocks, and a human summary — all on the NOISE-RESISTANT normalised content.
| Parameter | Allowed / range | Description | |
|---|---|---|---|
| url | required | — | The page to fingerprint. Full URL (https://example.com/page) or bare domain (example.com → https:// assumed). Only http/https; private / internal / cloud-metadata targets are SSRF-blocked. Alias: target. |
| baseline | optional | — | The prior state to diff against: a snapshot object returned by a previous `snapshot` call, OR a 64-hex content_hash, OR raw prior text/HTML. (diff needs `baseline` OR `url2`.) |
| url2 | optional | — | Second URL — when comparing two live pages (A vs B) instead of a page against a stored baseline. SSRF-guarded like `url`. |
| selector | optional | — | Scope detection to a region instead of the whole page. Accepts a FULL CSS selector ('div.price-box > span.amount', '.product:nth-child(2)', '[data-testid=total]') OR an XPath ('//div[@id="price"]', '//table//tr[2]/td[3]') — lxml-backed. If the selector matches nothing the whole page is used and `selector_matched:false` is reported. |
| mode = text | optional | text · structural · full · links | What to compare: text (main content, default) | structural (DOM skeleton) | full (both) | links (outbound link set). |
| follow_redirects = true | optional | — | Follow redirects (http→https, apex→www) before fingerprinting the final URL (default true). Each hop is independently SSRF-validated. |
/change-detect/v1/monitor_check1 creditThe stateless half of monitoring: re-fetch the URL and tell you FAST whether it changed since a known hash. Pass baseline_hash (+ optional selector/mode). Returns changed(bool) + current_hash + a short diff_summary — the cheap 'did it change?' path.
| Parameter | Allowed / range | Description | |
|---|---|---|---|
| url | required | — | The page to fingerprint. Full URL (https://example.com/page) or bare domain (example.com → https:// assumed). Only http/https; private / internal / cloud-metadata targets are SSRF-blocked. Alias: target. |
| baseline_hash | required | — | The content_hash (or text_hash) from a prior snapshot. monitor_check re-fetches the URL and tells you, fast, whether it changed since this hash. 64-hex. |
| selector | optional | — | Scope detection to a region instead of the whole page. Accepts a FULL CSS selector ('div.price-box > span.amount', '.product:nth-child(2)', '[data-testid=total]') OR an XPath ('//div[@id="price"]', '//table//tr[2]/td[3]') — lxml-backed. If the selector matches nothing the whole page is used and `selector_matched:false` is reported. |
| mode = text | optional | text · structural · full · links | What to compare: text (main content, default) | structural (DOM skeleton) | full (both) | links (outbound link set). |
| follow_redirects = true | optional | — | Follow redirects (http→https, apex→www) before fingerprinting the final URL (default true). Each hop is independently SSRF-validated. |
/change-detect/v1/extract1 creditMain-content extraction only — boilerplate (nav/header/footer/ads/cookie/comments) stripped → clean text + structured blocks + links. Useful to feed clean diffs or to get just the article. No fingerprint comparison.
| Parameter | Allowed / range | Description | |
|---|---|---|---|
| url | required | — | The page to fingerprint. Full URL (https://example.com/page) or bare domain (example.com → https:// assumed). Only http/https; private / internal / cloud-metadata targets are SSRF-blocked. Alias: target. |
| selector | optional | — | Scope detection to a region instead of the whole page. Accepts a FULL CSS selector ('div.price-box > span.amount', '.product:nth-child(2)', '[data-testid=total]') OR an XPath ('//div[@id="price"]', '//table//tr[2]/td[3]') — lxml-backed. If the selector matches nothing the whole page is used and `selector_matched:false` is reported. |
| follow_redirects = true | optional | — | Follow redirects (http→https, apex→www) before fingerprinting the final URL (default true). Each hop is independently SSRF-validated. |
/change-detect/v1/batch1 creditSnapshot up to 20 URLs concurrently in one call (each independently SSRF-guarded, shared concurrency). Returns a compact fingerprint per URL.
| Parameter | Allowed / range | Description | |
|---|---|---|---|
| urls | required | — | List of URLs to snapshot in one call (max 20); each is independently SSRF-guarded and fetched concurrently. |
| selector | optional | — | Scope detection to a region instead of the whole page. Accepts a FULL CSS selector ('div.price-box > span.amount', '.product:nth-child(2)', '[data-testid=total]') OR an XPath ('//div[@id="price"]', '//table//tr[2]/td[3]') — lxml-backed. If the selector matches nothing the whole page is used and `selector_matched:false` is reported. |
curl -X POST https://api.reefapi.com/change-detect/v1/snapshot \
-H "x-api-key: $REEF_KEY" \
-H "content-type: application/json" \
-d '{"url":"https://example.com"}'{
"ok": true,
"data": { /* the result */ },
"meta": {
"latency_ms": 240,
"record_count": 12,
"completeness_pct": 100
},
"error": null
}