# ScrAPI > Privacy-first web scraping API with browser automation, captcha solving, proxy rotation, and AI integration https://scrapi.tech ## Overview ScrAPI is a privacy-first, cloud-based web scraping API designed for developers who need reliable, scalable data extraction from any website. It handles the hard parts of web scraping — bot detection bypass, CAPTCHA solving, JavaScript rendering, IP rotation, and geotargeting — so you can focus on your data. ScrAPI does not log, store, or sell any data extracted through the service. ## API Endpoint Primary endpoint: `https://api.scrapi.tech/v1/scrape` - Supports GET (basic scraping) and POST (advanced options including browser commands) - Authentication: API key via `apiKey` URL parameter or `X-API-KEY` header - OpenAPI specification: https://api.scrapi.tech/scalar/v1 - Interactive playground: https://scrapi.tech/playground ## Core Capabilities ### Scraping Features - **Universal website scraping**: Extract data from any URL (static or dynamic) - **Real browser rendering**: Optional headless browser with full JavaScript execution for SPAs, React, Vue, Angular sites - **Multiple output formats**: JSON (default), raw HTML, or clean Markdown responses — Markdown is ideal for LLM and RAG pipeline consumption - **Response selectors**: Extract specific content using CSS selectors or XPath queries to reduce payload size - **Unlimited concurrency & bandwidth**: No arbitrary limits on paid plans - **Automatic retries**: Smart retry logic on transient failures ### Browser Automation Commands - **Click elements**: `{ "click": "#buttonId" }` — Automate button clicks and navigation - **Fill forms**: `{ "input": { "input[name='email']": "value" } }` — Enter text with human-like typing - **Select dropdowns**: `{ "select": { "select[name='country']": "USA" } }` — Choose dropdown options - **Scroll pages**: `{ "scroll": 1000 }` — Scroll to trigger lazy-loaded content - **Wait for elements**: `{ "waitfor": "#results" }` — Delay until specific elements appear - **Execute JavaScript**: `{ "javascript": "console.log('hello')" }` — Run arbitrary JS in browser context - **Timed waits**: `{ "wait": 5000 }` — Pause execution in milliseconds (max 15000) - **Human-like behavior**: All commands execute with realistic mouse movements, variable typing speed, and natural delays - **Complex workflows**: Chain multiple commands for login flows, multi-step processes, cookie consent, pagination ### Anti-Detection & Privacy - **Advanced bot detection bypass**: State-of-the-art techniques to defeat anti-bot systems - **Automatic CAPTCHA solving**: Supports reCAPTCHA v2/v3 (click, invisible, enterprise), hCaptcha (normal, invisible), and Cloudflare (Turnstile, challenge pages) - **Ads and trackers disabled**: Removes ads and web trackers from browser sessions - **No data logging**: Extracted content is never stored, inspected, or sold - **Anti-fingerprinting**: Realistic browser fingerprints, viewports, and user agents ### Proxy Options - **Free proxy**: Random anonymous proxies for testing (0 extra credits) - **Data center proxy**: Fast, reliable proxy pool (5 extra credits per request) - **Residential proxy**: Premium proxies with widest geolocation coverage (10 extra credits per request) - **Tor proxy**: Access .onion hidden services and maximum anonymity (1 extra credit per request) - **Custom proxy**: Use your own proxy infrastructure via `customProxyUrl` (0 extra credits) - **Geotargeting**: Route requests through specific countries (ISO 3166-1 alpha-3 codes) and cities using `proxyCountry` and `proxyCity` - **Automatic IP rotation**: Built-in rotation with 10-minute persistence when using sessions ### Visual Capture - **Screenshots**: Capture full-page PNG screenshots of the final rendered page (2 extra credits) - **PDF generation**: Convert any web page to a downloadable PDF document (2 extra credits) - **Video recording**: Record the full browser session as a WEBM video (3 extra credits) - All capture files are stored temporarily with download URLs in the response ### Session Management - **Persistent sessions**: Use `sessionId` to maintain the same IP address, user agent, and cookies across multiple requests - **Cookie forwarding**: Response cookies are returned and can be reused; sessions handle this automatically - **Custom cookies and headers**: Set request-specific authentication tokens, session IDs, or site configuration ### Integration Options - **RESTful API**: Standard HTTP interface for any programming language - **Webhooks/callbacks**: Async scraping with results POSTed to your endpoint; includes status polling via reference ID - **Official .NET SDK**: Available via NuGet (`dotnet add package ScrAPI`) - **Auto-generated SDKs**: Go, Java, PHP, Python, Ruby, TypeScript, Dart (via Microsoft Kiota from OpenAPI spec) - **MCP Server**: Model Context Protocol server for AI agent integration — available via Docker, NPX, or cloud (SSE/HTTP) - **Interactive playground**: Test requests and generate code samples at https://scrapi.tech/playground ## Request Parameters **Required:** - `url` (string): Target website URL **Optional:** - `useBrowser` (boolean): Enable headless browser for JavaScript rendering (5 credits) - `solveCaptchas` (boolean): Automatically detect and solve CAPTCHAs (30 credits per captcha; enables browser) - `includeScreenshot` (boolean): Capture PNG screenshot of final page (2 credits; enables browser) - `includePdf` (boolean): Generate PDF of final page (2 credits; enables browser) - `includeVideo` (boolean): Record WEBM video of browser session (3 credits; enables browser) - `proxyType` (string): Proxy type — `Free`, `DataCenter`, `Residential`, `Tor` - `proxyCountry` (string): ISO 3166-1 alpha-3 country code for geotargeting (requires proxyType) - `proxyCity` (string): City name for fine-grained geotargeting - `customProxyUrl` (string): Your own proxy URL (`protocol://username:password@host:port`) - `requestMethod` (string): HTTP method — GET (default), POST, PUT, DELETE, HEAD, PATCH (non-browser only) - `responseFormat` (string): Response format — `JSON` (default), `HTML`, or `Markdown` - `responseSelector` (string): CSS selector or XPath query to extract specific content - `callbackUrl` (string): Webhook URL to receive async results via POST - `cookies` (object/string): Custom cookies as key/value pairs - `headers` (object/string): Custom HTTP headers as key/value pairs - `sessionId` (string): Session identifier for persistent state across requests - `acceptDialogs` (boolean): Accept popup dialogs instead of cancelling (default: false) - `browserCommands` (array): Browser automation commands — POST method only ## Credit System ScrAPI uses credit-based billing. Credits are additive per request. No credits consumed on failures. | Feature | Credits | |---|---| | HTTP client (no browser) | 1 | | Real browser | 5 | | Screenshot | 2 | | PDF generation | 2 | | Video recording | 3 | | Data center proxy | 5 | | Residential proxy | 10 | | Tor proxy | 1 | | Captcha solved | 30 per captcha | | Free proxy | 0 | | Custom proxy | 0 | ## Use Cases - Scraping sites with anti-bot protection and bot detection - Extracting data from JavaScript-heavy single-page applications (SPAs) - Price monitoring and competitive intelligence across regions - RAG (Retrieval-Augmented Generation) pipeline content ingestion - AI agent web research via MCP Server integration - Content aggregation and summarization using Markdown output - Form automation and login workflows - Market research and data analytics - Visual regression monitoring with screenshots - Report generation and archival with PDF capture - Multi-step web crawling with session persistence - Ad verification and geotargeted content validation - Dark web research via Tor proxy access ## Pricing - **Free/Test Key**: 1 concurrent request, 20 requests/day - **Pay-as-you-go**: Credits never expire, flexible for small projects - **Subscription plans**: Regular usage with unlimited concurrency - **Enterprise**: Custom solutions with dedicated support ## Documentation Resources - Main documentation: https://scrapi.tech/docs - Getting started guide: https://scrapi.tech/docs/api_details/v1_scrape - Credit usage and pricing: https://scrapi.tech/docs/credits - Real browser rendering: https://scrapi.tech/docs/api_details/v1_scrape/use_browser - Browser commands: https://scrapi.tech/docs/api_details/v1_scrape/browser_commands - Captcha solving: https://scrapi.tech/docs/api_details/v1_scrape/solve_captchas - Proxy options: https://scrapi.tech/docs/api_details/v1_scrape/free_proxy - Geotargeting: https://scrapi.tech/docs/api_details/v1_scrape/geotargeting - Response formats: https://scrapi.tech/docs/api_details/v1_scrape/html_markdown_response - Response selectors: https://scrapi.tech/docs/api_details/v1_scrape/response_selector - Screenshots, PDF, video: https://scrapi.tech/docs/api_details/v1_scrape/screenshot - Cookies and headers: https://scrapi.tech/docs/api_details/v1_scrape/cookies_headers - Webhooks/callbacks: https://scrapi.tech/docs/api_details/v1_scrape/callback_webhook - Sessions: https://scrapi.tech/docs/api_details/v1_scrape/session_id - MCP Server: https://scrapi.tech/docs/mcp_server - SDK / API clients: https://scrapi.tech/docs/sdk_client - Available countries: https://scrapi.tech/docs/api_details/available_countries - Available cities: https://scrapi.tech/docs/api_details/available_cities - Credit balance API: https://scrapi.tech/docs/api_details/credit_balance - OpenAPI reference: https://api.scrapi.tech/scalar/v1 ## Support Contact: hello@scrapi.tech Website: https://scrapi.tech