Skip to main content

ScrapeOps Proxy API in n8n

The Proxy API is ScrapeOps' flagship service that provides intelligent proxy aggregation for reliable web scraping. This API automatically rotates through multiple proxy providers to ensure successful requests.

Full Documentation

For complete Proxy API documentation, see the Proxy API Aggregator docs.

Basic Configuration

Setting Up a Proxy API Request

  1. Add a ScrapeOps node to your workflow
  2. Select Proxy API as the API type
  3. Configure the basic parameters:
ParameterRequiredDescription
URLYesThe target URL to scrape
MethodYesHTTP method (GET or POST)
Return TypeNoResponse format (default or JSON)

Simple Example

Here's a basic configuration to scrape a webpage:

API Type: Proxy API
URL: https://example.com
Method: GET
Return Type: Default

Advanced Options

The Proxy API offers extensive customization through advanced options. All options are configured in the n8n node's Advanced Options section:

Complete Advanced Options Reference

OptionTypeDescriptionDefaultExample Values
Follow RedirectsBooleanFollow HTTP redirectstruetrue, false
Keep HeadersBooleanUse your custom headersfalsetrue, false
Initial Status CodeBooleanReturn initial status codefalsetrue, false
Final Status CodeBooleanReturn final status codefalsetrue, false
Optimize RequestBooleanAuto-optimize settingsfalsetrue, false
Max Request CostNumberMax credits to use (with optimize)010, 50, 100
Render JavaScriptBooleanEnable headless browserfalsetrue, false
Wait TimeNumberWait before capture (ms)03000, 5000
Wait ForStringCSS selector to wait for-.product-title, #content
ScrollNumberScroll pixels before capture01000, 2000
ScreenshotBooleanReturn base64 screenshotfalsetrue, false
Device TypeStringDevice emulationdesktopdesktop, mobile
Premium ProxiesStringPremium levellevel_1level_1, level_2
Residential ProxiesBooleanUse residential IPsfalsetrue, false
Mobile ProxiesBooleanUse mobile IPsfalsetrue, false
Session NumberNumberSticky session ID012345, 67890
CountryStringGeo-targeting country-us, gb, de, fr, ca, au, jp, in
BypassStringAnti-bot bypass level-cloudflare_level_1, cloudflare_level_2, cloudflare_level_3, datadome, perimeterx, incapsula, generic_level_1 to generic_level_4

Common Configuration Examples

JavaScript Rendering for Dynamic Sites:

Render JavaScript: true
Wait Time: 3000
Wait For: .product-title

For detailed JavaScript rendering options, see Headless Browser and Wait For docs.

Geo-Targeting:

Country: us

For detailed country codes and geo-targeting options, see Country Geo-targeting docs.

Anti-Bot Bypass:

Bypass: cloudflare_level_1

For complete anti-bot bypass options, see Anti-Bot Bypass docs.

Residential Proxies:

Residential Proxies: true
Premium Proxies: level_2

For more proxy types and options, see Residential Proxies and Premium Proxies docs.


Custom Headers & Cookies

Configure custom headers:

{
"User-Agent": "Custom Bot 1.0",
"Accept-Language": "en-US",
"X-Custom-Header": "value"
}

Configure custom cookies:

{
"session_id": "abc123",
"preferences": "dark_mode"
}

JavaScript Scenarios

Execute browser actions before returning response:

[
{"action": "click", "selector": ".load-more"},
{"action": "wait", "timeout": 2000},
{"action": "scroll", "pixels": 500},
{"action": "screenshot", "fullpage": true}
]

Working with POST Requests

For POST requests, the node uses the input data from the previous node:

  1. Connect a node that outputs JSON data
  2. Configure ScrapeOps node:
    • Method: POST
    • The JSON from the previous node becomes the request body

Example workflow:

[Set Node] → [ScrapeOps Node]
↓ ↓
{data} POST with {data}

Response Handling

Default Response

Returns raw HTML or file content:

<html>
<head>...</head>
<body>...</body>
</html>

JSON Response

When Return Type: JSON is selected:

{
"status": 200,
"headers": {...},
"body": "<html>...",
"url": "https://example.com",
"screenshot": "base64_data" // if enabled
}

Error Handling

Common status codes and their meanings:

StatusMeaningAction
200SuccessProcess response
404Not FoundValid response (charged)
400Bad RequestCheck parameters
401Invalid API KeyCheck credentials
429Rate LimitedReduce request rate
500Proxy ErrorRetry or contact support

Best Practices

1. Start Simple

Begin with basic requests and add complexity:

1. Basic GET request
2. Add JavaScript rendering if needed
3. Add wait conditions
4. Enable anti-bot bypass if blocked

2. Optimize for Cost

  • Use optimize_request: true for automatic optimization
  • Set max_request_cost to control spending
  • Only enable features you need

3. Handle Dynamic Content

For JavaScript-heavy sites:

Render JavaScript: true
Wait Time: 3000
Wait For: .main-content
Scroll: 1000

4. Session Management

For multi-step scraping:

Session Number: 12345
// Use same number across requests

Integration Examples

E-commerce Price Monitoring

URL: https://amazon.com/dp/B08N5WRWNW
Render JavaScript: true
Wait For: .price-block
Country: us
Return Type: JSON

News Article Scraping

URL: https://news-site.com/article
Follow Redirects: true
Device Type: mobile
Premium Proxies: level_2

Form Submission

Method: POST
URL: https://example.com/search
Custom Headers: {"Content-Type": "application/json"}
// Input data from previous node

Debugging Tips

1. Enable Transparent Status Codes

Initial Status Code: true
Final Status Code: true

2. Test with Simple Sites First

Use http://httpbin.org for testing:

  • /headers - See request headers
  • /ip - Check proxy IP
  • /user-agent - Verify user agent

3. Check Response Headers

Look for clues in response headers:

  • X-Proxy-Provider - Which proxy was used
  • X-Request-Cost - Credits consumed
  • X-Status-Code - Original status code

Common Issues and Solutions

Issue: JavaScript Not Loading

Solution: Increase wait time or use wait_for selector


Issue: Getting Blocked

Solution: Enable appropriate bypass level or use residential proxies


Issue: Inconsistent Data

Solution: Use session numbers for sticky sessions


Issue: High Credit Usage

Solution: Enable optimize_request and set max_request_cost


Next Steps

Need help? Check the ScrapeOps Dashboard for detailed request logs and debugging information.