ScrapeOps n8n Examples
Learn how to build powerful web scraping workflows with real-world examples. Each example includes step-by-step instructions and can be imported directly into your n8n instance.
For complete API documentation, see:
Quick Start Examples
Example 1: Simple Web Page Scraper
Goal: Scrape a basic web page and save to Google Sheets
Workflow:
[Manual Trigger] → [ScrapeOps Proxy] → [Google Sheets]
Configuration:
-
ScrapeOps Node (Proxy API)
URL: https://example.com/products
Method: GET
Return Type: Default -
Google Sheets Node
Operation: Append
Sheet: Web Scraping Results
Data: {{ $json }}
Example 2: Amazon Price Tracker
Goal: Monitor Amazon product prices daily
Workflow:
[Schedule Trigger] → [ScrapeOps Data API] → [Compare Prices] → [Email Alert]
Configuration:
-
Schedule Trigger
Interval: Daily at 9 AM
-
ScrapeOps Node (Data API)
Domain: Amazon
API Type: Product API
Input Type: ASIN
ASIN: B08N5WRWNW -
Function Node (Compare Prices)
const currentPrice = $json.data.price;
const targetPrice = 40.00;
return {
priceDropped: currentPrice < targetPrice,
currentPrice: currentPrice,
savings: targetPrice - currentPrice
};
Advanced Workflows
Competitive Price Monitoring
Goal: Track competitor prices across multiple products
graph LR
A[Schedule] --> B[Get Products]
B --> C[Loop]
C --> D[ScrapeOps Data API]
D --> E[Process Price]
E --> F[Update Database]
F --> C
C --> G[Generate Report]
G --> H[Send Email]
Implementation:
-
MySQL Node - Get Products
SELECT asin, product_name, target_price
FROM products_to_monitor
WHERE active = 1 -
Loop Node
Items: {{ $node["MySQL"].json }}
-
ScrapeOps Node (Data API)
Domain: Amazon
API Type: Product API
Input Type: ASIN
ASIN: {{ $json.asin }} -
Code Node - Process Price
const item = $node["Loop"].json;
const apiResponse = $json;
return {
asin: item.asin,
product_name: item.product_name,
current_price: apiResponse.data.price,
target_price: item.target_price,
below_target: apiResponse.data.price < item.target_price,
timestamp: new Date().toISOString()
};
Job Listing Aggregator
Goal: Collect job listings from Indeed and filter by criteria
Workflow Structure:
1. Search Keywords → Indeed Search → Parse Results
2. For Each Result → Get Job Details → Filter Criteria
3. Matching Jobs → Format Data → Send to Slack
Key Nodes Configuration:
-
ScrapeOps Node (Proxy API) - Search
URL: https://indeed.com/jobs?q=data+scientist&l=New+York
Method: GET
Render JavaScript: true
Wait For: .jobsearch-ResultsList -
ScrapeOps Node (Parser API)
Domain: Indeed
Page Type: Search Page
URL: {{ $node["Proxy_Search"].json.url }}
HTML: {{ $node["Proxy_Search"].json.body }} -
Filter Node
// Filter jobs by salary and experience
return $json.data.jobs.filter(job => {
const salary = parseInt(job.salary?.replace(/\D/g, '') || 0);
const isRemote = job.location?.includes('Remote');
const isEntry = !job.description?.includes('Senior');
return salary > 80000 && (isRemote || isEntry);
});
E-commerce Inventory Monitor
Goal: Track product availability and alert on restocks
Complete Workflow:
// 1. Schedule Trigger - Every 30 minutes
// 2. Get Products to Monitor
const products = [
{
name: "PlayStation 5",
url: "https://walmart.com/ip/playstation-5/123456",
notify_email: "gamer@example.com"
},
{
name: "Xbox Series X",
url: "https://walmart.com/ip/xbox-series-x/789012",
notify_email: "gamer@example.com"
}
];
// 3. Loop through products
for (const product of products) {
// 4. ScrapeOps Proxy API
const html = await scrapeWithProxy(product.url);
// 5. ScrapeOps Parser API
const parsed = await parseWalmartProduct(html);
// 6. Check availability
if (parsed.data.in_stock && !product.was_in_stock) {
// 7. Send notification
await sendEmail({
to: product.notify_email,
subject: `${product.name} is back in stock!`,
body: `Price: $${parsed.data.price}\nLink: ${product.url}`
});
// 8. Update status
product.was_in_stock = true;
}
}
Integration Patterns
Pattern 1: Scrape → Transform → Store
Use Case: Regular data collection for analysis
[Trigger] → [ScrapeOps] → [Transform] → [Database]
↓ ↓ ↓ ↓
Schedule Get Data Clean/Format PostgreSQL
Example Implementation:
// Transform Node
const rawData = $json;
// Clean and structure data
return {
product_id: rawData.data.asin,
title: rawData.data.title?.trim(),
price: parseFloat(rawData.data.price),
rating: parseFloat(rawData.data.rating),
review_count: parseInt(rawData.data.review_count),
scraped_at: new Date().toISOString(),
marketplace: 'amazon_us'
};
Pattern 2: Monitor → Compare → Act
Use Case: Price drop alerts, stock notifications
[ScrapeOps] → [Get Previous] → [Compare] → [Conditional]
↓ ↓ ↓ ↓
Current Data Database Changes? Email/Slack
Comparison Logic:
const current = $node["ScrapeOps"].json;
const previous = $node["Database"].json;
const changes = {
price_changed: current.price !== previous.price,
price_direction: current.price > previous.price ? 'up' : 'down',
price_difference: Math.abs(current.price - previous.price),
percent_change: ((current.price - previous.price) / previous.price * 100).toFixed(2)
};
return changes.price_difference > 5 ? changes : null;
Pattern 3: Aggregate → Analyze → Report
Use Case: Market research, competitive analysis
[Multiple Scrapers] → [Merge Data] → [Analytics] → [Report]
↓ ↓ ↓ ↓
Parallel Runs Combine CSV Statistics Dashboard
Error Handling Examples
Retry Logic with Exponential Backoff
// Error Handler Node
const maxRetries = 3;
const currentRetry = $node["ScrapeOps"].error?.retryCount || 0;
if (currentRetry < maxRetries) {
// Calculate delay
const delay = Math.pow(2, currentRetry) * 1000; // 1s, 2s, 4s
// Wait before retry
await new Promise(resolve => setTimeout(resolve, delay));
// Retry with incremented count
return {
retry: true,
retryCount: currentRetry + 1
};
} else {
// Log failure and continue
return {
failed: true,
error: $node["ScrapeOps"].error?.message
};
}
Fallback Data Sources
// Primary scraper failed, try alternative
if ($node["ScrapeOps_Primary"].error) {
// Use alternative marketplace
const alternativeUrl = $json.url.replace('amazon.com', 'amazon.co.uk');
// Trigger backup scraper
return {
useBackup: true,
backupUrl: alternativeUrl
};
}
Performance Optimization
Batch Processing Example
// Split large lists into batches
const items = $node["Get_Items"].json;
const batchSize = 10;
const batches = [];
for (let i = 0; i < items.length; i += batchSize) {
batches.push(items.slice(i, i + batchSize));
}
// Process each batch with delays
for (const batch of batches) {
// Process batch in parallel
const results = await Promise.all(
batch.map(item => processItem(item))
);
// Add delay between batches
await new Promise(resolve => setTimeout(resolve, 2000));
}
Caching Implementation
// Check cache before scraping
const cacheKey = `product_${asin}_${country}`;
const cached = await getFromCache(cacheKey);
if (cached && cached.timestamp > Date.now() - 3600000) {
// Use cached data (less than 1 hour old)
return cached.data;
} else {
// Scrape fresh data
const fresh = await scrapeProduct(asin, country);
// Update cache
await saveToCache(cacheKey, fresh);
return fresh;
}
Complete Workflow Examples
1. Amazon to Shopify Product Importer
Workflow Overview:
- Read Amazon ASINs from CSV
- Get product details via Data API
- Transform to Shopify format
- Create/Update Shopify products
- Log results
2. Real Estate Price Tracker
Workflow Overview:
- Search Redfin for properties
- Parse search results
- Get detailed property info
- Compare with historical data
- Generate market report
3. Review Sentiment Analyzer
Workflow Overview:
- Get product reviews from Amazon
- Parse review content
- Run sentiment analysis
- Calculate aggregate scores
- Create visual report
Tips for Building Workflows
1. Start Simple
- Test with single items before loops
- Add complexity incrementally
- Use manual triggers during development
2. Handle Errors Gracefully
- Add error branches to critical nodes
- Log errors for debugging
- Implement retry mechanisms
3. Optimize for Performance
- Use caching where appropriate
- Batch similar requests
- Add delays to respect rate limits
4. Monitor and Maintain
- Set up alerts for failures
- Track success rates
- Review logs regularly
Getting Help
Resources
- n8n Documentation: docs.n8n.io
- ScrapeOps Support: support@scrapeops.io
- Community Forum: community.n8n.io
Common Issues
- Rate Limits: Add delays between requests
- Dynamic Content: Enable JavaScript rendering
- Authentication: Use session management
- Large Data: Implement pagination
Ready to build your own workflows? Start with our templates or create from scratch!