ScrapeOps Parser API in n8n
The Parser API extracts structured data from HTML content of supported websites. Instead of writing complex parsing logic, simply send HTML to the Parser API and receive clean, structured JSON data.
For complete Parser API documentation, see the Parser API docs.
How It Works
The Parser API follows a two-step process:
- Fetch HTML - Get the HTML content (using Proxy API or other methods)
- Parse HTML - Send HTML to Parser API for structured extraction
Supported Websites
The Parser API currently supports:
Website | Supported Page Types | Documentation |
---|---|---|
Amazon | Product Pages, Search Pages, Reviews Pages | Amazon Parser |
eBay | Product Pages, Search Pages, Category Pages, Store Pages, Feedback Pages | eBay Parser |
Walmart | Product Pages, Search Pages, Category Pages, Reviews Pages, Browse Pages, Shop Pages | Walmart Parser |
Indeed | Job Detail Pages, Search Pages, Company Pages | Indeed Parser |
Redfin | Property Pages, Search Pages, Agent Pages | Redfin Parser |
Basic Configuration
Setting Up Parser API
- Add a ScrapeOps node to your workflow
- Select Parser API as the API type
- Configure the parameters:
Parameter | Required | Description |
---|---|---|
Domain | Yes | Website to parse (amazon, ebay, etc.) |
Page Type | Yes | Type of page being parsed |
URL | Yes | Original URL of the page |
HTML Content | Yes | HTML to parse |
Parser API Workflow Pattern
The typical workflow for using the Parser API:
[ScrapeOps Proxy API] → [ScrapeOps Parser API]
Get HTML Parse to JSON
Example: Amazon Product Parser
Step 1: Fetch HTML with Proxy API
Node 1: ScrapeOps (Proxy API)
- URL: https://www.amazon.com/dp/B08N5WRWNW
- Method: GET
- Render JavaScript: true
Step 2: Parse HTML
Node 2: ScrapeOps (Parser API)
- Domain: Amazon
- Page Type: Product Page
- URL: {{ $node["ScrapeOps"].json.url }}
- HTML Content: {{ $node["ScrapeOps"].json.body }}
Domain-Specific Configurations
Amazon Parser
Product Pages
- Extracts: title, price, rating, description, images, features
- Example URL:
https://www.amazon.com/dp/ASIN
Search Pages
- Extracts: product list with prices, ratings, prime status
- Example URL:
https://www.amazon.com/s?k=keyword
Reviews Pages
- Extracts: review text, ratings, dates, helpful votes
- Example URL:
https://www.amazon.com/product-reviews/ASIN
eBay Parser
Product Pages
- Extracts: title, price, seller info, shipping, item specifics
- Example URL:
https://www.ebay.com/itm/123456
Search Pages
- Extracts: listings with prices, conditions, shipping options
- Example URL:
https://www.ebay.com/sch/i.html?_nkw=keyword
Store Pages
- Extracts: store info, featured products, categories
- Example URL:
https://www.ebay.com/str/storename
Walmart Parser
Product Pages
- Extracts: title, price, availability, specifications
- Example URL:
https://www.walmart.com/ip/name/12345
Category Pages
- Extracts: product grid, filters, pagination
- Example URL:
https://www.walmart.com/browse/category/id
Indeed Parser
Job Pages
- Extracts: job title, company, description, requirements
- Example URL:
https://www.indeed.com/viewjob?jk=jobkey
Company Pages
- Extracts: company info, ratings, reviews
- Example URL:
https://www.indeed.com/cmp/company
Redfin Parser
Property Pages
- Extracts: price, details, features, agent info
- Example URL:
https://www.redfin.com/state/city/address
Search Pages
- Extracts: property listings, filters, map data
- Example URL:
https://www.redfin.com/city/state/filter
Working with Parser Responses
Successful Response Structure
{
"status": "valid",
"url": "https://original-url.com",
"data": {
// Structured data specific to page type
}
}
Amazon Product Response Example
{
"status": "valid",
"url": "https://www.amazon.com/dp/B08N5WRWNW",
"data": {
"title": "Echo Dot (4th Gen)",
"price": 49.99,
"rating": 4.7,
"review_count": 525841,
"features": [
"Voice control your entertainment",
"Control smart home devices"
],
"images": ["url1", "url2", "url3"],
"availability": "In Stock"
}
}
Advanced Workflows
Multi-Page Parsing
Parse search results then individual products:
1. [Proxy API - Search] → [Parser API - Search]
↓ ↓
Search HTML Product URLs
↓
2. [Loop over URLs] → [Proxy API - Product] → [Parser API - Product]
↓ ↓
Product HTML Structured Data
Combining with Other Nodes
Store parsed data in Google Sheets:
[Parser API] → [Google Sheets]
↓ ↓
Parsed Data Append Rows
Send notifications for price drops:
[Parser API] → [IF Node] → [Slack]
↓ ↓ ↓
Price Price < Target Alert
Error Handling
Common Parser Errors
Error | Cause | Solution |
---|---|---|
Invalid HTML | Malformed HTML input | Ensure complete HTML is passed |
Unsupported Page | Wrong page type selected | Verify URL matches page type |
Parser Timeout | HTML too large | Reduce HTML size or simplify |
Missing Elements | Page structure changed | Contact support for updates |
Performance Optimization
Reduce HTML Size
Only send necessary HTML sections:
- Remove scripts and styles
- Extract main content area
- Use specific selectors
Batch Processing
Process multiple pages efficiently:
- Collect all URLs first
- Fetch HTML in parallel
- Parse in batches
- Handle results together
Caching Strategies
Avoid re-parsing unchanged content:
- Check last-modified headers
- Store parsed results
- Compare checksums
Integration Examples
Price Monitoring Workflow
1. Schedule Trigger (Daily)
2. Get Product URLs from Database
3. Loop through URLs:
- Fetch HTML (Proxy API)
- Parse Product (Parser API)
- Compare with Previous Price
- Update Database
4. Send Summary Email
Review Analysis Workflow
1. Get Product ASIN
2. Fetch Reviews HTML (Proxy API)
3. Parse Reviews (Parser API)
4. Sentiment Analysis (AI Node)
5. Generate Report
6. Save to Dashboard
Job Aggregator Workflow
1. Search Indeed for Keywords
2. Parse Search Results
3. For Each Job:
- Fetch Job Details
- Parse Job Info
- Check Against Criteria
4. Send Matching Jobs to Slack
Best Practices
1. Always Validate Input
// Ensure HTML exists
{{ $json.body && $json.body.length > 0 }}
2. Handle Missing Data
// Provide defaults
{{ $json.data?.price || "Price not available" }}
3. Monitor Parser Performance
- Track success rates
- Log parsing errors
- Set up alerts for failures
4. Respect Rate Limits
- Don't overload parsers
- Implement delays between requests
- Use bulk operations when available
Next Steps
- Explore the Data API for direct data access
- See practical examples of parser workflows
- Learn about Proxy API for fetching HTML
Need specific parser support? Contact support@scrapeops.io with example URLs.