Skip to main content

ScrapeOps Parser API in n8n

The Parser API extracts structured data from HTML content of supported websites. Instead of writing complex parsing logic, simply send HTML to the Parser API and receive clean, structured JSON data.

Full Documentation

For complete Parser API documentation, see the Parser API docs.


How It Works

The Parser API follows a two-step process:

  1. Fetch HTML - Get the HTML content (using Proxy API or other methods)
  2. Parse HTML - Send HTML to Parser API for structured extraction

Supported Websites

The Parser API currently supports:

WebsiteSupported Page TypesDocumentation
AmazonProduct Pages, Search Pages, Reviews PagesAmazon Parser
eBayProduct Pages, Search Pages, Category Pages, Store Pages, Feedback PageseBay Parser
WalmartProduct Pages, Search Pages, Category Pages, Reviews Pages, Browse Pages, Shop PagesWalmart Parser
IndeedJob Detail Pages, Search Pages, Company PagesIndeed Parser
RedfinProperty Pages, Search Pages, Agent PagesRedfin Parser

Basic Configuration

Setting Up Parser API

  1. Add a ScrapeOps node to your workflow
  2. Select Parser API as the API type
  3. Configure the parameters:
ParameterRequiredDescription
DomainYesWebsite to parse (amazon, ebay, etc.)
Page TypeYesType of page being parsed
URLYesOriginal URL of the page
HTML ContentYesHTML to parse

Parser API Workflow Pattern

The typical workflow for using the Parser API:

[ScrapeOps Proxy API] → [ScrapeOps Parser API]
Get HTML Parse to JSON

Example: Amazon Product Parser

Step 1: Fetch HTML with Proxy API

Node 1: ScrapeOps (Proxy API)
- URL: https://www.amazon.com/dp/B08N5WRWNW
- Method: GET
- Render JavaScript: true

Step 2: Parse HTML

Node 2: ScrapeOps (Parser API)
- Domain: Amazon
- Page Type: Product Page
- URL: {{ $node["ScrapeOps"].json.url }}
- HTML Content: {{ $node["ScrapeOps"].json.body }}

Domain-Specific Configurations

Amazon Parser

Product Pages

  • Extracts: title, price, rating, description, images, features
  • Example URL: https://www.amazon.com/dp/ASIN

Search Pages

  • Extracts: product list with prices, ratings, prime status
  • Example URL: https://www.amazon.com/s?k=keyword

Reviews Pages

  • Extracts: review text, ratings, dates, helpful votes
  • Example URL: https://www.amazon.com/product-reviews/ASIN

eBay Parser

Product Pages

  • Extracts: title, price, seller info, shipping, item specifics
  • Example URL: https://www.ebay.com/itm/123456

Search Pages

  • Extracts: listings with prices, conditions, shipping options
  • Example URL: https://www.ebay.com/sch/i.html?_nkw=keyword

Store Pages

  • Extracts: store info, featured products, categories
  • Example URL: https://www.ebay.com/str/storename

Walmart Parser

Product Pages

  • Extracts: title, price, availability, specifications
  • Example URL: https://www.walmart.com/ip/name/12345

Category Pages

  • Extracts: product grid, filters, pagination
  • Example URL: https://www.walmart.com/browse/category/id

Indeed Parser

Job Pages

  • Extracts: job title, company, description, requirements
  • Example URL: https://www.indeed.com/viewjob?jk=jobkey

Company Pages

  • Extracts: company info, ratings, reviews
  • Example URL: https://www.indeed.com/cmp/company

Redfin Parser

Property Pages

  • Extracts: price, details, features, agent info
  • Example URL: https://www.redfin.com/state/city/address

Search Pages

  • Extracts: property listings, filters, map data
  • Example URL: https://www.redfin.com/city/state/filter

Working with Parser Responses

Successful Response Structure

{
"status": "valid",
"url": "https://original-url.com",
"data": {
// Structured data specific to page type
}
}

Amazon Product Response Example

{
"status": "valid",
"url": "https://www.amazon.com/dp/B08N5WRWNW",
"data": {
"title": "Echo Dot (4th Gen)",
"price": 49.99,
"rating": 4.7,
"review_count": 525841,
"features": [
"Voice control your entertainment",
"Control smart home devices"
],
"images": ["url1", "url2", "url3"],
"availability": "In Stock"
}
}

Advanced Workflows

Multi-Page Parsing

Parse search results then individual products:

1. [Proxy API - Search] → [Parser API - Search]
↓ ↓
Search HTML Product URLs

2. [Loop over URLs] → [Proxy API - Product] → [Parser API - Product]
↓ ↓
Product HTML Structured Data

Combining with Other Nodes

Store parsed data in Google Sheets:

[Parser API] → [Google Sheets]
↓ ↓
Parsed Data Append Rows

Send notifications for price drops:

[Parser API] → [IF Node] → [Slack]
↓ ↓ ↓
Price Price < Target Alert

Error Handling

Common Parser Errors

ErrorCauseSolution
Invalid HTMLMalformed HTML inputEnsure complete HTML is passed
Unsupported PageWrong page type selectedVerify URL matches page type
Parser TimeoutHTML too largeReduce HTML size or simplify
Missing ElementsPage structure changedContact support for updates

Performance Optimization

Reduce HTML Size

Only send necessary HTML sections:

  • Remove scripts and styles
  • Extract main content area
  • Use specific selectors

Batch Processing

Process multiple pages efficiently:

  1. Collect all URLs first
  2. Fetch HTML in parallel
  3. Parse in batches
  4. Handle results together

Caching Strategies

Avoid re-parsing unchanged content:

  • Check last-modified headers
  • Store parsed results
  • Compare checksums

Integration Examples

Price Monitoring Workflow

1. Schedule Trigger (Daily)
2. Get Product URLs from Database
3. Loop through URLs:
- Fetch HTML (Proxy API)
- Parse Product (Parser API)
- Compare with Previous Price
- Update Database
4. Send Summary Email

Review Analysis Workflow

1. Get Product ASIN
2. Fetch Reviews HTML (Proxy API)
3. Parse Reviews (Parser API)
4. Sentiment Analysis (AI Node)
5. Generate Report
6. Save to Dashboard

Job Aggregator Workflow

1. Search Indeed for Keywords
2. Parse Search Results
3. For Each Job:
- Fetch Job Details
- Parse Job Info
- Check Against Criteria
4. Send Matching Jobs to Slack

Best Practices

1. Always Validate Input

// Ensure HTML exists
{{ $json.body && $json.body.length > 0 }}

2. Handle Missing Data

// Provide defaults
{{ $json.data?.price || "Price not available" }}

3. Monitor Parser Performance

  • Track success rates
  • Log parsing errors
  • Set up alerts for failures

4. Respect Rate Limits

  • Don't overload parsers
  • Implement delays between requests
  • Use bulk operations when available

Next Steps

Need specific parser support? Contact support@scrapeops.io with example URLs.