Skip to main content

Auto Extract

For certain highly popular domains, the ScrapeOps Proxy API has the ability to extract the data from the HTML response and return the data in JSON format. To enable this functionality you just need to add the auto_extract query parameter to your request along with your desired parser.

Currently, these are the supported parsers which work on the following domains and page types:

| key | Domains | Cost | Page Types | |----------| -------------| -------------| | amazon | Amazon.com & subdomains | 1 | Extracts data from Amazon Product, Search, Category, and Review pages. | | google | Google.com & subdomains | 25 | Extracts data from Google Search results and Google Shopping. |

Beta Version

Currently, this functionality is in Beta. It has been tested and used in production, however, there still might be some issues with the parsers. If you notice any issues then just let us know.

We will also be adding more parsers in the coming weeks.

To use the Auto Extract functionality, simply add auto_extract=amazon to your request when you are scraping an Amazon.com domain and would like the data automatically extracted from the HTML for you.

For example, here if how you would extract data from the following Amazon Product Page:


curl -k "https://proxy.scrapeops.io/v1/?api_key=YOUR_API_KEY&url=https://www.amazon.com/dp/B08BNQ9GS1&auto_extract=amazon"

The API will return a JSON response with the following data (status, data, url):


{
"data": {
"aplus_present": true,
"availability_status": "In Stock.",
"average_rating": 4.6,
"brand": "Visit the Tile Store",
"brand_url": "https://www.amazon.com/stores/Tile/page/60382F7B-6515-4574-B7CC-5436875D54D3?ref_=ast_bln",
"customization_options": {
"Size": [
{
"asin": "B08BNVQDWT",
"image": null,
"is_selected": true,
"value": "2-pack"
},
{
"asin": "B07W73PTJB",
"image": null,
"is_selected": false,
"value": "4-pack"
}
],
"Style": [
{
"asin": "B08BNVQDWT",
"image": null,
"is_selected": true,
"value": "Mate"
},
{
"asin": "B07W86YZM7",
"image": null,
"is_selected": false,
"value": "Mate + Slim"
}
]
},
"fast_track_message": "",
"feature_bullets": [
"FIND KEYS, BAGS & MORE -- Directly attach Mate to everyday things like keys, bags and other stuff you need to keep track of regularly and use our free app on iOS or Android to find them. Keep track of more for less or give them to your friends and family as a gift.",
"FIND NEARBY -- Use the Tile app to ring your Mate when it’s within 200 ft. or ask your Smart Home device to find it for you. Tile works with Amazon Alexa, Google Assistant, Xfinity, and Siri.",
"FIND FAR AWAY -- When outside of Bluetooth range, use the Tile app to view your Tile’s most recent location or enlist the secure and anonymous help of the Tile Network to aid in your search.",
"FIND YOUR PHONE -- Use your Tile to find your phone, even when it’s on silent.",
"UPGRADE YOUR FINDING EXPERIENCE -- Subscribe to Premium or Premium Protect for proactive finding features and enhanced services including Item Reimbursement, Smart Alerts, and Free Battery Replacement."
],
"fulfilled_by_amazon": true,
"full_description": "Directly attach Mate to your everyday things like keys, backpacks and other stuff you need to keep track of regularly. You’ll gain peace of mind knowing you can open the free Tile app and tap ‘Find’ to locate your stuff. Tile requires installation of the Tile App on iOS or Android, registration for a Tile account and acceptance of Tile’s Privacy Policy and Terms of Service (available at Tile). Payment required to access additional Premium services.",
"images": [
"https://m.media-amazon.com/images/I/3148kifx0zL.jpg",
"https://m.media-amazon.com/images/I/41wxkOLO+uL.jpg",
"https://m.media-amazon.com/images/I/41-hEvEmPFL.jpg",
"https://m.media-amazon.com/images/I/51pI30yIhmL.jpg",
"https://m.media-amazon.com/images/I/41++6qPoFLL.jpg",
"https://m.media-amazon.com/images/I/41bO2K6vIAL.jpg"
],
"list_price": "$47.99",
"model": "RE-19002",
"name": "Tile Mate (2020) 2-Pack - Discontinued by Manufacturer",
"pricing": "$44.95",
"product_category": "",
"product_information": {
"ASIN": "B08BNVQDWT",
"Batteries": "2 Lithium Metal batteries required. (included)",
"Customer Reviews": {
"ratings_count": 24342,
"stars": "4.6 out of 5 stars"
},
"Date First Available": "October 6, 2020",
"Item Weight": "0.54 ounces",
"Item model number": "RE-19002",
"Manufacturer": "Tile",
"Product Dimensions": "1.38 x 0.24 x 1.38 inches"
},
"seller_id": "A3PUKTJIAB9Y0H",
"seller_name": "alwayz-on-sale",
"shipping_price": "FREE",
"small_description": "About this item \n \nFIND KEYS, BAGS & MORE -- Directly attach Mate to everyday things like keys, bags and other stuff you need to keep track of regularly and use our free app on iOS or Android to find them. Keep track of more for less or give them to your friends and family as a gift. FIND NEARBY -- Use the Tile app to ring your Mate when it’s within 200 ft. or ask your Smart Home device to find it for you. Tile works with Amazon Alexa, Google Assistant, Xfinity, and Siri. FIND FAR AWAY -- When outside of Bluetooth range, use the Tile app to view your Tile’s most recent location or enlist the secure and anonymous help of the Tile Network to aid in your search. FIND YOUR PHONE -- Use your Tile to find your phone, even when it’s on silent. UPGRADE YOUR FINDING EXPERIENCE -- Subscribe to Premium or Premium Protect for proactive finding features and enhanced services including Item Reimbursement, Smart Alerts, and Free Battery Replacement.",
"total_answered_questions": 239,
"total_reviews": 24342
},
"status": "parse_successful",
"url": "https://www.amazon.com/dp/B08BNQ9GS1"
}

The following is an example of how to integrate this into a Python Requests based scraper:


import requests

response = requests.get(
url='https://proxy.scrapeops.io/v1/',
params={
'api_key': 'YOUR_API_KEY',
'url': 'https://www.amazon.com/dp/B08BNQ9GS1',
'auto_extract': 'amazon'
}
)

print(response.json())