Skip to main content

Introduction

Welcome to the ScrapeOps Documentation pages. Here you will find info on how to integrate and use our 3 products.


💻 Demo

🔗 ScrapeOps Dashboard Demo


💻 ScrapeOps Proxy API Aggregator

ScrapeOps Proxy API Aggregator is an easy to use proxy that gives you access to the best performing Proxy APIs via a single endpoint. We take care of finding the best proxies, so you can focus on the data.

To use the ScrapeOps Proxy API Aggregator, you first need an API key which you can get by signing up for a free account here.

🚀 Getting Started

To make requests you need send the URL you want to scrape to the ScrapeOps Proxy API endpoint https://proxy.scrapeops.io/v1/ by adding your API Key and URL to the request using the api_key and url query parameter:


curl -k "https://proxy.scrapeops.io/v1/?api_key=YOUR_API_KEY&url=http://httpbin.org/anything"

After receiving a response from one of our proxy providers the ScrapeOps Proxy API Aggregator will then respond with the raw HTML content of the target URL along with a response code:


<html>
<head>
...
</head>
<body>
...
</body>
</html>

With the ScrapeOps Proxy API Aggregator you are only charged for successful requests (200 and 404 status codes).

To learn how to use the ScrapeOps Proxy API Aggregator and customise it to your requirement then check out the QuickStart Guide.


🏠 ScrapeOps Residential Proxy Aggregator

ScrapeOps Residential Proxy Aggregator is an easy to use proxy that gives you access to the best performing Residential Proxy providers via a single proxy port. We take care of finding the best proxies, so you can focus on the data.

To use the ScrapeOps Residential Proxy Aggregator, you first need an API key which you can get by signing up for a free account here.

🚀 Getting Started

To make requests you need send the URL you want to scrape to set their proxy port to the ScrapeOps Residential Proxy Port http://scrapeops:YOUR_API_KEY@residential-proxy.scrapeops.io:8181

The username for the proxy is scrapeops and the password is your API key.


curl -x "http://scrapeops:YOUR_API_KEY@residential-proxy.scrapeops.io:8181" "https://httpbin.org/ip"


Here are the individual connection details:

  • Proxy: residential-proxy.scrapeops.io
  • Port: 8181
  • Username: scrapeops
  • Password: YOUR_API_KEY

With the ScrapeOps Residential Proxy Aggregator you are charged for bandwidth consumed.

To learn how to use the ScrapeOps Residential Proxy Aggregator and customise it to your requirement then check out the QuickStart Guide.


📊 ScrapeOps Monitoring

ScrapeOps Monitoring is a monitoring tool purpose built for web scraping. With a simple 30 seconds install of one of our SDKs, your scraper's performance & error stats will be automatically aggregated and shipped to your ScrapeOps dashboard.

Features & Functionality

ScrapeOps Monitoring gives you the following features & functionality:

  • Scrapy Job Stats & Visualisation

    • 📈 Individual Job Progress Stats
    • 📊 Compare Jobs versus Historical Jobs
    • 💯 Job Stats Tracked
      • Pages Scraped & Missed
      • Items Parsed & Missed
      • Item Field Coverage
      • Runtimes
      • Response Status Codes
      • Success Rates & Average Latencies
      • Errors & Warnings
      • Bandwidth
  • Health Checks & Alerts

    • 🔍 Custom Spider & Job Health Checks
    • 📦 Out of the Box Alerts - Slack (More coming soon!)
    • 📑 Daily Scraping Reports

🚀 Getting Started

To use ScrapeOps you first need to create a free account and get your free API_KEY.

Currently ScrapeOps integrates with both Python Requests & Python Scrapy scrapers:

  1. Python Requests Integration
  2. Python Scrapy Integration

More ScrapeOps Monitoring integrations are on the way.


ScrapeOps Server Manager & Scheduler

ScrapeOps Server Manager & Job Scheduler is a easy to use server integration that enables you to deploy, manage and schedule your scrapers from the ScrapeOps dashboard.

There are two options to integrate ScrapeOps with your servers:

  1. Via SSH (Recommended)
  2. Via Scrapyd Server HTTP Endpoints (Only Applicable to Python Scrapy)

Features & Functionality

ScrapeOps Server Manager & Job Scheduler gives you the following features & functionality:

  • SSH Server Management
    • 🔗 Integrate With Any SSH Capably Server
    • 🕷 Deploy scrapers directly from GitHub to your servers.
    • Schedule Periodic Jobs
  • ScrapyD Cluster Management
    • 🔗 Integrate With ScrapyD Servers
    • Schedule Periodic Jobs
    • 💯 All Scrapyd JSON API Supported
    • 🔐 Secure Your ScrapyD with BasicAuth, HTTPS or Whitelisted IPs

To learn how to setup the integrate ScrapeOps with your servers with this guide.