Skip to main content

Oxylabs Residential Proxies Guide

Oxylabs Residential Proxies: Web Scraping Guide

Oxylabs is a leading provider of premium proxy services and web scraping solutions. They offer a vast pool of IP addresses to help businesses gather data without getting blocked. Their services include residential proxies, data center proxies, and web scraping tools.

This guide aims to helps you understand and effectively use Oxylabs residential proxies for web scraping through detailed instructions, code examples, and practical tips.

Need help scraping the web?

Then check out ScrapeOps, the complete toolkit for web scraping.


TLDR: How to Integrate Oxylabs Residential Proxy?

For a quick integration guide:

  1. Install Python Requests

    pip install requests
  2. Set Up Oxylabs Residential Proxy

    import requests

    username = "customer-USER"
    password = "Your_password"
    proxy = "pr.oxylabs.io:7777"

    proxies = {
    'http': f'http://{username}:{password}@{proxy}',
    'https': f'http://{username}:{password}@{proxy}'
    }

    response = requests.get("https://example.com", proxies=proxies)
    print(response.text)

In this script, we:

  1. Set Credentials and Proxy: We enter our Oxylabs credentials (username and password) and the proxy address (pr.oxylabs.io:7777).
  2. Configure Proxies: Next, we create the proxies dictionary to set up the HTTP and HTTPS proxies using our credentials.
  3. Send a Request: Then, we use the requests.get method to send a request to the target website (https://example.com) through the configured proxies.
  4. Print the Response: Finally, we print the response text from the target website.

Understanding Residential Proxies

Residential proxies are useful for various online activities where anonymity or geo-restricted access is important.

Here’s a breakdown of what residential proxies are and why they are valuable:

What Are Residential Proxies?

Residential proxies act as intermediaries between you and target websites, making your web traffic appear as if it’s coming from a real residential address.

When you use a residential proxy, websites see your requests as coming from a legitimate user, increasing your chances of bypassing blocks and avoiding detection.

Why Are Residential Proxies Important?

Residential proxies offer a layer of anonymity and security, essential for various online activities. They help you:

  • Bypass Geo-Restrictions: Access content restricted to certain locations.
  • Avoid IP Bans: Appear as different users, reducing the risk of being blocked.
  • Improve Data Accuracy: Gather more accurate data for market research and analysis.

Types of Residential Proxies

To better understand the options available, let’s compare the two main types of residential proxies: rotating and static.

Rotating residential proxies and static residential proxies differ primarily in how they manage IP addresses.

  1. Rotating Residential Proxies: Rotating residential proxies automatically change the IP address at regular intervals or with each request. This means that each request appears to come from a different IP address.

  2. Static residential proxies: On the other hand, static residential proxies maintain the same IP address for the duration of the session or until manually changed.

Here’s a breakdown of their features:

FeatureRotating Residential ProxiesStatic Residential Proxies
IP AddressChanges with each requestRemains the same for an extended period
AnonymityHighModerate
SpeedSlower due to rotation processGenerally faster
ManagementComplexSimpler
Risk of DetectionLowerHigher

Residential vs. Data Center Proxies

Now, let’s look at how residential proxies compare with data center proxies. This will help you decide which type suits your needs better

Residential proxies and datacenter proxies differ significantly in their origins and how they are perceived by websites. Residential proxies use IP addresses assigned by Internet Service Providers (ISPs) to homeowners, meaning they are associated with real residential locations.

In contrast, datacenter proxies originate from data centers and cloud service providers rather than ISPs. These IP addresses are not tied to a physical location or residential address.

FeatureResidential ProxiesData Center Proxies
IP SourceReal residential addresses from ISPsData centers and cloud service providers
AnonymityHighLower
SpeedVariable, often slowerGenerally faster
CostHigherLower
Detection RiskLowerHigher
Effectiveness for Geo-AccessHighLower

When Are Residential Proxies Useful?

Residential proxies are beneficial in various scenarios:

  • Web Scraping and Data Collection: Ensure you get accurate data without being blocked by anti-scraping measures.
  • SEO and SERP Analysis: Gather precise search engine results from different locations.
  • Social Media Monitoring: Track social media activities and trends without being flagged.
  • Ad Verification: Check if your ads are displayed correctly in different regions.
  • Geo-Restricted Content Access: Stream content or access websites restricted to specific regions.

Why Choose Oxylabs Residential Proxies?

Oxylabs is a well-known provider of residential proxies, offering a reliable and high-performance solution for users who need access to a vast pool of real IP addresses.

Here are several reasons why Oxylabs' residential proxies stand out:

  • Largest Residential Proxy Network: Oxylabs boasts one of the largest residential proxy networks in the world, with over 100 million IPs from real residential locations across 195 countries.
  • High Anonymity and Security: With Oxylabs residential proxies, you can maintain a high level of anonymity as their IPs are sourced from legitimate residential devices.
  • Global Coverage: Oxylabs provides residential proxies in nearly every country, ensuring that users have access to a broad range of geographic locations.
  • Easy Integration and Use: Oxylabs’ residential proxies are easy to integrate into a wide range of applications and platforms.

Oxylabs is an excellent choice for anyone looking for reliable and high-quality residential proxies. With its massive proxy pool, global coverage, flexible pricing, and robust support, it caters to both small users and enterprises.

Let's check out the pricing of Oxylabs.


Oxylabs Residential Proxy Pricing

Oxylabs offers several pricing plans for their Residential Proxies, catering to different needs.

Pricing Table

Here is the updated pricing structure of Oxylabs residential proxies:

PlanPrice per GBMonthly Price
7-day free trial eligible for verified company registrationFREEFREE
Pay as you go (Up to 50GB per month)$8 /GBNo Commitment
13 GB$7.75 /GB$99 + VAT / Billed monthly
40 GB$7.5 /GB$300 + VAT / Billed monthly
86 GB$6.98 /GB$600 + VAT / Billed monthly
133 GB$6.02 /GB$800 + VAT / Billed monthly
318 GB$5.5 /GB$1,750 + VAT / Billed monthly
1 TB$4 /GB$4,000 + VAT / Billed monthly

The above pricing structure shows that charges are primarily based on the bandwidth used. Each plan specifies a price per gigabyte (GB), allowing you to pay according to your data consumption.

For example, the "Pay as you go" plan costs $8 per GB, while larger plans like the 1 TB plan are priced at $4 per GB.

The Pay-As-You-Go plan offers flexibility, letting you pay based on the amount of data you use without committing to a fixed monthly fee. This is ideal for you if you have variable data needs or you prefer not to commit to a monthly subscription.

Pricing Comparison

Generally speaking, when proxy providers offer plans around $2-3 per GB, they are considered cheap. If they offer smaller plans in the $6-8 per GB range, they are more expensive.

For a detailed comparison of Oxylabs and other residential proxy providers, you can use our Proxy Comparison page. Check it out here.


Setting Up Oxylabs Residential Proxies

Creating an Oxylabs Account

To get started, visit the registration page. You can register with your Google account or use your username, email and password. We'll go with the option 2.

Fill in your details then click on the "Register" button.

Register

After you completed the registration, Oxylabs will send you a verification link at the email you just registered.

Verification E-mail Sent

Check your email "Inbox" and you will find an unread email entitled "Activate your Oxylabs account". Click on the "Activate your account" button in the body of the message.

Activate Your Account

And voila, you have completed the registration!

Registration Completed

Now you can subscribe to residential proxies.

Use Residential Proxies

Purchase Residential Proxies from Oxylabs

Visit the pricing page. Select a proxy plan that suits you and click on the respective "Buy now" button. For example, let's go with the Pay-As-You-Go plan.

After clicking the "Buy now" button, we are redirected to another page to specify the type of user we are and if we have changed our mind, we can pick a different plan.

We are "Regular" user and will still go with the Pay-As-You-Go plan. Choose 8 GB and re-click the "Buy now" button.

Choose 8GB Residential Proxies

Go with the default of "1GB" traffic and click on the "Continue" button.

Traffic 1GB

Select payment method, agree with terms and conditions then click on the "Continue" button.

Select Your Payment Method

Fill in your payment details then click on the "Pay" button.

After successful payment, you will be redirected to a page to create your proxy user.

Create Proxy User

Alternatively, you create or update your proxy users from your dashboard.

Manage Proxy Users

Now you can authenticate your requests as shown below.


Authentication

When using Oxylabs to access proxy services, you have two primary methods for authenticating your requests:

  1. username and password or
  2. whitelisting an IP address.

These methods ensure that your requests are securely routed through Oxylabs' proxy servers.

Method 1: Username & Password Authentication

This method involves providing your Oxylabs account's username and password as part of your proxy configuration. This is a straightforward approach and is ideal when you need to use the proxy from multiple locations or devices.

Here’s a step-by-step guide to using username and password authentication:

  1. Install the required packages: Ensure you have the requests library and python-dotenv to manage environment variables.

    pip install requests python-dotenv
  2. Set up environment variables: Store your Oxylabs username and password in a .env file for security. Create a .env file in your project directory and store your proxy user credentials:

    OXYLABS_USERNAME=your_oxylabs_username
    OXYLABS_PASSWORD=your_oxylabs_password
  3. Load environment variables and configure the proxy: Use the dotenv library to load these variables into your script and configure the proxy settings.

    import requests
    from dotenv import load_dotenv
    import os

    load_dotenv()

    username = os.getenv("OXYLABS_USERNAME")
    password = os.getenv("OXYLABS_PASSWORD")
    proxy = "pr.oxylabs.io:7777"

    proxies = {
    'http': f'http://{username}:{password}@{proxy}',
    'https': f'http://{username}:{password}@{proxy}'
    }

    response = requests.get("https://example.com", proxies=proxies)
    print(response.text)
  4. Run your script: Execute your script to send a request through the Oxylabs proxy with authentication.

Note: Despite setting everything correctly, you may encounter a 407 error when sending requests via your new proxy user's credentials for the first time.

Authentication Error

If that happens, change the proxy user's password and rerun your script. For more error codes and their solutions, check in the Error Codes section.

Method 2: Whitelisting an IP Address

Alternatively, you can whitelist your IP address with Oxylabs. This method is beneficial if you are accessing the proxy from a static IP address and prefer not to use credentials in your code.

Here’s how you can set up IP whitelisting:

  1. Log in to the Oxylabs Dashboard: First, log in to your Oxylabs account.
  2. Navigate to Whitelisting: On the left-hand side, select Residential Proxies and then Whitelist.
  3. Edit Whitelist: Click on Edit whitelist.

Whitelist

  1. Add IP Addresses: Enter up to 10 IP addresses in IPv4 format (xx.xx.xx.xx) and click "Submit"

Note: Ensure that these IP addresses are yours and that you are not using a proxy or VPN service when adding them. The SOCKS5 protocol does not support whitelisted IPs. Use other supported protocols for whitelisting.

Finding Your IP Address

Before whitelisting, you need to know your current IP address. Disconnect from any proxies or VPNs and then visit Oxylabs IP Location. The page will display your current IP address in a JSON file.

Code Example Using Whitelisted IP

Once you have whitelisted your IP addresses, you can configure your proxy without providing credentials:

import requests

# Configure proxy settings with whitelisted IP
proxy = "pr.oxylabs.io:7777"

proxies = {
'http': f'http://{proxy}',
'https': f'http://{proxy}'
}

# Make a request through the Oxylabs residential proxy
response = requests.get("https://example.com", proxies=proxies)

# Print the response content
print(response.text)

In the script above:

  • We set up the proxies dictionary to configure the HTTP and HTTPS proxies using only the proxy address, as your IP is already whitelisted.
  • We use the requests.get function to send a GET request to "https://example.com" through the configured proxy. The response from the server is printed to the console.

Basic Request Using Oxylabs Residential Proxies

To make requests using Oxylabs' residential proxies, you need to configure your request to route through their proxies.

Here’s a straightforward process to make requests using Oxylabs residential proxies:

  1. Set Up Your Environment: Ensure you have the requests library installed, which allows you to send HTTP requests easily.

    pip install requests
  2. Configure Proxy Settings: Use your Oxylabs credentials (username and password) and proxy address to configure the requests library. You need to specify the proxy server in the request settings.

  3. Make a Request: Send HTTP requests through the configured proxy to access web resources.

Code Example Using Python Requests

Let’s look at an example of how we can configure and use Oxylabs residential proxies in a Python script with the requests library:

import requests
from dotenv import load_dotenv
import os

# Load environment variables from .env file
load_dotenv()

# Retrieve Oxylabs credentials from environment variables
username = os.getenv("OXYLABS_USERNAME")
password = os.getenv("OXYLABS_PASSWORD")
proxy = "pr.oxylabs.io:7777"

# Configure proxy settings
proxies = {
'http': f'http://{username}:{password}@{proxy}',
'https': f'http://{username}:{password}@{proxy}'
}

# Make a request through the Oxylabs residential proxy
response = requests.get("https://datadome.co/", proxies=proxies)

# Print the response content
print(response.text)

In the script above:

  • We use the dotenv library to load our Oxylabs username and password from a .env file. This practice keeps our credentials secure and out of our code.
  • Then, we set up the proxies dictionary to configure the HTTP and HTTPS proxies using our Oxylabs credentials and proxy address.
  • After, we use the requests.get function to send a GET request to "https://datadome.co/" through the configured proxy.
  • Finally, we print the response from the server to the console.

Handling Proxy Errors

When using proxies, you may encounter errors such as connection timeouts or proxy failures. To handle these gracefully:

try:
response = requests.get("https://example.com", proxies=proxies, timeout=10)
response.raise_for_status()
except requests.exceptions.ProxyError:
print("Proxy error occurred. Please check your proxy settings.")
except requests.exceptions.Timeout:
print("The request timed out. Try increasing the timeout or check your internet connection.")
except requests.exceptions.RequestException as e:
print(f"An error occurred: {e}")

This code catches common exceptions, providing informative messages that help diagnose and fix issues with proxy configurations.


Country Geotargeting

Geotargeting allows you to connect to proxy servers in specific geographic locations, enabling you to bypass geo-restrictions and access content as if you were a local user in that area.

This capability is particularly useful for tasks such as market research, competitor analysis, and testing localized content.

Oxylabs provides extensive support for country-level geotargeting with their residential proxies. Their network covers 195 countries, offering a wide range of options for accessing location-specific content.

Top 10 Countries Supported by Oxylabs

Here's a table showing 10 popular countries supported by Oxylabs, along with their specific entry nodes:

CountryEntry Node
USAus-pr.oxylabs.io:10000
Canadaca-pr.oxylabs.io:30000
Great Britaingb-pr.oxylabs.io:20000
Germanyde-pr.oxylabs.io:30000
Francefr-pr.oxylabs.io:40000
Spaines-pr.oxylabs.io:10000
Italyit-pr.oxylabs.io:20000
Swedense-pr.oxylabs.io:30000
Greecegr-pr.oxylabs.io:40000
Portugalpt-pr.oxylabs.io:10000

Using Country-Specific Proxies

To use country-specific proxies with Oxylabs, you have two main options:

  1. Country-Specific Entry Nodes: You can connect to a specific country's proxy pool by using the country's dedicated entry node. For example, to use a proxy from the USA, you would connect to us-pr.oxylabs.io:10000.
  2. Country Code Parameter: Alternatively, you can add a cc flag to the authorization header, specifying the desired country code. This method allows you to use the main entry point while still targeting a specific country.

Let's examine both approaches:

Method A: Country-Specific Entry Nodes

import requests
from dotenv import load_dotenv
import os

load_dotenv()

username = os.getenv("OXYLABS_USERNAME")
password = os.getenv("OXYLABS_PASSWORD")

# Target country (Germany)
proxy = "de-pr.oxylabs.io:30000"

proxies = {
'http': f'http://{username}:{password}@{proxy}',
'https': f'http://{username}:{password}@{proxy}'
}

response = requests.get("https://example.com", proxies=proxies)
print(response.text)

Country Geotargeting

Method B: Country Code Parameter

from dotenv import load_dotenv
import os
import requests

load_dotenv()

username = os.getenv("OXYLABS_USERNAME")
password = os.getenv("OXYLABS_PASSWORD")

# Target country (Germany)
country = 'DE'

entry = f'http://customer-{username}-cc-{country}:{password}@pr.oxylabs.io:7777'

proxies = {
'http': entry,
'https': entry,
}

response = requests.get('https://example.com', proxies=proxies)
print(response.text)

Country Geotargeting

Both methods achieve the same goal of routing requests through a proxy in a specific country (Germany in these examples). However, they differ in how they specify the target country:

  • Method A uses a country-specific entry node (de-pr.oxylabs.io:30000 for Germany).
  • Method B uses the general entry point (pr.oxylabs.io:7777) and specifies the country in the username, where cc stands for "country code".

Both methods are valid and have their use cases. Method A might be preferred when consistently targeting the same country, while Method B offers more flexibility for frequently changing target countries.


City Geotargeting

City-level geotargeting allows you to connect to proxy servers in specific cities, enabling you to access hyper-local content and conduct precise market research.

This capability is particularly useful for tasks such as local SEO analysis, city-specific price monitoring, and testing localized advertising campaigns.

Oxylabs supports city-level targeting within their residential proxy network, offering an impressive level of granularity for your geotargeting needs.

Top 10 Cities Supported by Oxylabs

While Oxylabs supports every city in the world, here's a table showing 10 popular cities you can target, along with their country codes:

CityCountryCode Parameter
New YorkUScc-US-city-los_angeles
LondonGBcc-GB-city-london
ParisFRcc-FR-city-paris
BerlinDEcc-DE-city-berlin
MadridEScc-ES-city-barcelona
RomeITcc-IT-city-rome
StockholmSEcc-SE-city-stockholm
AthensGRcc-GR-city-athens
LisbonPTcc-PT-city-lisbon
TorontoCAcc-CA-city-toronto

It's important to note that while Oxylabs supports all cities worldwide, the availability of proxies in a specific city at any given time may vary due to the dynamic nature of residential proxies.

Using City-Specific Proxies

To use city-specific proxies with Oxylabs, you need to add both the country code (cc) and city parameters to your request. The format is as follows:

cc-[COUNTRY_CODE]-city-[CITY_NAME]

For example, to target Munich, Germany, you would use: cc-DE-city-munich

Here's a Python code example demonstrating how to use Oxylabs residential proxies with city-specific targeting:

import requests
from dotenv import load_dotenv
import os

load_dotenv()

username = os.getenv("OXYLABS_USERNAME")
password = os.getenv("OXYLABS_PASSWORD")

# Target city (Munich, Germany)
country = 'DE'
city = 'munich'

entry = f'http://customer-{username}-cc-{country}-city-{city}:{password}@pr.oxylabs.io:7777'

proxies = {
'http': entry,
'https': entry,
}

response = requests.get('https://example.com', proxies=proxies)
print(response.text)

City Geotargeting

Explanation:

  1. We import the necessary libraries and load environment variables.
  2. We set up our Oxylabs credentials and specify the target country and city (Munich, Germany in this case).
  3. We configure the proxy URL, including both the country code and city in the username.
  4. We make a request to https://example.com using the configured proxy.
  5. Finally, we print the response.

By changing the country and city variables, you can easily target different cities for your requests. This allows you to access city-specific content or test your applications from various urban locations around the world.

Remember that while Oxylabs supports a vast number of cities, the availability of proxies in very specific locations may vary. For a complete list of supported cities, you can refer to the City_list.csv file provided by Oxylabs.


How to Use Static Proxies

Static proxies, also known as ISP proxies, offer a powerful combination of residential and datacenter proxies' strengths. These proxies give you a consistent IP address that remains the same across sessions, providing the high anonymity of residential proxies and the speed of datacenter proxies.

You'll find static proxies particularly useful when you need reliability and a stable IP for long-term tasks.

Key Benefits of Static Proxies

By using static proxies, you gain several advantages:

  • High Speed: Since static proxies originate from datacenter infrastructure, you get fast and efficient connections.
  • Enhanced Anonymity: With IPs assigned by ISPs, these proxies provide a higher level of legitimacy and anonymity.
  • Reliability: You can count on static proxies for consistent performance, thanks to their ISP-backed infrastructure.
  • No IP Rotation: You don't have to worry about rotating IPs, which simplifies long sessions or tasks that require a consistent IP.
  • Private Access: You can set up static proxies as private, ensuring that only you use that specific IP address.

Common Use Cases for Static Proxies

Static proxies are particularly beneficial when IP rotation isn't an option or could cause disruptions:

  • E-commerce Activities: When you're making purchases or managing accounts on e-commerce sites, IP rotation might lead to blocks or bans. Static proxies help you maintain a continuous session, preventing such issues.
  • Social Media Management: If you need to create and manage multiple social media accounts, static proxies provide a stable IP that avoids re-authentication problems.
  • Brand Protection: You can use static proxies to monitor the web for brand abuse, like copyright infringement, without being detected as a bot.
  • Web Scraping: Static proxies enable you to scrape the web quickly and reliably, while appearing as a real user, making it less likely to be flagged or blocked.

Example of Using Static Proxies with Python

Here’s how we can set up and use static proxies in Python:

import requests
from dotenv import load_dotenv
import os

load_dotenv()

username = os.getenv("OXYLABS_USERNAME")
password = os.getenv("OXYLABS_PASSWORD")

# Static proxy server configuration
proxy = "isp.oxylabs.io:8001"

proxies = {
'http': f'http://{username}:{password}@{proxy}',
'https': f'http://{username}:{password}@{proxy}'
}

# Making a request through the static proxy
response = requests.get("https://example.com", proxies=proxies)
print(response.text)

Explanation:

  1. We start by importing the necessary libraries and loading our environment variables.
  2. We then set up the static proxy configuration, using a proxy server (isp.oxylabs.io:8001). This server ensures that our IP remains consistent across all sessions. You can buy Oxylabs Static IP proxies here.
  3. Finally, we make a request to https://example.com through the static proxy and print the response.

By using static proxies in this way, we combine speed, anonymity, and reliability, making them a great choice for various online tasks.


Error Codes

When using a proxy, you might encounter various HTTP error codes that indicate issues with your connection.

Below is a list of common proxy error codes along with explanations and suggested solutions to help you manage and resolve these issues effectively:

Error CodeExplanationSolution
100 - ContinueThe server has received the request header, and you can proceed with sending the body of the request.Typically, no action is needed unless additional instructions are given.
101 - Switching ProtocolsThe server is switching communication protocols as requested by the client.No action needed; the server has acknowledged the protocol switch.
102 - Processing (WebDav)The server is processing a complex request and has not yet completed it.Wait for the server to complete the request processing.
103 - Early HintsThe server is about to send a final response and provides preliminary information.No action needed; the final response will follow.
301 - Moved PermanentlyThe requested resource has been permanently moved to a new URL.Follow the new URL provided by the server.
305 - Use ProxyThe requested resource can only be accessed via a proxy.Connect to the specified proxy server and retry the request.
306 - Switch ProxyThe client should use a different proxy server for the request.Connect using a different proxy server.
307 - Temporary RedirectThe client is temporarily redirected to a different location.Follow the redirect and make the request again.
400 - Bad RequestThe request contains errors or malformed syntax.Review and correct the request, then try again.
401 - UnauthorizedAuthentication is required to access the resource.Provide the necessary authorization details.
403 - ForbiddenAccess to the requested resource is forbidden.Verify permissions and credentials, and ensure you are authorized to access the resource.
404 - Not FoundThe requested resource is not available at the specified URL.Double-check the URL and try again.
407 - Proxy Authentication RequiredAuthentication with the proxy server is required.Update proxy server settings with correct credentials and whitelisted IPs.
408 - Request TimeoutThe server timed out waiting for the client’s request.Check your internet connection and retry the request.
429 - Too Many RequestsToo many requests have been sent in a short period from the same IP.Rotate IP addresses and introduce time delays between requests.
502 - Bad GatewayThe proxy or gateway received an invalid response from the upstream server.Clear cache and cookies, and try changing DNS settings.
503 - Service UnavailableThe server is currently unable to handle the request, possibly due to overload or maintenance.Rotate your IP address or try using a different proxy server.

Understanding these error codes and implementing the appropriate solutions will help ensure smoother operations and fewer interruptions during your scraping activities.


KYC (Know-Your-Customer) Verification

Oxylabs upholds strict Know-Your-Customer (KYC) standards to safeguard customers, consumers, and the internet from malicious use of their solutions.

Oxylabs uses a multi-layered KYC approach that separates each customer into several distinct stages, depending on automated and manual initial assessment. However, the end goal is the same – to create accountability and identify the person(s) intending to purchase their services.

  1. Initial Assessment: Customers provide personal and business information, verified through independent sources. The goal is to ensure accountability and prevent misuse.

  2. Verification: Depending on risk factors, ID verification, compliance calls, and risk questionnaires may be required.

  3. Post-onboarding: Continued due diligence is conducted to ensure that services are used as agreed. Regardless of the results gleaned from the KYC questionnaire, some use cases are outright forbidden and are not subject to assessment or negotiations.

Check out the KYC policy of Oxylabs to get more information about the process.


Implementing Oxylabs Residential Proxies in Web Scraping

In this section, we will demonstrate how to use Oxylabs residential proxies with various libraries. We'll focus on US geotargeting and rotating proxies, using the same example across different libraries to scrape the title text from https://example.com.

Python Requests

To integrate Oxylabs proxies with Python Requests, follow these steps:

  1. Set up the proxy and authentication:

    username = "customer-USER"
    password = "Your_password"
    proxy = "pr.oxylabs.io:7777"

    proxies = {
    'http': f'http://{username}:{password}@{proxy}',
    'https': f'http://{username}:{password}@{proxy}'
    }
  2. Use the proxy to scrape the title:

    """
    Run `pip install requests beautifulsoup4` to install the libaries
    """
    import requests
    from bs4 import BeautifulSoup

    url = "https://example.com"
    response = requests.get(url, proxies=proxies)
    soup = BeautifulSoup(response.content, "html.parser")
    title = soup.title.string
    print(title)

In this example, we first configure the proxy settings with the required authentication. Then, we use requests.get to fetch the webpage content and BeautifulSoup to extract the title text.

Oxylabs Residential Proxies with Python Requests

Python Selenium

To set up Oxylabs proxies with Selenium for browser automation, follow these steps:

  1. Download the Proxy Auth Extension: Get authentication extension from GitHub.

  2. Configure the proxy and authentication:

    """
    Run `pip install selenium webdriver-manager` to install the libaries.
    """
    from selenium import webdriver
    from selenium.webdriver.chrome.service import Service as ChromeService
    from selenium.webdriver.chrome.options import Options
    from webdriver_manager.chrome import ChromeDriverManager
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    from selenium.webdriver.common.by import By
    from proxy_auth_extension import create_proxy_auth_extension
    import os

    username = "customer-USER"
    password = "Your_password"
    proxy_host = "pr.oxylabs.io"
    proxy_port = 7777

    proxy_auth_extension = create_proxy_auth_extension(proxy_host, int(proxy_port), username, password)

    # Set up Chrome options
    chrome_options = Options()
    chrome_options.add_extension(proxy_auth_extension)
    chrome_options.add_argument('--no-sandbox')
    chrome_options.add_argument('--disable-dev-shm-usage')
    chrome_options.add_argument('--disable-blink-features=AutomationControlled')

    driver = webdriver.Chrome(service=ChromeService(ChromeDriverManager().install()), options=chrome_options)

    driver.get('https://example.com')

    WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.TAG_NAME, "title")))

    # Scrape the title
    title = driver.title
    print(f"Scraped Title: {title}")
    os.remove(proxy_auth_extension)

We configure the Selenium WebDriver to use Oxylabs proxies by adding the proxy authentication extension to Chrome options. We then navigate to the target URL, extract the title text, and remove the extension.

Oxylabs Residential Proxies with Python Selenium

Python Scrapy

To integrate Oxylabs proxies with Scrapy for web scraping, follow these steps:

  1. Set up the proxy in Scrapy settings:

    # settings.py
    DOWNLOADER_MIDDLEWARES = {
    'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware': 1,
    }

    HTTP_PROXY = 'http://customer-USER:Your_password@pr.oxylabs.io:7777'
  2. Create a Scrapy spider to scrape the title:

    import scrapy

    class ExampleSpider(scrapy.Spider):
    name = "example"
    start_urls = ["https://example.com"]

    def parse(self, response):
    title = response.xpath('//title/text()').get()
    self.log(f'Title: {title}')

In this example, we configure the proxy settings in Scrapy's settings file. We then create a spider that fetches the target URL and extracts the title text using XPath.

Oxylabs Residential Proxies with Python Scrapy

Node.js Puppeteer

To set up Oxylabs proxies with Puppeteer for browser automation, follow these steps:

import puppeteer from 'puppeteer';

const run = async () => {
try {
const browser = await puppeteer.launch({
args: [
`--proxy-server=pr.oxylabs.io:7777`
]
});

const page = await browser.newPage();
await page.authenticate({
username: 'customer-USER',
password: 'Your_password'
});

await page.goto('https://example.com');
const title = await page.title();
console.log('Page title:', title);

await browser.close();
} catch (error) {
console.error('An error occurred:', error);
}
}

run();

We configure Puppeteer to use Oxylabs proxies by launching the browser with the proxy server argument. We then authenticate the proxy, navigate to the target URL, and extract the title text.

Oxylabs Residential Proxies with NodeJS Puppeteer

Node.js Playwright

To set up Oxylabs proxies with Playwright for browser automation, follow these steps:

import { chromium } from 'playwright'; // Run `npx playwright install` to download headless browsers

const run = async () => {
const browser = await chromium.launch({
proxy: {
server: 'http://pr.oxylabs.io:7777',
username: 'customer-USER',
password: 'Your_password'
}
});
const page = await browser.newPage();
await page.goto('https://example.com');
const title = await page.title();
console.log(title);
await browser.close();
}

run();

We configure Playwright to use Oxylabs proxies by launching the browser with the proxy settings. We then authenticate the proxy, navigate to the target URL, and extract the title text.

Oxylabs Residential Proxies with NodeJS Playwright


Case Study: Scrape Amazon Prices with Oxylabs Proxies

In this case study, we will scrape price information for a product on Amazon's Spanish and Portuguese websites using Puppeteer and Oxylabs proxies. This demonstrates how regional pricing strategies can be observed by changing IP addresses.

We will:

  1. Configure Puppeteer to use Oxylabs proxies by launching the browser with the proxy server argument,
  2. Authenticating the proxy,
  3. Navigating to the target URL, and
  4. Extracting the product title and price.

Code Example

import puppeteer from 'puppeteer';
import dotenv from 'dotenv';
import { fileURLToPath } from 'url';
import { dirname, join } from 'path';

const __filename = fileURLToPath(import.meta.url);
const __dirname = dirname(__filename);
dotenv.config({ path: join(__dirname, '.env') });

const username = process.env.OXYLABS_USERNAME;
const password = process.env.OXYLABS_PASSWORD;

if (!username || !password) {
console.error('Please set OXYLABS_USERNAME and OXYLABS_PASSWORD in your .env file');
process.exit(1);
}

const spanish_proxy = "es-pr.oxylabs.io:10000";
const portuguese_proxy = "pt-pr.oxylabs.io:10000";

async function scrapeAmazonPrice(proxy, url) {
const browser = await puppeteer.launch({
args: [`--proxy-server=${proxy}`],
headless: false, // Set to true for production
});

try {
const page = await browser.newPage();
await page.authenticate({ username, password });

console.log(`Navigating to ${url} using proxy ${proxy}`);
await page.goto(url, { waitUntil: 'networkidle2', timeout: 60000 });

// Handle the pop-up
try {
console.log('Checking for pop-up...');
await page.waitForSelector('input[id="sp-cc-accept"]', { timeout: 5000 });
console.log('Pop-up found. Clicking "Aceptar" button...');
await page.click('input[id="sp-cc-accept"]');
await page.waitForNavigation({ waitUntil: 'networkidle2' });
console.log('Pop-up handled successfully.');
} catch (error) {
console.log('No pop-up found or unable to click. Proceeding with scraping.');
}

// Wait for the price element to load
await page.waitForSelector('.a-price', { timeout: 30000 });

// Extract the price
const price = await page.evaluate(() => {
const priceElement = document.querySelector('.a-price .a-offscreen');
if (!priceElement) {
console.log('Price element not found. HTML:', document.body.innerHTML);
return 'Price not found';
}
return priceElement.textContent.trim();
});

// Extract the product title
const title = await page.evaluate(() => {
const titleElement = document.querySelector('#productTitle');
return titleElement ? titleElement.textContent.trim() : 'Title not found';
});

console.log(`Scraped data - Title: ${title}, Price: ${price}`);
return { title, price };
} catch (error) {
console.error('An error occurred:', error);
return { title: 'Error', price: 'Error', error: error.message };
} finally {
await browser.close();
}
}

async function compareAmazonPrices(productUrl) {
console.log('Scraping with Spanish IP...');
const spanishResult = await scrapeAmazonPrice(spanish_proxy, productUrl);

console.log('Scraping with Portuguese IP...');
const portugueseResult = await scrapeAmazonPrice(portuguese_proxy, productUrl);

console.log('\nResults:');
console.log('Spanish IP:', spanishResult); // e.g., 21.20 EUR
console.log('Portuguese IP:', portugueseResult); // e.g., 14.75 EUR

if (spanishResult.price !== portugueseResult.price) {
console.log('\nThe price differs based on the IP address used.');
} else if (spanishResult.price === 'Price not found' || portugueseResult.price === 'Price not found') {
console.log('\nUnable to compare prices due to scraping issues.');
} else {
console.log('\nThe price is the same for both IP addresses.');
}
}

// Example usage
const amazonProductUrl = 'https://www.amazon.es/Harry-Potter-Crochet-Kits/dp/1684128870';
compareAmazonPrices(amazonProductUrl);
  • Environment Setup:
    • We import necessary modules and configure environment variables using dotenv.
    • Oxylabs credentials (username and password) are fetched from a .env file.
  • Proxy and Target URL:
    • The function scrapeAmazonPrice is configured to scrape Amazon product data while using Spanish and Portuguese proxies.
    • Puppeteer handles proxy server arguments, allowing for requests from different locations.
  • Data Extraction:
    • The product title and price are extracted from the Amazon product page.
    • Puppeteer interacts with pop-ups, which can often appear on Amazon pages.

Comparison of Prices: Spain vs Portugal

This example compares the price of the product Harry Potter Crochet Kit from Amazon's Spanish and Portuguese versions.

  • Spanish Price: €21,20
  • Portuguese Price: €14,75

The results show that the price may vary depending on the region, highlighting the importance of regional pricing strategies. There might be various reasons that impacts the pricing such as:

  1. Regional Pricing Strategies: Like many companies, Amazon adjusts prices based on region to reflect local market conditions, taxes, and shipping costs.
  2. Local Competition and Demand: Local demand and competition can influence pricing, causing variations between regions.
  3. Currency and Economic Conditions: Exchange rates and the general economic conditions of a region can also lead to different pricing models.

Troubleshooting Tips

  • Verify Regional Content: Ensure that you are accessing the correct regional content by checking the website's URL and any location settings.
  • Use Accurate Proxies: For accurate results, use residential proxies specific to the country you are targeting.
  • Monitor Element Changes: Website structure might differ across regions; ensure your selectors are correctly targeting the desired elements.
  • Analyze HTTP Responses: Check for any region-specific redirects or content adjustments in the HTTP responses.

Alternative: ScrapeOps Residential Proxy Aggregator

The ScrapeOps Residential Proxy Aggregator offers a compelling alternative to traditional proxy providers.

With its unique features and competitive pricing, it stands out as a robust solution for web scraping needs. Here’s why you should consider using it:

  1. Competitive Pricing: ScrapeOps offers lower pricing, allowing you to maximize your budget while maintaining high quality.
  2. Flexible Plans: With ScrapeOps, you have access to a wider variety of plans, including smaller, more affordable options tailored to your needs. The best part? You can start using the proxies with a free trial account.
  3. Enhanced Reliability: By leveraging multiple proxy providers through a single port, ScrapeOps offers greater reliability. If one provider faces issues, your requests can seamlessly switch to another, ensuring continuous access.

Using ScrapeOps with Python Requests

Here is an example of how to use the ScrapeOps Residential Proxy Aggregator with Python Requests:

import requests
from bs4 import BeautifulSoup
from dotenv import load_dotenv
import os

load_dotenv()

username = 'scrapeops'
api_key = os.getenv("SCRAPEOPS_API_KEY")
proxy = 'residential-proxy.scrapeops.io'
port = 8181

proxies = {
"http": f"http://{username}:{api_key}@{proxy}:{port}"
}

response = requests.get('https://plainenglish.io/', proxies=proxies)

# Check if the request was successful
if response.status_code == 200:
soup = BeautifulSoup(response.text, 'html.parser')

# Find the first 10 items with class 'mob-col-100'
items = soup.find_all(class_='mob-col-100', limit=10)
for i, item in enumerate(items, 1):
print(f"Item {i}: {item.get_text(strip=True)}")
else:
print(f"Failed to retrieve content. Status code: {response.status_code}")

Explanation:

Proxy Configuration:

  • We set up the proxy configuration by specifying the username (scrapeops), the API key (api_key), the proxy server (residential-proxy.scrapeops.io), and the port (8181). This configuration allows us to route our HTTP and HTTPS requests through the ScrapeOps Residential Proxy.
  • Our proxy dictionary proxies contains both http and https proxy settings, ensuring that all types of requests are routed through the proxy.

Authentication and Routing:

  • We use basic authentication (http://username:api_key@proxy:port). We combine the username and api_key with the proxy server and port to create the authenticated proxy URL.
  • This setup ensures that our requests made using the requests library are authenticated and routed through ScrapeOps’ residential proxies. This provides access to multiple IP addresses, making it harder for target websites to block our requests.

Request Execution:

  • We use the requests.get function to make a GET request to https://zalando.be/ with the proxies parameter. This means our request is sent via the ScrapeOps Residential Proxy.
  • By using the proxy, our request benefits from automatic IP rotation, residential IPs, and other anti-bot measures provided by ScrapeOps. This enhances the success rate of our web scraping and reduces the risk of being blocked.

Handling the Response:

  • We check the response from the target website for a successful status code (200). If the request is successful, we parse the response content using BeautifulSoup to extract and print the first 10 items with the class mob-col-100.
  • If the request fails, we print the status code, indicating an issue with retrieving the content. This could be due to various reasons like network issues, proxy configuration problems, or target site restrictions.

ScrapeOps Residential Proxy Aggregator

Apart from enjoying the benefits of using one of the most reliable and affordable residential proxies, you have a chance to start using ScrapeOps Residential Proxy Aggregator without paying any amount. All you do is take advantage of the free trial offering 100MB of free bandwidth.

For more details, visit the documentation.


When using residential proxies for web scraping, it's crucial to consider the ethical implications and legal responsibilities.

  • Ethical Sourcing of Proxies: When choosing a proxy provider, it's crucial to consider if they promote their proxies as being ethically sourced. This means ensuring that the underlying IP holder has opted in for their IP address to be used for data gathering.
  • Oxylabs' Policies: Oxylabs is a strong advocate of ethical business practices, operating strictly within the capacities of an established legitimate proxy pool. This ensures that their residential proxies are ethically sourced and that end-users have given documented and explicit consent.
  • Importance of Scraping Ethically: Oxylabs has established clear standards for residential proxy acquisition. They emphasize fairness and transparency in their operations, ensuring that residential proxies are obtained with the full consent of the IP holders. Their approach includes rewarding network participants and maintaining high standards of ethics and transparency throughout the procurement process.
  • User Consent and Awareness: Oxylabs' policies require that people who choose to share their unused internet traffic are presented with clear information about their participation. The intention to share internet traffic with third parties must be explicitly stated in the Terms and Conditions.
  • Supplier Vetting: Oxylabs has a strict vetting process for their residential proxy providers. They have set explicit contractual obligations to ensure that end-users are aware and that their consent is documented. They are committed to terminating collaborations with providers who fail to meet these high standards.

For more information, refer to Oxylabs' Residential Proxy Pool Handbook and their whitepaper on proxy procurement processes and policies.


Conclusion

Residential proxies are crucial for web scraping, offering anonymity and helping to bypass geo-restrictions and anti-bot defenses. Throughout this guide, we've explored the world of residential proxies, with a particular focus on Oxylabs' offerings.

Oxylabs provides a robust residential proxy service with global coverage and advanced targeting options, while practical examples showed how to integrate them with popular scraping tools.

As you embark on your web scraping projects, we encourage you to implement residential proxies to enhance your data gathering capabilities. By using residential proxies, you can effectively avoid IP bans, overcome rate limiting, and access geo-restricted content. However, always remember to scrape responsibly, respecting website terms of service and adhering to ethical guidelines.


More Web Scraping Guides

At ScrapeOps, we offer a wide range of learning resources for every skill level—whether you're just starting out or an experienced developer, we've got something for you.

If you would like to learn more about Web Scraping with Python, then be sure to check out Python Web Scraping Playbook or check out one of our more in-depth guides: