Geonode Residential Proxies: Web Scraping Guide

Residential proxies offer a powerful way to appear as genuine users while accessing restricted content, bypassing anti-scraping defenses. Geonode’s residential proxies help to ensure high success rates for web scraping, enabling access to geo-restricted content while reducing the risk of detection and IP bans. This guide will walk you through integrating Geonode's proxies into your web scraping projects.

TLDR: How to Integrate Geonode Residential Proxy

To integrate Geonode Residential Proxy into your Python web scraping project:
  • Sign up for a Geonode account and complete the KYC process.
  • Obtain your credentials (username and password) from the dashboard.
  • Use this code snippet to make requests via Geonode’s proxy:
import requests
# Proxy configurationusername = "YOUR_USERNAME"password = "YOUR_PASSWORD"endpoint = "YOUR_ENDPOINT"
proxy_url = f"http://{username}:{password}@{endpoint}"
proxies = {    "http": proxy_url,    "https": proxy_url}
# Make a requesturl = "http://google.com"response = requests.get(url, proxies=proxies)
print(f"Status Code: {response.status_code}")print(f"Content: {response.text}")

Understanding Residential Proxies

A residential proxy is a type of proxy server that routes your internet traffic through an IP address assigned to a physical device, such as a home computer or mobile phone, rather than a data center. These IP addresses are tied to real residential internet service providers, making the traffic appear as if it originates from a genuine household.

Why Are Residential Proxies Important?

Residential proxies provide enhanced privacy, reliability, and accessibility for various online tasks, especially in environments where anonymity, security, and the ability to bypass restrictions are critical. Here’s some of the use cases:
  • Market Research: Scraping pricing and competitor data without being flagged.
  • Ad Verification: Ensuring ads are displayed correctly across different regions.
  • Accessing Geo-restricted Content: Viewing content restricted to specific locations.
  • E-commerce: Managing multiple accounts or monitoring product availability.
  • SEO & Analytics: Collecting accurate SERP data without triggering captcha or blocks.
There are two primary types of residential proxies offered by Geonode: Rotating and Static.

Rotating Residential Proxies

Rotating residential proxies dynamically change their IP address, either at fixed intervals or after each request, enhancing anonymity and reducing the likelihood of being blocked by target sites. Pros:
  • IPs originate from real residential devices, offering high authenticity.
  • Access to a broad range of IPs across multiple countries.
  • Easily bypasses geo-restrictions and anti-scraping defenses.
Cons:
  • Frequent IP changes can disrupt session consistency, which may be an issue for tasks needing continuous sessions.
  • Typically slower and costlier than data center proxies.

Static Residential Proxies

Static residential proxies, also known as ISP proxies, retain a consistent IP address assigned by an ISP. This stability provides reliability and consistency, as the IP doesn’t rotate during usage. Pros:
  • ISP-assigned IPs offer enhanced stability and dependability.
  • Faster and generally more reliable than rotating residential proxies.
  • High-quality IPs that improve performance.
Cons:
  • More costly due to the ISP-provided IPs.
  • Limited country coverage compared to rotating proxies.

Residential vs. Data Center Proxies

Residential and datacenter proxies are both tools for masking IP addresses and enhancing privacy, but they differ in origin, use cases, and reliability. It’s important to understand the differences between residential and data center proxies when choosing a solution for your web scraping needs. Each type provides distinct benefits and potential drawbacks.
FeatureResidential ProxiesData Center Proxies
SourceReal residential devices (computers, smartphones)Data centers (servers)
IP AuthenticityHigh (appears as legitimate users)Lower (appears as servers)
AnonymityHighMedium
Risk of IP BansLowHigher
SpeedTypically slowerGenerally faster
CostHigherLower
IP RotationAvailable (rotating proxies)Available
StabilityPotentially less stable (rotating IPs)Generally more stable
Best Used ForAccessing protected/geo-restricted content, ad verificationLarge-scale scraping, tasks needing high speed
AvailabilityBased on ISP partnershipsMore widely available
  • Residential Proxies: Best for tasks requiring undetectable, authentic IPs, especially for geo-sensitive or heavily monitored platforms.
  • Datacenter Proxies: Ideal for cost-effective, high-speed automation where detection isn’t a major concern.
Choose residential for undetectable, location-sensitive tasks; choose datacenter for cost-efficient, speed-focused tasks.

When Are Residential Proxies Useful?

Residential proxies are valuable for numerous scenarios where mimicking legitimate user behavior and avoiding detection are essential. Here are some core use cases:

Web Scraping and Data Collection

Residential proxies enable secure and efficient data extraction by emulating genuine user traffic, bypassing geo-restrictions and anti-scraping measures.

SEO and SERP Analysis

Residential proxies provide accurate, location-specific search engine results, supporting targeted SEO strategies while avoiding detection and blocks.

Social Media Monitoring

Residential proxies ensure uninterrupted tracking of brand mentions, trends, and competitors by distributing requests across multiple IPs without risking account bans.

Ad Verification

Residential proxies verify ad placements, accuracy, and regional targeting to ensure campaigns reach intended audiences and maintain quality standards.

Geo-Restricted Content Access

Residential proxies bypass location restrictions, allowing seamless access to region-specific content like streaming services and localized websites globally.

Why Use Geonode Residential Proxies?

Geonode’s residential proxies provide robust support across various applications, enhancing performance and bypassing restrictions effortlessly. Here are some use reasons why you should use Geonode residential proxies:
  • Large IP Pool: Access millions of residential IP addresses worldwide, enabling diverse IP use for price comparisons, market research, and large-scale scraping without blocks.
  • Worldwide Coverage: Connect from over 160 countries, perfect for localized data collection and geo-targeting across regions.
  • Unlimited Concurrent Requests: Run multiple scraping tasks simultaneously, improving efficiency by monitoring numerous websites at once.
  • Flexible Session Control: Choose between sticky or rotating IP sessions, minimizing detection risk for large-scale scraping operations.
  • SOCKS5 & HTTPS Protocols: Supports advanced protocols for secure, high-performance streaming and scraping, ensuring faster and more reliable connections.
  • Customizable Rotation: Adjustable IP rotation based on time or request count, reducing potential detection by aligning with website rate limits.
  • Ethical Sourcing: IPs are ethically sourced from consenting users worldwide, ensuring compliance with global standards like GDPR and CCPA.
  • High Uptime & Reliability: 99.9% uptime ensures continuous data collection with minimal interruptions, maximizing efficiency.
  • Pay-as-You-Go Pricing: Flexible, usage-based billing provides affordable access without subscription commitments, fitting both large and small projects.
  • Reputation & Trust: Highly rated across trusted review platforms, offering a reliable and ethical proxy solution.
  • Global Network: Proxies sourced from 190+ countries, ensuring accurate data collection and regional targeting.
  • ZenShield and Repocket Partnerships: Ethical sourcing solutions through partnerships, supporting trust and integrity in proxy usage.

Geonode Residential Proxy Pricing

Geonode provides flexible and affordable pricing for their residential proxies, designed to accommodate users from small-scale scrapers to large businesses. Here’s an overview of Geonode's pricing options and benefits:

Pricing Structure

  • Pay-As-You-Go (PAYG): Only pay for the data you actually use, making this a cost-effective choice for varying data needs without committing to a monthly subscription.
  • Membership Discounts: Save up to 40% with membership plans, offering substantial discounts for high-volume users.
  • Static & Rotating Options: Choose between IPs that remain static for up to 24 hours or rotating proxies, based on your project’s requirements.

Pricing Table

Plan NameData Volume (GB)Cost per GB (Standard)Cost per GB (with Membership)Description
PAYG1 GB$4.00$1.00Simple pay-as-you-go plan, no subscription needed
StartupUp to 200 GB$4.00$2.00Ideal for smaller projects, includes 20% discount
EmergingUp to 1000 GB$4.00$1.50Mid-sized plan with 30% discount for scale
ScaleUp to 2000 GB$4.00$1.00Large plan with a 40% discount for high-volume use

Flexible Billing and Plan Modifications

Geonode’s PAYG structure ensures you pay only for the bandwidth you use, making it ideal for projects with fluctuating data needs. Membership discounts provide additional savings, and you can easily adjust or cancel your plan anytime to align with changing requirements.

Comparison to Other Providers

Generally speaking, when proxy providers offer plans around $2-3 per GB, they are considered cheap. If they offer smaller plans in the $6-8 per GB range, they are more expensive. Geonode stands out with its affordable pay-as-you-go pricing, especially for high-volume users who benefit from up to 40% discounts. This pricing approach is significantly more budget-friendly compared to some competitors like Bright Data, whose rates start at a higher price per GB. Geonode’s competitive pricing and flexible plans make it a popular choice for those seeking cost-effective residential proxies. If you'd like to shop around for other residential providers, we built a tool for that here.

Setting Up Geonode Residential Proxies

Setting up Geonode residential proxies is fairly simple process that includes signing up, purchaisng a trial for 1$, choosing a plan, setting up authentication and integrating them into your web scraping scripts.

Creating a Geonode Account

To get started with Geonode, follow these steps to create an account and set up your residential proxies:
  1. Visit the Geonode website and sign up using your Google account or email.
  1. Before being able to utilize the proxies, you need to subscrible or start a 1$ Trial. Click on the arrow button in the Residential Proxies Card.
  1. This will take you to Premium Residential Window. To start a trial you can click on "Try with $1 Trial". For any of the subscription types.
  1. Now Enter your card details and start the trial.
  2. Now your trial is started (for 3 days) but you still need to purchase the data that you want to utilzie (In trial you can purchase for free).
  3. Since you are on a trial, you will have $5 in you wallet to buy data and try it out. Select the wallet, agree to terms and conditions and click Confirm payment. Subsequently if you want more data, you can buy through card or crypto as well.

Configuring Proxies

Now that your account is all setup, you can easily start using the residential proxies.

Access Your Zones:

  • To access different zones or locations throughout the world, you can go to "Proxy configuration" on the dashboard. And from there you can select any country, it's state or city. And specify the port from 9000 to 9010. The port would be later used when we get the endpoints for the proxies.

Configure Your Zone:

  • Now based on the port you selected for a zone/location you can select an ENDPOINT with that port. You will use that ENDPOINT later when you configure the proxy.

Proxy Integration

Geonode simplifies proxy integration for web scraping, offering features like automatic IP rotation and a flexible selection of proxy types. Let's go step-by-step how to set up and use Geonode proxies in Python.

Steps to Set Up a Proxy Rotator with Geonode

  1. Get your ENDPOINT from the dashboard.
  1. Integrating with Python's requests Library: Use the requests library to send HTTP requests via Geonode’s proxies. This setup lets you control the IP address for each request, facilitating web scraping and bypassing location restrictions.
  2. Monitoring and Managing Proxies: Geonode provides tools to track proxy performance, allowing you to make real-time adjustments based on success rates and response times.

Basic Python Proxy Integration Example

Here's how to perform a basic HTTP request through a proxy using Python’s requests library.
import requests
# Define the proxy server with IP and portproxies = {    'http': 'http://YOUR_ENDPOINT',    'https': 'https://YOUR_ENDPOINT'}
# Send a GET request using the proxyresponse = requests.get('https://www.example.com', proxies=proxies)
# Check response status and print the contentprint(response.status_code)print(response.text)
In this example:
  • A proxies dictionary specifies the proxy server for both HTTP and HTTPS requests.
  • We pass proxies as a parameter in the requests.get() method.
  • Finally, the response’s status code and content are printed, which confirms whether the proxy setup is functioning correctly.

Additional Features and Use Cases

Using Geonode proxies with Python’s requests library is a robust way to ensure anonymity and bypass geo-restrictions. With options for automatic IP rotation and customizable session controls, Geonode proxies are ideal for complex web scraping projects. By leveraging these features, you can optimize your scraping tasks and maintain high success rates across different websites.

Authentication

To authenticate a request using Geonode residential proxies, you need to use your credentials, which include your username, password, ip and endpoint. These credentials can be found in the Access Parameters tab of the proxy product. You can get your USERNAME and PASSWORD from the "Credentials" on the dashboard. For example, if you are using Python with the requests library, you can configure the proxies as follows:
import requestsfrom bs4 import BeautifulSoup
# Proxy configurationusername = "YOUR_USERNAME"password = "YOUR_PASSWORD"endpoint = "premium-residential.geonode.com:9001"
proxy_url = f"http://{username}:{password}@{endpoint}"
proxies = {    "http": proxy_url,    "https": proxy_url}
# Make a requesturl = "http://google.com"response = requests.get(url, proxies=proxies)
s = BeautifulSoup(response.content, 'html.parser')print(response.status_code)print(s.text)
# From here you can process the data as needed
In the code above:
  • We import the requests and BeautifulSoup libraries.
  • Initialize username, password, and endpoint variables (with placeholders for credentials).
  • Send an HTTP GET request using the proxy.
  • Parse the HTML response using BeautifulSoup.
  • Print the status code and parsed text content.

Basic Request Using Geonode Residential Proxies

To make requests through Geonode’s residential proxies, you need to configure your request with the correct proxy details and authentication. Here’s how to set up the Geonode proxy with Python's requests library.
  1. Install the requests Library: Ensure you have the requests library installed.
  2. Set Up Proxy Details: Use your Geonode credentials, including username, password, and endpoint URL, to configure the proxy.
  3. Send the Request: Use requests to route your HTTP requests through the Geonode proxy.
Here’s an example of using Geonode’s residential proxies in a web scraping setup with Python:
import requestsfrom bs4 import BeautifulSoup
# Geonode Proxy configurationusername = "YOUR_USERNAME"  # Replace with your Geonode usernamepassword = "YOUR_PASSWORD"  # Replace with your Geonode passwordendpoint = "premium-residential.geonode.com:9001"
proxy_url = f"http://{username}:{password}@{endpoint}"
proxies = {    "http": proxy_url,    "https": proxy_url}
# Target URLurl = "http://books.toscrape.com/"response = requests.get(url, proxies=proxies)
# Parse and extract data using BeautifulSoupsoup = BeautifulSoup(response.content, 'html.parser')links = soup.find_all('a')for link in links:    print(link.get('href'))
After successfully running the above code, you will get the following output:

Country Geotargeting

Country-level geotargeting ensures that you access location-specific content, making it valuable for market research, localized content analysis, and price comparisons where region-based restrictions may apply. Geonode provides a broad selection of proxies across 190+ countries worldwide, enabling users to geotarget their web scraping to specific regions.

Using Country-Specific Proxies

To use a country-specific proxy, select the country in the Geonode dashboard and assign it to a specific port, which can then be referenced in your code. Here’s a Python example of how to use Geonode's residential proxies with the requests library for a geotargeted request.
import requestsfrom bs4 import BeautifulSoup
# Geonode Proxy configurationusername = "YOUR_USERNAME"   # Replace with your Geonode usernamepassword = "YOUR_PASSWORD"   # Replace with your Geonode passwordendpoint = "premium-residential.geonode.com:9001"  # Use assigned country-specific port if applicable
proxy_url = f"http://{username}:{password}@{endpoint}"
proxies = {    "http": proxy_url,    "https": proxy_url}
response_ip = requests.get("https://httpbin.org/ip", proxies=proxies)ip = response_ip.json()['origin']
# Get detailed inforesponse_info = requests.get(f"http://ipinfo.io/{ip}", proxies=proxies)info = response_info.json()
print(f"IP: {info['ip']}")print(f"Country: {info['country']}")
The script above:
  • Configure Proxy: The proxy URL is generated using Geonode credentials (username, password, and endpoint). For country-specific access, select the required country in Geonode’s dashboard and assign it to the chosen port.
  • Make a Request: Set up both HTTP and HTTPS proxy options, then send the request.
  • Parse HTML: Use BeautifulSoup to parse and extract data from the target page.
Geonode’s setup allows you to easily access region-restricted data using proxies from specific countries, which makes it ideal for diverse geolocation-based data collection needs.

City Geotargeting

City targeting allows users to route your web traffic through specific cities, providing highly localized data and enabling more granular control over scraping, SEO, and marketing efforts. Unfortunately, Geonode does not support City targeting as for now. But would be available soon. The process would be same:
  • You can select the state and the city from where you select a country.
  • Then specify the port.
  • Use that port in the ENDPOINT.

How To Use Static Proxies

Static proxies are a type of proxy that uses fixed, unchanging IP addresses, meaning that once you are assigned a static IP, it remains the same across multiple sessions or requests. This contrasts with dynamic proxies (such as rotating proxies), where IP addresses change frequently during each session or request. Static proxies offer a consistent, long-term solution for applications that need reliable access with a fixed IP, but they can be more easily detected and blocked than rotating proxies.

Key Benefits of Static Proxies

  • Consistency and Reliability: Provide a fixed IP address, ensuring that your connection remains stable over time.
  • Better Performance for Long Sessions: Better suited for extended, uninterrupted sessions where a persistent connection is needed.
  • Avoiding Geo-Blocking or IP Bans: Help you avoid being blocked or banned from websites that limit access based on changing IPs or that may flag IPs that are frequently rotated.

Common Use Cases for Static Proxies

In summary, static proxies are perfect for operations requiring stability and consistent access from the same IP address. They are particularly useful for tasks like account management, localized access, and avoiding detection while performing web scraping or other long-term online activities.
  • SEO and SERP Tracking: For consistently tracking search engine results from the same location.
  • Social Media Automation: Managing multiple social media accounts from a fixed location.
  • Web Scraping: When scraping data from websites that may block rotating IPs or when a consistent IP is needed to avoid detection.
  • Testing: Static proxies are useful for testing applications that require a stable connection from a specific location.

Example of Using Static Proxies with Python

  1. Creating a Proxy Zone
To start using Geonode’s static, or “sticky,” proxies, head to the Geonode dashboard and select your desired proxy endpoint.
  1. Configuring Your Proxy
In your dashboard, choose the proxy type and endpoint settings, where you’ll also find country-specific sticky options. Assign the chosen endpoint to a specific port.
  1. Making Requests with Sticky Proxies
Once your Geonode sticky proxy configuration is set up, you can use the following Python code to make requests with a static IP.
import requests
# Geonode Proxy configurationusername = "YOUR_USERNAME"   # Replace with your Geonode usernamepassword = "YOUR_PASSWORD"   # Replace with your Geonode passwordendpoint = "premium-residential.geonode.com:10000"  # Sticky endpoint
proxy_url = f"http://{username}:{password}@{endpoint}"
proxies = {    "http": proxy_url,    "https": proxy_url}
# Make request to check IPresponse_ip = requests.get("https://httpbin.org/ip", proxies=proxies)ip = response_ip.json()['origin']
# Get detailed inforesponse_info = requests.get(f"http://ipinfo.io/{ip}", proxies=proxies)info = response_info.json()
print(f"IP: {info['ip']}")print(f"Country: {info['country']}")
The script above:
  • Configure Proxy: Create the proxy URL using your Geonode credentials and the assigned endpoint.
  • Request Static IP: Make requests to httpbin.org/ip to confirm the IP is stable across calls.
  • Detailed Information: Use the IP from ipinfo.io to verify geolocation accuracy.
The output would be: This Geonode-based code version simplifies proxy management by using a predefined username and password. Each proxy entry in RESIDENTIAL-PREMIUM-sticky.txt contains only the host and port, so the credentials are added directly in the code. Here’s the code:
import pprintimport randomimport requests
# Geonode Proxy configuration (replace with your own credentials)username = "YOUR_USERNAME"  # Replace with your Geonode usernamepassword = "YOUR_PASSWRORD"  # Replace with your Geonode password
def get_proxies_from_file(file_path):    """Reads proxies from the specified file and returns a list of proxies."""    with open(file_path, 'r') as file:        proxies = [line.strip() for line in file.readlines()]    return proxies
def make_request_with_proxy(proxy):    """Makes a request to check IP and fetches geolocation information."""    # Construct the proxy URL with authentication    proxy_url = f'http://{username}:{password}@{proxy}'        proxies = {        'http': proxy_url,        'https': proxy_url    }
    # Get IP address from httpbin    ip_response = requests.get("https://httpbin.org/ip", proxies=proxies)    ip_data = ip_response.json()    print(f"IP Address: {ip_data['origin']}")
    # Get detailed info from IP geolocation service    geo_response = requests.get("http://ipinfo.io", proxies=proxies)    pprint.pprint(geo_response.json())
def main():    proxy_file = 'RESIDENTIAL-PREMIUM-sticky.txt'  # Path to your .txt file with Geonode proxies    proxies = get_proxies_from_file(proxy_file)        if not proxies:        print("No proxies found in the file.")        return        selected_proxy = random.choice(proxies)    print(f"Using proxy: {selected_proxy}")        make_request_with_proxy(selected_proxy)
if __name__ == "__main__":    main()
A short explanation of the script above:
  1. get_proxies_from_file: Reads the proxy list from RESIDENTIAL-PREMIUM-sticky.txt. Each line should contain a proxy in the format host:port.
  2. make_request_with_proxy: Adds the Geonode username and password directly to each proxy URL for authentication. This function:
    • Retrieves the IP address from the proxy.
    • Retrieves additional geolocation data using the IP.
  3. main function: Manages reading the proxies, selecting one at random, and initiating the request.
  • Each line in RESIDENTIAL-PREMIUM-sticky.txt should have the format:
    host:port
  • Replace username and password in the code with your Geonode account details for authentication.

Error Codes

When using Geonode proxies, certain response codes indicate issues or errors you might encounter during requests. These codes can help you quickly identify problems and apply fixes for a more seamless experience.
Error CodeMeaningCauseSolution
407 Proxy Authentication RequiredAuthentication is required.Incorrect or missing proxy credentials.Verify that your authentication settings (username and password or IP whitelisting) are correct. Update credentials if necessary to match your Geonode account.
461 Sticky Port Session Limit ReachedYou’ve reached the sticky port session limit for your plan.Exceeding the session limit for your subscription package.Upgrade your plan to increase the session limit if more capacity is needed.
462 Sticky Port Session UnsupportedSticky port sessions are not supported.Attempting to use sticky sessions where only rotating ports are allowed.Switch to rotating ports if sticky ports are unavailable with your plan.
466 Limit ReachedThe request limit for your subscription has been reached.Exceeding the allotted number of requests.Consider upgrading to a higher package to extend request limits.
561 Proxy UnreachableUnable to connect to the proxy server.Network or server configuration issues.Retry the request after a short wait. Ensure your proxy configuration is correct.
468 No Available ProxyNo proxy is currently available.Limited proxy availability or network congestion.Wait and retry the request. Contact Geonode support if the issue persists.
464 Host Not AllowedThe target host or domain is restricted.Trying to access a host outside the allowed range.Confirm that your target host aligns with your Geonode permissions.
470 Account BlockedYour account has been blocked.Potential misuse or security reasons.Contact Geonode support to resolve the issue if you believe it’s in error.
403 ForbiddenThe request is invalid or access is restricted.Issues with request configuration or access restrictions.Review your request configurations to ensure they’re valid.
465 City/State Not FoundThe specified location couldn’t be resolved.Invalid or unsupported location configuration.Confirm and adjust the location details in your request.
463 City/State UnsupportedThe location configurations in the request are unsupported.Location not allowed in the current package or configuration.Verify if the desired location is permitted with your package.
401 UnauthorizedUnauthorized request.Potential account or configuration issues.Contact Geonode support for assistance.
411 Account BlockedThe account is restricted.Potential policy violation or administrative lock.Reach out to customer support to discuss reactivation options.
471 Inactive PortThe requested port is inactive.Attempting to use an inactive port.Switch to an active port.
429 Too Many Requests with Incorrect AuthenticationRate limit exceeded due to multiple failed authentication attempts.Incorrect credentials or rapid authentication failures.Use correct credentials and space out requests to avoid temporary blocks.
Understanding and resolving these response codes can help maintain effective and efficient operations while using Geonode proxies. For more details, please refer to the official Geonode Response Code Documentation.

KYC Verification

Geonode requires all new users to complete a KYC (Know-Your-Customer) verification process before accessing our proxy services. This measure ensures that user activities align with Geonode’s policies and are conducted in a legal and ethical manner. The KYC process with Geonode includes:
  • Personal Review: A compliance officer personally evaluates each customer’s intended use case.
  • Ongoing Monitoring: User accounts are monitored continually to ensure compliance with declared use cases.
Geonode strictly prohibits certain activities, including:
  • Adult content
  • Gambling
  • Cryptocurrency-related activities
These standards ensure compliance with Geonode's commitment to ethical use. Any requests that do not align with these policies may be declined or suspended. For more information on Geonode’s terms of service, please refer to our full Terms & Conditions.

Implementing Geonode Residential Proxies in Web Scraping

Geonode residential proxies can be integrated with popular Python libraries for web scraping.

Python Requests

Here's how to integrate Geonode proxies with requests:
import requestsfrom bs4 import BeautifulSoup
# Geonode Proxy configurationusername = "YOUR_USERNAME"  # Replace with your Geonode usernamepassword = "YOUR_PASSWORD"  # Replace with your Geonode passwordendpoint = "premium-residential.geonode.com:10000"  # Geonode endpoint for premium residential proxies
# Define the proxy URLproxy_url = f"http://{username}:{password}@{endpoint}"
# Set up proxiesproxies = {    "http": proxy_url,    "https": proxy_url}
# Define the URL to scrapeurl = "http://books.toscrape.com/"
# Send a GET request through the Geonode proxyresponse = requests.get(url, proxies=proxies)
# Parse the HTML content with BeautifulSoupsoup = BeautifulSoup(response.content, "html.parser")
# Find all the links on the web pagelinks = soup.find_all("a")
# Print each link foundfor link in links:    print(link.get("href"))
In this code:
  • We configure the Geonode proxy by setting up the proxy URL with the username and password.
  • Requests are sent through the specified proxy, enabling access to restricted sites and allowing IP rotation if needed.
  • The code fetches and parses all links from the target website using BeautifulSoup.

Python Selenium

Authenticated proxies require an API key (or something similar). Every proxy service that charges money is authenticated. Selenium doesn't support authenticated proxies. You can technically do it with SeleniumWire, but SeleniumWire has been deprecated. Long story short, doing so creates security and compatibility problems and it will only get worse over time.

Python Scrapy

To integrate Geonode proxies with Scrapy for web scraping, follow these steps:
  1. Create a new Scrapy project:
    scrapy startproject <project_name>
  2. Inside your Scrapy project, create a spider using Geonode proxies:
    import scrapy
    # Geonode Proxy configurationusername = "YOUR_USERNAME"password = "YOUR_PASSWORD"
    class GeonodeScrapyExampleSpider(scrapy.Spider):    name = "GeonodeScrapyExample"
        def start_requests(self):        request = scrapy.Request(url="http://example.com", callback=self.parse)        request.meta['proxy'] = f"http://{username}:{password}@premium-residential.geonode.com:10000"        yield request
        def parse(self, response):        print(response.body)
To run your Scrapy spider:
scrapy runspider <Pythonfilename.py>

Node.js Playwright

To use Geonode proxies with Playwright for automated browsing:
  1. Install Playwright:
    npm install playwright
  2. Set up Playwright with Geonode proxy settings:
    const playwright = require('playwright');
    const options = {    proxy: {        server: 'http://premium-residential.geonode.com:10000',        username: 'YOUR_USERNAME',        password: 'YOUR_PASSWORD'    }};
    (async () => {    const browser = await playwright.chromium.launch(options);    const page = await browser.newPage();
        await page.goto('http://example.com');
        const content = await page.content();    console.log(content);
        await browser.close();})();
Save the script as scrape.js and execute it:
node scrape.js

NodeJS Puppeteer

To set up Geonode proxies with Puppeteer for browser automation:
  1. Install Puppeteer:
    npm install puppeteer
  2. Set up the script with Geonode proxy authentication:
    const puppeteer = require('puppeteer');
    (async () => {  const browser = await puppeteer.launch({    headless: false,    args: ['--proxy-server=premium-residential.geonode.com:10000']  });
      const page = await browser.newPage();
      await page.authenticate({    username: 'YOUR_USERNAME',    password: 'YOUR_PASSWORD'  });
      await page.goto('http://example.com');  await page.screenshot({ path: 'example.png' });
      await browser.close();})();
Save the script to a file (e.g., scrape.js) and run it:
node scrape.js

Case Study - Scraping Amazon.es

In this case study, we'll show why web scraping eCommerce sites is important for targeting specific geographic regions. For example, if you're in Portugal, you might see different products compared to someone in Spain. Therefore, let’s scrape Amazon using Portugal and Spain geo-locations and compare the results. To achieve this, we'll utilize Geonode residential proxies with Python. We’ll compare product prices or availability based on the IP address location, specifically using Spanish and Portuguese IPs. Before we jump into writing any code, let's configure the Geonode proxy: Ready to scrape product information from Amazon:
import requestsfrom bs4 import BeautifulSoupfrom dataclasses import dataclassfrom typing import Dict, Tuple
# Proxy configurationPROXY_CONFIG = {    "host": "premium-residential.geonode.com",    "port": "10000",    "username": "YOUR_USERNAME",  # Replace with your Geonode username    "password": "YOUR_PASSWORD"   # Replace with your Geonode password}
# Product URLPRODUCT_URL = 'https://www.amazon.es/Taurus-WC12T-termoel%C3%A9ctrica-Aislamiento-Temperatura/dp/B093GXXKRL/ref=lp_14565165031_1_2'
@dataclassclass ProductInfo:    title: str    price: str    availability: str
def get_proxy_url(country: str) -> str:    # Generate Geonode proxy URL with country targeting    return f"http://{PROXY_CONFIG['username']}-country-{country}:{PROXY_CONFIG['password']}@{PROXY_CONFIG['host']}:{PROXY_CONFIG['port']}"
def get_headers() -> Dict[str, str]:    # HTTP headers to simulate a browser request    return {        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36',        'Accept-Language': 'es-ES,es;q=0.9',        'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8',        'Accept-Encoding': 'gzip, deflate, br',        'Referer': 'https://www.amazon.es/'    }
def scrape_amazon(country: str) -> ProductInfo:    proxy_url = get_proxy_url(country)    proxies = {"http": proxy_url, "https": proxy_url}        try:        response = requests.get(PRODUCT_URL, proxies=proxies, headers=get_headers(), verify=False, timeout=30)        response.raise_for_status()                soup = BeautifulSoup(response.content, 'html.parser')                title = soup.select_one('#productTitle').text.strip() if soup.select_one('#productTitle') else "Title not found"        price = soup.select_one('.a-price-whole').text.strip() if soup.select_one('.a-price-whole') else "Price not found"        availability = soup.select_one('#availability span').text.strip() if soup.select_one('#availability span') else "Availability not found"                return ProductInfo(title, price, availability)        except requests.RequestException as e:        print(f"An error occurred while scraping with {country} IP: {str(e)}")        return ProductInfo("Error", "Error", "Error")
def main():    countries = {        "SPAIN": "es",        "PORTUGAL": "pt"    }        results = {country_code: scrape_amazon(country_code) for country_code in countries.values()}        for country_name, country_code in countries.items():        info = results[country_code]        print(f"\n{country_name} IP Results:")        print(f"Title: {info.title}")        print(f"Price: {info.price}€")        print(f"Availability: {info.availability}")
if __name__ == "__main__":    main()
This script is designed to scrape product information from an Amazon page using Geonode residential proxies and to display the product details for different countries. Looking at our results:
  • We observe that the product price is slightly higher when accessed from a Portuguese IP (222€) compared to a Spanish IP (199€).
  • This demonstrates Amazon's dynamic pricing strategy, where prices may vary based on the user's location.
This example shows how using residential proxies can reveal differences in pricing and availability on e-commerce platforms based on location. By leveraging Geonode’s network of residential IPs, businesses and researchers can gain valuable insights into regional pricing strategies and inventory allocations, aiding in market research and competitive analysis.

Tips for Troubleshooting Common Issues

  • Verify Proxy Configuration: Double-check that the proxy details (host, port, username, and password) are correct. Incorrect or outdated proxy credentials often lead to connection errors.
  • Confirm Proxy Location: Ensure the proxy is located in a region supported by the target website. Using proxies from unsupported regions might not bypass geo-restrictions effectively.
  • Monitor IP Rotation: Geonode proxies may rotate IP addresses. Make sure the proxy in use is stable and provides consistent access from the intended region.
  • Implement Error Handling for Proxy Failures: Add error handling to your scraping code to handle cases where a proxy might fail or return an invalid response. This can help you troubleshoot issues more efficiently.
  • Test Multiple Proxies: If one proxy doesn’t work, try different proxies to determine if the issue lies with the specific proxy server or your configuration.

Alternative: ScrapeOps Residential Proxy Aggregator

If you're seeking a robust and affordable solution for web scraping, the ScrapeOps Residential Proxy Aggregator serves as an excellent alternative to conventional proxy providers. This service consolidates proxies from various sources, granting you access to an extensive array of IP addresses with exceptional flexibility and dependability.

Why Choose ScrapeOps Residential Proxy Aggregator?

1. Cost-Effective Solutions ScrapeOps is recognized for its affordability. Our pricing typically undercuts that of many standard proxy providers, allowing you to scrape more efficiently without overspending. 2. Versatile Plans In contrast to many proxy services that offer rigid plans, ScrapeOps presents a variety of options, including smaller plans designed for specific requirements. This adaptability enables you to choose a plan that aligns perfectly with your project goals and financial constraints. 3. Increased Reliability By sourcing proxies from multiple providers, ScrapeOps guarantees enhanced reliability. Rather than depending on a single proxy provider, you benefit from a diverse range of proxies accessible through one port. This minimizes the chances of downtime and connectivity problems, resulting in a more stable and dependable scraping experience.

Using ScrapeOps Residential Proxy Aggregator with Python Requests

Here’s how to implement the ScrapeOps Residential Proxy Aggregator with Python’s requests library:
import requests
api_key = 'YOUR_API_KEY'target_url = 'https://httpbin.org/ip'proxy_url = f'http://scrapeops:{api_key}@residential-proxy.scrapeops.io:8181'
proxies = {    'http': proxy_url,    'https': proxy_url,}
response = requests.get(    url=target_url,    proxies=proxies,    timeout=120,)
print('Body:', response.content)
The code snippet above demonstrates how to send a URL to the ScrapeOps Proxy port. ScrapeOps will handle the proxy selection and rotation for you, so you only need to provide the URL you wish to scrape. You can explore the free trial offering 500MB of free Bandwidth Credits Here. No credit card is necessary.
Geonode emphasizes that their residential proxies come from IP holders who have voluntarily opted in to participate. This approach promotes ethical practices by ensuring that data gathering through these IPs is conducted with transparency and consent. Geonode is committed to leading the industry in ethical proxy provision, with an emphasis on transparency and consent in their sourcing and operations. This focus addresses concerns about privacy and the responsible use of IP addresses. When using Geonode's residential proxies for web scraping, users should be mindful of several important responsibilities and policies:
  • Respect Website Terms of Service: Websites establish terms of service to govern access and usage of their data. Scraping activities should align with these terms to prevent legal issues or bans. While Geonode’s proxies provide anonymity, users are still obligated to follow the legal and ethical guidelines of each target site.
  • Avoid Abusive Practices: Users should refrain from actions that may be perceived as abusive, such as excessive scraping that could overload a website's servers or invasive data extraction. Responsible proxy use means respecting the limits set by target websites.
  • Prioritize Transparency and Consent: While Geonode ensures ethically sourced proxies, users should also be transparent about their data collection activities when feasible. Informing site owners about the purpose and scope of data collection can foster trust and open channels for potential collaboration.

Importance of Scraping Responsibly

Responsible scraping is essential for several reasons. Adhering to website terms of service helps avoid legal issues and preserves access to critical data sources. Handling personal data responsibly upholds privacy standards and ethical principles, avoiding misuse of sensitive information. Compliance with laws like GDPR is crucial to avoid legal penalties, especially when dealing with data related to EU citizens. Additionally, responsible scraping minimizes server load on target websites, helping maintain site functionality for other users. Practicing ethical scraping helps build trust and protects the reputation of both the scraper and their organization.

Conclusion

Implementing residential proxies in web scraping projects is essential for efficient, ethical, and reliable data collection. By using Geonode residential proxies, you can address common challenges like IP bans, geo-restrictions, and anti-bot defenses. This enables more effective data gathering, access to diverse datasets, and the ability to scale your projects seamlessly while ensuring anonymity and compliance with ethical standards.

More Python Web Scraping Guides

Want to take your scraping skills to the next level? Check out Python Web Scraping Playbook or these additional guides: