PacketStream Residential Proxies: Web Scraping Guide
PacketStream is a leading provider of residential proxy services, offering a vast network of authentic IP addresses from real residential internet service providers (ISPs). Their service enables you to access the internet as if you were a regular user from over 190 countries, granting you unparalleled global access for your web scraping and data collection needs.
In this comprehensive guide, we'll walk you through everything you need to know about using PacketStream's residential proxies for your web scraping projects.
- TLDR: How to Integrate PacketStream Residential Proxy?
- Understanding Residential Proxies
- Why Use PacketStream Residential Proxies?
- PacketStream Residential Proxy Pricing
- Setting Up PacketStream Residential Proxies
- Authentication
- Basic Request Using PacketStream Residential Proxies
- Country Geotargeting
- City Geotargeting
- Error Codes
- Implementing PacketStream Residential Proxies in Web Scraping
- Case Study: Scrape Amazon Prices with PacketStream Proxies
- Alternative: ScrapeOps Residential Proxy Aggregator
- Ethical Considerations and Legal Guidelines
- Conclusion
- More Web Scraping Guides
Need help scraping the web?
Then check out ScrapeOps, the complete toolkit for web scraping.
TLDR: How to Integrate PacketStream Residential Proxy?
To quickly start using PacketStream's residential proxies in your Python scripts, follow these steps:
- Sign up for a PacketStream account and add funds to your balance.
- Retrieve your proxy credentials from the
Network Access
page. - Use the following code snippet to make requests through PacketStream's proxy network:
import requests
proxy_host = 'proxy.packetstream.io'
proxy_port = '31112'
proxy_username = 'your_username'
proxy_password = 'your_password'
proxies = {
'http': f'http://{proxy_username}:{proxy_password}@{proxy_host}:{proxy_port}',
'https': f'http://{proxy_username}:{proxy_password}@{proxy_host}:{proxy_port}'
}
url = 'http://example.com'
response = requests.get(url, proxies=proxies)
print(f"Response status code: {response.status_code}")
print(f"Response content length: {len(response.text)}")
- Set up the proxy configuration using your PacketStream credentials then make a request to the target website through the proxy.
- The script prints the status code and the length of the response content, which confirms that your request is being routed through PacketStream's residential proxy network.
- Remember to replace
your_username
andyour_password
with your actual PacketStream credentials.
Understanding Residential Proxies
Residential proxies are IP addresses assigned by Internet Service Providers (ISPs) to real residential devices, such as laptops and smartphones. These proxies make your web traffic appear as though it's coming from a real user in a specific location.
When you use a residential proxy, your requests are routed through a legitimate residential IP, making it harder for target websites to detect automated actions or block your IP address.
Residential proxies act as intermediaries between you and the websites you access. When you send a request:
- It first passes through the proxy server, which forwards it to the target site using a residential IP.
- The target site responds to the proxy, which then relays the information back to you.
This conceals your real IP address and location, reducing the likelihood of IP bans or restrictions.
Types of Residential Proxies
There are two main types of residential proxies, each suited to different use cases:
1. Rotating Residential Proxies
These proxies automatically change the IP address after every request or at set intervals. They're ideal for tasks requiring numerous requests without detection.
Pros:
- Excellent for large-scale web scraping since each request uses a different IP.
- Lowers the risk of IP bans.
Cons:
- Not ideal for long sessions requiring the same IP.
- Slightly more complex when needing persistent access.
2. Static Residential Proxies
Static residential proxies maintain the same IP address throughout the session, making them useful for activities like logging into accounts.
Pros:
- Ensures a consistent IP address for long sessions, great for tasks that require continuity.
- Perfect for simulating a persistent user identity.
Cons:
- Greater risk of getting the IP banned if it's overused.
- Not suited for high-volume scraping where multiple requests are necessary.
Residential vs. Data Center Proxies
Residential and data center proxies are two types of proxy servers that differ mainly in origin, speed, reliability, and use cases.
Understanding the difference between residential and data center proxies is key to choosing the right solution for your needs:
Residential Proxies:
- Pros: More difficult to detect, as they use IPs from real residential ISPs; less likely to trigger IP bans or CAPTCHAs.
- Cons: More expensive and slower due to the use of residential networks.
Data Center Proxies:
- Pros: Faster and cheaper; perfect for tasks not requiring stealth, such as testing or accessing non-sensitive websites.
- Cons: Easier to detect and block since many data center IPs are flagged as non-residential.
Here's a comparison to help you understand how they differ:
Feature | Residential Proxies | Data Center Proxies |
---|---|---|
Source | Real ISP-provided IPs | Data center IPs |
Detection Risk | Lower, harder to detect | Higher, easier to detect |
Speed | Generally slower | Generally faster |
Cost | Higher | Lower |
Best For | Stealthy tasks, geo-blocked sites | Fast tasks, large-scale scraping |
Common Use Cases for Residential Proxies
Residential proxies are particularly useful in scenarios where stealth, reliability, and geo-targeting are critical. Some typical applications include:
- Web Scraping and Data Collection: Helps avoid detection and blocks on websites with strict anti-bot measures.
- SEO and SERP Analysis: Allows you to gather accurate search engine results from specific locations.
- Social Media Monitoring: Facilitates tracking and monitoring without being flagged by platforms.
- Ad Verification: Enables checking how ads are displayed across different regions.
- Geo-Restricted Content Access: Provides access to region-locked content by simulating access from a specified location.
In these use cases, residential proxies offer a higher degree of anonymity and accuracy, ensuring uninterrupted access to your target websites.
Why Use PacketStream Residential Proxies?
PacketStream is a powerful, reliable, and flexible choice for web scraping, offering unparalleled access to real residential IPs, cost-effective scalability, and optimized performance for even the most complex scraping operations.
Here are some reasons you can choose PacketStream:
Real Residential IPs for Genuine Anonymity
PacketStream provides authentic residential IP addresses directly sourced from real residential ISPs. This ensures your requests are indistinguishable from genuine user traffic, minimizing the risk of detection and blacklisting.
Unrestricted Access from Over 190 Countries
PacketStream’s global reach is another key differentiator. With IPs from over 190 countries, including hard-to-access regions, you can collect localized data and bypass geo-blocks with ease.
This extensive network opens the door to truly global web scraping, allowing you to extract critical market intelligence or monitor competitor activities in regions where access may otherwise be restricted.
Cost-Effective and Scalable Solutions
PacketStream’s pay-as-you-go pricing model is designed with flexibility in mind. You only pay for the bandwidth you use. This feature is particularly valuable for businesses of all sizes, from startups needing smaller-scale data collection to enterprises requiring massive scalability.
The absence of pre-set bandwidth limits or connection caps allows PacketStream to seamlessly support high-volume web scraping operations, dynamically adjusting to your project’s evolving demands without compromising performance.
High Uptime and Robust Network Performance
Where PacketStream truly shines is in its high uptime and low failure rates. Built with reliability at its core, PacketStream’s infrastructure ensures your web scraping tasks are not interrupted by network downtime or IP churn.
This robust network design makes PacketStream an ideal solution for long-running data extraction projects, guaranteeing you have the stability needed for mission-critical operations.
Intelligent Routing for Optimized Speed
PacketStream enhances web scraping efficiency through its intelligent routing system, which minimizes latency and ensures faster access to target websites.
For scraping operations that rely on real-time data, such as financial market analysis or dynamic pricing, PacketStream’s optimized proxies reduce response times, helping you gather the information you need without delay.
This combination of speed and reliability ensures higher success rates in retrieving data, allowing you to make faster, better-informed business decisions.
Seamless Integration and Compatibility
PacketStream’s residential proxies are designed to integrate effortlessly with popular scraping tools, automation frameworks, and browsers. Whether you're using Python Requests, Selenium, or Scrapy, PacketStream’s proxies fit seamlessly into your existing workflows.
The platform also offers advanced targeting options, such as rotating or static proxies, enabling users to choose the best solution for their specific scraping needs—whether you need consistent access through a single IP or want to rotate IPs to avoid detection.
PacketStream Residential Proxy Pricing
PacketStream stands out for its simple, pay-as-you-go pricing model, offering one of the most competitive rates in the residential proxy market.
Unlike many other providers that bundle their services into pre-set plans or packages, PacketStream's model is based entirely on the amount of bandwidth you use, giving you flexibility and control over your spending.
Pricing Structure: Pay-As-You-Go
PacketStream charges $1 per GB, with a minimum purchase of 50GB blocks. There are no separate fees for the number of IP addresses used, nor do they impose limits on the number of concurrent connections.
This approach eliminates the need to commit to a fixed monthly plan or worry about additional costs for scaling up your usage. Whether you're running a small project or a large-scale scraping operation, you only pay for the bandwidth you consume, making this model ideal for businesses that require cost-effective, scalable proxy solutions.
How Does PacketStream Compare to Other Providers?
PacketStream’s pricing is among the most affordable in the residential proxy market. Many other providers charge upwards of $6 to $8 per GB for smaller plans, which can quickly add up for businesses with high data requirements.
By comparison, PacketStream's flat rate of $1 per GB is a significant bargain, especially for companies that require large-scale bandwidth.
PacketStream is highlighted as one of the most cost-effective options on our Proxy Comparison Page, where you can evaluate pricing and features across top providers.
Setting Up PacketStream Residential Proxies
Setting up PacketStream residential proxies is straightforward, especially if you're familiar with proxy configurations. Here’s a step-by-step guide to get you started:
- Visit the Sign-up page.
- Enter your details and then click the "Submit" button. You will be redirected to the dashboard after successful sign-up.
- On the sidebar menu, top up your account by clicking the "Deposit" sub-menu.
- After that, you will be able to access your proxy details found under the "Network Access" submenu.
Alternatively, you can fill out this form to request a free trial before making a deposit.
Authentication
When using PacketStream's residential proxies, proper authentication is crucial to ensure secure and authorized access to the proxy network.
PacketStream offers Username and Password Authentication as the primary method of authentication. When you sign up for an account, you're provided with a unique username and password for proxy access.
To use this method in your Python scripts, you need to include your credentials in the proxy URL.
Here's an example of how to set this up:
import requests
proxy_host = 'proxy.packetstream.io'
proxy_port = '31112'
proxy_username = 'your_username'
proxy_password = 'your_password'
proxies = {
'http': f'http://{proxy_username}:{proxy_password}@{proxy_host}:{proxy_port}',
'https': f'http://{proxy_username}:{proxy_password}@{proxy_host}:{proxy_port}'
}
url = 'http://example.com'
response = requests.get(url, proxies=proxies)
print(f"Response status code: {response.status_code}")
In this example, we're including the username and password directly in the proxy URL. This method is simple and works well for most use cases.
Basic Request Using PacketStream Residential Proxies
Making requests through PacketStream's residential proxies is straightforward and can be done using popular libraries like Python's Requests.
Here’s a simple example demonstrating how to make a basic HTTP request through PacketStream's residential proxies:
import requests
# Define your proxy credentials
proxy_host = 'proxy.packetstream.io'
proxy_port = '31112'
proxy_username = 'your_username'
proxy_password = 'your_password'
# Set up the proxies dictionary
proxies = {
'http': f'http://{proxy_username}:{proxy_password}@{proxy_host}:{proxy_port}',
'https': f'http://{proxy_username}:{proxy_password}@{proxy_host}:{proxy_port}'
}
# Make a request through the proxy
response = requests.get('https://example.com', proxies=proxies)
# Print the response
print(response.text)
This code establishes a connection through PacketStream’s residential proxy, allowing you to send requests as if you were using a regular residential IP.
Country Geotargeting
Geotargeting at the country level allows you to access content tailored to specific geographic regions.
This can be crucial for various applications such as:
- Accessing Geo-Restricted Content: Geotargeted proxies allow users to access content restricted to specific countries, like streaming and local news.
- Localized Marketing and Ad Verification: Marketers can verify ads as they appear to users in different regions, ensuring accuracy and relevance.
- Accurate Web Scraping and Data Collection: Geotargeted proxies enable scraping region-specific data, like prices and product availability, for more accurate insights.
- SEO and SERP Tracking: Proxies allow companies to track keyword rankings in different countries, essential for global SEO optimization.
- Avoiding Geo-Blocking and Reduced Risk of Detection: Country-targeted proxies help bypass geo-blocks, reducing the likelihood of detection when accessing restricted sites.
- Localized User Testing and UX Optimization: Developers can test and optimize user experiences by viewing content as it appears in various countries.
- Market Research and Competitor Analysis: Proxies let businesses analyze competitor strategies and consumer behavior in specific regions for informed decision-making.
- Enhanced Security and Privacy: Geotargeted proxies provide additional anonymity, helping users protect their privacy when browsing from restrictive regions.
PacketStream enables you to leverage proxies from various countries, providing flexibility in data collection and browsing activities and supports proxies from over 190 countries.
To use country-specific proxies, simply select your desired country from the dropdown menu on your PacketStream dashboard.
This allows you to route your requests through an IP that corresponds to that country.
Here’s how to modify the previous code example to specify a country:
import requests
proxy_host = 'proxy.packetstream.io'
proxy_port = '31112'
proxy_username = 'your_username'
proxy_password = 'your_password'
country = 'Canada' # Example for Canada
proxies = {
'http': f'http://{proxy_username}:{proxy_password}_country-{country}@{proxy_host}:{proxy_port}',
'https': f'http://{proxy_username}:{proxy_password}_country-{country}@{proxy_host}:{proxy_port}'
}
response = requests.get('https://ipv4.icanhazip.com', proxies=proxies)
print(response.text) # This will show the IP address used for the request
We use the requests
library to direct our traffic through a Canadian proxy by setting up the necessary proxy details in the proxies
dictionary and making a simple GET request to check the IP address
City Geotargeting
City-level geotargeting is beneficial for applications that require hyper-localized data, such as market research, local SEO, and ad verification. By targeting specific cities, you can gain insights into regional trends and behaviors.
Currently, PacketStream does not allow you to choose proxies from different cities directly. However, you can select from various countries to enhance your targeting strategy.
Error Codes
When using proxies, you may encounter various error codes, including both standard HTTP status codes and specific proxy-related errors.
Here’s a breakdown of frequently encountered error codes, their meanings, and some recommended solutions.
Error Code | Meaning | Description | Potential Solution |
---|---|---|---|
400 | Bad Request | Indicates an incorrect request format, missing parameters, or malformed URL. | Verify the URL format and ensure request syntax aligns with PacketStream's requirements. |
403 | Access Denied | The target site or PacketStream blocks access, often due to restricted content or region. | Check target site permissions or contact PacketStream support if you believe the block is unintended. |
407 | Proxy Authentication Error | Incorrect username or password during authentication with PacketStream. | Double-check PacketStream proxy credentials for accuracy. |
490 | Invalid Access Point | The requested country or region has no available IPs, or the location is incorrectly specified. | Confirm location settings in PacketStream and ensure IP availability for the target region. |
500 | Internal Server Error | Temporary server issue on PacketStream’s end, often resolves automatically. | Retry after a short delay, as PacketStream may resolve the issue on their end. |
502 | Bad Gateway | The assigned IP is no longer available or the target server refused the connection. | Switch to a new IP in PacketStream or wait for an IP reassignment. |
503 | Service Unavailable | Temporary error indicating PacketStream or the target website is under high load. | Wait and retry after a few moments, as it may be a transient issue. |
522 | Connection Timeout | PacketStream proxy server did not receive a response from the target within the expected timeframe. | Retry, as this may be due to network latency; consider switching IPs if timeouts persist. |
One specific error to be aware of is Cloudflare Error 1015, known as the “You are being rate limited” error. This occurs when too many requests are made from a single IP within a short timeframe. It’s a protective measure designed to prevent abuse and maintain website performance.
Understanding error codes and their implications can help you manage your proxy usage effectively and enhance your overall web scraping experience.
Implementing PacketStream Residential Proxies in Web Scraping
PacketStream’s residential proxies offer both rotating and sticky IPs, allowing you to switch between multiple IP addresses or maintain a persistent connection depending on your scraping needs.
Below, we will explore how you can integrate PacketStream proxies into popular web scraping libraries.
Python Requests
PacketStream proxies can be easily integrated with the requests
library in Python, which is widely used for making HTTP requests. Let’s walk through an example using US-based rotating proxies:
import requests
username = 'your_proxy_username'
password = 'your_proxy_password'
country = 'UnitedStates'
port = '31112'
proxy = f'http://{username}:{password}_country-{country}@proxy.packetstream.io:{port}'
proxies = {
'http': proxy,
'https': proxy,
}
response = requests.get('http://example.com', proxies=proxies)
print(response.text)
In this example, we are using the requests
library to send a GET request through a US-based PacketStream proxy.
You can replace the username
, password
, and any relevant country code to target a specific location or rotate IPs as needed. This integration is ideal for lightweight scraping tasks.
Python Selenium
SeleniumWire has been a web scraping staple for many years. Vanilla Selenium has no support for authenticated proxies. Some sad news, SeleniumWire has been deprecated! With all of that said, it is still technically possible to integrate Packetstream Residential Proxies via SeleniumWire, but we strongly recommend against it.
When you decide to use SeleniumWire, you are vulnerable to the following risks:
-
Security: Browsers are updated with security patches regularly. Without these patches, your browser will have holes in the security that have been fixed in other browsers such as Chromedriver or Geckodriver.
-
Dependency Issues: SeleniumWire is no longer maintained. In time, it may not be able to keep up with its dependencies as they get updated. Broken dependencies can be a source of unending headache for anyone in software development.
-
Compatibility: As the web itself gets updated, SeleniumWire doesn't. Regular browsers are updated all the time. Since SeleniumWire no longer receives updates, you may experience broken functionality and unexpected behavior.
As time goes on, the probability of all these problems increases. If you understand the risks but still wish to use SeleniumWire, you can view a guide on that here.
Depending on your time of reading, the code example below may or may not work. As mentioned above, we strongly recommend against using SeleniumWire because of its deprecation, but if you decide to do so anyway here you go. We are not responsible for any damage that this may cause to your machine or your privacy.
from seleniumwire import webdriver
username = 'your_proxy_username'
password = 'your_proxy_password'
country = 'UnitedStates'
port = '31112'
proxy = f'http://{username}:{password}_country-{country}@proxy.packetstream.io:{port}'
## Define Your Proxy Endpoints
proxy_options = {
"proxy": {
"http": proxy,
"https": proxy,
"no_proxy": "localhost:127.0.0.1"
}
}
## Set Up Selenium Chrome driver
driver = webdriver.Chrome(seleniumwire_options=proxy_options)
## Send Request Using Proxy
driver.get('https://httpbin.org/ip')
- We setup our URL the same way we did with Python Requests:
f'http://{username}:{password}_country-{country}@proxy.packetstream.io:{port}'
. - We assign this url to both the
http
andhttps
protocols of our proxy settings. driver = webdriver.Chrome(seleniumwire_options=proxy_options)
tellswebdriver
to open Chrome with our customseleniumwire_options
.
Python Scrapy
Scrapy, a powerful web scraping framework, can also utilize PacketStream proxies to anonymize and rotate your requests. There are several ways to integrate Scrapy with a proxy, in the example below, we're going to integrate the proxy directly into our spider.
Create a new Scrapy project.
scrapy startproject scrapy_packetstream
Then, from inside the spiders folder of your new Scrapy project, copy and paste the code below into a new spider file. You can this one example.py.
First, we create variables for all of our credentials and settings: proxy_host
, proxy_port
, proxy_username
and proxy_password
. We put all of these together to create our proxy_url
.
When making our requests, we set them to our proxy using request.meta['proxy'] = proxy_url
.
import scrapy
proxy_host = 'proxy.packetstream.io'
proxy_port = '31112'
proxy_username = 'your_username'
proxy_password = 'your_password'
proxy_url = f'http://{proxy_username}:{proxy_password}@{proxy_host}:{proxy_port}'
class ExampleSpider(scrapy.Spider):
name = "example"
def start_requests(self):
request = scrapy.Request(url="http://lumtest.com/myip.json", callback=self.parse)
request.meta['proxy'] = proxy_url
yield request
def parse(self, response):
print(response.body)
In this example, we do the following to create and use our proxy connection:
- Create variables to hold our configuration and settings:
proxy_host
,proxy_port
,proxy_username
andproxy_password
. - Using the variables mentioned above, we construct a
proxy_url
:proxy_url = f'http://{proxy_username}:{proxy_password}@{proxy_host}:{proxy_port}'
. - When making our requests, we use the
request.meta['proxy']
setting to connect to our proxy,request.meta['proxy'] = proxy_url
.
NodeJs Puppeteer
If you’re working with Puppeteer for headless browser automation, PacketStream proxies can easily be integrated by specifying the proxy server in your browser launch options.
To start, you'll need to make a new JavaScript project.
mkdir puppeteer-packetstream
Initialize the project.
cd puppeteer-packetstream
npm init --y
Install Puppeteer.
npm install puppeteer
Now, create a new JavaScript file and paste the code below into it. Remember to replace your_username
and your_password
.
const puppeteer = require('puppeteer');
const proxy_host = 'proxy.packetstream.io';
const proxy_port = '31112';
const proxy_username = 'your_username';
const proxy_password = 'your_password';
const proxy_url = `http://${proxy_host}:${proxy_port}`;
(async () => {
const browser = await puppeteer.launch({
args: [`--proxy-server=${proxy_url}`]
});
const page = await browser.newPage();
await page.authenticate({
username: proxy_username,
password: proxy_password
});
await page.goto('http://lumtest.com/myip.json');
await page.screenshot({path: 'puppeteer.png'});
await browser.close();
})();
If you take a look at the image below, you'll see an example of the screenshot generated by this code.
NodeJs Playwright
Playwright, another popular tool for browser automation, works similarly to Puppeteer. Here's how to set up PacketStream proxies with Playwright.
Create a new project folder.
playwright-packetstream
Initialize a new JS project.
cd playwright-packetstream
npm init --y
Install Playwright.
npm install playwright
npx playwright install
Now that you've got your new project setup, create a new JS file and paste the following code into. Remember to replace your_username
and your_passsword
with your actual username and password.
const playwright = require('playwright');
const proxy_host = 'proxy.packetstream.io';
const proxy_port = '31112';
const proxy_username = 'your_username';
const proxy_password = 'your_password';
const options = {
proxy: {
server: `http://${proxy_host}:${proxy_port}`,
username: proxy_username,
password: proxy_password
}
};
(async () => {
const browser = await playwright.chromium.launch(options);
const page = await browser.newPage();
await page.goto('http://lumtest.com/myip.json');
await page.screenshot({ path: "playwright.png" })
await browser.close();
})();
When the browser is launched, requests are made through our proxy instead of our actual IP address. This setup is strikingly similar to our setup with Puppeteer.
You can view some sample output in the screenshot below.
Case Study: Scrape Amazon.es Prices with PacketStream Proxies
In this case study, we demonstrate how to scrape price information for a product from different regional Amazon websites using Python Requests and PacketStream proxies.
In the example below, we're going to scrape the our location information and the first price from Amazon.es using two different proxies in two different countries.
In the script below:
- We first setup a proxy in
Portugal
. - Once we've got our proxy setup, we make a request to Amazon.es to perform a search for
portátil
, or laptop. - Once we receive our response, we scrape our location information and then we scrape the first price that comes up on the page.
- After, we've scraped this data from a Portuguese proxy, we go and setup a new Spanish proxy connection.
- We then perform the exact same scrape from this connection.
- Finally, we print the results from each proxy to the terminal.
import requests
from bs4 import BeautifulSoup
proxy_host = 'proxy.packetstream.io'
proxy_port = '31112'
proxy_username = 'your_username'
proxy_password = 'your_username'
country = 'Portugal'
proxy_url = f'http://{proxy_username}:{proxy_password}_country-{country}@{proxy_host}:{proxy_port}'
proxies = {
"http": proxy_url,
"https": proxy_url
}
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36"
}
print("----------------------PT-------------------")
response = requests.get('https://www.amazon.es/s?k=port%C3%A1til', proxies=proxies, headers=headers)
soup = BeautifulSoup(response.text, "html.parser")
location = soup.select_one("span[id='glow-ingress-line2']")
print("Location:", location.text)
first_price_holder = soup.select_one("span[class='a-price']")
first_price = first_price_holder.select_one("span[class='a-offscreen']")
print("First Price:", first_price.text)
print("----------------------ES---------------------")
country = "Spain"
proxy_url = f'http://{proxy_username}:{proxy_password}_country-{country}@{proxy_host}:{proxy_port}'
proxies = {
"http": proxy_url,
"https": proxy_url
}
response = requests.get('https://www.amazon.es/s?k=port%C3%A1til', proxies=proxies, headers=headers)
soup = BeautifulSoup(response.text, "html.parser")
location = soup.select_one("div[id='glow-ingress-block']")
print("Location:", location.text.strip())
first_price_holder = soup.select_one("span[class='a-price']")
first_price = first_price_holder.select_one("span[class='a-offscreen']")
print("First Price:", first_price.text)
- We set our credentials and connection information like we did earlier in the geotargeting example.
- After setting our credentials and connection information, we format a string to create our
proxy_url
. - We use a fake user agent to make our traffic appear more normal.
- With all of these custom settings, we make our request to Amazon.
- Once we've finished this process for
Portugal
, we reset our proxy and repeat the process with a Spanish proxy connection.
Here is our output from running this code.
----------------------PT-------------------
Location:
Portugal
First Price: 243,96 €
----------------------ES---------------------
Location: Entrega en Mollet De... 08100
Actualizar ubicación
First Price: 279,99 €
Alternative: ScrapeOps Residential Proxy Aggregator
While PacketStream offers robust proxy services, the ScrapeOps Residential Proxy Aggregator presents a compelling alternative that's worth considering for your web scraping needs.
Why Use ScrapeOps Residential Proxy Aggregator?
- Competitive Pricing: ScrapeOps often provides more affordable rates compared to traditional proxy providers, allowing you to optimize your scraping budget.
- Flexible Plan Options: With a variety of plans, including smaller-scale options, ScrapeOps caters to diverse scraping requirements and project sizes.
- Multi-Provider Access: Unlike single-provider solutions, ScrapeOps gives you access to over 20 residential proxy providers through one proxy port, significantly boosting reliability and performance.
- Smart Proxy Optimization: The service continuously monitors and adjusts proxy performance, ensuring you always use the best-performing proxies without manual intervention.
Using ScrapeOps Residential Proxy Aggregator with Python Requests
Here's an example of how to use the ScrapeOps Residential Proxy Aggregator with Python Requests:
import requests
from bs4 import BeautifulSoup
from dotenv import load_dotenv
import os
load_dotenv()
username = 'scrapeops'
api_key = os.getenv("SCRAPEOPS_API_KEY")
proxy = 'residential-proxy.scrapeops.io'
port = 8181
proxies = {
"http": f"http://{username}:{api_key}@{proxy}:{port}",
"https": f"http://{username}:{api_key}@{proxy}:{port}"
}
response = requests.get('https://example.com', proxies=proxies)
if response.status_code == 200:
soup = BeautifulSoup(response.text, 'html.parser')
title = soup.find('title')
print(f"Page Title: {title.string if title else 'Not found'}")
else:
print(f"Failed to retrieve content. Status code: {response.status_code}")
This script demonstrates how to set up and use ScrapeOps proxies with Python Requests, including proper authentication and error handling.
You can start using ScrapeOps Residential Proxy Aggregator without any upfront cost. Take advantage of the free trial, which offers 500MB of free bandwidth to test the service.
For more detailed information, check out this comprehensive documentation.
Ethical Considerations and Legal Guidelines
When using residential proxies like those provided by PacketStream, it's crucial to consider the ethical implications and legal responsibilities:
-
Ethical Sourcing: While PacketStream doesn't explicitly market their proxies as ethically sourced, they do have a system where users opt-in to share their bandwidth. It's important to understand how the IPs you're using are obtained.
-
PacketStream Policies: Familiarize yourself with PacketStream's terms of service and acceptable use policies. Users are responsible for ensuring their usage complies with these guidelines.
-
Responsible Scraping: Always scrape responsibly by:
- Respecting website terms of service and robots.txt files
- Implementing rate limiting to avoid overwhelming target servers
- Handling any personal data in compliance with relevant laws like GDPR
- Using appropriate user agents and not misrepresenting your bot as a human user
-
Legal Compliance: Ensure your scraping activities comply with all applicable laws and regulations in your jurisdiction and the jurisdictions of your target websites.
Conclusion
In this guide, we have explored residential proxies, highlighting PacketStream's offerings alongside alternatives like ScrapeOps. Residential proxies are essential for web scraping due to their ability to provide anonymity and bypass restrictions. PacketStream is noted for its competitive pricing and global reach, with examples provided for integrating it into popular scraping tools.
The ScrapeOps Residential Proxy Aggregator was also recommended for its flexibility across multiple providers. Emphasizing the importance of ethical use and legal compliance, the guide encourages responsible scraping practices to maintain effectiveness and avoid issues like IP bans and rate limiting.
More Python Web Scraping Guides
For more web scraping tutorials, check out the Python Web Scraping Playbook along with these useful guides: