Oxylabs Mobile Proxies: Web Scraping Guide
Oxylabs, one of our providers, is renowned for their proxy and scraping products. They offer a variety of products such as Residential Proxies, Mobile Proxies, Datacenter Proxies, Dedicated Datacenter Proxies, ISP (internet service provider) Proxies and Web Unblocker. When you take your business to Oxylabs, you can trust that you'll be getting a reliable product at a decent price.
For the remainder of this article, we're going to explore Oxylabs Mobile Proxies. We'll go through the process from start to finish.
TLDR: How to Integrate Oxylabs Mobile Proxy?
Integration with Oxylabs Mobile Proxies is pretty standard.
- Once you've got an account and you've created your Mobile Proxy, you need to save your
USERNAME
and PASSWORD
.
- These are used to construct your
proxy_url
: http://customer-{USERNAME}:{PASSWORD}@pr.oxylabs.io:7777
.
Once you've got this, connecting to it is pretty simple.
Our code example below sets up connection using Oxylabs Mobile Proxies.
- Once we've got our
proxy_url
, we create a dict
object to represent our proxies
.
- Within this
dict
, we set our http
and https
protocols to our proxy_url
.
- As we make HTTP any requests, we pass
proxies=proxies
in as a keyword argument. This tells requests to use that url for our proxy connection.
import requests
USERNAME = "your-username"PASSWORD = "your-password"proxy_url = f"http://customer-{USERNAME}:{PASSWORD}@pr.oxylabs.io:7777"
proxies = { "http": proxy_url, "https": proxy_url}
result = requests.get("https://ip.oxylabs.io/location", proxies=proxies)print(result.text)
- First, we setup our connection variables:
USERNAME
, and PASSWORD
.
- We combine these variables to create our
proxy_url
: http://customer-{USERNAME}:{PASSWORD}@pr.oxylabs.io:7777
.
- We create a
dict
object to hold our http
and https
proxies. We then assign both of them to our proxy_url
.
- We send a GET request to the url using the
proxies
keyword argument.
- Once we've received our response, we print it to the terminal.
Like most proxy connections, Oxylabs Mobile Proxies requires us to create a special url using our username and password.
Once we've got that, we pass our proxies
in as a keyword argument to Python Requests.
Understanding Mobile Proxies
What Are Mobile Proxies?
Mobile proxies are proxy servers that route internet traffic through real mobile devices connected to mobile networks (3G, 4G, 5G).
These proxies use IP addresses assigned by mobile carriers, making them appear as legitimate users to websites and online platforms.
Why Are Mobile Proxies Important?
Mobile proxies are powerful tools for tasks requiring high trust, anonymity, and access to geo-restricted content.
Their dynamic nature and association with real mobile carriers make them essential for online activities like web scraping, social media management, and ad verification.
While they come at a higher cost, the benefits often outweigh the investment for users who need reliability and security.
Datacenter vs Premium
When talking about proxies, Datacenter and Premium proxies typically come to mind. Generally, there are three main types of proxy products: Datacenter, Premium, and Managed.
- Datacenter proxies are hosted indside an actual datacenter.
- Premium proxies are composed of both Residential and Mobile proxies.
- Managed proxies usually use a combination of datacenter and premium IP addresses.
In the next few sections, we'll go over the differences between Datacenter and Premium proxies. Since Premium proxies include both Residential and Mobile proxies, we'll then go over the difference between Residential and Mobile proxies.
Datacenter
Pros
-
Price: Datacenter Proxies are often very cheap. Oxylabs Datacenter Proxies start at $0.65/GB and as your plan tier increases, they get even cheaper. You also get the option to pay per IP address as well. When paying per IP, plans start at $1.20 per IP.
-
Speed: Datacenter proxies offer unmatched performance. With a Datacenter Proxy, you're guaranteed quality hardware with a decent internet connection.
-
Availability: Datacenters host warehouses of machines each with their own IP addresses. With a stable internet connection and so many machines to use, datacenter proxies give us access to very large IP pools with unmatched reliability.
Cons
-
Blocking: When dealing with more complex sites, they sometimes block datacenter IP addresses. This can make some sites very difficult to scrape if you're limited to datacenter IP addresses.
-
Less Geotargeting Support: When using a datacenter proxy, you'll often get support for geotargeting, but it is very limited compared to what you can get when using a premium proxy.
-
Less Anonymity: Our IP always shows up inside of a datacenter. Because of this, we don't blend in as well with normal traffic.
Premium
Pros
-
Anonymity: Premium Proxies give you access to a regular IP address on a real device. This makes it relatively easy for your scraper to blend in with other traffic.
-
Better Access: As we mentioned earlier, datacenter proxies can get blocked. Sites typically don't block Premium Proxies because they're composed of real users with standard internet connections.
Cons
-
Price: Premium Proxies are far more expensive than datacenter solutions. If you're on a bandwidth based plan, 1GB costs $1.65 when using a Datacenter Proxy. That same bandwidth from Oxylabs costs about $9 when using a Premium Proxy.
-
Speed: When it comes to Premium Proxies, performance is often subpar. When you use a premium proxy, you're getting a regular IP address on a real device. This means that we're limited by not only the compute power (possibly an outdated smartphone), but internet connection as well.
Residential proxies are ideal for SERP results, ad verification, social media monitoring/scraping and much more.
Mobile vs Residential
Let's take a look at the two types of Premium Proxies: Residential and Mobile.
Just like we looked at pros and cons for Datacenter and Premium Proxies above, we'll do that here for Residential and Mobile as well. While they carry many of the same benefits and often get lumped in to the same group (Premium Proxies).
There are some ikey differences you should be aware of.
Residential
Pros
-
Authenticity: Residential proxies give you a real IP address. This makes you far more difficult to detect in comparison to a Datacenter proxy.
-
Geotargeting: When using residential proxies, you often get much finer control over your location. Sometimes, you can even narrow down your location to a specific city.
Cons
-
Relatively Static: ISPs (internet service providers) sometimes rotate IP addresses. However, they don't do this often. Whether you keep your IP for a week, a month, or even a year, this can often make an IP ban difficult to get past... especially if you're using a dedicated proxy.
-
Sometimes More Detectable: It's a very uncommon practice, but there are some sites that check for a mobile only environment. They do this by checking both your IP address and user-agent string.
Mobile
Pros
-
Dynamic IP Addresses: Mobile phones receive new IP addresses all the time. Phones move when people move. This causes them to change connection all the time. Whenever your cell phone connects to a different tower, you get a new IP address. If you walk from one store into another, you often get a new IP address.
-
Carrier Network Address Translation: This method is a bit different than the IP rotation we mentioned above. With some carriers, many devices sometimes share the same network IP address. This gives you a large anonymity pool because there are loads of other phones all using your same IP address.
-
Social Media/ Mobile First Platforms: In 2024, regular people rarely use a computer. Their smartphone provides everything they need. Because of this, many sites (like social media based sites), have adopted a mobile first approach. Mobile proxies make us look far more legitimate on these sites.
Cons
-
Reliability: As we've already talked about, mobile phones change IP addresses and networks all the time. This can lead to reliability issues.
-
Price: Oxylabs supposedly charges $8/GB for Mobile Proxies when using the Pay As You Go plan. However, when purchasing 1GB, it cost me $9. The extra dollar isn't a big deal. Perhaps it is a round-up after adding tax or maybe they need to update the site.
Why Use Oxylabs Mobile Proxies?
With Oxylabs, we get quick access to Mobile Proxies at a very reasonable price. They advertise their pricing plans at $8/GB for the Pay As You Go plan. At their highest tier ($299/month), your bandwidth costs only $7.50/GB.
Along with this simple pricing structure, Oxylabs gives us access to stable mobile IP addresses with country and city geotargeting.
Oxylabs Mobile Proxies give us access to pretty much every site on the web.
Oxylabs Mobile Proxy Pricing
Their pricing plan is pretty straightforward for Mobile Proxies. They offer three separate plans. These plans are tailored so that you can use this product regardless of your bandwidth needs.
As we mentioned ealier, the lowest tier is the Pay As You Go plan. The highest tier comes with 40GB.
For more information about these plans, you can take a look at the table below.
If you're just interested in exploring scraping, or running a small daily scrape, Pay As You Go is definitely the plan for you. If you need a lot of data, go ahead and select one of their high tier plans.
|
Pay As You Go | N/A | $8 |
13GB | $99 + tax | $7.75 |
40 GB | $999 + tax | $7.50 |
Generally speaking, when proxy providers offer plans around $2-3 per GB, they are considered cheap. If they offer smaller plans in the $6-8 per GB range, they are more expensive. Oxylabs offers a pay-as-you-go plan that is expensive compared to other alternatives in the market.
We also built a tool that helps you can compare Oxylabs with all sorts of other proxy providers. This makes it easier to shop around and ensure that you're getting the best plan for your needs.
You can use our comparison tool here. This tool is built for anybody looking to shop for a proxy provider.
Setting Up Oxylabs Mobile Proxies
- For starters, if you don't have one already, you'll need to setup an account. As you might have noticed in the screenshot from their front page, click Try Oxylabs today.
- Afterward, you can create an account using either Google or with a traditional email and password.
- Once you've got an account, you'll be taken to your account dashboard. Here, you can navigate to the Mobile Proxy option under the Proxies dropdown. Once you've selected it, go ahead and click Buy now.
- After clicking the Buy now button, you'll be prompted to setup your plan. In the example below, we select the $9/GB Pay As You Go plan.
- Next, you'll be prompted to actually purchase your traffic. Go ahead and click Continue.
- Next, you'll be prompted to select a payment method. Choose whichever method works for you and enter your required information. After payment, your payment takes a second to process. Once its done processing, you'll be prompted to setup your new Mobile Proxy. Click on the button title, Let's start.
- After you've finished setting up your username and password for the proxy, you'll be given a cURL command that you can use to test everything out. Copy and paste the command into your terminal and hit Enter.
- After running the cURL command listed above, you should receive a response similar to the one below. If you do, this means that everything is working.
{"ip":"172.58.180.148","providers":{"dbip":{"country":"US","asn":"AS21928","org_name":"T-Mobile USA, Inc.","city":"Jenks","zip_code":"","time_zone":"","meta":"\u003ca href='https://db-ip.com'\u003eIP Geolocation by DB-IP\u003c/a\u003e"},"ip2location":{"country":"US","asn":"","org_name":"","city":"Dallas","zip_code":"75201","time_zone":"-06:00","meta":"This site or product includes IP2Location LITE data available from \u003ca href=\"https://lite.ip2location.com\"\u003ehttps://lite.ip2location.com\u003c/a\u003e."},"ipinfo":{"country":"US","asn":"AS21928","org_name":"T-Mobile USA, Inc.","city":"","zip_code":"","time_zone":"","meta":"\u003cp\u003eIP address data powered by \u003ca href=\"https://ipinfo.io\" \u003eIPinfo\u003c/a\u003e\u003c/p\u003e"},"maxmind":{"country":"US","asn":"AS21928","org_name":"T-MOBILE-AS21928","city":"Dallas","zip_code":"","time_zone":"-06:00","meta":"This product includes GeoLite2 Data created by MaxMind, available from https://www.maxmind.com."}}}
Authentication
We get the option to authenticate through both username and password as well as IP whitelisting. This tutorial is going to focus on username and password authentication, but we will also walk you through the process of whtielisting an IP address.
First, click on Whitelist.
Now, click the Edit whitelist button
.
Next, you'll be prompted to enter any IPs you'd like to whitelist line by line.
As you may have noticed either in the TLDR or the Setting Up sections, when you authenticate with your username and password, it gets passed into the URL like this:
http://customer-{USERNAME}:{PASSWORD}@pr.oxylabs.io:7777
- Our username is always laid out as follows:
customer-YOUR_USERNAME
.
- We first put a flag for our username,
customer
and then we attach our username to it -YOUR_USERNAME
.
- Then, we append it with
:
, and we append our password to that: customer-{USERNAME}:{PASSWORD}
.
Basic Request Using Oxylabs Mobile Proxies
By this point, you should already know that we authenticate with our username and password. Now, let's make a basic reqeust using Python. Our requests will go to:
https://ip.oxylabs.io/location
This API endpoint tells us all sorts of stuff about our IP and location information.
- First, we save our
USERNAME
and PASSWORD
variables.
- Then, we use them to build our
proxy_url
: http://customer-{USERNAME}:{PASSWORD}@pr.oxylabs.io:7777
. We use a dict
and assign this url to both our http
and https
protocols.
- Finally, we pass this
dict
into our proxies
argument: proxies=proxies
.
import requests
USERNAME = "your-username"PASSWORD = "your-password"proxy_url = f"http://customer-{USERNAME}:{PASSWORD}@pr.oxylabs.io:7777"
proxies = { "http": proxy_url, "https": proxy_url}
result = requests.get("https://ip.oxylabs.io/location", proxies=proxies)print(result.text)
Here's some sample output from the code above.
{"ip":"24.206.23.54","providers":{"dbip":{"country":"BS","asn":"AS15146","org_name":"Cable Bahamas","city":"Killarney","zip_code":"","time_zone":"","meta":"\u003ca href='https://db-ip.com'\u003eIP Geolocation by DB-IP\u003c/a\u003e"},"ip2location":{"country":"BS","asn":"","org_name":"","city":"Nassau","zip_code":"-","time_zone":"-05:00","meta":"This site or product includes IP2Location LITE data available from \u003ca href=\"https://lite.ip2location.com\"\u003ehttps://lite.ip2location.com\u003c/a\u003e."},"ipinfo":{"country":"BS","asn":"AS15146","org_name":"Cable Bahamas","city":"","zip_code":"","time_zone":"","meta":"\u003cp\u003eIP address data powered by \u003ca href=\"https://ipinfo.io\" \u003eIPinfo\u003c/a\u003e\u003c/p\u003e"},"maxmind":{"country":"BS","asn":"AS15146","org_name":"CABLEBAHAMAS","city":"Nassau","zip_code":"","time_zone":"-05:00","meta":"This product includes GeoLite2 Data created by MaxMind, available from https://www.maxmind.com."}}}
In the JSON above, our country
shows up as BS
, or The Bahamas. The org_name
for our ISP is Cable Bahamas
. Our proxy is in the city
of Nassau
. Our time_zone
is -5:00
, which means UTC-5.
Country Geotargeting
Country geotargeting is the ability to route internet traffic through an IP address that originates from a specific country.
Country geotargeting allows users to appear as though they are accessing the internet from mobile networks in a chosen country.
This feature provides several benefits:
- Access Geo-Restricted Content: Unlock websites, services, or platforms restricted to specific countries.
- Localized Marketing and SEO: Test ads, keywords, and content tailored for regional audiences.
- Ad Verification: Ensure ads display correctly in target countries while detecting fraud.
- Market Research: Gather accurate insights on regional competitors and trends.
- Bypass Regional Restrictions: Overcome censorship and firewalls in restricted regions.
- Social Media and E-Commerce Management: Safely manage accounts and campaigns for specific countries.
- Improved Anonymity and Security: Appear as a legitimate user from the selected region while maintaining privacy.
Geotargeting with Oxylabs Mobile proxies is top notch. Oxylabs Mobile Proxies give us access to over 140 countries. This allows us to target specific information when scraping.
To geotarget by country, we use the cc
flag followed by our country code.
The code below is structured much like the basic request we used earlier, except this time, we create a COUNTRY
and add it to our proxy_url
.
import requests
USERNAME = "your-username"PASSWORD = "your-password"COUNTRY = "IT"
proxy_url = f"http://customer-{USERNAME}-cc-{COUNTRY}:{PASSWORD}@pr.oxylabs.io:7777"
proxies = { "http": proxy_url, "https": proxy_url}
result = requests.get("https://ip.oxylabs.io/location", proxies=proxies)print(result.text)
The major change to this code is simply the cc
flag. We pass a country code along with this flag, and then we get routed through that country. When we run this code, we get the result you can see below.
If you look at the JSON, our country
shows up as IT, or Italy. Our city
is Rome
. Our geotargeting is working.
{"ip":"5.91.137.97","providers":{"dbip":{"country":"IT","asn":"AS30722","org_name":"Vodafone Italia S.p.A.","city":"Rome","zip_code":"","time_zone":"","meta":"\u003ca href='https://db-ip.com'\u003eIP Geolocation by DB-IP\u003c/a\u003e"},"ip2location":{"country":"IT","asn":"","org_name":"","city":"Ripa","zip_code":"00135","time_zone":"+01:00","meta":"This site or product includes IP2Location LITE data available from \u003ca href=\"https://lite.ip2location.com\"\u003ehttps://lite.ip2location.com\u003c/a\u003e."},"ipinfo":{"country":"IT","asn":"AS30722","org_name":"Vodafone Italia S.p.A.","city":"","zip_code":"","time_zone":"","meta":"\u003cp\u003eIP address data powered by \u003ca href=\"https://ipinfo.io\" \u003eIPinfo\u003c/a\u003e\u003c/p\u003e"},"maxmind":{"country":"IT","asn":"AS30722","org_name":"Vodafone Italia S.p.A.","city":"Rome","zip_code":"","time_zone":"+01:00","meta":"This product includes GeoLite2 Data created by MaxMind, available from https://www.maxmind.com."}}}
City Geotargeting
City geotargeting is routing internet traffic through an IP address associated with a specific city, allowing users to access content, services, or ads that are tailored to that urban location.
This technique is commonly used to simulate user activity from a particular city, ensuring highly localized interactions with websites and online platforms.
With their Mobile Proxies, Oxylabs also gives us support for city level geotargeting. Many residential and mobile providers support this, but few offer reliable support with it. Oxylabs is one of the few that actually gives us consistent support for multiple cities.
To use this type of geotargeting, we need to add the city
flag to our country geotargeted URL:
http://customer-{USERNAME}-cc-{COUNTRY}-city-{CITY}:{PASSWORD}@pr.oxylabs.io:7777
When using city geotargeting, we need to pass both the cc
(country) and city
(city) flags into our url.
Take a look at the example below.
import requests
USERNAME = "your-username"PASSWORD = "your-password"COUNTRY = "US"CITY = "los_angeles"
proxy_url = f"http://customer-{USERNAME}-cc-{COUNTRY}-city-{CITY}:{PASSWORD}@pr.oxylabs.io:7777"
proxies = { "http": proxy_url, "https": proxy_url}
result = requests.get("https://ip.oxylabs.io/location", proxies=proxies)print(result.text)
When running this code, you'll get a JSON response similar to the one you see below.
{"ip":"166.199.100.19","providers":{"dbip":{"country":"US","asn":"AS7018","org_name":"AT\u0026T Services, Inc.","city":"Tallahassee","zip_code":"","time_zone":"","meta":"\u003ca href='https://db-ip.com'\u003eIP Geolocation by DB-IP\u003c/a\u003e"},"ip2location":{"country":"US","asn":"","org_name":"","city":"Gainesville","zip_code":"32601","time_zone":"-05:00","meta":"This site or product includes IP2Location LITE data available from \u003ca href=\"https://lite.ip2location.com\"\u003ehttps://lite.ip2location.com\u003c/a\u003e."},"ipinfo":{"country":"US","asn":"AS7018","org_name":"AT\u0026T Services, Inc.","city":"","zip_code":"","time_zone":"","meta":"\u003cp\u003eIP address data powered by \u003ca href=\"https://ipinfo.io\" \u003eIPinfo\u003c/a\u003e\u003c/p\u003e"},"maxmind":{"country":"US","asn":"AS7018","org_name":"ATT-INTERNET4","city":"Los Angeles","zip_code":"","time_zone":"-08:00","meta":"This product includes GeoLite2 Data created by MaxMind, available from https://www.maxmind.com."}}}
Near the bottom of the JSON, you can see that our country
is US. Our city
is showing up as Los Angeles. This city level geotargeting is working.
In 4 out of 5 runs, our max_mind
location showed up in the city
of Los Angeles. One time, it showed up as Garden Grove.
According to Google Maps, Garden Grove is just outside Los Angeles.
Oxylabs Mobile Proxies give us consistently good results when using city level geotargeting.
City level geotargeting gives us access to hyper localized content. When you're dealing with local content, you can extract the following types of data at a local level. This allows you to collect and manage your precious data at a much more granular level.
- Local Ads
- Local Businesses
- Local Social Media
- Local Events
Error Codes
When something goes wrong, error codes will either tell us exactly what we need to know, or they will point us in the right direction.
In case you weren't aware, a status code of 200 tells us that our request was successful. When dealing with other error codes, it can be a little more difficult. Each error code comes with its own message. We can use this method to troubleshoot and fix the error.
When you receive an HTTP error, you need to lookup the error code and solve it accordingly.
|
200 | Success | Everything works as expected. |
400 | Bad Request | Your request is likely formatted incorrectly. |
403 | Access Denied | Add HTTP/HTTPS to your request or check your url. |
407 | Access Denied | Authentication failed or too many requests. |
500 | Internal Server Error | There was any issue with the Oxylabs server, try again. |
502 | Bad Response, Bad Gateway | Invalid response or timeout from the target server. |
Status codes are imperative. When you encounter an error, you need to look up the status code and troubleshoot accordingly.
KYC Verification
Most reputable Mobile and Residential providers use some type of KYC (Know Your Customer) verification.
While it's not as stringent as Bright Data's KYC policy, you do need to enter some personal information when signing up, and they do continuously monitor your usage of their products. Oxylabs uses an ongoing KYC process.
With the ongoing KYC from Oxylabs, your usage is constantly being monitored. If you use their products for anything suspicious or illegal, they reserve the right to suspend or ban you completely. This might sound a bit extreme, but it's not.
Their premium bandwidth is provided by real people lending out real bandwidth and a small amount of their compute power.
Implementing Oxylabs Mobile Proxies in Web Scraping
We've gotten a pretty good idea of how to use Oxylabs Mobile Proxies. Next, it's time to implement these proxies using some of the more popular web scraping frameworks.
In Python, we'll show some code examples for Reqeusts, SeleniumWire (DEPRECATED), and Scrapy. Next, we'll go through how to implement them in NodeJS with Puppeteer and Playwright.
By the end of this section, you can get connected to Oxylabs Mobile Proxies easily.
Python Requests
Take a look at our example from our TLDR section.
- We create our environment variables, then we use them to build a
proxy_url
.
- Then, we assign that
proxy_url
to our http
and https
protocols of our proxies
dict
.
- When making our requests, we pass
proxies=proxies
in as one of our keyword arguments.
This tells Python Requests to forward all of these through the proxy connection we set.
import requests
USERNAME = "your-username"PASSWORD = "your-password"proxy_url = f"http://customer-{USERNAME}:{PASSWORD}@pr.oxylabs.io:7777"
proxies = { "http": proxy_url, "https": proxy_url}
result = requests.get("https://ip.oxylabs.io/location", proxies=proxies)print(result.text)
- Like before, we setup our configuration variables.
- We create a
proxy_url
that holds those variables: http://customer-{USERNAME}:{PASSWORD}@pr.oxylabs.io:7777
.
- We then create a
dict
object that holds both our http
and https
proxies.
- When making our requests, we make sure to pass
proxies=proxies
. This tells Python Requests to use the dict
object we created for our proxy settings.
- Once we get our response back, we print it to the terminal.
Python Selenium
Vanilla Selenium does not support authenticated proxies. For years, SeleniumWire has been the go-to when you want to use proxies with Selenium.
Sadly, SeleniumWire has been deprecated. It hasn't received any code updates in two years and will likely never receive them again. However, it is still technically possible to use Oxylabs Mobile Proxies via SeleniumWire, but we strongly advise against it.
When you decide to use SeleniumWire, you are vulnerable to the following risks:
-
Security: Browsers are updated with security patches regularly. Without these patches, your browser will have holes in the security that have been fixed in other browsers such as Chromedriver or Geckodriver.
-
Dependency Issues: SeleniumWire is no longer maintained. In time, it may not be able to keep up with its dependencies as they get updated. Broken dependencies can be a source of unending headache for anyone in software development.
-
Compatibility: As the web itself gets updated, SeleniumWire doesn't. Regular browsers are updated all the time. Since SeleniumWire no longer receives updates, you may experience broken functionality and unexpected behavior.
As time goes on, the probability of all these problems increases. If you understand the risks but still wish to use SeleniumWire, you can view a guide on that here.
Depending on your time of reading, the code example below may or may not work. As mentioned above, we strongly recommend against using SeleniumWire because of its deprecation, but if you decide to do so anyway, here you go.
We are not responsible for any damage that this may cause to your machine or your privacy. As well, the example below does not contain any retry logic for bad responses. If you wish to do so, you can add this in yourself.
from seleniumwire import webdriverimport json
USERNAME = "your-username"PASSWORD = "your-password"proxy_url = f"http://customer-{USERNAME}:{PASSWORD}@pr.oxylabs.io:7777"
proxy_options = { "proxy": { "http": proxy_url, "https": proxy_url, "no_proxy": "localhost:127.0.0.1" }}
driver = webdriver.Chrome(seleniumwire_options=proxy_options)
driver.get('https://httpbin.org/ip')
- We build our
proxy_url
exactly the way we have throughout this article.
- We assign this url to both the
http
and https
protocols of our proxy settings.
driver = webdriver.Chrome(seleniumwire_options=proxy_options)
tells webdriver
to open Chrome with our custom seleniumwire_options
.
Python Scrapy
Now it's time to do the same thing using Scrapy. In this example, we add our proxy connection right into our spider.
To start, we need to make a new Scrapy project.
scrapy startproject mobile
Then, from within your new Scrapy project, create a new Python file inside the spiders folder with the following code.
import scrapy
USERNAME = "your-username"PASSWORD = "your-password"proxy_url = f"http://customer-{USERNAME}:{PASSWORD}@pr.oxylabs.io:7777"
class ExampleSpider(scrapy.Spider): name = "mobile_proxy"
def start_requests(self): request = scrapy.Request(url="https://httpbin.org/ip", callback=self.parse) request.meta['proxy'] = proxy_url yield request
def parse(self, response): print(response.body)
You can run this spider with the following command.
scrapy crawl mobile_proxy
- First, we create our configuration variables, just like we did before.
- Once again, we create the same basic
proxy_url
.
- From inside our
start_requests
method, we assign our proxy_url
to request.meta['proxy']
. This tells Scrapy to forward all of this spider's requests through the proxy_url
we created earlier.
NodeJS Puppeteer
Here, we'll implement the same proxy connection using Puppeteer. Puppeteer has builtin support for authentication. This gives us a slightly different way to use our configuration variables.
Create a new folder.
cd
into the new folder and create a new JavaScript project.
cd puppeteer-mobilenpm init --y
Next, we need to install Puppeteer.
Next, create a new JavaScript file. Copy and Paste the code below into your new file. Make sure to move or copy your proxy list into your new project folder.
const puppeteer = require("puppeteer");
const USERNAME = "your-username";const PASSWORD = "your-password";const proxy_url = "pr.oxylabs.io:7777";
const FULL_USER = `customer-${USERNAME}`;
(async () => { const browser = await puppeteer.launch({ args: [`--proxy-server=http://${proxy_url}`] });
const page = await browser.newPage();
await page.authenticate({ username: FULL_USER, password: PASSWORD });
await page.goto('http://lumtest.com/myip.json'); await page.screenshot({path: 'puppeteer.png'});
await browser.close();})();
- First, we create our config variables:
USERNAME
, and PASSWORD
. We declare all of these as constants.
- Next, we create a
FULL_USER
, which combines our USERNAME
string with our -customer
flag.
- We add our proxy url to our
proxy-args
: args: [`--proxy-server=http://${proxy_url}`]
.
- We pass the following credentials into
page.authenticate()
:
username: FULL_USER
password: PASSWORD
Puppeteer offers great proxy support out of the box. We combine our USERNAME
and with the customer
flag to create our FULL_USER
. We then use both our FULL_USER
and PASSWORD
to authenticate the proxy.
The screenshot below came from running our Puppeteer code above.
NodeJS Playwright
Connecting to our proxy via Plawright is going to be almost identical. Puppeteer and Playwright actually share a common origin coming from Chrome's DevTools.
If you look at the code example below, Playwright also has designated spots for our username
and password
. Follow the steps below to get up and running.
Create a new project folder.
cd
into the new folder and initialize a JavaScript project.
cd playwright-mobilenpm init --y
Install Playwright.
npm install playwrightnpx playwright install
Next, you can copy/paste the code below into a JavaScript file. Once again, make sure to add your proxy list to your project folder.
const playwright = require("playwright");
const USERNAME = "your-username";const PASSWORD = "your-password";const proxy_url = "pr.oxylabs.io:7777";
const FULL_USER = `customer-${USERNAME}`;
const options = { proxy: { server: `http://${proxy_url}`, username: FULL_USER, password: PASSWORD }};
(async () => { const browser = await playwright.chromium.launch(options); const page = await browser.newPage();
await page.goto('http://lumtest.com/myip.json');
await page.screenshot({ path: "playwright.png" })
await browser.close();})();
- Just like with Puppeteer, we first setup our configuration variables.
- We create a
proxy
object with the following fields:
server
: http://${proxy_url}
username: FULL_USER
password: PASSWORD
Like Puppeteer, Playwright comes packed with top notch support for authenticated proxies. You can view an example response in the screenshot below.
Case Study: Scrape Weather.com
We're going to do a small case study. We'll going to scrape a small amount of metadata from weather.com. We'll just scrape the site description. We're going to do this using both a Portuguese proxy and a US based one.
We'll learn a couple things about our proxies from this case study.
- First, we should get different content based on our location.
- We're also going to count our tries as well. This will allow us to benchmark the proxy quality in each country. Ideally, both proxies should get our content on the first try, but that isn't always the case.
Below is the code for our case study.
- We start by using
urllib3
to disable some SSL warnings that make it difficult to read our output.
- In the code below, we setup our first proxy connection in Portugal.
- Then, we scrape the site description. We also count the amount of tries it takes for a successful response.
- Afterward, we perform the same scrape using a US based proxy. This allows us to compare the quality of both proxy connections.
import requestsimport urllib3from bs4 import BeautifulSoup
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
USERNAME = "your-username"PASSWORD = "your-password"COUNTRY = "PT"
proxy_url = f"http://customer-{USERNAME}-cc-{COUNTRY}:{PASSWORD}@pr.oxylabs.io:7777"
proxies = { "http": proxy_url, "https": proxy_url}
headers = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36"}
print("----------------------PT-------------------")
success = False
tries = 1while not success: response = requests.get("https://weather.com", proxies=proxies, headers=headers, verify=False) if response.status_code != 200: print(f"Failed to get site, Status Code: {response.status_code}") tries+=1 continue
soup = BeautifulSoup(response.text, "html.parser") description_tag = soup.select_one("meta[name='description']") description = description_tag.get("content")
print("Description:", description) print("Total tries:", tries) success = True
print("----------------------US---------------------")
COUNTRY = "US"
proxy_url = f"http://customer-{USERNAME}-cc-{COUNTRY}:{PASSWORD}@pr.oxylabs.io:7777"
proxies = { "http": proxy_url, "https": proxy_url}
success = Falsetries = 1while not success: response = requests.get("https://weather.com", proxies=proxies, headers=headers, verify=False) if response.status_code != 200: print(f"Failed to get site, Status Code: {response.status_code}") tries+=1 continue
soup = BeautifulSoup(response.text, "html.parser") description_tag = soup.select_one("meta[name='description']") description = description_tag.get("content")
print("Description:", description) print("Total tries", tries) success = True
Here are some key things you should notice in this code:
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
disables SSL warnings. This will make our output a bit easier to read.
- After setting up each proxy connection, we do the following:
- Then, we create our
success
and tries
variables.
- As long as our operation hasn't succeeded, we continue trying and we increment
tries
for each unsuccessful try.
- Once successful, we locate the page description with
soup.select_one("meta[name='description']")
.
- We finally print our
tries
and description
to the terminal.
If you run this code, you shuuld receive something similar to this.
----------------------PT-------------------
Description: Previsão e condições meteorológicas e radar Doppler de hoje e de hoje à noite para Socorro, Lisboa por The Weather Channel e Weather.com
Total tries: 1
----------------------US---------------------
Description: The Weather Channel and weather.com provide a national and local weather forecast for cities, as well as weather radar, report and hurricane coverage
Total tries 1
Take a look at our site descriptions for each proxy.
-
Portugal: Previsão e condições meteorológicas e radar Doppler de hoje e de hoje à noite para Socorro, Lisboa por The Weather Channel e Weather.com.
-
United States: The Weather Channel and weather.com provide a national and local weather forecast for cities, as well as weather radar, report and hurricane coverage.
Now, take a look at our total tries for each proxy.
- Portugal: 1
- United States: 1
Both of our proxies were able to scrape retrieve the site on the first try. We ran this code 5 times to make sure. On each run, we received our responses on the first try. With other mobile proxy providers, we tend to get errors or inconsistent connections more often than that.
We can say conclusively, Oxylabs provides a steady, reliable mobile connection regardless of which country you geotarget.
Alternative: ScrapeOps Residential Proxy Aggregator
Mobile proxies from Oxylabs are pretty reliable, but they're expensive.
At ScrapeOps, we offer a Residential Proxy Aggregator that also gives you access to mobile proxies. With our Residential Proxy Aggregator, you get the same anonymity and anti-bot resistance that you get from Oxylabs Mobile Proxies.
ScrapeOps Residential Proxy Aggregator provides access to the top 20 residential proxy providers through a single port, ensuring a high success rate for web scraping tasks. It automatically switches proxies to avoid blocks, optimizing performance and cost with flexible pricing plans.
- Access to the top 20 residential proxy providers, including Smartproxy, Bright Data, and Oxylabs.
- 98% success rate due to automatic proxy switching.
- Bypasses anti-bot measures and avoids blocks.
- Optimizes performance and cost by monitoring proxy performance and pricing.
- Flexible pricing plans starting at $15 per month, with up to $999 for higher usage.
- 500 MB of free bandwidth credits to start.
Our plans range in price from free (yes... FREE) all the way to $999 per month. We have 8 different choices available.
The table below outlines our pricing.
|
Free | 100MB | $0 |
$15 | 3GB | $5 |
$45 | 10GB | $4.50 |
$99 | 25GB | $3.96 |
$149 | 50GB | $2.98 |
$249 | 100GB | $2.49 |
$449 | 200GB | $2.25 |
$999 | 500GB | $2 |
On top of all this, Oxylabs is one of our many providers. You can start your free trial here. It comes with 500MB of free bandwidth.
Once you've got your free trial, you can copy and paste the code below to check your proxy connection.
import requestsfrom urllib.parse import urlencode
API_KEY = "your-super-secret-api-key"
proxy_url = f"http://scrapeops.mobile=true:{API_KEY}@residential-proxy.scrapeops.io:8181"
proxies = { "http": proxy_url, "https": proxy_url}
response = requests.get("https://lumtest.com/myip.json", proxies=proxies, verify=False)print(response.text)
In the code above, we do the following.
- Our only config variable is our
API_KEY
.
- We use our
API_KEY
to build our proxy_url
: http://scrapeops.mobile=true:{API_KEY}@residential-proxy.scrapeops.io:8181
.
- The flag,
mobile=true
, tells ScrapeOps that we want a mobile IP address.
- We check our IP information by making a GET:
requests.get("https://lumtest.com/myip.json", proxies=proxies, verify=False)
.
- Finally, we print it to the terminal. You should get an output similar to what you see below. We removed the SSL warning to make the output more legible.
{"country":"VN","asn":{"asnum":45899,"org_name":"VNPT Corp"},"geo":{"city":"Huế","region":"26","region_name":"Thừa Thiên Huế Province","postal_code":"530000","latitude":16.3322,"longitude":107.5864,"tz":"Asia/Bangkok","lum_city":"hue","lum_region":"26"}}
Take a look at the org_name
, VNPT Corp
. VNPT Corp is a telecommunications provider of both broadband and 5G mobile data. We're getting mobile results when passing in the mobile flag. Our mobile bandwidth comes in at a far better price with a low tier paid plan of $5/GB and a top tier plan of $2/GB.
Ethical Considerations and Legal Guidelines
Oxylabs is incredibly committed to providing ethically sourced proxies. As we mentioned earlier, they keep an ongoing KYC process for anyone using their products.
The screenshot below actually comes from their landing page. If you're interested in more about their proxy sourcing process, you can get their whitepaper here.
When mobile and residential proxies are sourced, they come from real people using real devices on their real internet connection. Ethical sourcing of residential proxies means that everyone providing bandwidth knows they're providing bandwidth.
When we use datacenter proxies, they come from a datacenter, there is no way that our proxy could come from a user unknowingly running software on their smartphone.
Legal
Don't use a proxy provider to break laws. As you already know, it's illegal. You should consider something else as well: it harms everyone involved. It harms the proxy provider. It eventually harms you too. If you break the law using a proxy, first, it will be traced to the proxy provider. Then, the provider will trace it to your account using either your API key or your username and password.
This creates problems for both you and your proxy service.
-
Don't use proxies to access illegal content: This can bring you legal fines or even jail time depending on what you're trying to access.
-
Don't scrape and disseminate other people's private data: It varies by jurisdiction, but this is also a highly illegal and dangerous practice. Doxxing private data can lead to heavy fines and/or possibly jail/prison time.
Ethical
When we scrape the web, legality shouldn't be our only concern. We need to think about right and wrong. When something is legal, it isn't always right. Nobody wants to be in the next headline about ethically questionable data harvesting. This sort of reputational damage can completely destroy a company.
-
Social Media Monitoring: Social media stalking can be a very destructive and disrespectful behavior. How would you feel if someone used data collection methods on your account?
-
Respect Site Policies: Failure to respect a site's policies can get your account suspended/banned. It can even lead to legal troubles for those of you who sign and violate a terms of service agreement.
Conclusion
Mobile proxies are a great way to access the web when scraping. With Oxylabs Mobile Proxies, you get all the benefits of residential proxies and some additional benefits of having a mobile IP address.
By this point, you should also have a solid grasp of how to implement Oxylabs Mobile Proxies using Python Requests, Scrapy, NodeJS Puppeteer and NodeJS Playwright.
You can view the full documentation for Oxylabs Mobile Proxies here.
More Cool Articles
If you're in the mood to keep reading, we've got a ton of content that you can learn from here at ScrapeOps. Whether you're a seasoned dev or your brand new to web scraping, we've got something useful for you.
We love scraping so much that we wrote the Python Web Scraping Playbook.
If you want to learn more, take a look at the articles below.