Bright Data Mobile Proxies: Web Scraping Guide
Bright Data is one of our best providers here at ScrapeOps. They offer a variety of mobile, residential, datacenter and managed proxy products. When you choose Bright Data, you get access to top tier proxy solutions at a reasonable price. On top of all this, Bright Data offers some of the best reliability and redundancy in this industry.
Throughout the rest of this article, we'll go through Bright Data's mobile proxies in great detail.
TLDR: How to Integrate Bright Data Mobile Proxy?
Mobile proxies with Bright Data work pretty similar to any other proxy integration. You need a USERNAME
, PASSWORD
, and ZONE
.
There is one thing you need to look out for when using their mobile proxies: status code 502.
Mobile proxies tend to give us an unreliable connection. To get around this, we need some basic retry logic.
The code below sets up a proxy connection with Bright Data's Mobile Proxies. To handle possible 502 errors, we use try
and except
within a while
loop. This allows us to continually try our request until we receive a successful response.
import requestsimport json
USERNAME = "your-username"ZONE = "mobile_proxy1"PASSWORD = "your-password"
proxy_url = f"http://brd-customer-{USERNAME}-zone-{ZONE}:{PASSWORD}@brd.superproxy.io:22225"
proxies = { "http": proxy_url, "https": proxy_url}
success = False
while not success: result = requests.get("http://geo.brdtest.com/mygeo.json", proxies=proxies) try: if result.status_code != 200: raise Exception("Failed to find proxies") print(result.text) success = True except Exception as e: print(f"Request Failed, {e}, Status: {result.status_code}")
- First, we setup our connection variables:
USERNAME
, ZONE
, and PASSWORD
.
- We then put them together to create a
proxy_url
: http://brd-customer-{USERNAME}-zone-{ZONE}:{PASSWORD}@brd.superproxy.io:22225
.
- We create a
dict
object to hold our http
and https
proxies. We then assign both of them to our proxy_url
.
while
we have not received a successful response, we attempt to do the following:
- Send a GET request to the url.
- Within our
try
block, we:
- Check the status of the response.
- If the status code is not
200
, we throw an error.
- If we receive a
200
, we print the result.
- Our
except
block is used to handle any errors.
- We print an error message and our status code:
Request Failed, {e}, Status: {result.status_code}
.
When using Bright Data's Mobile proxies, you need to have at least some error handling in your code. Mobile networks can be unreliable and we need to take this into account.
The snippet below shows some example output from the code above. As you can see, we receive two failed responses before finally getting a good one.
Request Failed, Failed to find proxies, Status: 502
Request Failed, Failed to find proxies, Status: 502
{"country":"FR","asn":{"asnum":51207,"org_name":"Free Mobile SAS"},"geo":{"city":"","region":"IDF","region_name":"Île-de-France","postal_code":"","latitude":48.8611,"longitude":2.3269,"tz":"Europe/Paris","lum_region":"idf"}}
Understanding Mobile Proxies
What Are Mobile Proxies?
Mobile proxies are proxy servers that route internet traffic through real mobile devices connected to mobile networks (like 3G, 4G, or 5G), using IP addresses assigned by mobile carriers.
These proxies allow users to mask their real IP address and appear as though their internet activity is originating from a legitimate mobile user.
Why Are Mobile Proxies Important?
Mobile proxies are different from traditional datacenter or residential proxies because they utilize mobile carrier networks, making the IP addresses more difficult to detect and block.
This makes mobile proxies a powerful tool for online activities that require high anonymity, geo-targeting, and security, such as web scraping, ad verification, social media management, and bypassing geo-restrictions.
Datacenter vs Premium
When we talk about proxies, we usually think of Datacenter and Premium proxies. In general, these proxies fall into three separate categories: Datacenter, Premium, and Managed.
- Datacenter proxies are hosted indside a datacenter.
- Premium proxies are either residential or mobile, these proxies are best for sites that practice IP blocking.
- Managed proxies tend to use a combination of datacenter and premium IP addresses.
The ScrapeOps Proxy Aggregator is an example of a managed proxy. It first tries to GET a response using a datacenter proxy. If the proxy fails, it will retry with a premium proxy (with no additional charge!).
First, we'll go over the differences between Datacenter and Premium proxies. Since Premium proxies include both Residential and Mobile proxies, we'll then go over the difference between Residential and Mobile proxies.
Datacenter
Pros
-
Price: Datacenter proxies are incredibly cheap. If you choose to use their Datacenter proxies, Bright Data only charges $0.60 per GB.
-
Speed: Datacenter proxies give us unparalleled performance. When you use this type of proxy, it gets hosted in an actual datacenter. Datacenters use some of the best hardware with a fast and stable internet connection.
-
Availability: Datacenters are huge. They host tons and tons of machines each with their own IP addresses. With a stable internet connection and so many machines to use, datacenter proxies give us access to very large IP pools with unmatched reliability.
Cons
-
Blocking: When you're dealing with more difficult sites, they tend to block datacenter IP addresses. If you don't have a residential or mobile IP, they assume you're up to no good.
-
Less Geotargeting Support: When using a datacenter proxy, you'll often get support for geotargeting, but it is very limited compared to what you can get when using a premium proxy.
-
Less Anonymity: Our IP always shows up inside of a datacenter. Because of this, we don't blend in as well with normal traffic.
Premium
Pros
-
Anonymity: With a Premium proxy, you get a regular IP address on a real device. This makes it relatively easy to blend in.
-
Better Access: As mentioned earlier, datacenter proxies can often get blocked. Sites generally don't block residential or mobile IPs because this is the traffic they want.
Cons
-
Price: Premium proxies are far more expensive than their datacenter counterparts. With Bright Data, both their residential and mobile proxies can cost as much as $8.40 per GB. This is more than 10 times the price of their Datacenter products.
-
Speed: With premium proxies, speed is not as good. When you're connected through a premium proxy, you're getting a regular IP address on a real device. This means that we're limited by not only the compute power (possibly an outdated smartphone), but internet connection as well.
Residential proxies are ideal for SERP results, ad verification, social media monitoring/scraping and much more.
Mobile vs Residential
Let's take a look at the two types of Premium Proxies: Residential and Mobile. Just like we looked at pros and cons above, we'll do that here as well. While they carry many of the same benefits and often get lumped in to the same group (Premium Proxies), there are some important differences and you should take note.
Residential
Pros
-
Authenticity: Residential proxies give you a real IP address. This makes you far more difficult to detect.
-
Geotargeting: When using residential proxies, you often get much finer control over your location. Sometimes, you can even narrow down your location to a specific city.
Cons
-
Relatively Static: ISPs (internet service providers) do sometimes rotate our IP addresses. However, they don't do this often. Whether you keep your IP for a week, a month or even a year, this can often make an IP ban difficult to get past... especially if you're using a dedicated proxy.
-
Sometimes More Detectable: This is an incredibly uncommon practice, but there are some sites that look for a mobile only environment. They do this by checking both your IP address and user-agent string.
Mobile
Pros
-
Rotating IP Addresses: Phones get new IP addresses all the time. Phones move around alot. This causes them to change connection all the time. Whenever your cell phone connects to a different tower, you get a new IP address. If you walk from one store into another, you often get a new IP address.
-
Carrier Network Address Translation: This method is a bit different than the IP rotation mentioned above. With this feature, many devices share the same network IP address. This gives you a high anonymity pool because there are loads of other phones all using your same IP address.
-
Social Media/ Mobile First Platforms: Nowadays, most normal people rarely use a computer. Their smartphone provides everything they need internet wise. Because of this, many sites (social media in particular), have adopted a mobile first approach. Mobile proxies make us look far more legitimate on these sites.
Cons
-
Reliability: As mentioned earlier, mobile phones change IP addresses and networks all the time. This leads to reliability issues. As you might have noticed in our TLDR section, it took us several retries to even get a good response through the proxy.
-
Price: While Bright Data charges $8.40/GB for both its Mobile and Residential products, some providers chage more for mobile. As we've already talked about, we need to make multiple requests. Depending on your provider, this can eat up your bandwidth, which leads to higher cost.
Why Use Bright Data Mobile Proxies?
Bright Data gives us access to mobile proxies at a pretty reasonable price. When you pay their $8.40/GB, you are not locked into a monthly commitment, and you have a very low barrier to entry.
In comparison, depending on your product, some providers will leave you with a $50 barrier to entry!
Bright Data's Mobile Proxies give us access to pretty much every site on the web. On top of that we can often get a good mobile experience when scraping mobile-first sites.
Bright Data Mobile Proxy Pricing
Their Mobile Proxy pricing is pretty straightforward. Bright Data offers four different plans:
- Pay As You Go,
- 69GB,
- 158GB, and
- 339GB.
On the Pay As You Go plan, you need to deposit money into your account, but you are only charged based on the bandwidth you use. If you only use $0.01 worth of bandwidth, at the end of the month, $0.01 will be deducted from your account balance.
Take a look at the screenshot below from their Invoices section. As you can see below, I was charged $0.07 for September and $0.01 for my usage during October. Bright Data uses very fair and transparent pricing.
If you'd like to know more about their plans in general, take a look at the table below.
As mentioned, their cheapest plan is Pay as You Go, and their top tier plan is the 339GB plan coming in at a whopping $1999 per month. This pricing structure allows them to meet your needs regardless of what they are.
If you're just interested in exploring scraping, or running a small daily scrape, Pay As You Go is definitely the plan for you. If you need a lot of data, go ahead and select one of their high tier plans.
|
Pay As You Go | N/A | $8.40 |
69GB | $499 + tax | $7.14 |
158GB | $999 + tax | $6.30 |
339GB | $1999 + tax | $5.88 |
Generally speaking, when proxy providers offer plans around $2-3 per GB, they are considered cheap. If they offer smaller plans in the $6-8 per GB range, they are more expensive.
BrightData's pricing is on the higher side for smaller plans, starting at $7.14/GB, which is typically considered expensive.
We also built a tool where you can compare Bright Data with other proxy providers. This helps you shop around and ensure that you're getting the best value.
You can use our comparison tool here. This tool is built for anybody looking to shop for a proxy provider.
Setting Up Bright Data Mobile Proxies
Here, let's go through the process of getting Bright Data Mobile Proxies. We'll start by getting signed up. Then, we'll setup a zone and add it to our plans. Then, we'll test out our new proxy.
- To start, head on over to https://brightdata.com. You can create an account using a username and password, or you can simply sign up using your Google account.
- Next, once you're logged in, click on the Add button and select Mobile proxies.
- Now, you should be taken to a screen where you can configure your mobile proxies. In the shot below, we make a new zone called, mobile_proxy1 and we just keep the default settings. Once you're finished, click on the Add button.
- If you look at our final screenshot below, you'll see a basic request builder and some of our basic proxy information. There is also a toggle switch to turn it on and off. In the example below, it is switched off. Make sure you toggle your proxy on.
- For convenience, here's a closer look at the cURL command in the screenshot above. If you choose to run it, make sure to replace
YOUR_USERNAME
with your actual username. Also make sure to replace YOUR_PASSWORD
with your actual password.
curl -i --proxy brd.superproxy.io:22225 --proxy-user brd-customer-YOUR_USERNAME-zone-mobile_proxy1:YOUR_PASSWORD -k "https://geo.brdtest.com/welcome.txt"
- If you run the cURL command listed above, you'll get a response similar to this. As you can see below, we get all sorts of information about our proxy and some basic usage examples as well. They also give us information on how to receive a JSON response and a link to their docs.
Welcome to Bright Data! Here are your proxy details
Country: ZA
City: Johannesburg
Region: GP
Postal Code: 2041
Latitude: -26.2309
Longitude: 28.0583
Timezone: Africa/Johannesburg
ASN number: 37168
ASN Organization name: CELL-C
Common usage examples:
[USERNAME]-country-us:[PASSWORD] // US based Proxy
[USERNAME]-country-us-state-ny:[PASSWORD] // US proxy from NY
[USERNAME]-asn-56386:[PASSWORD] // proxy from ASN 56386
[USERNAME]-ip-1.1.1.1.1:[PASSWORD] // proxy from dedicated pool
To get a simple JSON response, use https://geo.brdtest.com/mygeo.json .
More examples on https://docs.brightdata.com/api-reference/introduction
Authentication
When using any of Bright Data's proxies (mobile, residential or datacenter), we have the option to authenticate with our username and password, or to also whitelist an IP address.
The rest of this tutorial will use username/password authentication, but we'll walk you through the process of whitelisting an IP as well. If your IP has been whitelisted, you'll actually won't need to authenticate.
First, click on Whitelisted IPs.
Next, you'll need to click Add allowed IPs.
Afterword, simply enter the IP address you wish to whitelist.
To authenticate via username and password, we actual need to include these directly in our URL. You might have noticed this above in the setting up section. Our authentication is laid out like this:
brd-customer-YOUR_USERNAME-zone-mobile_proxy1:YOUR_PASSWORD
When we stick this inside a fully formatted URL, it looks like this:
http://brd-customer-YOUR_USERNAME-zone-mobile_proxy1:YOUR_PASSWORD@brd.superproxy.io:22225
- Our username is always laid out as follows:
brd-customer-YOUR_USERNAME
.
- We first put a flag for our username,
brd-customer
and then we attach our username to it -YOUR_USERNAME
.
- We also have a
zone
flag followed by the name of the zone we created when doing our initial setup.
:
is then followed by our password.
Basic Request Using Bright Data Mobile Proxies
As mentioned above, we authenticate with our username and password. Time to learn how to make a basic request through this proxy using Python. We'll make our requests to https://geo.brdtest.com/mygeo.json
. This will also yield basic information about our proxy.
import requestsimport random
USERNAME = "your-username"ZONE = "mobile_proxy1"PASSWORD = "your-password"
proxy_url = f"http://brd-customer-{USERNAME}-zone-{ZONE}:{PASSWORD}@brd.superproxy.io:22225"
proxies = { "http": proxy_url, "https": proxy_url}
result = requests.get("http://geo.brdtest.com/mygeo.json", proxies=proxies)print(result.text)
If you decide to run the code above, you'll get a response similar to this.
{"country":"IT","asn":{"asnum":30722,"org_name":"Vodafone Italia S.p.A."},"geo":{"city":"Rome","region":"62","region_name":"Lazio","postal_code":"00125","latitude":41.8904,"longitude":12.5126,"tz":"Europe/Rome","lum_city":"rome","lum_region":"62"}}
As you can see in the JSON above, our country
is IT, or Italy. We are located in the city
of Rome. Our postal_code
is 00125.
Along with these very important bits of information, we get information about our service provider, timezone, and much more.
Country Geotargeting
One of the really nice things we get with Bright Data is country geotargeting. When we use geotargeting, we can select the country our proxy is located in. This allows us to target specific information when scraping.
If you're scraping an online store, you don't want page 1 of your scrape to be for the US and page 2 be setup for China. When we can consistenly choose our location, we can retrieve our data consistently as well.
NOTE: Mobile proxies can often be inconsistent. When using geotargeting with mobile proxies, always make sure to print the status code of your response. If you receive a code 502, Bright Data was likely unable to find a proxy in your country of choice. You can view more information about that in their docs here.
If you are consistently receiving 502, add retry logic, change your country or remove the country flag altogether.
You can also add some basic error handling. In the code example below, we use the same basic logic, but we tell Python to retry the request until it's successful.
import requestsimport json
USERNAME = "your-username"ZONE = "mobile_proxy1"PASSWORD = "your-password"COUNTRY = "us"
proxy_url = f"http://brd-customer-{USERNAME}-zone-{ZONE}-country-{COUNTRY}:{PASSWORD}@brd.superproxy.io:22225"
proxies = { "http": proxy_url, "https": proxy_url}
success = False
while not success: result = requests.get("http://geo.brdtest.com/mygeo.json", proxies=proxies) try: if result.status_code != 200: raise Exception("Failed to find proxies") print(result.text) success = True except Exception as e: print(f"Request Failed, {e}, Status: {result.status_code}")
After running this code, this is the output we received. As you can see below, we receive 5 bad responses before getting a proper response for our proxy connection.
Request Failed, Failed to find proxies, Status: 502
Request Failed, Failed to find proxies, Status: 502
Request Failed, Failed to find proxies, Status: 502
Request Failed, Failed to find proxies, Status: 502
Request Failed, Failed to find proxies, Status: 502
{"country":"US","asn":{"asnum":7018,"org_name":"ATT-INTERNET4"},"geo":{"city":"","region":"WA","region_name":"Washington","postal_code":"","latitude":47.6034,"longitude":-122.3414,"tz":"America/Los_Angeles","lum_region":"wa"}}
When we use this type of error handling, our scraper will continue until it makes a successful request through the proxy.
City Geotargeting
City geotargeting allows users to route their internet traffic through IP addresses associated with specific cities. It enables access to localized content, ads, or services within a particular city, simulating a user’s presence in that urban location.
For city geotargeting, you need to use a Dedicated Proxy, not a Shared one. For this, you need to go to the Configure section for your mobile proxy zone (in our case mobile_proxy1
). When you use a dedicated proxy, you will not be paying for bandwidth, you will be paying per IP address.
Since this feature involves purchasing a separate plan, we won't have any code examples. However, you can learn more about this feature here. For city level geotargeting, you need to purchase Bright Data's Dedicated Mobile Proxies.
City level geotargeting gives us access to hyper localized content. When you're dealing with local content, you can extract the following types of data at a local level. This allows you to collect and manage your precious data at a much more granular level.
- Local Ads
- Local Businesses
- Local Social Media
- Local Events
Error Codes
In web development, error codes are some of our most important problem solving tools. Most of us already know that a status code of 200 indicates a successful request. Other codes can be quite a bit more tricky.
When you receive an HTTP error, you need to lookup the error code and solve it accordingly. We already went over a status 502, they have some additional error codes listed here.
|
200 | Success | Everything works as expected. |
400 | Bad Request | Your account is likely suspended, check your balance. |
401 | Auth Failed | Your IP address has been blacklisted. |
403 | No Protocol/Destination Host | Add HTTP/HTTPS to your request or check your url. |
407 | Proxy Authentication Required | Double check your API key, it's wrong or missing. |
429 | Too Many Requests | Slow down your requests. |
502 | Bad Gateway | Invalid response from the target server. |
Status codes are imperative. When you encounter an error, you need to look up the status code and troubleshoot accordingly.
KYC Verification
Bright Data runs an incredibly stringent KYC (Know Your Customer) policy. Almost all mobile and residential providers use an ongoing KYC process. This entails monitoring your usage and ensuring that you're not up to anything nefarious. If there is suspicious activity coming from your account, you will likely be banned or suspended.
Bright Data takes this a step further. When you use Bright Data's mobile and residential proxies, for full access to the product, you actually need to get a video call with one of their representatives.
While it might seem like a bit of a pain, this process is important. This KYC process allows Bright Data ensure that anyone providing mobile or residential bandwidth is treated fairly and the quality of their proxies is maintained.
Implementing Bright Data Mobile Proxies in Web Scraping
Now that we're familiar with Bright Data's Mobile Proxies, let's go through and implement them using some different frameworks.
In Python, we'll show some examples for Reqeusts, SeleniumWire (DEPRECATED), and Scrapy. Next, we'll go through how to implement them in NodeJS with Puppeteer and Playwright.
By the end of this section, you can get connected with Bright Data's Mobile Proxies easily.
Python Requests
Since we've used Python Requests since the beginning of this article, that's where we'll start. This is the example from our TLDR section.
We setup our basic proxy and then use retry logic to make sure that we can get our response.
import requestsimport json
USERNAME = "your-username"ZONE = "mobile_proxy1"PASSWORD = "your-password"
proxy_url = f"http://brd-customer-{USERNAME}-zone-{ZONE}:{PASSWORD}@brd.superproxy.io:22225"
proxies = { "http": proxy_url, "https": proxy_url}
success = False
while not success: result = requests.get("http://geo.brdtest.com/mygeo.json", proxies=proxies) try: if result.status_code != 200: raise Exception("Failed to find proxies") print(result.text) success = True except Exception as e: print(f"Request Failed, {e}, Status: {result.status_code}")
- Like before, we setup our configuration variables.
- We combine all these variables into our
proxy_url
: http://brd-customer-{USERNAME}-zone-{ZONE}:{PASSWORD}@brd.superproxy.io:22225
.
- We then create a
dict
object that holds both our http
and https
proxies.
- When making our requests, we make sure to pass
proxies=proxies
. This tells Python Requests to use the dict
object we created for our proxy settings.
- Since our connection is prone to 502 errors, we use a
while
loop with some basic retry logic to keep trying our request until we get a proper response.
Python Selenium
For years, SeleniumWire has been the default option when you want to use proxies with Selenium.
If you didn't already know, Vanilla Selenium does not support authenticated proxies. Sadly, SeleniumWire has been deprecated. It hasn't received any code updates in two years and will likely never receive them again. However, it is still technically possible to integrate Bright Data Mobile Proxies via SeleniumWire, but we strongly advise against it.
When you decide to use SeleniumWire, you are vulnerable to the following risks:
-
Security: Browsers are updated with security patches regularly. Without these patches, your browser will have holes in the security that have been fixed in other browsers such as Chromedriver or Geckodriver.
-
Dependency Issues: SeleniumWire is no longer maintained. In time, it may not be able to keep up with its dependencies as they get updated. Broken dependencies can be a source of unending headache for anyone in software development.
-
Compatibility: As the web itself gets updated, SeleniumWire doesn't. Regular browsers are updated all the time. Since SeleniumWire no longer receives updates, you may experience broken functionality and unexpected behavior.
As time goes on, the probability of all these problems increases. If you understand the risks but still wish to use SeleniumWire, you can view a guide on that here.
Depending on your time of reading, the code example below may or may not work. As mentioned above, we strongly recommend against using SeleniumWire because of its deprecation, but if you decide to do so anyway, here you go. We are not responsible for any damage that this may cause to your machine or your privacy. As well, the example below does not contain any retry logic for bad responses. If you wish to do so, you can add this in yourself.
from seleniumwire import webdriverimport json
USERNAME = "your-username"ZONE = "mobile_proxy1"PASSWORD = "your-password"
proxy_url = f"http://brd-customer-{USERNAME}-zone-{ZONE}:{PASSWORD}@brd.superproxy.io:22225"
proxy_options = { "proxy": { "http": proxy_url, "https": proxy_url, "no_proxy": "localhost:127.0.0.1" }}
driver = webdriver.Chrome(seleniumwire_options=proxy_options)
driver.get('https://httpbin.org/ip')
- We build our
proxy_url
exactly the way we have throughout this article.
- We assign this url to both the
http
and https
protocols of our proxy settings.
driver = webdriver.Chrome(seleniumwire_options=proxy_options)
tells webdriver
to open Chrome with our custom seleniumwire_options
.
Python Scrapy
In the example below, we'll setup a new Scrapy project using our mobile proxy. Scrapy already has some builtin retry logic, so we don't need to worry about writing any ourselves.
To start, we need to make a new Scrapy project.
scrapy startproject mobile
Then, from within your new Scrapy project, create a new Python file inside the spiders folder with the following code.
import scrapy
USERNAME = "your-username"ZONE = "mobile_proxy1"PASSWORD = "your-password"
proxy_url = f"http://brd-customer-{USERNAME}-zone-{ZONE}:{PASSWORD}@brd.superproxy.io:22225"
class ExampleSpider(scrapy.Spider): name = "mobile_proxy"
def start_requests(self): request = scrapy.Request(url="https://httpbin.org/ip", callback=self.parse) request.meta['proxy'] = proxy_url yield request
def parse(self, response): print(response.body)
You can run this spider with the following command.
scrapy crawl mobile_proxy
- First, we use create configuration variables.
- Once again, we create the same `proxy_url.
- From inside
start_requests
, we assign our proxy_url
to request.meta['proxy']
. This tells Scrapy that all of this spider's requests are to be made through the proxy_url
we created earlier.
- Scrapy has builtin retry logic, we don't need to code any ourselves.
NodeJS Puppeteer
Connecting to these proxies with Puppeteer is pretty simple as well. We still use our coonfiguration variables, but we use them in a slightly different way.
Create a new folder.
cd
into the new folder and create a new JavaScript project.
cd puppeteer-mobilenpm init --y
Next, we need to install Puppeteer.
Next, create a new JavaScript file. Copy and Paste the code below into your new file. Make sure to move or copy your proxy list into your new project folder.
const puppeteer = require("puppeteer");
const USERNAME = "your-username";const ZONE = "mobile_proxy1";const PASSWORD = "your-password";
const FULL_USER = `brd-customer-${USERNAME}-zone-${ZONE}`;
(async () => { var success = false; const browser = await puppeteer.launch({ args: ["--proxy-server=http://brd.superproxy.io:22225"] }); while (!success) {
try { const page = await browser.newPage();
await page.authenticate({ username: FULL_USER, password: PASSWORD });
var badResponse = false;
page.on("response", async (response) => { const status = response.status(); if (status !== 200) { badResponse = true; console.log(`Failed to get the page, Status Code: ${status}`); } });
await page.goto('http://lumtest.com/myip.json');
if (badResponse) { throw new Error("bad response"); }
await page.screenshot({path: 'puppeteer.png'}); success = true; console.log("Screenshot successful");
} catch { console.log("Failed to connect to mobile proxy, retrying"); }
}
await browser.close();
})();
- First, we create our config variables:
USERNAME
, ZONE
, PASSWORD
. All of these get declared as constants.
- Next, we create a
FULL_USER
, which combines our USERNAME
string along with our ZONE
.
- We add our proxy url to our
proxy-args
: args: ["--proxy-server=http://brd.superproxy.io:22225"]
.
- We pass the following credentials into
page.authenticate()
:
username: FULL_USER
password: PASSWORD
- We use
page.on()
to analyze our response:
- If our
response.status()
is not 200
, we set badResponse
to true
.
- This
page.on()
logic executes whenever we GET a page. If badResponse
is true
, we throw an error.
Puppeteer offers some really strong proxy support out of the box. We combine our USERNAME
and ZONE
to create our FULL_USER
. We then use both our FULL_USER
and PASSWORD
to authenticate the proxy.
The screenshot below came from running the Puppeteer code above.
Just like with our Requests example earlier, we receive some error codes and our error handling kicks in. Take a look at our console output. We wind up with 5 failed attempts before we get an actual response.
Failed to get the page, Status Code: 307
Failed to get the page, Status Code: 307
Failed to get the page, Status Code: 502
Failed to connect to mobile proxy, retrying
Failed to get the page, Status Code: 502
Failed to connect to mobile proxy, retrying
Failed to get the page, Status Code: 502
Failed to connect to mobile proxy, retrying
Screenshot successful
NodeJS Playwright
Puppeteer integration is going to be very similar. Puppeteer and Playwright actually share a common origin coming from Chrome's DevTools. Our concepts and syntax here are going to be almost identical. Follow the steps below to get upa and running.
Create a new project folder.
cd
into the new folder and initialize a JavaScript project.
cd playwright-mobilenpm init --y
Install Playwright.
npm install playwrightnpx playwright install
Next, you can copy/paste the code below into a JavaScript file. Once again, make sure to add your proxy list to your project folder.
const playwright = require("playwright");
const USERNAME = "your-username";const ZONE = "mobile_proxy1";const PASSWORD = "your-password";
const FULL_USER = `brd-customer-${USERNAME}-zone-${ZONE}`;
const options = { proxy: { server: "http://brd.superproxy.io:22225", username: FULL_USER, password: PASSWORD }};
(async () => { const browser = await playwright.chromium.launch(options); var success = false; while (!success) { const page = await browser.newPage();
try { const response = await page.goto('http://lumtest.com/myip.json'); if (response.status() !== 200) { throw new Error(`Failed respoonse, Status Code: ${response.status()}`); } await page.screenshot({ path: "playwright.png" }); console.log("screenshot successful"); success = true; } catch (error) { console.log("ERROR:", error); } finally { await page.close(); }
} await browser.close(); })();
- Like our Puppeteer example, we first setup our configuration variables.
- We create a
proxy
object with the following fields:
server: "http://brd.superproxy.io:22225"
username: FULL_USER
password: PASSWORD
- Our retry logic is quite a bit easier to implement with Playwright. With Playwright,
page.goto()
returns a response
object. This allows us to check the status code in a much easier way (like we did with Requests).
Like Puppeteer, Playwright gives great support for authenticated proxies. However, our retry logic is much easier to implement to do Playwright's response
object. You can view an example response in the screenshot below.
Case Study: Scrape Weather.com
Let's perform a little case study. Here, we're going to scrape a small amount of metadata from weather.com. We'll just scrape the description. We'll scrape this description using both a Portuguese location and a US based location. We'll count the amount of tries it takes as well.
This case study should shed light on a couple of things.
- First, we should get different content based on our location.
- As well, we're going to count our tries. This will allow us to benchmark the proxy quality in each country.
Take a look at the code below.
- We start by disabling some warnings to prevent SSL errors from clogging up the console. If you wish, you can also manually download and install Bright Data's SSL certificate here.
- In the code below, we setup our first proxy connection in Portugal.
- Then, we scrape the site description.
- We count the tries and then print both the description and tries taken to the terminal.
- Afterward, we perform the same exercise using a US based proxy. This allows us to compare the quality of both proxies.
import requestsimport urllib3from bs4 import BeautifulSoup
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
USERNAME = "your-username"ZONE = "mobile_proxy1"PASSWORD = "your-password"COUNTRY = "pt"
proxy_url = f"http://brd-customer-{USERNAME}-zone-{ZONE}-country-{COUNTRY}:{PASSWORD}@brd.superproxy.io:33335"
proxies = { "http": proxy_url, "https": proxy_url}
headers = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36"}
print("----------------------PT-------------------")
success = False
tries = 1while not success: response = requests.get("https://weather.com", proxies=proxies, headers=headers, verify=False) if response.status_code != 200: print(f"Failed to get site, Status Code: {response.status_code}") tries+=1 continue
soup = BeautifulSoup(response.text, "html.parser") description_tag = soup.select_one("meta[name='description']") description = description_tag.get("content")
print("Description:", description) print("Total tries:", tries) success = True
print("----------------------US---------------------")
COUNTRY = "us"
proxy_url = f"http://brd-customer-{USERNAME}-zone-{ZONE}-country-{COUNTRY}:{PASSWORD}@brd.superproxy.io:33335"
proxies = { "http": proxy_url, "https": proxy_url}
success = Falsetries = 1while not success: response = requests.get("https://weather.com", proxies=proxies, headers=headers, verify=False) if response.status_code != 200: print(f"Failed to get site, Status Code: {response.status_code}") tries+=1 continue
soup = BeautifulSoup(response.text, "html.parser") description_tag = soup.select_one("meta[name='description']") description = description_tag.get("content")
print("Description:", description) print("Total tries", tries) success = True
Here are some key things you should notice in this code:
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
is used to disable any SSL warnings we might receive.
- After setting up each proxy connection, we do the following:
- Then, we set both our
success
and tries
variables.
- While our operation hasn't succeeded, we continue trying and we increment
tries
for each unsuccessful try.
- Once our try is successful, we find the page description with
soup.select_one("meta[name='description']")
.
- We finally print our
tries
and description
to the terminal.
If you run this code, you shuuld receive something similar to this.
----------------------PT-------------------
Description: Previsão e condições meteorológicas e radar Doppler de hoje e de hoje à noite para Socorro, Lisboa por The Weather Channel e Weather.com
Total tries: 1
----------------------US---------------------
Failed to get site, Status Code: 502
Failed to get site, Status Code: 502
Description: The Weather Channel and weather.com provide a national and local weather forecast for cities, as well as weather radar, report and hurricane coverage
Total tries 3
Take a look at our site descriptions for each proxy.
-
Portugal: Previsão e condições meteorológicas e radar Doppler de hoje e de hoje à noite para Socorro, Lisboa por The Weather Channel e Weather.com.
-
United States: The Weather Channel and weather.com provide a national and local weather forecast for cities, as well as weather radar, report and hurricane coverage.
Now, take a look at our total tries for each proxy.
- Portugal: 1
- United States: 3
To summarize our results:
- When using a Portuguese proxy:
- Our site comes written in Portuguese.
- Our Portuguese proxy was able to get a proper response on the first try.
- With our US based proxy:
- Our site comes written in English.
- The US based proxy took three trials to get a proper response.
The reliability of your proxy changes based on your location.
Alternative: ScrapeOps Residential Proxy Aggregator
Mobile proxies from Bright Data are a good deal. However, our Residential proxies are a pretty good deal too. With our Residential Proxy Aggregator, you get the same anonymity and anti-bot resistance that you get from mobile proxies.
At ScrapeOps, we offer a Residential Proxy Aggregator that also gives you access to mobile proxies. With our Residential Proxy Aggregator, you get the same anonymity and anti-bot resistance that you get from BrightData Mobile Proxies.
ScrapeOps Residential Proxy Aggregator provides access to the top 20 residential proxy providers through a single port, ensuring a high success rate for web scraping tasks. It automatically switches proxies to avoid blocks, optimizing performance and cost with flexible pricing plans.
- Access to the top 20 residential proxy providers, including Smartproxy, Bright Data, and Oxylabs.
- 98% success rate due to automatic proxy switching.
- Bypasses anti-bot measures and avoids blocks.
- Optimizes performance and cost by monitoring proxy performance and pricing.
- Flexible pricing plans starting at $15 per month, with up to $999 for higher usage.
- 500 MB of free bandwidth credits to start.
Our plans range in price from free (yes... FREE) all the way to $999 per month. We have 8 different choices available.
The table below outlines our pricing.
|
Free | 100MB | $0 |
$15 | 3GB | $5 |
$45 | 10GB | $4.50 |
$99 | 25GB | $3.96 |
$149 | 50GB | $2.98 |
$249 | 100GB | $2.49 |
$449 | 200GB | $2.25 |
$999 | 500GB | $2 |
As we've mentioned earlier, Bright Data is one of our many providers! When you sign up for ScrapeOps, you get access to not only Bright Data, but a ton more proxy providers.
Go a head and start your free trial here.
Once you've got your free trial, you can copy and paste the code below to check your proxy connection.
import requestsfrom urllib.parse import urlencode
API_KEY = "your-super-secret-api-key"
proxy_url = f"http://scrapeops.mobile=true:{API_KEY}@residential-proxy.scrapeops.io:8181"
proxies = { "http": proxy_url, "https": proxy_url}
response = requests.get("https://lumtest.com/myip.json", proxies=proxies, verify=False)print(response.text)
In the code above, we do the following.
- Create our configuration variable:
API_KEY
.
- We then build our
proxy_url
: http://scrapeops.mobile=true:{API_KEY}@residential-proxy.scrapeops.io:8181
.
- The flag,
mobile=true
, tells ScrapeOps that we want a mobile IP address.
- We check our IP information by making a GET:
requests.get("https://lumtest.com/myip.json", proxies=proxies, verify=False)
.
- Finally, we print it to the terminal. You should get an output similar to what you see below. We removed the SSL warning to make the output more legible.
{"country":"VN","asn":{"asnum":7552,"org_name":"Viettel Group"},"geo":{"city":"Ho Chi Minh City","region":"SG","region_name":"Ho Chi Minh","postal_code":"700000","latitude":10.8217,"longitude":106.6254,"tz":"Asia/Ho_Chi_Minh","lum_city":"hochiminhcity","lum_region":"sg"}}
Take a look at the org_name
, Viettel Group
. Viettel Group is a 4G and 5G mobile provider. As you can see, we're getting mobile results with no retry logic required.
In our initial testing, even though we're in beta, our Residential Proxy Aggregator gives us a more consistent mobile connection than Bright Data's Mobile Proxy.
On top of that, we provide bandwidth at a far better price with a low tier paid plan of $5/GB and a top tier plan of $2/GB.
Ethical Considerations and Legal Guidelines
As far as ethical providers go, Bright Data is probably the most committed one in the industry. They have one of the toughest KYC processes around.
On top of that, anyone providing proxy bandwidth is doing so knowingly. You can read more about Bright Data's commitment to ethical proxies here.
When mobile and residential proxies are sourced, they come from real people using real devices on their real internet connection. Ethical sourcing of residential proxies means that everyone providing bandwidth knows they're providing bandwidth.
When we use datacenter proxies, they come from a datacenter, there is no way that our proxy could come from a user unknowingly running software on their smartphone.
Legal
It's a bad idea to break the law using any proxy provider. As you already know, it's illegal. Another thing you might not have considered: it harms everyone involved. It harms the proxy provider. It eventually harms you too.
If you break the law with a proxy, first, it will be traced to the proxy provider. Then, the provider will trace it to your account using either your API key or your username and password.
This creates problems for both you and your proxy service.
- Don't use proxies to access illegal content: This can being you legal fines or even jail time.
- Don't scrape and disseminate other people's private data: Depending on what your jurisdiction, this is also a highly illegal and dangerous practice. Doxxing private data can also lead to heavy fines and possibly jail/prison time.
Ethical
When scraping, we don't need to worry just about what's legal and illegal. We need to think about right and wrong. When something is legal, it's not necessarily right. Nobody wants to be in the next headline about ethically questionable data collection. Things like this can destroy your company and eventually your reputation.
-
Social Media Monitoring: Social media stalking can be a very destructive and disrespectful behavior. How would you feel if someone used data collection methods on your account?
-
Respect Site Policies: Failure to respect a site's policies can get your account suspended/banned. It can even lead to legal troubles for those of you who sign and violate a terms of service agreement.
Conclusion
Mobile proxies give us a very unique way to access the web with great anti-bot resistance. At this point, you should understand that datacenter proxies are an excellent value, but sometimes they just don't get the job done.
With Bright Data's Mobile Proxies, you get the benefits of residential proxies and you get some additional benefits of having a mobile IP address as well.
You should also have a decent understanding of how to implement Bright Data's Mobile Proxies using Python Requests, Scrapy, NodeJS Puppeteer and NodeJS Playwright. You can view the full documentation for Bright Data's Mobile Proxies here.
More Cool Articles
Wanna keep reading? At ScrapeOps we've got a ton of content that you can learn from.
Whether you're a seasoned dev or your brand new to web scraping, we've got something useful for you. We love scraping so much that we wrote the Python Web Scraping Playbook.
If you want to learn more, take a look at the articles below.