Skip to main content

Bright Data DataCenter Proxies Guide

Bright Data Datacenter Proxies: Web Scraping Guide

Bright Data provides the world with all sorts of proxy products. They boast access to a Scraping Browser, Residential Proxies, ISP Proxies, Datacenter Proxies, Mobile Proxies, Web Unlocker, and an SERP API.

Today, we're going to test out their Datacenter proxies. Datacenter proxies offer speed, reliablitiy and stability. We'll go through the process of signing up, all the way to using the proxy in production.

Need help scraping the web?

Then check out ScrapeOps, the complete toolkit for web scraping.


TLDR: How to Integrate Bright Data Datacenter Proxy?

Getting setup with Datacenter Proxies from Bright Data is really easy. Once you've got your USERNAME, ZONE and PASSWORD, simply put them into the file below. This code sets up a proxy. Then, it checks your actual IP address and your proxied IP so you can compare them and ensure that everything's working.

import requests

USERNAME = "your-username"
ZONE = "your-zone-name"
PASSWORD = "your-password"
HOSTNAME = "brd.superproxy.io"
PORT = 22225

proxies = {
"http": f"http://brd-customer-{USERNAME}-zone-{ZONE}:{PASSWORD}@{HOSTNAME}:{PORT}",
"https": f"http://brd-customer-{USERNAME}-zone-{ZONE}:{PASSWORD}@{HOSTNAME}:{PORT}"
}

url = "https://httpbin.org/ip"

actual_ip = requests.get(url)

proxied_ip = requests.get(url, proxies=proxies)

print(actual_ip.text)

print(proxied_ip.text)
  • First, we setup our configuration variables: USERNAME, ZONE, PASSWORD, HOSTNAME, PORT.
  • We create a dict object holding our http and https proxies. We set them both to f"http://brd-customer-{USERNAME}-zone-{ZONE}:{PASSWORD}@{HOSTNAME}:{PORT}".
  • Next, we get our actual_ip and our proxied_ip from httpbin.
  • Once we've retrieved both IP addresses, we print them to the console to compare.

Understanding Datacenter Proxies

What Are Datacenter Proxies?

Some sites block datacenter IP addresses. However, not many do this. For the vast majority of the web, datacenter proxies are more than sufficient.

There are two main types of proxies in the industry: Datacenter and Premium. Premium proxies are made up of actual mobile and residential IP addresses. They tend to cost quite a bit more and they are significantly slower than datacenter proxies.

In the next few sections, we're going to go over the differences between residential and datacenter proxies and the reasons to choose each type.

Why Are Mobile Proxies Important?

Datacenter vs Residential

Datacenter

Pros

  • Price: Datacenter proxies are cheap... really cheap. When using residential proxies, it's not uncommon to pay up to $8/GB.

  • Speed: These proxies are hosted inside actual datacenters. Datacenters usually use top of the line internet connections and hardware. This can really increase the speed of our scrape.

  • Availability: Datacenter proxies usually operate with a much larger pool of IP addresses.

Cons

  • Blocking: Some sites block datacenter IPs by default. This makes some sites more difficult to scrape when using a datacenter proxy.

  • Less Geotargeting Support: While we often do get the option to choose our location with datacenter proxies, they still don't appear quite as normal as a residential proxy. We choose our location, but it still shows up as a datacenter.

  • Less Anonymity: Since you're not traced to an individual residential IP address, your proxy location can be traced easily to a datacenter. This doesn't reveal your identity, but it does reveal that the request is not coming from a standard residential location. Your request isn't tied to some random household, it's tied to a company.

Residential

Pros

  • Anonymity: Residential proxies do offer a higher degree of anonymity. Since you're getting an IP address tied to an actual house, your traffic blends in much more.

  • Better Access: There are quite a few sites that block datacenter IP addresses. If you're trying to access a site that blocks them, you need to use a residential IP address.

Cons

  • Price: Residential Proxies are far more expensive than their Datacenter counterparts. Bright Data charges $8.40/GB on their Pay As You Go plan!

  • Speed: Residential proxies can are often slower than their datacenter counterparts. You're not always tied to a state of the art machine with the best connection. You are tied to a real residential device using a residential internet connection.

Residential proxies are ideal for SERP results, ad verification, social media monitoring/scraping and much more.


Why Use Bright Data Datacenter Proxies?

When we decide to use Datacenter Proxies from Bright Data, we get some pretty decent perks. For starters, their datacenter proxies are dirt cheap. You can get started for $0.60/GB. That's less than 1/10 the price of the residential proxies. On top of that, we get access to free geotargeting!

  • Datacenter Proxies with Bright Data are very affordable.

  • We can use geotargeting to select which country we want to appear in.


Bright Data Datacenter Proxy Pricing

The pricing plans for Bright Data Datacenter Proxies are pretty straightforward. For people just looking to test it out, they offer a Pay As You Go Plan. If you're looking to use these proxies at more of an industrial level, they offer monthly plans as well.

The table below outlines their pricing plans.

PlanMonthly CostCost Per GB
Pay As You GoN/A$0.60
1 TB$499+tax$0.51
2 TB$999+tax$0.45
5 TB$1999+tax$0.42

While the higher tier plans definitely require a pretty large commitment, these prices are pretty good. If you want to compare them to other providers, take a look here. We actually built a tool to help you shop for the best proxy provider to meet your needs.


Setting Up Bright Data Datacenter Proxies

Signing up for Bright Data is pretty simple. They give you the option to simply create an account, or to also sign up using Google, GitHub, or your email address.

alt text

Once you've signed up, you'll need to click on My Zones. If you're new to Bright Data, your dashboard will look similar to the one below, but you won't have any zones yet.

alt text

Click on Add and then select Datacenter Proxies.

alt text

You'll be given the option to customize your proxies if you want. Here, I decided to use just standard Datacenter Proxies and pay per bandwidth. As you can see in the screenshot, Datacenter proxies are dirt cheap. Our cost is only $0.60/GB!

alt text


Authentication

Authentication is relatively simple. We can authenticate our requests using either a username and password or we can also whitelist an IP address. To whitelist an IP address, first click on your new datacenter_proxy zone and click Whitelisted IPs.

When you whitelist an IP address, it no longer requires authentication.

alt text

In our standard requests, we authenticate with a username and a password. Take a look at the request below.

import requests

USERNAME = "your-username"
ZONE = "your-zone-name"
PASSWORD = "your-password"
HOSTNAME = "brd.superproxy.io"
PORT = 22225

proxies = {
"http": f"http://brd-customer-{USERNAME}-zone-{ZONE}:{PASSWORD}@{HOSTNAME}:{PORT}",
"https": f"http://brd-customer-{USERNAME}-zone-{ZONE}:{PASSWORD}@{HOSTNAME}:{PORT}"
}

url = "https://httpbin.org/ip"

actual_ip = requests.get(url)

proxied_ip = requests.get(url, proxies=proxies)

print(actual_ip.text)

print(proxied_ip.text)

In the example above, we go through and check our actual IP address against our proxied one. We use httpbin's /ip endpoint to do this.

  • First, we setup all the basic pieces of our url: USERNAME, ZONE, PASSWORD, HOSTNAME, PORT
  • Next, we create a proxies dictionary and set both our http and https proxies to f"http://brd-customer-{USERNAME}-zone-{ZONE}:{PASSWORD}@{HOSTNAME}:{PORT}".
  • We get our actual IP address with requests.get(url)
  • We then get our proxied IP with requests.get(url, proxies=proxies).
  • Once we've finished retrieving our information, we print it to the termimal so we can compare.

In the screenshot below, you can verify that this proxy connection is working. Our proxied request yields different results than our standard one.

alt text


Basic Request Using Bright Data Datacenter Proxies

In the Authentication section above, you already learned how to make basic requests. For consistency, we'll show an example of that here as well. This is a more simplified version of our request. We already know that the proxy connection is working. No need to test it against our real IP address again.

import requests
import json

USERNAME = "your-username"
ZONE = "your-zone-name"
PASSWORD = "your-password"
HOSTNAME = "brd.superproxy.io"
PORT = 22225

proxies = {
"http": f"http://brd-customer-{USERNAME}-zone-{ZONE}:{PASSWORD}@{HOSTNAME}:{PORT}",
"https": f"http://brd-customer-{USERNAME}-zone-{ZONE}:{PASSWORD}@{HOSTNAME}:{PORT}"
}

url = "https://httpbin.org/ip"

proxied_ip = requests.get(url, proxies=proxies)
print("Proxy Location:", proxied_ip.json()["origin"])

This example is a little bit cleaner than our last one. Here's what's changed.

  • We import json so we can properly parse JSON data.
  • We omit the check on our actual IP address.
  • After we get our response, we index the JSON data and print our IP address: print("Proxy Location:", proxied_ip.json()["origin"]).

All in all, we're still in pretty simple territory here. In the coming sections we'll take a look at some of Bright Data's more advanced functionality when it comes to using Datacenter IPs.


Country Geotargeting

Using Bright Data's Datacenter proxies, we get access to geolocation targeting. This is a super useful feature. When we set a specific geolocation, Bright Data will go through and automatically route our request through an IP address in that geolocation. We can set a custom location using the country flag. If we want a US based IP address, we would pass country-us into our url.

Take a look at the example below.

import requests
import json

USERNAME = "your-username"
ZONE = "your-zone-name"
PASSWORD = "your-password"
HOSTNAME = "brd.superproxy.io"
PORT = 22225
COUNTRY = "us"

proxies = {
"http": f"http://brd-customer-{USERNAME}-zone-{ZONE}-country-{COUNTRY}:{PASSWORD}@{HOSTNAME}:{PORT}",
"https": f"http://brd-customer-{USERNAME}-zone-{ZONE}-country-{COUNTRY}:{PASSWORD}@{HOSTNAME}:{PORT}"
}

url = "https://httpbin.org/ip"

proxied_ip = requests.get(url, proxies=proxies)
print("Proxy Location:", proxied_ip.json()["origin"])

We ran this code and got the following output.

alt text

Now, we need to verify that this IP address is in our selected country (us). If you look at the screenshot below, our location shows up as Wilmington, Delaware. Our geotargeting is working.

alt text

The country flag is the key difference in this example. To set a country, you pass a country code along with your country flag. In this case, we pass us. There are a ton of options available, we'll break then down in a table below.

Country Codes

CountryCountry Code
Albaniaal
Argentinaar
Armeniaam
Australiaau
Austriaat
Azerbaijanaz
Bangladeshbd
Belarusby
Belgiumbe
Boliviabo
Brazilbr
Bulgariabg
Cambodiakh
Canadaca
Chilecl
Chinacn
Colombiaco
Costa Ricacr
Croatiahr
Cypresscy
Czech Republiccz
Denmarkdk
Dominican Republicdo
Ecuadorec
Egypteg
Estoniaee
Finlandfi
Francefr
Georgiage
Germanyde
Great Britaingb
Greecegr
Guatemalagt
Hong Konghk
Hungaryhu
Icelandis
Indiain
Indonesiaid
Irelandie
Isle of Manim
Israelil
Italyit
Jamaicajm
Japanjp
Jordanjo
Kazakhstankz
Kyrgyzstankg
Laosla
Latvialv
Lithuanialt
Luxembourglu
Malaysiamy
Mexicomx
Moldovamd
Netherlandsnl
New Zealandnz
Norwayno
Perupe
Phillipinesph
Russiaru
Saudi Arabiasa
Singaporesg
South Koreakr
Spaines
Sri Lankalk
Swedense
Switzerlandch
Taiwantw
Tajikistantj
Thailandth
Turkeytr
Turkmenistantm
Ukraineua
United Arab Emiratesae
United Statesus
Uzbekistanuz
Vietnamvn

City Geotargeting

City level geotargeting can be a very useful feature. With datacenter proxies, you typically don't get support for city level geotargeting. If you want to geotarget a specific city, it's best to choose a residential service. Bright Data also has a residential proxy service you can sign up for a free trial here.

When we use datacenter proxies, more often than not, we need to just make due with country level geotargeting. If you need city level geotargeting, it's best to sign up for a Residential Proxy service. These products give us much better support for city geotargeting and allow us to scrape all sorts of local data such as:

  • Local Ads
  • Local Businesses
  • Local Social Media
  • Local Events

Error Codes

In their FAQ Section, Bright Data only lists two error codes that you might receive. These are status 403, and 502. Of course, status 200 still means that our request was sucessful. The table below is small, but it will hopefully give you some insight into any problems you may face when using Bright Data Datacenter Proxies.

Status CodeTypeDescription
200SuccessEveything worked as expected!
403ForbiddenYou are forbidden from accessing this URL.
502Bad GatewayBright Data failed to get a response from the server.

KYC Verification

For datacenter proxies, Bright Data does not require a KYC process. However, if you decide to use their residential services, they have quite a stringent KYC process. It even includes a live video call! You can read more about the residential KYC process here.

When using Datacenter Proxies, Bright Data does not require users to undergo the KYC process. KYC is only required in order to use their residential services.


Implementing Bright Data Datacenter Proxies in Web Scraping

Now, let's take a look at all the different ways we can integrate with Datacenter Proxies from Bright Data. In the coming sections, we'll show you several Python integrations and a couple using NodeJS as well. This should leave you well equipped to get started with these proxies on your own.

Python Requests

If you've been following along, you've already seen integration using Python Requests. For consistency, we're going to post an example of it here anyway.

USERNAME = "your-username"
ZONE = "your-zone-name"
PASSWORD = "your-password"
HOSTNAME = "brd.superproxy.io"
PORT = 22225

proxies = {
"http": f"http://brd-customer-{USERNAME}-zone-{ZONE}:{PASSWORD}@{HOSTNAME}:{PORT}",
"https": f"http://brd-customer-{USERNAME}-zone-{ZONE}:{PASSWORD}@{HOSTNAME}:{PORT}"
}

url = "https://httpbin.org/ip"

proxied_ip = requests.get(url, proxies=proxies)
print("Proxy Location:", proxied_ip.json()["origin"])
  • Once we've got our credentials, we write our proxy url like this: f"http://brd-customer-{USERNAME}-zone-{ZONE}:{PASSWORD}@{HOSTNAME}:{PORT}"
  • We then create a dict object that holds both our http and https proxies.
  • When making our requests, we make sure to pass proxies=proxies. This tells Python Requests to use the dict object we created for our proxy settings.

Python Selenium

SeleniumWire has always been a tried and true method for using authenticated proxies with Selenium. As you may or may not know, vanilla Selenium does not support authenticated proxies. Even worse, SeleniumWire has been deprecated! This being said, it is still technically possible to integrate Bright Data Datacenter Proxies via SeleniumWire, but we highly advise against it.

When you decide to use SeleniumWire, you are vulnerable to the following risks:

  • Security: Browsers are updated with security patches regularly. Without these patches, your browser will have holes in the security that have been fixed in other browsers such as Chromedriver or Geckodriver.

  • Dependency Issues: SeleniumWire is no longer maintained. In time, it may not be able to keep up with its dependencies as they get updated. Broken dependencies can be a source of unending headache for anyone in software development.

  • Compatibility: As the web itself gets updated, SeleniumWire doesn't. Regular browsers are updated all the time. Since SeleniumWire no longer receives updates, you may experience broken functionality and unexpected behavior.

As time goes on, the probability of all these problems increases. If you understand the risks but still wish to use SeleniumWire, you can view a guide on that here.

Depending on your time of reading, the code example below may or may not work. As mentioned above, we strongly recommend against using SeleniumWire because of its deprecation, but if you decide to do so anyway here you go. We are not responsible for any damage that this may cause to your machine or your privacy.

from seleniumwire import webdriver

USERNAME = "your-username"
ZONE = "your-zone-name"
PASSWORD = "your-password"
HOSTNAME = "brd.superproxy.io"
PORT = 22225


## Define Your Proxy Endpoints
proxy_options = {
"proxy": {
"http": f"http://brd-customer-{USERNAME}-zone-{ZONE}:{PASSWORD}@{HOSTNAME}:{PORT}",
"https": f"http://brd-customer-{USERNAME}-zone-{ZONE}:{PASSWORD}@{HOSTNAME}:{PORT}",
"no_proxy": "localhost:127.0.0.1"
}
}

## Set Up Selenium Chrome driver
driver = webdriver.Chrome(seleniumwire_options=proxy_options)

## Send Request Using Proxy
driver.get('https://httpbin.org/ip')
  • We setup our url the same way we did with Python Requests: f"http://brd-customer-{USERNAME}-zone-{ZONE}:{PASSWORD}@{HOSTNAME}:{PORT}".
  • We assign this url to both the http and https protocols of our proxy settings.
  • driver = webdriver.Chrome(seleniumwire_options=proxy_options) tells webdriver to open Chrome with our custom seleniumwire_options,

Python Scrapy

Using these Datacenter Proxies with Scrapy is really straightforward. There are many ways to do it. In the example below, we'll setup our proxy from within our spider.

To start, we need to make a new Scrapy project.

scrapy startproject datacenter

Then, from within your new Scrapy project, create a new Python file inside the spiders folder with the following code.

import scrapy

USERNAME = "your-username"
ZONE = "your-zone-name"
PASSWORD = "your-password"
HOSTNAME = "brd.superproxy.io"
PORT = 22225


class BrightdataScrapyExampleSpider(scrapy.Spider):
name = "datacenter_proxy"

def start_requests(self):
request = scrapy.Request(url="https://httpbin.org/ip", callback=self.parse)
request.meta['proxy'] = f"http://brd-customer-{USERNAME}-zone-{ZONE}:{PASSWORD}@{HOSTNAME}:{PORT}"
yield request

def parse(self, response):
print(response.body)
  • We construct our url the same we we did in the other previous two examples: f"http://brd-customer-{USERNAME}-zone-{ZONE}:{PASSWORD}@{HOSTNAME}:{PORT}".
  • Inside of our start_requests method, we assign this url to request.meta['proxy'].

NodeJS Puppeteer

Now, we're going to run the same setup for NodeJS Puppeteer. Similar to Scrapy, we need to create a new project first. Follow the steps below to get up and running in minutes.

Create a new folder.

mkdir puppeteer-datacenter

cd into the new folder and create a new JavaScript project.

cd puppeteer-datacenter
npm init --y

Next, we need to install Puppeteer.

npm install puppeteer

Next, from within your new JavaScript project, copy/paste the code below into a new .js file.

const puppeteer = require('puppeteer');

const PORT = 22225;
const USERNAME = "your-username";
const PASSWORD = "your-password";
const ZONE = "your-zone-name";

(async () => {
const browser = await puppeteer.launch({
args: [`--proxy-server=brd.superproxy.io:${PORT}`]
});

const page = await browser.newPage();

await page.authenticate({
username: `brd-customer-${USERNAME}-zone-${ZONE}`,
password: PASSWORD
});

await page.goto('http://lumtest.com/myip.json');
await page.screenshot({path: 'puppeteer.png'});

await browser.close();
})();
  • First, we declare all of our configuration variables as constants: PORT, USERNAME, PASSWORD, ZONE.
  • We set our url when we launch our browser with this arg added args: [`--proxy-server=brd.superproxy.io:${PORT}`].
  • We add our USERNAME to the authentication: brd-customer-${USERNAME}-zone-${ZONE}.
  • We also add our PASSWORD to the authentication: password: PASSWORD.

Puppeteer gives first class support for proxy integration right out of the box. With Puppeteer's builtin authenticate() method, we even have a special spot for both our USERNAME and our PASSWORD. The screenshot from this code is available for you to view below.

alt text

NodeJS Playwright

Integration with Playwright is almost identical to Puppeteer. Puppeteer and Playwright both actual share a common origin in Chrome's DevTools. The steps below should look at least somewhat familiar, however it does get slightly different at the end.

Create a new project folder.

mkdir playwright-datacenter

cd into the new folder and initialize a JavaScript project.

cd playwright-datacenter
npm init --y

Install Playwright.

npm install playwright
npx playwright install

Next, you can copy/paste the code below into a JavaScript file.

const playwright = require('playwright');

const PORT = 22225;
const USERNAME = "your-username";
const PASSWORD = "your-password";
const ZONE = "your-zone-name";

const options = {
proxy: {
server: `http://brd.superproxy.io:${PORT}`,
username: `brd-customer-${USERNAME}-zone-${ZONE}`,
password: PASSWORD
}
};

(async () => {
const browser = await playwright.chromium.launch(options);
const page = await browser.newPage();

await page.goto('http://lumtest.com/myip.json');

await page.screenshot({ path: "playwright.png" })

await browser.close();
})();
  • Like our Puppeteer example, we first setup our configuration variables: PORT, USERNAME, PASSWORD, ZONE.
  • We create a proxy object with the following fields:
  • server: `http://brd.superproxy.io:${PORT}
  • username: `brd-customer-${USERNAME}-zone-${ZONE}
  • password: PASSWORD

When we setup our proxy port using Playwright, we get solid support and easy configuration for our proxy. You can view the resulting screenshot from this code below.

alt text


Case Study: Scrape The Guardian

Sites with super strict anti-bots will automatically block our datacenter proxy, and that's normal. A datacenter proxy is for more general sites. Datacenter proxies are far cheaper and more efficient. Residential proxies are designed as more of a fallback in the even that datacenter proxies don't work.

Now, we're going to scrape The Guardian. This scrape is more about showing you concepts than gathering vast amounts of data. In the code below, we first setup a proxy based in the US. We perform a GET to verify our location information and then we make a GET to The Guardian. We print our location information and we find the navbar from the Guardian's front page. After this initial run, we reset our connection to use an IP inside of the UK. If our proxies are working, we'll receive different output from each proxy.

Take a look at the code below.

import requests
from bs4 import BeautifulSoup

#basic config for your proxy
USERNAME = "your-username"
ZONE = "your-zone-name"
PASSWORD = "your-password"
HOSTNAME = "brd.superproxy.io"
PORT = 22225
COUNTRY = "us"

#set the initial connection
proxies = {
"http": f"http://brd-customer-{USERNAME}-zone-{ZONE}-country-{COUNTRY}:{PASSWORD}@{HOSTNAME}:{PORT}",
"https": f"http://brd-customer-{USERNAME}-zone-{ZONE}-country-{COUNTRY}:{PASSWORD}@{HOSTNAME}:{PORT}"
}

print("----------------------US---------------------")


location_info = requests.get("http://lumtest.com/myip.json", proxies=proxies)
print(location_info.text)

response = requests.get('https://www.theguardian.com/', proxies=proxies)

soup = BeautifulSoup(response.text, "html.parser")

subnav = soup.select_one("div[data-testid='sub-nav']")
print(subnav.text)


print("----------------------UK---------------------")

#reset the country variable
COUNTRY = "gb"

#reset the proxies
proxies = {
"http": f"http://brd-customer-{USERNAME}-zone-{ZONE}-country-{COUNTRY}:{PASSWORD}@{HOSTNAME}:{PORT}",
"https": f"http://brd-customer-{USERNAME}-zone-{ZONE}-country-{COUNTRY}:{PASSWORD}@{HOSTNAME}:{PORT}"
}

location_info = requests.get("http://lumtest.com/myip.json", proxies=proxies)
print(location_info.text)

response = requests.get("https://www.theguardian.com/", proxies=proxies)

soup = BeautifulSoup(response.text, "html.parser")

subnav = soup.select_one("div[data-testid='sub-nav']")

print(subnav.text)

There are some subtle differences you should notice above from our earlier geotargeting example:

  • location_info = requests.get("http://lumtest.com/myip.json", proxies=proxies) is used after setting up each proxy connection. This is just to simply verify our location before getting our target site.
  • COUNTRY: This variable changes later on in the code so we can reset the proxy.
  • proxies: After resetting our country, we need to reset our proxies.

When we run the code, we get the following output.

----------------------US---------------------
{"ip":"134.199.83.172","country":"US","asn":{"asnum":20473,"org_name":"AS-VULTR"},"geo":{"city":"Los Angeles","region":"CA","region_name":"California","postal_code":"90017","latitude":34.0514,"longitude":-118.2707,"tz":"America/Los_Angeles","lum_city":"losangeles","lum_region":"ca"}}
USUS elections 2024WorldEnvironmentUkraineSoccerBusinessTechScienceNewslettersWellness
----------------------UK---------------------
{"ip":"92.43.85.194","country":"GB","asn":{"asnum":207990,"org_name":"HostRoyale Technologies Pvt Ltd"},"geo":{"city":"London","region":"ENG","region_name":"England","postal_code":"EC4R","latitude":51.5088,"longitude":-0.093,"tz":"Europe/London","lum_city":"london","lum_region":"eng"}}
WorldUKClimate crisisUkraineEnvironmentScienceGlobal developmentFootballTechBusinessObituaries

First, we'll look at the location comparison here. We cleaned up the important information from the JSON and made it a little easier to read. Our US proxy is located in California and our UK proxy is located in London.

Proxy CountryRegion NameCity
USCaliforniaLos Angeles
UKEnglandLondon

Now let's take a closer look at our navbar text from each run.

  • us: USUS elections 2024WorldEnvironmentUkraineSoccerBusinessTechScienceNewslettersWellness
  • gb: WorldUKClimate crisisUkraineEnvironmentScienceGlobal developmentFootballTechBusinessObituaries

Let's make these a little easier to read.

  • us: US | US elections | 2024 | World | Environment | Ukraine | Soccer | Business | Tech | Science | Newsletters | Wellness
  • gb: World | UK | Climate crisis | Ukraine | Environment | Science | Global development | Football | Tech | Business | Obituaries

As you can see, there are some differences in the way that the navbar gets laid out. On the US site, the Top left hand corner of the navbar holds US followed by US Elections. In the UK, the viewer's attention is prioritized a bit differently. The Guardian knows that the average user in the UK is probably not as concerned about the US and US elections, so they instead see World followed by UK.

Many websites will prioritize your attention differently based on your location.


Alternative: ScrapeOps Proxy Aggregator

While Bright Data's Datacenter Proxies are quite a deal, we offer a different product with even better features for a pretty good price! Take a look at the ScrapeOps Proxy Aggregator. With our Proxy Aggregator, instead of paying for bandwidth, you pay per request. On top of that, you only pay for successful requests.

Our Proxy Aggregator automatically selects the best proxy for you based on our datacenter pools. We source these pools from tons of different providers. If a request fails using a datacenter proxy, we actually retry it using a premium (residential or mobile) proxy for you with no additional charge!

The table below outlines our pricing.

Monthly PriceAPI CreditsBasic Request Cost
$99,000$0.00036
$1550,000$0.0003
$19100,000$0.00019
$29250,000$0.000116
$54500,000$0.000108
$991,000,000$0.000099
$1992,000,000$0.0000995
$2543,000,000$0.000084667

All of these plans offer the following aweseome features:

  • JavaScript Rendering
  • Screenshot Capability
  • Country Geotargeting
  • Residential and Mobile Proxies
  • Anti-bot Bypass
  • Custom Headers
  • Sticky Sessions

Along with all of these features, Bright Data is one of our providers! When you sign up for ScrapeOps, you get access to proxies from Bright Data and numerous other providers!

Go a head and sign up for a free trial account here.

Once you've got your free trial, you can copy and paste the code below to check your proxy connection.

import requests
from urllib.parse import urlencode

API_KEY = "your-super-secret-api-key"
LOCATION = "us"

def get_scrapeops_url(url, location=LOCATION):
payload = {
"api_key": API_KEY,
"url": url,
"country": location
}
proxy_url = "https://proxy.scrapeops.io/v1/?" + urlencode(payload)
return proxy_url

response = requests.get(get_scrapeops_url("http://lumtest.com/myip.json"))
print(response.text)

In the code above, we do the following.

  • Create our configuration variables: API_KEY and LOCATION.
  • Write a get_scrapeops_url() function. This function takes all of our parameters along with a target url and wraps it into a ScrapeOps Proxied url. This is an incredibly easy way to scrape and it makes our proxy code much more modular.
  • Check our IP info with response = requests.get(get_scrapeops_url("http://lumtest.com/myip.json")).
  • Finally, we print it to the terminal. You should get an output similar to this.
{"country":"US","asn":{"asnum":26832,"org_name":"RICAWEBSERVICES"},"geo":{"city":"Dallas","region":"TX","region_name":"Texas","postal_code":"75247","latitude":32.8137,"longitude":-96.8704,"tz":"America/Chicago","lum_city":"dallas","lum_region":"tx"}}

Bright Data is one of the most ethical proxy companies around. Their proxies come entirely from ethical sources and they do not condone using their product for illegal or immoral behavior.

Don't use proxy providers to break laws. This is illegal and it harms everyone involved. It harms the proxy provider. It eventually harms you too. If you do something illegal using a proxy, first your action will be traced to the proxy provider. Then, the action will be traced to your account.

This sort of thing creates problems for both you and the proxy service.

  • Don't use residential proxies to access illegal content: These actions can come with intense legal penalties and even prison or jail time depending on the severity of the offense.

  • Don't scrape and disseminate other people's private data: Depending on what jurisdiction you're dealing with, this is also a highly illegal and dangerous practice. Doxxing private data can also lead to heavy fines and possibly jail/prison time,

Ethical

When we scrape, we don't just need to think about legality, we also need to make some ethical considerations. Just because something is legal doesn't mean that it's morally right. Nobody wants to be in the next headline about unethical practices.

  • Social Media Monitoring: Social media stalking can be a very destructive and disrespectful behavior. How would you feel if someone used data collection methods on your account?

  • Respect Site Policies: Failure to respect a site's policies can get your account suspended/banned. It can even lead to legal troubles for those of you who sign and violate a terms of service agreement.


Conclusion

You've made it to the end! You should have a solid understanding of what Bright Data's Datacenter proxies are capable of. You should know that they're noticeably faster than residential proxies. You should also have a decent understanding of how to implement them in Python Requests, Scrapy, NodeJS Puppeteer and NodeJS Playwright.

As an added bonus, you also learned how to setup a basic proxy connection using the ScrapeOps Proxy Aggregator. You learned about all the features we have and how reasonable priced we are. Take this new knowledge and go build something with Bright Data's Datacenter Proxies or the ScrapeOps Proxy Aggregator.


More Cool Articles

If you want to read more, we've got a ton of content. Whether you're a seasoned dev or your brand new to coding, we've got something useful for you. We love scraping so much that we wrote the Python Web Scraping Playbook. If you want to learn more, take a look at the guides below.