Skip to main content

Python Selenium VS Python Requests Compared

Python Selenium VS Python Requests Compared

Python Selenium and Python Requests are two of the most common libraries used in the web scraping community to retrieve pages from a website and extract the data you want.

As Selenium and Python Requests are very different in terms of their intended use cases and functionalities it is common for developers to wonder which library is a better option for their web scraping use case.

To help you answer this common question, so in this guide we're going to walk through:

Both Python Selenium and Requests are very well established libraries that are used in numerous use-cases. However, for the purposes of this guide we're going to compare them from a web scraping & botting perspective.


TLDR Python Selenium vs Python Requests

Python Selenium and Python Requests are two great libraries for web scraping. However, they each have their own pros and cons, and are each ideally suited to particular types of web scraping.

  • Python Selenium: Is a browser automation library that allows you to scrape websites with a full browser. It renders the page, and allows you to interact with the website and extract the data you need.
  • Python Requests: Is a HTTP client that allows you to send GET, POST and other HTTP requests to HTTP endpoints so you can retrieve HTML, JSON and XML data for parsing with another library.

Here are the situations when you should consider using each library:

SeleniumRequests
When you need to render dynamic pages with a browser.When you are scraping at a large scale.
When you need to interact a lot with the website (click, scroll, etc.) to access the data you need.When you are scraping API endpoints.
When you need to make automated bots that work behind logins.When you want to minimize scraping costs.
When you need to take screenshots of pages.When you want to scrape as fast as possible.
When you need to scrape heavily protected websites.-

What Is Python Selenium?

Python Selenium is the Python implementation of the popular browser automation library Selenium.

Launched in 2002, Selenium is a browser automation library that allows you to load a web page in a browser, interact with the pages like you would as a normal user using pre-coded steps and extract the data you need.

Selenium was orginally designed for software QA testing and browser automation, however because of its ease of use Selenium has been heavily adopted by developers building web scrapers and automated bots.

A Selenium web scraper that scrapes the header from QuotesToScrape.com would look something like this:


from selenium import webdriver

## Set Up Selenium Chrome driver
webdriver_path = "path_to_installed_driver_location"
chrome = webdriver.Chrome(webdriver_path)

## Send Request Using ScrapeOps Proxy
driver.get('http://quotes.toscrape.com/page/1/')

## Extract H1 Tag From HTML
h1_text = driver.find_element_by_tag_name('h1').text

print(h1_text)

Selenium boasts a wide range of features that you can use in your scrapers:

  • Headless Browser: As a browser you can open web pages like a real user would see them with all the popular browsers (Chrome, Firefox, etc.). And if performance is a concern for you then you can run Selenium in headless mode (renders page but doesn't launch a visual browser).
  • Page Interactions: With Selenium you can code your scraper to interact with the page like a real user would. Click on a button, fill out a form, or scroll the page.
  • Data Extraction: Selenium has a built in HTML parsing API built in so you can easily extract the data you want from the page (no need to use other libraries).
  • Waits: You can tell Selenium to wait for specific elements to appear on the page before taking a action.
  • Screenshots: Selenium allows you to automate taking screenshots of pages.

Selenium is a very well established library and as result a large developer eco-system has built up around it meaning there is a lot of extensions, guides and support material available online if you need it.


What Is Python Requests?

Python Requests is a battletested and simple to use HTTP client for Python.

What this means is that with Python Requests you can send GET, POST, PUT, HEAD, DELETE and OPTIONS requests to a HTTP server to retrieve the content you want to scrape.

Python Requests is by far the most widely used HTTP client for Python, with millions of developers using it to make HTTP requests. As a result, it is the most popular library used amongst Python web scrapers.

A Python Requests based web scraper that scrapes the header from QuotesToScrape.com would look something like this:


import requests
from bs4 import BeautifulSoup

## Make HTTP Request
response = requests.get('http://quotes.toscrape.com/page/1/')

## Extract Data From HTML
soup = BeautifulSoup(response.text, "html.parser")
h1_text = soup.find('h1').text

print(h1_text)

Unlike Selenium, Python Requests is just a HTTP client so you will need to combine it with a parsing library like BeautifulSoup, Parsel or lxml to extract data from HTML pages.

Python Requests boasts a wide range of features that you can use in your scrapers:

  • HTTP Requests: With Python Requests you can send GET, POST, PUT, HEAD, DELETE and OPTIONS requests to a HTTP server to retrieve the content you want to scrape.
  • Sessions: You can use Sessions to maintain the same session across multiple requests and increase the speed of your scrapers.
  • Headers: Python Requests makes it very easy for you to control what headers and user-agents you send with your requests (useful for avoiding anti-bots).
  • Proxies: With Python Requests it is very easy to route your requests through proxies to avoid anti-bot systems that aim to block you from scraping their website.

When Should You Use Selenium Over Python Requests?

Python Selenium is a powerful browser automation libary that can be very useful for web scraping. However, it really shines in the following situations:

  1. Require Full Browser
  2. Heavy Page Interaction
  3. Automated Bots
  4. Taking Screenshots
  5. Scraping Protected Websites

More details below:

Situation #1: Require Full Browser

As Selenium is a automated browser and Python Requests is a simple HTTP client, Selenium is the better option if you are scraping dynamic pages that require the page to be client side rendered before showing all the data.

This is typical for websites that use modern web frameworks like AngularJS, ReactJS, VueJS, etc. but which don't server-side render their pages before returning them to the user.

Situation #2: Heavy Page Interaction

If you need to do a lot of page interactions for your use case (like clicking buttons, toggling dropdowns, submitting forms, etc) then although you often can do this with Python Requests (requires a lot of reverse engineering) Selenium is the way to go.

Situation #3: Automated Bots

If you need to make a automated bot that works behind a websites login then Selenium is much better placed that Python Requests to do the job.

With Selenium it is much easier to manage the process of logging into to websites and navigating behind the login compared to Python Requests.

Situation #4: Taking Screenshots

If your use case requires you to take screenshots of the page then Selenium is a great option as it has built in screenshot functionality, whereas with Python Requests taking screenshots isn't possible.

Situation #5: Scraping Protected Websites

A lot of websites have started using sophisticated anti-bot protection systems like Cloudflare and DataDome to stop people from scraping their content.

Whilst there are ways to scrape these website with Python Requests, Selenium is a good option if you don't want to rely on any 3rd party tools and try your hand at bypassing these anti-bot systems yourself.

You can use fortified versions of Selenium like the Seleniums Undetected ChromeDriver which are designed to bypass some of these anti-bot systems. Check out our guide for more information about how to use the Undetected Chromedriver.

Situation #5: Small Scale Scraping

If you are doing a small scale scraping then the advantages and disadvantages of Selenium over Python Requests are less relevant, especially if you don't need to use proxies (more on this later).


When Should You Use Python Requests Over Selenium?

Python Requests is a light weight and battletested HTTP client that is great for web scraping. However, it is really suited to the following situations:

  1. Large scale web scraping
  2. Scraping API endpoints
  3. Minimizing Costs
  4. Faster Scraping

More details below:

Situation #1: Large Scale Web Scraping

If you need to scrape data at scale then using Python Requests is definetly the way to go. With Python Requests you can create very efficient and scalable web scrapers that only download the data from the website that you absolutely need.

Unlike Selenium, each page you request with Python Requests will only return the initial HTML response and won't download any unwanted images, CSS files, tracking cookies, etc.

Plus, if you reverse engineer the target website you can often send the requests to non-public internal API endpoints that give you the data you need in easy to parse JSON format.

Situation #2: Scraping API Endpoints

Python Requests should be your go to option if you need to scrape API endpoints (public or internal endpoints) with GET or POST requests. Although possible, Python Selenium is completely unsuited to this use case.

Situation #3: Minimizing Costs

If minimizing your scraping costs is a big concern for you then you should use Python Requests over Selenium in your scrapers.

As we will discuss in the downsides of using Selenium section using Selenium can dramatically increase your costs versus using Python Requests as Selenium consumes a lot more memory and bandwidth when downloading pages.

With Python Requests, you will be able to keep you server and proxy costs to the absolute minimum, saving you money.

Situation #4: Faster Scraping

If speed is a big concern for you then go with Python Requests. A scraper built with Python Requests will be able to download web pages much faster than Selenium as it doesn't need to open a browser and

Python Request Sessions

If you really want to optimize your Python Request scrapers for speed then you should give a look at its Sessions functionality. Using the same Session for multiple requests can dramatically lower the latency of each request.


Downsides Of Using Selenium Over Python Requests

There are a number of downsides to using Python Selenium over Python Requests that should you be aware of especially if you are planning to use Selenium to scrape at scale:

Downside #1: Memory & Bandwidth Consumption

As Selenium loads each page in a browser (full browser or headless) it consumes a lot more memory & bandwidth than making the same request to a website with Python Request would.

When Selenium loads a page with also load things like CSS files, images, JS scripts, tracking cookies, etc. meaning that each page you request generates 10's or 100's of additional requests which download additional data that needs to be stored in memory.

Python Requests Comparison

In comparison, when you request a URL with Python requests it will only load that specific URL you requested. Not all the additional files and scripts like Selenium.

Downloading these extra files (especially images) can dramatically increase the memory and bandwidth consumption of your scraper.

This increased memory & bandwidth consumption can cause further issues like scraper stability, increased proxy costs and slower scraping.

Downside #2: Scraper Stability

Because your scraper requires much more memory to operate, it can create stability issues if you are deploying these scrapers on remote servers or virtual machines.

Every server or virtual machine has a memory capacity but if your scraper exceeds this memory limit then it can crash the server or virtual machine.

Because of this you will need to provision larger servers with more memory capacity and/or modify your scraper to use less memory.

Downside #3: Increased Proxy Costs

Another drawback to Selenium making 10's or 100's of additional requests for each page you load versus Python Requests is that this can dramatically increase your proxy costs depending on which type of proxy solution you use.

Selenium can increase your proxy costs if you are using either one of these proxy pricing methods:

  • Pay Per GB Consumed: If you are using a proxy that charges by the amount of bandwidth you consume (GBs) then using Selenium will consume a lot more expensive than Python Requests, as by default Selenium will download all the images, CSS files, etc. when loading the page, which you will have to pay for.
  • Pay Per Successful Request: If you are using a API style proxy that charges for each successful request, then using Selenium over Python Requests will dramatically increase your proxy consumption as you will be charged for each one of the 10's or 100's requests Selenium sends to load a single web page. In effect, you could be spending 10-100 times more versus using Python Requests.

If either of these situations are the case for you then it is recommended to use Python Requests over Selenium.

Downside #4: Slower Scraping

The final downside to using Selenium is that it is much slower than using Python Requests. Again this is largely due to Selenium loading all the additional files required to fully render the page.

Depending on how you configure your scraper, Selenium based scrapers can be slower because of the extra time it takes for Selenium to launch a browser at startup. However, you can largely negate this issue with optimizing your code.


Downsides Of Using Python Requests Over Selenium

There are a number of downsides to using Python Requests over Python Selenium that should you be aware of when choosing this library to go with:

Downside #1: Dynamic Pages

Certain websites use modern web frameworks like AngularJS, ReactJS, VueJS, etc. and require the client to render the page content for all the data to be present.

In situations like this, if you try to use Python Requests to scrape the page the data likely won't be present in the page (sometimes it is in a JSON blob) as Python Requests won't render the page. So the HTML response won't contain your target data.

You can overcome this using a proxy solution with a built in JS rendering functionality like the ScrapeOps Proxy Aggregator, however, if you aren't using a smart proxy then you won't be able to access the data you need with Python Requests.

Downside #2: Interacting With Pages

Python Requests doesn't allow you to interact with the page like how you can click on page elements, scroll the page, and fill out forms like you would in Selenium.

Instead, with Python Requests you would have to reverse engineer how their website works and figure out ways to send form data to POST endpoints to interact with forms and identify hidden API endpoints that populate infinte scroll data feeds.

Downside #3: Navigating & Loggging Into Websites

Whereas navigating and logging into websites with Selenium is pretty straightforward, with Python Requests you will have to reverse engineer the way the websites paginates content and how their login system works.

Scraping Behind Logins

When it comes to scraping behind logins, it is sometimes easier to use a headless browser (like Selenium) to login to a website and retrieve the login cookie. Then use this cookie in your Python Request requests to access data behind the login, instead of trying to reverse engineer their login system.

So it is possible to combine a Selenium scraper with a Python Requests scraper to get the best of both worlds.

Downside #4: Can't Screenshot The Page

With Python Requests, you can't take a screenshot of the target page as it doesn't render the page in a browser. You would need to subsequently load the HTML into a local browser instance (like Selenium) and have it render the page and screenshot it for you.


More Web Scraping Tutorials

So that's a comparison of Python Selenium and Python Requests. What are they, when should you use both and the downsides of going with each option.

If you would like to learn more about Web Scraping, then be sure to check out The Web Scraping Playbook.

Or check out one of our more in-depth guides: