![]()
The NodeJS Playwright Guide
Playwright is a Node.js automation library built by Microsoft. It offers a high-level, user-friendly API for automating tasks and interacting with dynamic web pages.
Playwright communicates directly with the browser, primarily Chromium, WebKit and Firefox, providing a smooth experience for tasks such as DOM interaction and navigation. It lets programmatic control of a wide choice of browsers in headless mode, resulting in faster execution.
Headless browsers are those without a graphical user interface (GUI). They operate in the background and render pages without a visible display, thereby making them faster and more efficient.
In this tutorial, we'll take you through:
- How To Install NodeJS Playwright
- How To Use Playwright
- How To Scrape Pages With Playwright
- How To Wait For The Page To Load
- How To Click On Buttons With Playwright
- How To Scroll The Page With Playwright
- How To Take Screenshots With Playwright
- How to Use A Proxy With Playwright
- More Playwright Functionality
- More Web Scraping Tutorials
Need help scraping the web?
Then check out ScrapeOps, the complete toolkit for web scraping.
How To Install NodeJS Playwright
Before you install Playwright, make sure Node.js is installed on your system. To install Node.js, go to the Node.js website and install the most recent version. Now let's install and configure Playwright.
Open the terminal and create a new folder for your project with any name (in our case, playwright_guide).
mkdir playwright_guide
Now, using the cd command, change the directory to the above-created directory.
cd playwright_guide
Great, you're now in the right directory. Run the following command to initialize the package.json file:
npm init -y
Next, install the latest version of Playwright using the following command:
npm install playwright@latest
You also need to install a browser to use with playwright. Here's how to install chromium:
npm install playwright-chromium
This is how the installation process looks.

Attention, head over to the package.json file and add "type": "module" to load the ES module and handle ES6 features such as template literals, classes, and promises.

How To Use Playwright
We’ll use the toscrape website to understand Playwright. This website is mainly designed for web scraping and is easy to use and navigate.
Before we jump into the code, create a JavaScript file (index.js) in the directory we created above and run the following code:
// index.js
// Import chromium from Playwright module
import { chromium } from "playwright";
// Define a function to scrape quotes from a website
const scrapeData = async () => {
// Launch a new chromium browser instance
const browser = await chromium.launch({
headless: false // Set to true to run in headless mode
});
// Open a new page in the browser
const page = await browser.newPage();
// Navigate to the URL of the website you want to scrape
await page.goto("http://quotes.toscrape.com/");
// Take a screenshot of the webpage
await page.screenshot({ path: 'screenshot.png' });
// Close the browser instance
await browser.close();
};
// Call the scrapeData function to initiate the scraping process
scrapeData();
Here’s the code result:

This code takes a screenshot of a web page. The scrapeData() function launches a new chromium browser instance with chromium.launch() and sets the headless mode to false so that you can see the web pages in your browser.
Next, the function creates a new page in the browser using browser.newPage(). It then passes the webpage URL to the page.goto() function to navigate there. The function then captures a screenshot of the page using the page.screenshot() function.
Finally, the function closes the browser instance by calling the browser.close() method.
Remember, if you want to use playwright with a different browser, say firefox, you need to install a wrapper library for it first. You can learn more here.
How To Scrape Pages With Playwright
Playwright is commonly used for web scraping. Let's scrape the first quote from the website. As shown in the image below, there is a parent class called quote with some child classes, such as text (class="text"), author (class="author"), and tags (class="tags").

Let's understand the code. Here, the querySelector() method selects an element on the web page based on the .quote class. After that, we extract the text and the author by passing .text and .author to the querySelector() method.
// Import chromium from Playwright module
import { chromium } from "playwright";
// Define a function to handle web scraping
const scrapeData = async () => {
// Launch a new browser instance
const browser = await chromium.launch({
headless: false // Set to true to run in headless mode
});
// Create a new page in the browser
const page = await browser.newPage();
// Navigate to the target URL
await page.goto("http://quotes.toscrape.com/");
// Extract data from the web page
const quotes = await page.evaluate(() => {
// Use querySelector to select an element on the web page based on its CSS selector
const quote = document.querySelector(".quote");
// Extract the text of the quote
const text = quote.querySelector(".text").innerText;
// Extract the author of the quote
const author = quote.querySelector(".author").innerText;
// Extract the tags associated with the quote
const tags = quote.querySelector(".tags").innerText;
return { text, author, tags };
});
// Print the scraped data (quote, author, tags)
console.log(quotes);
// Close the browser instance
await browser.close();
};
// Call the scrapeData function to initiate the scraping process
scrapeData();
Here’s the code result:
{
text: '“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”',
author: 'Albert Einstein',
tags: 'Tags: change deep-thoughts thinking world'
}