Puppeteer Guide: How To Take Screenshots
Taking screenshots is a fundamental aspect of web scraping and testing with Puppeteer. Screenshots not only serve as a valuable tool for debugging and analysis but also document the state of a webpage at a specific point in time.
In this guide, we will focus on how to take screenshots of web pages using Puppeteer and we'll walk you through:
- How To Take Screenshots With Puppeteer
- Use Cases For Taking Screenshots of Web Pages
- How to Take Screenshot of the Full Page
- How to Take Screenshot of a Specific Area or ViewPort
- How to Take Screenshot of a Specific DOM Element
- How to Take Multiple Screenshots
- How to Manage Screenshot Quality?
- How to Manage Screenshot Resolution?
- Generate PDFs Of Web Pages
- Fixing Common Errors While Using page.screenshot() Method
- Conclusion
If you prefer to follow along with a video then check out the video tutorial version here:
Need help scraping the web?
Then check out ScrapeOps, the complete toolkit for web scraping.
How To Take Screenshots With Puppeteer
To take screenshot of a web page, you can use the page.screenshot()
method. The screenshot()
method accepts an object with different properties that control various aspects of the resulting image:
Property | Description | Default value |
---|---|---|
path | The file path to save the image. If no path is provided, the image won't be saved to the disk. | - |
type | Specifies the format of the screenshot. Can be png , jpeg , or webp . | png |
quality | Quality of the image, between 0-100. Not applicable to png images. | - |
fullPage | When true, takes screenshot of the full page. | false |
clip | Specifies the area of the page to clip. The area is specified by the x , y , width and height parameters. | - |
omitBackground | Hides the default white background and allows capturing screenshots with transparency. | false |
These properties offer flexibility in capturing screenshots tailored to your specific needs, whether it's capturing the entire page, a specific region, adjusting image quality, or enabling transparency.
Before you start writing code, make sure that you have installed NodeJS on your machine. For installing Puppeteer, you can use this command:
npm i puppeteer
Now, let's capture screenshot of the home page of QuotesToScrape:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({headless: true});
const page = await browser.newPage();
await page.goto('https://quotes.toscrape.com');
await page.screenshot({ path: 'example.png' });
await browser.close();
})()
In the code above, we created a simple Puppeteer script that automates a headless browser to navigate to a webpage, take a screenshot, and then close the browser.
Use Cases For Taking Screenshots of Web Pages
Here are several use cases for capturing screenshots of web pages:
-
Automated Testing
- Visual Regression Testing: Capture screenshots to detect visual changes in a web page over time.
- UI Testing: Verify that user interfaces display correctly and consistently.
- Cross-Browser Testing: Ensure web pages look and function as expected across different browsers.
-
Web Scraping and Monitoring
- Data Extraction: Use screenshots to gather visual data from dynamic or interactive web content.
- Content Monitoring: Monitor web pages for changes and capture screenshots to track content updates.
-
Headless Browser Interactions
- Capturing State Changes: Take screenshots during headless browser interactions to document different states.
- Debugging: Use screenshots as part of the debugging process to visually inspect page behavior.
-
Documentation and Reporting
- Documentation: Include screenshots in documentation to provide visual context or step-by-step guides.
- Reporting: Use screenshots for visual reporting, showcasing specific web page states or issues.
How to Screenshot of the Full Page
To take screenshot of the full page, pass the fullPage: true
flag to the screenshot()
method. Here is an example:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({headless: true});
const page = await browser.newPage();
await page.goto('https://quotes.toscrape.com');
await page.screenshot({ path: 'full-page.png', fullPage: true });
await browser.close();
})()
How to Take Screenshot of a Specific Area or ViewPort
Viewport is the portion of web page that a user can see in their browser window without scrolling. To take screenshot of a specific viewport, you first need to set the viewport size with the setViewport()
method.
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
await page.setViewport({ width: 1920, height: 1080 });
await page.goto('https://quotes.toscrape.com');
await page.screenshot({ path: 'viewport.png' });
await browser.close();
})()
Capturing screenshot of a specific area within the viewport is done by passing the clip
flag. It takes an object with x
, y
, height
, and width
properties. You can use these properties to specify the exact area you want to capture.
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
await page.setViewport({ width: 1920, height: 1080 });
await page.goto('https://quotes.toscrape.com');
await page.screenshot({ path: 'clip.png', clip: {
x: 0,
y: 0,
width: 1920,
height: 400
}});
await browser.close();
})()
How to Take Screenshot of a Specific DOM Element
To take screenshot of a specific element, you first need to select that element using the page.$()
method, which takes a CSS selector as a parameter.
Once you have selected an element, you can use the screenshot()
method on it, just like we did with the page. Let's capture screenshot of the first quote from QuotesToScrape website:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
await page.goto('https://quotes.toscrape.com');
const element = await page.$("div.quote:nth-child(1)");
await element.screenshot({ path: 'element.png' });
await browser.close();
})()
Avoid using
fullPage
andclip
flags, when usingscreenshot()
method on elements. Otherwise you will get a 'clip' and 'fullPage' are exclusive error.
How to Take Multiple Screenshots
You might want to capture multiple screenshots of different web pages for comparison or documentation purposes. To do this, create an array containing the URLs of the desired web pages. Traverse through the array, accessing each page and capturing the screenshot.
We will put all the screenshots in a folder named "screenshots".
mkdir screenshots
Here is an example code to compare how home pages of different browsers appear on screen, by using a single Puppeteer script:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
await page.setViewport({ width: 1920, height: 1080 });
const urls = [
'https://www.google.com',
'https://www.bing.com',
'https://www.opera.com'
];
let id = 1;
for (const url of urls) {
await page.goto(url);
await page.screenshot({ path: `screenshots/IMG_0${id++}.png` });
}
await browser.close();
})()
How to Manage the Screenshot Quality?
To adjust the quality of the screenshot, you can use the quality
option in the screenshot()
method. You can specify a value between 0 and 100 for the quality option, where 100 is the highest quality.
await page.screenshot({ path: 'screenshot.png', quality: 50 });
How to Manage the Screenshot Resolution?
The resolution of screenshot in Puppeteer is determined by the viewport size. If you need a higher resolution screenshot, you would set the viewport to a larger width and height. This simulates a larger screen and captures more pixels.
await page.setViewport({ width: 1920, height: 1080 });
await page.screenshot({ path: 'screenshot.png' });
Generate PDFs Of Web Pages
While both PDFs and screenshots serve as visual representations of web content, there are specific scenarios where opting for PDFs is more advantageous:
-
Textual Content and Readability: PDFs maintain the integrity of textual content, ensuring that it remains selectable and searchable. This is particularly useful when preserving the readability of textual information is crucial.
-
Print-Friendly Format: PDFs are inherently designed for print, making them an ideal choice when users may need to print or share documents in a format that retains the intended layout and structure.
-
Multi-Page Content: When web pages contain lengthy or multi-page content, generating a PDF provides a more concise and organized way to capture the entire content flow.
-
Interactive Elements: PDFs can embed interactive elements, such as hyperlinks and form fields, providing a more immersive and functional representation of web content.
The PDF generation is accomplished using the page.pdf()
method. This method accepts an object with options, controlling margins, page count, formats, etc. You can explore these options here. Here is an example code for generating PDF of the QuotesToScrape website in "A4" format.
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({headless: true});
const page = await browser.newPage();
await page.goto('https://quotes.toscrape.com');
await page.pdf({ path: 'file.pdf', format: 'A4' });
await browser.close();
})();
Fixing Common Errors While Using page.screenshot() Method
Let's explore some common errors that might occur while taking screenshots with Puppeteer's screenshot()
method and understand how to fix them.
-
Invalid Selector Error: When you’re capturing screenshot of an element and the selector does not match any elements on the page, Puppeteer will throw an error. To fix this, check the browser's inspect tab, and make sure that the selector is correct and matches an element on the page.
-
Timeout Error: If the page takes too long to load, Puppeteer may throw a timeout error. You can increase the timeout limit by passing a
timeout
option to thepage.goto()
method. Here is how to set the timeout to 5 seconds:
await page.goto('https://quotes.toscrape.com', { timeout: 5000 });
-
File Path Issues: If you’re having trouble saving the screenshot to a file, make sure that the file path is correct and that you have write permissions to the directory. You can also try using an absolute file path instead of a relative file path.
-
Viewport Not Set Error: If you’re trying to take screenshot of the full page and you get a "viewport not set" error, you need to set the viewport size using the
setViewport()
method. For example, to set the viewport size to 1920x1080, you can use the following code:
await page.setViewport({ width: 1920, height: 1080 });
await page.screenshot({ path: 'screenshot.png', fullPage: true });
- Resource Load Issues:
If the screenshot is not loading or is incomplete, it may be because the page contains dynamic content that takes time to load. To fix this, you can pass the
waitUntil: networkidle0
flag to thepage.goto()
method.networkidle0
consider navigation to be finished when there are no more than 0 network requests for at least 500 milli-seconds.
await page.goto(url, { waitUntil: 'networkidle0' });
await page.screenshot({ path: 'screenshot.png' });
- Image Not Loading
You can use
page.waitForSelector()
method to wait for a specific element to appear on the page before capturing screenshot. Here’s an example code that demonstrates how to wait for all the images to render before taking screenshot:
await page.waitForSelector('img', {
visible: true
});
await page.screenshot({ path: 'screenshot.png' });
The visible
flag will ensure that all the images are available and loaded.
- Slow Screenshot Times:
If the screenshot is taking too long to capture, you can try reducing the quality of the screenshot by using the
quality
option. For example, you can reduce the quality of image to 50%.
await page.screenshot({ path: 'screenshot.png', quality: 50 });
Alternatively, you can try reducing the size of the viewport to capture fewer pixels. Try setting the viewport to a smaller size like 800x600.
await page.setViewport({ width: 800, height: 600 });
await page.screenshot({ path: 'screenshot.png' });
Conclusion
In this guide, we explored how to capture web page screenshots efficiently. Puppeteer proves versatile, from basic usage to advanced techniques like capturing full-page and specific element screenshots.
We highlighted the use cases of screenshots and PDFs in testing, monitoring, and documentation. The guide concludes with some tips to resolve most common errors that might occur while taking screenshots, ensuring a smooth experience. Mastering Puppeteer's screenshot and PDF generating capabilities enhances web development workflows and user communication.
More Web Scraping Tutorials
If you would like to learn more about Web Scraping with Puppeteer, then be sure to check out The Puppeteer Web Scraping Playbook.
Or check out one of our more in-depth guides: