Playwright Guide: Downloading A File
Downloading files is a common requirement in many web scraping scenarios, and being able to handle file downloads effectively is crucial for successful scraping processes.
This is an in-depth tutorial that covers everything from finding elements on the page to finding URLs with Playwright and downloading a ton of them with Axios.
- TLDR: How to Download A File Using Playwright
- Why Automated File Downloads Matter
- Understanding File Downloads
- Downloading a File
- Setting Custom Download Behaviour
- Handling File Download Dialogs
- Downloading Multiple Files
- Downloading All Images From a Webpage
- Monitoring Download Progress
- Verifying File Download
- Downloading Very Large Files with Playwright
- Handling Different File Types
- Conclusion
- More Playwright web Scraping Guides
TLDR: How to Download A File Using Playwright
Today, we're going to learn how to download files using Playwright. There is actually more than one way to do this, and we're going to explore them all with an excruciating level of detail.
Don't have time for the whole thing? Take a look at this script:
const playwright = require("playwright");
const path = require("path");
async function main() {
//create a browser object
const browser = await playwright.chromium.launch();
//set a new context for the browser
const context = await browser.newContext({
acceptDownloads: true
});
//open a new page with our custom context
const page = await context.newPage();
//navigate to the site
await page.goto("https://nakamotoinstitute.org/library/bitcoin")
//xpath variable
const xPath = "/html/body/main/article/div/a[2]"
//create a promise that resolves once a download event has taken place
const downloadPromise = page.waitForEvent("download");
//locate the link by its xpath
const link = page.locator(`xpath=${xPath}`);
//const link = page.locator(`href="https://nakamotoinstitute.org/library/bitcoin"`)
//click the link
await link.click();
//create a download object out of our promise
const download = await downloadPromise;
//create a path to the file
const filePath = path.join(process.cwd(), download.suggestedFilename());
//save the file at the specified path
await download.saveAs(filePath);
//close the browser
await browser.close();
}
main();
When downloading a file, you need to ensure that you take the following steps:
- Create a
Promise
object usingpage.waitForEvent("download")
. - Begin your download, the example above uses
click()
to do so, but you can also do this withpage.goto()
- Create a
download
object from yourPromise
- Save your download at a specified path with
download.saveAs()
Why Automated File Downloads Matter?
Why would anybody want to automate file downloads?
Actually there are many reasons to automate your downloads. The list below outlines some of the reasons you may wish to automate a file download.
-
Web Scraping for Content Aggregation: Quite often, data collection is the main reason that we'd need to scrape a site. If we need to aggregate relevant content, automating the download is a much faster and more efficient way to do it than manually downloading all the content.
-
Social Media Data Collection: Quite often, different companies will routinely scrape the social media posts of their competitors. When scraping in this scenario, one can easily and quickly create a summary of what their competitor is doing.
-
Email Attachment Automation: In larger companies especially, it is very important to backup data. What better way to backup email attachments than to build a bot that does this in seconds?
-
Continuous Integration and Deployment (CI/CD): For large download sites that change often, they may wish to create a suite of tests that check all the downloads. Automated testing is great because when you change a large piece of software (your website), you know immediately if your changes break anything, and if your tests are written correctly, you'll know where things are breaking.
Understanding File Downloads
To understand file downloads, first we need a basic understanding of HTTP requests. There are four main requests that you should be aware of in web development:
- GET: gets information from a server
- POST: posts information to a server.
- PUT: modifies existing content on a server.
- DELETE: deletes content from a server.
In this article, we're really only going focus on GET
requests. GET
is very self-explanatory. You send a GET
request whenever you wish to get content. When you go to a website, your browser sends a GET
request to the domain. After receiving the request, the server finds the required information (in this case an HTML file), and sends it back to you.
The same happens when we download a file. The main difference in the process:
- When we're viewing a site: Our browser reads the file and displays the HTML content to us as website.
- When we're downloading: Instead of being read and displayed by the browser, our file is saved to the local machine to be viewed later.
Downloading a File in Playwright
Downloading a file is actuallly quite simple. There are only a few steps we need to follow:
- Find download button or download url
- Start the download by either clicking the button or sending a
GET
request to it. - Wait for the download to complete.
- Verify that the download took place.
There are numerous other things we can to in order to tweak and customize this process, but the steps listed above are the most important. We always need to find, start, wait, and verify.
Locating the Download Link or Button
As mentioned above, our first step is to find the download. Using Playwright, we can find pretty much anything using its ID, Class, or XPATH. The code below finds several elements on the page:
const playwright = require("playwright");
async function main() {
//create a browser object
const browser = await playwright.chromium.launch();
//set a new context for the browser
const context = await browser.newContext({
acceptDownloads: true
});
//open a new page with our custom context
const page = await context.newPage();
//navigate to the site
await page.goto("https://nakamotoinstitute.org/library/bitcoin/");
//xpath variable
const xPath = ("/html/body/main/article/div/a[2]");
//locate the link by its xpath
const link = page.locator(`xpath=${xPath}`);
//await the link's text content
const downloadText = await link.textContent();
//log the link's text content
console.log("Link download text:", downloadText);
//get the href
const href = await link.getAttribute("href");
//log the href
console.log("Actual link:", href);
//close the browser
await browser.close();
}
main();
In the code above, we:
- Import Playwright with
require()
- Create an
async
function,main()
- Open a browser with
playwright.chromium.launch()
- Allow our browser to download files with
browser.newContext()
- Create a
page
object withcontext.newPage()
- Navigate to the site with
page.goto()
- Find the link by its xpath with
page.locator(`xpath=${xpath}`)
and log its text to the console - Use
link.getAttribute("href")
to get the actualhref
of the link and print it to the console - Close the browser with
browser.close()
If you would like to get more information about how to select/locate elements with CSS selectors, check our How To Find Elements by CSS Selector Playwright Guide.
Clicking the Download Link or Button
Now that we know how to find elements on the page, we need to find and click the download button. Let's create some code that finds the button download link and clicks on it with click()
. While this example is a bit similar to our first one, you should pay attention to the differences, they're very important.
const playwright = require("playwright");
const path = require("path");
async function main() {
//create a browser object
const browser = await playwright.chromium.launch();
//set a new context for the browser
const context = await browser.newContext({
acceptDownloads: true
});
//open a new page with our custom context
const page = await context.newPage();
//navigate to the site
await page.goto("https://nakamotoinstitute.org/library/bitcoin")
//xpath variable
const xPath = "/html/body/main/article/div/a[2]"
//create a promise that resolves once a download event has taken place
const downloadPromise = page.waitForEvent("download");
//locate the link by its xpath
const link = page.locator(`xpath=${xPath}`);
//const link = page.locator(`href="https://nakamotoinstitute.org/library/bitcoin"`)
//click the link
await link.click();
//create a download object out of our promise
const download = await downloadPromise;
//create a path to the file
const filePath = path.join(process.cwd(), download.suggestedFilename());
//save the file at the specified path
await download.saveAs(filePath);
//close the browser
await browser.close();
}
main();
Key differences from our first example:
downloadPromise = page.waitForEvent("download");
creates aPromise
object that cannot resolve until a"download"
event has taken place.const download = await downloadPromise;
creates a download object out of our Promise objectconst filePath = path.join(process.cwd(), download.suggestedFilename());
joins our current working directory (cwd) and the suggested filename of the download object to create a path for our new file.await download.saveAs(filePath);
saves the file at the path that we just created
Setting a Custom Download Behaviour
If you followed the example above, you've already set a custom download behaviour. Setting custom behaviours consists of the following steps:
- Set the download path: When we set the download path, we create a PATH object in our filesystem. We'll use this later on.
- Create an event listener: If you recall from the previous example, we used
page.waitForEvent("download")
to create an event listener. This creates aPromise
which resolves once the event has taken place. - Manage the completion of the download:
await download.saveAs(filePath);
tells Playwright toawait
the download object and save it at the PATH that we created in Step 1.
Let's take our previous example and expand upon it a bit more.
const playwright = require("playwright");
const path = require("path");
const fs = require("fs");
async function main() {
//create a browser object
const browser = await playwright.chromium.launch();
//get the current directory
const currentFolder = process.cwd();
//name of the downloads folder
const downloadsFolder = "./pdfDownloads";
//if the downloads folder doesn't exist, create it
if (!fs.existsSync(downloadsFolder)) {
fs.mkdirSync(downloadsFolder)
}
//path for our downloads
const downloadsPath = path.join(currentFolder, downloadsFolder);
//set a new context for the browser
const context = await browser.newContext({
acceptDownloads: true,
});
//open a new page with our custom context
const page = await context.newPage();
//navigate to the site
await page.goto("https://nakamotoinstitute.org/library/bitcoin/");
//xpath variable
const xPath = "/html/body/main/article/div/a[2]";
//create a promise that resolves once a download event has taken place
const downloadPromise = page.waitForEvent("download");
//xpath of the download button
await page.locator(`xpath=${xPath}`).click();
//create a download object from the promise
const download = await downloadPromise;
//save the file to our new custom pdfDownloads folder
await download.saveAs(path.join(downloadsPath, "CUSTOM-BITCOIN.pdf"));
//close the browser
await browser.close();
}
main();
Here are some of the important differences you should notice with this script:
require(fs)
imports the filesystemif (!fs.existsSync(downloadsFolder))
, tells NodeJS to check if ourdownloadsFolder
variable exists- If it doesn't exist, we create it with
fs.mkdirSync(downloadsFolder)
- We then make a new path by combining our current folder and the
downloadsFolder
withpath.join()
- We then create another path when saving our file,
"CUSTOM_BITCOIN.pdf"
withdownloadsPath
. This path gets passed intodownload.saveAs()
to tell Playwright to save our download into this new custom folder with the new custom name.
Handling File Download Dialogs
Sometimes when handling a file download, you may need to click pop-ups or close them out. The example below shows you how to handle a pop-up with Playwright. We simply locate the element as we did earlier when learning how to locate elements, then we click the button.
const playwright = require("playwright");
async function main() {
//create a browser object
const browser = await playwright.chromium.launch();
//set a new context for the browser
const context = await browser.newContext({
acceptDownloads: true,
});
//open a new page with our custom context
const page = await context.newPage();
//navigate to the site
await page.goto("https://designsystem.digital.gov/components/modal/");
//xpath to launch the modal
const xPathModalLauncher = ("/html/body/div[2]/div/main/div/div[2]/div/div[1]/a");
//click to launch the modal
await page.locator(`xpath=${xPathModalLauncher}`).click();
//take a screenshot interacting with the modal
await page.screenshot({ path:"before-accepting.png" });
//xpath to the "continue without saving" button
const xPathModalCloser = "/html/body/div[4]/div/div/div/div/div[2]/ul/li[1]/button"
//close the modal
await page.locator(`xpath=${xPathModalCloser}`).click()
//take a screenshot after clicking the "continue without saving" button
await page.screenshot({ path: "after-accepting.png" });
//close the browser
await browser.close();
}
main();
Notice the similarities between this example and our first:
- We find the button to launch the pop-up using
page.locator()
- We also find the "continue without saving" button using
page.locator()
- We then
click()
on the button to take care of the pop-up
Here is the screenshot before clicking the button:
Here is the screenshot after the button has been clicked:
Downloading Multiple Files
Perhaps the most important reason to download with Playwright: We can download a lot of stuff...fast. In the example below, we navigate to the same website we used before for the Bitcoin Whitepaper, but this time, we attempt to download all of the PDF files from the site.
const { chromium } = require('playwright');
const path = require('path');
async function downloadFile(context, linkHref) {
//open a new page
const page = await context.newPage();
try {
const [download] = await Promise.all([
//create a download listener
page.waitForEvent("download"),
//go to the download url
page.goto(linkHref),
]);
const downloadPath = path.join(process.cwd(), download.suggestedFilename());
await download.saveAs(downloadPath);
console.log(`Downloaded file saved to ${downloadPath}`);
} catch (err) {
console.log(`Failed to download from ${linkHref}: ${err}`);
}
}
async function main() {
//open a new browser
const browser = await chromium.launch();
//create a new context that accepts downloads
const context = await browser.newContext({
acceptDownloads: true,
});
//open a new page
const page = await context.newPage();
//base url
const url = "https://file-examples.com/index.php/sample-documents-download/sample-pdf-download/";
//navigate to the url
await page.goto(url);
//find every pdf link on the page
const listOfLinks = await page.$$eval("a[href$='.pdf']", links => links.map(link => link.href));
//create an array of async download promises
const downloadPromises = listOfLinks.map(linkHref => downloadFile(context, linkHref));
//await all the promises
await Promise.all(downloadPromises);
//close the browser
await browser.close();
}
main();
Key differences you should notice in the code above:
downloadFile()
is an async function we use to download filesconst downloadPromises = listOfLinks.map(linkHref => downloadFile(context, linkHref))
runsdownloadFile()
on each of the links in the list
Downloading All the Images from a WebPage
Now that we've downloaded all the PDFs from a site, let's take this a step further. With the help of axios
, we're going to download all images from a website asynchronously. In the code below, we create a function that asynchronously downloads an image. Then we take a a notch further and create an array of Promise
objects that only resolves once all of our images have finished downloading.
const { chromium } = require('playwright');
const axios = require('axios');
const fs = require('fs');
const path = require('path');
//if we don't have a download folder, create one
const downloadDir = path.join(__dirname, 'downloads');
if (!fs.existsSync(downloadDir)) {
fs.mkdirSync(downloadDir, { recursive: true });
}
//function to download an image
async function downloadImage(url, filepath) {
//get the image as a stream
const response = await axios({
url,
method: 'GET',
responseType: 'stream',
});
//write the stream
const writer = fs.createWriteStream(filepath);
//pipe the writer object
response.data.pipe(writer);
//return a new Promise object
return new Promise((resolve, reject) => {
//if we succeed, resolve the Promise
writer.on("finish", () => {
writer.close(); // Close the stream
resolve();
});
//if we fail, reject the Promise
writer.on("error", reject);
});
}
//main function
async function main() {
//launch chromium
const browser = await chromium.launch();
//open a new page
const page = await browser.newPage();
//navigate to the site
await page.goto('https://www.unsplash.com');
//get all image urls on the page
const imageUrls = await page.$$eval('img', images => images.map(img => img.src));
//create a Promise for each image in our list
const downloadPromises = imageUrls.map(url => {
//name of the image
const imageName = path.basename(new URL(url).pathname);
//path to the image
const filepath = path.join(downloadDir, `${imageName}.png`);
//return the downloaded image and log it to the console
return downloadImage(url, filepath).then(() => {
console.log(`Downloaded image: ${imageName}`);
//if we encounter an error, log it to the console
}).catch(error => {
console.error(`Error downloading ${imageName}:`, error);
});
});
//wait for ALL downloadPromises to resolve or reject
await Promise.all(downloadPromises);
//close the browser
await browser.close();
}
main();
In the code above, we:
- Create a function to download images asynchronously,
downloadImage()
. - After finding the url of each image (similar to how we found all the PDF links), we create an array
downloadPromises
which contains aPromise
for each image we'd like to download. - When downloading each image, we pass the url and our expected filepath into the download function. If the image downloads, we get a .png image inside of our downloads folder.
Monitoring Download Progress
In the example below, we once again create a function to download a file using axios
. When we create a download stream, we receive our data in chunks. If we know the total size of the stream, we can update our progress each time we receive a new chunk. The code below does exactly that.
const axios = require("axios");
const { chromium } = require("playwright");
const fs = require("fs");
const path = require("path");
//function to download a file and log the progress
async function downloadFileWithProgress(url, outputPath) {
//make a get request
const response = await axios({
url,
method: "GET",
responseType: "stream",
});
//save the content length
const totalLength = response.headers["content-length"];
//start at zero
let receivedLength = 0;
//when we receive a chunk
response.data.on("data", (chunk) => {
//add the chunk to our received length
receivedLength += chunk.length;
//log the current received length and the percentage of the total
console.log(`Received ${receivedLength} of ${totalLength} bytes (${((receivedLength / totalLength) * 100).toFixed(2)}%)`);
});
//create a write stream
const writer = fs.createWriteStream(outputPath);
//pipe the output
response.data.pipe(writer);
//return a promise
return new Promise((resolve, reject) => {
writer.on("finish", resolve);
writer.on("error", reject);
});
}
//main function
async function main() {
//launch chromium
const browser = await chromium.launch();
//open a new page
const page = await browser.newPage();
//go to the site
await page.goto("https://bitcoin.org/");
//url of the file we'd like to download
const fileUrl = "https://bitcoin.org/bitcoin.pdf";
//path for the download
const downloadPath = path.join(__dirname, "bitcoin.pdf");
//try to dowload the file
try {
await downloadFileWithProgress(fileUrl, downloadPath);
console.log("Download completed");
//log any errors
} catch (error) {
console.error("Download failed:", error);
}
//close the browser
await browser.close();
}
main();
In the code example above, we follow the same basic structure as our example that downloads images. The major differences here lie in the downloadFileWithProgress()
function. Key points in this function:
- We make a
GET
request to the download url, and pass the parameter to stream. - We find the
content-length
from the headers. This tells us the full size of the file. We save this as a variable,totalLength
- We create a counter,
receivedLength
and start it at zero. - Each time we receive a data chunk, we update
receivedLength
and log our download progress to the console. - After receiving all of our data, we then write the file to
outputPath
. - If everything runs properly, we resolve the
Promise
, if the writer returns an error, we reject it.
From inside the main function, we call it and get an output that looks like this:
As you can see from the screenshot, each time we received a chunk, our progress was logged to the console. After we finished the download, we log "Download completed".
Verifying File Download
There are multiple ways to ensure that your download took place. We can either do this manually with the File Explorer or automate this process with code. In this section we'll take a look at how to do both. Now let's verify the download from our previous example.
Verify With Your File Explorer
To check with your file explorer, simply open the folder where you downloaded the file. If you download succeeded, you should be able to open the file.
Step 1: Open the Folder
Take a look at the screenshot below. The downloaded file is highlighted.
Step 2: Open the File
Next, we open our file. If you downloaded a PDF, open it with a PDF viewer, if you downloaded an image, open it with an image viewer, and so on and so forth.
Verify Using Your Code
Now let's take a look at how we can verify this with our code. The code below is largely the same as our previous example, but at the end, we double check to ensure that the file exists.
const axios = require('axios');
const { chromium } = require('playwright');
const fs = require('fs');
const path = require('path');
//function to download a file and log the progress
async function downloadFileWithProgress(url, outputPath) {
//make a get request
const response = await axios({
url,
method: 'GET',
responseType: 'stream',
});
//save the content length
const totalLength = response.headers['content-length'];
//start at zero
let receivedLength = 0;
//when we receive a chunk
response.data.on('data', (chunk) => {
//add the chunk to our received length
receivedLength += chunk.length;
//log the current received length and the percentage of the total
console.log(`Received ${receivedLength} of ${totalLength} bytes (${((receivedLength / totalLength) * 100).toFixed(2)}%)`);
});
//create a write stream
const writer = fs.createWriteStream(outputPath);
//pipe the output
response.data.pipe(writer);
//return a promise
return new Promise((resolve, reject) => {
writer.on('finish', resolve);
writer.on('error', reject);
});
}
//main function
async function main() {
//launch chromium
const browser = await chromium.launch();
//open a new page
const page = await browser.newPage();
//go to the site
await page.goto("https://bitcoin.org/");
//url of the file we'd like to download
const fileUrl = "https://bitcoin.org/bitcoin.pdf";
//path for the download
const downloadPath = path.join(__dirname, "bitcoin.pdf");
//try to dowload the file
try {
await downloadFileWithProgress(fileUrl, downloadPath);
console.log("Download completed");
//log any errors
} catch (error) {
console.error('Download failed:', error);
}
//check if the new file exists
const fileExists = fs.existsSync(downloadPath);
//log the result of the check
if (fileExists) {
console.log(`File found at: ${downloadPath}`);
} else {
console.log(`Failed to find file at: ${downloadPath}`);
}
//close the browser
await browser.close();
}
main();
In this example, we:
- Download the file as we previously did. This informs us of the progress and logs a message that the download did complete.
- After completing the download, we create a boolean using
fs.existsSync()
to check the existence of the file. - Then, we log the
fileExists
anddownloadPath
variables to the console
Take a look at the screenshot below, our code works! The file finishes downloading and tells us when it's done. After completing the download, we double-check to ensure that the file exists, it does.
Increasing Download Speed
Unlike Selenium, Playwright does not offer very good built-in support for downloading a bunch of files asynchronously. Recall the example where we downloaded all those Unsplash images. We only used Playwright to scrape the url of each image. We actually used axios to download the images.
Take a closer look this function:
//function to download an image
async function downloadImage(url, filepath) {
//get the image as a stream
const response = await axios({
url,
method: 'GET',
responseType: 'stream',
});
//write the stream
const writer = fs.createWriteStream(filepath);
//pipe the writer object
response.data.pipe(writer);
//return a new Promise object
return new Promise((resolve, reject) => {
//if we succeed, resolve the Promise
writer.on("finish", () => {
writer.close(); // Close the stream
resolve();
});
//if we fail, reject the Promise
writer.on("error", reject);
});
}
This function creates a Promise
object. While we run this function, the rest of the code in our script continues to execute...But how do we access the results when the Promise
resolves?
Take a look at this code from that same example:
const downloadPromises = imageUrls.map(url => {
//name of the image
const imageName = path.basename(new URL(url).pathname);
//path to the image
const filepath = path.join(downloadDir, `${imageName}.png`);
//return the downloaded image and log it to the console
return downloadImage(url, filepath).then(() => {
console.log(`Downloaded image: ${imageName}`);
//if we encounter an error, log it to the console
}).catch(error => {
console.error(`Error downloading ${imageName}:`, error);
});
});
//wait for ALL downloadPromises to resolve or reject
await Promise.all(downloadPromises);
In this portion, we create an array of Promise
objects, downloadPromises
. Each Promise
in our array is being resolved in parallel. To access this array when it's finished resolving everything, we await Promise.all(downloadPromises);
. When we get to this portion of the code, NodeJS will automatically await
until all of our downloadPromises
are either rejected or resolved.
Downloading Very Large Files With Playwright
When downloading extremely large files, it is actually better to find the download url and use a dedicated download manager than to simply hang there with Playwright waiting for the file to finish downloading.
You can use Free Download Manager or a similar tool to easily accomplish your download. Download managers like this can support pausing the download, retries and sometimes even multithreaded downloads.
If you need to download large files, it's much better to use an existing solution built for the problem than to "reinvent the wheel".
Handling Different File Types
When downloading different file types, you should always use a strategy tailored to your individual needs. When downloading images, axios
was our best bet. This allowed us to download all of our files at once. When the same strategy was used for downloading PDF files, we actually received many errors from the server and got blocked on more than one occasion.
If you choose to use a regular HTTP client (such as axios), you may need to ensure that you're sending the right headers and even possibly using fake user agents. This all depends on the type of file that your downloading. On the web, images are often the easiest to download without getting blocked. Any time you view an image on the web, a GET
is already sent to the server containing that image in order to retrieve it. When a server receives a bunch of requests for images, this is pretty standard. When a server receives a bunch of requests for other types of documents, this doesn't look normal and you can be spotted as a bot.
Handling Authentication for Secure Downloads
Sometimes, you'll need to login to a website in order to download its content. Playwright offers a great API for filling and submitting forms on the web. The example below shows you how to fill and submit a simple login form.
const playwright = require("playwright");
async function main() {
//create a browser object
const browser = await playwright.chromium.launch();
//set a new context for the browser
const context = await browser.newContext({
acceptDownloads: true
});
//open a new page with our custom context
const page = await context.newPage();
//navigate to the site
await page.goto("https://quotes.toscrape.com/login");
//take a screenshot before filling the forms
await page.screenshot({ path: "before-login.png" });
//find the username
const usernameBox = page.locator("#username");
//find the password
const passwordBox = page.locator("#password");
//fill the username
await usernameBox.fill("ScrapeOps");
//fill the password
await passwordBox.fill("MY-SUPER-SECRET-PASSWORD");
//take a screenshot after filling the forms
await page.screenshot({ path: "after-login.png" });
//close the browser
await browser.close();
}
main();
This example uses the same locator
method we've used previously to find elements on the page. There are a couple things you really need to pay attention to here:
usernameBox.fill()
fills the username boxpasswordBox.fill()
fills the password box
Here is the page before entering the information:
Here it is after:
Error Handling for Download Failures
As you probably noticed, in most of our in-depth examples, we used JavaScript's builtin error handling. In this tutorial we've used error handling for the following:
- Incomplete Downloads: In the example where we showed the download progress, this snippet handled incomplete downloads:
try {
await downloadFileWithProgress(fileUrl, downloadPath);
console.log("Download completed");
//log any errors
} catch (error) {
console.error('Download failed:', error);
} - Incorrect File Format: When downloading images, and PDFs, we always checked the file extension in the code:
- Images
//in this example, we automatically add the .png file extension
const filepath = path.join(downloadDir, `${imageName}.png`);There are many ways to check file extensions in your code and most of them involve only basic string operations. Take the extra minute to ensure that your file extensions are correct.//create a list ONLY containing objects with the .pdf extension
const listOfLinks = await page.$$eval("a[href$='.pdf']", links => links.map(link => link.href)); - Network Latency: Network latency can definitely present difficulties. If you run into timeout errors, it is very easy to manually change your timeout. We set our timeout in milliseconds.
//don't wait for timeout
await page.goto('https://example.com', { timeout: 0 });
//wait 60 seconds for the page to timeout
await page.goto('https://example.com', { timeout: 60000 });
Conclusion
Congratulations! You've made it to the end of this tutorial. You now have should now have a decent grasp of the following topics:
- Locating elements with Playwright
- Clicking buttons with Playwright
- Downloading files with Playwright
- Taking screenshots with Playwright
- Downloading files with Axios
- Basic error handling in JavaScript
- Basic filesystem operations in JavaScript
Take your new knowledge and go build something. You've now got a great start on not just the fundamentals of Playwright, but Web Development fundamentals as well. If you'd like to learn more, take a look at the following links: