Skip to main content

Bypass CAPTCHAs With Puppeteer

Bypass CAPTCHAs With Puppeteer

Scraping and crawling data from websites is crucial for performing various operations at large scale, but anti-bot technologies like CAPTCHAs have made the scraping process more challenging.

CAPTCHAs are obstacles that keep bots out, but if your scraper bot can appear human to the target website, it can bypass the CAPTCHA challenge and scrape the desired data. Fortunately, there are techniques to bypass CAPTCHAs during web scraping.

In this article, we'll cover:


What Are CAPTCHAs

CAPTCHA stands for "Completely Automated Public Turing Test to Tell Computers and Humans Apart". It is a security measure that distinguishes humans from bots.

CAPTCHAs are often used on websites to prevent automated attacks and bots from accessing websites.

CAPTCHAs are easy for humans to solve but difficult for machines to understand. For example, in the image below, the user must check the box to prove that they are human.

Bypass CAPTCHAs With Puppeteer - What are CAPTCHAs? - Not a robot


Can CAPTCHAs Be Bypassed?

Yes, CAPTCHAs can be bypassed, but it is challenging.

CAPTCHAs are designed to be difficult for computers to solve, but not impossible. There are two approaches:

  1. Avoid being shown the CAPTCHA by making your requests look as much like a real human user as possible.
  2. Solve the CAPTCHA with your own CAPTCHA solving system or using a 3rd party CAPTCHA solver.

The best way to deal with CAPTCHAs is to prevent them from appearing in the first place. If a CAPTCHA does appear, retry the request (by reloading the page or submitting the request again).


What are the Different Types of CAPTCHAs?

Having a thorough understanding of the different types of CAPTCHA is crucial when scraping.

We’re going to focus on 5 most common types of CAPTCHAs.

Alphanumeric CAPTCHAs

Alphanumeric CAPTCHAs are a mix of random letters and numbers that users must type to proceed. They are displayed as stylized or distorted images of alphanumeric characters, and the user is asked to enter these characters in a text box.

Bypass CAPTCHAs With Puppeteer - Different Types of CAPTCHAs - Alphanumeric CAPTCHAs

reCAPTCHAs

reCAPTCHA is a free Google service that helps protect websites from spam and abuse by distinguishing between human users and automated bots.

There are two versions of reCAPTCHA:

  1. reCAPTCHA v2: also known as "I'm not a robot" reCAPTCHA, requires users to click on a checkbox to verify that they are human.
  2. reCAPTCHA v3: the newest type of reCAPTCHA from Google. It does not require any user interaction, but instead uses a "humanity" rating to determine whether the interactions on the site are human or bot-like.

Bypass CAPTCHAs With Puppeteer - Different Types of CAPTCHAs - reCAPTCHAs

hCAPTCHAs

hCaptcha is a relatively new type of CAPTCHA that is very similar to reCAPTCHA.

It is a popular CAPTCHA service that presents users with various challenges to verify their humanity. These challenges involve tasks such as selecting specific images, solving puzzles, or answering questions.

When you click the checkbox, you’ll be given a task to solve, which makes it more difficult for bots to complete.

Bypass CAPTCHAs With Puppeteer - Different Types of CAPTCHAs - hCAPTCHAs

Cloudflare Turnstile

Cloudflare Turnstile is a CAPTCHA made by Cloudflare. Turnstile stops abuse and confirms visitors are real without the data privacy concerns or awful user experience of CAPTCHAs. There are two cases for Turnstile:

  1. A standalone CAPTCHA widget placed on a page of a website protects a form from automated submission.

  2. Turnstile CAPTCHA on Cloudflare Challenge pages. Here, you'll first need to wait a few seconds in the Cloudflare waiting room. During that time, your browser solves challenges to prove you're not a robot.

Bypass CAPTCHAs With Puppeteer - Different Types of CAPTCHAs - Cloudflare Turnstile

Audio CAPTCHAs

These CAPTCHAs use audio instead of images. They work by playing a short audio clip and then asking users to enter what they hear from the audio clip into a text area. It is usually a series of letters or numbers.

The audio clip may be distorted or have background noise added to make it more difficult for bots.

Bypass CAPTCHAs With Puppeteer - Different Types of CAPTCHAs - Audio CAPTCHAs

Approaches To Dealing With CAPTCHAs

There are two approaches to dealing with CAPTCHAs.

  1. Solve the CAPTCHA itself.
  • This is typically when you need to solve an embedded CAPTCHA while performing common tasks such as logging into a website, submitting a form, or scraping data.
  • There are various ways to solve CAPTCHAs: you can use free and open source CAPTCHA solving libraries, third-party CAPTCHA solving services, or Puppeteer to solve different types of CAPTCHAs.
  1. Avoid CAPTCHAs by making your requests look as much like a real human user as possible.
  • You can use services like ScrapeOps Proxy Aggregator to avoid anti-bot challenges and stealth mode, which applies various evasion techniques.
  • You can also use and rotate real headers and rotating proxies, or implement headless browsers. Check our Web Scraping Without Getting Blocked guide to improve your web scrapers.

We’ll discuss both options in detail in this guide.


How to Solve CAPTCHAs?

There are many ways to solve CAPTCHAs, such as using free and open-source CAPTCHA-solving libraries or paid third-party CAPTCHA-solving services.

Free & Open Source CAPTCHA Solving Libraries

Here are some open-source CAPTCHA-solving libraries available.

  • arunpatala/captcha:
    • Using the Torch machine learning library, the author created a dataset of 10,000 samples, each with 5 characters.
    • The dataset contains all the effects and noises available in the library to make it more challenging.
    • The goal is to break Simple Captcha, a Java-based CAPTCHA software.
  • zakizhou/CAPTCHA:
    • This is a small convolutional neural network built with TensorFlow to recognize CAPTCHAs.
    • For simplicity, the images will only contain four digits with noise.
  • nladuo/captcha-break:
    • CAPTCHA breaking based on OpenCV, Tesseract-OCR, and a machine learning algorithm.
  • ypwhs/captcha_break:
    • This project will use Keras to build a deep convolutional neural network to identify CAPTCHA verification codes.
  • ptigas/simple-captcha-solver:
    • This is a simple solver for very specific and easy-to-solve CAPTCHAs.
    • The procedure for solving captchas is as follows:
      1. Move each letter across the image and calculate the difference of the pixels for each position, then sum them.
      2. This gives you a score for each position, indicating how well the letter (mask) fits the letter behind it.
      3. Store the position with the highest score for each letter.
      4. Sort the letters by score, taking the top five results (since our captcha is five letters).
      5. Finally, sort the letters by position. The result is the CAPTCHA text.
  • rickyhan/SimGAN-Captcha:
    • It helps you solve captchas without manually labeling a training set.
    • By using a captcha synthesizer and a refiner trained with GANs, it is feasible to generate synthesized training pairs for classifying captchas.
  • arunpatala/captcha.irctc:
    • This reads IRCTC captchas with 98% accuracy using deep learning.
    • IRCTC is a popular travel website in India where people book train tickets.
    • Due to the high demand for tickets, booking during peak hours requires a captcha image containing letters that humans must enter to book the ticket.
    • This is supposed to stop automated software from booking tickets.
  • JackonYang/captcha-tensorflow:
    • Solves Image Captcha Using TensorFlow and CNN Model with the Accuracy of 90%+.
  • skyduy/CNN_keras:
    • Using a convolutional neural network built with Keras, the model achieves 95% accuracy in recognizing single letters from a dataset of about 5000 samples.
  • PatrickLib/captcha_recognize:
    • This image recognition captcha does not require image segmentation.
    • It achieves 99.7% accuracy with 50000 training samples and 52.1% accuracy with 100000 training samples.
  • zhengwh/captcha-svm:
    • It can identify simple verification strings and solve simple captchas using a support vector machine (SVM).
  • chxj1992/captcha_cracker:
    • This is a simple implementation of a verification code recognition function in Keras using a convolutional neural network model.
  • chxj1992/slide_captcha_cracker:
    • This project uses a simple image edge detection algorithm to locate the sliding verification code puzzle in the background image.
    • The code is implemented mainly with OpenCV to process and position the images.
  • JasonLiTW/simple-railway-captcha-solver#english-version:
    • This is a simple railway captcha solver.
    • It uses a simple convolutional neural network to solve the captcha (as shown above) on the Taiwan railway booking website.
    • Currently, the accuracy of a single digit on the validation set is about 98.84%, and the overall accuracy is 91.13% (successfully recognizing 6 digits at once).
  • lllcho/CAPTCHA-breaking:
    • Using Python Keras, and OpenCV, it’ll break simple captchas.
  • ecthros/uncaptcha:
    • It can defeat Google's audio reCAPTCHA with 85% accuracy by correctly identifying spoken numbers and programmatically passing the reCAPTCHA program, fooling the site into thinking the bot is a human.
  • dessant/buster:
    • This free tool helps humans solve difficult captchas by completing reCAPTCHA audio challenges using speech recognition. It is available for Chrome, Edge, and Firefox.
  • kerlomz/captcha_trainer:
    • This deep learning-based image verification code solution can quickly eliminate various interference situations such as character adhesion, overlap, perspective deformation, blur, and noise.
    • It is sufficient to solve the most complex verification code scenarios on the market and is also currently used in other OCR scenes.

3rd Party CAPTCHA Solving Services

These services use real people or sophisticated algorithms to solve CAPTCHAs.

Once you integrate them into your Puppeteer script, you can automatically send the CAPTCHA image to these services, receive the solution, and input it into the page.

Popular services include:

  • 2Captcha: It can help solve reCAPTCHA v2, v2 callback, v2 invisible, v3, and Enterprise. It also supports bypassing hCaptchas. Its price starts at $1.00 for 1,000 solved CAPTCHAs, and its auto captcha solver response time is less than 12 seconds.
  • Anti-Captcha: 100% of CAPTCHAs are solved by human workers from around the world, starting at $0.50 per 1,000 images.
  • DeathByCaptcha: Death by Captcha is a CAPTCHA-solving service that can solve any CAPTCHA.
    • The price for solving a normal CAPTCHA is $0.99-$2 per 1,000 CAPTCHAs,
    • for solving a Cloudflare Turnstile CAPTCHA it is $2.89 per 1,000 CAPTCHAs,
    • for solving a reCAPTCHA v2/v3 CAPTCHA (including Invisible CAPTCHAs) it is $2.89 per 1,000 CAPTCHAs, and
    • for solving an hCAPTCHA it is $3.99 per 1,000 CAPTCHAs.

How to Solve CAPTCHAs With Puppeteer?

Puppeteer can help you solve CAPTCHAs.

To do this, you can use Puppeteer with the 2Captcha API, a CAPTCHA-solving service.

First of all, sign up on 2Captcha to get an API key.

Bypass CAPTCHAs With Puppeteer - Solve CAPTCHAs With Puppeteer - 2Captcha API

To solve CAPTCHAs, you must have a balance in your 2Captcha account. Therefore, add the balance to your account and use the 2Captcha service.

Bypass CAPTCHAs With Puppeteer - Solve CAPTCHAs With Puppeteer - Zero balance

Solving Text CAPTCHAs With Puppeteer

Solving text CAPTCHAs with Puppeteer requires a couple of steps.

  1. First, you need to capture the text CAPTCHA on the webpage.

  2. Then, you've to make a POST request to the 2Captcha API, and you will receive an ID in response to the POST request.

  3. After that, make a GET request with your ID and 2Captcha key to get the text of the CAPTCHA.

Follow the steps below to understand more clearly.

First, install the puppeteer and requests module.

npm install puppeteer request

Now, let's write a script that opens the website you want to scrape, takes a screenshot of the CAPTCHA, and sends it to the 2Captcha service.

const puppeteer = require('puppeteer');
const request = require('request');

(async () => {
const browser = await puppeteer.launch({
headless: false,
});
const page = await browser.newPage();

// Step 1: Navigate to the page with the CAPTCHA
await page.goto('https://2captcha.com/demo/normal');

// Step 2: Take a screenshot of the CAPTCHA
const screenshot = await page.screenshot();

// Step 3: Convert the screenshot to a base64 encoded string
const image = Buffer.from(screenshot).toString('base64');

// Step 4: Send the image to the 2Captcha API for CAPTCHA solving
request.post({
url: 'http://2captcha.com/in.php',
formData: {
key: 'your_2captcha_api_key',
method: 'base64',
body: image
}
}, async (error, response, body) => {
if (error) {
console.error(error);

Now, capture the API response and extract the CAPTCHA ID from it. Then, make a request to get the solution to the CAPTCHA.

// Step 5: Get the CAPTCHA ID from the 2Captcha API response
const captchaId = body.split('|')[1];

// Step 6: Request the CAPTCHA solution from the 2Captcha API
request.get({
url: `http://2captcha.com/res.php?key=your_2captcha_api_key&action=get&id=${captchaId}`
}, async (error, response, body) => {
if (error) {
console.error(error);

Now, extract the CAPTCHA solution and type it in the text field. Then, click the submit button to check if the CAPTCHA is correct.

const captchaText = body.split('|')[1];
// Step 8: Use the CAPTCHA solution in your Puppeteer script
await page.type('#simple-captcha-field', captchaText);
await page.click(`button[type="submit"]);

Here’s the complete code:

const puppeteer = require("puppeteer");
const request = require("request");

(async () => {
const browser = await puppeteer.launch({
headless: false,
});
const page = await browser.newPage();

// Step 1: Navigate to the page with the CAPTCHA
await page.goto("https://2captcha.com/demo/normal");

// Step 2: Take a screenshot of the CAPTCHA
const screenshot = await page.screenshot();

// Step 3: Convert the screenshot to a base64 encoded string
const image = Buffer.from(screenshot).toString("base64");

// Step 4: Send the image to the 2Captcha API for CAPTCHA solving
request.post(
{
url: "http://2captcha.com/in.php",
formData: {
key: "your_2captcha_api_key",
method: "base64",
body: image,
},
},
async (error, response, body) => {
if (error) {
console.error(error);
} else {
// Step 5: Get the CAPTCHA ID from the 2Captcha API response
const captchaId = body.split("|")[1];

// Step 6: Request the CAPTCHA solution from the 2Captcha API
request.get(
{
url: `http://2captcha.com/res.php?key=your_2captcha_api_key&action=get&id=${captchaId}`,
},
async (error, response, body) => {
if (error) {
console.error(error);
} else {
// Step 7: Get the CAPTCHA solution from the 2Captcha API response
const captchaText = body.split("|")[1];
// Step 8: Use the CAPTCHA solution in your Puppeteer script
await page.type("#simple-captcha-field", captchaText);
await page.click(`button[type="submit"]`);
}
// Step 9: Close the Puppeteer browser
// await browser.close();
}
);
}
}
);
})();

Wow, CAPTCHA is solved using Puppeteer and 2Captcha.

Bypass CAPTCHAs With Puppeteer - Solve CAPTCHAs With Puppeteer - Text captcha passed

Solving reCAPTCHAs With Puppeteer

To solve CAPTCHAs, you can use the free and open-source puppeteer-extra-plugin-recaptcha plugin, which can solve reCAPTCHA and hCAPTCHA challenges automatically.

We’ll be using this plugin with the 2Captcha API-based CAPTCHA-solving service, as it comes with 2Captcha integration.

To get started, install puppeteer-extra and puppeteer-extra-plugin-recaptcha using the following command:

npm install puppeteer puppeteer-extra puppeteer-extra-plugin-recaptcha

The code is simple. It calls page.solveRecaptchas() to automatically solve the reCAPTCHA challenge on the webpage. The script then uses page.waitForNavigation() to wait for the navigation to complete (after the reCAPTCHA challenge is solved).

const puppeteer = require("puppeteer-extra");
const RecaptchaPlugin = require("puppeteer-extra-plugin-recaptcha");

// Configure the Recaptcha plugin with the 2Captcha provider and API token
puppeteer.use(
RecaptchaPlugin({
provider: {
id: "2captcha",
token: "YOUR_2CAPTCHA_API_TOKEN",
},
visualFeedback: true, // Enable visual feedback for solving reCAPTCHAs
})
);

// Launch a headless browser instance
puppeteer
.launch({
headless: false,
})
.then(async (browser) => {
// Create a new page
const page = await browser.newPage();

// Navigate to a webpage containing a reCAPTCHA challenge
await page.goto("https://www.google.com/recaptcha/api2/demo");

// Automatically solve the reCAPTCHA challenge
await page.solveRecaptchas();

// Wait for the page navigation to complete and then click the submit button
await Promise.all([
page.waitForNavigation(),
page.click(`#recaptcha-demo-submit`),
]);

// Take a screenshot of the response page
await page.screenshot({
path: "result.png",
fullPage: true,
});

// Close the browser
await browser.close();
});

Here’s the code output:

Bypass CAPTCHAs With Puppeteer - Solve CAPTCHAs With Puppeteer - Recaptcha solve

Solving hCAPTCHAs With Puppeteer

The process for solving hCAPTCHAs is almost the same as solving reCAPTCHAs. You don't have to make any changes to the above code, just change the URL of the webpage where the hCAPTCHA is present.

const puppeteer = require("puppeteer-extra");
const RecaptchaPlugin = require("puppeteer-extra-plugin-recaptcha");

// Configure the Recaptcha plugin with the 2Captcha provider and API token
puppeteer.use(
RecaptchaPlugin({
provider: {
id: "2captcha",
token: "YOUR_2CAPTCHA_API_TOKEN",
},
visualFeedback: true, // Enable visual feedback for solving hCAPTCHAs
})
);

// Launch a headless browser instance
puppeteer
.launch({
headless: false,
})
.then(async (browser) => {
// Create a new page
const page = await browser.newPage();

// Navigate to a webpage containing a hCAPTCHA challenge
await page.goto("https://2captcha.com/demo/hcaptcha");

// Automatically solve the hCAPTCHA challenge
await page.solveRecaptchas();

// Wait for the page navigation to complete and then click the submit button
await Promise.all([
page.waitForNavigation(),
page.click(`button[type="submit"]`),
]);

// Close the browser
await browser.close();
});

Here’s the code output:

Bypass CAPTCHAs With Puppeteer - Solve CAPTCHAs With Puppeteer - hcaptcha solved

Solving Audio CAPTCHAs With Puppeteer

Puppeteer and 2Captcha together can help you solve audio captchas. The recognition of audio is fully automated and is performed by a neural network trained for speech recognition.

You just need to submit a file for recognition (a base64-encoded audio file in MP3 format).

The following code is very similar to the code for solving text captchas.

// Require the puppeteer and request libraries
const puppeteer = require("puppeteer");
const request = require("request");
const fs = require("fs");

// Create an async function to run the code
(async () => {
// Launch Puppeteer in headless mode
const browser = await puppeteer.launch({
headless: true,
});

// Create a new Puppeteer page
const page = await browser.newPage();

// Get the path to the audio file
const path = "C:/pupp_captcha/audio.mp3";

// Read the audio file as a base64 encoded string
const audio = fs.readFileSync(path, "base64");

// Make a POST request to the 2captcha API to solve the audio captcha
request.post(
{
url: "http://2captcha.com/in.php",
formData: {
key: "YOUR_2CAPTCHA_API_TOKEN",
method: "audio",
body: audio,
lang: "en",
},
},
async (error, response, body) => {
// If the request failed, log the error
if (error) {
console.error(error);
return;
}

// Get the captcha ID from the response body
const captchaId = body.split("|")[1];

// Wait for 10 seconds
await page.waitForTimeout(10000);

// Make a GET request to the 2captcha API to get the solved captcha text
request.get(
{
url: `http://2captcha.com/res.php?key=YOUR_2CAPTCHA_API_TOKEN&action=get&id=${captchaId}`,
},
async (error, response, body) => {
// If the request failed, log the error
if (error) {
console.error(error);
return;
}

// Get the solved captcha text from the response body
const captchaText = body.split("|")[1];

// Log the solved captcha text
console.log("Audio to text: ", captchaText);

// Close the Puppeteer browser
await browser.close();
}
);
}
);
})();
  1. First, convert your audio file to base64-encoded format.
  2. Then, make a POST request to get the captcha ID.
  3. Finally, make a GET request with the captcha ID to get your answer.

How To Avoid Triggering CAPTCHAs

To avoid triggering CAPTCHAs that are shown by anti-bot systems when they suspect the request isn’t coming from a real human user, make your scraper look like a real user so that the CAPTCHA won’t be shown.

Use Stealth Mode

Puppeteer's Stealth mode can make your bot look like a human, bypassing CAPTCHAs on protected websites. In short, Stealth-based access prevents CAPTCHAs from loading on these websites.

Puppeteer provides a plugin to enable Stealth mode called puppeteer-extra-plugin-stealth.

Use & Rotate Real Headers

To avoid being flagged as a bot, use real headers, including user agents that match your actual browser and operating system.

Websites can detect requests from Puppeteer by its default user agent, so use a pool of recent and popular user agents. The puppeteer-extra-plugin-anonymize-ua plugin anonymizes the user agent header sent by Puppeteer.

You can visit useragentstring.com to see the UA for your web browsing environment.

Bypass CAPTCHAs With Puppeteer - Use & Rotate Real Headers - User agent

Too many requests with the same HTTP headers are suspicious, so a real user wouldn't visit 500 pages in two minutes.

To avoid attracting attention, rotate your headers. You can find the latest user agents for web browsers and operating systems here.

Use Rotating Proxies

To avoid detection by websites, web scrapers need to rotate their IP addresses.

This precaution is necessary because websites can track the number of requests that are coming from a particular IP address and block scrapers that make a large number of requests.

Rotating HTTP headers can also help to make a web scraper look more like a real browser, but this is not enough to prevent detection if the scraper's IP address remains the same.

puppeteer-extra-plugin-proxy plugin adds proxy support, which helps avoid rate limiting in web scraping.

Some websites impose rate limits on the number of requests a single IP address can make within a certain timeframe.

Implement headless browsers

Browser automation tools such as Puppeteer and Selenium can be used to resolve CAPTCHAs and avoid them entirely. These tools simulate human interactions with websites, which leads to CAPTCHAs believing that the user is a human.

The headless mode of browser automation tools allows them to run without a graphical user interface (GUI). It can be useful for web scraping since it saves resources and speeds up the browser.

Avoid fingerprinting

Websites can fingerprint your browser. Browser fingerprinting is a technique used by websites to collect information regarding the configuration of your browser and device.

By randomizing viewport size, navigator plugins, and other detectable features (such as fonts or user agents), it is possible to prevent browser fingerprinting.

Make your scraper look like a real user

It is important to mimic human behavior and avoid patterns when web scraping to avoid detection. Websites monitor user navigation, hover elements, and even click coordinates to analyze user behavior.

A web scraper that exhibits robotic behavior might be detected and blocked because it makes too many requests too quickly or clicks on links in a predictable pattern.

Some actions that web scrapers can take to avoid detection include:

  • Web scrapers can use random time intervals between actions to simulate human behavior.
  • Scrapers can also use random intervals between requests to avoid making too many requests at once.
  • Scrapers can rotate their IP addresses between requests by using a proxy pool.

How To Avoid CAPTCHAs Using ScrapeOps Proxy Aggregator

You can avoid anti-bot CAPTCHAs using a service like the ScrapeOps proxy aggregator. ScrapeOps takes care of proxy selection and rotation, so you only need to send the URL you want to scrape.

ScrapeOps Proxy Aggregator is an all-in-one proxy API that allows you to use over 20 proxy providers from a single API.

To integrate the ScrapeOps proxy with Puppeteer, you just need to define the proxy port settings, set Puppeteer to ignore HTTPS errors and configure the proxy authorization.

const puppeteer = require("puppeteer");
const cheerio = require("cheerio");

// ScrapeOps proxy configuration
PROXY_USERNAME = "scrapeops.headless_browser_mode=true";
PROXY_PASSWORD = "f2a9c7e3-61db-4c8a-9f4d-8e4572bc60a1"; // <-- enter your API_Key here
PROXY_SERVER = "proxy.scrapeops.io";
PROXY_SERVER_PORT = "5353";

(async () => {
const browser = await puppeteer.launch({
ignoreHTTPSErrors: true,
args: [`--proxy-server=http://${PROXY_SERVER}:${PROXY_SERVER_PORT}`],
});
const page = await browser.newPage();
await page.authenticate({
username: PROXY_USERNAME,
password: PROXY_PASSWORD,
});

try {
await page.goto("https://quotes.toscrape.com/page/1/", {
timeout: 180000,
});
let bodyHTML = await page.evaluate(() => document.body.innerHTML);
let $ = cheerio.load(bodyHTML);

// Scrape quotes and authors
let quotes = [];
$(".quote").each((index, element) => {
let quoteText = $(element).find(".text").text().trim();
let author = $(element).find(".author").text().trim();
quotes.push({
quote: quoteText,
author: author,
});
});
console.log("Quotes:", quotes);
} catch (err) {
console.log(err);
}

await browser.close();
})();

Bypassing CAPTCHAs can compromise the security of websites and services. Bots that bypass CAPTCHAs can steal sensitive information, such as account credentials, credit card numbers, and other personal data.

When you register on a website, you agree to its terms of service, which typically prohibit the use of bots and other automated tools.

Bypassing CAPTCHAs could potentially harm websites and services. It can result in data scraping, spamming, denial-of-service attacks, and other malicious activities.

Bypassing CAPTCHAs can also result in severe legal consequences. Your account could be suspended or terminated, or you may be subject to other actions.


More Web Scraping Guides

In this tutorial, you've learned to bypass various types of captchas using Puppeteer and third-party tools. We covered text, audio, reCAPTCHA, and hCaptcha solutions with Puppeteer and the 2Captcha API.

Additionally, we explored open-source captcha-solving libraries. You also gained insights into avoiding captchas and the legal and ethical aspects of bypassing them.

If you'd like to learn more about scraping but don't know where to start, try one of these: