Skip to main content

Solve 403 Forbidden Error

UpdatedMay 13, 2026

How To Solve 403 Forbidden Errors When Web Scraping (2026)

TL;DR — A 403 Forbidden error in web scraping means the website has identified your request as a bot and refused to serve the page. The fix is almost always one of four layers: set a real browser User-Agent, send the full browser header set, rotate through residential or mobile proxies, and — for sites behind Cloudflare or DataDome — match the TLS/JA3 fingerprint of a real browser (most easily by using curl_cffi or a smart proxy API that handles it for you).

Most of the time the underlying cause is not the URL itself — the same URL works fine in a browser — it is the website's anti-bot system flagging your scraper. 403 Forbidden Errors are particularly common when scraping sites protected by Cloudflare, DataDome, or PerimeterX, as all three return a 403 when their detection logic fires.

This 2026 guide walks through the full debugging workflow and shows you exactly what to change at each layer.

Let's begin...


Quick reference: 403 causes & fixes

If you only have time to read one section, this is it. The same six root causes account for the vast majority of 403 Forbidden responses returned to scrapers in 2026:

#SymptomMost likely causeFix
1403 on the very first request, browser works fineLibrary default User-Agent (e.g. python-requests/2.x, axios/1.x) gives you awaySet a current Chrome/Firefox/Safari User-Agent header on every request
2403 even with a real User-AgentOther browser headers (Accept, Accept-Language, Sec-Fetch-*) are missing or inconsistentSend the full set of headers a real browser sends, and make sure the OS/version implied by them matches the User-Agent
3First few requests succeed, then 403IP-based rate limitingSlow the crawl down, exponentially back off on 403/429, and rotate proxies
4403 from every IP you try, even with perfect headersIP reputation: the proxy range is flagged as datacenter/bot trafficSwitch to residential or mobile proxies
5403 specifically from Cloudflare/DataDome with a Chrome challenge pageTLS/JA3 fingerprint or HTTP/2 settings frame doesn't match a real browserUse curl_cffi (Python), tls-client (Go/Node), or a smart proxy API
6403 on URLs that genuinely require authThe URL really is permission-protectedAuthenticate (cookies / OAuth / API key) — this is the only 403 that is not anti-bot related

The rest of this guide drills into each fix in turn — starting with the easiest.

Need help scraping the web?

Then check out ScrapeOps, the complete toolkit for web scraping.


Easy Way To Solve 403 Forbidden Errors When Web Scraping

If the URL you are trying to scrape is normally accessible, but you are getting 403 Forbidden Errors then it is likely that the website is flagging your spider as a scraper and blocking your requests.

To avoid getting detected we need to optimise our spiders to bypass anti-bot countermeasures by:

  • Using Fake User Agents
  • Optimizing Request Headers
  • Using Proxies

We will discuss these below, however, the easiest way to fix this problem is to use a smart proxy solution like the ScrapeOps Proxy Aggregator.

ScrapeOps Proxy Aggregator

With the ScrapeOps Proxy Aggregator you simply need to send your requests to the ScrapeOps proxy endpoint and our Proxy Aggregator will optimise your request with the best user-agent, header and proxy configuration to ensure you don't get 403 errors from your target website.

Simply get your free API key by signing up for a free account here and edit your scraper as follows:


import requests
from urllib.parse import urlencode

API_KEY = 'YOUR_API_KEY'

def get_scrapeops_url(url):
payload = {'api_key': API_KEY, 'url': url}
proxy_url = 'https://proxy.scrapeops.io/v1/?' + urlencode(payload)
return proxy_url

r = requests.get(get_scrapeops_url('http://quotes.toscrape.com/page/1/'))
print(r.text)

If you are getting blocked by Cloudflare, then you can simply activate ScrapeOps' Cloudflare Bypass by adding bypass=cloudflare_level_1 to the request:


import requests

API_KEY = 'YOUR_API_KEY'

def get_scrapeops_url(url):
payload = {'api_key': API_KEY, 'url': url, 'bypass': 'cloudflare_level_1'}
proxy_url = 'https://proxy.scrapeops.io/v1/?' + urlencode(payload)
return proxy_url

r = requests.get(get_scrapeops_url('http://example.com/'))
print(r.text)

tip

Cloudflare is the most common anti-bot system being used by websites today, and bypassing it depends on which security settings the website has enabled.

To combat this, we offer 3 different Cloudflare bypasses designed to solve the Cloudflare challenges at each security level.

Security LevelBypassAPI CreditsDescription
Lowcloudflare_level_110Use to bypass Cloudflare protected sites with low security settings enabled.
Mediumcloudflare_level_235Use to bypass Cloudflare protected sites with medium security settings enabled. On large plans the credit multiple will be increased to maintain a flat rate of $3.50 per thousand requests.
Highcloudflare_level_350Use to bypass Cloudflare protected sites with high security settings enabled. On large plans the credit multiple will be increased to maintain a flat rate of $4 per thousand requests.

You can check out the full documentation here.

Or if you would prefer to try to optimize your user-agent, headers and proxy configuration yourself then read on and we will explain how to do it.


Use Fake User Agents

The most common reason for a website to block a web scraper and return a 403 error is because you is telling the website you are a scraper in the user-agents you send to the website when making your requests.

By default, most HTTP libraries (Python Requests, Scrapy, NodeJs Axios, etc.) either don't attach real browser headers to your requests or include headers that identify the library that is being used. Both of which immediately tell the website you are trying to scrape that you are scraper, not a real user.

For example, let's send a request to http://httpbin.org/headers with the Python Requests library using the default setting:


import requests

r = requests.get('http://httpbin.org/headers')
print(r.text)

You will get a response like this that shows what headers we sent to the website:


{
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Host": "httpbin.org",
"User-Agent": "python-requests/2.26.0",
}
}

Here we can see that our request using the Python Requests libary appends very few headers to the request, and even identifies itself as the python requests library in the User-Agent header.


"User-Agent": "python-requests/2.26.0",

This tells the website that your requests are coming from a scraper, so it is very easy for them to block your requests and return a 403 status code.

Solution

The solution to this problem is to configure your scraper to send a fake user-agent with every request. This way it is harder for the website to tell if your requests are coming from a scraper or a real user.

Here is how you would send a fake user agent when making a request with Python Requests.


import requests

HEADERS = {'User-Agent': 'Mozilla/5.0 (iPad; CPU OS 12_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148'}

r = requests.get('http://quotes.toscrape.com/page/1/', headers=HEADERS)
print(r.text)

Here we are making our request look like it is coming from a iPad, which will increase the chances of the request getting through.

This will only work on relatively small scrapes, as if you use the same user-agent on every single request then a website with a more sophisticated anti-bot solution could easily still detect your scraper.

To solve when scraping at scale, we need to maintain a large list of user-agents and pick a different one for each request.


import requests
import random

user_agents_list = [
'Mozilla/5.0 (iPad; CPU OS 12_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148',
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.83 Safari/537.36',
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.51 Safari/537.36'
]

r = requests.get('http://quotes.toscrape.com/page/1/', headers={'User-Agent': random.choice(user_agents_list)})
print(r.text)

Now, when we make the request. We will pick a random user-agent for each request.


Optimize Request Headers

In a lot of cases, just adding fake user-agents to your requests will solve the 403 Forbidden Error, however, if the website is has a more sophisticated anti-bot detection system in place you will also need to optimize the request headers.

By default, most HTTP clients will only send basic request headers along with your requests such as Accept, Accept-Language, and User-Agent.


Accept: 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'
Accept-Language: 'en'
User-Agent: 'python-requests/2.26.0'

In contrast, here are the request headers a Chrome browser running on a MacOS machine would send:


Connection: 'keep-alive'
Cache-Control: 'max-age=0'
sec-ch-ua: '" Not A;Brand";v="99", "Chromium";v="99", "Google Chrome";v="99"'
sec-ch-ua-mobile: '?0'
sec-ch-ua-platform: "macOS"
Upgrade-Insecure-Requests: 1
User-Agent: 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.83 Safari/537.36'
Accept: 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9'
Sec-Fetch-Site: 'none'
Sec-Fetch-Mode: 'navigate'
Sec-Fetch-User: '?1'
Sec-Fetch-Dest: 'document'
Accept-Encoding: 'gzip, deflate, br'
Accept-Language: 'en-GB,en-US;q=0.9,en;q=0.8'

If the website is really trying to prevent web scrapers from accessing their content, then they will be analysing the request headers to make sure that the other headers match the user-agent you set, and that the request includes other common headers a real browser would send.

Solution

To solve this, we need to make sure we optimize the request headers, including making sure the fake user-agent is consistent with the other headers.

This is a big topic, so if you would like to learn more about header optimization then check out our guide to header optimization.

However, to summarize, we don't just want to send a fake user-agent when making a request but the full set of headers web browsers normally send when visiting websites.

Here is a quick example of adding optimized headers to our requests:


import requests

HEADERS = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:98.0) Gecko/20100101 Firefox/98.0",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8",
"Accept-Language": "en-US,en;q=0.5",
"Accept-Encoding": "gzip, deflate",
"Connection": "keep-alive",
"Upgrade-Insecure-Requests": "1",
"Sec-Fetch-Dest": "document",
"Sec-Fetch-Mode": "navigate",
"Sec-Fetch-Site": "none",
"Sec-Fetch-User": "?1",
"Cache-Control": "max-age=0",
}

r = requests.get('http://quotes.toscrape.com/page/1/', headers=HEADERS)
print(r.text)

Here we are adding the same optimized header with a fake user-agent to every request. However, when scraping at scale you will need a list of these optimized headers and rotate through them.


Match The Browser TLS Fingerprint (JA3 / JA4)

If you've done everything above — set a current User-Agent, sent the full browser header set, used good proxies — and you are still getting 403s from Cloudflare or DataDome, the problem is almost always your TLS fingerprint.

Every HTTPS client (Python Requests, Node Axios, Go net/http, Java HttpClient) negotiates the TLS handshake slightly differently from a real browser. The exact cipher suite list, the order of TLS extensions, the HTTP/2 settings frame and the way it handles HTTP/2 pseudo-headers all combine into a fingerprint — typically expressed as a JA3 or JA4 hash. Cloudflare, DataDome, Akamai Bot Manager and PerimeterX all compare incoming JA3/JA4 hashes against known-browser values and return 403 when they don't match — regardless of how perfect your headers look.

You can verify this is your problem by sending a request to a TLS fingerprint debugging service. Two free ones are useful:

If the JA3 hash you see is something like 8d9989e9d52b73c5c2e1c9f2dd47ce69 (the canonical Python Requests fingerprint at the time of writing), you've found your problem. Real Chrome on macOS produces a completely different hash, and that's what Cloudflare expects.

Solution

The fix is to use an HTTP client that impersonates a real browser at the TLS layer, not just at the header layer. The three most widely used options in 2026 are:

LanguageLibraryWhat it does
Pythoncurl_cffiDrop-in replacement for requests that uses libcurl with browser-impersonating TLS. Pass impersonate="chrome124" (or similar) to match a specific browser version.
PythonhrequestsWraps a Go TLS client (originally tls-client) with a requests-like API.
Go / Nodetls-clientUnderlying Go library used by many other tools.

Here is the minimal Python fix using curl_cffi:


# pip install curl_cffi
from curl_cffi import requests

# `impersonate` makes the TLS handshake, HTTP/2 settings frame and
# default header order all match a real Chrome 124 install.
r = requests.get(
"https://nowsecure.nl/",
impersonate="chrome124",
)
print(r.status_code, len(r.text))

For most Cloudflare-protected sites this alone is enough to get a 200 response. For DataDome-protected sites, you usually need TLS impersonation plus residential proxies — DataDome scores IP reputation aggressively.

When to skip this step entirely

Maintaining TLS impersonation across Chrome versions is real engineering work — browser fingerprints change with every Chrome release and the libraries above ship updates a few weeks behind. If you don't want to maintain that yourself, the ScrapeOps Proxy Aggregator handles TLS fingerprinting transparently when you set the bypass parameter — see the Quick reference table above and the Easy Way section.


Use Rotating Proxies

If the above solutions don't work then it is highly likely that the server has flagged your IP address as being used by a scraper and is either throttling your requests or completely blocking them.

This is especially likely if you are scraping at larger volumes, as it is easy for websites to detect scrapers if they are getting an unnaturally large amount of requests from the same IP address.

Solution

You will need to send your requests through a rotating proxy pool.

Here is how you could do it Python Requests:


import requests
from itertools import cycle

list_proxy = [
'http://Username:Password@IP1:20000',
'http://Username:Password@IP2:20000',
'http://Username:Password@IP3:20000',
'http://Username:Password@IP4:20000',
]

proxy_cycle = cycle(list_proxy)
proxy = next(proxy_cycle)

for i in range(1, 10):
proxy = next(proxy_cycle)
print(proxy)
proxies = {
"http": proxy,
"https":proxy
}
r = requests.get(url='http://quotes.toscrape.com/page/1/', proxies=proxies)
print(r.text)

Now, your request will be routed through a different proxy with each request.

You will also need to incorporate the rotating user-agents we showed previous as otherwise, even when we use a proxy we will still be telling the website that our requests are from a scraper, not a real user.

If you need help finding the best & cheapest proxies for your particular use case then check out our proxy comparison tool here.

Alternatively, you could just use the ScrapeOps Proxy Aggregator as we discussed previously.


Framework-Specific Guides

The general guidance above applies to every HTTP client. If you are using a specific framework, these focused guides go into the exact code changes needed:


Frequently Asked Questions

What causes a 403 Forbidden error when web scraping?

Almost always it means the website has detected that the request is automated and refused to serve the page. The most common triggers are a non-browser User-Agent, an incomplete header set, an IP address that's been rate-limited or blacklisted, and a TLS/JA3 fingerprint that doesn't match a real browser — which is why default Python Requests calls are routinely blocked by Cloudflare and DataDome even when the headers look correct.

Work through four layers in order: (1) set a realistic browser User-Agent, (2) send the full set of browser headers, (3) route through rotating residential or mobile proxies, and (4) if you're still seeing 403s from sites behind Cloudflare or DataDome, switch from `requests` to `curl_cffi` (or use a smart proxy API that handles the TLS fingerprint for you).

Usually yes. A 403 Forbidden response in web scraping means the server understood the request but is refusing to authorize it. If the URL works in a normal browser, the site is blocking your scraper specifically. Genuine permission-based 403s — where the resource truly requires login — are less common and you'd normally see them in a browser too.

A 401 Unauthorized means the server needs you to authenticate. A 403 Forbidden means the server has identified the requester (with or without credentials) and refused access anyway. In scraping you almost always see 403 — the site has identified your client as a bot and refuses regardless of who you say you are.

Because the website is comparing dozens of signals — User-Agent, header order, TLS handshake, HTTP/2 settings frame, IP reputation — and Python Requests gives off a different fingerprint than Chrome on every one of them. Closing the gap means setting real browser headers in the right order, matching the TLS fingerprint with curl_cffi or a TLS-impersonating library, and proxying through a residential IP.

Rotate everything: User-Agent, full header set, proxy IP, and ideally TLS fingerprint. Keep request volume per IP below the site's rate-limit threshold, respect Retry-After when present, exponentially back off on 403/429 responses, and use a smart proxy aggregator for sites with active bot-management products like Cloudflare, DataDome, PerimeterX, or Akamai.


More Web Scraping Tutorials

So that's how you can solve 403 Forbidden Errors when you get them.

If you would like to know more about bypassing the most common anti-bots then check out our bypass guides here:

Or if you would like to learn more about Web Scraping, then be sure to check out The Web Scraping Playbook.

Or check out one of our more in-depth guides: