NodeJS Fake User-Agents: How to Manage User Agents When Scraping
To use fake user-agents with a NodeJS HTTP client like:
- Node-Fetch
- Axios
- SuperAgent
- Got
- Request Promise
With Node-Fetch, you just need to define a user-agent in a headers
object and pass it into the headers
attribute in your request options
.
import fetch from 'node-fetch';
(async () => {
const url = 'http://httpbin.org/headers'
const options = {
method: 'GET',
headers: {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36'
}
}
try{
const response = await fetch(url, options)
const data = await response.json()
console.log(data);
} catch (error){
console.log('error', error)
}
})();
With NodeJS Axios, you just need to define a user-agent in a headers
object and pass it into the headers
attribute in your axios options
.
const axios = require('axios');
(async () => {
const url = 'http://httpbin.org/headers'
const options = {
headers: {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36'
}
}
try {
const response = await axios.get(url, options)
console.log(response.data);
} catch (error){
console.log('error', error)
}
})();
With NodeJS SuperAgent, you first need to define a user-agent in a headers
object. Then you simply call set
method on the request with the headers
object.
const request = require('superagent');
(async () => {
const url = 'http://httpbin.org/headers';
const headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36'
}
try{
const response = await request.get(url)
.set(headers)
console.log(response.body);
} catch (error){
console.log('error', error);
}
})();
With NodeJS Got, you just need to define a user-agent in a headers
object and pass it into the headers
attribute in your got request options
.
import got from 'got';
(async () => {
const url = 'https://httpbin.org/headers';
const options = {
headers: {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36'
}
};
try {
const response = await got.get(url, options);
console.log(response.body);
} catch (e) {
console.error(error);
}
})();
With NodeJS Request Promise, you just need to define a user-agent in a headers
object and pass it into the headers
attribute in your request options
.
import request from 'request-promise';
(async () => {
const options = {
method: 'GET',
url: 'http://httpbin.org/headers',
headers: {'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36'}
}
try{
const response = await request(options)
console.log(response);
} catch (error){
console.log('error', error)
}
})();
One of the most common reasons for getting blocked whilst web scraping is using bad user-agents.
However, integrating fake user-agents into your NodeJS web scrapers is very easy.
So in this guide, we will go through:
- What Are Fake User-Agents?
- How To Set A User Agents
- How To Rotate User-Agents
- How To Manage Thousands of Fake User-Agents
- Why Use Fake Browser Headers
- ScrapeOps Fake Browser Headers API
First, let's quickly go over some the very basics.
Need help scraping the web?
Then check out ScrapeOps, the complete toolkit for web scraping.
What Are Fake User-Agents?
User Agents are strings that let the website you are scraping identify the application, operating system (OSX/Windows/Linux), browser (Chrome/Firefox/Internet Explorer), etc. of the user sending a request to their website. They are sent to the server as part of the request headers.
Here is an example User agent sent when you visit a website with a Chrome browser:
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.82 Safari/537.36'
When scraping a website, you also need to set user-agents on every request as otherwise the website may block your requests because it knows you aren't a real user.
In the case of most NodeJS HTTP clients like Node-Fetch, SuperAgent, Got, Axios, and Request-Promise, when you send a request with them using their default settings they clearly identify that the request is being made with their library in the user-agent string.
How each popular Node.js HTTP client behaves out-of-the-box is different, so use the tabs below to see the default header each one sends (or doesn’t send) and why you need to override it.
- Node-Fetch
- Axios
- SuperAgent
- Got
- Request-Promise
"User-Agent": "node-fetch"
This user agent will clearly identify your requests are being made by the Node-Fetch library, so the website can easily block you from scraping the site. Example of website adminstrator trying to block it.
That is why we need to manage the user-agents our NodeJS HTTP clients send with our requests.
'User-Agent': '',
An empty header is just as suspicious - real browsers never omit this field.
To avoid easy detection you need to supply (and regularly rotate) a realistic UA string when using Axios.
'User-Agent': '',
Sites flag blank headers as bot-like, so be sure to set a proper UA - and rotate it - on every SuperAgent request.
'User-Agent': '',
That instantly tips websites off that the traffic isn’t from a normal browser session.
Override the header (and rotate) when scraping with Got.
'User-Agent': '',
Again, this is a red flag for most sites' anti-bot systems.
Always provide a realistic, rotating UA string when working with Request-Promise.
How To Set A Fake User-Agent
Setting a fake user-agent is very easy.
- Node-Fetch
- Axios
- SuperAgent
- Got
- Request Promise
import fetch from 'node-fetch';
(async () => {
const url = 'http://httpbin.org/headers'
const options = {
method: 'GET',
headers: {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36'
}
}
try{
const response = await fetch(url, options)
const data = await response.json()
console.log(data);
} catch (error){
console.log('error', error)
}
})();
From here Node-Fetch will use the above user-agent to make the request. All request headers, including User-Agent, are reflected in response JSON, which is loaded with response.json
method.
Like Request-Promise and Node-Fetch, setting Axios to use a fake user-agent just requires us to create a options
object and include a user-agent in the headers parameter. Then add this options
object to the get
request.
const axios = require('axios');
(async () => {
const url = 'http://httpbin.org/headers'
const options = {
headers: {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36'
}
}
try{
const response = await axios.get(url, options)
console.log(response.data);
} catch (error){
console.log('error', error)
}
})();
From here Axios will use the above user-agent to make the request.
With SuperAgent, you just need to define a user-agent in a headers
object and pass it during the set
method call.
const request = require('superagent');
(async () => {
const url = 'http://httpbin.org/headers';
const headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36'
}
try{
const response = await request.get(url)
.set(headers)
console.log(response.body);
} catch (error){
console.log('error', error);
}
})();
From here SuperAgent will use the above user-agent to make the request. All request headers, including User-Agent, are reflected in response JSON, which can be accessed with response.body
.
With Got, you just define a user-agent in a headers
object and pass it into the headers
attribute in your request options
.
import got from 'got';
(async () => {
const url = 'https://httpbin.org/headers';
const options = {
headers: {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36'
}
};
try {
const response = await got.get(url, options);
console.log(response.body);
} catch (e) {
console.error(error);
}
})();
From here Got will use the above user-agent to make the request. All request headers, including User-Agent, are reflected in response JSON, which can be accessed with response.body
.
With NodeJs Request-Promise, you just need to define a user-agent in a headers
object and pass it into the headers
attribute in your request options
.
import request from 'request-promise';
(async () => {
const options = {
method: 'GET',
url: 'http://httpbin.org/headers',
headers: {'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36'}
}
try{
const response = await request(options)
console.log(response);
} catch (error){
console.log('error', error)
}
})();
From here Request-Promise will use the above user-agent to make the request.
How To Rotate User-Agents
Rotating through user-agents is also pretty straightforward when using NodeJS libraries. We just need a list of user-agents in our scraper and use a random one with every request.
- Node-Fetch
- Axios
- SuperAgent
- Got
- Request Promise
import fetch from 'node-fetch';
import randomUserAgent from 'random-useragent';
const userAgents = [
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.82 Safari/537.36',
'Mozilla/5.0 (iPhone; CPU iPhone OS 14_4_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0.3 Mobile/15E148 Safari/604.1',
'Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1)',
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36 Edg/87.0.664.75',
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.18363'
];
const url = 'http://httpbin.org/headers'
const headers = {
'User-Agent': randomUserAgent.getRandom(userAgents)
};
const options = {
method: 'GET',
headers: headers
};
fetch(url, options)
.then(async response => {
console.log(await response.json());
})
.catch(error => {
console.error(error);
});
const axios = require('axios');
const randomUserAgent = require('random-useragent');
const userAgents = [
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.82 Safari/537.36',
'Mozilla/5.0 (iPhone; CPU iPhone OS 14_4_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0.3 Mobile/15E148 Safari/604.1',
'Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1)',
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36 Edg/87.0.664.75',
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.18363'
];
const url = 'http://httpbin.org/headers';
const options = {
headers: {
'User-Agent': randomUserAgent.getRandom(userAgents)
}
};
(async () => {
try {
const response = await axios.get(url, options);
console.log(response.data);
} catch(error) {
console.error('error', error);
};
})();
const request = require('superagent');
const randomUserAgent = require('random-useragent');
const userAgents = [
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.82 Safari/537.36',
'Mozilla/5.0 (iPhone; CPU iPhone OS 14_4_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0.3 Mobile/15E148 Safari/604.1',
'Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1)',
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36 Edg/87.0.664.75',
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.18363'
];
const url = 'http://httpbin.org/headers'
const headers = {
'User-Agent': randomUserAgent.getRandom(userAgents)
};
request.get(url)
.set(headers)
.then(async response => {
console.log(await response.body);
})
.catch(error => {
console.error(error);
});
import got from 'got';
import randomUserAgent from 'random-useragent';
const userAgents = [
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.82 Safari/537.36',
'Mozilla/5.0 (iPhone; CPU iPhone OS 14_4_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0.3 Mobile/15E148 Safari/604.1',
'Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1)',
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36 Edg/87.0.664.75',
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.18363'
];
const url = 'http://httpbin.org/headers';
const options = {
headers: {
'User-Agent': randomUserAgent.getRandom(userAgents)
}
};
got.get(url, options)
.then(response => {
console.log(response.body);
})
.catch(error => {
console.error(error);
});
const rp = require('request-promise');
const randomUserAgent = require('random-useragent');
const userAgents = [
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.82 Safari/537.36',
'Mozilla/5.0 (iPhone; CPU iPhone OS 14_4_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0.3 Mobile/15E148 Safari/604.1',
'Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1)',
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36 Edg/87.0.664.75',
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.18363'
];
const headers = {
'User-Agent': randomUserAgent.getRandom(userAgents)
};
const options = {
uri: 'http://httpbin.org/headers',
headers: headers,
json: true
};
rp(options)
.then(response => {
console.log(response);
})
.catch(error => {
console.error(error);
});
This works but it has drawbacks as we would need to build & keep an up-to-date list of user-agents ourselves.
How To Manage Thousands of Fake User-Agents
A better approach would be to use a free user-agent API like ScrapeOps Fake User-Agent API to download an up-to-date user-agent list when your scraper starts up and then pick a random user-agent for each request.
To use the ScrapeOps Fake User-Agents API you just need to send a request to the API endpoint to retrieve a list of user-agents.
http://headers.scrapeops.io/v1/user-agents?api_key=YOUR_API_KEY
To use the ScrapeOps Fake User-Agent API, you first need an API key which you can get by signing up for a free account here.
Example response from the API:
{
"result": [
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.5 Safari/605.1.15",
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.53 Safari/537.36",
"Mozilla/5.0 (Windows NT 10.0; Windows; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.114 Safari/537.36",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/603.3.8 (KHTML, like Gecko) Version/10.1.2 Safari/603.3.8",
"Mozilla/5.0 (Windows NT 10.0; Windows; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.114 Safari/537.36",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0 Safari/605.1.15",
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.53 Safari/537.36",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0 Safari/605.1.15",
"Mozilla/5.0 (Windows NT 10.0; Windows; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.114 Safari/537.36",
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.53 Safari/537.36"
]
}
To integrate the Fake User-Agent API you should configure your scraper to retrieve a batch of the most up-to-date user-agents when the scraper starts and then configure your scraper to pick a random user-agent from this list for each request.
Here are examples of different scraper integrations:
- Node-Fetch
- Axios
- SuperAgent
- Got
- Request Promise
import fetch from 'node-fetch';
const SCRAPEOPS_API_KEY = '9a90a1dd-2575-4f18-8646-139b2d7de711';
const url = `http://headers.scrapeops.io/v1/user-agents?api_key=${SCRAPEOPS_API_KEY}`
async function getUserAgentList() {
const response = await fetch(url);
const data = await response.json()
return data.result || [];
}
function getRandomUserAgent(userAgentList) {
const randomIndex = Math.floor(Math.random() * userAgentList.length);
return userAgentList[randomIndex];
}
const urlList = [
'https://jsonplaceholder.typicode.com/posts/1',
'https://jsonplaceholder.typicode.com/posts/2',
'https://jsonplaceholder.typicode.com/posts/3'
];
(async () => {
try {
const userAgentList = await getUserAgentList();
for (const url of urlList) {
const headers = {
'User-Agent': getRandomUserAgent(userAgentList)
};
const options = {
headers: headers
};
const response = await fetch(url, options);
const data = await response.json()
console.log(data);
}
} catch (error) {
console.error(error);
}
})();
const axios = require('axios');
const SCRAPEOPS_API_KEY = 'YOUR_API_KEY';
async function getUserAgentList() {
const url = `http://headers.scrapeops.io/v1/user-agents?api_key=${SCRAPEOPS_API_KEY}`
const response = await axios.get(url);
return response.data.result || [];
}
function getRandomUserAgent(userAgentList) {
const randomIndex = Math.floor(Math.random() * userAgentList.length);
return userAgentList[randomIndex];
}
const urlList = [
'https://jsonplaceholder.typicode.com/posts/1',
'https://jsonplaceholder.typicode.com/posts/2',
'https://jsonplaceholder.typicode.com/posts/3'
];
(async () => {
try {
const userAgentList = await getUserAgentList();
for (const url of urlList) {
const headers = {
'User-Agent': getRandomUserAgent(userAgentList)
};
const options = {
headers: headers
};
console.log({ headers })
const response = await axios.get(url, options);
console.log(response.data);
}
} catch (error) {
console.error('error', error);
}
})();
const request = require('superagent');
const SCRAPEOPS_API_KEY = 'YOUR_API_KEY';
const url = `http://headers.scrapeops.io/v1/user-agents?api_key=${SCRAPEOPS_API_KEY}`
async function getUserAgentList() {
const response = await request.get(url);
const data = response.body
return data.result || [];
}
function getRandomUserAgent(userAgentList) {
const randomIndex = Math.floor(Math.random() * userAgentList.length);
return userAgentList[randomIndex];
}
const urlList = [
'https://jsonplaceholder.typicode.com/posts/1',
'https://jsonplaceholder.typicode.com/posts/2',
'https://jsonplaceholder.typicode.com/posts/3'
];
(async () => {
try {
const userAgentList = await getUserAgentList();
for (const url of urlList) {
const headers = {
'User-Agent': getRandomUserAgent(userAgentList)
};
const response = await request.get(url)
.set(headers);
console.log(response.body);
}
} catch (error) {
console.error(error);
}
})();
import got from 'got';
const SCRAPEOPS_API_KEY = 'YOUR_API_KEY';
const url = `http://headers.scrapeops.io/v1/user-agents?api_key=${SCRAPEOPS_API_KEY}`
async function getUserAgentList() {
const response = await got.get(url);
const data = response.body
return data.result || [];
}
function getRandomUserAgent(userAgentList) {
const randomIndex = Math.floor(Math.random() * userAgentList.length);
return userAgentList[randomIndex];
}
const urlList = [
'https://jsonplaceholder.typicode.com/posts/1',
'https://jsonplaceholder.typicode.com/posts/2',
'https://jsonplaceholder.typicode.com/posts/3'
];
(async () => {
try {
const userAgentList = await getUserAgentList();
for (const url of urlList) {
const options = {
headers: {
'User-Agent': getRandomUserAgent(userAgentList)
}
}
const response = await got.get(url, options);
console.log(response.body);
}
} catch (error) {
console.error(error);
}
})();
const rp = require('request-promise');
const SCRAPEOPS_API_KEY = 'YOUR_API_KEY';
async function getUserAgentList() {
const options = {
uri: `http://headers.scrapeops.io/v1/user-agents?api_key=${SCRAPEOPS_API_KEY}`,
json: true
};
const response = await rp(options);
return response.result || [];
}
function getRandomUserAgent(userAgentList) {
const randomIndex = Math.floor(Math.random() * userAgentList.length);
return userAgentList[randomIndex];
}
const urlList = [
'https://example.com/1',
'https://example.com/2',
'https://example.com/3',
];
(async () => {
try {
const userAgentList = await getUserAgentList();
for (const url of urlList) {
const headers = {
'User-Agent': getRandomUserAgent(userAgentList)
};
const options = {
uri: url,
headers: headers
};
const response = await rp(options);
console.log(response);
}
} catch (error) {
console.error(error);
}
})();
Here the scraper will use a random user-agent for each request.
Why Use Fake Browser Headers
For simple websites, simply setting an up-to-date user-agent should allow you to scrape a website pretty reliably.
However, a lot of popular websites are increasingly using sophisticated anti-bot technologies to try and prevent developer from scraping data from their websites.
These anti-bot solutions not only look at your requests user-agent when analysing the request, but also the other headers a real browser normally sends.
By using a full set of browser headers you make your requests look more like real user requests, and as a result harder to detect.
Here are example headers when using a Chrome browser on a MacOS machine:
sec-ch-ua: " Not A;Brand";v="99", "Chromium";v="99", "Google Chrome";v="99"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "macOS"
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.83 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Sec-Fetch-Site: none
Sec-Fetch-Mode: navigate
Sec-Fetch-User: ?1
Sec-Fetch-Dest: document
Accept-Encoding: gzip, deflate, br
Accept-Language: en-GB,en-US;q=0.9,en;q=0.8
As we can see, real browsers don't just send User-Agent
strings but also a number of other headers that are used to identify and customize the request.
So to improve the reliability of our scrapers we should also include these headers when making requests.
You could build a list of fake browser headers yourself, or you could use the ScrapeOps Fake Browser Headers API to get an up-to-date list every time your scraper starts up.
ScrapeOps Fake Browser Headers API
The ScrapeOps Fake Browser Headers API is a free API that returns a list of optimized fake browser headers that you can use in your web scrapers to avoid blocks/bans and improve the reliability of your scrapers.
API Endpoint:
http://headers.scrapeops.io/v1/browser-headers?api_key=YOUR_API_KEY
Response:
{
"result": [
{
"upgrade-insecure-requests": "1",
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Windows; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.114 Safari/537.36",
"accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
"sec-ch-ua": "\".Not/A)Brand\";v=\"99\", \"Google Chrome\";v=\"103\", \"Chromium\";v=\"103\"",
"sec-ch-ua-mobile": "?0",
"sec-ch-ua-platform": "\"Windows\"",
"sec-fetch-site": "none",
"sec-fetch-mod": "",
"sec-fetch-user": "?1",
"accept-encoding": "gzip, deflate, br",
"accept-language": "bg-BG,bg;q=0.9,en-US;q=0.8,en;q=0.7"
},
{
"upgrade-insecure-requests": "1",
"user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.53 Safari/537.36",
"accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
"sec-ch-ua": "\".Not/A)Brand\";v=\"99\", \"Google Chrome\";v=\"103\", \"Chromium\";v=\"103\"",
"sec-ch-ua-mobile": "?0",
"sec-ch-ua-platform": "\"Linux\"",
"sec-fetch-site": "none",
"sec-fetch-mod": "",
"sec-fetch-user": "?1",
"accept-encoding": "gzip, deflate, br",
"accept-language": "fr-CH,fr;q=0.9,en-US;q=0.8,en;q=0.7"
}
]
}
To use the ScrapeOps Fake Browser Headers API, you first need an API key which you can get by signing up for a free account here.
To integrate the Fake Browser Headers API you should configure your scraper to retrieve a batch of the most up-to-date headers when the scraper starts and then configure your scraper to pick a random header from this list for each request.
Here are examples of different scraper integrations:
- Node-Fetch
- Axios
- SuperAgent
- Got
import fetch from 'node-fetch';
const SCRAPEOPS_API_KEY = 'YOUR_API_KEY';
const url = `http://headers.scrapeops.io/v1/browser-headers?api_key=${SCRAPEOPS_API_KEY}`
async function getHeadersList() {
const response = await fetch(url);
const data = await response.json()
return data.result || [];
}
function getRandomHeader(headerList) {
const randomIndex = Math.floor(Math.random() * headerList.length);
return headerList[randomIndex];
}
const urlList = [
'https://jsonplaceholder.typicode.com/posts/1',
'https://jsonplaceholder.typicode.com/posts/2',
'https://jsonplaceholder.typicode.com/posts/3'
];
(async () => {
try {
const headerList = await getHeadersList();
for (const url of urlList) {
const headers = getRandomHeader(headerList);
const options = {
headers: headers
};
const response = await fetch(url, options);
const data = await response.json()
console.log(data);
}
} catch (error) {
console.error(error);
}
})();
const axios = require('axios');
const SCRAPEOPS_API_KEY = 'YOUR_API_KEY';
async function getHeadersList() {
const url = `http://headers.scrapeops.io/v1/browser-headers?api_key=${SCRAPEOPS_API_KEY}`
const response = await axios.get(url);
return response.data.result || [];
}
function getRandomHeader(headerList) {
const randomIndex = Math.floor(Math.random() * headerList.length);
return headerList[randomIndex];
}
const urlList = [
'https://jsonplaceholder.typicode.com/posts/1',
'https://jsonplaceholder.typicode.com/posts/2',
'https://jsonplaceholder.typicode.com/posts/3'
];
(async () => {
try {
const headerList = await getHeadersList();
for (const url of urlList) {
const headers = getRandomHeader(headerList);
const options = {
headers: headers
};
const response = await axios.get(url, options);
console.log(response.data);
}
} catch (error) {
console.error(error);
}
})();
const request = require('superagent');
const SCRAPEOPS_API_KEY = 'YOUR_API_KEY';
const url = `http://headers.scrapeops.io/v1/browser-headers?api_key=${SCRAPEOPS_API_KEY}`;
async function getHeadersList() {
const response = await request.get(url);
return response.body.result || [];
}
function getRandomHeader(headerList) {
const randomIndex = Math.floor(Math.random() * headerList.length);
return headerList[randomIndex];
}
const urlList = [
'https://jsonplaceholder.typicode.com/posts/1',
'https://jsonplaceholder.typicode.com/posts/2',
'https://jsonplaceholder.typicode.com/posts/3'
];
(async () => {
try {
const headerList = await getHeadersList();
for (const url of urlList) {
const headers = getRandomHeader(headerList);
const response = await request.get(url)
.set(headers);
console.log(response.body);
}
} catch (error) {
console.error(error);
}
})();
import got from 'got';
const SCRAPEOPS_API_KEY = 'YOUR_API_KEY';
const url = `http://headers.scrapeops.io/v1/browser-headers?api_key=${SCRAPEOPS_API_KEY}`;
async function getHeadersList() {
const response = await got.get(url);
return response.body.result || [];
}
function getRandomHeader(headerList) {
const randomIndex = Math.floor(Math.random() * headerList.length);
return headerList[randomIndex];
}
const urlList = [
'https://jsonplaceholder.typicode.com/posts/1',
'https://jsonplaceholder.typicode.com/posts/2',
'https://jsonplaceholder.typicode.com/posts/3'
];
(async () => {
try {
const headerList = await getHeadersList();
for (const url of urlList) {
const options = {
headers: getRandomHeader(headerList)
};
const response = await got.get(url, options);
console.log(response.body);
}
} catch (error) {
console.error(error);
}
})();
</TabItem>
<TabItem value="Request Promise" label="Request Promise">
```javascript
const rp = require('request-promise');
const SCRAPEOPS_API_KEY = 'YOUR_API_KEY';
async function getHeadersList() {
const options = {
uri: `http://headers.scrapeops.io/v1/browser-headers?api_key=${SCRAPEOPS_API_KEY}`,
json: true
};
const response = await rp(options);
return response.result || [];
}
function getRandomHeader(headerList) {
const randomIndex = Math.floor(Math.random() * headerList.length);
return headerList[randomIndex];
}
const urlList = [
'https://example.com/1',
'https://example.com/2',
'https://example.com/3',
];
(async () => {
try {
const headerList = await getHeadersList();
for (const url of urlList) {
const headers = getRandomHeader(headerList);
const options = {
uri: url,
headers: headers
};
const response = await rp(options);
console.log(response);
}
} catch (error) {
console.error(error);
}
})();
For more information on check out the Fake Browser Headers API documenation.
More Web Scraping Tutorials
So that's why you need to use user-agents when scraping and how you can manage them with NodeJS.
If you would like to learn more about Web Scraping, then be sure to check out The Web Scraping Playbook.
Or check out one of our more in-depth guides: