NodeJS: Retry Failed Requests

In this guide for The NodeJs Web Scraping Playbook, we will look at how to configure the NodeJS Got, Node-Fetch, Axios, SuperAgent, and Request-Promise libraries to retry failed requests so you can build a more reliable system.

There are a couple of ways to approach this, so in this guide we will walk you through the 2 most common ways to retry failed requests and show you how to use them with the NodeJS libraries:

Retry Failed Requests Using Retry Library
Build Your Own Retry Logic Wrapper

Let's begin...

Need help scraping the web?

Then check out ScrapeOps, the complete toolkit for web scraping.

Proxy Manager

Scraper Monitoring

Job Scheduling

Retry Failed Requests Using Retry Library

Here we use the Retry package to define the retry logic and trigger any retries on failed requests.

Here is an example:

Got
Node-Fetch
Axios
SuperAgent
Request Promise

import got from 'got';
import retry from 'retry';

const retryOptions = {
  retries: 5,
  factor: 2,
  minTimeout: 1000,
  maxTimeout: 10000,
  randomize: true,
  statusCode: [429, 500, 502, 503, 504]
};

const retryOperation = retry.operation(retryOptions);

const url = 'http://quotes.toscrape.com/';

retryOperation.attempt(async function(currentAttempt) {
  try {
    const response = await got.get(url);
    const html = response.body;
    console.log(html);
  } catch (error) {
    if (retryOperation.retry(error)) {
      console.log(`Retry attempt: ${currentAttempt}`);
      return;
    }
    console.error(`Maximum number of retries reached. Error: ${error}`);
  }
});

In the above code, we use the node.js Got library to send HTTP requests with retry functionality. We also utilize the retry package to control the retry behavior.

We define the retry options, including the maximum number of retries, the factor by which to increase the retry timeout, the minimum and maximum timeout values, and the status codes that trigger a retry:

retries: The maximum amount of times to retry the operation. Default is 10. Setting this to 1 means do it once, then retry it once.
factor: The exponential factor to use. Default is 2.
minTimeout: The number of milliseconds before starting the first retry. Default is 1000.
maxTimeout: The maximum number of milliseconds between two retries. Default is Infinity.
randomize: Randomizes the timeouts by multiplying with a factor between 1 to 2. Default is false.

The formula used to calculate the individual timeouts is:

Math.min(random * minTimeout * Math.pow(factor, attempt), maxTimeout)

The retryOperation.attempt function handles the retry logic. Inside the function, we make the GET request using got.get method to the specified URL. If an error occurs, we check if a retry should be attempted using retryOperation.retry.

If a retry is required, we log the attempt number and make another attempt. If the maximum number of retries is reached, we log an error message.

import fetch from 'node-fetch';
import retry from 'retry';

const retryOptions = {
  retries: 5,
  factor: 2,
  minTimeout: 1000,
  maxTimeout: 10000,
  randomize: true,
  statusCode: [429, 500, 502, 503, 504]
};

const retryOperation = retry.operation(retryOptions);

const url = 'http://quotes.toscrape.com/';

retryOperation.attempt(async function(currentAttempt) {
  try {
    const response = await fetch(url);
    const html = await response.text();
    console.log(html);
  } catch (error) {
    if (retryOperation.retry(error)) {
      console.log(`Retry attempt: ${currentAttempt}`);
      return;
    }
    console.error(`Maximum number of retries reached. Error: ${error}`);
  }
});

In the above code, we use the node-fetch library to send HTTP requests with retry functionality. We also utilize the retry package to control the retry behavior.

retries: The maximum amount of times to retry the operation. Default is 10. Setting this to 1 means do it once, then retry it once.
factor: The exponential factor to use. Default is 2.
minTimeout: The number of milliseconds before starting the first retry. Default is 1000.
maxTimeout: The maximum number of milliseconds between two retries. Default is Infinity.
randomize: Randomizes the timeouts by multiplying with a factor between 1 to 2. Default is false.

The formula used to calculate the individual timeouts is:

Math.min(random * minTimeout * Math.pow(factor, attempt), maxTimeout)

The retryOperation.attempt function handles the retry logic. Inside the function, we make the GET request using fetch method (node-fetch) to the specified URL. If an error occurs, we check if a retry should be attempted using retryOperation.retry.

If a retry is required, we log the attempt number and make another attempt. If the maximum number of retries is reached, we log an error message.

const axios = require('axios');

const NUM_RETRIES = 3;

(async () => {
  let response;

  for (let i = 0; i < NUM_RETRIES; i++) {
    try {
      response = await axios.get('http://quotes.toscrape.com/');
      if (response.status === 200) {
        // Escape the loop if a successful response is returned
        break;
      }
    } catch (error) {
      // Absence of response field in error object indicates network error
      const networkError = error.response === undefined
      if (networkError) {
        // Handle connection errors
        continue;
      }
      response = error.response
      if (error.response.status === 404) {
        break;
      }
    }
  }

  // Do something with the successful response
  if (response && response.status === 200) {
    // Perform actions with the successful response
    console.log(response.data)
  }
})();

In the above code, we use the axios library to send HTTP requests with retry functionality. We also utilize the retry package to control the retry behavior.

retries: The maximum amount of times to retry the operation. Default is 10. Setting this to 1 means do it once, then retry it once.
factor: The exponential factor to use. Default is 2.
minTimeout: The number of milliseconds before starting the first retry. Default is 1000.
maxTimeout: The maximum number of milliseconds between two retries. Default is Infinity.
randomize: Randomizes the timeouts by multiplying with a factor between 1 to 2. Default is false.

The formula used to calculate the individual timeouts is:

Math.min(random * minTimeout * Math.pow(factor, attempt), maxTimeout)

The retryOperation.attempt function handles the retry logic. Inside the function, we make the GET request using axios axios.get method to the specified URL. If an error occurs, we check if a retry should be attempted using retryOperation.retry.

If a retry is required, we log the attempt number and make another attempt. If the maximum number of retries is reached, we log an error message.

const request = require('superagent');
const retry = require('retry');

const retryOptions = {
  retries: 5,
  factor: 2,
  minTimeout: 1000,
  maxTimeout: 10000,
  randomize: true,
  statusCode: [429, 500, 502, 503, 504]
};

const retryOperation = retry.operation(retryOptions);

const url = 'http://quotes.toscrape.com/';

retryOperation.attempt(async function(currentAttempt) {
  try {
    const response = await request.get(url);
    const html = response.text;
    console.log(html);
  } catch (error) {
    if (retryOperation.retry(error)) {
      console.log(`Retry attempt: ${currentAttempt}`);
      return;
    }
    console.error(`Maximum number of retries reached. Error: ${error}`);
  }
});

In the above code, we use the node.js SuperAgent library to send HTTP requests with retry functionality. We also utilize the retry package to control the retry behavior.

retries: The maximum amount of times to retry the operation. Default is 10. Setting this to 1 means do it once, then retry it once.
factor: The exponential factor to use. Default is 2.
minTimeout: The number of milliseconds before starting the first retry. Default is 1000.
maxTimeout: The maximum number of milliseconds between two retries. Default is Infinity.
randomize: Randomizes the timeouts by multiplying with a factor between 1 to 2. Default is false.

The formula used to calculate the individual timeouts is:

Math.min(random * minTimeout * Math.pow(factor, attempt), maxTimeout)

The retryOperation.attempt function handles the retry logic. Inside the function, we make the GET request using request.get method (from superagent) to the specified URL. If an error occurs, we check if a retry should be attempted using retryOperation.retry.

If a retry is required, we log the attempt number and make another attempt. If the maximum number of retries is reached, we log an error message.

const rp = require('request-promise');
const request = require('request');
const retry = require('retry');

const retryOptions = {
  retries: 5,
  factor: 2,
  minTimeout: 1000,
  maxTimeout: 10000,
  randomize: true,
  statusCode: [429, 500, 502, 503, 504]
};

const retryOperation = retry.operation(retryOptions);

const url = 'http://quotes.toscrape.com/';

retryOperation.attempt(async function(currentAttempt) {
  try {
    const response = await rp(url);
    console.log(response);
  } catch (error) {
    if (retryOperation.retry(error)) {
      console.log(`Retry attempt: ${currentAttempt}`);
      return;
    }
    console.error(`Maximum number of retries reached. Error: ${error}`);
  }
});

In the above code, we use the request-promise library to send HTTP requests with retry functionality. We also utilize the retry package to control the retry behavior.

retries: The maximum amount of times to retry the operation. Default is 10. Setting this to 1 means do it once, then retry it once.
factor: The exponential factor to use. Default is 2.
minTimeout: The number of milliseconds before starting the first retry. Default is 1000.
maxTimeout: The maximum number of milliseconds between two retries. Default is Infinity.
randomize: Randomizes the timeouts by multiplying with a factor between 1 to 2. Default is false.

The formula used to calculate the individual timeouts is:

Math.min(random * minTimeout * Math.pow(factor, attempt), maxTimeout)

The retryOperation.attempt function handles the retry logic. Inside the function, we make the GET request using request-promise rp to the specified URL. If an error occurs, we check if a retry should be attempted using retryOperation.retry.

If a retry is required, we log the attempt number and make another attempt. If the maximum number of retries is reached, we log an error message.

Build Your Own Retry Logic Wrapper

Another method of retrying failed requests with NodeJS Libraries is to build your own retry logic around your request functions.

Got
Node-Fetch
Axios
SuperAgent
Request Promise

import got from 'got';
const NUM_RETRIES = 3;

(async () => {
  let response;

  for (let i = 0; i < NUM_RETRIES; i++) {
    try {
      response = await got.get('http://quotes.toscrape.com');
      if (response.statusCode === 200) {
        // Escape the loop if a successful response is returned
        break;
      }
    } catch (error) {
      // Absence of response field in error object indicates network error
      const networkError = error.response === undefined;
      if (networkError) {
        // Handle network errors
        continue;
      }
      response = error.response;
      if (response.statusCode === 404) {
        break;
      }
    }
  }

  // Do something with the successful response
  if (response && response.statusCode === 200) {
    // Perform actions with the successful response
    console.log(response.body)
  }
})();

In the above code, we use the got.get method to send HTTP requests and handle retries. We initialize a variable response to store the response from the successful request.

We then use a for loop with a maximum of NUM_RETRIES iterations. Inside the loop, we make a GET request using got.get to the specified URL. If the response status code is either 200 or 404, we break out of the loop.

If a connection error occurs, we catch the error and continue to the next iteration.

Note that got throws an error when response status code is greater than 299. So we have to handle 404 error in the catch block. In this case, we also set response to error.response.

Finally, after the loop, we check if the response variable is not null and has a status code of 200. If these conditions are met, you can perform actions with the successful response.

The advantage of this approach is that you have a lot of control over what is a failed response.

Above we are only looking at the response code to see if we should retry the request, however, we could adapt this so that we also check the response to make sure the HTML response is valid.

Below we will add an additional check to make sure the HTML response doesn't contain a ban page.

const request = require('got');

const NUM_RETRIES = 3;

(async () => {
  let response;
  let validResponse = false;

  for (let i = 0; i < NUM_RETRIES; i++) {
    try {
      response = await got.get('http://quotes.toscrape.com');
      const html = response.body;
      if (response.statusCode === 200 && !html.includes('<title>Robot or human?</title>')) {
        // Break the loop if a successful response is returned and the expected content is not present
        validResponse = true;
        break;
      }
    } catch (error) {
      // Absence of response field in error object indicates network error
      const networkError = error.response === undefined;
      if (networkError) {
        // Handle network errors
        continue;
      }
      response = error.response;
      if (response.statusCode === 404) {
        validResponse = true;
        break;
      }
    }
  }

  // Do something with the successful response
  if (response && validResponse && response.statusCode === 200) {
    // Perform actions with the successful response
    console.log(response.body)
  }
})();

import fetch from 'node-fetch';

const NUM_RETRIES = 3;

(async () => {
  let response;

  for (let i = 0; i < NUM_RETRIES; i++) {
    try {
      response = await fetch('http://quotes.toscrape.com/');
      if (response.status === 200 || response.status === 404) {
        // Escape the loop if a successful response is returned
        break;
      }
    } catch (error) {
      // Absence of response field in error object indicates network error
      if (error.name === 'FetchError') {
        // Handle connection errors
        continue;
      }
    }
  }

  // Do something with the successful response
  if (response && response.status === 200) {
    // Perform actions with the successful response
    console.log(await response.text())
  }
})();

In the above code, we use the fetch method from node-fetch library to send HTTP requests and handle retries. We initialize a variable response to store the response from the successful request.

We then use a for loop with a maximum of NUM_RETRIES iterations. Inside the loop, we make a GET request using fetch (node-fetch) to the specified URL. If the response status code is either 200 or 404, we break out of the loop.

If a connection error occurs, we catch the error and continue to the next iteration.

Finally, after the loop, we check if the response variable is not null and has a status code of 200. If these conditions are met, you can perform actions with the successful response.

The advantage of this approach is that you have a lot of control over what is a failed response.

Above we are only look at the response code to see if we should retry the request, however, we could adapt this so that we also check the response to make sure the HTML response is valid.

Below we will add an additional check to make sure the HTML response doesn't contain a ban page.

import fetch from 'node-fetch';

const NUM_RETRIES = 3;

(async () => {
  let response;
  let responseText;
  let vaildResponse = false;

  for (let i = 0; i < NUM_RETRIES; i++) {
    try {
      response = await fetch('http://quotes.toscrape.com/');
      responseText = await response.text()
      if (response.status === 200 && !responseText.includes('<title>Robot or human?</title>')) {
        // Break the loop if a successful response is returned and the expected content is not present
        vaildResponse = true
        break;
      }

      if (response.status === 404) {
        // Break the loop if a 404 page is returned
        vaildResponse = true
        break
      }
    } catch (error) {
      if (error.name === 'FetchError') {
        // Handle connection errors
        continue;
      }
    }
  }

  // Do something with the successful response
  if (response && vaildResponse && response.status === 200) {
    // Perform actions with the successful response
    console.log(responseText)
  }
})();

const axios = require('axios');

const NUM_RETRIES = 3;

(async () => {
  let response;

  for (let i = 0; i < NUM_RETRIES; i++) {
    console.log(i)
    try {
      response = await axios.get('http://quotes.toscrape.com/');
      if (response.status === 200) {
        // Escape the loop if a successful response is returned
        break;
      }
    } catch (error) {
      // Absence of response field in error object indicates network error
      const networkError = error.response === undefined
      if (networkError) {
        // Handle connection errors
        continue;
      }
      response = error.response
      if (error.response.status === 404) {
        break;
      }
    }
  }

  // Do something with the successful response
  if (response && response.status === 200) {
    // Perform actions with the successful response
    console.log(response.data)
  }
})();

In the above code, we use the axios.get method from axios library to send HTTP requests and handle retries. We initialize a variable response to store the response from the successful request.

We then use a for loop with a maximum of NUM_RETRIES iterations. Inside the loop, we make a GET request using axios.get (axios) to the specified URL. If the response status code is either 200 or 404, we break out of the loop.

If a connection error occurs, we catch the error and continue to the next iteration.

Finally, after the loop, we check if the response variable is not null and has a status code of 200. If these conditions are met, you can perform actions with the successful response.

The advantage of this approach is that you have a lot of control over what is a failed response.

Above we are only look at the response code to see if we should retry the request, however, we could adapt this so that we also check the response to make sure the HTML response is valid.

Below we will add an additional check to make sure the HTML response doesn't contain a ban page.

const axios = require('axios');

const NUM_RETRIES = 3;

(async () => {
  let response;
  let vaildResponse = false;

  for (let i = 0; i < NUM_RETRIES; i++) {
    try {
      response = await axios('http://quotes.toscrape.com/');

      if (response.status === 200 && !response.data.includes('<title>Robot or human?</title>')) {
        // Break the loop if a successful response is returned and the expected content is not present
        vaildResponse = true
        break;
      }

    } catch (error) {
      const networkError = error.response === undefined
      if (networkError) {
        // Handle connection errors
        continue;
      }
      if (error.response.status === 404) {
        // Break the loop if a 404 page is returned
        vaildResponse = true
        break
      }
    }
  }

  // Do something with the successful response
  if (response && vaildResponse && response.status === 200) {
    // Perform actions with the successful response
    console.log(response.data)
  }
})();

const request = require('superagent');
const NUM_RETRIES = 3;

(async () => {
  let response;

  for (let i = 0; i < NUM_RETRIES; i++) {
    try {
      response = await request.get('http://quotes.toscrape.com/');
      if (response.status === 200) {
        // Escape the loop if a successful response is returned
        break;
      }
    } catch (error) {
      // Absence of response field in error object indicates network error
      const networkError = error.response === undefined;
      if (networkError) {
        // Handle connection errors
        continue;
      }
      response = error.response;
      if (response.status === 404) {
        break;
      }
    }
  }

  // Do something with the successful response
  if (response && response.status === 200) {
    // Perform actions with the successful response
    console.log(response.text)
  }
})();

In the above code, we use the request.get method from superagent library to send HTTP requests and handle retries. We initialize a variable response to store the response from the successful request.

We then use a for loop with a maximum of NUM_RETRIES iterations. Inside the loop, we make a GET request using request.get (superagent) to the specified URL. If the response status code is either 200 or 404, we break out of the loop.

If a connection error occurs, we catch the error and continue to the next iteration.

Finally, after the loop, we check if the response variable is not null and has a status code of 200. If these conditions are met, you can perform actions with the successful response.

The advantage of this approach is that you have a lot of control over what is a failed response.

Above we are only look at the response code to see if we should retry the request, however, we could adapt this so that we also check the response to make sure the HTML response is valid.

Below we will add an additional check to make sure the HTML response doesn't contain a ban page.

const request = require('superagent');

const NUM_RETRIES = 3;

(async () => {
  let response;
  let vaildResponse = false;

  for (let i = 0; i < NUM_RETRIES; i++) {
    try {
      response = await request.get('http://quotes.toscrape.com/');
      const html = response.text;
      if (response.status === 200 && !html.includes('<title>Robot or human?</title>')) {
        // Break the loop if a successful response is returned and the expected content is not present
        vaildResponse = true;
        break;
      }
    } catch (error) {
      const networkError = error.response === undefined;
      if (networkError) {
        // Handle connection errors
        continue;
      }
      response = error.response;
      if (response.status === 404) {
        vaildResponse = true;
        break;
      }
    }
  }

  // Do something with the successful response
  if (response && vaildResponse && response.status === 200) {
    // Perform actions with the successful response
    console.log(response.text);
  }
})();

const rp = require('request-promise');

const NUM_RETRIES = 3;

(async () => {
  let response;

  for (let i = 0; i < NUM_RETRIES; i++) {
    try {
      response = await rp('http://quotes.toscrape.com/');
      if (response.statusCode === 200 || response.statusCode === 404) {
        // Escape the loop if a successful response is returned
        break;
      }
    } catch (error) {
      if (error.name === 'RequestError') {
        // Handle connection errors
        continue;
      }
    }
  }

  // Do something with the successful response
  if (response && response.statusCode === 200) {
    // Perform actions with the successful response
  }
})();

In the above code, we use the request-promise library to send HTTP requests and handle retries. We initialize a variable response to store the response from the successful request.

We then use a for loop with a maximum of NUM_RETRIES iterations. Inside the loop, we make a GET request using rp (request-promise) to the specified URL. If the response status code is either 200 or 404, we break out of the loop.

If a connection error occurs, we catch the error and continue to the next iteration.

Finally, after the loop, we check if the response variable is not null and has a status code of 200. If these conditions are met, you can perform actions with the successful response.

The advantage of this approach is that you have a lot of control over what is a failed response.

Above we are only look at the response code to see if we should retry the request, however, we could adapt this so that we also check the response to make sure the HTML response is valid.

Below we will add an additional check to make sure the HTML response doesn't contain a ban page.

const rp = require('request-promise');

const NUM_RETRIES = 3;

(async () => {
  let response;
  let vaildResponse = false;

  for (let i = 0; i < NUM_RETRIES; i++) {
    try {
      response = await rp('http://quotes.toscrape.com/');

      if (response.statusCode === 200 && !response.includes('<title>Robot or human?</title>')) {
        // Break the loop if a successful response is returned and the expected content is not present
        vaildResponse = true
        break;
      }

      if (response.statusCode === 404) {
        // Break the loop if a 404 page is returned
        vaildResponse = true
        break
      }

    } catch (error) {
      if (error.name === 'RequestError') {
        // Handle connection errors
        continue;
      }
    }
  }

  // Do something with the successful response
  if (response && vaildResponse && response.statusCode === 200) {
    // Perform actions with the successful response
  }
})();

In this example, we also check the successful 200 status code responses to make sure they don't contain a ban page.

"<title>Robot or human?</title>"

If it does then the code will retry the request.

NodeJS: Retry Failed Requests

Need help scraping the web?

Retry Failed Requests Using Retry Library​

Build Your Own Retry Logic Wrapper​

More Web Scraping Tutorials​

Retry Failed Requests Using Retry Library

Build Your Own Retry Logic Wrapper

More Web Scraping Tutorials