Skip to main content

Golang Colly Fake Headers Integration

The following are two examples of how to integrate the Fake Browser Headers API and the Fake User-Agent API into your Go Colly based web scrapers.


Go Colly Fake Browser Headers API Integration

To integrate the Fake Browser Headers API you should configure your scraper to retrieve a batch of the most up-to-date headers when the scraper starts and then configure your scraper to pick a random header from this list for each request.

Here is an example Go Colly scraper integration:


package main

import (
"bytes"
"time"
"log"
"math/rand"
"net/http"
"encoding/json"
"github.com/gocolly/colly"
)


type FakeUserAgentResponse struct {
Result []string `json:"result"`
}

func RandomString(userAgentList []string) string {
randomIndex := rand.Intn(len(userAgentList))
return userAgentList[randomIndex]
}

func GetUserAgentList() []string {

// ScrapeOps User-Agent API Endpint
scrapeopsAPIKey := "YOUR_API_KEY"
scrapeopsAPIEndpoint := "http://headers.scrapeops.io/v1/user-agents?api_key=" + scrapeopsAPIKey

req, _ := http.NewRequest("GET", scrapeopsAPIEndpoint, nil)
client := &http.Client{
Timeout: 10 * time.Second,
}

// Make Request
resp, err := client.Do(req)
if err == nil && resp.StatusCode == 200 {
defer resp.Body.Close()

// Convert Body To JSON
var fakeUserAgentResponse FakeUserAgentResponse
json.NewDecoder(resp.Body).Decode(&fakeUserAgentResponse)
return fakeUserAgentResponse.Result
}

var emptySlice []string
return emptySlice
}


func main() {
// Instantiate default collector
c := colly.NewCollector(colly.AllowURLRevisit())

// Get Fake User Agents From API
userAgentList := GetUserAgentList()

// Set Random Fake User Agent
c.OnRequest(func(r *colly.Request) {
r.Headers.Set("User-Agent", RandomString(userAgentList))
})

// Print the Response
c.OnResponse(func(r *colly.Response) {
log.Printf("%s\n", bytes.Replace(r.Body, []byte("\n"), nil, -1))
})

// Fetch httpbin.org/ip five times
for i := 0; i < 5; i++ {
c.Visit("http://httpbin.org/headers")
}
}


Go Colly Fake User-Agent API Integration

To integrate the Fake User-Agent API you should configure your scraper to retrieve a batch of the most up-to-date user-agents when the scraper starts and then configure your scraper to pick a random user-agent from this list for each request.

Here is an example Request-Promise scraper integration:


package main

import (
"bytes"
"time"
"log"
"math/rand"
"net/http"
"encoding/json"
"github.com/gocolly/colly"
)


type FakeBrowserHeadersResponse struct {
Result []map[string]string `json:"result"`
}

func RandomHeader(headersList []map[string]string) map[string]string {
randomIndex := rand.Intn(len(headersList))
return headersList[randomIndex]
}

func GetHeadersList() []map[string]string {

// ScrapeOps Browser Headers API Endpint
scrapeopsAPIKey := "YOUR_API_KEY"
scrapeopsAPIEndpoint := "http://headers.scrapeops.io/v1/browser-headers?api_key=" + scrapeopsAPIKey

req, _ := http.NewRequest("GET", scrapeopsAPIEndpoint, nil)
client := &http.Client{
Timeout: 10 * time.Second,
}

// Make Request
resp, err := client.Do(req)
if err == nil && resp.StatusCode == 200 {
defer resp.Body.Close()

// Convert Body To JSON
var fakeBrowserHeadersResponse FakeBrowserHeadersResponse
json.NewDecoder(resp.Body).Decode(&fakeBrowserHeadersResponse)
return fakeBrowserHeadersResponse.Result
}

var emptySlice []map[string]string
return emptySlice
}


func main() {
// Instantiate default collector
c := colly.NewCollector(colly.AllowURLRevisit())

// Get Fake Browser Headers From API
headersList := GetHeadersList()

// Set Random Fake Browser Headers
c.OnRequest(func(r *colly.Request) {
randomHeader := RandomHeader(headersList)
for key, value := range randomHeader {
r.Headers.Set(key, value)
}
})

// Print the Response
c.OnResponse(func(r *colly.Response) {
log.Printf("%s\n", bytes.Replace(r.Body, []byte("\n"), nil, -1))
})

// Fetch httpbin.org/ip five times
for i := 0; i < 5; i++ {
c.Visit("http://httpbin.org/headers")
}
}

Here the scraper will use a random user-agent for each request.


API Parameters

The following is a list of API parameters that you can include with your requests to customise the header list response.

ParameterDescription
api_keyThis is a required parameter. You can get your Free API key here.
num_resultsBy default the API returns a list of 10 user-agents, however, you can increase that number by changing the num_results number. Max is 100 headers.