Java OkHttp Library: How to Use & Rotate Proxies
In order to use proxies with a Java OkHttp Library, initialize proxy
variable to a new Proxy
instance created with Proxy.Type.HTTP
(proxy scheme) as the first argument, and an InetSocketAddress
object constructed from your proxyHost
and proxyPort
as the second argument.
Use .proxy(proxy)
on the OkHttpClient.Builder
object to have your client
use the defined proxy for every request.
import java.net.InetSocketAddress;
import java.net.Proxy;
import java.util.concurrent.TimeUnit;
import okhttp3.OkHttpClient;
import okhttp3.Request;
import okhttp3.Response;
public class OkHttpProxy {
public static String proxyHost = "111.43.105.50";
public static int proxyPort = 9091;
public static void main(String[] args) throws Exception {
Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress(proxyHost, proxyPort));
OkHttpClient client = new OkHttpClient.Builder()
.proxy(proxy)
.readTimeout(30, TimeUnit.SECONDS)
.build();
Request request = new Request.Builder().url("https://httpbin.org/ip").build();
Response response = client.newCall(request).execute();
System.out.println("Response body: " + response.body().string());
}
}
In this guide for The OkHttp Web Scraping Playbook, we will look at how to integrate the 3 most common types of proxies into our OkHttp OkHttp based web scraper.
Using proxies with the OkHttp library allows you to spread your requests over multiple IP addresses making it harder for websites to detect & block your web scrapers.
In this guide we will walk you through the 3 most common proxy integration methods and show you how to use them with OkHttp OkHttp:
- Using Proxy IPs With OkHttp
- Proxy Authentication With OkHttp
- The 3 Most Common Proxy Formats
- Proxy Integration #1: Rotating Through Proxy IP List
- Proxy Integration #2: Using Proxy Gateway
- Proxy Integration #3: Using Proxy API Endpoint
Let's begin...
Need help scraping the web?
Then check out ScrapeOps, the complete toolkit for web scraping.
Using Proxy IPs With OkHttp
Using a proxy with OkHttp is very straightforward. You need to first define your proxy settings using Proxy
class imported from java.net
package. To do so, instantiate a new Proxy
object named proxy
by providing these constructor arguments: proxy scheme (Proxy.Type.HTTP
) as the first argument, and an instance of InetSocketAddress
constructed from your proxyHost
and proxyPort
as the second argument.
Then to configure your client
to use the proxy that we just defined, simply call proxy
method of your OkHttpClient.Builder
instance with proxy
object.
import java.net.InetSocketAddress;
import java.net.Proxy;
import java.util.concurrent.TimeUnit;
import okhttp3.OkHttpClient;
import okhttp3.Request;
import okhttp3.Response;
public class OkHttpProxy {
public static String proxyHost = "111.43.105.50";
public static int proxyPort = 9091;
public static void main(String[] args) throws Exception {
Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress(proxyHost, proxyPort));
OkHttpClient client = new OkHttpClient.Builder()
.proxy(proxy)
.readTimeout(30, TimeUnit.SECONDS)
.build();
Request request = new Request.Builder()
.url("https://httpbin.org/ip")
.build();
Response response = client.newCall(request).execute();
System.out.println("Response body: " + response.body().string());
}
}
Proxy Authentication With OkHttp
Some proxy IPs require authentication in the form of a username
and password
to use the proxy. In this section you'll see how to authenticate your proxy with OkHttp.
First set authenticator
variable to a new instance of Authenticator() { ... }
, which is an anonymous class
that implements Authenticator
interface.
Inside this class
implementation, override the authenticate
method with your custom authentication logic. authenticate
method gets called for every response that fails with 401 unauthenticated exception.
Inside authenticate
method, you are supposed to modify the original request such that it satisfies authentication challenge in the response. First create base64 encoded credential
out of your proxy username
and password
using Credentials.basic
method. Then set Proxy-Authorization
header of the original request to credential
and return this modified request.
Then to configure your client
to use authenticated proxy, simply call proxyAuthenticator
method of OKHttp client builder with authenticator
object.
import java.net.InetSocketAddress;
import java.net.Proxy;
import java.util.concurrent.TimeUnit;
import okhttp3.Authenticator;
import okhttp3.Credentials;
import okhttp3.OkHttpClient;
import okhttp3.Request;
import okhttp3.Response;
import okhttp3.Route;
public class AuthProxy {
public static String proxyHost = "example.com";
public static int proxyPort = 80;
public static String username = "USERNAME";
public static String password = "PASSWORD";
public static void main(String[] args) throws Exception {
InetSocketAddress proxyAddress = new InetSocketAddress(proxyHost, proxyPort);
Proxy proxy = new Proxy(Proxy.Type.HTTP, proxyAddress);
Authenticator authenticator = new Authenticator() {
@Override
public Request authenticate(Route route, Response response) {
String credential = Credentials.basic(username, password);
return response.request().newBuilder()
.header("Proxy-Authorization", credential)
.build();
}
};
OkHttpClient client = new OkHttpClient.Builder()
.proxy(proxy)
.proxyAuthenticator(authenticator)
.readTimeout(30, TimeUnit.SECONDS)
.build();
Request request = new Request.Builder().url("https://httpbin.org/ip").build();
Response response = client.newCall(request).execute();
System.out.println("Response body: " + response.body().string());
}
}
The 3 Most Common Proxy Formats
That covered the basics of integrating a proxy into OkHttp. In the next sections we will show you how to integrate OkHttp into the 3 most common proxy formats:
- Rotating Through List of Proxy IPs
- Using Proxy Gateways
- Using Proxy APIs
A couple years ago, proxy providers would sell you a list of proxy IP addresses and you would configure your scraper to rotate through these IP addresses and use a new one with each request.
However, today more and more proxy providers don't sell raw lists of proxy IP addresses anymore. Instead they provide access to their proxy pools via proxy gateways or proxy API endpoints.
We will look at how to integrate with all 3 proxy formats.
If you are looking to find a good proxy provider then check out our web scraping proxy comparison tool where you can compare the plans of all the major proxy providers.
Proxy Integration #1: Rotating Through Proxy IP List
Here a proxy provider will normally provide you with a list of proxy IP addresses that you will need to configure your scraper to rotate through and select a new IP address for every request.
The proxy list you recieve can come in different formats, the simplest being a list of proxy urls (scheme://username:password@host:port):
[
'http://username:password@85.237.57.198:20000',
'http://username:password@85.237.57.198:21000',
'http://username:password@85.237.57.198:22000',
'http://username:password@85.237.57.198:23000',
]
To integrate them into our scrapers we need to configure our code to pick a random proxy from this list everytime we make a request.
In our Java OkHttp Library scraper we could do it like this:
import java.net.InetSocketAddress;
import java.net.Proxy;
import java.net.URL;
import java.util.Random;
import java.util.concurrent.TimeUnit;
import okhttp3.Authenticator;
import okhttp3.Credentials;
import okhttp3.OkHttpClient;
import okhttp3.Request;
import okhttp3.Response;
import okhttp3.Route;
public class ProxyRotation {
public static String[] proxyList = new String[] {
"http://username:password@85.237.57.198:20000",
"http://username:password@85.237.57.198:21000",
"http://username:password@85.237.57.198:22000",
"http://username:password@85.237.57.198:23000"
};
public static String getRandomProxy(String[] proxyList) {
int rnd = new Random().nextInt(proxyList.length);
return proxyList[rnd];
}
public static void main(String[] args) throws Exception {
// pick a ramdom proxy url and construct a new URL instance from it
URL proxyUrl = new URL(getRandomProxy(proxyList));
// parse proxy details
String userInfo = proxyUrl.getUserInfo();
int delimiterIndex = userInfo.indexOf(":");
String proxyHost = proxyUrl.getHost();
int proxyPort = proxyUrl.getPort();
String username = userInfo.substring(0, delimiterIndex);
String password = userInfo.substring(delimiterIndex + 1);
// configure client to use proxy
Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress(proxyHost, proxyPort));
Authenticator authenticator = new Authenticator() {
@Override
public Request authenticate(Route route, Response response) {
return response.request().newBuilder()
.header("Proxy-Authorization", Credentials.basic(username, password))
.build();
}
};
OkHttpClient client = new OkHttpClient.Builder()
.proxy(proxy)
.proxyAuthenticator(authenticator)
.readTimeout(30, TimeUnit.SECONDS)
.build();
Request request = new Request.Builder()
.url("https://httpbin.org/ip")
.build();
Response response = client.newCall(request).execute();
System.out.println("Response body: " + response.body().string());
}
}
This is a simplistic example, as when scraping at scale we would also need to build a mechanism to monitor the performance of each individual IP address and remove it from the proxy rotation if it got banned or blocked.
Proxy Integration #2: Using Proxy Gateway
Increasingly, a lot of proxy providers aren't selling lists of proxy IP addresses anymore. Instead, they give you access to their proxy pools via a proxy gateway.
Here, you only have to integrate a single proxy into your OkHttp scraper and the proxy provider will manage the proxy rotation, selection, cleaning, etc. on their end for you.
This is the most comman way to use residential and mobile proxies, and is becoming increasingly common when using datacenter proxies too.
Here is an example of how to integrate a BrightData's residential proxy gateway into our OkHttp scraper:
import java.net.InetSocketAddress;
import java.net.Proxy;
import java.util.concurrent.TimeUnit;
import okhttp3.Authenticator;
import okhttp3.Credentials;
import okhttp3.OkHttpClient;
import okhttp3.Request;
import okhttp3.Response;
import okhttp3.Route;
public class BrightDataProxy {
/**
* Initializing global variables for bright data proxy url parameters
* a typical url looks like this: http://USERNAME:PASSWORD@zproxy.lum-superproxy.io:22225
*/
public static String BRIGHTDATA_USERNAME = "USERNAME";
public static String BRIGHTDATA_PASSWORD = "PASSWORD";
public static String BRIGHTDATA_HOSTNAME = "zproxy.lum-superproxy.io";
public static int BRIGHTDATA_PORT = 22225;
public static void main(String[] args) throws Exception {
Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress(BRIGHTDATA_HOSTNAME, BRIGHTDATA_PORT));
Authenticator authenticator = new Authenticator() {
@Override
public Request authenticate(Route route, Response response) {
return response.request().newBuilder()
.header("Proxy-Authorization", Credentials.basic(BRIGHTDATA_USERNAME, BRIGHTDATA_PASSWORD))
.build();
}
};
OkHttpClient client = new OkHttpClient.Builder()
.proxy(proxy)
.proxyAuthenticator(authenticator)
.readTimeout(30, TimeUnit.SECONDS)
.build();
Request request = new Request.Builder().url("https://httpbin.org/ip").build();
Response response = client.newCall(request).execute();
System.out.println("Response body: " + response.body().string());
}
}
As you can see, it is much easier to integrate than using a proxy list as you don't have to worry about implementing all the proxy rotation logic.
Proxy Integration #3: Using Proxy API Endpoint
Recently, a lot of proxy providers have started offering smart proxy APIs that take care of managing your proxy infrastructure for you by rotating proxies and headers for you so you can focus on extracting the data you need.
Here you typically send the URL you want to scrape to their API endpoint and then they will return the HTML response to you.
Although every proxy API provider has a slightly different API integration, they are all very similar and are very easy to integrate with.
Here is an example of how to integrate with the ScrapeOps Proxy Manager:
import java.util.concurrent.TimeUnit;
import okhttp3.OkHttpClient;
import okhttp3.Request;
import okhttp3.Response;
public class ScrapeOpsProxyAPI {
public static String SCRAPEOPS_API_KEY = "your_api_key";
public static void main(String[] args) throws Exception {
OkHttpClient client = new OkHttpClient.Builder()
.readTimeout(30, TimeUnit.SECONDS)
.build();
String targetUrl = "https://httpbin.org/ip";
String proxyAPIUrl = String.format("https://proxy.scrapeops.io/v1?api_key=%s&url=%s", SCRAPEOPS_API_KEY, targetUrl);
Request request = new Request.Builder()
.url(proxyAPIUrl)
.build();
Response response = client.newCall(request).execute();
System.out.println("Response body: " + response.body().string());
}
}
Here you simply send the targetUrl
you want to scrape to the ScrapeOps API endpoint in the url
query param, along with your SCRAPEOPS_API_KEY
in the api_key
param, and ScrapeOps will deal with finding the best proxy for that domain and return the HTML response to you.
You can get your own free API key with 1,000 free requests by signing up here.
More Web Scraping Tutorials
So that's how you can integrate proxies into your OkHttp scrapers.
If you would like to learn more about Web Scraping, then be sure to check out The Web Scraping Playbook.
Or check out one of our more in-depth guides: