The following are the steps which you can follow to avoid getting blocked while scraping data using WebHarvy and also to stay anonymous.
-
1. You may use the Inject pauses during mining feature to avoid making continuous page requests to web servers for long duration. Although this method will minimize the chances of getting detected and blocked by web servers, this may not be effective always and your identify is still not hidden from the web server.
-
2. Select the Disable cookies while mining option in Browser Settings. Websites can get details regarding your previous visits using cookies stored locally by the browser. WebHarvy will periodically delete browser cookies during mining when this option is enabled.
It is also recommended that you Enable custom user agent string so that the scraping browser mimics a standard browser like Chrome or Edge.
-
3. The Scrape via Proxy Server feature allows you to access and scrape websites through proxy servers, thereby maintaining anonymity while scraping data.
You may also use a VPN instead of proxies to anonymously scrape websites.
To configure this feature, open WebHarvy Settings and select the Proxy Settings tab. You may provide a single proxy address or a list of proxy addresses as shown below. (Know More)
Either a single proxy server or a list of proxy servers can be used for web scraping. In case you select the 'Rotate proxies' option, WebHarvy will automatically rotate and use each proxy server in the list periodically. Otherwise, the first proxy in the list will be used.
How to obtain proxy server addresses ?
There are free as well as paid proxy servers available in the internet. You may find them by performing a google search.
The free proxies available are often slow and unreliable, and may result in early termination of mining process. For this reason, we do not recommend using free proxies with WebHarvy.
Our recommendation
You can choose any Proxy or VPN service to perform web scraping anonymously. We highly recommend that you make use of the free trial offered by most services before purchasing them. This is to verify that the service (proxy/VPN) works well with the websites from which you intend to extract data.
You can follow the link below to see some of the proxy services which we have tested and which we recommend using along with WebHarvy for anonymous web scraping.
Proxy Server Recommendations for Web Scraping
Please contact our support in case you need assistance or have any questions.