Loading Web Pages & Starting Configuration
Selecting Data / Page Interaction
Following a link
Capturing data from multiple pages
Export captured data
Keyword based Scraping
Scrape via Proxy Server
Scheduler & Command line options
How to register ?
'Follow this link' option
In order to gather more detailed data by following a link within the page, click on the link. In the resulting Capture window, click 'Follow this Link' button as shown below.
When 'Follow this link' button is clicked WebHarvy will navigate following the link which you clicked. When the new page is loaded, you can select more data items to scrape by just clicking on them.
In case 'Follow this link' option is disabled, you may use the 'Click' option listed under 'More Options' to click a link, load the linked page and then extract data. This is useful if you need to navigate links within product details pages or select tabs within details page before extracting data.
Please make sure that you use the 'Click' option only when 'Follow this link' option is disabled.
Follow URLs present in HTML
WebHarvy can be configured to follow links (URLs : absolute as well as relative) present in the HTML code of the selected content. This option can be used when the 'Follow this link' or 'Click' options are not enabled or does not result in loading the required page. This is particularly useful while trying to capture data from popups.
During configuration, click on the link/image/button/element where the URL is embedded. In the resulting Capture window displayed, click 'More Options' and select 'Capture HTML' option to display the HTML code in the preview area. You may sometimes need to apply the 'Capture more content' option before selecting the 'Capture HTML' option to make sure that the HTML displayed in preview area contains the URL to be opened.
Once the HTML containing the URL is displayed, click 'More Options' and select 'Apply Regular Expression' option to capture the URL.
Using the correct RegEx string capture the URL from the HTML code displayed in the preview.
Once the URL (relative or absolute) is captured and displayed in the preview area, the 'Follow this link' option will be enabled. Click the 'Follow this link' option to load the URL. If the 'Follow this link' option is disabled, click 'More Options' and select the 'Click' option.
WebHarvy can automatically follow links in web pages and capture data from the resulting pages.