Being a visual web scraper, WebHarvy allows you to select most of the data which you need from web pages via simple mouse clicks (click on the data > select required option from Capture window). But sometimes, the layout of product tiles in ecommerce websites vary from product to product – some products may have a discounted price, some may have a sponsored or a special tag etc. In such cases when you normally click and try to select data like price and images, some rows of the data column may be blank. To overcome this problem the following technique can be used.
Scraping price from product tiles
The following method needs to be used only if the normal ‘click and select data’ method fails to get all product prices.
Step 1
Click on the title of the first product
Step 2
Click on the Capture More Content capture window toolbar option once or twice till the entire product tile text (including price) is displayed in the preview area.
Step 3
Click on the Apply RegEx capture window option and select the regex string for getting price from the dropdown as shown below.
Step 4
Apply the selected RegEx and then click on the main ‘Capture Text’ button once the price is displayed in the preview area.
Scraping images or image URLs from product tiles
Normally, images can be selected for scraping by directly clicking over them during configuration and then by clicking the Capture Image button. If all product images are not selected while following the normal method, the workaround method given below can be followed.
Step 1
Click on the title of the first product
Step 2
Apply Capture More Content option multiple times (as required) till the entire product tile text is displayed in the preview area
Step 3
Click on the Capture HTML capture window toolbar option
Step 4
Click on the Apply RegEx capture window toolbar option and in the resulting window select the regex string for getting image URL from HTML.
Step 5
Apply the selected regex string and click on the ‘Capture Image’ button to download the product image or to scrape its URL.
Video
The following video shows an example where image URLs and product prices are selected in the above described method.