<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" > <channel> <title>WebHarvy Blog</title> <atom:link href="https://www.webharvy.com/blog/feed/" rel="self" type="application/rss+xml" /> <link>https://www.webharvy.com/blog/</link> <description>Web Scraping Made Easy</description> <lastBuildDate>Thu, 07 Nov 2024 04:44:37 +0000</lastBuildDate> <language>en-US</language> <sy:updatePeriod> hourly </sy:updatePeriod> <sy:updateFrequency> 1 </sy:updateFrequency> <generator>https://wordpress.org/?v=6.7</generator> <image> <url>https://www.webharvy.com/blog/wp-content/uploads/2020/12/webharvy.png</url> <title>WebHarvy Blog</title> <link>https://www.webharvy.com/blog/</link> <width>32</width> <height>32</height> </image> <item> <title>WebHarvy 7.4 – Improvements in Page Scroll, Input Text, Image Download</title> <link>https://www.webharvy.com/blog/webharvy-7-4-changes/</link> <dc:creator><![CDATA[admin]]></dc:creator> <pubDate>Thu, 07 Nov 2024 04:41:44 +0000</pubDate> <category><![CDATA[Release update]]></category> <guid isPermaLink="false">https://www.webharvy.com/blog/?p=1655</guid> <description><![CDATA[<p>The following are the main changes in WebHarvy release 7.4.0.228. Input text dynamically updates the page content Often, when you enter text in an input field, the page updates dynamically. For example, other text on the page or items in a dropdown list may update instantly as you type. Previously, simulating this effect required using ... <a title="WebHarvy 7.4 – Improvements in Page Scroll, Input Text, Image Download" class="read-more" href="https://www.webharvy.com/blog/webharvy-7-4-changes/" aria-label="Read more about WebHarvy 7.4 – Improvements in Page Scroll, Input Text, Image Download">Read more</a></p> <p>The post <a href="https://www.webharvy.com/blog/webharvy-7-4-changes/">WebHarvy 7.4 – Improvements in Page Scroll, Input Text, Image Download</a> appeared first on <a href="https://www.webharvy.com/blog">WebHarvy Blog</a>.</p> ]]></description> <content:encoded><![CDATA[ <p>The following are the main changes in WebHarvy release 7.4.0.228.</p> <h2 class="wp-block-heading">Input text dynamically updates the page content</h2> <p>Often, when you enter text in an input field, the page updates dynamically. For example, other text on the page or items in a dropdown list may update instantly as you type. Previously, simulating this effect required using <a href="https://www.webharvy.com/docs/selecting-data.html#RunScript">JavaScript</a>. We have updated the input text functionality so that this is no longer required. Now, the page automatically updates whenever text is entered in an input field using the <a href="https://www.webharvy.com/docs/selecting-data.html#InputText">Input Text</a> option in Capture window. </p> <figure class="wp-block-image size-full"><img fetchpriority="high" decoding="async" width="465" height="339" src="https://www.webharvy.com/blog/wp-content/uploads/2024/11/image.png" alt="Input Text Dynamically Updates the Page Content" class="wp-image-1656" srcset="https://www.webharvy.com/blog/wp-content/uploads/2024/11/image.png 465w, https://www.webharvy.com/blog/wp-content/uploads/2024/11/image-300x219.png 300w" sizes="(max-width: 465px) 100vw, 465px" /></figure> <p></p> <h2 class="wp-block-heading">XML configuration files are now auto-indented </h2> <p>WebHarvy saves configurations as an XML files. These XML configuration files are now auto-indented to enhance readability, making it easier to inspect and edit them in a text editor.</p> <figure class="wp-block-image size-full"><img decoding="async" width="696" height="727" src="https://www.webharvy.com/blog/wp-content/uploads/2024/11/image-1.png" alt="XML Configuration Files Auto Indented" class="wp-image-1657" srcset="https://www.webharvy.com/blog/wp-content/uploads/2024/11/image-1.png 696w, https://www.webharvy.com/blog/wp-content/uploads/2024/11/image-1-287x300.png 287w" sizes="(max-width: 696px) 100vw, 696px" /></figure> <h2 class="wp-block-heading"><br>Page Scrolling Improved</h2> <p>The Page Scroll functionality has been improved by making the scroll smoother. This ensures that content further down the page loads correctly as the page is scrolled down. </p> <h2 class="wp-block-heading">Download Multiple Images using RegEx</h2> <p>The <a href="https://www.webharvy.com/docs/selecting-data.html#ScrapeByRegEx">RegEx multi-match</a> option allows you to select multiple image URLs from the page’s underlying HTML. When multiple image URLs appear in the preview area of Capture window, the <a href="https://www.webharvy.com/docs/selecting-data.html#ScrapeImage">Capture Image</a> option will be enabled, allowing you to download all of them. </p> <figure class="wp-block-image size-full"><img decoding="async" width="539" height="596" src="https://www.webharvy.com/blog/wp-content/uploads/2024/11/image-2.png" alt="Scrape Multiple Images using Regular Expressions." class="wp-image-1658" srcset="https://www.webharvy.com/blog/wp-content/uploads/2024/11/image-2.png 539w, https://www.webharvy.com/blog/wp-content/uploads/2024/11/image-2-271x300.png 271w" sizes="(max-width: 539px) 100vw, 539px" /></figure> <p>The post <a href="https://www.webharvy.com/blog/webharvy-7-4-changes/">WebHarvy 7.4 – Improvements in Page Scroll, Input Text, Image Download</a> appeared first on <a href="https://www.webharvy.com/blog">WebHarvy Blog</a>.</p> ]]></content:encoded> </item> <item> <title>How to Scrape Conrad.de Product Data using WebHarvy</title> <link>https://www.webharvy.com/blog/how-to-scrape-conrad-de-product-data-using-webharvy/</link> <dc:creator><![CDATA[admin]]></dc:creator> <pubDate>Tue, 03 Sep 2024 06:04:07 +0000</pubDate> <category><![CDATA[Case Studies]]></category> <category><![CDATA[Web Scraping Workshop]]></category> <category><![CDATA[WebHarvy]]></category> <category><![CDATA[conrad.de]]></category> <guid isPermaLink="false">https://www.webharvy.com/blog/?p=1619</guid> <description><![CDATA[<p>Scraping product data from Conrad.de, a leading electronics retailer, can provide valuable insights for market research, competitive analysis, and inventory management. Using WebHarvy, a visual point-and-click web scraping tool, makes this process easy and efficient, allowing you to easily extract detailed product information without needing any coding skills. This guide will show you how to ... <a title="How to Scrape Conrad.de Product Data using WebHarvy" class="read-more" href="https://www.webharvy.com/blog/how-to-scrape-conrad-de-product-data-using-webharvy/" aria-label="Read more about How to Scrape Conrad.de Product Data using WebHarvy">Read more</a></p> <p>The post <a href="https://www.webharvy.com/blog/how-to-scrape-conrad-de-product-data-using-webharvy/">How to Scrape Conrad.de Product Data using WebHarvy</a> appeared first on <a href="https://www.webharvy.com/blog">WebHarvy Blog</a>.</p> ]]></description> <content:encoded><![CDATA[ <p><br><br>Scraping product data from <a href="https://www.conrad.de">Conrad.de</a>, a leading electronics retailer, can provide valuable insights for market research, competitive analysis, and inventory management. Using <a href="https://www.webharvy.com">WebHarvy</a>, a visual point-and-click web scraping tool, makes this process easy and efficient, allowing you to easily extract detailed product information without needing any coding skills. This guide will show you how to use WebHarvy to automate data collection from Conrad.de, saving time and enhancing your data-driven strategies.</p> <figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio"><div class="wp-block-embed__wrapper"> <iframe loading="lazy" title="Scrape Product Data from Conrad.de using WebHarvy" width="900" height="506" src="https://www.youtube.com/embed/_T2k0y-8HxM?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe> </div></figure> <h2 class="wp-block-heading"></h2> <h3 class="wp-block-heading"><strong>Step 1: Install and Launch WebHarvy</strong></h3> <p><a href="https://www.webharvy.com/download.html">Download </a>and install the WebHarvy software from our website. Once installed, launch the application to get started with the scraping process.</p> <h3 class="wp-block-heading"><strong>Step 2: Load the website and start configuration</strong></h3> <p>Load the page from which you wish to scrape data within WebHarvy’s browser and <a href="https://www.webharvy.com/tour.html">start configuration</a>. </p> <figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="546" src="https://www.webharvy.com/blog/wp-content/uploads/2024/09/image-1024x546.png" alt="" class="wp-image-1620" srcset="https://www.webharvy.com/blog/wp-content/uploads/2024/09/image-1024x546.png 1024w, https://www.webharvy.com/blog/wp-content/uploads/2024/09/image-300x160.png 300w, https://www.webharvy.com/blog/wp-content/uploads/2024/09/image-768x409.png 768w, https://www.webharvy.com/blog/wp-content/uploads/2024/09/image.png 1280w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure> <p></p> <h3 class="wp-block-heading"><strong>Step 3: Configure Pagination </strong></h3> <p>To scrape product data from multiple pages of product data you need to <a href="https://www.webharvy.com/tour3.html">configure pagination</a>. Click on the title of the first product and select <a href="https://www.webharvy.com/tour1.html#ScrollList">More Options > Scroll List</a> from the resulting Capture window. When the page scrolls down to the bottom, click on the link to load the next page and <a href="https://www.webharvy.com/tour3.html#Pagination">set it as the next page link</a>.</p> <figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="867" height="515" src="https://www.webharvy.com/blog/wp-content/uploads/2024/09/image-1.png" alt="" class="wp-image-1621" srcset="https://www.webharvy.com/blog/wp-content/uploads/2024/09/image-1.png 867w, https://www.webharvy.com/blog/wp-content/uploads/2024/09/image-1-300x178.png 300w, https://www.webharvy.com/blog/wp-content/uploads/2024/09/image-1-768x456.png 768w" sizes="auto, (max-width: 867px) 100vw, 867px" /></figure> <p></p> <h3 class="wp-block-heading"><strong>Step 4: Select Data to Scrape</strong></h3> <p>WebHarvy operates in point-and-click mode. Hover over the product data you want to scrape (such as product names, prices, descriptions, and images) and <a href="https://www.webharvy.com/tour1.html">click to select it</a>. WebHarvy will automatically detect similar data across the page and highlight it. Select the Capture Text option from Capture window to select the text of the clicked item (and subsequent items).</p> <figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="907" height="665" src="https://www.webharvy.com/blog/wp-content/uploads/2024/09/image-2.png" alt="" class="wp-image-1622" srcset="https://www.webharvy.com/blog/wp-content/uploads/2024/09/image-2.png 907w, https://www.webharvy.com/blog/wp-content/uploads/2024/09/image-2-300x220.png 300w, https://www.webharvy.com/blog/wp-content/uploads/2024/09/image-2-768x563.png 768w" sizes="auto, (max-width: 907px) 100vw, 907px" /></figure> <h3 class="wp-block-heading"><strong>Step 5: Follow product links and scrape additional data</strong></h3> <p>WebHarvy allows you to follow product links and select data from product details pages. For this, click on the title/link fo the first product and select the <a href="https://www.webharvy.com/tour2.html">Follow this link</a> option from the Capture window. Once the product details page is loaded, you can click and select additional data. </p> <h3 class="wp-block-heading"><strong>Step 6: Stop Configuration and Start Mining</strong></h3> <p>Click on the “Stop Configuration” button to stop the configuration process. You may optionally save the configuration so that it can be run or edited later. Click on the <a href="https://www.webharvy.com/tour5.html">Start Mine</a> button to start mining data.</p> <figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="596" src="https://www.webharvy.com/blog/wp-content/uploads/2024/09/image-3-1024x596.png" alt="" class="wp-image-1623" srcset="https://www.webharvy.com/blog/wp-content/uploads/2024/09/image-3-1024x596.png 1024w, https://www.webharvy.com/blog/wp-content/uploads/2024/09/image-3-300x175.png 300w, https://www.webharvy.com/blog/wp-content/uploads/2024/09/image-3-768x447.png 768w, https://www.webharvy.com/blog/wp-content/uploads/2024/09/image-3.png 1131w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure> <p>The mined data can be <a href="https://www.webharvy.com/tour6.html">saved to a file or to a database</a>. </p> <h2 class="wp-block-heading">Try it!</h2> <p>Download and try the free evaluation version of WebHarvy to see if solves your web scraping requirements. <a href="https://www.webharvy.com/articles/getting-started.html">To get started follow this link</a>. </p> <p>The post <a href="https://www.webharvy.com/blog/how-to-scrape-conrad-de-product-data-using-webharvy/">How to Scrape Conrad.de Product Data using WebHarvy</a> appeared first on <a href="https://www.webharvy.com/blog">WebHarvy Blog</a>.</p> ]]></content:encoded> </item> <item> <title>WebHarvy 7.3 – Keywords via Input-Text, Miner Options saved in Configuration etc.</title> <link>https://www.webharvy.com/blog/webharvy-7-3-keywords-via-input-text-miner-options-saved-in-configuration-etc/</link> <dc:creator><![CDATA[admin]]></dc:creator> <pubDate>Tue, 18 Jun 2024 06:52:23 +0000</pubDate> <category><![CDATA[Release update]]></category> <guid isPermaLink="false">https://www.webharvy.com/blog/?p=1615</guid> <description><![CDATA[<p>The following are the main changes in Version 7.3 of WebHarvy. Support for adding keywords via the ‘Input-Text’ option There are websites where the search functionality is implemented such that the search keyword which user enters does not appear in the URL or POST data of the search results page. In these cases, if you ... <a title="WebHarvy 7.3 – Keywords via Input-Text, Miner Options saved in Configuration etc." class="read-more" href="https://www.webharvy.com/blog/webharvy-7-3-keywords-via-input-text-miner-options-saved-in-configuration-etc/" aria-label="Read more about WebHarvy 7.3 – Keywords via Input-Text, Miner Options saved in Configuration etc.">Read more</a></p> <p>The post <a href="https://www.webharvy.com/blog/webharvy-7-3-keywords-via-input-text-miner-options-saved-in-configuration-etc/">WebHarvy 7.3 – Keywords via Input-Text, Miner Options saved in Configuration etc.</a> appeared first on <a href="https://www.webharvy.com/blog">WebHarvy Blog</a>.</p> ]]></description> <content:encoded><![CDATA[ <p>The following are the main changes in Version 7.3 of WebHarvy. </p> <h2 class="wp-block-heading">Support for adding keywords via the ‘Input-Text’ option</h2> <p>There are websites where the search functionality is implemented such that the search keyword which user enters does not appear in the URL or POST data of the search results page. In these cases, if you use the <a href="https://www.webharvy.com/tour1.html#InputText">‘Input Text’</a> capture window option to input the keyword to the search box and then perform search (during configuration), then keywords can be added later to the configuration using the ‘<a href="https://www.webharvy.com/tour71.html#addkeywordslater">Add Keywords</a>‘ functionality. </p> <h2 class="wp-block-heading">Miner options saved in the configuration file</h2> <p>The following miner options are now saved in the configuration file, so that each configuration can have its own specific values for these settings. Previously, global settings were used for all configurations.</p> <ol class="wp-block-list"> <li><a href="https://www.webharvy.com/tour81.html#AdvancedMinerOptions">Advanced Miner Options</a></li> <li><a href="https://www.webharvy.com/tour81.html#MinerSettings">Page Load Timeout</a></li> <li><a href="https://www.webharvy.com/tour81.html#MinerSettings">Script Load Wait Time</a></li> </ol> <h2 class="wp-block-heading">Category tagging is now done using the full category path for multi-level category scraping</h2> <p>For <a href="https://www.webharvy.com/tour7.html#MultiLevelCategory">multi-level category scraping</a>, the category tagging column is filled with full category path (example: main category, sub category 1, sub category 2, final category). Previously, the <a href="https://www.webharvy.com/tour81.html#CategoryKWSettings">category column</a> was filled with the final page URL instead of category names. </p> <h2 class="wp-block-heading">Updated Browser</h2> <p>The chromium browser that WebHarvy internally uses has been updated to the latest version. This solves issues with Cloudflare protection reported on some websites. </p> <p>The post <a href="https://www.webharvy.com/blog/webharvy-7-3-keywords-via-input-text-miner-options-saved-in-configuration-etc/">WebHarvy 7.3 – Keywords via Input-Text, Miner Options saved in Configuration etc.</a> appeared first on <a href="https://www.webharvy.com/blog">WebHarvy Blog</a>.</p> ]]></content:encoded> </item> <item> <title>Scraping StatScore.com Live Scores from ScoreFrame using WebHarvy</title> <link>https://www.webharvy.com/blog/scraping-statscore-live-scores/</link> <dc:creator><![CDATA[admin]]></dc:creator> <pubDate>Mon, 29 Jan 2024 15:07:47 +0000</pubDate> <category><![CDATA[Case Studies]]></category> <category><![CDATA[Web Scraping Workshop]]></category> <category><![CDATA[WebHarvy]]></category> <category><![CDATA[statscore]]></category> <guid isPermaLink="false">https://www.webharvy.com/blog/?p=1599</guid> <description><![CDATA[<p>StatScore.com is a sports data and statistics website, which displays live score data, standings and match statistics for various sports, leagues and matches. In this article you will learn how to scrape live score data from StatScore’s ScoreFrame product. The easiest and fastest way to scrape data from any website is to use a web ... <a title="Scraping StatScore.com Live Scores from ScoreFrame using WebHarvy" class="read-more" href="https://www.webharvy.com/blog/scraping-statscore-live-scores/" aria-label="Read more about Scraping StatScore.com Live Scores from ScoreFrame using WebHarvy">Read more</a></p> <p>The post <a href="https://www.webharvy.com/blog/scraping-statscore-live-scores/">Scraping StatScore.com Live Scores from ScoreFrame using WebHarvy</a> appeared first on <a href="https://www.webharvy.com/blog">WebHarvy Blog</a>.</p> ]]></description> <content:encoded><![CDATA[ <p><a href="https://www.statscore.com/">StatScore.com</a> is a sports data and statistics website, which displays live score data, standings and match statistics for various sports, leagues and matches. In this article you will learn how to scrape live score data from StatScore’s ScoreFrame product. </p> <p>The easiest and fastest way to scrape data from any website is to use a web scraping software. <a href="https://www.webharvy.com/index.html">WebHarvy</a> is a visual web scraping software which can be used to scrape data from any website. WebHarvy allows you to select the data which you need to scrape via simple mouse clicks. WebHarvy can be used to <a href="https://www.webharvy.com/articles/scraping-sports-betting-odds.html">scrape sports analytics</a> and betting data from various websites like <a href="https://www.webharvy.com/articles/scraping-oddsportal.html">OddsPortal</a>, <a href="https://www.webharvy.com/articles/scraping-flashscore.html">FlashScore</a>, <a href="https://www.webharvy.com/articles/scraping-betexplorer.html">BetExplorer</a>, <a href="https://www.webharvy.com/blog/scrape-whoscored-live-scores/">WhoScored</a> etc. </p> <h2 class="wp-block-heading">Video Demonstration</h2> <p>The following video shows how WebHarvy can be used to scrape live score data from StatScore. </p> <figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio"><div class="wp-block-embed__wrapper"> <iframe loading="lazy" title="Scraping StatScore.com | ScoreFrame | Today's Matches" width="900" height="506" src="https://www.youtube.com/embed/G5O2UKPUfvA?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe> </div><figcaption class="wp-element-caption">Scraping StatScore Live Scores using WebHarvy</figcaption></figure> <p></p> <h2 class="wp-block-heading">Steps to follow to scrape StatScore Live Scores</h2> <ul class="wp-block-list"> <li><a href="https://www.webharvy.com/download.html">Download</a> and install WebHarvy in your computer. </li> <li>Open WebHarvy and navigate to the StatScore.com <a href="https://www.statscore.com/products/scoreframe/">ScoreFrame</a> page from which you need to scrape data</li> <li>Open <a href="https://www.webharvy.com/tour81.html#AdvancedMinerOptions">WebHarvy Settings > Advanced Miner Options</a> and select value ‘3’ for ‘Minimum number of items required in a list’. Apply changes.</li> </ul> <figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="578" src="https://www.webharvy.com/blog/wp-content/uploads/2024/01/image-1-1024x578.png" alt="" class="wp-image-1600" srcset="https://www.webharvy.com/blog/wp-content/uploads/2024/01/image-1-1024x578.png 1024w, https://www.webharvy.com/blog/wp-content/uploads/2024/01/image-1-300x169.png 300w, https://www.webharvy.com/blog/wp-content/uploads/2024/01/image-1-768x433.png 768w, https://www.webharvy.com/blog/wp-content/uploads/2024/01/image-1.png 1276w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure> <p></p> <ul class="wp-block-list"> <li>If you scroll down the page, you can see that after few initial matches, the rest of the matches are in collapsed layout. We need to expand them so that data for all matches listed on page can be scraped.</li> <li><a href="https://www.webharvy.com/tour.html">Start Configuration</a></li> <li>Click anywhere on the page and select <a href="https://www.webharvy.com/tour1.html#RunScript">More Options > Run Script</a> option from the resulting Capture window. Paste and run the following code, which will expand all matches. </li> </ul> <pre class="wp-block-code"><code>els = document.getElementsByClassName('fa fa-angle-double-down'); for (var i = els.length - 1; i != -1; i--) { if(!els[i].className.includes('180')) { els[i].parentElement.click(); } }</code></pre> <ul class="wp-block-list"> <li>Now you can start selecting data. The match time text can be directly clicked and selected using the ‘<a href="https://www.webharvy.com/tour1.html#ScrapeText">Capture Text</a>‘ option in the Capture window. </li> <li>To select the home and away team names, select the entire block text first using ‘<a href="https://www.webharvy.com/tour1.html#ScrapeMore">Capture More Content</a>‘ option and then highlight the required portion (team name) before clicking on the ‘<a href="https://www.webharvy.com/tour1.html#ScrapeText">Capture Text</a>‘ button.</li> </ul> <figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="913" height="581" src="https://www.webharvy.com/blog/wp-content/uploads/2024/01/image-2.png" alt="" class="wp-image-1601" srcset="https://www.webharvy.com/blog/wp-content/uploads/2024/01/image-2.png 913w, https://www.webharvy.com/blog/wp-content/uploads/2024/01/image-2-300x191.png 300w, https://www.webharvy.com/blog/wp-content/uploads/2024/01/image-2-768x489.png 768w" sizes="auto, (max-width: 913px) 100vw, 913px" /></figure> <p></p> <ul class="wp-block-list"> <li>The scores can be selected by directly selecting and clicking over its respective cells and using the Capture Text option. </li> <li>Once you have finished selecting all required data, Stop Configuration</li> <li>Save the configuration so that it can be edited or run later</li> <li>Click the <a href="https://www.webharvy.com/tour5.html">Start Mine</a> menu bar button to start mining data.</li> </ul> <figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="578" src="https://www.webharvy.com/blog/wp-content/uploads/2024/01/image-4-1024x578.png" alt="" class="wp-image-1603" srcset="https://www.webharvy.com/blog/wp-content/uploads/2024/01/image-4-1024x578.png 1024w, https://www.webharvy.com/blog/wp-content/uploads/2024/01/image-4-300x169.png 300w, https://www.webharvy.com/blog/wp-content/uploads/2024/01/image-4-768x433.png 768w, https://www.webharvy.com/blog/wp-content/uploads/2024/01/image-4.png 1276w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure> <p></p> <h2 class="wp-block-heading">Try yourselves</h2> <p>We highly recommend that you try this yourselves after downloading and installing the free evaluation version of WebHarvy. To get started, please <a href="https://www.webharvy.com/articles/getting-started.html">follow this link</a>. </p> <p>The post <a href="https://www.webharvy.com/blog/scraping-statscore-live-scores/">Scraping StatScore.com Live Scores from ScoreFrame using WebHarvy</a> appeared first on <a href="https://www.webharvy.com/blog">WebHarvy Blog</a>.</p> ]]></content:encoded> </item> <item> <title>WebHarvy 7.2 Release Update</title> <link>https://www.webharvy.com/blog/webharvy-7-2-release-update/</link> <dc:creator><![CDATA[admin]]></dc:creator> <pubDate>Wed, 10 Jan 2024 12:05:38 +0000</pubDate> <category><![CDATA[Release update]]></category> <category><![CDATA[WebHarvy]]></category> <category><![CDATA[new release]]></category> <guid isPermaLink="false">https://www.webharvy.com/blog/?p=1587</guid> <description><![CDATA[<p>The following are the changes in this version. 1. Updated quick start guide with useful links and demonstration/tutorial video search for various websites. 2. For multi-level category scraping, when the ‘Tag with category’ option is enabled in WebHarvy Settings, instead of filling the last column in data table with only the final sub-category name, the ... <a title="WebHarvy 7.2 Release Update" class="read-more" href="https://www.webharvy.com/blog/webharvy-7-2-release-update/" aria-label="Read more about WebHarvy 7.2 Release Update">Read more</a></p> <p>The post <a href="https://www.webharvy.com/blog/webharvy-7-2-release-update/">WebHarvy 7.2 Release Update</a> appeared first on <a href="https://www.webharvy.com/blog">WebHarvy Blog</a>.</p> ]]></description> <content:encoded><![CDATA[ <p>The following are the changes in this version.</p> <p>1. Updated quick start guide with useful links and demonstration/tutorial video search for various websites.</p> <figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="634" src="https://www.webharvy.com/blog/wp-content/uploads/2024/01/image-1024x634.png" alt="" class="wp-image-1588" srcset="https://www.webharvy.com/blog/wp-content/uploads/2024/01/image-1024x634.png 1024w, https://www.webharvy.com/blog/wp-content/uploads/2024/01/image-300x186.png 300w, https://www.webharvy.com/blog/wp-content/uploads/2024/01/image-768x475.png 768w, https://www.webharvy.com/blog/wp-content/uploads/2024/01/image.png 1388w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure> <p></p> <p>2. For <a href="https://www.webharvy.com/tour7.html#MultiLevelCategory">multi-level category scraping</a>, when the ‘Tag with category’ option is enabled in <a href="https://www.webharvy.com/tour81.html#CategoryKWSettings">WebHarvy Settings</a>, instead of filling the last column in data table with only the final sub-category name, the full category path starting from main category to final sub category is filled (eg: Main Category, Sub Category 1, Sub Category 2, Final Category). </p> <p>3. Updated the internal browser to <a href="https://www.chromium.org/Home/">Chromium</a> V117</p> <p>4. Pagination (loading next page data) made faster. </p> <p>5. Updated installer and other third party libraries used. </p> <p>The latest version of WebHarvy is available for download at <a href="https://www.webharvy.com/download.html ">https://www.webharvy.com/download.html</a></p> <p>The post <a href="https://www.webharvy.com/blog/webharvy-7-2-release-update/">WebHarvy 7.2 Release Update</a> appeared first on <a href="https://www.webharvy.com/blog">WebHarvy Blog</a>.</p> ]]></content:encoded> </item> <item> <title>Scrape GitHub Release Notes</title> <link>https://www.webharvy.com/blog/scrape-github-release-notes/</link> <dc:creator><![CDATA[admin]]></dc:creator> <pubDate>Fri, 17 Nov 2023 06:47:24 +0000</pubDate> <category><![CDATA[Uncategorized]]></category> <guid isPermaLink="false">https://www.webharvy.com/blog/?p=1579</guid> <description><![CDATA[<p>This article demonstrates how WebHarvy can be used to scrape GitHub release notes. With WebHarvy, it is possible to efficiently scrape release details like version numbers and release notes from multiple pages. WebHarvy is a generic web scraping software which can be used to scrape data from any website. Steps to follow The first step ... <a title="Scrape GitHub Release Notes" class="read-more" href="https://www.webharvy.com/blog/scrape-github-release-notes/" aria-label="Read more about Scrape GitHub Release Notes">Read more</a></p> <p>The post <a href="https://www.webharvy.com/blog/scrape-github-release-notes/">Scrape GitHub Release Notes</a> appeared first on <a href="https://www.webharvy.com/blog">WebHarvy Blog</a>.</p> ]]></description> <content:encoded><![CDATA[ <p>This article demonstrates how <a href="https://www.webharvy.com/index.html">WebHarvy </a>can be used to scrape <a href="https://github.com">GitHub</a> release notes. With WebHarvy, it is possible to efficiently scrape release details like version numbers and release notes from multiple pages.</p> <figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="494" src="https://www.webharvy.com/blog/wp-content/uploads/2023/11/image-1024x494.png" alt="" class="wp-image-1580" srcset="https://www.webharvy.com/blog/wp-content/uploads/2023/11/image-1024x494.png 1024w, https://www.webharvy.com/blog/wp-content/uploads/2023/11/image-300x145.png 300w, https://www.webharvy.com/blog/wp-content/uploads/2023/11/image-768x370.png 768w, https://www.webharvy.com/blog/wp-content/uploads/2023/11/image.png 1064w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure> <p></p> <p>WebHarvy is a generic web scraping software which can be used to scrape data from any website. </p> <h2 class="wp-block-heading">Steps to follow</h2> <p>The first step is to <a href="https://www.webharvy.com/download.html">download </a>and install WebHarvy in your computer, if you have not done so already. Then load the page from which you need to scrape data within WebHarvy’s configuration browser.</p> <p>Once the page has been loaded, click on the <strong>Start </strong>button to <a href="https://www.webharvy.com/tour.html">start configuration</a>. Once in configuration mode, you can click and select any data item (text or image) which you wish to scrape. </p> <figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="692" src="https://www.webharvy.com/blog/wp-content/uploads/2023/11/image-1-1024x692.png" alt="" class="wp-image-1581" srcset="https://www.webharvy.com/blog/wp-content/uploads/2023/11/image-1-1024x692.png 1024w, https://www.webharvy.com/blog/wp-content/uploads/2023/11/image-1-300x203.png 300w, https://www.webharvy.com/blog/wp-content/uploads/2023/11/image-1-768x519.png 768w, https://www.webharvy.com/blog/wp-content/uploads/2023/11/image-1.png 1423w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure> <p></p> <p>Clicking on any data item on page will bring up a <a href="https://www.webharvy.com/tour1.html">Capture window</a> with various options. Select the <strong>Capture Text </strong>option to select the text of the clicked item. Details like version number and release note text can be selected for scraping in this manner.</p> <figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="655" src="https://www.webharvy.com/blog/wp-content/uploads/2023/11/image-2-1024x655.png" alt="" class="wp-image-1582" srcset="https://www.webharvy.com/blog/wp-content/uploads/2023/11/image-2-1024x655.png 1024w, https://www.webharvy.com/blog/wp-content/uploads/2023/11/image-2-300x192.png 300w, https://www.webharvy.com/blog/wp-content/uploads/2023/11/image-2-768x491.png 768w, https://www.webharvy.com/blog/wp-content/uploads/2023/11/image-2.png 1486w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure> <p></p> <p>While selecting release notes, if the entire block of text is not selected, you can apply <strong><a href="https://www.webharvy.com/tour1.html#ScrapeMore">Capture More Content</a> </strong>option multiple times till the desired portion is selected. </p> <p>To configure pagination, that is to teach WebHarvy how to scrape data from multiple pages, scroll down to the bottom of the page and click on the link to load the next page (you may either click on the ‘next’ link or direct link to load page number 2). Then from the resulting Capture window, select the <strong><a href="https://www.webharvy.com/tour3.html">Set as Next Page link</a></strong> option. </p> <p>Once all data has been selected, <strong>Stop Configuration</strong> and <strong><a href="https://www.webharvy.com/tour5.html">Start Mine</a>. </strong></p> <figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="690" src="https://www.webharvy.com/blog/wp-content/uploads/2023/11/image-3-1024x690.png" alt="" class="wp-image-1583" srcset="https://www.webharvy.com/blog/wp-content/uploads/2023/11/image-3-1024x690.png 1024w, https://www.webharvy.com/blog/wp-content/uploads/2023/11/image-3-300x202.png 300w, https://www.webharvy.com/blog/wp-content/uploads/2023/11/image-3-768x517.png 768w, https://www.webharvy.com/blog/wp-content/uploads/2023/11/image-3.png 1289w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure> <p></p> <h2 class="wp-block-heading">Try WebHarvy</h2> <p>You may download and try the 15 days free evaluation version of WebHarvy by visiting the following link. If you have any questions, please feel free to reach out to our support.</p> <p><a href="https://www.webharvy.com/articles/getting-started.html">https://www.webharvy.com/articles/getting-started.html</a></p> <p>The post <a href="https://www.webharvy.com/blog/scrape-github-release-notes/">Scrape GitHub Release Notes</a> appeared first on <a href="https://www.webharvy.com/blog">WebHarvy Blog</a>.</p> ]]></content:encoded> </item> <item> <title>How to run WebHarvy on your Mac?</title> <link>https://www.webharvy.com/blog/how-to-run-webharvy-on-your-mac/</link> <dc:creator><![CDATA[admin]]></dc:creator> <pubDate>Mon, 13 Nov 2023 10:17:26 +0000</pubDate> <category><![CDATA[Uncategorized]]></category> <guid isPermaLink="false">https://www.webharvy.com/blog/?p=1575</guid> <description><![CDATA[<p>WebHarvy is a Windows application – it requires a Windows PC or Laptop to run. But this fact does not prevent you from running WebHarvy on your M1 or M2 Mac. WebHarvy can be run on macOS using Parallels software. Parallels allows you to run Windows on your Mac. Downloading, installing and configuring Windows on ... <a title="How to run WebHarvy on your Mac?" class="read-more" href="https://www.webharvy.com/blog/how-to-run-webharvy-on-your-mac/" aria-label="Read more about How to run WebHarvy on your Mac?">Read more</a></p> <p>The post <a href="https://www.webharvy.com/blog/how-to-run-webharvy-on-your-mac/">How to run WebHarvy on your Mac?</a> appeared first on <a href="https://www.webharvy.com/blog">WebHarvy Blog</a>.</p> ]]></description> <content:encoded><![CDATA[ <p>WebHarvy is a Windows application – it requires a Windows PC or Laptop to run. But this fact does not prevent you from running WebHarvy on your M1 or M2 Mac. WebHarvy can be run on macOS using <a href="https://www.parallels.com/">Parallels</a> software. </p> <p>Parallels allows you to run Windows on your Mac. Downloading, installing and configuring Windows on your Mac is very easy using Parallels. You can start running WebHarvy on your Mac with a few mouse clicks using Parallels. </p> <h2 class="wp-block-heading">Installing Windows 11 on your M1/M2 Mac </h2> <p>Video below shows how you can install Windows 11 on your Mac using Parallels.</p> <figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio"><div class="wp-block-embed__wrapper"> <iframe loading="lazy" title="How to install Windows 11 on M1/M2 Macs using Parallels 18" width="900" height="506" src="https://www.youtube.com/embed/-pysqMlhWQE?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe> </div></figure> <p></p> <h2 class="wp-block-heading">Installing and running WebHarvy on Mac</h2> <p></p> <p>Once you have installed Windows 11 via Parallels, then installing WebHarvy is just like installing any other application on a normal PC. <a href="https://www.webharvy.com/download.html">Download WebHarvy installer </a>and complete the installation. Video below shows WebHarvy running on an M1 MacBook Pro. </p> <figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-4-3 wp-has-aspect-ratio"><div class="wp-block-embed__wrapper"> <iframe loading="lazy" title="WebHarvy running on M1 MacBook Pro via Parallels" width="900" height="675" src="https://www.youtube.com/embed/WdGlTe3NcYM?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe> </div></figure> <p></p> <h2 class="wp-block-heading">Questions?</h2> <p>If you have any questions please feel free to reach out to our <a href="https://www.webharvy.com/support.html">tech support team</a>. </p> <p>The post <a href="https://www.webharvy.com/blog/how-to-run-webharvy-on-your-mac/">How to run WebHarvy on your Mac?</a> appeared first on <a href="https://www.webharvy.com/blog">WebHarvy Blog</a>.</p> ]]></content:encoded> </item> <item> <title>Scraping TrustPilot Reviews, Ratings and Contact Details</title> <link>https://www.webharvy.com/blog/scraping-trustpilot-reviews-ratings-and-contact-details/</link> <dc:creator><![CDATA[admin]]></dc:creator> <pubDate>Thu, 26 Oct 2023 09:43:21 +0000</pubDate> <category><![CDATA[Uncategorized]]></category> <guid isPermaLink="false">https://www.webharvy.com/blog/?p=1569</guid> <description><![CDATA[<p>Trustpilot is an online review platform that allows consumers to leave feedback and reviews about their experiences with various products and services. In this article you will learn how to scrape TrustPilot app reviews, ratings and contact details using WebHarvy. WebHarvy is an easy to use web scraping software using which data can be extracted ... <a title="Scraping TrustPilot Reviews, Ratings and Contact Details" class="read-more" href="https://www.webharvy.com/blog/scraping-trustpilot-reviews-ratings-and-contact-details/" aria-label="Read more about Scraping TrustPilot Reviews, Ratings and Contact Details">Read more</a></p> <p>The post <a href="https://www.webharvy.com/blog/scraping-trustpilot-reviews-ratings-and-contact-details/">Scraping TrustPilot Reviews, Ratings and Contact Details</a> appeared first on <a href="https://www.webharvy.com/blog">WebHarvy Blog</a>.</p> ]]></description> <content:encoded><![CDATA[ <p><a href="https://www.trustpilot.com/">Trustpilot</a> is an online review platform that allows consumers to leave feedback and reviews about their experiences with various products and services. In this article you will learn how to scrape TrustPilot app reviews, ratings and contact details using <a href="https://www.webharvy.com/index.html">WebHarvy</a>.</p> <p>WebHarvy is an easy to use web scraping software using which data can be extracted from any website. WebHarvy allows you to select the data which you need via an intuitive point and click user interface. </p> <p>You should <a href="https://www.webharvy.com/download.html">download</a> and install WebHarvy in your computer. The app looks like a browser – within which you can load and navigate web pages. You should then teach the software the location of the data which you wish to extract by clicking over them during a <a href="https://www.webharvy.com/demo.html">configuration phase</a>. </p> <h2 class="wp-block-heading">Video : Scraping TrustPilot Review Data</h2> <p>The following video shows how WebHarvy can be used to scrape data from TrustPilot. Details like product/service name, reviews and rating, contact details including email, phone and address etc. can be scraped from TrustPilot as demonstrated in the video.</p> <figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio"><div class="wp-block-embed__wrapper"> <iframe loading="lazy" title="Scraping TrustPilot.com | Ratings, Reviews, Email, Phone etc." width="900" height="506" src="https://www.youtube.com/embed/w-mVrZbpPow?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe> </div></figure> <p>As you can see WebHarvy allows you to click and select the data items which you wish to scrape from a web page. WebHarvy automatically detects and scrapes patterns of data occurring in list or table format. Each listing <a href="https://www.webharvy.com/tour2.html">link can be followed</a> to load its details page – from where more data can be selected. Scraping <a href="https://www.webharvy.com/tour3.html">data from multiple pages</a> of listings is also supported.</p> <h2 class="wp-block-heading">Try WebHarvy</h2> <p>We recommend that you download and try using the free evaluation version of WebHarvy available in our website. Please follow the link below to get started.</p> <p><a href="https://www.webharvy.com/articles/getting-started.html">Getting started with WebHarvy</a></p> <p>If you have any questions please do not hesitate to <a href="https://www.webharvy.com/support.html">contact our support</a> team. </p> <p>The post <a href="https://www.webharvy.com/blog/scraping-trustpilot-reviews-ratings-and-contact-details/">Scraping TrustPilot Reviews, Ratings and Contact Details</a> appeared first on <a href="https://www.webharvy.com/blog">WebHarvy Blog</a>.</p> ]]></content:encoded> </item> <item> <title>Scrape LandWatch Property Listings Data</title> <link>https://www.webharvy.com/blog/scrape-landwatch-property-data/</link> <dc:creator><![CDATA[admin]]></dc:creator> <pubDate>Fri, 13 Oct 2023 05:05:12 +0000</pubDate> <category><![CDATA[Case Studies]]></category> <category><![CDATA[Web Scraping Workshop]]></category> <category><![CDATA[WebHarvy]]></category> <category><![CDATA[landwatch]]></category> <category><![CDATA[real estate]]></category> <guid isPermaLink="false">https://www.webharvy.com/blog/?p=1558</guid> <description><![CDATA[<p>LandWatch.com is one of the largest rural property listing websites. It is a part of Land.com network. LandWatch allows users to search for a variety of land types including agricultural land, recreational land, undeveloped land and other types of real estate. In this article you will learn how to scrape LandWatch.com property listing data. Details ... <a title="Scrape LandWatch Property Listings Data" class="read-more" href="https://www.webharvy.com/blog/scrape-landwatch-property-data/" aria-label="Read more about Scrape LandWatch Property Listings Data">Read more</a></p> <p>The post <a href="https://www.webharvy.com/blog/scrape-landwatch-property-data/">Scrape LandWatch Property Listings Data</a> appeared first on <a href="https://www.webharvy.com/blog">WebHarvy Blog</a>.</p> ]]></description> <content:encoded><![CDATA[ <p><a href="https://www.landwatch.com/">LandWatch.com</a> is one of the largest rural property listing websites. It is a part of <a href="https://www.land.com/">Land.com</a> network. LandWatch allows users to search for a variety of land types including agricultural land, recreational land, undeveloped land and other types of real estate. In this article you will learn how to scrape LandWatch.com property listing data. Details like address, price, availability status, seller/agent contact details etc. can be scraped from LandWatch.com.</p> <h2 class="wp-block-heading">Using WebHarvy for scraping LandWatch.com</h2> <p>We are going to use <a href="https://www.webharvy.com/index.html">WebHarvy </a>to scrape data from LandWatch.com property listing pages. WebHarvy is a <a href="https://www.webharvy.com/demo.html">visual web scraper</a> which can be used to scrape data from any website. It is very easy to use and allows you to select data to scrape via simple mouse clicks from web pages. </p> <p>WebHarvy can also be used to <a href="https://www.webharvy.com/articles/scraping-real-estate.html">scrape data from other real estate websites</a> like <a href="https://www.webharvy.com/articles/scraping-realtor-real-estate-listings.html">Realtor</a>, <a href="https://www.webharvy.com/articles/scrape-trulia-real-estate.html">Trulia</a>, <a href="https://www.webharvy.com/articles/scraping-zillow-real-estate-data.html">Zillow</a>, RedFin etc. If you have not used WebHarvy before, we would recommend that you refer our <a href="https://www.webharvy.com/articles/getting-started.html">getting started guide</a>.</p> <h2 class="wp-block-heading">Video: How to scrape LandWatch?</h2> <p>The following video shows how WebHarvy can be used to scrape data from LandWatch.com. As you can see, WebHarvy contains a built in web browser using which you can load and navigate web pages. The data which you need to scrape can be selected using simple mouse clicks. </p> <figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio"><div class="wp-block-embed__wrapper"> <iframe loading="lazy" title="How to Scrape LandWatch Property Listings Data | WebHarvy" width="900" height="506" src="https://www.youtube.com/embed/Lx-77zO5_kU?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe> </div></figure> <h2 class="wp-block-heading"><br>Steps to follow to scrape LandWatch</h2> <ol class="wp-block-list"> <li><a href="https://www.webharvy.com/download.html">Download </a>and install WebHarvy</li> <li>Open WebHarvy and load the landwatch.com property listings page</li> <li><a href="https://www.webharvy.com/tour.html">Start Configuration</a></li> <li>Select the first listing and apply <a href="https://www.webharvy.com/tour1.html#ScrollList">Scroll List</a> option from the Capture window. This will smoothly scroll the page down so that all listings are correctly loaded</li> <li>Click the link to load second page and<a href="https://www.webharvy.com/tour3.html"> set it as the next page link</a></li> <li>Scroll back up to the first listing and start selecting data</li> <li>Details like address, price, URL etc. can be selected by directly clicking over its text. WebHarvy will show a <a href="https://www.webharvy.com/tour1.html">Capture window</a> with various options whenever you click over any data item displayed on the page. To capture the text of the clicked item, select the ‘Capture Text’ option. </li> <li>Use the <a href="https://www.webharvy.com/tour2.html">Follow this link</a> option to follow links and scrape data from property details pages</li> <li>The <a href="https://www.webharvy.com/tour1.html#ScrapeFollowingText">Capture Following Text</a> option can be used whenever the data which you need to scrape appears after a heading text</li> <li>Stop Configuration</li> <li><a href="https://www.webharvy.com/tour4.html">Save Configuration</a></li> <li><a href="https://www.webharvy.com/tour5.html">Start Mine</a></li> </ol> <h2 class="wp-block-heading">Questions</h2> <p>If you have any questions please <a href="https://www.webharvy.com/support.html">contact our support</a> or refer to our <a href="https://www.webharvy.com/articles/troubleshoot.html">troubleshooting</a> / <a href="https://www.webharvy.com/articles/howto.html">how-to</a> guides. A 15 days <a href="https://www.webharvy.com/download.html">free trial version of WebHarvy</a> can be downloaded from our website. </p> <p>The post <a href="https://www.webharvy.com/blog/scrape-landwatch-property-data/">Scrape LandWatch Property Listings Data</a> appeared first on <a href="https://www.webharvy.com/blog">WebHarvy Blog</a>.</p> ]]></content:encoded> </item> <item> <title>How to scrape Google News without writing any code?</title> <link>https://www.webharvy.com/blog/how-to-scrape-google-news-without-writing-any-code/</link> <dc:creator><![CDATA[admin]]></dc:creator> <pubDate>Tue, 03 Oct 2023 10:41:44 +0000</pubDate> <category><![CDATA[Case Studies]]></category> <category><![CDATA[How To]]></category> <category><![CDATA[Web Scraping Workshop]]></category> <category><![CDATA[WebHarvy]]></category> <category><![CDATA[google news]]></category> <guid isPermaLink="false">https://www.webharvy.com/blog/?p=1555</guid> <description><![CDATA[<p>In this article you will learn how to scrape data from Google News (news.google.com), without writing any code, using WebHarvy. WebHarvy is a visual web scraping software which can be used to scrape data from any website. WebHarvy You will need to download and install WebHarvy in your computer. WebHarvy allows you to select data ... <a title="How to scrape Google News without writing any code?" class="read-more" href="https://www.webharvy.com/blog/how-to-scrape-google-news-without-writing-any-code/" aria-label="Read more about How to scrape Google News without writing any code?">Read more</a></p> <p>The post <a href="https://www.webharvy.com/blog/how-to-scrape-google-news-without-writing-any-code/">How to scrape Google News without writing any code?</a> appeared first on <a href="https://www.webharvy.com/blog">WebHarvy Blog</a>.</p> ]]></description> <content:encoded><![CDATA[ <p>In this article you will learn how to scrape data from <a href="https://news.google.com/">Google News</a> (<a href="https://news.google.com/">news.google.com</a>), without writing any code, using WebHarvy. <a href="https://www.webharvy.com/">WebHarvy</a> is a visual web scraping software which can be used to scrape data from any website. </p> <h2 class="wp-block-heading">WebHarvy</h2> <p>You will need to <a href="https://www.webharvy.com/download.html">download and install WebHarvy </a>in your computer. WebHarvy allows you to select data from web pages via an easy-to-use, <a href="https://www.webharvy.com/demo.html">point-and-click user interface</a>. You can select the data which you need to scrape by simple mouse clicks. WebHarvy automatically identifies and parses repeating data (in lists or tables) displayed by web pages. </p> <h2 class="wp-block-heading">Steps to follow to scrape Google News articles</h2> <ol class="wp-block-list"> <li><a href="https://www.webharvy.com/download.html">Download</a> and install WebHarvy in your computer</li> <li>Open WebHarvy</li> <li>Load the news.google.com page from which you need to scrape data. WebHarvy’s inbuilt browser, which is based on Chromium, can load and navigate web pages just like any normal browser.</li> <li>Once the page displaying the data is loaded, click <a href="https://www.webharvy.com/tour.html">Start Configuration</a></li> <li>Now you can <a href="https://www.webharvy.com/tour1.html">click and select the data</a> which you need to scrape</li> <li>News title and URL can be selected by directly clicking on the text displayed on page and by using the corresponding option from the <a href="https://www.webharvy.com/tour1.html">Capture window</a>. </li> <li>WebHarvy allows data to be scraped from multiple pages of listings. <a href="https://www.webharvy.com/tour3.html">Various pagination techniques </a>employed by websites are handled by WebHarvy. </li> <li>The news article page can be opened by using either the <a href="https://www.webharvy.com/tour2.html">Follow this link </a>or <a href="https://www.webharvy.com/tour1.html#OpenPopup">Open Popup</a> option.</li> <li>Once all required data is selected, you can <a href="https://www.webharvy.com/tour4.html">stop configuration</a></li> <li>Click on the <a href="https://www.webharvy.com/tour5.html">Start Mine</a> button to start scraping data </li> <li>WebHarvy allows you to <a href="https://www.webharvy.com/tour6.html">save the scraped data to a file or database.</a> </li> </ol> <h2 class="wp-block-heading">Video : Scraping Google News Articles</h2> <figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio"><div class="wp-block-embed__wrapper"> <iframe loading="lazy" title="Web Scrape Google News | Title, URL, Content | WebHarvvy" width="900" height="506" src="https://www.youtube.com/embed/e2IvsmSOjuU?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe> </div></figure> <h2 class="wp-block-heading">Try yourselves</h2> <p>You may download and try using the free evaluation version of WebHarvy available in our website. <a href="https://www.webharvy.com/articles/getting-started.html">Follow this link </a>to get started. </p> <p>The post <a href="https://www.webharvy.com/blog/how-to-scrape-google-news-without-writing-any-code/">How to scrape Google News without writing any code?</a> appeared first on <a href="https://www.webharvy.com/blog">WebHarvy Blog</a>.</p> ]]></content:encoded> </item> </channel> </rss>