Product Tour

Loading Web Pages & Starting Configuration

Selecting Data / Page Interaction

Following a link

Capturing data from multiple pages

Saving Configuration

Editing Configuration

Scraping Data

Export captured data

Category Scraping

Keyword based Scraping

Scrape via Proxy Server

Settings

Scheduler & Command line options

How to register ?

Saving Scraped Data

  • Once WebHarvy has finished scraping data, you can export (save) the captured data as a file ( Excel, XML, CSV, JSON or TSV file) or to a database by clicking the 'Export' button. As shown below clicking the 'Export' button gives 2 options to export data, to file or to database.

    Web Scraper
  • Export to File

  • Click Export button and select 'Export as File' option to export captured data as an Excel, XML, CSV, JSON or TSV file.

    Web Scraper
  • Export to Database

  • Click Export button and select 'Export to Database' option to export captured data to an SQL database. WebHarvy currently supports Microsoft SQL Server, Oracle, MySQL and PostgreSQL.

    Web Scraper

    You must provide the SQL server address, login details, database name and table name before clicking the Export button. If table of given name is not present in the database it will be automatically created. If table already exists exported data will be appended to it.

  • Append / Overwrite / Update

    Both file as well as database export allows 3 modes of saving newly mined data. This is relevant only when you are saving to a file or database where data is already present (from a previous mining session).

    Append

    Newly mined data is added to the end of existing rows of data in the file or database.

    Overwrite

    Newly mined data is saved after deleting all existing rows of data in the file or database. Previously saved data is overwritten.

    Update

    Existing rows in the file or database which has the same first column value as in any rows of newly mined data are updated. Rest of the rows of newly mined data are appended. For file export, this option is currently available only for Excel files.