Tools for Working with Data

Tools for working with Data

Web Scraping

Web scraping is a technique of extracting information from websites.

Scraperwiki

http://scraperwiki.com

Scraperwiki has tutorials on scraping webpages for data, written for
Python and Ruby.

Scraper: a Plug-in for Chrome

Scraper is a cool, chrome plug-in I've just discovered that makes scraping web pages easy. Just

  1. Highlight part of a table, at least a row, that you want to scrape.
  2. Right-click on the selection. Select "scrape similar" from the pop-up menu, and some reasonable scraping defaults will appear.
  3. Press the "Export to Google Docs.." button to save the scraped data to a google docs spreadsheet.

Google Refine

Use to clean up messy and inconsistent data

Chrome Developer Tools

use to see the DOM underlying web pages

If there is a table of data on a web page that you want to scrape,
select it with your mouse, right click on the selection and choose
inspect element in the pop up menu. This should work in Safari,
Chrome or Firefox with the Firebug plug-in.

Data Analysis

Category:

Add new comment

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
By submitting this form, you accept the Mollom privacy policy.