ScraperWiki is a hosted environment for writing automated processes to scan public websites and extract structured information from the pages they’ve published. It handles all of the boilerplate code that you normally have to write to handle crawling websites, gives you a simple online editor for your Ruby, Python, or PHP scripts, and automatically runs your crawler as a background process.

What I really like, though, is the way that most of the scripts are published on the site, so new users have a lot of existing examples to start with, and as websites change their structures, popular older scrapers can be updated by the community.

