CorpusCatcher is a corpus collection toolset. It can help you to build language or topic specific corpora from publicly available web resources. This can be very useful for many purposes, especially for data to build spell checkers.
If you are interested in CorpusCatcher, or are working on spell checkers, you might also be interested in Spelt.
See the Advanced Topics section in the README for some notes on the code.