I actually wrote an article on using Python to scrape a sports (hockey) statistics site and do some simple analysis. Check it out at: http://www.arandomforest.com/?cat=4
I used lxml with XPath queries. This is good for structured sites like sports statistics since they usually come in tables. One issue if you go down this path is many of the older sports statistics sites use table-based layouts, which may be intertwined with the data tables.
If you're interested in time-series for financial data the Yahoo finance API is your best bet for raw data. But there is a great R package: QuantMod that will automatically do charting and technical indicators, and will grab data from Yahoo or Google finance.
If you have a general interest in web scraping in general I can recommend Webbots, Spiders, and Screen Scrapers by Schrenk. It offers some interesting ideas and how they can be implemented with PHP.
If you're looking for software libraries, like Scrapy (as Andrew mentioned), lxml, and Beautiful Soup. From what I remember the Beautiful Soup stopped development, but it seems like it has been restarted. Scrapy is a framework for scraping, so if you're planning on doing a lot of scraping, or data integration it will probably have more useful tools. If you're simply scraping a set of pages I find lxml is simpler for prototyping.