http://code.google.com/p/boilerpipe/ looks nice. I also need the main image and have it working for chinese sites so I grabbed goose
https://github.com/jiminoc/goose
and created my own simple Java library called snacktory:
https://github.com/karussell/snacktory
See snacktory in action on jetslide