Are there any current research projects going on involving developing a reliable system that is capable of learning objects automatically? I've seen plenty of products that involve some human interaction during the online or offline process to get a good learner.I realize that the current state of object recognition can't be deemed reliable when dealing with the conditions of the real world. But surely the end goal is to have machines "learn" the objects around them without any of the current hassles that we currently deal with?

What are the factors that make this so hard to achieve? Besides the fact that we still can't get recognition perfectly.

asked Dec 09 '11 at 20:35

mugetsu's gravatar image

mugetsu
233212431


2 Answers:

This work could be interesting for you: http://jmlr.csail.mit.edu/papers/volume12/ren11a/ren11a.pdf

Most object recognition systems work with human annotated image datasets. This is the standard procedure in computer vision. If you are given just a collection of images, I think it is impossible to determine what an object is. If you have a controlled scene like small object lying on a table, this might be possible. In real world images, there is no way to say when a border in an image corresponds to an object boundary.

You say you want machines that learn the objects around them. When you say that, you implicitly state that you don't want to use an image database but instead an autonomous robot that can interact with it's environment.

I agree that this would make learning new objects much easier. But I don't know of any robot that researchers would dare to "roam free" in any natural (indoor) environment.

It is certainly possible to build 3d maps of indoor scenes. This is definitely a step forward from just having a 2d image and makes segmentation much easier. Still, I think without real interaction, it is quite hard to separate touching objects. But picking up objects is not as easy as it sounds and afaik picking up objects of any shape is still an area of active research in robotics.

And would you really have your 400.000 $ robot play in your kitchen on it's own?

I guess this is the state of the art: http://www.youtube.com/user/WillowGaragevideo#p/u/1/8AiO530iePI

This answer is marked "community wiki".

answered Dec 10 '11 at 07:25

Andreas%20Mueller's gravatar image

Andreas Mueller
2686185893

My bot isn't close to capable of picking up and interacting with objects. So at the moment it's impossible for me to work with 3d maps.

so basically, the major issue deterring me is that I cannot accurately separate an object in an image from the boundary? But what if I use saliency map? And ignore everything except the roughly correct salient object? It won't necessary get the boundaries correctly but it can get most of the object in many cases.

(Dec 12 '11 at 13:37) mugetsu

I just went through the LSBP paper and it seemed to be essentially doing what my saliency map does though it cuts out unrelated backgrounds entirely. Certainly this is very useful and would allow me to have the correct borders of an image. The saliency map I have right now is decent enough that it captures most of the object and cuts out most of the unrelated parts of the image. But seeing as they have a nice set of code written I'll definitely try it out. At the same time I see that they also used it to find similar images with some degree of success, this is very necessary when it comes to classifying objects, which is done normally by human.

I found a paper that deals directly with this, I'm not really sure which of the two approaches would work best when trying to sort out a collection of images and determining which of them should be under the same class.

http://www.cs.utexas.edu/~grauman/papers/grauman_darrell_cvpr2006.pdf

(Dec 12 '11 at 18:14) mugetsu

For the "cutting out part": this is really really hard and quite unsolved. The LSBP paper is very cool but I'm pretty sure this won't scale and it does not completely solve the vision problem ;)

Segmentation is not the only problem. In your second comment, you talk about similar images. This is basically a clustering task. If you already have the objects of interest, it is certainly possible to see which are "the same class" using clustering. An even better approach is to use topic models. This is also an active area of research. A paper I particularly like on this topic is: http://books.nips.cc/papers/files/nips23/NIPS2010_0146.pdf

You are probably also interested in this one, which is quite complex but might be the best shot at "finding object categories" yet: http://books.nips.cc/papers/files/nips24/NIPS2011_1163.pdf

(Dec 15 '11 at 09:35) Andreas Mueller

This might be relevant : http://www.youtube.com/watch?v=1GhNXHCQGsM

answered Dec 15 '11 at 16:23

Aurelien%20Lucchi's gravatar image

Aurelien Lucchi
31344

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.