|
There are some vast image datasets like ImageNet that are labeled by the object names. Are there similar datasets illustrating verbs (e.g. "running", "standing", "sitting") and adjectives (e.g. "big", "crooked", "dark", "round")? I know of a couple of "action" video datasets, but they include only a very small set of categories. |