A project I'm involved with is building software for machine learning and data analysis on temporal data. The typical data would be a set of multimedia time series (e.g. data from fifty meetings, each 10-30 minutes long). We're interested both in typical time series prediction tasks (i.e. predict the future from past within a single time series), as well as annotation tasks (given twenty episodes where we have segmented the data into intervals 5 seconds long and manually classified each interval, classify the 5 second intervals for the other thirty episodes automatically).

I'm interested in hearing about good interfaces or notations for setting up these problems, and software which exhibits those interfaces or notations. There's quite a bit one has to specify, particularly with respect to the windows of data from which features are extracted (which do not have to be the same as the windows that are classified).

asked Aug 26 '11 at 15:38

Dave%20Lewis's gravatar image

Dave Lewis
890202846

Hi Dave, this sounds like an interesting problem, but it would help to have more information, How many classes do the intervals fall into? Does each interval only fall into one class? How long are the sequences of intervals? Are there structures like loops or the equivalent of parenthesis balancing that you need to detect?

(Aug 26 '11 at 17:59) Daniel Mahler

Daniel - We're still very much working out what sort of tasks we want to support. But I'd say prototypical tasks would have intervals classified with respect to contrast sets containing 2 to 15 classes, multiple contrast sets being relevant to the same intervals, typical data sequences being 10 minutes to one day, and typical segment lengths varying from a few seconds to several hours. Loops are definitely not in the picture, but overlapping segments might be. But we're very much interested in being guided by what software out there supports.

(Aug 26 '11 at 18:19) Dave Lewis

Learning finite state automata always seems like a reasonable bet for learning sequences. What does your training data look like. Is it just a set of sequences with some elements labelled & some not, or are there additional feature for each sequence or interval?

(Aug 26 '11 at 18:36) Daniel Mahler

I'm really trying to avoid leaping ahead to particular learning algorithms, or even data representations. What I'm interested in here is how well-designed learning software allows defining the task.

(Aug 26 '11 at 18:47) Dave Lewis
Be the first one to answer this question!
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.