Hi all. Long time reader, first time poster. I appreciate your help!

Suppose I have two features, X and Y for a clustering or unsupervised classification algorithm. X varies more quickly than Y, and is reported at a rate of 10 Hz, and Y is reported at a rate of 1 Hz.

Traditionally, I've used classifiers where the features are in the form of tuples. As in: <timestamp_1, x_1,="" y_1="">, <timestamp_2, x_2,="" y_2=""> ... <timestamp_n, x_n,="" y_n="">. This is easy to feed into the algorithms as a single 2D data point.

But in the case where they are at different rates, I only see two options:

(A) Upsampling Y to match the rate of B

(B) Downsampling X to match the rate of A

In the case of (A), there will be many repeated values of Y which can cause issues with some techniques (biases, singularities) In the case of (B), we throw away a lot of potentially useful data in X.

How is this traditionally managed?

asked Jul 10 '13 at 23:18

Randall%20Keeler's gravatar image

Randall Keeler
1111

Be the first one to answer this question!
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.