Hi,

So we have the current setting, a stream of non-stationary data, which is locally unbounded in the capture data set. However we do know some global bounds that the dataset has. The data cannot exceed certain value, but that is a really large value that is not very frequent.

We are going to apply online processing, so I was wondering what are the best practices when it comes to normalization, scaling, whitening, etc.

Since most of these tools require that you have the full set of data, I imagine that operating over small buffers should be the norm, but I'm not so sure.

Also, if this is not possible, it would mean that the learning algorithms would have to be used on unnormalized data, and that is a bit dicey even for simple regressions.

Any suggestions would be great

asked Jul 14 '13 at 22:10

Leon%20Palafox's gravatar image

Leon Palafox ♦
40857194128

You can always take a small sample of the data, compute the mean and covariance in that sample, and whiten all future data with respect to that.

(Jul 15 '13 at 13:31) Alexandre Passos ♦

Since your data is non-stationary do this every once in a while.

(Jul 15 '13 at 13:31) Alexandre Passos ♦

I think you have to clarify what you mean by non stationary, and then everything else will fall into place...(blocks exponentials etc) unless you are clear what regularities exist, how can you learn?

(Jul 17 '13 at 11:08) SeanV

What do you mean by that, non stationarity has a very clear definition, and of course we are doing feature extraction to detect stationary features.

My question is not really related to that

(Jul 17 '13 at 19:51) Leon Palafox ♦
1

if you don't suffer non stationarity, why bother?? Take a sample, calculate the expected values and they will keep constant...

(Jul 22 '13 at 09:53) edersantana

One Answer:

Instead of using blocks, I would compute an exponentially decayed mean with the recursive formula, and similarly for the variance.

There are also on-line sequential estimators for quantiles, I think the original author's name is "Tierney", search for that.

answered Jul 17 '13 at 03:54

Matt's gravatar image

Matt
0113

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.