Hi,

I have been reading white papers day and night trying to get my head around convolutional dbn's. I am new to this and am trying to learn how to represent input.

I have simple musical data (pitch, timing) and unencoded hierarchical representations of it generating 1000ish dimensions per time slice. There is also other state data that I would like to use.

My understanding is that it is possible to train the initial RBM on the full set of data and then remove all the inputs but the basic note data for building the DBN.

Is this the correct approach or are there aspects to using hierarchical data like this that are problematic? Is there a preferred way to reduce the dimensionality or is that necessary?

Most of the audio related used of DBN's I have come across use spectral data, which seems like a likely candidate for input and output. I'm having trouble understanding how other kinds of input and output can be used.

asked Sep 26 '12 at 03:38

Casey%20Basichis's gravatar image

Casey Basichis
16224

edited Sep 26 '12 at 03:46


One Answer:

These two threads might help:

http://metaoptimize.com/qa/questions/7865/convolutional-deep-belief-networks

http://metaoptimize.com/qa/questions/8561/deep-belief-network-for-audio-feature-extraction

answered Mar 14 '14 at 01:31

Sharath%20Chandra's gravatar image

Sharath Chandra
311131621

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.