Deep learning architectures such as stacked Restricted Boltzmann Machines or stached autoencoders leverage an unsupervise pre-training phase that can be understood as data-driven feature extraction. If you further know that your samples are 2D pictures, blending the afore mentioned models convolutional layers might further reduce the number of parameters to fit and bring some shift invariance.