You might want to look at PGPE. While the cited paper presents the main idea in the context of Reinforcement Learning (particularly Policy Gradients), PGPE is by no means limited to that field. E.g. an ICANN-paper this year shows how it is used to break a cryptographic system (much faster than standard Evolutionary Search). You also might want to give Stochastic Search a try, though one might find out that with parameter sizes >> 50, PGPE is faster.