1
1

Let's say I have an unsigned long integer (32-bit).

I would like to transform this value to a standard normal (Gaussian) distribution value. I want the transformation to be deterministic. Assume I sample uniformly over unsigned long integers, I would like the converted values to have zero mean and unit variance.

What is the correct way to do this?

asked Jun 17 '10 at 14:58

Joseph%20Turian's gravatar image

Joseph Turian ♦♦
579051125146

edited Jun 17 '10 at 15:33


5 Answers:

What level of correctness are you looking for? It seems like starting with a discrete valued number you are not going to be able to achieve mathematical correctness. On the other hand, if you just want it to be correct to machine precision, then a uniform 32 bit int already has more precision than a 32 bit float chosen uniformly from [0,1], which is the basis for generating random gaussian values with Box-Muller. (So divide by MAX_INT and use Box-Muller)

answered Jun 17 '10 at 15:18

Ian%20Goodfellow's gravatar image

Ian Goodfellow
1072162734

edited Jun 17 '10 at 15:20

What I'm confused about with the Box-Muller transform is that it takes two uniform values in [0, 1), and transform them into two normal random values.

However, I only have one uniform value. How do I apply Box-Muller over a single value?

(Jun 17 '10 at 15:31) Joseph Turian ♦♦
1

I guess dividing by MAX_INT only works if you are always going to generate them in pairs. If you want to get one gaussian distributed variable for one int, you could use half the int as one of your inputs, the other half as the other input, and then only generate one of the two outputs. ie let x = 16 most significant bits, divided by 2^16-1, let y = 16 least significant bits, divided by 2^16-1, then z = sqrt(- 2lnx) cos(2 pi y) will be normally distributed

(Jun 17 '10 at 15:40) Ian Goodfellow

P.S. If you want a topology-preserving mapping then the straightforward thing is to map the integer to [-1,1] and apply the inverse error function. This is available in python as scipy.special.erfinv

(Jun 17 '10 at 16:05) Ian Goodfellow

Instead, I deterministically sample within [0, 1) using x1 to 32-bit precision. Here is Python code: http://github.com/turian/common/blob/master/gaussian.py

(Jun 17 '10 at 17:40) Joseph Turian ♦♦

If you have a uniform value in some open range, I guess you can use the inverse error function, if it is implemented in your environment, scaled by the appropriate constant.

answered Jun 17 '10 at 16:04

Pascal%20Lamblin's gravatar image

Pascal Lamblin
106126

Joseph, the standard--and quickest--way to generate standard normal variates is the Box-Muller transform, as Ian indicated; yet it requires two uniforms. If you have only one variable at a time, and your particular situation prevents any type of buffering, using the inverse CDF is, to my knowledge, the only solution.

Still, the inverse CDF is quite a bit slower. If you can tolerate slight imprecisions, I found the following "fast inverse normal CDF" code to be useful in the past: http://home.online.no/~pjacklam/notes/invnorm/

answered Jun 17 '10 at 16:46

Nicolas%20Chapados's gravatar image

Nicolas Chapados
161

I agree with Pascal that to convert a single uniformly sampled integer into a variable whose distribution is Gaussian, you can just map it to (0,1) and then apply the inverse of the cumulative distribution function of the Gaussian.

answered Jun 17 '10 at 16:18

Yoshua%20Bengio's gravatar image

Yoshua Bengio
481157

Two answers that you aren't going to like are:

a) generate another random number and use Box Muller anyway (because generating an extra few bits of randomness is cheaper than the inverse method). Throw away the extra Gaussian sample or save it for a rainy day.

b) generate another 11 random numbers in the range 0 to 1 and add them to the first number (scaled to [0,1]) and subtract 6. This is the old, not particularly accurate method that was used in the dark ages. I bet (a) sounds better now since it only wasted one sample!

answered Jul 03 '10 at 03:16

Ted%20Dunning's gravatar image

Ted Dunning
636815

Your answer
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.