Simulating Normal Random Variables

Two simple ways to simulate a pair of independent standard normal distributions.

– Box-Muller Transform
– Marsaglia-Bray Transform

Box-Muller

The Box–Muller transform is a pseudo-random number sampling method for generating pairs of independent, standard, normally distributed (zero expectation, unit variance) random numbers, given a source of uniformly distributed random numbers. The method was first mentioned explicitly by Raymond E. A. C. Paley and Norbert Wiener in 1934. However, it is attributed to British statistician George E.P. Box and American computer scientist Mervin E. Muller who formalised it in their paper “A Note on the Generation of Random Normal Deviates” (1958) published in The Annals of Mathematical Statistics.

The Box–Muller transform was developed as a more computationally efficient alternative to the inverse transform sampling method. It takes two samples from the uniform distribution on the interval $[0, 1]$ and maps them to two standard, normally distributed samples. It uses the fact that the 2-dimensional distribution of two independent zero-mean Gaussian random numbers is radially symmetric if both component Gaussians have the same variance.

Method

Generate  $U_1, U_2$ independent random variables distributed uniformly on $[0,1]$ and define $R = \sqrt{-2\cdot\ln U_1}$ and $\theta = 2 \pi U_1$. Then

$$Z_1 = R*cos(\theta),$$

$$Z_2 = R*sin(\theta)$$

are independent random variables with standard normal distribution.

Visualisations

Box-Muller Transform Visualisation with sample size n=1000. The image on the top shows the uniform variables U1, U2. The image on the bottom shows the transformed variables Z1, Z2 which are normally distributed.

Let’s breakdown the transformation in steps and take a look at the transformations.

  • The first step consists of transforming $U_1$ into $R = \sqrt{-2\cdot\ln U_1}$.
  • Note that we can write $R^2= Z_1^2 + Z_2^2$. That is, $R^2$ is the sum of two independent squared normal variables. Thus $R^2$ follows a Chi-Square distribution with 2 degrees of freedom which in turn coincides with an exponential distribution of mean equal 2. We can see this in the third picture below. Note that this give us a way to generate a sample from an exponential distribution.
  • The second step is to map $U_2$ into $X = cos(2\pi U_2)$ and $Y = sin(2\pi U_2)$. These transformations together form the pairs $(X, Y) =(cos(2\pi U_2), sin(2\pi U_2))$ which indeed define the circle in the fourth picture.

  • Finally, multiplying the pairs $(X, Y)$ by $R$ we obtain $(Z1, Z2)$. Visualisations allow us to see that at the end $U1$ translates into the distance of each $(Z1, Z2)$ to the origin; while $U2$ defines the slope of the line that joins $(Z1, Z2)$ with the origin.

Python Implementation

from numpy import random, sqrt, log, sin, cos, pi

def box_muller_sample(n=1):
    
    u1 = random.rand(n)
    u2 = random.rand(n)
    
    r = sqrt(-2*log(u1))
    x = cos(2*pi*u2)
    y = sin(2*pi*u2)
    z1 = r*x
    z2 = r*y
    
    return z1, z2

Marsaglia-Bray Method

As we have seen the Box-Muller transform requires logarithm, square root, sine and cosine functions. The latter can be expensive to compute. American mathematician and computer scientist George Marsaglia developed a modification of the Box-Muller transform that avoids the usage of trigonometric functions. It was proposed in his paper “A Convenient Method for Generating Normal Variables” (1964) published by SIAM Review.

Marsaglia-Bray method is sometimes referred as Polar Method/Version of Box- Muller.

Method

Generate $W_1, W_2$ independent random variables distributed uniformly on $[-1,1]$ until $S = W1^2 + W2^2 < 1$. Define

$T = \sqrt{\frac{-2\log(S)}{S} }$, and

$Z_1 = T*W_1$, and $Z_2 = T*W_2$

Then, $Z_1, Z_2$ are independent random variables with standard normal distribution.

 

Visualisations

Marsaglia Bray Transform Visualisation with sample size n=1000. The image on the top shows the uniform variables W1, W2. The image on the bottom shows the transformed variables Z1, Z2 which are normally distributed.

Just as we did for the Box-Muller transform, let’s breakdown this transformations in steps and take a look at the visualisations.

  • We start with points $(W1, W2)$ within the square of vertices $(-1, 1), (-1, -1), (1, -1)$ and $(1,1)$. Then, we calculate their square distance to the origin, i.e. $S = W1^2 + W2^2$ and keep only those points within the unit disk.
  • Next we calculate $T = \sqrt{\frac{-2\ln( W1^2 + W2^2)}{ W1^2 + W2^2} }$ in order to map $S$ into $\mathbb{R}^{+}$. The following graph shows the mappings $W1\mapsto S$ and $W2\mapsto T$.
  • Finally, the pairs $(Z1,Z2)$ are obtained simply by multiplying $T$ by $(W1, W2)$. The following plots illustrate the mappings $W1\mapsto(Z1, Z2)$ and $W2\mapsto(Z1, Z2)$.

Python Implementation

import numpy as np
import math

def marsaglia():
  
    while True:
        
        w1 = (np.random.rand() * 2) - 1
        w2 = (np.random.rand() * 2) - 1
        s = w1 * w1 + w2 * w2
        
        if s < 1:
            t = math.sqrt((-2) * math.log(s)/s)
            z1 = w1 * t
            z2 = w2 * t
            
            return z1, z2

def marsaglia_sample(n=1):
    
    z1 = np.empty(n)
    z2 = np.empty(n)
    
    for i in range(n):
        z1[i], z2[i] = marsaglia()
        
    return z1, z2

Further Reading

The original papers and some extra reading materials

Quant Girl

I write about Mathematics, Statistics, Finance, Programming, and the relationships among them. I enjoy learning, listening to music and podcasts, practicing yoga, reading magazines, and watching documentaries.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.