Don't hesitate to ask questions and make remarks on this wiki page.

Discrete-rejection algorithm.


Tower sampling (Python program and output)


import random
import pylab, bisect
 
def cumulative(l):
    acc,  L = 0, [0]
    for x in l:
        acc += x
        L.append(acc)
    return L
 
class rand_tower:
    L = []
    def __init__(self,l):
        self.L = cumulative(l)
    def call(self):
        u = random.uniform(0.0, self.L[-1])
        return bisect.bisect_right(self.L, u) - 1
 
def test(n, r):
    l = [ 1.0/float(i + 1.)**2 for i in range(n)]
    g = rand_tower(l)
    samples = [g.call() for i in range(r)]
    pylab.hist(samples, bins=n, range=(-0.5,n - 0.5), normed=True)
    y_vec = [l[k]/sum(l)  for k in range(n)]
    pylab.plot(range(n),y_vec,'ro')
    pylab.title('Tower sampling')
    pylab.savefig('Tower_sampling.png')
    pylab.show()
 
test(10, 1000000)
 
here is output for the above Python program..
Tower_sampling.png
Comparison of Tower sampling (blue) with the theoretical result (red dots) for a discrete distribution with ten values..

Walker's method of alias

Walker's method [1] is an ingenious algorithm for sampling an arbitrary discrete distribution of N elements in constant time per sample (after an initialization of complexity O(N) ).

Walker's method (Naive Python code)

from random import uniform, seed,randint
import pylab
 
N = 5
x = [[1.0 / (k + 1), k] for k in range(N)]
x_mean = sum(y[0] for y in x) / N
for k in range(N): x[k][0] /= x_mean
for k in range(N):
     print k, x[k][0]  /N
x_plus = []
x_minus = []
for z in x:
    if z[0]> 1.0: x_plus.append(z)
    else: x_minus.append(z)
table = []
for k in range(N - 1):
    e_plus = x_plus.pop()
    e_minus= x_minus.pop()
    table.append((e_minus[0], e_minus[1], e_plus[1]))
    e_plus[0] = e_plus[0] - (1.0 - e_minus[0])
    if e_plus[0] < 1.0: x_minus.append(e_plus)
    else: x_plus.append(e_plus)
if len(x_plus) > 0 : table.append((x_plus[0][0], x_plus[0][1], x_plus[0][1]))
if len(x_minus) > 0: table.append((x_minus[0][0], x_minus[0][1], x_minus[0][1]))
print table
 
samples = []
for k in range(1000000):
    x = uniform(0.0, 1.0)
    i = randint(0, N-1)
    if x < table[i][0]: samples.append(table[i][1])
    else: samples.append(table[i][2])
pylab.figure()
pylab.hist(samples, bins=N, range=(-0.5, N-0.5), normed=True)
pylab.title("Walker's method N= "+str(N))
pylab.show()

From discrete to continuous


tower.png
Continuum limit of the tower sampling


One-dimensional integrals

We now consider two other examples for the one-dimensional sample transformation. In the first example, we produce samples x with 0<x<1, distributed according to

We write the integrals as

It follows that

is distributed as



Errors and Central limit theorem


Consider a direct sampling simulation, the outcome is usually a sequence of values

which are different realizations of a random variable ξ , of distribution π(ξ). We assume that these values are independent and identically distributed (i.i.d.).

Using these N realisations, the best estimation of the mean

of the distribution π(ξ) of ξ is

Can we determine the statistical error associated with this estimation?

The mean XN being the average of N i.i.d. random variables, the central limit theorem should apply and the result writes:


Test this prediction with the following example: Sample N random numbers

taken from a uniform distribution on [0,1] and give an estimation of the average with its associated error.

Stable distribution


Take Ntime=10,100,1000... Study the distribution of the variable

as N becomes large. Show that after a proper rescaling of this variable, the distribution converge (for large N) to a unique curve. What is this limit distribution?
 import pylab, math, numpy, random
 Nsample = 20000
 Ntime = 100
 random.seed(2)
 alpha = 3
 xmean = alpha/(alpha-1.)
 Nfactor = float(Ntime)**0.5   #(1. - 1./alpha)
 xsum = [0. for n in range(Nsample)]
 for isample in range(Nsample):
     xi = [0. for n in range(Ntime)]
     for itime in range(Ntime):
         xi[itime] = random.random()**(-1./alpha)
     xsum[isample]=(sum(xi)/float(Ntime) - xmean) * Nfactor
 pylab.hist(xsum, bins=30,range=(-10,10),normed=True)
 pylab.title('Example 3, Rescaled Distributions, alpha = 3')
 pylab.xlabel('distribution')
 pylab.ylabel('x')
 pylab.show()
[Print this page]
  1. ^ A.J. Walker, New fast method for generatlng discrete random numbers with arbitrary frequency distributions, Electronics Letters, vol. 10, pp. 127-128, 1974,
    A.J. Walker, “An efflclent method for generatlng dlscrete random varlables wlth general dlstrlbutlons,” ACM Transactions on Mathematical Software, vol. 3, pp. 253-256, 1977.