# Discrete and Continuous Distributions, Errors and fluctuations

###### # Table of Contents toc Discrete-rejection algorithm. Tower sampling (Python program and output) Walker's method of alias Walker's method (Naive Python code) From discrete to continuous One-dimensional integrals Errors and Central limit theorem Stable distribution
Don't hesitate to ask questions and make remarks on this wiki page.

## Discrete-rejection algorithm.

### Tower sampling (Python program and output)

import random
import pylab, bisect

def cumulative(l):
acc,  L = 0, [0]
for x in l:
acc += x
L.append(acc)
return L

class rand_tower:
L = []
def __init__(self,l):
self.L = cumulative(l)
def call(self):
u = random.uniform(0.0, self.L[-1])
return bisect.bisect_right(self.L, u) - 1

def test(n, r):
l = [ 1.0/float(i + 1.)**2 for i in range(n)]
g = rand_tower(l)
samples = [g.call() for i in range(r)]
pylab.hist(samples, bins=n, range=(-0.5,n - 0.5), normed=True)
y_vec = [l[k]/sum(l)  for k in range(n)]
pylab.plot(range(n),y_vec,'ro')
pylab.title('Tower sampling')
pylab.savefig('Tower_sampling.png')
pylab.show()

test(10, 1000000)

here is output for the above Python program..
 Comparison of Tower sampling (blue) with the theoretical result (red dots) for a discrete distribution with ten values..

## Walker's method of alias

Walker's method [1] is an ingenious algorithm for sampling an arbitrary discrete distribution of N elements in constant time per sample (after an initialization of complexity O(N) ).

### Walker's method (Naive Python code)

from random import uniform, seed,randint
import pylab

N = 5
x = [[1.0 / (k + 1), k] for k in range(N)]
x_mean = sum(y[0] for y in x) / N
for k in range(N): x[k][0] /= x_mean
for k in range(N):
print k, x[k][0]  /N
x_plus = []
x_minus = []
for z in x:
if z[0]> 1.0: x_plus.append(z)
else: x_minus.append(z)
table = []
for k in range(N - 1):
e_plus = x_plus.pop()
e_minus= x_minus.pop()
table.append((e_minus[0], e_minus[1], e_plus[1]))
e_plus[0] = e_plus[0] - (1.0 - e_minus[0])
if e_plus[0] < 1.0: x_minus.append(e_plus)
else: x_plus.append(e_plus)
if len(x_plus) > 0 : table.append((x_plus[0][0], x_plus[0][1], x_plus[0][1]))
if len(x_minus) > 0: table.append((x_minus[0][0], x_minus[0][1], x_minus[0][1]))
print table

samples = []
for k in range(1000000):
x = uniform(0.0, 1.0)
i = randint(0, N-1)
if x < table[i][0]: samples.append(table[i][1])
else: samples.append(table[i][2])
pylab.figure()
pylab.hist(samples, bins=N, range=(-0.5, N-0.5), normed=True)
pylab.title("Walker's method N= "+str(N))
pylab.show()

## From discrete to continuous

 Continuum limit of the tower sampling

## One-dimensional integrals

We now consider two other examples for the one-dimensional sample transformation. In the first example, we produce samples x with 0<x<1, distributed according to
$\pi(x)= x^\sigma$
We write the integrals as
\begin{align} \int_0^1 d\Upsilon &= \text{const} \int_0^1 dx x^\sigma \\ d\Upsilon &= \text{const} dx x^\sigma \\ \Upsilon &= \frac{\text{const}}{\sigma+1} x^{\sigma+1} \\ \Upsilon &= \frac{\text{const}}{\sigma+1} x^{\sigma+1} \\ - \log \exp(-x) & = x \\ \text{ran(0,1)}^{1/(\sigma+1)} &= x \end{align}
It follows that
$x= [ \text{ran}[0,1] ]^{1/(\sigma+1)}$
is distributed as
$\pi(x)= x^\sigma$

# Errors and Central limit theorem

Consider a direct sampling simulation, the outcome is usually a sequence of values
$(\xi_1, \xi_2, \ldots, \xi_N)$
which are different realizations of a random variable ξ , of distribution π(ξ). We assume that these values are independent and identically distributed (i.i.d.).

Using these N realisations, the best estimation of the mean
$\langle \xi\rangle$
of the distribution π(ξ) of ξ is
$X_N=\frac{1}{N}\sum_{i=1}^N \xi_i .$
Can we determine the statistical error associated with this estimation?

The mean XN being the average of N i.i.d. random variables, the central limit theorem should apply and the result writes:
$X_N=\langle \xi\rangle \pm \frac{\sqrt{\text{Var}(\xi_i)}}{\sqrt{N}}.$

Test this prediction with the following example: Sample N random numbers
$\{\xi_1,\xi_2,\ldots, \xi_N\}$
taken from a uniform distribution on [0,1] and give an estimation of the average with its associated error.

# Stable distribution

Take Ntime=10,100,1000... Study the distribution of the variable
$\frac{1}{N}\sum_{i=1}^N \xi_i$
as N becomes large. Show that after a proper rescaling of this variable, the distribution converge (for large N) to a unique curve. What is this limit distribution?
 import pylab, math, numpy, random
Nsample = 20000
Ntime = 100
random.seed(2)
alpha = 3
xmean = alpha/(alpha-1.)
Nfactor = float(Ntime)**0.5   #(1. - 1./alpha)
xsum = [0. for n in range(Nsample)]
for isample in range(Nsample):
xi = [0. for n in range(Ntime)]
for itime in range(Ntime):
xi[itime] = random.random()**(-1./alpha)
xsum[isample]=(sum(xi)/float(Ntime) - xmean) * Nfactor
pylab.hist(xsum, bins=30,range=(-10,10),normed=True)
pylab.title('Example 3, Rescaled Distributions, alpha = 3')
pylab.xlabel('distribution')
pylab.ylabel('x')
pylab.show()