Saturday 18 May 2013

Monte Carlo Integration 3

jQuery UI Slider - Multiple sliders












The animation above does the integral $\int_0^1 dx x^\alpha$ a total of $M$ times using $N$ samples and plots the distribution of values obtained. The mean is shifted to zero and the distribution is normalized to one. You can also click the graph which will recalculate the distribution using different random numbers. For further explanation read on...

This post is about how to estimate the error in Monte Carlo Integration. It will get very mathematical but here's the answer: assume that the distribution of results is Gaussian and estimate the error from the sample variance. However, as we will see when we analyse it in detail, this is not true for arbitrary integrals, which has implications in high finance among other fields.

The first question is how to take the distribution of random x values we use for Monte Carlo and obtain the distribution of function values. The probability that we get the value $F = f(x)$ when $x$ is chosen according to $P(x)$ is \[ \int dx \delta(F - f(x)) P(x) \equiv P(F)\] where we will distinguish the probability distribution for $x$ and $F$ by their arguments. The equation above is simply saying, the probability to get $F$ is equal to the probability to get an $x$ such that $F = f(x)$ summed over all values of $x$ for which this is true. The characteristic function is defined as the expectation value of $e^{itx}$ and we define the function $W(ik)$ as the log of this (for reasons that will become clear), \[ W(ik) = ln \int dF P(F) e^{ikF} = ln \int dx P(x) e^{ikf(x)} \equiv ln \langle e^{ikf}\rangle. \] To get the third term we used the definition of $P(F)$ and the integral with respect to $P(x)dx$ defines the angle brackets.

Now comes an important assumption: assume $W(ik)$ has a series expansion \[ W(ik) = \sum_{n=1}^{\infty} \frac{C_n (ik)^n}{n!}. \] By expanding out $ln \langle e^{ikf}\rangle$ we can calculate the coefficients $C_n$. The most important ones are the first two, \[C_1 = \langle f \rangle\] \[C_2 = \langle (f - \langle f \rangle)^2 \rangle.\] We recognise (if we have taken a lot of statistics courses) $C_1$ as the mean and $C_2$ as the variance.

In a Monte Carlo integration we take more than just one sample, so what is the distribution of estimates $\bar{f} = \frac{1}{N} \sum_i^N f(x_i)$? In the same way as before the probability to get the estimate $\bar{F}$ is the probability to get $x_1 \ldots x_N$ such that $\bar{f} = \bar{F}$ (the notation is getting subtle). \[P(\bar{F}) = \int \ldots \int dx_1 \ldots dx_N P(x_1) \ldots P(x_N) \delta(\bar{F} - \frac{1}{N} \sum_i^N f(x_i)) \] This has a characteristic function $\Omega(ik)$, \[ \Omega(ik) = ln \int d\bar{F} P(\bar{F} ) e^{ik \bar{F} }. \] Using the property of delta functions $\int \delta(x - a - b) = \int \delta(x-a) + \delta(x-b)$ and the fact that all the $x_i$ are independent the integral for $P(\bar{F} )$ factorises and leads to, \[ \Omega(ik) = ln \left[ \int dx P(x) e^{ikf(x)/N } \right]^N. \] The thing is square brackets is just $W(ik/N)$ and the magic properties of $ln$ let us put the power in front, \[ \Omega(ik) = N W(ik/N)\] This means $\Omega$ can be expanded in terms of the same $C_n$ as before, \[ \Omega(ik) = \sum_{n=1}^{\infty} \frac{C_n (ik)^n}{n! N^{n-1} } \]

For $N$ very large we can truncate the series, the integrals become too difficult if $n > 2$, \[ \Omega(ik) = C_1 ik - \frac{C_2 k^2}{2 N }. \] We want $P(\bar{F} )$, the distribution of answers we get from doing Monte Carlo integration with $N$ samples, $\ln( \Omega )$ is the Fourier transform of this, therefore inverse Fourier transforming $\exp( \Omega) $ gives $P(\bar{F})$. We need, \[ P(\bar{F}) = \frac{1}{2\pi}\int dk \exp\left( C_1 (ik) + \frac{C_2 (ik)^2}{2 N } -ik\bar{F} \right). \] This is a Gaussian integral, you can look the answer up (or complete the square) to get, \[ P(\bar{F}) = \frac{N}{2 \pi C_2} \exp\left( \frac{ (F-C_1) }{2 C_2 / N} \right) \] This is a Gaussian with mean $C_1 = \langle f \rangle$ and variance $C_2/N = \langle (f - \langle f \rangle)^2 \rangle/N$.

This is great news. As long as $N$ is big enough the distribution of answers we obtain from Monte Carlo is Gaussian. Answers far from the true mean are extremely unlikely, 99% of the measurements will be within $3\sigma$ of the mean ( $\sigma = \sqrt{C_2/N}$ is the width of the Gaussian ). The largeness of $N$ is usually the assumption people worry about; however there is another one: what if the higher terms in the expansion aren't small, eg. if $C_3 \rightarrow \infty$? Then no matter how large $N$ is the truncation is not justified and the distribution won't be Gaussian.

There is a simple example where this is the case: \[\int_0^1 dx x^\alpha = \frac{1}{\alpha+1}. \] as long as $-1 < \alpha$ the integral converges. What is $P(F)$? \[P(F) = \int_\epsilon^1 \frac{ \delta(F - x^\alpha) }{1-\epsilon} = \frac{F^{(1-\alpha)/\alpha}}{\alpha(1-\epsilon)}.\] The $1/(1-\epsilon)$ is a uniform probability distribution on the interval $[\epsilon,1]$ (corresponding to $P(x)$). There is no problem with taking $\epsilon$ to zero just yet but wait. The higher terms in the expansion $C_3 \ldots$ are sums of terms like $M_n = \int dF P(F) F^n$ (usually said as cumulants $C_n$ are sums of moments $M_n$). Let's calculate the $M_n$, \[M_n = \int dF \frac{F^{(1-\alpha)/\alpha + n}}{\alpha(1-\epsilon)} = \frac{1 - \epsilon^{1+\alpha n}}{(1-\epsilon)(1 + \alpha n)} \] If $1 + \alpha n < 0$ then the $\epsilon^{1 + \alpha n}$ in the numerator blows up as $\epsilon \rightarrow 0$ and thus $M_n$ and $C_n$ diverge.

The animation at the top of the post lets you vary $\alpha$ from 1 where all the moments converge down to -0.9 where some diverge. The distributions obtained from doing $M$ Monte Carlo integrations changes from a Gaussian to a long tailed distribution. "Rare" events are no so rare anymore and $\sigma$ is a very bad estimate of the error. So be careful, not everything in the world is a bell curve.

No comments:

Post a Comment