p(\theta|D) = \frac{p(D|\theta) p(\theta)}{\int p(D|\theta)p(\theta) d\theta}
via Monte-Carlo. The Metropolis algorithm, first discussed in Metropolis, Rosenblush, Teller (1953), allows you to sample from any distribution p(x) providing you can compute a function Cp(x) proportional to p(x). In this case, the algorithm allows us to sample the posterior without performing the integral in the denominator.
The algorithm
We start by defining a transition probability
M(x,x') = q(x,x')\min\left(\frac{p(x')}{p(x)}, 1\right)
where q is any arbitrary transition probability. In practice, to create a chain of samples based on M we
- Sample q(x, \cdot) to choose x'
- Move to x' if p(x') > p(x), else move to x' with probability p(x')/p(x), else remain in x.
As the number of iterations increases, the probability of being in state x converges to p(x). Intuitively, this happens because p is an eigenfunction of M with eigenvalue 1. Further details can be found in Robert, Casella (2004).
Practical issues
Although the algorithm works in principle with any non-degenerate proposal distribution q, the optimal choice minimizes the burn-in period, i.e. the number of initial iterations we discard before considering the chain to have 'converged' to the target distribution.
Although the algorithm works in principle with any non-degenerate proposal distribution q, the optimal choice minimizes the burn-in period, i.e. the number of initial iterations we discard before considering the chain to have 'converged' to the target distribution.
No comments:
Post a Comment