Revision


Bayes’ theorem

Definition

In probability theory and statistics, Bayes’ theorem (or Bayes’ law or Bayes’ rule) describes the probability of an event, based on prior knowledge of conditions that might be related to the event.


Formula

The Baye’s theorem is:

\[P\left(A \vert B\right) = \frac{P\left(B \vert A\right) P\left(A\right)}{P\left(B\right)}\]

It is just a derivation of this:

\[P\left(A \vert B\right) P\left(B\right) = P\left(A \cap B\right) = P\left(B \vert A\right) P\left(A\right)\]

Another formulation is:

\[posterior = \frac{prior \times likelihood}{evidence}\]

Where:

In a test for a disease:


Examples

See:


Resources

See:


MCMC Algorithms

In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain a sample of the desired distribution by recording states from the chain. The more steps are included, the more closely the distribution of the sample matches the actual desired distribution.


Metropolis Hasting Algorithm

Metropolis–Hastings algorithm is a Markov chain Monte Carlo (MCMC) method for obtaining a sequence of random samples from a probability distribution \(\pi\) from which direct sampling is difficult. This sequence can be used to approximate the distribution (e.g. to generate a histogram) or to compute an integral (e.g. an expected value).


Assumtions

Assume:

\[\pi(x) = \frac{f(x)}{\int f(y) \mathrm{d}y}\]

Here \(\frac{1}{\int f(y) \mathrm{d}y}\) is a normalization constant that is difficult to estimate. Metropolis Hasting Algorithm allows to sample from \(\pi\) without computing this normalization constant. It uses the fact that \(f \propto \pi\).

Let \(q(\cdot \vert x)\) be any probability density that depends on \(x\). For the algorithm to work correctly, it should be easy to sample following \(q(\cdot \vert x)\).

\(q(\cdot \vert x)\) may be for example \(\mathcal{N}(x, \sigma^2)\)


Pseudo code


Interpretation

At every iteration, the algorithm tries to move in the space of possible states. This move may be accepted or reject. The acceptance rate \(\alpha\) depends on the probability of the new state given the current state and the probability distribution \(\pi\). It the algorithm tries to move to a more probable state (ie \(\alpha = 1\)) then the move is always accepted. Otherwise the probability to reject the move is proportional to the drop of the probability density of the new point compared to the current point.


Resources

See:


Resources

See: