AdaBoost is a classification boosting algorithm which iteratively forces the learners to predict accurately the sample wrongly predicted.
It does this by setting weights on each sample. At initialisation the weights of every sample are equal but after an iteration the weights of wrongly predicted sample are greater than the correctly predicted samples.
Here are the steps of AdaBoost:
Where:
\(\varepsilon_k=\sum_{j=1}^{n_{pop}}\alpha_j[t_k(X_j) \ne Y_j]\),
\(w_k=\frac{1}{2}\log\left(\frac{1-\varepsilon_k}{\varepsilon_k}\right)\) (negative logit function muliplied by 0.5),
\(\alpha_j\) is updated as follow: \(\alpha_j=\begin{cases} \alpha_j e^{-w_k}&& \text{if } \;\; t_k(X_j)=Y_j\\ \alpha_j e^{w_k}&& \text{if } \;\; t_k(X_j) \ne Y_j \end{cases}\),
Normalisation of \(\alpha_j\) is \(\alpha_j \leftarrow \frac{\alpha_j}{\sum_{l=1}^{n_{pop}}\alpha_l}\).
The final prediction for an unseen sample \(x\) is:
\[T_A(x)=\frac{1}{n_{trees}}\sum_{k=0}^{n_{trees}} w_k \cdot \left[t_k(x) \leq 0.5\right]\]Using the exponential loss \(L(h(X), Y)=e^{-h(X)Y}\) and \(\lambda=1\) it can be shown that Gradient Boosting is almost equivalent to AdaBoost.
See:
See: