Revision

Back to Actor Critic

Introduction

GAE for Generalized Advantage Estimation is an actor critic model that uses some tricks to improve training of the model.

Tricks

Lambda return

GAE uses lambda returns a generalization of n-steps where the model does’nt use only the n-th step but a combination of each n-th first steps. The parameter \(\lambda\) defines the weights associated with each step.

\(\lambda=0\) is equivalent to TD estimate and \(\lambda=1\) is equivalent to MC estimate. Every \(\lambda\) in between is a mix.

Resources

See:

UDRLN videos 3.4.13.