Parameter-free Step-size Adaptation – Reinforcement Learning and Artificial Intelligence

Many of the learning algorithms used in the project have parameters that must be tuned manually for good performance; we are always looking for ways that they can be set automatically. Foremost among these parameters is the step-size parameter of stochastic gradient-descent algorithms. Several methods have been proposed for automatically setting step-size parameters, but unfortunately all of them have at least one parameter of their own, and this meta-parameter must generally be tuned manually to the particular problem, thereby limiting the benefit of these methods. This year we have begun a subproject to develop a superior step-size algorithm with no parameters or meta-parameters, that is, that can be applied with no domain knowledge other than what is needed to formulate a problem as gradient descent.

So far we have shown that all previous step-size methods involve at least one meta-parameter and that there is no single setting of the meta-parameters that produces acceptable performance on all tasks. We are focusing in particular on a family of methods related to an algorithm known as K1, previously developed by Sutton, which performs best of the existing methods but is still sensitive to the meta-parameter. We have developed a new algorithm that we call Normalized K1 and which performs well, without tuning, over a much wider range of problems. So far we have tested Normalized K1 on a range of artificial problems. Next we will stress it more severely by applying it to data from the Critterbot.