Many of the learning algorithms used in the project have parameters that must be tuned manually for good performance; we are always looking for ways that they can be set automatically. Foremost among these parameters is the step-size parameter of stochastic gradient-descent algorithms. Several methods have been proposed for automatically setting step-size parameters, but unfortunately all of them have at least one parameter of their own, and this meta-parameter must generally be tuned manually to the particular problem, thereby limiting the benefit of these methods. This year we carried out a major study of algorithms for automatic step-size adaptation, and ultimately produced a new algorithm, called Autostep, with no parameters or meta-parameters, that is, that can be applied with no domain knowledge other than what is needed to formulate a problem as gradient descent. We stress tested this algorithm by applying it to Critterbot data. We asked it to make simple short-term predictions of all of the robot’s the sensory variables. Autostep was able to find step-size parameters for all these predictions with no tuning.
Autostep and its predecessor algorithms are all designed for conventional supervised learning, not reinforcement learning. A goal for next year will be to extend them to TD and GTD learning, and then to use them to improve nexting and learning in Horde. If this is successful, we then hope to use Autostep to evaluate the utility of representational components such as the features of linear function approximation. This seems a promising method for directing representation discovery.