Michael Truell
Horace Mann School, USA
Title: A universal robotic control system using reinforcement learning with limited feedback
Biography
Biography: Michael Truell
Abstract
A mobile robot deep reinforcement learning system is created that converges on common robotic tasks using four times less feedback than pre-existing solutions. Th e system achieves this leap in effi ciency through context-aware action selection and aggressive online hyper-parameter optimization while still maintaining performance on embedded hardware. A core algorithm of deep wire fi tted q-learning is supplemented with active measurement of robot uncertainty, defi ned as the derivative of error between expected and received reward. Th is uncertainty value directly scales the temperature of Boltzmann probabilistic exploration policy in addition to the learning rate of stochastic gradient descent. Furthermore, to provide generality across robots and tasks, neural network topology is effi ciently evolved throughout training and evaluation. Finally, experience replay is extended to changing environments and is integrated with our uncertainty value. Human operators successfully trained the system on multiple robots in a matter of minutes to perform tasks such as driving to a point with a diff erential drive system, following a line using holonomic Swedish wheels or playing ping pong with a robot arm. All are without any manual hyperparameter adjustment in both simulation and hardware.