- Keywords: model-based control, model-based learning
- TL;DR: This paper provides a personal survey and retrospective on model-based control and learning.
- Abstract: This paper provides a personal survey and retrospective of my work on robot control and learning. It reveals an ideological agenda: Learning, and much of AI, is all about making and using models to optimize behavior. Robots should be rational agents. Reinforcement learning is optimal control. Learn models, and optimize policies to maximize reward. The view that intelligence is defined by making and using models guides this approach. This is a strong version of model-based learning, which emphasizes model-based reasoning as a form of behavior, and takes into account bounded rationality, bounded information gathering, and bounded learning, in contrast to the weak version of model-based learning, which only uses models to generate fake (imagined, dreamed) training data for model-free learning. Thinking is a physical behavior which is controlled by the agent, just as movement and any other behavior is controlled. Learning to control what to think about and when to think is another form of robot learning. That represents my career up to now. However, I have come to see that model-free approaches to learning where policies or control laws are directly manipulated to change behavior also play a useful role, and that we should combine model-based and model-free approaches to learning. Everyday life is complicated. We (and robots) don't know what aspects of the world are relevant to any particular problem. The dimensionality of the state is potentially the dimensionality of the universe. The number of possible actions is vast. The complexity barrier for AI is like the speed of sound for airplanes, where we have to punch through it as a field to go supersonic and develop useful robots in our homes, daily lives, and on the road [The Right Stuff, book 1979, movie 1983]. I advocate exploring life-long learning in the context of rich knowledge, such as libraries of past observed, taught, actual, and imagined behavior, rather than focusing on impoverished and isolated learning where a single representation is adapted to do a single behavior once from a single training set, and independent of any larger context. I advocate working with real robots in the real world to address the complexity barrier. Working with real robots is much easier and more fun than trying to figure out how to get simulators to capture the complexity that human learning effortlessly takes advantage of.