Keywords: model-based control, model-based learning
TL;DR: This paper provides a personal survey and retrospective on model-based control and learning.
Abstract: This paper provides a personal survey and retrospective of my work on
robot control and learning. It reveals an ideological agenda:
Learning, and much of AI, is all about making and using models to
optimize behavior. Robots should be rational agents. Reinforcement
learning is optimal control. Learn models, and optimize policies to
maximize reward. The view that intelligence is defined by making and
using models guides this approach.
This is a strong version of model-based learning, which emphasizes
model-based reasoning as a form of behavior, and takes into account
bounded rationality, bounded information gathering, and bounded
learning, in contrast to the weak version of model-based learning,
which only uses models to generate fake (imagined, dreamed) training
data for model-free learning. Thinking is a physical behavior which
is controlled by the agent, just as movement and any other behavior is
controlled. Learning to control what to think about and when to think
is another form of robot learning.
That represents my career up to now. However, I have come to see that
model-free approaches to learning where policies or control laws are
directly manipulated to change behavior also play a useful role, and
that we should combine model-based and model-free approaches to
learning.
Everyday life is complicated. We (and robots) don't know what aspects
of the world are relevant to any particular problem. The
dimensionality of the state is potentially the dimensionality of the
universe. The number of possible actions is vast. The complexity
barrier for AI is like the speed of sound for airplanes, where we have
to punch through it as a field to go supersonic and develop useful
robots in our homes, daily lives, and on the road [The Right Stuff,
book 1979, movie 1983].
I advocate exploring life-long learning in the context of rich
knowledge, such as libraries of past observed, taught, actual, and
imagined behavior, rather than focusing on impoverished and isolated
learning where a single representation is adapted to do a single
behavior once from a single training set, and independent of any
larger context. I advocate working with real robots in the real world
to address the complexity barrier. Working with real robots is much
easier and more fun than trying to figure out how to get simulators to
capture the complexity that human learning effortlessly takes
advantage of.
2 Replies
Loading