- Abstract: Contrary to most reinforcement learning studies, which emphasize on training a deep neural network to approximate its output layer to certain strategies, this paper proposes a reversed method for reinforcement learning. We call this “Deep Deducing”. In short, after adequately training a deep neural network according to a strategy-environment-to-payoff table, then we initialize randomized strategy input and propagate the error between the actual output and the desired output back to the initially-randomized strategy input in the “input layer” of the trained deep neural network gradually to perform a task similar to “human deduction”. And we view the final strategy input in the “input layer” as the fittest strategy for a neural network when confronting the observed environment input from the world outside.
- Keywords: Reinforcement Learning, Deep Feed-forward Neural Network, Recurrent Neural Network, Game Theory, Control Theory, Nash Equilibrium, Optimization
- TL;DR: FROM DEEP LEARNING TO DEEP DEDUCING