Improving Reinforcement FALCON Learning in Complex Environment with Much Delayed Evaluation via Memetic Automaton

Gengzhi Zhang; Liang Feng; Yuling Xie; Zhou Wu; Lin Chen

Improving Reinforcement FALCON Learning in Complex Environment with Much Delayed Evaluation via Memetic Automaton

Gengzhi Zhang, Liang Feng, Yuling Xie, Zhou Wu, Lin Chen

Published: 01 Jan 2019, Last Modified: 13 Nov 2024CEC 2019EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The Fusion Architecture for Learning, COgnition, and Navigation (FALCON) is an extension of the self-organizing neural network i.e., Adaptive Resonance Theory (ART), which has been successfully applied in many reinforcement learning tasks, and demonstrated fast and stable real-time learning capabilities. However, the learning of reinforcement FALCON relies on the positive feedbacks obtained from the environment, which may not be always available in many real-world applications. Although TD-FALCON has been proposed in the literature, to integrate the temporal difference method to estimate the payoff value when immediate reward is not available, the accuracy of the estimation also relies on the received feedback from environment. In complex environments with much delayed evaluation, the reinforcement FALCON may be hard to learn the proper knowledge to adapt in the given task quickly. To the best of our knowledge, there is no existing work has been conducted to improve the reinforcement FALCON learning in such environment. Taking this cue, inspired from the science of memetics, in this paper, we propose to improve the reinforcement FALCON learning in complex environment where positive reward is hard to achieve, via memetic automaton. In particular, by defining the particular representation of memes in the context of FALCON, the corresponding designs of meme selection and meme transmission for meme evolution are presented, to transfer the knowledge meme from well-learned agents in simple environment to improve the learning performance of FALCON agents in complex environment. Lastly, simulations of FALCON based multi-agent system using the mine navigation task platform, confirmed the efficacy of the proposed memetic model.

Loading