COMPARATIVE STUDY OF WORLD MODELS, NVAE-BASED HIERARCHICAL MODELS, AND NOISYNET-AUGMENTED MODELS IN CARRACING-V2

Vidyavarshini Holenarasipur Jayashankar; Banafsheh Rekabdar

COMPARATIVE STUDY OF WORLD MODELS, NVAE-BASED HIERARCHICAL MODELS, AND NOISYNET-AUGMENTED MODELS IN CARRACING-V2

Vidyavarshini Holenarasipur Jayashankar, Banafsheh Rekabdar

Published: 06 Mar 2025, Last Modified: 15 Apr 2025ICLR 2025 Workshop World ModelsEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Reinforcement Learning, World Models, NoisyNet, NVAE, Model-Based RL

TL;DR: This study evaluates World Models, NVAE-based hierarchical models, and NoisyNet-augmented models in CarRacing V2, comparing rewards, training stability, and efficiency against baselines to enhance reinforcement learning for continuous control.

Abstract: In the case of OpenAI’s CarRacing-V2, Reinforcement Learning (RL) needs to solve both the problem of world modeling and exploration. This work primarily focuses at solving the issues of efficient world modeling and exploration strategies in RL for continuous control tasks by comparing different approaches for improv- ing the performance. It exhibits an experimental evaluation of three approaches: (i) standard World Models, (ii) NVAE-based hierarchical World Models, and (iii) NoisyNet-augmented World Models. We compare these methods based on cumu- lative reward performance, training stability, and computational efficiency. The comparison of the cumulative rewards and training stability in the experiments showed that the NVAE-based models improve the feature representation and the generalization of the models while the NoisyNet augmentation improves the adap- tive exploration. The work also shows trade-offs, for instance, the computational cost versus the reward performance among these approaches. It also proposes that a future model-based RL for autonomous driving should incorporate NVAE for feature extraction and NoisyNet for exploration as they could yield the best results. The results show that standard World Models have the highest cumulative reward, whereas the NoisyNet-augmented models have similar performance with fewer rollouts, thus indicating better exploration efficiency.

Submission Number: 59

Loading