Abstract: Large language models contain noisy general knowledge of the world, yet are hard to train or fine-tune. In contrast cogni- tive architectures have excellent interpretability and are flexi- ble to update but require a lot of manual work to instantiate. In this work, we combine the best of both worlds: bootstrapping a cognitive-based model with the noisy knowledge encoded in large language models. Through an embodied agent doing kitchen tasks, we show that our proposed framework yields better efficiency compared to an agent entirely based on large language models. Our experiments also indicate that the cog- nitive agent bootstrapped using this framework can generalize to novel environments and be scaled to complex tasks.
Loading