Keywords: Multi-Modal Fusion
Abstract: We propose Dynamic Contrastive Reinforcement Learning (DCRL), a new structure for end-to-end adaptive code-text alignment with a multi-modal fusion. The proposed method overcomes the shortcomings of static fusion methods by dynamically tuning contrastive learning parameters depending on the reinforcement learning agent's performance, and thus guarantees the quality of alignment is proportional to the proficiency of the task. Unlike conventional methods with 'fix margin' and 'fix temperature' against the contrastive loss, DCRL re-constructs the parameters of margin and temperature as a function of the cumulative reward of the agent and the rate of completion of the tasks, allowing the embedding space to learn out of broadly exploring and then pinpoint alignment. The framework incorporates a cross modal transformer which helps you fuse the embeddings of codes and text and further feed it into a policy network for downstream tasks such as code generation or text summarization.
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 25491
Loading