Keywords: multimodal, human behavior, nlp
TL;DR: We train a multimodal model on both code and developer interactions for the defect detection task. We found it overfits due to weak-alignment between the two data domains.
Abstract: Recent advances in natural language processing has seen the rise of language models trained on code. Of great interest is the ability of these models to find and classify defects in existing code bases. These models have been applied to defect detection but improvements between these models has been minor. Literature from cyber security highlights how developer behaviors are often the cause of these defects. In this work we propose to approach the defect detection problem in a multimodal manner using weakly-aligned code and the developer workflow data. We find that models trained on code and developer interactions tend to overfit and do not generalize because of weak-alignment between the code and developer workflow data.