Keywords: contrastive Training, Deep learning, Machine learning
TL;DR: An additional method for contrastive training of more than related inputs
Abstract: This paper proposes a new method of contrastive training over multiple data points, focusing on the scaling issue present when using in-batch negatives. Our approach compares transformer training with dual encoders vs training with multiple encoders. Our method can provide a feasible approach to improve loss modelling as encoders scale.
5 Replies
Loading