Referential communication in heterogeneous communities of pre-trained visual deep networks

Matéo Mahaut; Roberto Dessi; Francesca Franzon; Marco Baroni

Referential communication in heterogeneous communities of pre-trained visual deep networks

Matéo Mahaut, Roberto Dessi, Francesca Franzon, Marco Baroni

Published: 17 Apr 2025, Last Modified: 17 Apr 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: As large pre-trained image-processing neural networks are being embedded in autonomous agents such as self-driving cars or robots, the question arises of how such systems can communicate with each other about the surrounding world, despite their different architectures and training regimes. As a first step in this direction, we systematically explore the task of referential communication in a community of heterogeneous state-of-the-art pre-trained visual networks, showing that they can develop, in a self-supervised way, a shared protocol to refer to a target object among a set of candidates. This shared protocol can also be used, to some extent, to communicate about previously unseen object categories of different granularity. Moreover, a visual network that was not initially part of an existing community can learn the community's protocol with remarkable ease. Finally, we study, both qualitatively and quantitatively, the properties of the emergent protocol, providing some evidence that it is capturing high-level semantic features of objects.

Submission Length: Long submission (more than 12 pages of main content)

Previous TMLR Submission Url: https://openreview.net/forum?id=NbbU8zr2v4&referrer=%5BAuthor%20Console%5D(%2Fgroup%3Fid%3DTMLR%2FAuthors%23your-submissions)

Changes Since Last Submission: For the camera-ready version, we have added the new experiments requested by the reviewers (communication without training; longer population training; sequential population training). We have moreover highlighted the main contributions of the work in the introduction, and we have incorporated all the fixes and requests for clarification suggested by the reviewers. We thank again the editor and the reviewers for helping us improving the paper.

Code: https://github.com/facebookresearch/EGG

Assigned Action Editor: ~Han-Jia_Ye1

Submission Number: 3816

Loading