ANAVI: Audio Noise Awareness by Visual Interaction

Published: 05 Sept 2024, Last Modified: 05 Sept 2024CoRL 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Robots, Acoustic Noise, Vision, Learning
TL;DR: We propose Acoustic Noise Predictor (ANP) that learns how "loud" the robot’s actions will be for a listener in a home or an office.
Abstract: We propose Audio Noise Awareness by Visual Interaction (ANAVI) for robots.  Everyone and everything makes noise, but while humans are aware of their sound impact on everyone around them, robots are not.  This ability is critical as robots become members of our households and need to know to whisper when the baby has fallen asleep or that it is ok to make noise during a party.  The challenge of audio awareness starts with understanding the geometry and material composition of our indoor spaces.  In this work, we generate data on how an 'impulse' sounds at different listener locations in typical home settings, and train our Acoustic Noise Predictor (ANP). Next, we collect acoustic profiles corresponding to different actions for navigation. Unifying ANP with action acoustics, we demonstrate experiments with wheeled (Hello Robot Stretch) and legged (Unitree Go2) robots so that these robots adhere to the noise constraints of the environment.  We record the decibel levels of the audio produced at the receiver's location and report the $\epsilon-$thresholded accuracy of our model vs distance-based baseline. All simulated and real-world data, code and model checkpoints will be released.
Supplementary Material: zip
Submission Number: 401
Loading