Abstract: The landscape of accessibility features in video games remains inconsistent, posing challenges for gamers who seek experiences tailored to their needs. Accessibility features, such as subtitles are widely used by players but are difficult to test manually due to the large scope of games and the variability in how subtitles can appear. In this article, we introduce an automated approach (EchoTest) to extract subtitles and spoken audio from a gameplay video, convert them into text, and compare them to detect discrepancies, such as typos, desynchronization, and missing text. EchoTest can be used by game developers to identify discrepancies between subtitles and spoken audio in their games, enabling them to better test the accessibility of their games. In an empirical study on gameplay videos from 15 popular games, EchoTest can verify discrepancies between subtitles and audio with a precision of 98% and a recall of 89%. In addition, EchoTest performs well with a precision of 73% and a recall of 99% on a challenging generated benchmark.
Loading