Abstract: With breakthroughs in deep learning algorithms, the
practice of manipulating audio to produce believable fakes
is expanding rapidly. This survey paper provides a com-
prehensive overview of the current state of deepfake audio
research, encompassing generation methods, online plat-
forms to generate fake audio, the latest detection tech-
niques, human perception of fake audio, and the underly-
ing security concerns. We examine different methods for
speech synthesis, audio splicing, and voice cloning, point-
ing out their advantages and disadvantages. Furthermore,
we investigate various detection algorithms, encompass-
ing supervised, unsupervised, and hybrid techniques, and
assess their effectiveness in detecting audio manipulation.
We review deepfake audio’s impacts, including possible ad-
verse effects on reputation, fraud, and misinformation. We
present a concise analysis of AI versus human detection of
deepfake audio, drawing insights from existing literature
and validating them through our experiments. Finally, we
highlight future research directions and recommendations
for mitigating the societal risks associated with this power-
ful technology.
Loading