Real-time recognition of piano keys based on mobile devices

Published: 01 Jan 2024, Last Modified: 13 Nov 2024GAIIS 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In practical application environments, the task of piano music transcription confronts a series of challenges including the long-tail distribution characteristics of training data, inadequate transcription accuracy in multi-key scenarios, interference from environmental noise, and the inability of mobile devices to meet the demands of real-time transcription. This study builds upon the Onsets and Frames model and introduces a series of structural improvements aimed at enhancing model performance. These include the simplification and optimization of the model structure, the introduction of data augmentation strategies, and the refinement of the loss function. Precise quantization of the model, coupled with effective mobile deployment strategies, ensures the efficient operation of the model on mobile devices. Experimental validation in the mobile environment demonstrates that the proposed improvements significantly enhance transcription accuracy in various scenarios, achieving a 56.98% increase in transcription speed compared to the original model, thereby meeting the requirements for real-time transcription. Moreover, the new model exhibits substantial noise resistance capabilities, particularly in low signal-to-noise ratio conditions where the performance degradation is considerably less than that of the original model.
Loading