Abstract: In this letter, we propose a new approach to improve the performance of automatic speech segmentation techniques for concatenative text-to-speech synthesis. Instead of using a single automatic segmentation machine (ASM), we make use of multiple ASMs to draw the final boundary time marks. Given multiple ASMs, the best time mark is chosen among the results provided by the multiple separate ASMs depending on the contextual condition. The experimental results show that our approach dramatically improves the segmentation accuracy
External IDs:dblp:journals/spl/ParkK06
Loading