Folder Structure Overview
=========================

This folder contains 3 main subfolders, each one consisting of audio clips that are meant to be used for comparisons on different tasks and of different methods
1. RealEdit: MAVE vs. FluentSpeech
   -------------------------------
   Folder: RealEdit_MAVE_VS_FluentSpeech
   Task: Speech editing
   Contents:
   - MAVE.wav                    Audio output from MAVE (our model)
   - FluentSpeech.wav           Audio output from FluentSpeech
   - orig_edited.docx           Original and edited text with modifications highlighted

2. RealEdit: MAVE vs. VoiceCraft
   -----------------------------
   Folder: RealEdit_MAVE_VS_VoiceCraft
   Task: Speech editing
   Contents:
   - MAVE.wav                    Audio output from MAVE (our model)
   - VoiceCraft.wav             Audio output from VoiceCraft
   - orig_edited.docx           Original and edited text with modifications highlighted

3. LibriTTS: MAVE vs. VoiceCraft
   -----------------------------
   Folder: TTS_LibriTTS
   Task: Zero-shot text-to-speech
   Contents:
   - MAVE.wav                    Audio output from MAVE (our model)
   - VoiceCraft.wav             Audio output from VoiceCraft
   - prompt.wav                 Reference audio (first 3 seconds used for both models as discussed in the paper)
   - prompt_generated.docx      Prompt text and target generation text

Note: All audio comparisons use consistent evaluation protocols as described in the associated paper.