Abstract: Highlights•Novel framework to design tailored, unified, and parameter-efficient AVSR systems.•First to harness the flexibility and interpretability of the Branchformer encoder.•Experiments for English and Spanish show our AVSR framework’s effectiveness.•Competitive state-of-the-art performance with nearly 50% fewer model parameters.•Explainable insights into audiovisual speech processing.
Loading