Abstract: As social media communication develops, reliable multimedia quality evaluation indicators have become a prerequisite for enriching user experience services. In this paper, we propose a multiscale spatiotemporal pyramid attention (SPA) block for constructing a blind video quality assessment (VQA) method to evaluate the perceptual quality of videos. First, we extract motion information from the video frames at different temporal scales to form a feature pyramid, which provides a feature representation with multiple visual perceptions. Second, an SPA module, which can effectively extract multiscale spatiotemporal information at various temporal scales and develop a cross-scale dependency relationship, is proposed. Finally, the quality estimation process is completed by passing the extracted features obtained from a network of multiple stacked spatiotemporal pyramid blocks through a regression network to determine the perceived quality. The experimental results demonstrate that our method is on par with the state-of-the-art approaches. The source code necessary for conducting groundbreaking scientific research is accessible online https://github.com/Land5cape/SPBVQA.
Loading