Abstract: Highlights•Presented a novel dynamic neural rendering approach for dynamic monocular videos, leveraging the aggregation of multi-view feature vectors to enhance the quality of rendering novel views.•Combining multi-frame feature vectors can lead to the loss or merging of intricate details, risking the preservation of crucial characteristics from the original data. To address this, we introduce a Ray-based cross-time transformer.•To mitigate potential blurring effects during feature aggregation, we propose the incorporation of a Global Spatio-Temporal Filter.
Loading