
<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8">
  <meta name="viewport" content="width=device-width, initial-scale=1">
  <title>4DEditPro: Progressively Editing 4D Scenes from Monocular Videos with Text Prompts</title>

  <!-- Global site tag (gtag.js) - Google Analytics -->
  <script async src="https://www.googletagmanager.com/gtag/js?id=G-PYVRSFMDRL"></script>
  <script>
    window.dataLayer = window.dataLayer || [];

    function gtag() {
      dataLayer.push(arguments);
    }

    gtag('js', new Date());

    gtag('config', 'G-PYVRSFMDRL');
  </script>
  
  <link href="https://fonts.googleapis.com/css?family=Google+Sans|Noto+Sans|Castoro"
        rel="stylesheet">

  <link rel="stylesheet" href="./static/css/bulma.min.css">
  <link rel="stylesheet" href="./static/css/bulma-carousel.min.css">
  <link rel="stylesheet" href="./static/css/bulma-slider.min.css">
  <link rel="stylesheet" href="./static/css/fontawesome.all.min.css">
  <link rel="stylesheet"
        href="https://cdn.jsdelivr.net/gh/jpswalsh/academicons@1/css/academicons.min.css">
  <link rel="stylesheet" href="./static/css/index.css">
  <link rel="stylesheet" href="./static/css/result.css">
  <link rel="icon" href="./static/images/page.svg">


  <script defer src="./static/js/fontawesome.all.min.js"></script>
  <script src="./static/js/bulma-carousel.min.js"></script>
  <script src="./static/js/bulma-slider.min.js"></script>
  <script src="./static/js/index.js"></script>

</head>
<body>

<section class="hero">
  <div class="hero-body">
    <div class="container is-max-desktop">
      <div class="columns is-centered">
        <div class="column has-text-centered">
          <h1 class="title is-1 publication-title">4DEditPro: Progressively Editing 4D Scenes from Monocular Videos with Text Prompts</h1>
          
          <div class="is-size-5 publication-authors">
            <span class="footnote">Submitted to ICLR 2025</span>
          </div>

        </div>
      </div>
    </div>
  </div>
</section>

<div class="my-hr">
  <hr>
</div>

<section class="section">
  <div class="container is-max-desktop">
    <!-- Abstract. -->
    <div class="columns is-centered has-text-centered">
      <div class="column is-four-fifths">
        <h2 class="title is-3">Abstract</h2>
        <div class="content has-text-justified">
          <p>
            Editing 4D scenes using text prompts is a novel task made possible by advances in text-to-image diffusion models and differentiable scene representations. However, conventional approaches typically use multi-view images or videos with camera poses as input, which causes inconsistencies when editing monocular videos due to the reliance of these tools on iteratively per-image editing and the absence of multi-view supervision.
            Furthermore, these techniques usually require external Structure-from-Motion (SfM) libraries for camera pose estimation, which can be impractical for casual monocular videos. 
            To tackle these hurdles, we present <b>4DEditPro</b>, a novel framework that enables consistent 4D scene editing on casual monocular videos with text prompts. 
            In our 4DEditPro, the Temporally Propagated Editing (TPE) module guides the diffusion model to ensure temporal coherence across all input frames in scene editing.
            Furthermore, the Spatially Propagated Editing (SPE) module in 4DEditPro introduces auxiliary novel views near the camera trajectory to enhance the spatial consistency of edited scenes. 
            4DEditPro employs a pose-free 4D Gaussian Splatting (4DGS) approach for reconstructing dynamic scenes on monocular videos, which progressively recovers relative camera poses, reconstructs the scene, and facilitates scene editing.
            We have conducted extensive experiments to demonstrate the effectiveness of our approach, including both quantitative measures and user studies. 
          </p>
        </div>
      </div>
    </div>
    <!--/ Abstract. -->
    <hr>

    <!-- Teaser video-->
    <section class="hero teaser">
      <div class="container is-max-desktop">
        <div class="hero-body">
          <video poster="" id="tree" autoplay controls muted loop height="100%">
            <!-- Your video here -->
            <source src="static/videos/demo_video.mp4"
            type="video/mp4">
          </video>
          <h2 class="subtitle has-text-centered">
            The demo video of 4DEditPro.
          </h2>
        </div>
      </div>
    </section>
    <!-- End teaser video -->

    <div class="columns is-centered has-text-centered">
        <div class="column is-full-width">
          <h2 class="title is-3">Method</h2>
          <img src="./static/images/framework.png">
          <br>
          <br>
          <div class="content has-text-justified">
            <p>
              <b>Our proposed 4DEditPro.</b> This pipeline utilizes the TPE module to generate a temporally consistent video sequence, 
              employs the SPE module to interpolate and refine novel views near the camera trajectory of the original monocular video, and integrates a progressive 4D Gaussian 
              representation for estimating camera poses and reconstructing the 4D scenes.
            </p>
          </div> 
          
        </div>
      </div>
    
    <hr>


    <div class="columns is-centered has-text-centered">
      <div class="column is-full-width">
        <h2 class="title is-3">Some Results</h2>
        <div class="content has-text-justified">

        </div>
        

      
      

      <table>
        
        <tr>
          <td colspan="1">
            Original
          </td>
        </tr>
        
        <tr>
          <td style="width: 24%;">
            <video poster="" id="tree" autoplay controls muted loop height="100%">
              <source src="./static/videos/blackswan_ori_x264.mp4"
                      type="video/mp4">
            </video>
          </td>
        </tr>

        <tr>
          <td colspan="8">
            An origami black swan is swimming over the river.
          </td>
        </tr>

            
        <tr>
          <td style="width: 24%">
            <video poster="" id="tree" autoplay controls muted loop height="100%">
              <source src="./static/videos/prev_blackswan_x264.mp4"
                      type="video/mp4">
            </video>
          </td>
          <td style="width: 24%">
            <video poster="" id="tree" autoplay controls muted loop height="100%">
              <source src="./static/videos/prev_blackswan_depth_x264.mp4"
                      type="video/mp4">
            </video>
          <td></td>
          <td style="width: 24%">
            <video poster="" id="tree" autoplay controls muted loop height="100%">
              <source src="./static/videos/blackswan_ours_x264.mp4"
                      type="video/mp4">
            </video>
          </td>
          <td style="width: 24%">
            <video poster="" id="tree" autoplay controls muted loop height="100%">
              <source src="./static/videos/blackswan_depth_x264.mp4"
                      type="video/mp4">
            </video>
          </td>
        </tr>
        <tr>
          <td colspan="2">
            GSEditor-4D
          </td>
          
          <td></td>
          <td colspan="2">
            <b>Ours</b> 
          </td>
        </tr>
        
        <tr>
          <td colspan="1">
            Original
          </td>
        </tr>

        <tr>
          <td style="width: 24%">
            <video poster="" id="tree" autoplay controls muted loop height="100%">
              <source src="./static/videos/rhino_ori_x264.mp4"
                      type="video/mp4">
            </video>
          </td>
        </tr>

        <tr>
          <td colspan="8">
            A rhino is walking at night.
          </td>
          
        </tr>

        <tr>
          <td style="width: 24%">
            <video poster="" id="tree" autoplay controls muted loop height="100%">
              <source src="./static/videos/prev_rhino_x264.mp4"
                      type="video/mp4">
            </video>
          </td>
          <td style="width: 24%">
            <video poster="" id="tree" autoplay controls muted loop height="100%">
              <source src="./static/videos/prev_rhino_depth_x264.mp4"
                      type="video/mp4">
            </video>
          <td></td>
          <td style="width: 24%">
            <video poster="" id="tree" autoplay controls muted loop height="100%">
              <source src="./static/videos/rhino_night_x264.mp4"
                      type="video/mp4">
            </video>
          </td>
          <td style="width: 24%">
            <video poster="" id="tree" autoplay controls muted loop height="100%">
              <source src="./static/videos/rhino_depth_x264.mp4"
                      type="video/mp4">
            </video>
          </td>
        </tr>

        <tr>
          <td colspan="2">
            GSEditor-4D
          </td>
          
          <td></td>
          <td colspan="2">
            <b>Ours</b> 
          </td>
        </tr>

        <tr>
          <td colspan="1">
            Original
          </td>
        </tr>

        <tr>
          <td style="width: 24%">
            <video poster="" id="tree" autoplay controls muted loop height="100%">
              <source src="./static/videos/boat_x264.mp4"
                      type="video/mp4">
            </video>
          </td>
        </tr>

        <tr>
          <td colspan="8">
            A Steampunk boat is sailing.
          </td>
        </tr>

        <tr>
          <td style="width: 24%">
            <video poster="" id="tree" autoplay controls muted loop height="100%">
              <source src="./static/videos/prev_boat_steampunk_x264.mp4"
                      type="video/mp4">
            </video>
          </td>
          <td style="width: 24%">
            <video poster="" id="tree" autoplay controls muted loop height="100%">
              <source src="./static/videos/prev_boat_depth_steampunk_x264.mp4"
                      type="video/mp4">
            </video>
          <td></td>
          <td style="width: 24%">
            <video poster="" id="tree" autoplay controls muted loop height="100%">
              <source src="./static/videos/boat_steampunk_x264.mp4"
                      type="video/mp4">
            </video>
          </td>
          <td style="width: 24%">
            <video poster="" id="tree" autoplay controls muted loop height="100%">
              <source src="./static/videos/boat_steampunk_depth_x264.mp4"
                      type="video/mp4">
            </video>
          </td>
        </tr>

        <tr>
          <td colspan="2">
            GSEditor-4D
          </td>
          
          <td></td>
          <td colspan="2">
            <b>Ours</b> 
          </td>
        </tr>

      </table>
      </div>
    </div>

  </div>   
</section>  

</body>
</html>