<!doctype html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1">

    <link rel="preconnect" href="https://fonts.googleapis.com">
    <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
    <link href="https://fonts.googleapis.com/css2?family=Roboto:wght@300;400;500" rel="stylesheet">
    <link href="https://fonts.googleapis.com/css2?family=Google+Sans:wght@200;400" rel="stylesheet">
    <link rel="stylesheet" type="text/css" href="styles.css" />

    <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.6.1/jquery.min.js"></script>

    <meta property="og:site_name" content="Video Adapter: Probabilistic Adaptation of Black-Box Text-to-Video Models" />
    <meta property="og:type" content="video.other" />
    <meta property="og:title" content="Video Adapter" />
    <meta property="og:description" content="Probabilistic Adaptation of Black-Box Text-to-Video Models" />

    <title>Video Adapter</title>
  </head>
  <script src="js/mobile.js"></script>
  <body>
    <div class="video_1" id="video_gallery">
    </div>

    <div class="header" style="background-color: #202020;" id="header_box">
      <div style="position: sticky; top: 0; height: 100vh;" id="header_container">
        <div id="header_video" class="stack" style="display: flex; align-content: center; height: 100vh; overflow-y: hidden;">
          <video id="video" style="opacity: 0.0; width: 100%; object-fit: contain;" autoplay loop muted playsinline><source src="videos/anime1-1.mp4" type="video/mp4"></video>
        </div>
        <div class="stack">
          <div class="title" id="title">
            <h1>Video Adapter</h1>
            <h2>Probabilistic Adaption of Black-Box Text-to-Video Models</h2>
          </div>
        </div>
      </div>
    </div>

    <div class="video_2" id="video_gallery2">
    </div>

    <div class="section_header">
      <h1>Probabilistic Adaptation of Black-Box Text-to-Video Models</h1>


    <div class="abstract">
      <div style="margin: 50px auto;">
        <h1 style="font-weight: 400;">Abstract</h1>
        <p style="padding-bottom: 25px; text-align: justify; font-weight: 300;">Large text-to-video models trained on internet-scale data have demonstrated exceptional capabilities in generating high-fidelity videos from arbitrary textual descriptions. However, similar to proprietary language models, large text-to-video models are often black boxes whose weight parameters are not publicly available, posing a significant challenge to adapting these models to specific domains such as robotics, animation, and personalized stylization. Inspired by how a large language model can be prompted to perform new tasks without access to the model weights, we investigate how to adapt a black-box pretrained text-to-video model to a variety of downstream domains without weight access to the pretrained model. In answering this question, we propose Video Adapter, which leverages the score function of a large pretrained video diffusion model as a probabilistic prior to guide the generation of a task-specific small video model. Our experiments show that, by incorporating broad knowledge and fidelity of the pretrained model probabilistically, a small model with as few as 1.25% parameters of the pretrained model can generate high-quality yet domain-specific videos for a variety of downstream domains such as animation, egocentric modeling, and modeling of simulated and real-world robotics data. As large text-to-video models starting to become available as a service similar to large language models, we advocate for private institutions to expose scores of video diffusion models as outputs in addition to generated videos to allow flexible adaptation of large pretrained text-to-video models by the general public.
      </div>
    </div>

    <div class="section_header">
      <h1>Video Adapter Framework</h1>
      <h2>Adaptation through Score Composition</h2>
    </div>


    <div class="abstract">
      <div style="margin: 50px auto;">
      <p style="padding-bottom: 25px; text-align: justify; font-weight: 300;">Video Adapter only requires training a small domain-specific text-to-video model with orders of magnitude fewer parameters than a large video model pretrained from internet data.
During sampling, Video Adapter composes the scores of the pretrained and the domain specific video models,
	achieving high-quality and flexible video synthesis.</p>
      </div>
    </div>

    <div id="zoom_box2">
        <div style="position: sticky; top: 0; left: 0; height: calc(100vh + 100px); width: 100vw; overflow-x: hidden;">
              <video autoplay loop muted playsinline><source src="videos/framework.mp4" type="video/mp4"></video>
        </div>
    </div>

    <div class="section_header">
      <h1>Video Adapter for Animation and Robotics</h1>
    </div>

    <div class="abstract">
      <div style="margin: 50px auto;">
	<p style="padding-bottom: 25px; text-align: justify; font-weight: 300;">We can train a small video model on an animation style of a particular artist (Detective Conan). The pretrained prior can maintain the artist's style while changing the background. We can also train task-specific small edge-to-sim and edge-to-real models on robotic videos. The pretrained prior can be used to modify the styles of the videos as a form of domain randomization.
	</p>
      </div>
    </div>
    
    <div class="video_3" id="video_gallery3">
    </div>


    <div class="section_header">
      <h1>Ablation against Naive Classifier-Free Score Mix</h1>
    </div>

    <div class="abstract">
      <div style="margin: 50px auto;">
	<p style="padding-bottom: 25px; text-align: justify; font-weight: 300;">Below we show ablation of Video Adapter (with different prior strength) against naively combining classifier-free scores. Video Adapter modifies the style as instructed (top), whereas directly mixing two classifier-free guidance scores fails to adapt the video (bottom).

	</p>
      </div>
    </div>
    
    <div class="video_4" id="video_gallery4">
    </div>
    
    
    <script>

      function addVideoPane4() {
	  videos = [
	      {"caption": "Edges", "src": "videos/edge_condition.mp4"},
	      {"caption": "Small model", "src": "videos/base_ours.mp4"},
	      {"caption": "Video Adapter - Comic Book (weight 0.5)", "src": "videos/05_ours.mp4"},
	      {"caption": "Video Adapter - Comic Book (weight 1.0)", "src": "videos/10_ours.mp4"},
	      {"caption": "Edges", "src": "videos/edge_condition.mp4"},
	      {"caption": "Small model", "src": "videos/base_ours.mp4"},
	      {"caption": "Classifier-Free Score Mix (weight 0.5)", "src": "videos/05_cfg.mp4"},
	      {"caption": "Classifier-Free Score Mix (weight 1.0)", "src": "videos/10_cfg.mp4"},	     
 ];

 globalVideoIdx = 0;

 function updateVideoSlot(idxVideo, idxSlot) {
   slot = $("#video_gallery4 > div").eq(idxSlot);

   src = videos[idxVideo].src;
   caption = videos[idxVideo].caption;

   slot.find("div.video_container > video > source").attr("src", src);
   slot.find("div.video_container > video")[0].load();
   slot.find("div.video_container > div.caption > div").text(caption);
 }

 function updateVideoGallery() {
   numVideos = videos.length;
   idxVideo = Math.floor(Math.random() * numVideos);

   numSlots = $("#video_gallery3 > div").length;
   idxSlot = Math.floor(Math.random() * numSlots);

   globalVideoIdx = ++globalVideoIdx % numVideos;
   updateVideoSlot(globalVideoIdx, idxSlot);
 }

 function updateAllVideoSlots() {
   numVideos = videos.length;
   numSlots = $("#video_gallery3 > div").length;

   globalVideoIdx = 0;
   for (var i = 0; i < numSlots; ++i, ++globalVideoIdx) {
     globalVideoIdx = globalVideoIdx % numVideos;
     updateVideoSlot(globalVideoIdx, i);
   }
 }

 function addVideoSlot(idx) {
   if (idx < 8) {
     template = `
   <div class="video_wrapper">
     <div class="video_container">
       <video autoplay loop muted playsinline><source src="videos/anime1-1.mp4" type="video/mp4"></video>
       <div class="caption">
	 <div></div>
       </div>
     </div>
   </div>
   `;
   } else {
     template = `
   <div class="video_wrapper nomobile">
     <div class="video_container">
       <video autoplay loop muted playsinline><source src="videos/anime1-1.mp4" type="video/mp4"></video>
       <div class="caption">
	 <div></div>
       </div>
     </div>
   </div>
   `;
   }

   $("#video_gallery4").append(template);
 }

 function playVideo(id, src) {
   $(id + " video > source").attr("src", src);
   $(id + " video")[0].load();
 }

 for (var i = 0; i < 8; ++i) {
   addVideoSlot(i);
 }

    updateAllVideoSlots();
      }
      

      function addVideoPane3() {
	  videos = [
	      {"caption": "Edge Condition - Detective Conan Anime", "src": "videos/conan_edge_resize.mp4"},
	      {"caption": "Generation - Detective Conan Anime", "src": "videos/conan.mp4"},
     {"caption": "Adapted Generation - Abstract Style", "src": "videos/conan_abstract.mp4"},
     {"caption": "Adapted Generation - Dark Background", "src": "videos/conan_dark.mp4"},
     
     {"caption": "Edge Generation - Move the Red Block towards the Blue Block", "src": "videos/edge5_resize.mp4"},
     {"caption": "Generation - Simulation", "src": "videos/sim5.mp4"},
     {"caption": "Generation - Real", "src": "videos/real5.mp4"},
     {"caption": "Adapted Generation - Colorful Stylisation", "src": "videos/adapt5.mp4"},

     {"caption": "Edge Condition - Push the Green Star into the Red Circle", "src": "videos/edge9_resize.mp4"},
     {"caption": "Generation - Simulation", "src": "videos/sim9.mp4"},
     {"caption": "Generation - Real", "src": "videos/real9.mp4"},
     {"caption": "Adapted generation - Dark Stylisation", "src": "videos/adapt9.mp4"},
 ];

 globalVideoIdx = 0;

 function updateVideoSlot(idxVideo, idxSlot) {
   slot = $("#video_gallery3 > div").eq(idxSlot);

   src = videos[idxVideo].src;
   caption = videos[idxVideo].caption;

   slot.find("div.video_container > video > source").attr("src", src);
   slot.find("div.video_container > video")[0].load();
   slot.find("div.video_container > div.caption > div").text(caption);
 }

 function updateVideoGallery() {
   numVideos = videos.length;
   idxVideo = Math.floor(Math.random() * numVideos);

   numSlots = $("#video_gallery3 > div").length;
   idxSlot = Math.floor(Math.random() * numSlots);

   globalVideoIdx = ++globalVideoIdx % numVideos;
   updateVideoSlot(globalVideoIdx, idxSlot);
 }

 function updateAllVideoSlots() {
   numVideos = videos.length;
   numSlots = $("#video_gallery3 > div").length;

   globalVideoIdx = 0;
   for (var i = 0; i < numSlots; ++i, ++globalVideoIdx) {
     globalVideoIdx = globalVideoIdx % numVideos;
     updateVideoSlot(globalVideoIdx, i);
   }
 }

 function addVideoSlot(idx) {
   if (idx < 8) {
     template = `
   <div class="video_wrapper">
     <div class="video_container">
       <video autoplay loop muted playsinline><source src="videos/anime1-1.mp4" type="video/mp4"></video>
       <div class="caption">
	 <div></div>
       </div>
     </div>
   </div>
   `;
   } else {
     template = `
   <div class="video_wrapper nomobile">
     <div class="video_container">
       <video autoplay loop muted playsinline><source src="videos/anime1-1.mp4" type="video/mp4"></video>
       <div class="caption">
	 <div></div>
       </div>
     </div>
   </div>
   `;
   }

   $("#video_gallery3").append(template);
 }

 function playVideo(id, src) {
   $(id + " video > source").attr("src", src);
   $(id + " video")[0].load();
 }

 for (var i = 0; i < 12; ++i) {
   addVideoSlot(i);
 }

    updateAllVideoSlots();
      }

      function addVideoPane2() {
 hdvideos = [
     {"caption": "Adapted Generation - Walks on the Grass", "src": "videos/nav1-crop.mp4"},
     {"caption": "Adapted Generation - Walks around", "src": "videos/nav2-crop.mp4"},
     {"caption": "Adapted Generation - Walks forward", "src": "videos/nav3-crop.mp4"},
     {"caption": "Adapted Generation - Walks a Few Steps", "src": "videos/nav4-crop.mp4"},
     {"caption": "Adapted Generation - Walks", "src": "videos/nav5-crop.mp4"},
     {"caption": "Adapted Generation - Walks towards the Shelf", "src": "videos/nav6-crop.mp4"},
     
     {"caption": "Adapted Generation - Removes Left Hand from Cable at the Bicycle's Seat", "src": "videos/gen1-crop.mp4"},
     {"caption": "Adapted Generation - Drills the Wood with the Drill", "src": "videos/gen2-crop.mp4"},
     {"caption": "Adapted Generation - Drops a Fabric", "src": "videos/gen3-crop.mp4"},
     {"caption": "Adapted Generation - Drops the Slice of Cheese on the Sink", "src": "videos/gen4-crop.mp4"},
     {"caption": "Adapted Generation - Puts Wood on Table", "src": "videos/gen5-crop.mp4"},
     {"caption": "Adapted Generation - Holds the Book with Both Hand", "src": "videos/gen6-crop.mp4"},
     {"caption": "Adapted Generation - Unties the Plant ", "src": "videos/gen7-crop.mp4"},
     {"caption": "Adapted Generation - Scrubs Hands on the Sand on the Floor", "src": "videos/gen8-crop.mp4"},

     {"caption": "Adapted Generation - Put Pot or Pan in Sink", "src": "videos/combine1.mp4"},
     {"caption": "Adapted Generation - Put Corn into Bowl", "src": "videos/combine4.mp4"},

     {"caption": "Adapted Generation - Looks at the Phone Screen", "src": "videos/ego1-crop.mp4"},
     {"caption": "Adapted Generation - Wipes the Tip of the Detail Needle with Her Left Hand", "src": "videos/ego2-crop.mp4"},
     {"caption": "Adapted Generation - Paints the Piece of Wood", "src": "videos/ego3-crop.mp4"},
     {"caption": "Adapted Generation - Pats the Clay in the Mould with His Right Hand", "src": "videos/ego4-crop.mp4"},
     {"caption": "Adapted Generation - Dips Paintbrush in Paint in the Bucket", "src": "videos/ego5-crop.mp4"},
     {"caption": "Adapted Generation - Rubs the Pink Ball in Both Hand", "src": "videos/ego6-crop.mp4"},
     {"caption": "Adapted Generation - Adjusts the Jig Saw with Both Hands", "src": "videos/ego7-crop.mp4"},
     {"caption": "Adapted Generation - Walks around", "src": "videos/ego8-crop.mp4"},
     {"caption": "Adapted Generation - Dries the Greens with Kitchen Tissue", "src": "videos/ego9-crop.mp4"},
     {"caption": "Adapted Generation - Picks a Scrubber", "src": "videos/ego10-crop.mp4"},
     {"caption": "Adapted Generation - Cuts the Piece of Wood with the Miter Saw", "src": "videos/ego11-crop.mp4"},
     {"caption": "Adapted Generation - Look at the Food", "src": "videos/ego12-crop.mp4"},
     {"caption": "Adapted Generation - Place the Piece", "src": "videos/ego13-crop.mp4"},
     {"caption": "Adapted Generation - Turns over the Dough", "src": "videos/ego14-crop.mp4"},

     {"caption": "Adapted Generation - Put Eggplant in Pot", "src": "videos/combine2.mp4"},
     {"caption": "Adapted Generation - Put Pan from Drying Rack into Sink", "src": "videos/combine3.mp4"},
 ];
	  eg4d_videos = hdvideos.sort((a, b) => 0.5 - Math.random());;

 globalVideoIdx = 0;

 function updateVideoSlot(idxVideo, idxSlot) {
   slot = $("#video_gallery2 > div").eq(idxSlot);

   src = eg4d_videos[idxVideo].src;
   caption = eg4d_videos[idxVideo].caption;

   slot.find("div.video_container > video > source").attr("src", src);
   slot.find("div.video_container > video")[0].load();
   slot.find("div.video_container > div.caption > div").text(caption);
 }

 function updateVideoGallery() {
   numVideos = eg4d_videos.length;
   idxVideo = Math.floor(Math.random() * numVideos);

   numSlots = $("#video_gallery2 > div").length;
   idxSlot = Math.floor(Math.random() * numSlots);

   globalVideoIdx = ++globalVideoIdx % numVideos;
     updateVideoSlot(globalVideoIdx, idxSlot);

     setTimeout(updateVideoGallery, 1000);
 }

 function updateAllVideoSlots() {
   numVideos = eg4d_videos.length;
   numSlots = $("#video_gallery2 > div").length;

   globalVideoIdx = Math.floor(Math.random() * numVideos);
   for (var i = 0; i < numSlots; ++i, ++globalVideoIdx) {
     globalVideoIdx = globalVideoIdx % numVideos;
     updateVideoSlot(globalVideoIdx, i);
   }
 }

 function addVideoSlot(idx) {
   if (idx < 8) {
     template = `
   <div class="video_wrapper">
     <div class="video_container">
       <video autoplay loop muted playsinline><source src="videos/anime1-1.mp4" type="video/mp4"></video>
       <div class="caption">
	 <div></div>
       </div>
     </div>
   </div>
   `;
   } else {
     template = `
   <div class="video_wrapper nomobile">
     <div class="video_container">
       <video autoplay loop muted playsinline><source src="videos/anime1-1.mp4" type="video/mp4"></video>
       <div class="caption">
	 <div></div>
       </div>
     </div>
   </div>
   `;
   }

   $("#video_gallery2").append(template);
 }

 function playVideo(id, src) {
   $(id + " video > source").attr("src", src);
   $(id + " video")[0].load();
 }

 for (var i = 0; i < 16; ++i) {
   addVideoSlot(i);
 }

	  updateAllVideoSlots();
	  setTimeout(updateVideoGallery, 1000);
      }

      function addVideoPane1() {
 videos = [
     {"caption": "Original Generation - Happy Holidays Animated Card", "src": "videos/anime1-1.mp4"},
     {"caption": "Adapted Generation - Storybook Illustration", "src": "videos/anime1-3.mp4"},
     {"caption": "Adapted Generation - Digital Art", "src": "videos/anime1-4.mp4"},
     {"caption": "Adapted Generation - Outdoors Video", "src": "videos/anime1-5.mp4"},

     {"caption": "Original Generation  - Podium with Moving Clouds and Lamps", "src": "videos/anime2-1.mp4"},
     {"caption": "Adapted Generation - Arcade Style", "src": "videos/anime2-2.mp4"},
     {"caption": "Adapted Generation - Blue Sunny Day", "src": "videos/anime2-3.mp4"},
     {"caption": "Adapted Generation - Scifi", "src": "videos/anime2-4.mp4"},

     {"caption": "Original Generation  - Astronaut on the Moon Surface near a US Flag", "src": "videos/scifi1-1.mp4"},
     {"caption": "Adapted Generation - Concept Art", "src": "videos/scifi1-2.mp4"},
     {"caption": "Adapted Generation - Black and White", "src": "videos/scifi1-3.mp4"},
     {"caption": "Adapted Generation - Vintage Film", "src": "videos/scifi1-4.mp4"},

     {"caption": "Original Generation  - Retro Night City ", "src": "videos/scifi2-1.mp4"},
     {"caption": "Adapted Generation - Dark Night Sky", "src": "videos/scifi2-2.mp4"},
     {"caption": "Adapted Generation - Red Sun over Stadium", "src": "videos/scifi2-3.mp4"},
     {"caption": "Adapted Generation - Snow Falling", "src": "videos/scifi2-4.mp4"},
 ];

 globalVideoIdx = 0;

 function updateVideoSlot(idxVideo, idxSlot) {
   slot = $("#video_gallery > div").eq(idxSlot);

   src = videos[idxVideo].src;
   caption = videos[idxVideo].caption;

   slot.find("div.video_container > video > source").attr("src", src);
   slot.find("div.video_container > video")[0].load();
   slot.find("div.video_container > div.caption > div").text(caption);
 }

 function updateVideoGallery() {
   numVideos = videos.length;
   idxVideo = Math.floor(Math.random() * numVideos);

   numSlots = $("#video_gallery > div").length;
   idxSlot = Math.floor(Math.random() * numSlots);

   globalVideoIdx = ++globalVideoIdx % numVideos;
   updateVideoSlot(globalVideoIdx, idxSlot);
 }

 function updateAllVideoSlots() {
   numVideos = videos.length;
   numSlots = $("#video_gallery > div").length;

   globalVideoIdx = 0;
   for (var i = 0; i < numSlots; ++i, ++globalVideoIdx) {
     globalVideoIdx = globalVideoIdx % numVideos;
     updateVideoSlot(globalVideoIdx, i);
   }
 }

 function addVideoSlot(idx) {
   if (idx < 8) {
     template = `
   <div class="video_wrapper">
     <div class="video_container">
       <video autoplay loop muted playsinline><source src="videos/anime1-1.mp4" type="video/mp4"></video>
       <div class="caption">
	 <div></div>
       </div>
     </div>
   </div>
   `;
   } else {
     template = `
   <div class="video_wrapper nomobile">
     <div class="video_container">
       <video autoplay loop muted playsinline><source src="videos/anime1-1.mp4" type="video/mp4"></video>
       <div class="caption">
	 <div></div>
       </div>
     </div>
   </div>
   `;
   }

   $("#video_gallery").append(template);
 }

 function playVideo(id, src) {
   $(id + " video > source").attr("src", src);
   $(id + " video")[0].load();
 }

 for (var i = 0; i < 16; ++i) {
   addVideoSlot(i);
 }

    updateAllVideoSlots();
}      

      addVideoPane1();
      addVideoPane2();
      addVideoPane3();
      addVideoPane4();

$(window).scroll(function() {
  $("#title").each(title);
});
    </script>
  </body>
</html>
