{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "N2J4-fUc43_T"
   },
   "source": [
    "# Install essentials & requirements\n",
    "\n",
    "* The BoN_MOSS folder, which is the result of unzipping the submitted .zip file, must be uploaded at Google Drive.\n",
    "* For Stable Diffusion 1.5, **T4 GPU (free-tier Colab)** can be used. For Stable Diffusion 2.1, **A100 GPU** must be used.\n",
    "* This code is built on [the official implementation code of Self-Cross guidance](https://github.com/mengtang-lab/selfcross-guidance)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "5jJils3gDJJb"
   },
   "outputs": [],
   "source": [
    "from google.colab import drive\n",
    "drive.mount('/content/drive/')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "d3YAHTPUCLmR"
   },
   "outputs": [],
   "source": [
    "!cd /content/drive/MyDrive/BoN_MOSS && pip install -r environment.yaml && pip install pytorch_metric_learning"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "CbgoqoaS5eLs"
   },
   "source": [
    "# CONFORM\n",
    "\n",
    "* The \"model\" option indicates the version of the stable diffusion model to use as the backbone.\n",
    "    * SD15 means Stable Diffusion 1.5, and SD21 means Stable Diffusion 2.1. For Stable Diffusion 2.1, A100 GPU must be used.\n",
    "\n",
    "* The \"dataset\" option indicates the prompt dataset to be used for the text-to-image task.\n",
    "    * If you select one of \"animals\", \"animals_objects\", \"objects\", or \"SSD-2\", you will use the dataset used in the paper, and you can also provide arbitrary prompt.\n",
    "        * When you provide the arbitrary prompt, only prompts in the form of \"a `object_1` and a `object_2`\", \"a `object_1` and a `attribute_2` `object_2`\", or \"a `attribute_1` `object_1` and a `attribute_2` `object_2`\" are supported.\n",
    "\n",
    "* The \"num_initial_noises\" option indicates the number of candidate initial noises.\n",
    "\n",
    "* For one input text prompt, images are generated for 10 seeds(1~10). If you want to specify seeds, edit \"seeds.txt\" file in BoN_MOSS.\n",
    "\n",
    "* The result image can be found in \"outputs\" folder in BoN_MOSS."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "wKXXumgNAmxM"
   },
   "outputs": [],
   "source": [
    "!cd /content/drive/MyDrive/BoN_MOSS && python run_conform.py --model SD15 --dataset animals --num_initial_noises 50"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "SU8YXZRmAoa-"
   },
   "source": [
    "# InitNO\n",
    "\n",
    "* The \"dataset\" option indicates the prompt dataset to be used for the text-to-image task.\n",
    "    * If you select one of \"animals\", \"animals_objects\", \"objects\", or \"SSD-2\", you will use the dataset used in the paper, and you can also provide arbitrary prompt.\n",
    "        * When you provide the arbitrary prompt, only prompts in the form of \"a `object_1` and a `object_2`\", \"a `object_1` and a `attribute_2` `object_2`\", or \"a `attribute_1` `object_1` and a `attribute_2` `object_2`\" are supported.\n",
    "\n",
    "* The \"num_initial_noises\" option indicates the number of candidate initial noises.\n",
    "\n",
    "* The \"updates_per_noise\" option indicates the number of updates per one initial noise.\n",
    "\n",
    "* For one input text prompt, images are generated for 10 seeds(1~10). If you want to specify seeds, edit \"seeds.txt\" file in BoN_MOSS.\n",
    "\n",
    "* The result image can be found in \"outputs\" folder in BoN_MOSS."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "oUgpDcNpBFI3"
   },
   "outputs": [],
   "source": [
    "!cd /content/drive/MyDrive/BoN_MOSS && python run_initno.py --dataset \"an apple and a bag\" --num_initial_noises 5 --updates_per_noise 10"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "EAft97SeBJzt"
   },
   "source": [
    "# Self-Cross guidance (without InitNO)\n",
    "\n",
    "* The \"model\" option indicates the version of the stable diffusion model to use as the backbone.\n",
    "    * SD15 means Stable Diffusion 1.5, and SD21 means Stable Diffusion 2.1. For Stable Diffusion 2.1, A100 GPU must be used.\n",
    "\n",
    "* The \"dataset\" option indicates the prompt dataset to be used for the text-to-image task.\n",
    "    * If you select one of \"animals\", \"animals_objects\", \"objects\", or \"SSD-2\", you will use the dataset used in the paper, and you can also provide arbitrary prompt.\n",
    "        * When you provide the arbitrary prompt, only prompts in the form of \"a `object_1` and a `object_2`\", \"a `object_1` and a `attribute_2` `object_2`\", or \"a `attribute_1` `object_1` and a `attribute_2` `object_2`\" are supported.\n",
    "\n",
    "* The \"num_initial_noises\" option indicates the number of candidate initial noises.\n",
    "\n",
    "* For one input text prompt, images are generated for 10 seeds(1~10). If you want to specify seeds, edit \"seeds.txt\" file in BoN_MOSS.\n",
    "\n",
    "* The result image can be found in \"outputs\" folder in BoN_MOSS."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "bJ7TcnFGBP1j"
   },
   "outputs": [],
   "source": [
    "!cd /content/drive/MyDrive/BoN_MOSS && python run_selfcross.py --model SD15 --dataset \"a red apple and a blue bag\" --num_initial_noises 50"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "3u61LMtKBR2X"
   },
   "source": [
    "# Self-Cross guidance (original)\n",
    "\n",
    "* The \"dataset\" option indicates the prompt dataset to be used for the text-to-image task.\n",
    "    * If you select one of \"animals\", \"animals_objects\", \"objects\", or \"SSD-2\", you will use the dataset used in the paper, and you can also provide arbitrary prompt.\n",
    "        * When you provide the arbitrary prompt, only prompts in the form of \"a `object_1` and a `object_2`\", \"a `object_1` and a `attribute_2` `object_2`\", or \"a `attribute_1` `object_1` and a `attribute_2` `object_2`\" are supported.\n",
    "\n",
    "* The \"num_initial_noises\" option indicates the number of candidate initial noises.\n",
    "\n",
    "* The \"updates_per_noise\" option indicates the number of updates per one initial noise.\n",
    "\n",
    "* For one input text prompt, images are generated for 10 seeds(1~10). If you want to specify seeds, edit \"seeds.txt\" file in BoN_MOSS.\n",
    "\n",
    "* The result image can be found in \"outputs\" folder in BoN_MOSS."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "gnoNBa8VBlXV"
   },
   "outputs": [],
   "source": [
    "!cd /content/drive/MyDrive/BoN_MOSS && python run_selfcross+initno.py --dataset \"a cat and a red apple\" --num_initial_noises 5 --updates_per_noise 10"
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "gpuType": "A100",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3",
   "name": "python3"
  },
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
