{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5fa21d44",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Copyright (c) Meta Platforms, Inc. and affiliates."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b7c0041e",
   "metadata": {},
   "source": [
    "# Automatically generating object masks with SAM"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "289bb0b4",
   "metadata": {},
   "source": [
    "Since SAM can efficiently process prompts, masks for the entire image can be generated by sampling a large number of prompts over an image. This method was used to generate the dataset SA-1B. \n",
    "\n",
    "The class `SamAutomaticMaskGenerator` implements this capability. It works by sampling single-point input prompts in a grid over the image, from each of which SAM can predict multiple masks. Then, masks are filtered for quality and deduplicated using non-maximal suppression. Additional options allow for further improvement of mask quality and quantity, such as running prediction on multiple crops of the image or postprocessing masks to remove small disconnected regions and holes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "072e25b8",
   "metadata": {},
   "outputs": [],
   "source": [
    "from IPython.display import display, HTML\n",
    "display(HTML(\n",
    "\"\"\"\n",
    "<a target=\"_blank\" href=\"https://colab.research.google.com/github/facebookresearch/segment-anything/blob/main/notebooks/automatic_mask_generator_example.ipynb\">\n",
    "  <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/>\n",
    "</a>\n",
    "\"\"\"\n",
    "))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c0b71431",
   "metadata": {},
   "source": [
    "## Environment Set-up"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "47e5a78f",
   "metadata": {},
   "source": [
    "If running locally using jupyter, first install `segment_anything` in your environment using the [installation instructions](https://github.com/facebookresearch/segment-anything#installation) in the repository. If running from Google Colab, set `using_collab=True` below and run the cell. In Colab, be sure to select 'GPU' under 'Edit'->'Notebook Settings'->'Hardware accelerator'."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4fe300fb",
   "metadata": {},
   "outputs": [],
   "source": [
    "using_colab = False"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0685a2f5",
   "metadata": {},
   "outputs": [],
   "source": [
    "if using_colab:\n",
    "    import torch\n",
    "    import torchvision\n",
    "    print(\"PyTorch version:\", torch.__version__)\n",
    "    print(\"Torchvision version:\", torchvision.__version__)\n",
    "    print(\"CUDA is available:\", torch.cuda.is_available())\n",
    "    import sys\n",
    "    !{sys.executable} -m pip install opencv-python matplotlib\n",
    "    !{sys.executable} -m pip install 'git+https://github.com/facebookresearch/segment-anything.git'\n",
    "\n",
    "    !mkdir images\n",
    "    !wget -P images https://raw.githubusercontent.com/facebookresearch/segment-anything/main/notebooks/images/dog.jpg\n",
    "\n",
    "    !wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fd2bc687",
   "metadata": {},
   "source": [
    "## Set-up"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "560725a2",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import torch\n",
    "import matplotlib.pyplot as plt\n",
    "import cv2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "74b6e5f0",
   "metadata": {},
   "outputs": [],
   "source": [
    "def show_anns(anns):\n",
    "    if len(anns) == 0:\n",
    "        return\n",
    "    sorted_anns = sorted(anns, key=(lambda x: x['area']), reverse=True)\n",
    "    ax = plt.gca()\n",
    "    ax.set_autoscale_on(False)\n",
    "    polygons = []\n",
    "    color = []\n",
    "    for ann in sorted_anns:\n",
    "        m = ann['segmentation']\n",
    "        img = np.ones((m.shape[0], m.shape[1], 3))\n",
    "        color_mask = np.random.random((1, 3)).tolist()[0]\n",
    "        for i in range(3):\n",
    "            img[:,:,i] = color_mask[i]\n",
    "        ax.imshow(np.dstack((img, m*0.35)))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "27c41445",
   "metadata": {},
   "source": [
    "## Example image"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ad354922",
   "metadata": {},
   "outputs": [],
   "source": [
    "image = cv2.imread('images/dog.jpg')\n",
    "image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e0ac8c67",
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure(figsize=(20,20))\n",
    "plt.imshow(image)\n",
    "plt.axis('off')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b8c2824a",
   "metadata": {},
   "source": [
    "## Automatic mask generation"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d9ef74c5",
   "metadata": {},
   "source": [
    "To run automatic mask generation, provide a SAM model to the `SamAutomaticMaskGenerator` class. Set the path below to the SAM checkpoint. Running on CUDA and with the default model is recommended."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "17ade22d",
   "metadata": {},
   "outputs": [],
   "source": [
    "sam_checkpoint = \"sam_vit_h_4b8939.pth\"\n",
    "\n",
    "device = \"cuda\"\n",
    "model_type = \"default\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1848a108",
   "metadata": {},
   "outputs": [],
   "source": [
    "import sys\n",
    "sys.path.append(\"..\")\n",
    "from segment_anything import sam_model_registry, SamAutomaticMaskGenerator, SamPredictor\n",
    "\n",
    "sam = sam_model_registry[model_type](checkpoint=sam_checkpoint)\n",
    "sam.to(device=device)\n",
    "\n",
    "mask_generator = SamAutomaticMaskGenerator(sam)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d6b1ea21",
   "metadata": {},
   "source": [
    "To generate masks, just run `generate` on an image."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "391771c1",
   "metadata": {},
   "outputs": [],
   "source": [
    "masks = mask_generator.generate(image)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e36a1a39",
   "metadata": {},
   "source": [
    "Mask generation returns a list over masks, where each mask is a dictionary containing various data about the mask. These keys are:\n",
    "* `segmentation` : the mask\n",
    "* `area` : the area of the mask in pixels\n",
    "* `bbox` : the boundary box of the mask in XYWH format\n",
    "* `predicted_iou` : the model's own prediction for the quality of the mask\n",
    "* `point_coords` : the sampled input point that generated this mask\n",
    "* `stability_score` : an additional measure of mask quality\n",
    "* `crop_box` : the crop of the image used to generate this mask in XYWH format"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4fae8d66",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(len(masks))\n",
    "print(masks[0].keys())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "53009a1f",
   "metadata": {},
   "source": [
    "Show all the masks overlayed on the image."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "77ac29c5",
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "plt.figure(figsize=(20,20))\n",
    "plt.imshow(image)\n",
    "show_anns(masks)\n",
    "plt.axis('off')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "00b3d6b2",
   "metadata": {},
   "source": [
    "## Automatic mask generation options"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "183de84e",
   "metadata": {},
   "source": [
    "There are several tunable parameters in automatic mask generation that control how densely points are sampled and what the thresholds are for removing low quality or duplicate masks. Additionally, generation can be automatically run on crops of the image to get improved performance on smaller objects, and post-processing can remove stray pixels and holes. Here is an example configuration that samples more masks:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "68364513",
   "metadata": {},
   "outputs": [],
   "source": [
    "mask_generator_2 = SamAutomaticMaskGenerator(\n",
    "    model=sam,\n",
    "    points_per_side=32,\n",
    "    pred_iou_thresh=0.86,\n",
    "    stability_score_thresh=0.92,\n",
    "    crop_n_layers=1,\n",
    "    crop_n_points_downscale_factor=2,\n",
    "    min_mask_region_area=100,  # Requires open-cv to run post-processing\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bebcdaf1",
   "metadata": {},
   "outputs": [],
   "source": [
    "masks2 = mask_generator_2.generate(image)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b8473f3c",
   "metadata": {},
   "outputs": [],
   "source": [
    "len(masks2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fb702ae3",
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure(figsize=(20,20))\n",
    "plt.imshow(image)\n",
    "show_anns(masks2)\n",
    "plt.axis('off')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8c937160",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
