# Mask R-CNN

> [Mask R-CNN](https://arxiv.org/abs/1703.06870)

<!-- [ALGORITHM] -->

## Abstract

We present a conceptually simple, flexible, and general framework for object instance segmentation. Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. The method, called Mask R-CNN, extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition. Mask R-CNN is simple to train and adds only a small overhead to Faster R-CNN, running at 5 fps. Moreover, Mask R-CNN is easy to generalize to other tasks, e.g., allowing us to estimate human poses in the same framework. We show top results in all three tracks of the COCO suite of challenges, including instance segmentation, bounding-box object detection, and person keypoint detection. Without bells and whistles, Mask R-CNN outperforms all existing, single-model entries on every task, including the COCO 2016 challenge winners. We hope our simple and effective approach will serve as a solid baseline and help ease future research in instance-level recognition.

<div align=center>
<img src="https://user-images.githubusercontent.com/40661020/143967081-c2552bed-9af2-46c4-ae44-5b3b74e5679f.png"/>
</div>

## Results and Models

|    Backbone     |  Style  | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP |                     Config                      |                                                                                                                                                                            Download                                                                                                                                                                             |
| :-------------: | :-----: | :-----: | :------: | :------------: | :----: | :-----: | :---------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
|    R-50-FPN     |  caffe  |   1x    |   4.3    |                |  38.0  |  34.4   | [config](./mask-rcnn_r50-caffe_fpn_1x_coco.py)  |   [model](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_caffe_fpn_1x_coco/mask_rcnn_r50_caffe_fpn_1x_coco_bbox_mAP-0.38__segm_mAP-0.344_20200504_231812-0ebd1859.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_caffe_fpn_1x_coco/mask_rcnn_r50_caffe_fpn_1x_coco_20200504_231812.log.json)    |
|    R-50-FPN     | pytorch |   1x    |   4.4    |      16.1      |  38.2  |  34.7   |    [config](./mask-rcnn_r50_fpn_1x_coco.py)     |                                  [model](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_fpn_1x_coco/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_fpn_1x_coco/mask_rcnn_r50_fpn_1x_coco_20200205_050542.log.json)                                  |
| R-50-FPN (FP16) | pytorch |   1x    |   3.6    |      24.1      |  38.1  |  34.7   |  [config](./mask-rcnn_r50_fpn_amp-1x_coco.py)   |                             [model](https://download.openmmlab.com/mmdetection/v2.0/fp16/mask_rcnn_r50_fpn_fp16_1x_coco/mask_rcnn_r50_fpn_fp16_1x_coco_20200205-59faf7e4.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/fp16/mask_rcnn_r50_fpn_fp16_1x_coco/mask_rcnn_r50_fpn_fp16_1x_coco_20200205_130539.log.json)                             |
|    R-50-FPN     | pytorch |   2x    |    -     |       -        |  39.2  |  35.4   |    [config](./mask-rcnn_r50_fpn_2x_coco.py)     |               [model](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_fpn_2x_coco/mask_rcnn_r50_fpn_2x_coco_bbox_mAP-0.392__segm_mAP-0.354_20200505_003907-3e542a40.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_fpn_2x_coco/mask_rcnn_r50_fpn_2x_coco_20200505_003907.log.json)               |
|    R-101-FPN    |  caffe  |   1x    |          |                |  40.4  |  36.4   | [config](./mask-rcnn_r101-caffe_fpn_1x_coco.py) |                [model](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r101_caffe_fpn_1x_coco/mask_rcnn_r101_caffe_fpn_1x_coco_20200601_095758-805e06c1.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r101_caffe_fpn_1x_coco/mask_rcnn_r101_caffe_fpn_1x_coco_20200601_095758.log.json)                 |
|    R-101-FPN    | pytorch |   1x    |   6.4    |      13.5      |  40.0  |  36.1   |    [config](./mask-rcnn_r101_fpn_1x_coco.py)    |                                [model](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r101_fpn_1x_coco/mask_rcnn_r101_fpn_1x_coco_20200204-1efe0ed5.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r101_fpn_1x_coco/mask_rcnn_r101_fpn_1x_coco_20200204_144809.log.json)                                |
|    R-101-FPN    | pytorch |   2x    |    -     |       -        |  40.8  |  36.6   |    [config](./mask-rcnn_r101_fpn_2x_coco.py)    |             [model](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r101_fpn_2x_coco/mask_rcnn_r101_fpn_2x_coco_bbox_mAP-0.408__segm_mAP-0.366_20200505_071027-14b391c7.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r101_fpn_2x_coco/mask_rcnn_r101_fpn_2x_coco_20200505_071027.log.json)             |
| X-101-32x4d-FPN | pytorch |   1x    |   7.6    |      11.3      |  41.9  |  37.5   | [config](./mask-rcnn_x101-32x4d_fpn_1x_coco.py) |                    [model](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_32x4d_fpn_1x_coco/mask_rcnn_x101_32x4d_fpn_1x_coco_20200205-478d0b67.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_32x4d_fpn_1x_coco/mask_rcnn_x101_32x4d_fpn_1x_coco_20200205_034906.log.json)                    |
| X-101-32x4d-FPN | pytorch |   2x    |    -     |       -        |  42.2  |  37.8   | [config](./mask-rcnn_x101-32x4d_fpn_2x_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_32x4d_fpn_2x_coco/mask_rcnn_x101_32x4d_fpn_2x_coco_bbox_mAP-0.422__segm_mAP-0.378_20200506_004702-faef898c.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_32x4d_fpn_2x_coco/mask_rcnn_x101_32x4d_fpn_2x_coco_20200506_004702.log.json) |
| X-101-64x4d-FPN | pytorch |   1x    |   10.7   |      8.0       |  42.8  |  38.4   | [config](./mask-rcnn_x101-64x4d_fpn_1x_coco.py) |                    [model](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_64x4d_fpn_1x_coco/mask_rcnn_x101_64x4d_fpn_1x_coco_20200201-9352eb0d.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_64x4d_fpn_1x_coco/mask_rcnn_x101_64x4d_fpn_1x_coco_20200201_124310.log.json)                    |
| X-101-64x4d-FPN | pytorch |   2x    |    -     |       -        |  42.7  |  38.1   | [config](./mask-rcnn_x101-64x4d_fpn_2x_coco.py) |                [model](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_64x4d_fpn_2x_coco/mask_rcnn_x101_64x4d_fpn_2x_coco_20200509_224208-39d6f70c.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_64x4d_fpn_2x_coco/mask_rcnn_x101_64x4d_fpn_2x_coco_20200509_224208.log.json)                 |
| X-101-32x8d-FPN | pytorch |   1x    |   10.6   |       -        |  42.8  |  38.3   | [config](./mask-rcnn_x101-32x8d_fpn_1x_coco.py) |                [model](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_32x8d_fpn_1x_coco/mask_rcnn_x101_32x8d_fpn_1x_coco_20220630_173841-0aaf329e.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_32x8d_fpn_1x_coco/mask_rcnn_x101_32x8d_fpn_1x_coco_20220630_173841.log.json)                 |

## Pre-trained Models

We also train some models with longer schedules and multi-scale training. The users could finetune them for downstream tasks.

|                             Backbone                             |  Style  | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP |                         Config                          |                                                                                                                                                                                                    Download                                                                                                                                                                                                     |
| :--------------------------------------------------------------: | :-----: | :-----: | :------: | :------------: | :----: | :-----: | :-----------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
|     [R-50-FPN](./mask-rcnn_r50-caffe_fpn_ms-poly-2x_coco.py)     |  caffe  |   2x    |   4.3    |                |  40.3  |  36.5   | [config](./mask-rcnn_r50-caffe_fpn_ms-poly-2x_coco.py)  | [model](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_2x_coco/mask_rcnn_r50_caffe_fpn_mstrain-poly_2x_coco_bbox_mAP-0.403__segm_mAP-0.365_20200504_231822-a75c98ce.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_2x_coco/mask_rcnn_r50_caffe_fpn_mstrain-poly_2x_coco_20200504_231822.log.json) |
|     [R-50-FPN](./mask-rcnn_r50-caffe_fpn_ms-poly-3x_coco.py)     |  caffe  |   3x    |   4.3    |                |  40.8  |  37.0   | [config](./mask-rcnn_r50-caffe_fpn_ms-poly-3x_coco.py)  | [model](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_20200504_163245.log.json)  |
|        [R-50-FPN](./mask-rcnn_r50_fpn_ms-poly-3x_coco.py)        | pytorch |   3x    |   4.1    |                |  40.9  |  37.1   |    [config](./mask-rcnn_r50_fpn_ms-poly-3x_coco.py)     |                            [model](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_fpn_mstrain-poly_3x_coco/mask_rcnn_r50_fpn_mstrain-poly_3x_coco_20210524_201154-21b550bb.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_fpn_mstrain-poly_3x_coco/mask_rcnn_r50_fpn_mstrain-poly_3x_coco_20210524_201154.log.json)                             |
|    [R-101-FPN](./mask-rcnn_r101-caffe_fpn_ms-poly-3x_coco.py)    |  caffe  |   3x    |   5.9    |                |  42.9  |  38.5   | [config](./mask-rcnn_r101-caffe_fpn_ms-poly-3x_coco.py) |                   [model](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r101_caffe_fpn_mstrain-poly_3x_coco/mask_rcnn_r101_caffe_fpn_mstrain-poly_3x_coco_20210526_132339-3c33ce02.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn_r101_caffe_fpn_mstrain-poly_3x_coco/mask_rcnn_r101_caffe_fpn_mstrain-poly_3x_coco_20210526_132339.log.json)                    |
|       [R-101-FPN](./mask-rcnn_r101_fpn_ms-poly-3x_coco.py)       | pytorch |   3x    |   6.1    |                |  42.7  |  38.5   |    [config](./mask-rcnn_r101_fpn_ms-poly-3x_coco.py)    |                          [model](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r101_fpn_mstrain-poly_3x_coco/mask_rcnn_r101_fpn_mstrain-poly_3x_coco_20210524_200244-5675c317.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r101_fpn_mstrain-poly_3x_coco/mask_rcnn_r101_fpn_mstrain-poly_3x_coco_20210524_200244.log.json)                           |
| [x101-32x4d-FPN](./mask-rcnn_x101-32x4d_fpn_ms-poly-3x_coco.py)  | pytorch |   3x    |   7.3    |                |  43.6  |  39.0   | [config](./mask-rcnn_x101-32x4d_fpn_ms-poly-3x_coco.py) |              [model](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_32x4d_fpn_mstrain-poly_3x_coco/mask_rcnn_x101_32x4d_fpn_mstrain-poly_3x_coco_20210524_201410-abcd7859.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_32x4d_fpn_mstrain-poly_3x_coco/mask_rcnn_x101_32x4d_fpn_mstrain-poly_3x_coco_20210524_201410.log.json)               |
| [X-101-32x8d-FPN](./mask-rcnn_x101-32x8d_fpn_ms-poly-3x_coco.py) | pytorch |   1x    |   10.4   |                |  43.4  |  39.0   | [config](./mask-rcnn_x101-32x8d_fpn_ms-poly-1x_coco.py) |              [model](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_32x8d_fpn_mstrain-poly_1x_coco/mask_rcnn_x101_32x8d_fpn_mstrain-poly_1x_coco_20220630_170346-b4637974.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_32x8d_fpn_mstrain-poly_1x_coco/mask_rcnn_x101_32x8d_fpn_mstrain-poly_1x_coco_20220630_170346.log.json)               |
| [X-101-32x8d-FPN](./mask-rcnn_x101-32x8d_fpn_ms-poly-3x_coco.py) | pytorch |   3x    |   10.3   |                |  44.3  |  39.5   | [config](./mask-rcnn_x101-32x8d_fpn_ms-poly-3x_coco.py) |              [model](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_32x8d_fpn_mstrain-poly_3x_coco/mask_rcnn_x101_32x8d_fpn_mstrain-poly_3x_coco_20210607_161042-8bd2c639.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_32x8d_fpn_mstrain-poly_3x_coco/mask_rcnn_x101_32x8d_fpn_mstrain-poly_3x_coco_20210607_161042.log.json)               |
| [X-101-64x4d-FPN](./mask-rcnn_x101-64x4d_fpn_ms-poly_3x_coco.py) | pytorch |   3x    |   10.4   |                |  44.5  |  39.7   | [config](./mask-rcnn_x101-64x4d_fpn_ms-poly_3x_coco.py) |              [model](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_64x4d_fpn_mstrain-poly_3x_coco/mask_rcnn_x101_64x4d_fpn_mstrain-poly_3x_coco_20210526_120447-c376f129.pth) \| [log](https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_64x4d_fpn_mstrain-poly_3x_coco/mask_rcnn_x101_64x4d_fpn_mstrain-poly_3x_coco_20210526_120447.log.json)               |

## Citation

```latex
@article{He_2017,
   title={Mask R-CNN},
   journal={2017 IEEE International Conference on Computer Vision (ICCV)},
   publisher={IEEE},
   author={He, Kaiming and Gkioxari, Georgia and Dollar, Piotr and Girshick, Ross},
   year={2017},
   month={Oct}
}
```
