# When ControlNet Meet Inexplicit Masks: A Case Study on its Contour-following Ability



## Abstract

ControlNet excels at creating content that closely matches precise contours in user-provided masks. However, when these masks contain noise, as a frequent occurrence with non-expert users, the output would include unwanted artifacts. This paper first highlights the crucial role of controlling the impact of these inexplicit masks with diverse deterioration levels through in-depth analysis. Subsequently, to enhance controllability with inexplicit masks, an advanced **Shape-aware ControlNet** consisting of a deterioration estimator and a shape-prior modulation block is devised. The deterioration estimator assesses the deterioration factor of the provided masks. Then this factor is utilized in the modulation block to adaptively modulate the model's contour-following ability, which helps it dismiss the noise part in the inexplicit masks. Extensive experiments prove its effectiveness in encouraging ControlNet to interpret inaccurate spatial conditions robustly rather than blindly following the given contours. We showcase application scenarios like modifying shape priors and composable shape-controllable generation. Codes are available.

![](./fig/network_arch_v2.png)



## Performance

1. TikZ sketches and user scribbles 

   > Generation with TikZ sketches 

   ![](./fig/appendix-sketch.png)

   > Generation with user scribbles

   ![](./fig/appendix-scribble.png)

2. Modification of shape-priors

   ![](./fig/appendix-prior_control_v2.png)

3. Composable shape-controllable generation

   ![](./fig/appendix-multicontrol.png)



## How to Run

#### 1. Prerequisite

Our implementation is based on `diffusers >= 0.21.0`. 

```bash
# prepare the environment with conda
conda env create -f environment.yaml
```

#### 2. Prepare the datasets

1. Download the COCO dataset and LVIS dataset into the directory `./data/COCO_LVIS`
2. Reformat the annotations. 

#### 3. Train

​	a. Copy the file in the `./src/*` into the package `diffusers`. Note that our code only adds new parameters without disturbing the original implementation in `diffusers` for ControlNet. 

​	b.  Configure the script and run the code.  

```bash
sh scripts/submit_train.sh
```

#### 4. Test

```bash
sh script/batch_inference.sh
```


