# CLEVR Image Generation

Images are generated by using Blender to invoke the script `render_images.py` like this:

```
blender --background --python render_images.py -- [args]
```

Any arguments following the `--` will be captured by `render_images.py`.

This command should be run from the `image_generation` directory, since by default the script will load resources from the `data` directory.

When rendering on cluster machines without audio drivers installed you may need to add the `-noaudio` flag to the Blender invocation like this:

```
blender --background -noaudio --python render_images.py -- [args]
```

You can also run `render_images.py` as a standalone script to view help on all command line flags like this:

```
python render_images.py --help
```

## Setup
You will need to download and install [Blender](https://www.blender.org/); code has been developed and tested using Blender version 2.78c but other versions may work as well.

Blender ships with its own version of Python 3.5, and it uses its bundled Python to execute scripts. You'll need to add this directory to the Python path of Blender's bundled Python with a command like this:

```
echo $PWD >> $BLENDER/$VERSION/python/lib/python3.5/site-packages/clevr.pth
```

where `$BLENDER` is the directory where Blender is installed and `$VERSION` is your Blender version; for example on OSX you might run:

```
echo $PWD >> /Applications/blender/blender.app/Contents/Resources/2.78/python/lib/python3.5/site-packages/clevr.pth
```

## Rendering Overview
The file `data/base_scene.blend` contains a Blender scene used for the basis of all CLEVR images. This scene contains a ground plane, a camera, and several light sources. After loading the base scene, the positions of the camera and lights are randomly jittered (controlled with the `--key_light_jitter`, `--fill_light_jitter`, `--back_light_jitter`, and `--camera_jitter` flags).

After the base scene has been loaded, objects are placed one by one into the scene. The number of objects for each scene is a random integer between `--min_objects` (default 3) and `--max_objects` (default 10), and each object has a random shape, size, color, and material.

After placing all objects, we ensure that no objects are fully occluded; in particular each object must occupy at least 100 pixels in the rendered image (customizable using `--min_pixels_per_object`). To accomplish this, we assign each object a unique color and render a version of the scene with lighting and shading disabled, writing it to a temporary file; we can then count the number of pixels of each color in this pre-render to check the number of visible pixels for each object.

Each invocation of `render_images.py` will render `--num_images` images, and they will be numbered starting at `--start_idx` (default 0). Using non-default values for `--start_idx` allows you to distribute rendering across many workers and recombine their results later without filename conflicts.

### Object Placement
Each object is positioned randomly, but before actually adding the object to the scene we ensure that its center is at least `--min_dist` units away from the centers of all other objects. We also ensure that between each pair of objects, the left/right and front/back distance along the ground plane is at least `--margin` units; this helps to minimize ambiguous spatial relationships. If after `--max_retries` attempts we are unable to find a suitable position for an object, then all objects are deleted and placed again from scratch.

### Image Resolution
By default images are rendered at `320x240`, but the resolution can be customized using the `--height` and `--width` flags.

### GPU Acceleration
Rendering uses CPU by default, but if you have an NVIDIA GPU with CUDA installed then you can use the GPU to accelerate rendering by adding the flag `--use_gpu 1`. Blender also supports acceleration using OpenCL which allows the use of non-NVIDIA GPUs; however this is not currently supported by `render_images.py`.

### Rendering Quality
You can control the quality of rendering with the `--render_num_samples` flag; using fewer samples will run more quickly but will result in grainy images. I've found that 64 samples is a good number to use for development; all released CLEVR images were rendered using 512 samples. The `--render_min_bounces` and `--render_max_bounces` control the number of bounces for transparent objects; I've found the default of 8 to work well for these options.

When rendering, Blender breaks up the output image into tiles and renders tiles sequentialy; the `--render_tile_size` flag controls the size of these tiles. This should not affect the output image, but may affect the speed at which it is rendered. For CPU rendering smaller tile sizes may be optimal, while for GPU rendering larger tiles may be faster.

With default settings, rendering a 320x240 image takes about 4 seconds on a Pascal Titan X. It's very likely that these rendering times could be drastically reduced by someone more familiar with Blender, but this rendering speed was acceptable for our purposes.

### Saving Blender Scene Files
You can save a Blender `.blend` file for each rendered image by adding the flag `--save_blendfiles 1`. These files can be more than 5 MB each, so they are not saved by default.

### Output Files
Rendered images are stored in the `--output_image_dir` directory, which is created if it does not exist. The filename of each rendered image is constructed from the `--filename_prefix`, the `--split`, and the image index.

A JSON file for each scene containing ground-truth object positions and attributes is saved in the `--output_scene_dir` directory, which is created if it does not exist. After all images are rendered the JSON files for each individual scene are combined into a single JSON file and written to `--output_scene_file`. This single file will also store the `--split`, `--version` (default 1.0), `--license` (default CC-BY 4.0), and `--date` (default today).

When rendering large numbers of images, I have sometimes experienced random Blender crashes; saving JSON files for each scene as they are rendered ensures that you do not lose information for scenes already rendered in the event of a crash.

If saving Blender scene files for each image (`--save_blendfiles 1`) then they are stored in the `--output_blend_dir` directory, which is created if it does not exist.

### Object Properties
The file `--properties_json` file (default `data/properties.json`) defines the allowed shapes, sizes, colors, and materials used for objects, making it easy to extend CLEVR with new object properties.

Each shape (cube, sphere, cylinder) is stored in its own `.blend` file in the `--shape_dir` (default `data/shapes`); the file `X.blend` contains a single object named `X` centered at the origin with unit size. The `shapes` field of the JSON properties file maps human-readable shape names to `.blend` files in the `--shape_dir`.

The `colors` field  of the JSON properties file maps human-readable color names to RGB values between 0 and 255; most of our colors are adapted from [Wad's Optimum 16 Color Palette](http://alumni.media.mit.edu/~wad/color/palette.html).

The `sizes` field of the JSON properties file maps human-readable size names to scaling factors used to scale the object models from the `--shape_dir`.

Each material is stored in its own `.blend` file in the `--material_dir` (default `data/materials`). The file `X.blend` should contain a single NodeTree item named X, and this NodeTree item must have a single `Color` input that accepts an RGBA value so that each material can be used with any color. The `materials` field of the JSON properties file maps human-readable material names to `.blend` files in the `--material_dir`.

### Restricting Shape / Color Combinations
The optional `--shape_color_combos_json` flag can be used to restrict the colors of each shape. If provided, this should give a path to a JSON file mapping shape names to lists of allowed color names. This option can be used to render CLEVR-CoGenT images using the files `data/CoGenT_A.json` and `data/CoGenT_B.json`.
