Conditional Image Generation with PixelCNN Decoders

NIPS 2016
@google.com

Contents

pdf

Introduction

  • Generate pictures pixel by pixel
  • Related Works
    • PixelRNN: better performance
    • PixelCNN: faster to train (easier to parallelize)
  • Gated PixelCNN
  • Conditional variant of the Gated PixelCNN

PixelRNN and PixelCNN

The Distribution of PixelRNNs

$p(x) = \Pi_{i=1}^{n^2}p(x_i|x_1, …,x_{i-1})$

  • $x$: input picture
  • $x_i$: a single pixel

Masking

to make sure the CNN can only use information about pixels above and to the left of the current pixel

3 Color Channels

  • B conditioned on (R, G)
  • G conditioned on R
  • first layer: mask A, otherwise: mask B

Gated PixelCNN

Gated Convolutioal Layers

Gated Activation Unit

$y = tanh(W_{k,f} * x) \odot \sigma(W_{k,g} * x)$

  • $k$: the number of the layer
  • $\odot$: element-wise product
  • $*$: convolution operator

Blind spot

A single layer block of a Gated PixelCNN

  • Notations
    • green: convolution operations
    • red: element-wise multiplications and additions
    • blue: splites feature maps
  • Left part: vertical stack
  • Right part: horizontal stack

Conditional PixelCNN

$p(x|h) = \Pi_{i=1}^{n^2} p(x_i|x_1, …, x_{i-1},h)$

  • $h$: a latent vector, image description

$y = tanh(W_{k,f} * x + V_{k,f}^T h) \odot \sigma(W_{k,g} * x + V_{k,g}^T h)$

  • Applications of $h$
    • class dependent bias
      • what should be in the image
    • location dependent bias
      • where

Location Dependent

mapping $h$ to a spatial representation $s = m(h)$
where $s$ has the same width and height as the image

$y = tanh(W_{k,f} * x + V_{k,f} * s) \odot \sigma(W_{k,g} * x + V_{k,g} * s)$

PixelCNN Auto-Encoders

  • Replacing the deconvolutional decoder with a conditional PixelCNN

Experiments

Unconditional Modeling with Gated PixelCNN

Performance of different models on CIFAR-10

Performance of different models on ImageNet

Conditioning on ImageNet Classes

Conditioning on Portrait Embeddings

PixelCNN Auto-Encoder

Reference

Author

Tracy Liu

Posted on

2019-08-30

Updated on

2021-03-31

Licensed under

Comments