Posted 2019-08-30Updated 2021-03-31Paper Notes0 visits

Conditional Image Generation with PixelCNN Decoders

NIPS 2016
@google.com

pdf

Introduction

Generate pictures pixel by pixel
Related Works
- PixelRNN: better performance
- PixelCNN: faster to train (easier to parallelize)
Gated PixelCNN
Conditional variant of the Gated PixelCNN

PixelRNN and PixelCNN

The Distribution of PixelRNNs

$p(x) = \Pi_{i=1}^{n^2}p(x_i|x_1, …,x_{i-1})$

$x$: input picture
$x_i$: a single pixel

Masking

to make sure the CNN can only use information about pixels above and to the left of the current pixel

3 Color Channels

B conditioned on (R, G)
G conditioned on R
first layer: mask A, otherwise: mask B

Gated PixelCNN

Gated Convolutioal Layers

Gated Activation Unit

$y = tanh(W_{k,f} * x) \odot \sigma(W_{k,g} * x)$

$k$: the number of the layer
$\odot$: element-wise product
$*$: convolution operator

A single layer block of a Gated PixelCNN

Notations
- green: convolution operations
- red: element-wise multiplications and additions
- blue: splites feature maps
Left part: vertical stack
Right part: horizontal stack

Conditional PixelCNN

$p(x|h) = \Pi_{i=1}^{n^2} p(x_i|x_1, …, x_{i-1},h)$

$h$: a latent vector, image description

$y = tanh(W_{k,f} * x + V_{k,f}^T h) \odot \sigma(W_{k,g} * x + V_{k,g}^T h)$

Applications of $h$
- class dependent bias
  - what should be in the image
- location dependent bias
  - where

Location Dependent

mapping $h$ to a spatial representation $s = m(h)$
where $s$ has the same width and height as the image

$y = tanh(W_{k,f} * x + V_{k,f} * s) \odot \sigma(W_{k,g} * x + V_{k,g} * s)$

PixelCNN Auto-Encoders

Replacing the deconvolutional decoder with a conditional PixelCNN

Experiments

Unconditional Modeling with Gated PixelCNN

Performance of different models on CIFAR-10

Performance of different models on ImageNet

Conditioning on ImageNet Classes

Conditioning on Portrait Embeddings

PixelCNN Auto-Encoder

Reference

Conditional Image Generation with PixelCNN Decoders

https://tracyliu1220.github.io/2019/08/30/2019-08-30-Conditional-Image-Generation-with-PixelCNN-Decoders/

Author

Tracy Liu

Posted on

2019-08-30

Updated on

2021-03-31

Licensed under

#deep learning

Conditional Image Generation with PixelCNN Decoders

Contents

Introduction

PixelRNN and PixelCNN

The Distribution of PixelRNNs

Masking

3 Color Channels

Gated PixelCNN

Gated Convolutioal Layers

Blind spot

A single layer block of a Gated PixelCNN

Conditional PixelCNN

Location Dependent

PixelCNN Auto-Encoders

Experiments

Unconditional Modeling with Gated PixelCNN

Performance of different models on CIFAR-10

Performance of different models on ImageNet

Conditioning on ImageNet Classes

Conditioning on Portrait Embeddings

PixelCNN Auto-Encoder

Reference

Author

Posted on

Updated on

Licensed under

Comments

Catalogue