Datasets

MNIST Series

MNIST

  • Image of handwritten digit
  • 60000 training examples
  • 10000 test examples
  • 28*28 grayscale images
  • 10 classes
Read more

Attention is All You Need

NIPS 2017
@google.com

Contents

pdf

Introduction

  • Recurrent Models
    • sequence operations
    • hard to do parallelization
  • the Transformer
    • no recurrence
    • relying entirely on an attention mechanism
    • global dependencies
    • parallelization with 8 GPUs
Read more

Auto-Encoding Variational Bayes

ICLR 2014
@google.com

Contents

pdf

Introduction

  • Variational Bayesian (VB)
  • Approximate posterior using MLP
  • Stochatic Gradient Variational Bayes (SGVB)
  • Auto-Encoding VB (AEVB) algorithm
  • Variational auto-encoder (VAE)

Method

problem

  • the integral of the marginal likelihood $p_\theta(x) = \int p_\theta(z)p_\theta(x|z) dz$ is intractable
  • a large dataset: the need of updating using small minibatches
Read more

Sequence to Sequence Learning with Neural Networks

NIPS 2014
@google.com

Contents

pdf

Abstract

  • Task: an English to French translation task from the WMT-14 dataset
  • Method:
    • a Deep LSTM: maps theinput sequence to a vector of a fixed dimensionality
    • another Deep LSTM: decodes the target sequence from the vector
  • Result: BLEU score 34.8
  • Additional Founds: reversing the order of the words in all source sentences (but not target sentences) improved the LSTM’s performance markedly
Read more