Datasets

MNIST Series

MNIST

  • Image of handwritten digit
  • 60000 training examples
  • 10000 test examples
  • 28*28 grayscale images
  • 10 classes

Fashion-MNIST

  • Images of clothing and accessories
  • 60000 training examples (same as mnist)
  • 10000 test examples (same as mnist)
  • 28*28 grayscale images. (still the same as mnist)
  • 10 classes. (same as mnist)
  • Designed to be a direct drop-in replacement for MNIST

Kuzushiji MNIST (KMNIST)

  • Images of Kanji characters.
  • Also a drop-in replacement of MNIST.
  • Provide Kuzushiji-49 with 49 classes.
  • Provide Kuzushiji-Kanji for all Kanji characters, but less complete.

EMNIST

  • Images of handwritten digits & characters
  • Same format as MNIST, but with more classes
  • Designed to be more challenging than MNIST

QMNIST

  • Recover missing testing data in MNIST

Images

COCO Datasets

  • Large scale image dataset object detection, segmentation, and captioning
  • Includes segmentations and bounding boxes of objects in images
  • Includes image captions

LSUN

  • Images of scenes or objects
  • 10 scene categories
  • 20 object categories
  • Partially automated labeled.
  • Huge amount (1 million) images per category

ImageNet

  • Images organized following WordNet hierarchy (only nouns)
  • Largest image datasets
  • 15 millions of images
  • 22 thousands of classes

CIFAR

  • Tiny images of 32*32 coloured images
  • CIFAR-10 : 10 classes with 6000 images each
  • CIFAR-100: 100 classes with 600 images each

STL-10

  • Images of 10 classes
  • Inspired by CIFAR-10 but more - focused on unsupervised tasks
  • Higher resolution: 96*96 coloured images
  • Images acquired from ImageNet
  • Assistance to building priors

SVHN

  • Images of street view house numbers
  • Two formats
    • Original images with character level bounding boxes
    • MNIST-like 32*32 image center around a single character

SBU Captioned Photo

  • 1 million images with associated visually relevant captions
  • Resources
    • Data
      • 1 million image urls and their captions
      • A script to crawl images
    • Search tool: search for images using text queries

The Flickr30k Dataset

Pascal VOC Data Sets

  • 20 classes
  • 11530 images
    • 27450 ROI annotated objects
    • 6929 segmentations
  • Challenges

Cityscapes Dataset

  • Urban street scenes
  • Features
    • Annotations: semantic, instance-wise, dance pixel
    • 30 Classes
    • 50 cities
    • Different seasons, daytime, and weather conditions
  • 5000 fine annotated images
  • 20000 coarse annotated images

Semantic Boundaries Dataset

  • Mark up the pixels that lie on the boundary of the object

Videos

Kinetics-400 Dataset

  • 650000 video clip url links
  • 700 human cation classes
  • Each action class has at least 600 video clips

HMDB Dataset

  • 1,000,000,000 videos of human motions
    • General facial actions: smile, laugh, chew, talk
    • Facial actions with object manipulation: smoke, eat, drink
    • General body movements
    • Body movements with object interaction
    • Body movements for human interaction

UCF101 Dataset

  • An action recognition data set of realistic action videos
  • 5 action categories
    • Human-Object interaction
    • Body-Motion only
    • Human human interaction
    • Playing musical instruments
    • Sports

Reference

Author

Tracy Liu

Posted on

2019-11-21

Updated on

2021-03-31

Licensed under

Comments