Posted 2019-11-21Updated 2021-03-31Uncategorized0 visits

Datasets

MNIST Series

MNIST

Image of handwritten digit
60000 training examples
10000 test examples
28*28 grayscale images
10 classes

Fashion-MNIST

Images of clothing and accessories
60000 training examples (same as mnist)
10000 test examples (same as mnist)
28*28 grayscale images. (still the same as mnist)
10 classes. (same as mnist)
Designed to be a direct drop-in replacement for MNIST

Kuzushiji MNIST (KMNIST)

Images of Kanji characters.
Also a drop-in replacement of MNIST.
Provide Kuzushiji-49 with 49 classes.
Provide Kuzushiji-Kanji for all Kanji characters, but less complete.

EMNIST

Images of handwritten digits & characters
Same format as MNIST, but with more classes
Designed to be more challenging than MNIST

QMNIST

Recover missing testing data in MNIST

Images

COCO Datasets

Large scale image dataset object detection, segmentation, and captioning
Includes segmentations and bounding boxes of objects in images
Includes image captions

LSUN

Images of scenes or objects
10 scene categories
20 object categories
Partially automated labeled.
Huge amount (1 million) images per category

ImageNet

Images organized following WordNet hierarchy (only nouns)
Largest image datasets
15 millions of images
22 thousands of classes

CIFAR

Tiny images of 32*32 coloured images
CIFAR-10 : 10 classes with 6000 images each
CIFAR-100: 100 classes with 600 images each

STL-10

Images of 10 classes
Inspired by CIFAR-10 but more - focused on unsupervised tasks
Higher resolution: 96*96 coloured images
Images acquired from ImageNet
Assistance to building priors

SVHN

Images of street view house numbers
Two formats
- Original images with character level bounding boxes
- MNIST-like 32*32 image center around a single character

SBU Captioned Photo

1 million images with associated visually relevant captions
Resources
- Data
  - 1 million image urls and their captions
  - A script to crawl images
- Search tool: search for images using text queries

The Flickr30k Dataset

Pascal VOC Data Sets

20 classes
11530 images
- 27450 ROI annotated objects
- 6929 segmentations
Challenges

Cityscapes Dataset

Urban street scenes
Features
- Annotations: semantic, instance-wise, dance pixel
- 30 Classes
- 50 cities
- Different seasons, daytime, and weather conditions
5000 fine annotated images
20000 coarse annotated images

Semantic Boundaries Dataset

Mark up the pixels that lie on the boundary of the object

Videos

Kinetics-400 Dataset

650000 video clip url links
700 human cation classes
Each action class has at least 600 video clips

HMDB Dataset

1,000,000,000 videos of human motions
- General facial actions: smile, laugh, chew, talk
- Facial actions with object manipulation: smoke, eat, drink
- General body movements
- Body movements with object interaction
- Body movements for human interaction

UCF101 Dataset

An action recognition data set of realistic action videos
5 action categories
- Human-Object interaction
- Body-Motion only
- Human human interaction
- Playing musical instruments
- Sports

Reference

torchvision.datasets

Datasets

https://tracyliu1220.github.io/2019/11/21/2019-11-21-Datasets/

Author

Tracy Liu

Posted on

2019-11-21

Updated on

2021-03-31

Licensed under

Comments