CIFAR-10 and CIFAR-100 dataset

Summary
| classes   subclasses |  images/class   training imgs/class   testing images/class
(label)         |  (fine)     (course)      |                            |                       (1k per batch)
CIFAR-10   | 10              none          | 6,000                  5,000 (5 bathces)       1,000 (1 batch)
CIFAR-100 | 100           20               | 600                      500                                 100

image: 32×32 RGB pixels, labeled

Introduction
These dataset collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton are labeled subsets of 80 million tiny images dataset. The CIFAR dataset has become one of the common benchmark datasets for machine learning.

80 million tiny images dataset스크린샷 2016-03-12 오전 12.50.37

CIFAR-10 dataset

스크린샷 2016-03-12 오전 1.02.16

Each image has 32×32 RGB pixels. The CIFAR-10 dataset consists of 60k images in 10 classes. There are 10 classes or categories: airplane, automobile, …, and truck.

Each class has 6,000 images. For training and testing, 1k images comprises a batch. 5 batches for training and 1 batch for testing. In other words, 5k images are used for training and 1k for testing (out of 6k per class). In total, there are 50k training images and 10k test images.

who is the best in CIFAR-10 ?
스크린샷 2016-03-12 오전 2.35.21
(% is accuracy)
As of Mar. 12, 2016, the best method to classify the CIFAR-10 dataset is Fractional Max-Pooling with 96.53% accuracy pubshed to arXiv 2015. A comprehensive list of accuracy vs. method is summarized by Rodrigo Benenson at “What is the class of this image? Discover the current state of the art in objects classification.

CIFAR-100 dataset

스크린샷 2016-03-12 오전 2.01.32

Each image has 32×32 RGB pixels (same as CIFAR-100). The CIFAR-100 dataset consists of 60k images in 100 classes and 20 subclasses.

Each class has 600 images (as opposed to 6,000 images for CIFAR-100). 500 images are used for training and 100 for testing (out of 600 per class). In total, there are 50k training images and 10k test images.

who is the best in CIFAR-100 ?
스크린샷 2016-03-12 오전 2.31.55
(% is accuracy)

Refer to the Alex Krizhevsky’s page for details about:

  • the training and test image configuration,
  • how to dowload the Python, Matlab, and binary versions,
  • how to site the dataset, and so on.

More on this topic
Alex Krizhevsky’s home page
– CIFAR-10 and CIFAR-100 dataset page
– Alex Krizhevsky, “Learning Multiple Layers of Features from Tiny Images”,Tech Report, 2009.  [pdf]
What is the class of this image? Discover the current state of the art in objects classification.
– who is the best in CIFAR-10 ?
– who is the best in CIFAR-100 ?
Alex’s CIFAR-10 tutorial, Caffe style
Convolutional Neural Networks, TensorFlow