STL-10 dataset


The STL-10 dataset is an image recognition dataset for developing unsupervised feature learning, deep learning, self-taught learning algorithms. It is inspired by the CIFAR-10 dataset but with some modifications. In particular, each class has fewer labeled training examples than in CIFAR-10, but a very large set of unlabeled examples is provided to learn image models prior to supervised training. The primary challenge is to make use of the unlabeled data (which comes from a similar but different distribution from the labeled data) to build a useful prior. We also expect that the higher resolution of this dataset (96x96) will make it a challenging benchmark for developing more scalable unsupervised learning methods.

Overview

  • 10 classes: airplane, bird, car, cat, deer, dog, horse, monkey, ship, truck.
  • Images are 96x96 pixels, color.
  • 500 training images (10 pre-defined folds), 800 test images per class.
  • 100000 unlabeled images for unsupervised learning. These examples are extracted from a similar but broader distribution of images. For instance, it contains other types of animals (bears, rabbits, etc.) and vehicles (trains, buses, etc.) in addition to the ones in the labeled set.
  • Images were acquired from labeled examples on ImageNet.

Testing Protocol

We recommend the following standardized testing protocol for reporting results:
  • Perform unsupervised training on the unlabeled.
  • Perform supervised training on the labeled data using 10 (pre-defined) folds of 100 examples from the training data. The indices of the examples to be used for each fold are provided.
  • Report average accuracy on the full test set.

Download

  • Matlab files
    There are three files:  train.mat, test.mat and  unlabeled.mat. These files contain variables:
    1. X, y: The matrix "X" contains the images for the file as a matrix with 1 example per row. In each row, the pixels are laid out in column-major order, one channel at a time. That is, the first 96*96 values are the red channel, the next 96*96 are green, and the last are blue. To convert these to a matrix of RGB images, use: reshape(X,10000,96,96,3). The vector "y" contains the labels in the range 1 to 10.
    2. class_names: Contains the text name of each class.
    3. fold_indices: In train.mat only. Contains the pre-selected indices of the examples to be used for the 10 training trials. For the i'th fold, use: X(fold_indices{i}, :) and y(fold_indices{i}) as your training set.
  • Binary files
    • The binary files are split into data and label files with suffixes: train_X.bin, train_y.bin, test_X.bin and test_y.bin. Within each, the values are stored as tightly packed arrays of uint8's. The images are stored in column-major order, one channel at a time. That is, the first 96*96 values are the red channel, the next 96*96 are green, and the last are blue. The labels are in the range 1 to 10. The unlabeled dataset, unlabeled.bin, is in the same format, but there is no "_y.bin" file.
    • class_names.txt file is included for reference, with one class name per line.
    • The file fold_indices.txt contains the (zero-based) indices of the examples to be used for each training fold. The first line contains the indices for the first fold, the second line, the second fold, and so on.

Reference

* Please cite the following reference in papers using this dataset:

Adam Coates, Honglak Lee, Andrew Y. Ng An Analysis of Single Layer Networks in Unsupervised Feature Learning AISTATS, 2011. (PDF)

* Please use http://cs.stanford.edu/~acoates/stl10 as the URL for this site when necessary.

Contact

Send questions to  Adam Coates  

Logo

华为开发者空间,是为全球开发者打造的专属开发空间,汇聚了华为优质开发资源及工具,致力于让每一位开发者拥有一台云主机,基于华为根生态开发、创新。

更多推荐