numpy_datasets.images

numpy_datasets.images.mnist.load([path]) The MNIST database of handwritten digits, available from this page has a training set of 60,000 examples, and a test set of 10,000 examples.
numpy_datasets.images.arabic_characters.load([path]) Arabic Handwritten Characters Dataset
numpy_datasets.images.arabic_digits.load([path]) Arabic Handwritten Digits Dataset
numpy_datasets.images.kmnist.load([dataset, …]) japanese character (image) classification
numpy_datasets.images.emnist.load([option, path]) Grayscale digit/letter classification.
numpy_datasets.images.fashionmnist.load([path]) Grayscale image classification
numpy_datasets.images.face_pointing.load([path]) head angle classification The head pose database consists of 15 sets of images.
numpy_datasets.images.rock_paper_scissors.load([path]) The MNIST database of handwritten digits, available from this page has a training set of 60,000 examples, and a test set of 10,000 examples.
numpy_datasets.images.dsprites.load([path]) greyscale image classification and disentanglement
numpy_datasets.images.svhn.load([path]) Street number classification.
numpy_datasets.images.cifar10.load([path]) Image classification.
numpy_datasets.images.cifar100.load([path]) Image classification.
numpy_datasets.images.celeb.load([path]) face images with attributes CelebFaces Attributes Dataset (CelebA) is a large-scale face attributes dataset with more than 200K celebrity images, each with 40 attribute annotations.
numpy_datasets.images.ibeans.load([path]) Plant images classification.
numpy_datasets.images.stl10.load([path]) Image classification with extra unlabeled images.
numpy_datasets.images.tinyimagenet.load([path]) Tiny Imagenet has 200 classes.

Detailed description

numpy_datasets.images.mnist.load(path=None)[source]

The MNIST database of handwritten digits, available from this page has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image.

It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting.

Parameters:path (str (optional)) – default ($DATASET_PATH), the path to look for the data and where the data will be downloaded if not present
Returns:
  • train_images (array)
  • train_labels (array)
  • valid_images (array)
  • valid_labels (array)
  • test_images (array)
  • test_labels (array)
numpy_datasets.images.arabic_characters.load(path=None)[source]

Arabic Handwritten Characters Dataset

Astract Handwritten Arabic character recognition systems face several challenges, including the unlimited variation in human handwriting and large public databases. In this work, we model a deep learning architecture that can be effectively apply to recognizing Arabic handwritten characters. A Convolutional Neural Network (CNN) is a special type of feed-forward multilayer trained in supervised mode. The CNN trained and tested our database that contain 16800 of handwritten Arabic characters. In this paper, the optimization methods implemented to increase the performance of CNN. Common machine learning methods usually apply a combination of feature extractor and trainable classifier. The use of CNN leads to significant improvements across different machine-learning classification algorithms. Our proposed CNN is giving an average 5.1% misclassification error on testing data.

Context The motivation of this study is to use cross knowledge learned from multiple works to enhancement the performance of Arabic handwritten character recognition. In recent years, Arabic handwritten characters recognition with different handwriting styles as well, making it important to find and work on a new and advanced solution for handwriting recognition. A deep learning systems needs a huge number of data (images) to be able to make a good decisions.

Content The data-set is composed of 16,800 characters written by 60 participants, the age range is between 19 to 40 years, and 90% of participants are right-hand. Each participant wrote each character (from ’alef’ to ’yeh’) ten times on two forms as shown in Fig. 7(a) & 7(b). The forms were scanned at the resolution of 300 dpi. Each block is segmented automatically using Matlab 2016a to determining the coordinates for each block. The database is partitioned into two sets: a training set (13,440 characters to 480 images per class) and a test set (3,360 characters to 120 images per class). Writers of training set and test set are exclusive. Ordering of including writers to test set are randomized to make sure that writers of test set are not from a single institution (to ensure variability of the test set).

Parameters:path (str (optional)) – default ($DATASET_PATH), the path to look for the data and where the data will be downloaded if not present
Returns:
  • train_images (array)
  • train_labels (array)
  • valid_images (array)
  • valid_labels (array)
  • test_images (array)
  • test_labels (array)
numpy_datasets.images.arabic_digits.load(path=None)[source]

Arabic Handwritten Digits Dataset

Parameters:path (str (optional)) – default ($DATASET_PATH), the path to look for the data and where the data will be downloaded if not present
Returns:
  • train_images (array)
  • train_labels (array)
  • valid_images (array)
  • valid_labels (array)
  • test_images (array)
  • test_labels (array)
numpy_datasets.images.kmnist.load(dataset='kmnist', path=None)[source]

japanese character (image) classification

Kuzushiji-MNIST is a drop-in replacement for the MNIST dataset (28x28 grayscale, 70,000 images), provided in the original MNIST format as well as a NumPy format. Since MNIST restricts us to 10 classes, we chose one character to represent each of the 10 rows of Hiragana when creating Kuzushiji-MNIST.

Kuzushiji-49, as the name suggests, has 49 classes (28x28 grayscale, 270,912 images), is a much larger, but imbalanced dataset containing 48 Hiragana characters and one Hiragana iteration mark.

Kuzushiji-MNIST

Kuzushiji-MNIST contains 70,000 28x28 grayscale images spanning 10 classes (one from each column of hiragana), and is perfectly balanced like the original MNIST dataset (6k/1k train/test for each class). File Examples Download (MNIST format) Download (NumPy format) Training images 60,000 train-images-idx3-ubyte.gz (18MB) kmnist-train-imgs.npz (18MB) Training labels 60,000 train-labels-idx1-ubyte.gz (30KB) kmnist-train-labels.npz (30KB) Testing images 10,000 t10k-images-idx3-ubyte.gz (3MB) kmnist-test-imgs.npz (3MB) Testing labels 10,000 t10k-labels-idx1-ubyte.gz (5KB) kmnist-test-labels.npz (5KB)

Mapping from class indices to characters: kmnist_classmap.csv (1KB)

We recommend using standard top-1 accuracy on the test set for evaluating on Kuzushiji-MNIST. Which format do I download?

If you’re looking for a drop-in replacement for the MNIST or Fashion-MNIST dataset (for tools that currently work with these datasets), download the data in MNIST format.

Otherwise, it’s recommended to download in NumPy format, which can be loaded into an array as easy as: arr = np.load(filename)[‘arr_0’]. Kuzushiji-49

Kuzushiji-49 contains 270,912 images spanning 49 classes, and is an extension of the Kuzushiji-MNIST dataset. File Examples Download (NumPy format) Training images 232,365 k49-train-imgs.npz (63MB) Training labels 232,365 k49-train-labels.npz (200KB) Testing images 38,547 k49-test-imgs.npz (11MB) Testing labels 38,547 k49-test-labels.npz (50KB)

Mapping from class indices to characters: k49_classmap.csv (1KB)

We recommend using balanced accuracy on the test set for evaluating on Kuzushiji-49. We use the following implementation of balanced accuracy:

License

Both the dataset itself and the contents of this repo are licensed under a permissive CC BY-SA 4.0 license, except where specified within some benchmark scripts. CC BY-SA 4.0 license requires attribution, and we would suggest to use the following attribution to the KMNIST dataset.

“KMNIST Dataset” (created by CODH), adapted from “Kuzushiji Dataset” (created by NIJL and others), doi:10.20676/00000341

Parameters:
  • dataset (str(optional)) – “kmnist” or “k49mnist”
  • path (str (optional)) – default ($DATASET_PATH), the path to look for the data and where the data will be downloaded if not present
Returns:

  • train_images (array)
  • train_labels (array)
  • valid_images (array)
  • valid_labels (array)
  • test_images (array)
  • test_labels (array)

numpy_datasets.images.emnist.load(option='byclass', path=None)[source]

Grayscale digit/letter classification.

The EMNIST Dataset

Authors:

Gregory Cohen, Saeed Afshar, Jonathan Tapson, and Andre van Schaik

The MARCS Institute for Brain, Behaviour and Development Western Sydney University Penrith, Australia 2751

Email: g.cohen@westernsydney.edu.au

What is it?

The EMNIST dataset is a set of handwritten character digits derived from the NIST Special Database 19 (https://www.nist.gov/srd/nist-special-database-19) and converted to a 28x28 pixel image format and dataset structure that directly matches the MNIST dataset (http://yann.lecun.com/exdb/mnist/). Further information on the dataset contents and conversion process can be found in the paper available at https://arxiv.org/abs/1702.05373v1.

Formats:

The dataset is provided in two file formats. Both versions of the dataset contain identical information, and are provided entirely for the sake of convenience. The first dataset is provided in a Matlab format that is accessible through both Matlab and Python (using the scipy.io.loadmat function). The second version of the dataset is provided in the same binary format as the original MNIST dataset as outlined in http://yann.lecun.com/exdb/mnist/

Dataset Summary:

There are six different splits provided in this dataset. A short summary of the dataset is provided below:

EMNIST ByClass:EMNIST814,255 characters. 62 unbalanced classes EMNIST ByMerge: 814,255 characters. 47 unbalanced classes EMNIST Balanced:Balanced131,600 characters. 47 balanced classes. EMNIST Letters:EMNIST145,600 characters. 26 balanced classes. EMNIST Digits:EMNIST280,000 characters. 10 balanced classes. EMNIST MNIST:EMNIST 70,000 characters. 10 balanced classes.

The full complement of the NIST Special Database 19 is available in the ByClass and ByMerge splits. The EMNIST Balanced dataset contains a set of characters with an equal number of samples per class. The EMNIST Letters dataset merges a balanced set of the uppercase and lowercase letters into a single 26-class task. The EMNIST Digits and EMNIST MNIST dataset provide balanced handwritten digit datasets directly compatible with the original MNIST dataset.

Please refer to the EMNIST paper (available at https://arxiv.org/abs/1702.05373v1) for further details of the dataset structure.

How to cite:

Please cite the following paper when using or referencing the dataset:

Cohen, G., Afshar, S., Tapson, J., & van Schaik, A. (2017). EMNIST: an extension of MNIST to handwritten letters. Retrieved from http://arxiv.org/abs/1702.05373

numpy_datasets.images.fashionmnist.load(path=None)[source]

Grayscale image classification

Zalando ‘s article image classification. Fashion-MNIST is a dataset of Zalando ‘s article images consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. We intend Fashion-MNIST to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits.

numpy_datasets.images.face_pointing.load(path=None)[source]

head angle classification The head pose database consists of 15 sets of images. Each set contains of 2 series of 93 images of the same person at different poses. There are 15 people in the database, wearing glasses or not and having various skin color. The pose, or head orientation is determined by 2 angles (h,v), which varies from -90 degrees to +90 degrees. Here is a sample of a serie :

PersonID = {01, …, 15}:
stands for the number of the person.
Serie = {1, 2}
stands for the number of the serie.
Number = {00, 01, …, 92}
the number of the file in the directory.

VerticalAngle = {-90, -60, -30, -15, 0, +15, +30, +60, +90}

HorizontalAngle = {-90, -75, -60, -45, -30, -15, 0, +15, +30, +45, +60, +75, +90}

All images have been taken using the FAME Platform of the PRIMA Team in INRIA Rhone-Alpes. To obtain different poses, we have put markers in the whole room. Each marker corresponds to a pose (h,v). Post-it are used as markers. The whole set of post-it covers a half-sphere in front of the person.

In order to obtain the face in the center of the image, the person is asked to adjust the chair to see the device in front of him. After this initialization phase, we ask the person to stare successively at 93 post-it notes, without moving his eyes. This second phase just takes a few minutes.

Parameters:path (str (optional)) – default ($DATASET_PATH), the path to look for the data and where the data will be downloaded if not present
Returns:
  • train_images (array)
  • train_labels (array)
  • valid_images (array)
  • valid_labels (array)
  • test_images (array)
  • test_labels (array)
numpy_datasets.images.rock_paper_scissors.load(path=None)[source]

The MNIST database of handwritten digits, available from this page has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image.

It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting.

Parameters:path (str (optional)) – default ($DATASET_PATH), the path to look for the data and where the data will be downloaded if not present
Returns:
  • train_images (array)
  • train_labels (array)
  • valid_images (array)
  • valid_labels (array)
  • test_images (array)
  • test_labels (array)
numpy_datasets.images.dsprites.load(path=None)[source]

greyscale image classification and disentanglement

This dataset consists of 737,280 images of 2D shapes, procedurally generated from 5 ground truth independent latent factors, controlling the shape, scale, rotation and position of a sprite. This data can be used to assess the disentanglement properties of unsupervised learning methods.

dSprites is a dataset of 2D shapes procedurally generated from 6 ground truth independent latent factors. These factors are color, shape, scale, rotation, x and y positions of a sprite.

All possible combinations of these latents are present exactly once, generating N = 737280 total images.

https://github.com/deepmind/dsprites-dataset

path: str (optional)
default ($DATASET_PATH), the path to look for the data and where the data will be downloaded if not present

images: array

latent: array

classes: array

numpy_datasets.images.svhn.load(path=None)[source]

Street number classification.

The SVHN dataset is a real-world image dataset for developing machine learning and object recognition algorithms with minimal requirement on data preprocessing and formatting. It can be seen as similar in flavor to MNIST (e.g., the images are of small cropped digits), but incorporates an order of magnitude more labeled data (over 600,000 digit images) and comes from a significantly harder, unsolved, real world problem (recognizing digits and numbers in natural scene images). SVHN is obtained from house numbers in Google Street View images.

Parameters:path (str (optional)) – default $DATASET_PATH, the path to look for the data and where the data will be downloaded if not present
Returns:
  • train_images (array)
  • train_labels (array)
  • test_images (array)
  • test_labels (array)
numpy_datasets.images.cifar10.load(path=None)[source]

Image classification. The `CIFAR-10 < https: // www.cs.toronto.edu/~kriz/cifar.html >`_ dataset was collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. It consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images. The dataset is divided into five training batches and one test batch, each with 10000 images. The test batch contains exactly 1000 randomly selected images from each class. The training batches contain the remaining images in random order, but some training batches may contain more images from one class than another. Between them, the training batches contain exactly 5000 images from each class.

Parameters:path (str (optional)) – default ($DATASET_PATH), the path to look for the data and where the data will be downloaded if not present
Returns:
  • train_images (array)
  • train_labels (array)
  • test_images (array)
  • test_labels (array)
numpy_datasets.images.cifar100.load(path=None)[source]

Image classification.

The `CIFAR-100 < https: // www.cs.toronto.edu/~kriz/cifar.html >`_ dataset is just like the CIFAR-10, except it has 100 classes containing 600 images each. There are 500 training images and 100 testing images per class. The 100 classes in the CIFAR-100 are grouped into 20 superclasses. Each image comes with a “fine” label(the class to which it belongs) and a “coarse” label(the superclass to which it belongs).

numpy_datasets.images.celeb.load(path=None)[source]

face images with attributes CelebFaces Attributes Dataset (CelebA) is a large-scale face attributes dataset

with more than 200K celebrity images, each with 40 attribute annotations. The

images in this dataset cover large pose variations and background clutter. CelebA has large diversities, large quantities, and rich annotations, including - 10,177 number of identities, - 202,599 number of face images, and - 5 landmark locations, 40 binary attributes annotations per image. The dataset can be employed as the training and test sets for the following computer vision tasks: face attribute recognition, face detection, and landmark

(or facial part) localization.

Note: CelebA dataset may contain potential bias. The fairness indicators https://github.com/tensorflow/fairness-indicators/blob/master/fairness_indicators/documentation/examples/Fairness_Indicators_TFCO_CelebA_Case_Study.ipynb goes into detail about several considerations to keep in mind while using the CelebA dataset.

Parameters:path (str (optional)) – default ($DATASET_PATH), the path to look for the data and where the data will be downloaded if not present
Returns:
  • train_images (array)
  • train_labels (array)
  • valid_images (array)
  • valid_labels (array)
  • test_images (array)
  • test_labels (array)
numpy_datasets.images.ibeans.load(path=None)[source]

Plant images classification.

This dataset is of leaf images taken in the field in different districts in Uganda by the Makerere AI lab in collaboration with the National Crops Resources Research Institute (NaCRRI), the national body in charge of research in agriculture in Uganda.

The goal is to build a robust machine learning model that is able to distinguish between diseases in the Bean plants. Beans are an important cereal food crop for Africa grown by many small-holder farmers - they are a significant source of proteins for school-age going children in East Africa.

The data is of leaf images representing 3 classes: the healthy class of images, and two disease classes including Angular Leaf Spot and Bean Rust diseases. The model should be able to distinguish between these 3 classes with high accuracy. The end goal is to build a robust, model that can be deployed on a mobile device and used in the field by a farmer.

The data includes leaf images taken in the field. The figure above depicts examples of the types of images per class. Images were taken from the field/garden a basic smartphone.

The images were then annotated by experts from NaCRRI who determined for each image which disease was manifested. The experts were part of the data collection team and images were annotated directly during the data collection process in the field.

Class Examples Healthy class 428 Angular Leaf Spot 432 Bean Rust 436 Total: 1,296

Data Released 20-January-2020 License MIT Credits Makerere AI Lab

Parameters:path (str (optional)) – default ($DATASET_PATH), the path to look for the data and where the data will be downloaded if not present
Returns:
  • train_images (array)
  • train_labels (array)
  • valid_images (array)
  • valid_labels (array)
  • test_images (array)
  • test_labels (array)
numpy_datasets.images.stl10.load(path=None)[source]

Image classification with extra unlabeled images.

The STL-10 dataset is an image recognition dataset for developing unsupervised feature learning, deep learning, self-taught learning algorithms. It is inspired by the CIFAR-10 dataset but with some modifications. In particular, each class has fewer labeled training examples than in CIFAR-10, but a very large set of unlabeled examples is provided to learn image models prior to supervised training. The primary challenge is to make use of the unlabeled data (which comes from a similar but different distribution from the labeled data) to build a useful prior. We also expect that the higher resolution of this dataset (96x96) will make it a challenging benchmark for developing more scalable unsupervised learning methods.

Parameters:path (str (optional)) – the path to look for the data and where it will be downloaded if not present
Returns:
  • train_images (array) – the training images
  • train_labels (array) – the training labels
  • test_images (array) – the test images
  • test_labels (array) – the test labels
  • extra_images (array) – the unlabeled additional images
numpy_datasets.images.tinyimagenet.load(path=None)[source]

Tiny Imagenet has 200 classes. Each class has 500 training images, 50 validation images, and 50 test images. We have released the training and validation sets with images and annotations. We provide both class labels an bounding boxes as annotations; however, you are asked only to predict the class label of each image without localizing the objects. The test set is released without labels. You can download the whole tiny ImageNet dataset here.