Skip to content

A CNN network trained to classify images over 5 classes evaluated on accuracy.

Notifications You must be signed in to change notification settings

ndvaughn/ImageClassification

Repository files navigation

Overview

This project uses CNN architecture to classify images in two separate ways. THe first is a traditional head to toe Convolutional neural network trained using Backpropogation with Adam optimizer. The second is a neural network trained on our dataset used for teh classification and a Convolutional backend trained on an autoencoder task. Data was limited for this project, however, accuracy of ~40% was seen in test sets over 5 classes. (GPUs were not used.)

Image Classification Using CNN

CNNs used for image classification can often be viewed in two parts, where the first part learns a represen-tation of the image data — or “encoding” — and the second learns a mapping from this image encoding to4 the output space. Our CNN will follow this schema, and the output space will be the five class labels weare considering. The image encoder will be composed of convolutional layers, which maintain the spatialstructure of the image data by sliding filters with learned weights across the image. Once we have our imageencoding, we will use fully connected layers to learn a non-linear combination of this representation, andideally map this representation into a linearly-separable space that we can easily classify. Between layers,we will add non-linear activation functions that allow the network to learn a non-linear mapping.

Image Classification Using Autoencoder

In many cases, we may have a large amount of unlabeled data. Such unlabeled data may still be useful if we can find a way to learn a good representation of the underlying structure. The general task of thisform, called unsupervised learning, involves learning NOT how to predict the output label given the inputfeature-vector, but instead the internal structure of the input relative to other data points. There are many architectures and training methods for doing unsupervised learning with neural nets; here, we’ll focus on autoencoders.We consider a good representation of our data to be a compact representation in a lower-dimensional space,since it would require our model to learn the simpler underlying structure of the data instead of memorizing specific features of the data. Specifically, if X is a space of possible food images, we want to learn a d-dimensional representation of the observed elements of X, where d is much smaller than the dimensions of X. To learn such a representation, we can employ a function encoder:X →Rd, which maps input images to the d-dimensional representation. However, without a semantic representation of the important features(e.g. labels), we cannot directly train a network to learn the functionencoder.Instead of requiring labels, let’s utilize what we do have: data. Instead of learning encoder :X →Rd directly, we can learn an identity function ident :X → X. Now, we have both the inputs and the outputs,so we can learn this function. We seek to approximate the identity function by a functional form that compresses and then decompresses the input data. We can break this function down into two parts:ident =decoder◦encoder, where encoder :X →Rd and decoder :Rd→ X. Our training pairs for learning this function will be of the form ( ̄x, ̄x), where ̄x is a food image. Intuitively, the learned compressor (encoder) will throw away all irrelevant aspects of the data and keep only what is essential for the decompressor(decoder) to reconstruct the data. This will allow us to find a good representation of the data.A neural network that implements this idea is called anautoencoder. Note: in general, the performance ofa neural network increases significantly with the amount of data. Thus, we will use both the training andtest images (even though we don’t have labels for the test images). This corresponds to 12000 training and 4000 test images. The remaining 4000 validation images will be used to evaluate the performance of theautoencoder.

About

A CNN network trained to classify images over 5 classes evaluated on accuracy.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages