<< back to list of programming projects
The image recognition program I developed is based on pattern identification machine learning, and is implemented using a neural network architecture. The inputs to the neural network are the RGB values of a 32x32 pixel image (3,072 inputs total). 1 hidden network layer consisting of 15 neural nodes is used between the input layer and the output layer. The output layer maps the inputs to one of the following ten image categories: (Airplane, Automobile, Bird, Cat, Deer, Dog, Frog, Horse, Ship, Truck)
AI training data set CIFAR-10 from http://www.cs.utoronto.ca/~kriz/cifar.html is used to train the network in 5 training batches. Each batch consisted of 1,000 images per output category i.e. 10,000 images total. For the first training run, weights and biases associated with each hidden layer are randomly initialized. Weights signify the significance of each neural node in the activation of the next network layer, while biases provide a non-linearity of the network place. Next, the input data is loaded and forward propagated through the network. The weights and biases are then adjusted based on the gradient of the cost function (a function that calculates the difference between expected output and the true output of the network). On each subsequent run, these steps are repeated until the weights and biases converged.
Finally, the network is tested using a testing batch of 10,000 images not previously seen/processed by the program, and resulted in 87% prediction accuracy. I also developed a separate image handling code using Arash Partow’s bitmap_image.hpp, which allows the neural network to process bitmap images saved on disk. In future modifications to the code, I would like to increase the prediction accuracy by increasing the size of the neural network, and also integrate it into a web application.
Some results of the neural network’s prediction on random images from the internet are shown below. The image on the left is the original image, and the image on the right is the one created by the image handling module (32x32 image) to feed into the neural network. Some of the program's guesses were quite interesting.
The program is written in C++.
Check out the code at: https://github.com/nablul/Neural-Network-Img-Recognition
Copyright © 2022 Nablul - All Rights Reserved.