Introduction to Convolution Neural Network –
• Convolution Neural Network’s are similar to typical neural networks.
• CNN takes images as inputs.
• This allows us to solve problems involving Image Recognition, Object Detection & Computer vision applications.
The Convolutional Neural Network is a type of Deep Learning algorithm in which the operation is not done by the network in a typical matrix-based operational manner while it is based on the mathematical operation called convolution.
The Convolution in mathematics can be defined as the integral of the product of two functions generate a new function (3rd Function) after the one is reversed and shifted for the desired time operation.
The Convolution Neural Networks are mainly inspired by biological animal linkages of the neuron that resembles the animal visual cortex. CNN’s are also useful in Image or Real-time recognition of images or video classification and need small amounts of data to do so as compared to other algorithmic approaches.
The design of CNNs also have one output and input layer and in-between them there are a lot of hidden layers present to classify the input image.
CNNs are very similar to Neural Networks, they are made up of neuron that has learnable weights and biases. Each neuron receives some input and performs dot product and follows it with non-linearity. The architecture of CNNs mainly consists of an Input image, Convolutional layers, Pooling layers and a fully connected layer attached to the penultimate output.
The whole network still expresses a single differential score function, and also have loss function like SoftMax and SVM etc. on fully connected fully-connected layers, also we can use some filters in each layer of the CNNs.
Convolution Neural Networks Architecture –
The hidden layers of a Convolution Neural Network typically consist of a series of convolutional layers that convolve with multiplication or other Dot Product.
Convolutional Layer –
Let’s take an image as an example with only the Red channel and calculate its convolutional layer dot product by simple mathematics with the help of 2X2 filter and create a simple empty matrix.
In order to get the output in the form of the matrix, we slide the filter matrix 2X2 on the Red channel matrix from left to right to get the desired dot product and overlapping pixel value and sorting the result in the output matrix as shown in the figure below.
We repeating the above step moving the filter from left to right by just one cell or a stride at a time and sorting that result into the output matrix. We repeat this until we cover the entire image and sort that result in the output matrix. Repeat this whole process for green and blue channels to get the 3D channel matrix as an output. We can use one or more filter to get the desired result in the upcoming Conv layers (The more filter we use, the more we are able to preserve the spatial dimensions better).
The CNN also consists of ReLu’s which filters the convolutional steps and passing only positive value to the output and changing other negative value to the 0.
The activation function is commonly a ReLu Function, fully connected layers and normalization layers also known as to as hidden layers because their inputs and outputs are fixed by the activation function and final Convolution.
ReLu Function –
Where X is the input to the Neuron
The next layer in the network is pooling layer. The main purpose of the pooling layer is to reduce the dimension of the data propagating through the networks.
Pooling Layers –
There are basically two types of pooling layers used in Convolution Neural Networks.
• Max-pooling – In max-pooling for each section of the image we scanned, we just keep the highest value of the matrix.
• Average-Pooling – In average-pooling, we scan the image for each section and compute the average of the scanning area to calculate the output.
Fully-connected Layer –
Now, at the penultimate stage of the Convolution Neural Networks aka Fully connected layer, we flatten the output and connect every single node of the current layer with every other node of the next layer. This layer takes the input and output from the preceding layers whether it is a convolutional layer, ReLu or Pooling layer and outputs the n-dimensional vector.