Concepts of Artificial Intelligence, Deep Learning and Artificial Neural Networks form the basis of many Machine Learning algorithms which can be used to simplify many real-world problems. Machine Learning aims at breaking the necessity to formulate long and complex programs and focuses on training the machine to learn from the training data sets. Though human-alike thinking and decision making is still very far, ML algorithms can be used in various scenarios to predict human behaviour which would otherwise had been impossible.

Typical examples of the use of ML algorithms are, shopping recommendation in Amazon or Flipkart based on your previous buying decisions, suggestions from Netflix or Amazon prime depending on the genres of movies you have watched earlier, etc.

New fields such as Predictive Analytics, Data Science etc are emerging from the transition to machine-based thinking and learning instead of human intervention. Of course, there are various algorithms to learn and measure and various ways to optimise them, but they come later. One must know about the basics from where these algorithms came, how was the idea was formed and also what is the relation between Artificial Neural Networks and Deep Learning. In this article, we will focus on the very basic of these, so that someone looking to jump in this domain can form a basic idea, and also a professional can use it as a reference.

What is AI

Intelligence, according to Oxford Dictionary, is defined as follows

“The ability to acquire and apply knowledge and skills.”

The definition of Artificial Intelligence goes like this:

“The theory and development of computer systems able to perform tasks normally requiring human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages.”

AI can be used to perform a specific task like in medical diagnosing, remote sensing, trading etc. It has become very popular and is widespreadly being used in e-commerce, finance, logistics etc. One example of AI is IBM’s Watson, which can understand, reason and learn from your all of your data, including unstructured text, images, audio and video. Another instance is Apple’s Siri, a personal assistant which uses voice queries and a natural language user interface to answer questions, make recommendations, and perform actions by delegating requests to a set of Internet services.

There can be various end goals of AI and different approaches and algorithm can be applied to reach at each. We will deal with Neural Networks and Introduction to Deep Learning here.

Biological Neural Network vs Artificial Neural Network

In our brain, there is a chain of neurons which communicate with each other through axons. A neuron will pass the message to the other end, if the summation of inputs crosses a threshold point, to transmit the message to the next neuron.

Let’s understand the process with the help of a diagram:

Neural Network

The input by each neuron is a function of the summation of the weightage of different input signal defined by( w1, w2, w3….wn). Also, each neuron applies a transformation function to each weighted input and then net input is calculated. The net input determines if the activation point has been reached or the threshold has been crossed.

Each neuron receives 1000s of inputs per second, the inputs being generated by our senses or activities, all the inputs are summed up once, if the threshold is passed, the signal reaches the brain to take some action in voluntary or involuntary form.

Our brain “thinks” or “takes actions” depending on the weightage of all the inputs received from the neurons. Also, learning and experience causes the neural network in the brain to modify or change themselves from time to time.

Remember the concept of Artificial Intelligence? “ The capability of a machine to think on its own and learn from experience”, well the very process in which our brain learns from experience is the inception of this very idea.

It all might sound very complication, but for those of you who are new to this, don’t be scared yet! We are going to make it easy for you.

Okay, so to repeat again, the Artificial Neural Network is a partial replication of the Biological Neural Network. Artificial Neural Networks function by the adaptive weights between two neurons. By adaptive, we mean that the weight modify themselves by a learning algorithm that learns from the training data set to produce a desired output.

When in comes to artificial neural networks, training plays an integral part. Some of the common training methods are Back propagation or Backward Propagation.This is a method for training artificial neural networks along with the use of gradient descent algorithm. We will talk about this in the next article.

Now, let's talk about a term called the “loss function” - It is defined as a function which maps a particular event to a set of real values. It also represents some ‘cost’ associated with the given event. The term cost refers to the extent that the resulting values differ from the expected ones. The lesser the loss function, higher is the optimisation.

ANN works on something called the “hidden state”. The hidden states act as the transient between the input and the output. Let’s look at the following diagram,

Looks complex right? Well, let’s break it down one by one. There is an input layer, followed by a hidden layer, concluding with the output layer. Each layer consists of one or multiple neurons.

We have three inputs and we want to find the probability if the output will be 1 or 2. For this prediction we need to predict a series of hidden layers in between the input and the output. The input layers determines which of the three hidden layers will be activated.There can be multiple hidden layers, driven by probabilistic combination in each, which finally predicts the outcome. The hidden layers allows the algorithm to learn from every prediction.

Artificial Neural Networks are indeed very powerful, but they can tend to be very complex. Also some problems like the Over-fitting problem or the Dimensional problem can occur which can restrict proper data modelling.

Deep Learning

Deep learning consists of neural algorithms that take a series of input data and gives the desired output after various non-linear transformations of the input data. Deep learning is part of a broader family of machine learning methods based on learning data representations, as opposed to task specific algorithms. Learning can be supervised, partially supervised or unsupervised.

According to Wikipedia, Deep learning is a class of machine learning algorithms that use a cascade of many layers of nonlinear processing units for feature extraction and transformation. Each successive layer uses the output from the previous layer as input. The algorithms are based on the (unsupervised) learning of multiple levels of features or representations of the data. Higher level features are derived from lower level features to form a hierarchical representation.

One major area of deep learning is unsupervised feature extraction, it is when an algorithm can identify and derive the meaningful sets of data which can be used for learning and uses it for further understanding. Good news for the data scientists indeed!

It is one of the most popular class of Machine Learning Algorithms presently. It has found its uses in various fields including computer vision, speech recognition, natural language processing, audio recognition, social network filtering, machine translation and bioinformatics.

To conclude, AI is already transforming the world as we know it, and shaping all the sectors. New research are coming in this field everyday, all aimed at reaching that higher level of human-alike intelligence and decision making capabilities. For those who have read this article, it was a basic overview of the origin and the applications of ANN and Deep Learning. We will talk about the popular Machine Learning algorithms in the next article.