This article is hopefully the first one in a series of articles to come. The topic we will discuss in this series will be Machine Learning. This series of article is planned to be based on the Stanford Uni ML course I’m enrolled into on Coursera. But, the entire point of such a series of article is to be different from the course itself. These articles will not have any complex math in it. All concepts will be discussed using intuition. Lets get started.
The term Machine Learning became popular recently. If you look at the past 10 years of google search index trends for the phrase “Machine Learning”, you get the following.
The science of dealing with large set of data, and trying to make some sense out of it, was here for many decades. The concept of neural networks on the other hand has a different ultimate goal, Artificial Intelligence. While we are way behind schedule on creating our first true AI, people are applying the concept of neural networks into large data sets and producing very interesting results. For example, an engineer can feed millions of videos into a neural network, and group together those videos in which a cat appears! While this may sound silly, the computer learns the concept of a “cat” without explicitly giving it the properties or features of a cat. This is a classic example of a set of problems defined in machine learning called Clustering.
A neuron is the fundamental building block of a human brain. Neurons are connected to each other a forms a messy complex network in our brains. Scientists have produced a mathematical model of this basic architecture of human brain, and are trying to emulate the same on computers, and they call it the Artificial Neural Networks. The basics of this model is very simple. A neutron can have connections to N number of other neurons. Each of these individual connections have different strengths or weights. These neurons are arranged into layers. Neurons in the same layer are not connected to each other.
In very simple words, it is a way to teach a computer to perform a task instead of explicitly programming it. In the past, we used programming to perform a task using computer. We defined functions using a set of instructions the computer can understand, and the computer produced outputs by operating on the input. Now, what if the computer can write its own version of programs by looking at a set of inputs and corresponding outputs? This is exactly what is called machine learning. Consider the following function.
The input of the function is the floor area of a building in square foots. And given that, in that particular town, average price of one square foot is 100. We can easily calculate the price of a house by multiplying these two values. But, this value might not be the real selling price because of several reasons. Say for example, in my town, if the area is less than 1000 square foots, the demand is high, because in my town, most home owners are small families. That trend cannot be leveraged using a simple function like this. But, imagine, if we are feeding that sort of fine detailing to this function, the function itself becomes very complex. There is still a chance that we might miss out one important parameter or the other.
But suppose, you have access to 10 years of historical data of all real estate transactions in your town. This database includes real selling price, and floor area of all buildings sold in the past 10 years. A batch of data is given below.
|FLOOR AREA||ACTUAL SELLING PRICE|
Note:- In Machine Learning (ML), the inputs are called features, and output set is called target. In this case, the set of floor area is a feature, and selling price is the target.
See, we already have missed out a lot of insights without this data. Very small houses, still have a decent base price, probably because land price is also affecting the house price. Also, very large houses tend to have very high price ~140 per square foot, probably because, such large houses are made with all sort of luxuries into it, and probably is in a premium neighbourhood. Now, what if based on these data, a computer can write (learn) its own function to predict the selling price of a house?
This function above, will always return a better prediction than our human version of the same function we have seen above. Why? Because, it can detect all sorts of patterns and features within the data. This process is machine learning!
Note:- This particular problem, where a set of features and targets are given, and the machine is expected to predict a specific value on a continuous line of values, is called regression. In this case, it is a very simple version of regression called Linear Regression. We will discuss this in detail in another article.
Machine learning is primarily classified into three major categories. Supervised learning, unsupervised learning, and reinforcement learning. The supervised learning differs from the unsupervised one, just by the fact that, in case of unsupervised learning, the LABELS will not be used for training. The machine is supposed to learn by the FEATURES set only. (We will discuss how this is useful in a future session) Reinforcement learning is not in the scope of this article.
I think this gives a good start into the world of machine learning. In the next article we will discuss in detail about learning, by focusing on Regression problems. And I would personally recommend enrolling into that machine learning course by Andrew NG on coursera, even if you don’t like math!