Supervised Learning is a Machine Learning task in which a function programmed in such a way that it can predict next value without being explicitly programmed for it.
Or in other words when a machine is trained on a labelled dataset in which a function maps an input to output based on trained datasets based on training datasets examples input-output pairs. It is just a task-driven technique in which a machine predicts the next value.

So, a supervised learning algorithm analyzed data and produce an inferred function that suits well to the desired task and which may or may not be used for mapping new examples.
And in supervised learning, we used to do a mapping from input to the required output. If the output that we’re searching for happens to be a categorical output like whether someone is buying a computer or not at an electronics shop or the tested patients having the disease or not. Then the supervised learning is categorized in classification problem. And if the output is categorized by just one algorithm then we should have to check when this algorithmic approach failed to deliver the desired output. And after getting these true and false result we’ll change the learning data to new categorized real-world data for better results.
In other cases, if the task is based on the expectations like whether rainfall tomorrow or not, so these kinds of problem would be called a regressions problem.
To solve the given problem of supervised learning, one has to follow the following steps:
1)- First of all determine the training examples. As mentioned in the above examples of buyers or the forecast data, one should decide what kind of data is to be used as a training set.
2)- Use a training set that matches the real-world (data) use of the function. Thus, the set of input, as well as output, are also gathered.
3)- Determine the input feature representation of the learned function. The accuracy of the learned function is only depending upon the input objects. The input object is transformed into a feature vector, which contains the number of features that are descriptive of the object.
4)- Determine the structure as well as the learning algorithm. Example-Decision trees or support vector machine etc.
5)- Run the algorithm on training dataset because sometimes learning algorithms need to determine certain control parameters and these parameters adjusted by optimizing performance on a validation set of training sets.
6)- Last step of evaluation of the accuracy of the learned function. In this step, the function/algorithm accuracy tested on a separate dataset other than the training set.
Algorithm Choice-
There are so many algorithms already present for supervised learning with a wide range of uses. Every single algorithm has different strength and weaknesses. There is no single algorithm that works for supervised learning problems. Mainly, there’re four main types of supervised learning algorithm.
- Bias-Variance Trade off
- Function complexity and amount of training data
- Dimensionality of the input space
- Noise in the output values
The most widely used algorithms are:
- Naïve Bays
- Neural Network
- Similarity Learning
- Support Vector Machine
- Linear Regression
- Logistic Regression
- Linear Discriminant Analysis
- Decision Tree
- K-Nearest Neighbor Algorithm
Supervised Learning Mathematical Approach
Given: –
- A set of input features X1,X2 ………, Xn
- A target features Y
- A set of training examples where the values for the input and target features are given for every single example
- A new example, where only the values for the input features are given
Predict the values for the target for the new example
- Classification when Y is discrete
- Regression when Y is continuous
Classification

Example –
Credit Scoring
Differentiating between high and low
risking assets from their income and saving
Regression

Example –
Price of 2nd hand cars
X : Car attributes
Y : Price
y = g (x, θ)
g() model,
θ parameters
Real-world example-
In the following graphical representation of buyers. In which there are two possibilities existed, one should buy the computer or not. So, in this case, we’ve to trained our task algorithm in such a way that it can classify who may or may not buy the computer from the shop. These possibilities are dependent on many things like income, savings, budget and the environment condition as well as natural causes.

So, to overcome this problem we can use multiple possible classifiers and have to choose the one which suited for the desired situation and real-world possibilities that customer will be going to buy the computer or not.
Therefore, the examples are as follows: so here I have drawn a line and everything to the left of the line right. So, these are points that are red, so everything to the left of the line would be classified as will not buy a computer, everything to the right of the line where the predominantly. But in this classifier, the accuracy is very low but the speed is good though.

Now, in this classifier, the differentiating line is tilted in such a way that the maximum number of buyers and those who may not buy the computer can identify by the algorithm easily. In this case, speed and accuracy are much more efficient and reliable than the above-shown classifier.
