Artificial Intelligence(2) — Supervised Learning, Unsupervised Learning, and Reinforcement Learning

Kasun Dissanayake
6 min readFeb 4


There are a number of algorithms used in Machine learning to solve complex problems. Each of these algorithms can be classified into a certain category. The different types of machine learning algorithms are,

  • Supervised Learning
  • Unsupervised Learning
  • Reinforcement Learning

Now let's look at the definitions of each of these learning techniques. Supervised learning uses labeled data to train machine learning models. Label data means that the output is already known to you. The model just needs to map the inputs to the outputs. An example of supervised learning can be training a machine that the image of an animal. Here you can see we have a trained model that identifies a picture of a cat, dog, or chicken.

Unsupervised learning uses unlabeled data to train machines. Unlabeled data means that there is no fixed output variable. The model learns from the data discovers patterns and features of the data and returns the output. Here is an example of an unsupervised learning technique that uses the images of vehicles to classify them as a bus or a car or a truck. So the model learns by identifying the parts of the vehicle such as the length and width of the vehicle, the front, and rear end covers, roofers, the types of wheels used, etc. Based on these features the model classifies the image as a bus or a car or a truck.

Reinforcement learning trains a machine to take suitable actions and maximize a reward in a particular situation. It uses an agent and environment to produce actions and rewards. The agent has a start and end state. There might be different paths for reaching the end state like a maze. In this learning technique, there is no predefined target variable. When we consider the example of the dog, there we have the owner of the dog and the “dog” (Agent) itself. Now when the owner of the dog is present in the garden with the dog, he/she throws away a stick. This throwing away of the stick is the “state” for the agent and now the dog will run after the stick which will be the “action”.

The result will be an appreciation or food for the dog from the owner which will be a “reward” as a result of the action and if the dog does not go after the stick for another alternate action then it may get some “punishment”. Therefore, this is what Reinforcement Learning is all about.

Now for each and every Reinforcement Learning problem, there are some predefined components that help in better representation and understanding of the problem. The following are the components,

Agent: The agent takes actions; as mentioned earlier in our example, the dog is the agent

Action (A): The agent has a set of actions A from which it selects which action to perform. Just like the dog who decided whether to go after the stick, just look at the stick or jump at the position.

Discount Factor: The discount factor is multiplied by the future rewards as discovered by the agent to reduce the effect of the agent’s choice of action. To simplify this, through the discount factor we are making the future rewards less valuable than immediate rewards. This makes the agent look at short-term goals itself. So lesser the value of the discount factor the more insignificant future rewards will become and vice versa.

Environment: It is the surroundings of the agent in which it moves. In the dog example, the environment consists of the owner and the garden in which the dog is present. It is the environment that gives the agent its rewards as an output based upon the agent’s current state and action as inputs.

State: A state is an immediate situation in which the agents find themselves in relation to other important things in the surroundings like tools, obstacles, enemies, and prizes/rewards. Here the dog is required to

Reward(R): The reward is the output that is received by the agent in response to the actions of the agent. For example, the dog receives dog food as a reward if the dog (agent) brings back the stick otherwise it receives scolding as a punishment if it does not wish to do so.

Policy: Here policy is the strategy that the agent uses to determine the actions which should be taken on the basis of the current state. Basically, the agent maps states to actions i.e. it decides the actions which are providing the maximum rewards with regard to states. Talking about the dog example, when the dog comes to know that dog food will be given as a reward if it brings back the stick, keeping this in mind the dog will create its own policy to reap maximum rewards.

Machine Learning Algorithms

Now let's look at different machine learning algorithms that come under these learning techniques. Some of the commonly used supervised learning algorithms are,

  • Linear regression
  • Logistic regression
  • Support Vector Machine
  • K nearest neighbors
  • Decision Tree
  • Random Forest
  • Naive Bayes

Examples of Unsupervised Learning algorithms are,

  • k-means clustering
  • Hierarchical Clustering
  • Principle component analysis

Examples of Reinforcement Learning algorithms are,

  • Q-Learning
  • Monte Carlo
  • Deep Q Network

Now let's look at the approach in which these machine learning techniques work. So supervised learning takes labeled inputs and maps them to known outputs which means you already know the target variable. Unsupervised learning finds patterns and understands the trends in the data to discover the output. So the model tries to label the data based on the features of the input data. While reinforcement learning follows the trial and error method to get the desired solution. After accomplishing a task the agent receives an award. As an example, we could train a dog to catch a stick if the dog learns to catch a stick you give it a reward such as a biscuit.

The training process for each of Supervised, Unsupervised, and Reinforcement Learning.

Supervised learning methods need external supervision to train machine learning models and hence the name supervised. They need guidance and additional information to return the result.

Unsupervised learning techniques do not need any supervision to train models they learn on their own and predict the output.

Reinforcement learning methods do not need any supervision to train machine learning models and with that let's focus on the types of problems that can be solved using these three types of machine learning techniques.

Supervised learning is generally used for classification and regression problems. Unsupervised Learning is used for clustering and association problems. Reinforcement learning is a reward based so for every task if every step is completed there will be a reward received by the agent. And if the task is not achieved correctly there will be some penalty used.

Few applications that use Supervised, Unsupervised, and Reinforcement Learning.

As I mentioned earlier supervised learning is used to solve classification and regression problems, for example, you can predict the weather for a particular day based on humidity, precipitation, wind speed, and pressure values. You can use supervised learning algorithms to forecast sales for the next month or the next quarter for different products. Similarly, you can use it for stock price analysis or identifying if a cancer cell is malignant or benign.

In Unsupervised learning algorithms, we have customer segmentation. Based on customer behavior likes, dislikes, and interests you can segment and cluster similar customers into a group. In Customer churn analysis also we are using unsupervised learning.

Reinforcement learning algorithms are widely used in the gaming industries to build games and it is also used to train robots to perform human tasks.

I think you get a better understanding of Supervised, Unsupervised, and Reinforcement Learning. See you in another tutorial.

Thank You!

If you wanna know about Artificial Intelligence use this link to refer to my first tutorial of this series.



Kasun Dissanayake

Senior Software Engineer at IFS R & D International || Former Software Engineer at Pearson Lanka || Former Associate Software Engineer at hSenid Mobile