Skip to main content

Unsupervised, Supervised, Reinforcement Learning

 Three types of ML - Unsupervised, Semi-supervised, Supervised, Reinforcement Learning


RL - Reinforcement Learning - like control theory with AI

Supervised and unsupervised learning work with static dataset - RL with dynamic environment

Goal is not to cluster or label data but to find the best sequence of actions that will generate optimal outcome (optimal=collect the most reward)


Two parts : Agent (Ex. a video game player) and Environment (Ex. a video game)

Start State (Environment) -> Agent takes Action -> End State (Environment) + Reward (for that action, can be positive or negative)

Based on the reward, agent can learn which action to take in the future

Actually start state can be ignore for now


Some examples of Action, State, Reward

Action = You went to college, State = You got a job, Reward = Job pays well

Action = You looked both ways before crossing a street, State = You got to other side of the street, Reward = You didn't get hit by a bus 

Action = You stayed up late during the exam, State = You're tired, Reward = You failed the exam



Inside agent:

Input (State) ----> Neural Network (Policy) -----> Output (Action)


Pollicy is updated using Reinforcement learning algorithm, RLA takes inputs (Actions, Rewards and States) and updates the policy