Three types of ML - Unsupervised, Semi-supervised, Supervised, Reinforcement Learning
RL - Reinforcement Learning - like control theory with AI
Supervised and unsupervised learning work with static dataset - RL with dynamic environment
Goal is not to cluster or label data but to find the best sequence of actions that will generate optimal outcome (optimal=collect the most reward)
Two parts : Agent (Ex. a video game player) and Environment (Ex. a video game)
Start State (Environment) -> Agent takes Action -> End State (Environment) + Reward (for that action, can be positive or negative)
Based on the reward, agent can learn which action to take in the future
Actually start state can be ignore for now
Some examples of Action, State, Reward
Action = You went to college, State = You got a job, Reward = Job pays well
Action = You looked both ways before crossing a street, State = You got to other side of the street, Reward = You didn't get hit by a bus
Action = You stayed up late during the exam, State = You're tired, Reward = You failed the exam
Inside agent:
Input (State) ----> Neural Network (Policy) -----> Output (Action)
Pollicy is updated using Reinforcement learning algorithm, RLA takes inputs (Actions, Rewards and States) and updates the policy