States, Actions, and Rewards
← Back to Reinforcement Learning
The core components of RL: states describe the environment, actions are what the agent can do, and rewards provide feedback. Q-learning learns action-value functions; policy gradient methods directly optimize the policy.