The Q-learning is a Reinforcement Learning algorithm in which an agent tries to learn the optimal policy from its past experiences with the environment.