Guiding Autonomous Agents to Better Behaviors through Human Advice
Indiana University at Bloomington Bloomington United States
Pagination or Media Count:
Inverse Reinforcement Learning IRL is an approach for domain-reward discovery from demonstration, where anagent mines the reward function of a Markov decision process by observing an expert acting in the domain. In thestandard setting, it is assumed that the expert acts nearly optimally, and a large number of trajectories, i.e.,training examples are available for reward discovery and consequently, learning domain behavior. These are notpractical assumptions trajectories are often noisy, and there can be a paucity of examples.