Foundations of Sequential Learning
Technical Report,01 Apr 2016,31 Aug 2017
Duke University Durham United States
Pagination or Media Count:
This report summarizes the research done under FA8750-16-2-0173. This research advanced understanding of bandit algorithms and exploration in Markov Decision Processes MDPs. New algorithms and theory were proposed for bandits with periodic payoff multipliers and arms with costs. Exploration and transfer learning algorithms were evaluated for MDPs.
- Statistics and Probability