Curator

Learning Montezuma’s Revenge from a single demonstration

来自 OpenAI News · 2018-07-04

We’ve trained an agent to achieve a high score of 74,500 on Montezuma’s Revenge from a single human demonstration, better than any previously published result. Our algorithm is simple: the agent plays a sequence of games starting from carefully chosen states from the demonstration, and learns ...

在 OpenAI News 阅读全文 →