Curator

Proximal Policy Optimization

来自 OpenAI News · 2017-07-20 精选

We’re releasing a new class of reinforcement learning algorithms, Proximal Policy Optimization (PPO), which perform comparably or better than state-of-the-art approaches while being much simpler to implement and tune. PPO has become the default reinforcement learning algorithm at OpenAI because of...

在 OpenAI News 阅读全文 →