Curator

Improving mathematical reasoning with process supervision

来自 OpenAI News · 2023-05-31 精选

We've trained a model to achieve a new state-of-the-art in mathematical problem solving by rewarding each correct step of reasoning (“process supervision”) instead of simply rewarding the correct final answer (“outcome supervision”). In addition to boosting performance relative to outcome su...

在 OpenAI News 阅读全文 →