Curator

Evaluating chain-of-thought monitorability

来自 OpenAI News · 2025-12-18 精选

OpenAI introduces a new framework and evaluation suite for chain-of-thought monitorability, covering 13 evaluations across 24 environments. Our findings show that monitoring a model’s internal reasoning is far more effective than monitoring outputs alone, offering a promising path toward scalable ...

在 OpenAI News 阅读全文 →