Curator

Improving Model Safety Behavior with Rule-Based Rewards

来自 OpenAI News · 2024-07-24 精选

模型对齐 AI安全 RLHF

We've developed and applied a new method leveraging Rule-Based Rewards (RBRs) that aligns models to behave safely without extensive human data collection.

在 OpenAI News 阅读全文 →