Curator

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

来自 OpenAI News · 2024-04-20 精选

LLM训练模型对齐 Prompt工程 AI安全越狱防护

Today's LLMs are susceptible to prompt injections, jailbreaks, and other attacks that allow adversaries to overwrite a model's original instructions with their own malicious prompts.

在 OpenAI News 阅读全文 →