Root Cause
LLMs memorize statistical patterns from training data — including PII, code secrets, and proprietary text. Additionally, operators place sensitive instructions and credentials in system prompts, assuming they are inaccessible. Both assumptions are incorrect. The model's stochastic completion mechanism can reproduce memorized content, and system prompts are accessible in the model's context window.
Disclosure Categories
| Category | Source | Attack Method | Severity |
|---|---|---|---|
| System Prompt Contents | Operator configuration | Direct extraction, completion traps, framing | High |
| Credentials in System Prompt | Operator misconfiguration | System prompt extraction | Critical |
| Training Data PII | Pre-training corpus | Verbatim memorization probing | High |
| RAG Source Data | Knowledge base | Metadata queries, chunk boundary probing | High |
| Cross-user RAG leakage | Multi-tenant RAG | Authorization bypass in retrieval | Critical |
| Context Window History | Session memory | Direct context dump requests | High |