Skip to content

Commit

Permalink
Fix typos (#473)
Browse files Browse the repository at this point in the history
  • Loading branch information
Setotet authored Nov 16, 2024
1 parent 5486aaf commit 8021203
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions 2_0_vulns/LLM07_SystemPromptLeakage.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@

### Description

The system prompt leakage vulnerability in LLMs refers to the risk that the system prompts or instructions used to steer the behaviour of the model can also contain sensitive information that was not intended to be discovered. System prompts are designed to guide the model's output based on the requirements of the application, but may inadvertantly contain secrets. When discovered, this information can be used to facilitate other attacks.
The system prompt leakage vulnerability in LLMs refers to the risk that the system prompts or instructions used to steer the behavior of the model can also contain sensitive information that was not intended to be discovered. System prompts are designed to guide the model's output based on the requirements of the application, but may inadvertently contain secrets. When discovered, this information can be used to facilitate other attacks.

It's important to understand that the system prompt should not be considered a secret, nor should it be used as a security control. Accordingly, sensitive data such as credentials, connection strings, etc. should not be contained within the system prompt langauge.
It's important to understand that the system prompt should not be considered a secret, nor should it be used as a security control. Accordingly, sensitive data such as credentials, connection strings, etc. should not be contained within the system prompt language.

Similarly, if a system prompt contains information describing different roles and permissions, or sensitive data like connection strings or passwords, while the disclosure of such information may be helpful, the fundamental security risk is not that these have been disclosed, it is that the application allows bypassing strong session management and authorization checks by delegating these to the LLM, and that sensitive data is being stored in a place that it should not be.

Expand Down Expand Up @@ -34,8 +34,8 @@ In short: disclosure of the system prompt itself does not present the real risk
Since LLMs are susceptible to other attacks like prompt injections which can alter the system prompt, it is recommended to avoid using system prompts to control the model behavior where possible. Instead, rely on systems outside of the LLM to ensure this behavior. For example, detecting and preventing harmful content should be done in external systems.
#### 3. Implement Guardrails
Implement a system of guardrails outside of the LLM itself. While training particular behavior into a model can be effective, such as training it not to reveal its system prompt, it is not a guarantee that the model will always adhere to this. An independent system that can inspect the output to determine if the model is in compliance with expectations is preferable to system prompt instructions.
#### 4. Ensure that security controls are enforced independantly from the LLM
Critical controls such as privilage separation, authorization bounds checks, and similar must not be delegated to the LLM, either through the system prompt or otherwise. These controls need to occur in a deterministic, auditable manner, and LLMs are not (currently) conducive to this. In cases where an agent is performing tasks, if those tasks require different levels of access, then multiple agents should be used, each configured with the least privileges needed to perform the desired tasks.
#### 4. Ensure that security controls are enforced independently from the LLM
Critical controls such as privilege separation, authorization bounds checks, and similar must not be delegated to the LLM, either through the system prompt or otherwise. These controls need to occur in a deterministic, auditable manner, and LLMs are not (currently) conducive to this. In cases where an agent is performing tasks, if those tasks require different levels of access, then multiple agents should be used, each configured with the least privileges needed to perform the desired tasks.

### Example Attack Scenarios

Expand Down

0 comments on commit 8021203

Please sign in to comment.