From 8021203723f81c230d992639347e7ceb3c216f4f Mon Sep 17 00:00:00 2001 From: Setotet Date: Fri, 15 Nov 2024 17:32:13 -0800 Subject: [PATCH] Fix typos (#473) --- 2_0_vulns/LLM07_SystemPromptLeakage.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/2_0_vulns/LLM07_SystemPromptLeakage.md b/2_0_vulns/LLM07_SystemPromptLeakage.md index 19aa9f03..16fe235d 100644 --- a/2_0_vulns/LLM07_SystemPromptLeakage.md +++ b/2_0_vulns/LLM07_SystemPromptLeakage.md @@ -2,9 +2,9 @@ ### Description -The system prompt leakage vulnerability in LLMs refers to the risk that the system prompts or instructions used to steer the behaviour of the model can also contain sensitive information that was not intended to be discovered. System prompts are designed to guide the model's output based on the requirements of the application, but may inadvertantly contain secrets. When discovered, this information can be used to facilitate other attacks. +The system prompt leakage vulnerability in LLMs refers to the risk that the system prompts or instructions used to steer the behavior of the model can also contain sensitive information that was not intended to be discovered. System prompts are designed to guide the model's output based on the requirements of the application, but may inadvertently contain secrets. When discovered, this information can be used to facilitate other attacks. -It's important to understand that the system prompt should not be considered a secret, nor should it be used as a security control. Accordingly, sensitive data such as credentials, connection strings, etc. should not be contained within the system prompt langauge. +It's important to understand that the system prompt should not be considered a secret, nor should it be used as a security control. Accordingly, sensitive data such as credentials, connection strings, etc. should not be contained within the system prompt language. Similarly, if a system prompt contains information describing different roles and permissions, or sensitive data like connection strings or passwords, while the disclosure of such information may be helpful, the fundamental security risk is not that these have been disclosed, it is that the application allows bypassing strong session management and authorization checks by delegating these to the LLM, and that sensitive data is being stored in a place that it should not be. @@ -34,8 +34,8 @@ In short: disclosure of the system prompt itself does not present the real risk Since LLMs are susceptible to other attacks like prompt injections which can alter the system prompt, it is recommended to avoid using system prompts to control the model behavior where possible. Instead, rely on systems outside of the LLM to ensure this behavior. For example, detecting and preventing harmful content should be done in external systems. #### 3. Implement Guardrails Implement a system of guardrails outside of the LLM itself. While training particular behavior into a model can be effective, such as training it not to reveal its system prompt, it is not a guarantee that the model will always adhere to this. An independent system that can inspect the output to determine if the model is in compliance with expectations is preferable to system prompt instructions. -#### 4. Ensure that security controls are enforced independantly from the LLM - Critical controls such as privilage separation, authorization bounds checks, and similar must not be delegated to the LLM, either through the system prompt or otherwise. These controls need to occur in a deterministic, auditable manner, and LLMs are not (currently) conducive to this. In cases where an agent is performing tasks, if those tasks require different levels of access, then multiple agents should be used, each configured with the least privileges needed to perform the desired tasks. +#### 4. Ensure that security controls are enforced independently from the LLM + Critical controls such as privilege separation, authorization bounds checks, and similar must not be delegated to the LLM, either through the system prompt or otherwise. These controls need to occur in a deterministic, auditable manner, and LLMs are not (currently) conducive to this. In cases where an agent is performing tasks, if those tasks require different levels of access, then multiple agents should be used, each configured with the least privileges needed to perform the desired tasks. ### Example Attack Scenarios