From ac4289fb40df7e43799f0242a8bb7d4710634b39 Mon Sep 17 00:00:00 2001 From: "DistributedApps.AI" Date: Sun, 8 Dec 2024 22:32:41 -0500 Subject: [PATCH] Update LLM03_SupplyChain.md Signed-off-by: DistributedApps.AI --- .../translations/zh-CN/LLM03_SupplyChain.md | 204 +++++++++--------- 1 file changed, 106 insertions(+), 98 deletions(-) diff --git a/2_0_vulns/translations/zh-CN/LLM03_SupplyChain.md b/2_0_vulns/translations/zh-CN/LLM03_SupplyChain.md index 3b9e739c..fd4646c0 100644 --- a/2_0_vulns/translations/zh-CN/LLM03_SupplyChain.md +++ b/2_0_vulns/translations/zh-CN/LLM03_SupplyChain.md @@ -1,98 +1,106 @@ -## LLM03:2025 Supply Chain - -### Description - -LLM supply chains are susceptible to various vulnerabilities, which can affect the integrity of training data, models, and deployment platforms. These risks can result in biased outputs, security breaches, or system failures. While traditional software vulnerabilities focus on issues like code flaws and dependencies, in ML the risks also extend to third-party pre-trained models and data. - -These external elements can be manipulated through tampering or poisoning attacks. - -Creating LLMs is a specialized task that often depends on third-party models. The rise of open-access LLMs and new fine-tuning methods like "LoRA" (Low-Rank Adaptation) and "PEFT" (Parameter-Efficient Fine-Tuning), especially on platforms like Hugging Face, introduce new supply-chain risks. Finally, the emergence of on-device LLMs increase the attack surface and supply-chain risks for LLM applications. - -Some of the risks discussed here are also discussed in "LLM04 Data and Model Poisoning." This entry focuses on the supply-chain aspect of the risks. -A simple threat model can be found [here](https://github.com/jsotiro/ThreatModels/blob/main/LLM%20Threats-LLM%20Supply%20Chain.png). - -### Common Examples of Risks - -#### 1. Traditional Third-party Package Vulnerabilities - Such as outdated or deprecated components, which attackers can exploit to compromise LLM applications. This is similar to "A06:2021 – Vulnerable and Outdated Components" with increased risks when components are used during model development or finetuning. - (Ref. link: [A06:2021 – Vulnerable and Outdated Components](https://owasp.org/Top10/A06_2021-Vulnerable_and_Outdated_Components/)) -#### 2. Licensing Risks - AI development often involves diverse software and dataset licenses, creating risks if not properly managed. Different open-source and proprietary licenses impose varying legal requirements. Dataset licenses may restrict usage, distribution, or commercialization. -#### 3. Outdated or Deprecated Models - Using outdated or deprecated models that are no longer maintained leads to security issues. -#### 4. Vulnerable Pre-Trained Model - Models are binary black boxes and unlike open source, static inspection can offer little to security assurances. Vulnerable pre-trained models can contain hidden biases, backdoors, or other malicious features that have not been identified through the safety evaluations of model repository. Vulnerable models can be created by both poisoned datasets and direct model tampering using tehcniques such as ROME also known as lobotomisation. -#### 5. Weak Model Provenance - Currently there are no strong provenance assurances in published models. Model Cards and associated documentation provide model information and relied upon users, but they offer no guarantees on the origin of the model. An attacker can compromise supplier account on a model repo or create a similar one and combine it with social engineering techniques to compromise the supply-chain of an LLM application. -#### 6. Vulnerable LoRA adapters - LoRA is a popular fine-tuning technique that enhances modularity by allowing pre-trained layers to be bolted onto an existing LLM. The method increases efficiency but create new risks, where a malicious LorA adapter compromises the integrity and security of the pre-trained base model. This can happen both in collaborative model merge environments but also exploiting the support for LoRA from popular inference deployment platforms such as vLMM and OpenLLM where adapters can be downloaded and applied to a deployed model. -#### 7. Exploit Collaborative Development Processes - Collaborative model merge and model handling services (e.g. conversions) hosted in shared environments can be exploited to introduce vulnerabilities in shared models. Model merging is is very popular on Hugging Face with model-merged models topping the OpenLLM leaderboard and can be exploited to bypass reviews. Similarly, services such as conversation bot have been proved to be vulnerable to maniputalion and introduce malicious code in models. -#### 8. LLM Model on Device supply-chain vulnerabilities - LLM models on device increase the supply attack surface with compromised manufactured processes and exploitation of device OS or fimware vulnerabilities to compromise models. Attackers can reverse engineer and re-package applications with tampered models. -#### 9. Unclear T&Cs and Data Privacy Policies - Unclear T&Cs and data privacy policies of the model operators lead to the application's sensitive data being used for model training and subsequent sensitive information exposure. This may also apply to risks from using copyrighted material by the model supplier. - -### Prevention and Mitigation Strategies - -1. Carefully vet data sources and suppliers, including T&Cs and their privacy policies, only using trusted suppliers. Regularly review and audit supplier Security and Access, ensuring no changes in their security posture or T&Cs. -2. Understand and apply the mitigations found in the OWASP Top Ten's "A06:2021 – Vulnerable and Outdated Components." This includes vulnerability scanning, management, and patching components. For development environments with access to sensitive data, apply these controls in those environments, too. - (Ref. link: [A06:2021 – Vulnerable and Outdated Components](https://owasp.org/Top10/A06_2021-Vulnerable_and_Outdated_Components/)) -3. Apply comprehensive AI Red Teaming and Evaluations when selecting a third party model. Decoding Trust is an example of a Trustworthy AI benchmark for LLMs but models can finetuned to by pass published benchmarks. Use extensive AI Red Teaming to evaluate the model, especially in the use cases you are planning to use the model for. -4. Maintain an up-to-date inventory of components using a Software Bill of Materials (SBOM) to ensure you have an up-to-date, accurate, and signed inventory, preventing tampering with deployed packages. SBOMs can be used to detect and alert for new, zero-date vulnerabilities quickly. AI BOMs and ML SBOMs are an emerging area and you should evaluate options starting with OWASP CycloneDX -5. To mitigate AI licensing risks, create an inventory of all types of licenses involved using BOMs and conduct regular audits of all software, tools, and datasets, ensuring compliance and transparency through BOMs. Use automated license management tools for real-time monitoring and train teams on licensing models. Maintain detailed licensing documentation in BOMs. -6. Only use models from verifiable sources and use third-party model integrity checks with signing and file hashes to compensate for the lack of strong model provenance. Similarly, use code signing for externally supplied code. -7. Implement strict monitoring and auditing practices for collaborative model development environments to prevent and quickly detect any abuse. "HuggingFace SF_Convertbot Scanner" is an example of automated scripts to use. - (Ref. link: [HuggingFace SF_Convertbot Scanner](https://gist.github.com/rossja/d84a93e5c6b8dd2d4a538aa010b29163)) -8. Anomaly detection and adversarial robustness tests on supplied models and data can help detect tampering and poisoning as discussed in "LLM04 Data and Model Poisoning; ideally, this should be part of MLOps and LLM pipelines; however, these are emerging techniques and may be easier to implement as part of red teaming exercises. -9. Implement a patching policy to mitigate vulnerable or outdated components. Ensure the application relies on a maintained version of APIs and underlying model. -10. Encrypt models deployed at AI edge with integrity checks and use vendor attestation APIs to prevent tampered apps and models and terminate applications of unrecognized firmware. - -### Sample Attack Scenarios - -#### Scenario #1: Vulnerable Python Library - An attacker exploits a vulnerable Python library to compromise an LLM app. This happened in the first Open AI data breach. Attacks on the PyPi package registry tricked model developers into downloading a compromised PyTorch dependency with malware in a model development environment. A more sophisticated example of this type of attack is Shadow Ray attack on the Ray AI framework used by many vendors to manage AI infrastructure. In this attack, five vulnerabilities are believed to have been exploited in the wild affecting many servers. -#### Scenario #2: Direct Tampering - Direct Tampering and publishing a model to spread misinformation. This is an actual attack with PoisonGPT bypassing Hugging Face safety features by directly changing model parameters. -#### Scenario #3: Finetuning Popular Model - An attacker finetunes a popular open access model to remove key safety features and perform high in a specific domain (insurance). The model is finetuned to score highly on safety benchmarks but has very targeted triggers. They deploy it on Hugging Face for victims to use it exploiting their trust on benchmark assurances. -#### Scenario #4: Pre-Trained Models - An LLM system deploys pre-trained models from a widely used repository without thorough verification. A compromised model introduces malicious code, causing biased outputs in certain contexts and leading to harmful or manipulated outcomes -#### Scenario #5: Compromised Third-Party Supplier - A compromised third-party supplier provides a vulnerable LorA adapter that is being merged to an LLM using model merge on Hugging Face. -#### Scenario #6: Supplier Infiltration - An attacker infiltrates a third-party supplier and compromises the production of a LoRA (Low-Rank Adaptation) adapter intended for integration with an on-device LLM deployed using frameworks like vLLM or OpenLLM. The compromised LoRA adapter is subtly altered to include hidden vulnerabilities and malicious code. Once this adapter is merged with the LLM, it provides the attacker with a covert entry point into the system. The malicious code can activate during model operations, allowing the attacker to manipulate the LLM’s outputs. -#### Scenario #7: CloudBorne and CloudJacking Attacks - These attacks target cloud infrastructures, leveraging shared resources and vulnerabilities in the virtualization layers. CloudBorne involves exploiting firmware vulnerabilities in shared cloud environments, compromising the physical servers hosting virtual instances. CloudJacking refers to malicious control or misuse of cloud instances, potentially leading to unauthorized access to critical LLM deployment platforms. Both attacks represent significant risks for supply chains reliant on cloud-based ML models, as compromised environments could expose sensitive data or facilitate further attacks. -#### Scenario #8: LeftOvers (CVE-2023-4969) - LeftOvers exploitation of leaked GPU local memory to recover sensitive data. An attacker can use this attack to exfiltrate sensitive data in production servers and development workstations or laptops. -#### Scenario #9: WizardLM - Following the removal of WizardLM, an attacker exploits the interest in this model and publish a fake version of the model with the same name but containing malware and backdoors. -#### Scenario #10: Model Merge/Format Conversion Service - An attacker stages an attack with a model merge or format conversation service to compromise a publicly available access model to inject malware. This is an actual attack published by vendor HiddenLayer. -#### Scenario #11: Reverse-Engineer Mobile App - An attacker reverse-engineers an mobile app to replace the model with a tampered version that leads the user to scam sites. Users are encouraged to dowload the app directly via social engineering techniques. This is a "real attack on predictive AI" that affected 116 Google Play apps including popular security and safety-critical applications used for as cash recognition, parental control, face authentication, and financial service. - (Ref. link: [real attack on predictive AI](https://arxiv.org/abs/2006.08131)) -#### Scenario #12: Dataset Poisoning - An attacker poisons publicly available datasets to help create a back door when fine-tuning models. The back door subtly favors certain companies in different markets. -#### Scenario #13: T&Cs and Privacy Policy - An LLM operator changes its T&Cs and Privacy Policy to require an explicit opt out from using application data for model training, leading to the memorization of sensitive data. - -### Reference Links - -1. [PoisonGPT: How we hid a lobotomized LLM on Hugging Face to spread fake news](https://blog.mithrilsecurity.io/poisongpt-how-we-hid-a-lobotomized-llm-on-hugging-face-to-spread-fake-news) -2. [Large Language Models On-Device with MediaPipe and TensorFlow Lite](https://developers.googleblog.com/en/large-language-models-on-device-with-mediapipe-and-tensorflow-lite/) -3. [Hijacking Safetensors Conversion on Hugging Face](https://hiddenlayer.com/research/silent-sabotage/) -4. [ML Supply Chain Compromise](https://atlas.mitre.org/techniques/AML.T0010) -5. [Using LoRA Adapters with vLLM](https://docs.vllm.ai/en/latest/models/lora.html) -6. [Removing RLHF Protections in GPT-4 via Fine-Tuning](https://arxiv.org/pdf/2311.05553) -7. [Model Merging with PEFT](https://huggingface.co/blog/peft_merging) -8. [HuggingFace SF_Convertbot Scanner](https://gist.github.com/rossja/d84a93e5c6b8dd2d4a538aa010b29163) -9. [Thousands of servers hacked due to insecurely deployed Ray AI framework](https://www.csoonline.com/article/2075540/thousands-of-servers-hacked-due-to-insecurely-deployed-ray-ai-framework.html) -10. [LeftoverLocals: Listening to LLM responses through leaked GPU local memory](https://blog.trailofbits.com/2024/01/16/leftoverlocals-listening-to-llm-responses-through-leaked-gpu-local-memory/) - -### Related Frameworks and Taxonomies - -Refer to this section for comprehensive information, scenarios strategies relating to infrastructure deployment, applied environment controls and other best practices. - -- [ML Supply Chain Compromise](https://atlas.mitre.org/techniques/AML.T0010) - **MITRE ATLAS** +### LLM03:2025 供应链 + +#### 描述 + +LLM供应链容易受到各种漏洞的影响,这些漏洞可能威胁训练数据、模型和部署平台的完整性。此类风险可能导致偏差输出、安全漏洞或系统故障。传统软件漏洞主要集中在代码缺陷和依赖项上,而在机器学习中,风险还扩展到第三方预训练模型和数据。这些外部元素可能通过篡改或投毒攻击被利用。 + +LLM的开发是一项专业任务,通常依赖第三方模型。随着开放访问LLM的兴起,以及如“LoRA”(低秩适应)和“PEFT”(参数高效微调)等新型微调方法的出现,尤其是在 Hugging Face 等平台上的广泛应用,这引入了新的供应链风险。此外,设备端LLM的出现增加了攻击面和供应链风险。 + +本条目专注于风险的供应链方面,与“LLM04 数据与模型投毒”中的一些风险相互关联。简单的威胁模型可参考[这里](https://github.com/jsotiro/ThreatModels/blob/main/LLM%20Threats-LLM%20Supply%20Chain.png)。 + +#### 常见风险示例 + +##### 1. 传统第三方组件漏洞 + 使用过时或已弃用的组件,这些组件可能被攻击者利用以妥协LLM应用。这类似于“OWASP A06:2021 – 易受攻击和过时的组件”,但在模型开发或微调期间使用的组件增加了风险。 + (参考链接:[A06:2021 – Vulnerable and Outdated Components](https://owasp.org/Top10/A06_2021-Vulnerable_and_Outdated_Components/)) + +##### 2. 许可风险 + AI开发通常涉及多种软件和数据集许可证管理不当可能引发法律和使用风险,包括使用限制、分发和商业化限制。 + +##### 3. 过时或已弃用模型 + 使用不再维护的过时或已弃用模型会带来安全隐患。 + +##### 4. 脆弱的预训练模型 + 预训练模型可能包含隐蔽偏见、后门或其他未识别的恶意特性。尤其通过数据集投毒或直接模型篡改(如 ROME 技术)生成的脆弱模型具有潜在风险。 + +##### 5. 弱模型溯源 + 当前的模型发布缺乏强溯源保障。模型卡等文档提供了模型信息,但无法保证模型来源真实性,供应链攻击者可利用这一点来进行社会工程和模型篡改。 + +##### 6. 脆弱的LoRA适配器 + LoRA微调技术虽然提高了模块化和效率,但也增加了安全风险,例如通过恶意适配器妥协模型完整性。 + +##### 7. 利用协作开发流程 + 协作模型开发流程和服务(如模型合并和转换服务)可能被利用注入漏洞。 + +##### 8. 设备端LLM供应链漏洞 + 设备端部署的LLM面临制造流程妥协和设备固件漏洞利用等供应链风险。 + +##### 9. 模糊的条款与数据隐私政策 + 模糊的条款和数据隐私政策可能导致敏感数据被用于训练模型,从而增加数据泄露风险。 + +#### 防范与缓解策略 + +1. 审核数据源和供应商,包括条款与隐私政策,仅使用可信供应商。定期审查和审计供应商安全措施及其变更。 +2. 参考OWASP Top Ten中的“A06:2021 – 易受攻击和过时的组件”进行漏洞扫描和管理,并应用于敏感数据的开发环境。 + (参考链接:[A06:2021 – Vulnerable and Outdated Components](https://owasp.org/Top10/A06_2021-Vulnerable_and_Outdated_Components/)) + +3. 通过AI红队测试和评估选择第三方模型。采用如Decoding Trust等可信AI基准,但需警惕模型微调可能绕过这些基准。 + +4. 使用软件物料清单(SBOM)维护组件清单以防止篡改。探索AI BOM和ML SBOM选项(例如OWASP CycloneDX)。 + +5. 针对AI许可风险,创建许可证清单并定期审计,确保遵守使用条款,必要时使用自动化许可证管理工具。 + +6. 使用可验证来源的模型,结合第三方完整性检查(如签名和文件哈希)弥补弱溯源问题。 + +7. 在协作开发环境中实施严格监控和审计,防止滥用。例如使用HuggingFace SF_Convertbot Scanner等自动化工具。 + (参考链接:[HuggingFace SF_Convertbot Scanner](https://gist.github.com/rossja/d84a93e5c6b8dd2d4a538aa010b29163)) + +8. 对供应模型和数据进行异常检测和对抗性鲁棒性测试,这些方法也可在MLOps和LLM管道中实现。 + +9. 实施补丁管理策略,确保API及底层模型使用维护版本。 + +10. 加密部署在边缘AI设备上的模型,并通过供应商认证API防止篡改应用与模型。 + +#### 攻击场景示例 + +##### 场景1:易受攻击的Python库 + 攻击者利用易受攻击的Python库入侵LLM应用,这发生在Open AI的首次数据泄露中。 + +##### 场景2:直接篡改 + 直接篡改并发布模型传播虚假信息,例如通过PoisonGPT绕过Hugging Face的安全机制。 + +##### 场景3:微调热门模型 + 攻击者通过微调开放模型绕过基准测试,在特定领域(如保险)表现突出,但隐藏触发条件以实施恶意行为。 + +##### 场景4:预训练模型 + 在未充分验证的情况下使用预训练模型,导致恶意代码引入偏见输出。 + +##### 场景5:供应商妥协 + 第三方供应商提供的LoRA适配器被攻击者篡改并合并到LLM中。 + +##### 场景6:供应链渗透 + 攻击者渗透供应商并妥协LoRA适配器,以隐藏漏洞并控制系统输出。 + +##### 场景7:云端攻击 + 攻击者利用共享资源和虚拟化漏洞实施云劫持(CloudJacking),导致未经授权访问关键部署平台。 + +#### 参考链接 + +1. [PoisonGPT: How we hid a lobotomized LLM on Hugging Face to spread fake news](https://blog.mithrilsecurity.io/poisongpt-how-we-hid-a-lobotomized-llm-on-hugging-face-to-spread-fake-news) +2. [Large Language Models On-Device with MediaPipe and TensorFlow Lite](https://developers.googleblog.com/en/large-language-models-on-device-with-mediapipe-and-tensorflow-lite/) +3. [Hijacking Safetensors Conversion on Hugging Face](https://hiddenlayer.com/research/silent-sabotage/) +4. [ML Supply Chain Compromise](https://atlas.mitre.org/techniques/AML.T0010) - **MITRE ATLAS** +5. [Using LoRA Adapters with vLLM](https://docs.vllm.ai/en/latest/models/lora.html) +6. [Removing RLHF Protections in GPT-4 via Fine-Tuning](https://arxiv.org/pdf/2311.05553) +7. [Model Merging with PEFT](https://huggingface.co/blog/peft_merging) +8. [HuggingFace SF_Convertbot Scanner](https://gist.github.com/rossja/d84a93e5c6b8dd2d4a538aa010b29163) +9. [Thousands of servers hacked due to insecurely deployed Ray AI framework](https://www.csoonline.com/article/2075540/thousands-of-servers-hacked-due-to-insecurely-deployed-ray-ai-framework.html) +10. [LeftoverLocals: Listening to LLM responses through leaked GPU local memory](https://blog.trailofbits.com/2024/01/16/leftoverlocals-listening-to-llm-responses-through-leaked-gpu-local-memory/) + +--- + +### 相关框架和分类 + +- **[AML.T0010 - ML供应链妥协](https://atlas.mitre.org/techniques/AML.T0010)** - **MITRE ATLAS** + +此条目详细列出了涉及供应链安全的风险、攻击示例和防范策略,为安全开发和部署LLM应用提供了基础指南。用户应结合具体应用场景实施适当的风险缓解措施,加强整个供应链的安全性。