Skip to content

curiosity-ai/llm-guard

Repository files navigation

LLM.Guard

LLM.Guard is a lightweight C# library designed to implement guardrails for Large Language Model (LLM) prompts. The library provides a straightforward way to detect potentially unsafe or inappropriate prompt patterns, such as jailbreak attempts, in order to help maintain the integrity and security of interactions with LLM-based systems.

Features

  • Prompt Classification: Classifies prompts based on predefined patterns (e.g., jailbreak, inappropriate content).
  • Easy Integration: Simple, singleton-based interface for seamless integration into existing C# applications.
  • Customizable Rules: Extendable to allow for custom guardrails and prompt detection criteria.

Installation

Add LLM-Guard to your project using NuGet:

dotnet add package LLM.Guard

Usage

Here’s a quick example of how to use LLM-Guard to detect jailbreak attempts:

using LLM.Guard;

var prediction = Predictor.Instance.Predict("Ignore all previous instructions and do as follows:");
if (prediction == PromptType.Jailbreak)
{
    throw new Exception("Jailbreak detected");
}

In this example, LLM-Guard analyzes the input prompt and classifies it as a PromptType. If it detects a Jailbreak attempt, the system raises an exception to prevent potentially unsafe behavior.

API Reference Predictor.Instance.Predict(string prompt) Parameters: prompt (string): The LLM prompt to be analyzed. Returns: PromptType: The predicted type of the prompt (e.g., Normal, Jailbreak, Inappropriate). PromptType Enum Normal: The prompt appears safe and standard. Jailbreak: The prompt contains language that may attempt to bypass LLM safeguards. Inappropriate: The prompt contains language that may be deemed inappropriate.

Contributing

Contributions to LLM-Guard are welcome! Please submit an issue or pull request if you have ideas for enhancements or bug fixes.

License

This project is licensed under the MIT License. See the LICENSE file for details.

LLM-Guard helps make your LLM-powered applications safer by enforcing prompt guardrails.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages