Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: ✨ Feature Request: Integrate Logits Processors from logits-processor-zoo into vLLM #11461

Closed
dongxiaolong opened this issue Dec 24, 2024 · 2 comments

Comments

@dongxiaolong
Copy link

🚀 The feature, motivation and pitch

Feature Request: Integrate Logits Processors from logits-processor-zoo into vLLM

Description

I am writing to request the integration of several advanced logits processors from the logits-processor-zoo into the vLLM framework. These processors provide enhanced control over the text generation process, enabling more sophisticated and context-aware outputs.

Proposed Logits Processors for Integration

  1. GenLengthLogitsProcessor

    • Purpose: Adjusts the likelihood of the end-of-sequence (EOS) token based on the generated sequence length, encouraging or discouraging shorter or longer responses.
    • Use Cases: Controlling response length in chatbots, ensuring concise or detailed answers as needed.
  2. MultipleChoiceLogitsProcessor

    • Purpose: Guides the model to select from predefined multiple-choice options by boosting the logits of the specified choices.
    • Use Cases: Implementing multiple-choice questions, surveys, or structured decision-making processes within generated text.
  3. CiteFromPromptLogitsProcessor

    • Purpose: Boosts or diminishes the likelihood of tokens present in the prompt (and optionally the EOS token) to encourage the generation of similar tokens or to avoid repetition.
    • Use Cases: Enhancing relevance to the input prompt, avoiding redundancy, and maintaining context consistency.
  4. ForceLastPhraseLogitsProcessor

    • Purpose: Forces the model to generate a specified phrase before finalizing its response, useful for adding references, disclaimers, or thank-you notes.
    • Use Cases: Ensuring consistent closing statements, adding required legal disclaimers, or enhancing user interaction with personalized messages.

Benefits of Integration

  • Enhanced Control: Provides developers with finer control over the text generation process, enabling tailored outputs for specific applications.
  • Improved User Experience: Facilitates the creation of more coherent, relevant, and contextually appropriate responses in various use cases.
  • Flexibility and Customization: Allows for the implementation of complex generation strategies without extensive modifications to the core model.

Implementation Suggestions

  • Modular Integration: Incorporate the logits processors as optional modules or plugins within vLLM, allowing users to enable or disable them based on their requirements.
  • Configuration Parameters: Provide configurable parameters for each logits processor (e.g., boost factors, choice lists, target phrases) to allow customization.
  • Documentation and Examples: Update the vLLM documentation to include usage examples and guidelines for integrating and configuring the new logits processors.

Example Use Case

from transformers import PreTrainedTokenizer
from vllm import VLLMModel
from logits_processor_zoo import GenLengthLogitsProcessor, MultipleChoiceLogitsProcessor

# Initialize tokenizer and model
tokenizer = PreTrainedTokenizer.from_pretrained('gpt-2')
model = VLLMModel.from_pretrained('gpt-2')

# Initialize logits processors
gen_length_processor = GenLengthLogitsProcessor(tokenizer=tokenizer, boost_factor=0.5, p=2)
multiple_choice_processor = MultipleChoiceLogitsProcessor(tokenizer=tokenizer, choices=["A", "B", "C", "D"], delimiter=".", boost_first_words=1.0)

# Generate text with logits processors
output = model.generate(
    prompt="What is the capital of France?\nA. Berlin\nB. Madrid\nC. Paris\nD. Rome",
    logits_processors=[gen_length_processor, multiple_choice_processor],
    max_length=50
)

print(tokenizer.decode(output, skip_special_tokens=True))


### Alternatives

_No response_

### Additional context

Additional Information
Repository Links:

[logits-processor-zoo](https://github.com/NVIDIA/logits-processor-zoo)
[vLLM Integration Directory](https://github.com/NVIDIA/logits-processor-zoo/tree/main/logits_processor_zoo/vllm)
Related Discussions:


### Before submitting a new issue...

- [X] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.
@DarkLight1337
Copy link
Member

With #11150, you should be able to import those logits processors as needed.

@dongxiaolong
Copy link
Author

With #11150, you should be able to import those logits processors as needed.

Thanks @DarkLight1337

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants