-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update to outlines010 #1092
update to outlines010 #1092
Conversation
Documentation for this PR has been built. You can view it at: https://distilabel.argilla.io/pr-1092/ |
CodSpeed Performance ReportMerging #1092 will improve performances by ×4.1Comparing Summary
Benchmarks breakdown
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good as it's the approach I had started for LlamaCpp.
- Did you check if it worked for LlamaCpp? I had tested before vacations and needed to update
logits_processor=LogitsProcessorList([self._logits_processor]) if self.structured_output else None
in llamacpp.py - This is the issue I had encountered for Llama models. I guess it should be solved with the previous PR, right? RuntimeError when using generate.json() on llama 3.2 with llamaccp dottxt-ai/outlines#1261
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
…hub.com/argilla-io/distilabel into feat/1081-feature-update-to-outlines010
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good functionally, but I found it difficult to follow. With community maintainability in mind, I think you could localise the logic about outlines versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I left some comments. Have you generated some dataset with the 3 integrations to check it works?
…ct; delete unnecessary .DS_Store files from unit tests
- Introduced a helper function to check if the 'outlines' package is installed and its version. - Updated the logic in `_get_logits_processor` to use the new version check, simplifying the processor selection based on the outlines version. - Adjusted the handling of tokenizers in `_get_tokenizer_from_model` to streamline the integration with different frameworks. - Modified `prepare_guided_output` to differentiate processing based on the outlines version, ensuring compatibility with both pre-0.1.0 and post-0.1.0 versions of the outlines package.
- Replaced the `_set_logits_processor` method with direct assignment of `_logits_processor` using `_prepare_structured_output`. - Simplified the logic for setting the logits processor in both the `load` and generation methods, enhancing code clarity and maintainability.
…sLLM - Updated the import statement for outlines to use the new helper function `_outlines_version_below_0_1_0`. - Simplified the logic for setting the `_logits_processor` based on the outlines version check, enhancing code clarity and maintainability.
- Renamed the helper function from `_outlines_version_below_0_1_0` to `_is_outlines_version_below_0_1_0` for clarity. - Updated all references to the renamed function across the codebase, ensuring consistent usage in the `TransformersLLM` class and related functions. - Enhanced code readability and maintainability by standardizing function naming conventions.
…on outlines version - Introduced version check for outlines in both LlamaCppLLM and TransformersLLM to determine processor return type. - Updated `prepare_guided_output` to handle processor initialization differently for outlines versions below and above 0.1.0. - Enhanced tokenizer handling in `_get_tokenizer_from_model` to support multiple frameworks, ensuring compatibility and improved functionality.
…ransformersLLM - Updated return types of `_prepare_structured_output` methods to reflect changes in processor handling. - Changed return type in LlamaCppLLM from `Union["LogitsProcessorList", None]` to `Union["LogitsProcessorList", "LogitsProcessor"]`. - Modified MlxLLM and TransformersLLM to return `Union[List[Callable], Callable>` instead of `Union[Callable, None]`, ensuring consistency across implementations. - Enhanced code clarity and maintainability by standardizing output handling in structured output preparation.
- Added support for the 'mlx' framework in the outlines processing logic. - Updated the `prepare_guided_output` function to utilize `TransformerTokenizer` for 'mlx' framework. - Modified the `_get_logits_processor` and `_get_tokenizer_from_model` functions to include 'mlx' as a valid framework option, ensuring consistent handling across different frameworks. - Improved code clarity and maintainability by standardizing framework handling in the structured output preparation process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
- Simplified return types in LlamaCppLLM and MlxLLM by removing version checks and directly returning the processor. - Enhanced code clarity and maintainability by standardizing the output structure across both classes. - Updated `prepare_guided_output` usage to ensure consistent handling of structured outputs.
- Removed the `structured_output` attribute and related processing logic from MlxLLM to simplify the class structure. - Updated the `load` and generation methods to eliminate references to structured output, enhancing clarity and maintainability. - Adjusted imports and type hints in `outlines.py` to reflect the removal of 'mlx' framework support, streamlining the framework handling. - Improved code readability by cleaning up unnecessary complexity in structured output preparation.
- Changed the assignment of `_logits_processor` to always use a list, ensuring consistent handling across different outlines versions. - Removed the version check for outlines in the `load` method, simplifying the logic and enhancing maintainability. - Updated the return type in the structured output preparation to directly return the processor, improving code clarity.
- Updated type hints for the `llm` parameter in `_get_tokenizer_from_model` and `prepare_guided_output` functions to use `_vLLM` instead of `LLM`, enhancing code readability. - Adjusted imports to reflect the new alias for `LLM`, streamlining the code structure.
- Updated type hint imports to include `# noqa` comments, enhancing code readability and maintaining consistency with type checking. - No functional changes were made; this commit focuses on code structure and clarity.
- Updated the return statement in the `prepare_guided_output` function to use `model or tokenizer` instead of `llm`, improving clarity and consistency in processor assignment. - This change enhances the function's flexibility in handling different input types while maintaining existing functionality.
- Removed the upper version limit for the `transformers` package, allowing for updates beyond version 4.47.0.
mlx-lm
integration #995, did not include this because it was relatively complex to add at this stage.