forked from ROCm/transformers
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automated PR: Downstream develop rebase new changes #44
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…#31395) * Add llama3-llava-next-8b to llava_next conversion script Adds support for the lmms-lab/llama3-llava-next-8b model to the convert_llava_next_weights_to_hf.py script, along with an example prompt generated from the llava_llama_3 conv_template in the LLaVA-NeXT repo. * Exclude <|begin_of_text|> from prompt example This token gets added automatically, so it should not be included in the prompt example. * Add llava-next-72b and llava-next-110b Adds the Qwen-based LLaVA-Next models to the conversion script, along with changes to load the models on multiple GPUs for inference. * Add llama3 and qwen prompt formats to docs * Chat prompt and padding side left for llama3 batched * update * Update src/transformers/models/llava_next/convert_llava_next_weights_to_hf.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/llava_next/convert_llava_next_weights_to_hf.py Co-authored-by: amyeroberts <[email protected]> * remove code * better naming --------- Co-authored-by: raushan <[email protected]> Co-authored-by: Raushan Turganbay <[email protected]> Co-authored-by: amyeroberts <[email protected]>
* pad on right if training * docs * add tests
* [whisper integration] use parquet dataset for testing * propagate to others * more propagation * last one
…ace#31749) * [whisper] remove un-necessary transpose for fa2 attention * propagate
* fix mask creation of gpt2 and gpt_neox caused by me * forgot the reshape of masks when shape > 2 * add tests for gpt neox and gpt2 * nit on a comment
encapsulate chat template logic
* Add YaRN and Dynamic-YaRN RoPE Scaling Methods YaRN (Yet another RoPE extension method) combines the NTK-By-Parts Interpolation and Attention Scaling methods, improving upon existing RoPE interpolation methods for longer context window sizes. Fine-tuned models maintain their original performance across benchmarks while enabling efficient extrapolation and transfer learning for quicker convergence, especially in compute-limited environments. We implement YaRN and Dynamic-YaRN for the following list of models: - LLaMA - Falcon - GPT-NeoX - Olmo - Persimmon - Phi - StableLM - OpenLLaMA New unit tests are added to assert YaRN's correct behavior on both short and long sequence inputs. For more details, please refer to https://arxiv.org/abs/2309.00071. Co-authored-by: Miguel Almeida <[email protected]> * Refactor YaRN implementation for LLaMA Iterate on YaRN implementation for LLaMA and remove diff from remaining models for increased PR modularity. This commit includes the following changes: - Merge 'yarn_rope_scaling' and 'rope_scaling' dictionaries - Remove unnecessary attributes ('extrapolation_factor' and 'finetuned') from YaRN classes - Inherit 'forward' method in YaRN classes from superclass - Rename 'yarn' method to 'compute_yarn_scaling' - Extend YaRN tests with further assertions - Fix style inconsistencies Co-authored-by: Miguel Monte e Freitas <[email protected]> * Refactor Tensor Building Logic for YaRN - Comply with the the tensor building logic introduced in huggingface#30743 - Add referencing to the optimized Attention Factor equation - Remove Dynamic YaRN for a more agile deployment Co-authored-by: mig-mfreitas <[email protected]> * remove unwanted file --------- Co-authored-by: Miguel Almeida <[email protected]> Co-authored-by: mig-mfreitas <[email protected]> Co-authored-by: Joao Gante <[email protected]>
add attribute to model Signed-off-by: Daniel Lok <[email protected]>
…huggingface#31979) * Change resize_token_embeddings to make it return same Class that is passed to it * Add explanatory comment as requested in review * Add explanatory comments for add resizing function in lxmert * Add comment for padding_idx and moving _resize_bias in lxmert to LxmertForPreTraining --------- Co-authored-by: Prashanth Sateesh <[email protected]> Co-authored-by: Prashanth Sateesh <[email protected]>
Co-authored-by: amyeroberts <[email protected]> Co-authored-by: Arthur <[email protected]>
* gguf conversion forces add_prefix_space=False for llama3, this is not required and forces from_slow, which fails. changing to None + test * typo * clean test
Add the lru_cache for speed
--------- Co-authored-by: Merve Noyan <[email protected]>
* Update README.md * tests: forward ok * backward test done * done testing * removed check. scripts * Update README.md * added use_mambapy arg * fixed typo in warning * protected imports w/ mambapy package * delete pscan.py + raise rather than assert * Update import_utils.py * fix whitespaces and unused import * trailing whitespace + import block unformatted * Update modeling_mamba.py * transpose before pscan * shape comment * ran make style * use_mambapy=False by default Co-authored-by: Arthur <[email protected]> * ran make fix-copies --------- Co-authored-by: Arthur <[email protected]>
* renamed phi3 rope_scaling type * fixed trailing whitespaces * fixed test * added warning * fixed format
…e#32148) Revert "Incorrect Whisper long-form decoding timestamps (huggingface#32003)" This reverts commit cd48553.
…ingface#31857) * feat(cache): StaticCache uses index_copy_ to avoid useless copy Using index_copy_ allows for explicit in-place change of the tensor. Some backends (XLA) will otherwise copy the tensor, making the code slower and using more memory. Proposed implementation will end up using less memory and on XLA will result in less compilation, but the change is also quite generic, making no change whatsoever on CUDA or CPU backend. * feat(cache): SlidingWindowCache uses index_copy_ to avoid useless copy Applying the same change done in StaticCache. * fix(cache): fallback of index_copy_ when not implemented * fix(cache): in index_copy_ ensure tensors are on same device * [run slow] llama * fix(cache): add move of cache_position to same device in SlidingWindowCache * Revert "[run slow] llama" This reverts commit 02608dd.
…r search (huggingface#31924) Update integration_utils.py Added additional kwarg
…ith Position IDs (huggingface#31629) * add DataCollatorBatchFlattening * Update data_collator.py * change name * new FA2 flow if position_ids is provided * add comments * minor fix * minor fix data collator * add test cases for models * add test case for data collator * remove extra code * formating for ruff check and check_repo.py * ruff format ruff format tests src utils * custom_init_isort.py
* Updated ruff version and fixed the required code accorindg to the latest version. * Updated ruff version and fixed the required code accorindg to the latest version. * Added noqa directive to ignore 1 error shown by ruff
Co-authored-by: Arthur Zucker <[email protected]>
…face#32160) Fixed an if condition always evaluating to true.
…eights in the layer (huggingface#32171) * adds: extra_repr() to MambaRMSNorm to include the hidden size of the layer * style fix with ruff:
…than the ones present at import time. (huggingface#32153) * fix: default value reflects the runtime environment variables rather than the ones present at import time. * Fix: Change `deterministic` to None by default; use env var if None
* Update qwen2.md outdated description * Update qwen2.md amended * Update qwen2.md Update * Update qwen2.md fix wrong version code, now good to go
Remove conversation pipeline tests
Fixed WhisperModel.forward’s docstring link.
) * docs: ko: chat_templating.md * feat: nmt draft * fix: manual edits * Update docs/source/ko/chat_templating.md Co-authored-by: Sungmin Oh <[email protected]> * Update docs/source/ko/chat_templating.md Co-authored-by: Sungmin Oh <[email protected]> * fix: apply suggestions from code review - anchor Co-authored-by: Sungmin Oh <[email protected]> * fix: manual edits Co-authored-by: SeungYoun Lee <[email protected]> Co-authored-by: Minki Kim <[email protected]> * fix: manual edits * fix: delete 'default template' section --------- Co-authored-by: Sungmin Oh <[email protected]> Co-authored-by: SeungYoun Lee <[email protected]> Co-authored-by: Minki Kim <[email protected]>
Hello! ## Pull Request overview * Fix typo ## Details This should speak for itself. cc @itazap @ArthurZucker - Tom Aarsen
Update llm_tutorial.md remove comma re: issue 32518 huggingface#32518
* Change `_supports_sdpa` to True * add phi3 to sdpa support list
* fix typo * uniform kwargs * make style * add comments * remove return_tensors * remove common_kwargs from processor since it propagates * make style * return_token_type_ids to True * revert the default imagekwargs since does not accept any value in the image processro * revert processing_utils.py * make style * add molbap's commit * fix typo * fix common processor * remain * Revert "add molbap's commit" This reverts commit a476c6e. * add unsync PR * revert * make CI happy * nit * import annotationformat
* handle (processor_class, None) returned by ModelPatterns * handle (slow, fast) image processors in add model * handle old image processor case
* add qwen2audio * Update check_repo.py * fix style * fix test * fix style * add model size * Qwen2AudioEncoderModel->Qwen2AudioEncoder; add copy info * Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py Co-authored-by: Yoach Lacombe <[email protected]> * Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py Co-authored-by: Yoach Lacombe <[email protected]> * Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py Co-authored-by: Yoach Lacombe <[email protected]> * switch the attention_mask and the feature_attention_mask * add to PRIVATE_MODELS in check_repo.py; add to MODEL_NAMES_TO_IGNORE in check_table.py * fix initialization * update chat_template * fix consistency issue after copy * add docstrings to _merge_input_ids_with_audio_features * add copied from to prepare_inputs_for_generation * add more details to docs * rm comment * add init_std * Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py Co-authored-by: Yoach Lacombe <[email protected]> * Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py Co-authored-by: Yoach Lacombe <[email protected]> * Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py Co-authored-by: Yoach Lacombe <[email protected]> * Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py Co-authored-by: Yoach Lacombe <[email protected]> * update * Update docs/source/en/model_doc/qwen2_audio.md Co-authored-by: amyeroberts <[email protected]> * update tests * rm ignore_index * update processor * rm ffmpeg_read * Update tests/models/qwen2_audio/test_modeling_qwen2_audio.py Co-authored-by: amyeroberts <[email protected]> * Update docs/source/en/model_doc/qwen2_audio.md Co-authored-by: amyeroberts <[email protected]> * Update docs/source/en/model_doc/qwen2_audio.md Co-authored-by: amyeroberts <[email protected]> * Update docs/source/en/model_doc/qwen2_audio.md Co-authored-by: amyeroberts <[email protected]> * update * typo * [run_slow] qwen2_audio * [run_slow] qwen2_audio * [run_slow] qwen2_audio * fix quality * [run_slow] qwen2_audio * [run_slow] qwen2_audio * [run_slow] qwen2_audio * add official model --------- Co-authored-by: Yoach Lacombe <[email protected]> Co-authored-by: amyeroberts <[email protected]>
…0954) * filter flash_attn optional imports loading remote code * improve pattern * fix code style * Update src/transformers/dynamic_module_utils.py Co-authored-by: Matt <[email protected]> --------- Co-authored-by: Matt <[email protected]>
…uggingface#32372) * docs: ko: llm_tutorial_optimization.md * feat: nmt draft * fix: manual edits * Update docs/source/ko/llm_tutorial_optimization.md Co-authored-by: Chaewon Song <[email protected]> * Update docs/source/ko/llm_tutorial_optimization.md Co-authored-by: Chaewon Song <[email protected]> * fix: resolve suggestions - 1 Co-authored-by: Chaewon Song <[email protected]> Co-authored-by: timdalxx <[email protected]> Co-authored-by: boyunJang <[email protected]> * fix: resolve suggestions - 2 Co-authored-by: boyunJang <[email protected]> Co-authored-by: Chaewon Song <[email protected]> Co-authored-by: timdalxx <[email protected]> --------- Co-authored-by: Chaewon Song <[email protected]> Co-authored-by: timdalxx <[email protected]> Co-authored-by: boyunJang <[email protected]>
* docs: ko: ko-trainer * feat: nmt draft * fix: manual edits * fix: manual edits * fix: glossary * fix: glossary * Apply suggestions from code review Co-authored-by: Jinuk <[email protected]> Co-authored-by: SeongWooChoi <[email protected]> --------- Co-authored-by: Jinuk <[email protected]> Co-authored-by: SeongWooChoi <[email protected]>
* docs: ko: quantization/eetq.md * feat: nmt draft * fix docs: ko: quantization/eetq.md * fix docs: ko: quantization/eetq.md * fix: resolve suggestions Co-authored-by: Jiwook Han <[email protected]> * fix: resolve suggestions * fix: resolve suggsetions --------- Co-authored-by: Jiwook Han <[email protected]>
* docs: ko: fsdp.md * feat: nmt draft * fix: manual edits * Apply suggestions from code review Co-authored-by: 김준재 <[email protected]> Co-authored-by: Minki Kim <[email protected]> * fix: resolve suggestions * Update docs/source/ko/fsdp.md Co-authored-by: 김준재 <[email protected]> * Update docs/source/ko/fsdp.md Co-authored-by: Steven Liu <[email protected]> --------- Co-authored-by: 김준재 <[email protected]> Co-authored-by: Minki Kim <[email protected]> Co-authored-by: Steven Liu <[email protected]>
* docs: ko: quantization/bitsandbytes.md * feat: nmt draft * fix: minor typos * fix: manual edits * fix: manual edits * fix: resolve suggestions Co-authored-by: wony617 <[email protected]> Co-authored-by: YONGSANG <[email protected]> Co-authored-by: Woojun Jung <[email protected]> * fix: resolve suggestions Co-authored-by: Steven Liu <[email protected]> * Apply suggestions from code review Co-authored-by: Steven Liu <[email protected]> * Apply suggestions from code review Co-authored-by: Steven Liu <[email protected]> --------- Co-authored-by: wony617 <[email protected]> Co-authored-by: YONGSANG <[email protected]> Co-authored-by: Woojun Jung <[email protected]> Co-authored-by: Steven Liu <[email protected]>
* I think inputs_embeds has ndim == 3 * fix sequence length catch * add generate test * [run-slow]olmo, persimmon, gemma, gemma2, qwen2, llama * skip whisper * fix bart test * more fixes
…gface#32516) Workaround the export issue in torch 2.4 Co-authored-by: Guang Yang <[email protected]>
fix _update_model_kwargs_for_generation
no empty revision
…#32422) Signed-off-by: duzhanwei <[email protected]> Co-authored-by: duzhanwei <[email protected]>
* docs: ko: main_classes/agent * feat: chatgpt draft * fix: manual edits * �fix: resolve suggestions Co-authored-by: Woojun Jung <[email protected]> Co-authored-by: thsamaji <[email protected]> Co-authored-by: SeungAhSon <[email protected]> * fix: resolve suggestions * fix: resolve code line number --------- Co-authored-by: Woojun Jung <[email protected]> Co-authored-by: thsamaji <[email protected]> Co-authored-by: SeungAhSon <[email protected]>
* Update testing_utils.py * changes * from env var * name change * debug * name change
* skip failures * navi31 skip * mi300 skips * conversational test backwards compatability * mi300 skips
Unit Test Results\n
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR was created automatically by the Fork Maintenance System to sync changes from the downstream main into downstream develop.