Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated PR: Downstream develop rebase new changes #53

Closed
wants to merge 200 commits into from

Conversation

Cemberk
Copy link
Owner

@Cemberk Cemberk commented Aug 14, 2024

This PR was created automatically by the Fork Maintenance System to sync changes from the downstream main into downstream develop.

jamt9000 and others added 30 commits July 23, 2024 10:12
…#31395)

* Add llama3-llava-next-8b to llava_next conversion script

Adds support for the lmms-lab/llama3-llava-next-8b model to the
convert_llava_next_weights_to_hf.py script, along with an example
prompt generated from the llava_llama_3 conv_template in the LLaVA-NeXT
repo.

* Exclude <|begin_of_text|> from prompt example

This token gets added automatically, so it should not be included in the
prompt example.

* Add llava-next-72b and llava-next-110b

Adds the Qwen-based LLaVA-Next models to the conversion script, along
with changes to load the models on multiple GPUs for inference.

* Add llama3 and qwen prompt formats to docs

* Chat prompt and padding side left for llama3 batched

* update

* Update src/transformers/models/llava_next/convert_llava_next_weights_to_hf.py

Co-authored-by: amyeroberts <[email protected]>

* Update src/transformers/models/llava_next/convert_llava_next_weights_to_hf.py

Co-authored-by: amyeroberts <[email protected]>

* remove code

* better naming

---------

Co-authored-by: raushan <[email protected]>
Co-authored-by: Raushan Turganbay <[email protected]>
Co-authored-by: amyeroberts <[email protected]>
* pad on right if training

* docs

* add tests
* [whisper integration] use parquet dataset for testing

* propagate to others

* more propagation

* last one
…ace#31749)

* [whisper] remove un-necessary transpose for fa2 attention

* propagate
* fix mask creation of gpt2 and gpt_neox caused by me

* forgot the reshape of masks when shape > 2

* add tests for gpt neox and gpt2

* nit on a comment
* Add YaRN and Dynamic-YaRN RoPE Scaling Methods

YaRN (Yet another RoPE extension method) combines the NTK-By-Parts
Interpolation and Attention Scaling methods, improving upon existing
RoPE interpolation methods for longer context window sizes.

Fine-tuned models maintain their original performance across benchmarks
while enabling efficient extrapolation and transfer learning for
quicker convergence, especially in compute-limited environments.

We implement YaRN and Dynamic-YaRN for the following list of models:

 - LLaMA
 - Falcon
 - GPT-NeoX
 - Olmo
 - Persimmon
 - Phi
 - StableLM
 - OpenLLaMA

New unit tests are added to assert YaRN's correct behavior on both
short and long sequence inputs.

For more details, please refer to https://arxiv.org/abs/2309.00071.

Co-authored-by: Miguel Almeida <[email protected]>

* Refactor YaRN implementation for LLaMA

Iterate on YaRN implementation for LLaMA and remove diff from remaining
models for increased PR modularity.

This commit includes the following changes:
- Merge 'yarn_rope_scaling' and 'rope_scaling' dictionaries
- Remove unnecessary attributes ('extrapolation_factor' and 'finetuned')
  from YaRN classes
- Inherit 'forward' method in YaRN classes from superclass
- Rename 'yarn' method to 'compute_yarn_scaling'
- Extend YaRN tests with further assertions
- Fix style inconsistencies

Co-authored-by: Miguel Monte e Freitas <[email protected]>

* Refactor Tensor Building Logic for YaRN

- Comply with the the tensor building logic introduced in huggingface#30743
- Add referencing to the optimized Attention Factor equation
- Remove Dynamic YaRN for a more agile deployment

Co-authored-by: mig-mfreitas <[email protected]>

* remove unwanted file

---------

Co-authored-by: Miguel Almeida <[email protected]>
Co-authored-by: mig-mfreitas <[email protected]>
Co-authored-by: Joao Gante <[email protected]>
…huggingface#31979)

* Change resize_token_embeddings to make it return same Class that is passed to it

* Add explanatory comment as requested in review

* Add explanatory comments for add resizing function in lxmert

* Add comment for padding_idx and moving _resize_bias in lxmert to LxmertForPreTraining

---------

Co-authored-by: Prashanth Sateesh <[email protected]>
Co-authored-by: Prashanth Sateesh <[email protected]>
Co-authored-by: amyeroberts <[email protected]>
Co-authored-by: Arthur <[email protected]>
* gguf conversion forces add_prefix_space=False for llama3, this is not required and forces from_slow, which fails. changing to None + test

* typo

* clean test
* Update README.md

* tests: forward ok

* backward test done

* done testing

* removed check. scripts

* Update README.md

* added use_mambapy arg

* fixed typo in warning

* protected imports w/ mambapy package

* delete pscan.py + raise rather than assert

* Update import_utils.py

* fix whitespaces and unused import

* trailing whitespace + import block unformatted

* Update modeling_mamba.py

* transpose before pscan

* shape comment

* ran make style

* use_mambapy=False by default

Co-authored-by: Arthur <[email protected]>

* ran make fix-copies

---------

Co-authored-by: Arthur <[email protected]>
* renamed phi3 rope_scaling type

* fixed trailing whitespaces

* fixed test

* added warning

* fixed format
…e#32148)

Revert "Incorrect Whisper long-form decoding timestamps  (huggingface#32003)"

This reverts commit cd48553.
…ingface#31857)

* feat(cache): StaticCache uses index_copy_ to avoid useless copy

Using index_copy_ allows for explicit in-place change of the tensor.
Some backends (XLA) will otherwise copy the tensor, making the code
slower and using more memory.

Proposed implementation will end up using less memory and on XLA will
result in less compilation, but the change is also quite generic, making
no change whatsoever on CUDA or CPU backend.

* feat(cache): SlidingWindowCache uses index_copy_ to avoid useless copy

Applying the same change done in StaticCache.

* fix(cache): fallback of index_copy_ when not implemented

* fix(cache): in index_copy_ ensure tensors are on same device

* [run slow] llama

* fix(cache): add move of cache_position to same device in SlidingWindowCache

* Revert "[run slow] llama"

This reverts commit 02608dd.
…r search (huggingface#31924)

Update integration_utils.py

Added additional kwarg
…ith Position IDs (huggingface#31629)

* add DataCollatorBatchFlattening

* Update data_collator.py

* change name

* new FA2 flow if position_ids is provided

* add comments

* minor fix

* minor fix data collator

* add test cases for models

* add test case for data collator

* remove extra code

* formating for ruff check and check_repo.py

* ruff format

ruff format tests src utils

* custom_init_isort.py
* Updated ruff version and fixed the required code accorindg to the latest version.

* Updated ruff version and fixed the required code accorindg to the latest version.

* Added noqa directive to ignore 1 error shown by ruff
Co-authored-by: Arthur Zucker <[email protected]>
…eights in the layer (huggingface#32171)

* adds: extra_repr() to MambaRMSNorm to include the hidden size of the layer

* style fix with ruff:
…than the ones present at import time. (huggingface#32153)

* fix: default value reflects the runtime environment variables rather than the ones present at import time.

* Fix: Change `deterministic` to None by default; use env var if None
* Update qwen2.md

outdated description

* Update qwen2.md

amended

* Update qwen2.md

Update

* Update qwen2.md

fix wrong version code, now good to go
Jwaminju and others added 29 commits August 9, 2024 09:58
* docs: ko: main_classes/agent

* feat: chatgpt draft

* fix: manual edits

* �fix: resolve suggestions

Co-authored-by: Woojun Jung <[email protected]>
Co-authored-by: thsamaji <[email protected]>
Co-authored-by: SeungAhSon <[email protected]>

* fix: resolve suggestions

* fix: resolve code line number

---------

Co-authored-by: Woojun Jung <[email protected]>
Co-authored-by: thsamaji <[email protected]>
Co-authored-by: SeungAhSon <[email protected]>
* v1 - working version

* fix

* fix

* fix

* fix

* rename to correct name

* fix title

* fixup

* rename files

* fix

* add copied from on tests

* rename to `FalconMamba` everywhere and fix bugs

* fix quantization + accelerate

* fix copies

* add `torch.compile` support

* fix tests

* fix tests and add slow tests

* copies on config

* merge the latest changes

* fix tests

* add few lines about instruct

* Apply suggestions from code review

Co-authored-by: Arthur <[email protected]>

* fix

* fix tests

---------

Co-authored-by: Arthur <[email protected]>
* fix check

* add tests

* [run-slow] llama, gemma2

* oops, whisper actually runs but needed some special treatment
…ce#32522)

* fix sliding window attention (flash2) in gemma2 model

* [run-slow] gemma

* fix slicing attention_mask for flash_attn2

* fix slicing attention_mask when flash_attn is used

* add missing comment

* slice the last seq_len tokens in the key, value states

* revert code of slicing key, value states
…2581)

* Fixed conditional check for encodec model names.

* Reformatted conditional check.
…eating PR on not-owned repo (huggingface#32094)

Fix create_pr aagainst existing revision
…sion_transformer (huggingface#32569)

Bump aiohttp in /examples/research_projects/decision_transformer

Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.9.4 to 3.10.2.
- [Release notes](https://github.com/aio-libs/aiohttp/releases)
- [Changelog](https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst)
- [Commits](aio-libs/aiohttp@v3.9.4...v3.10.2)

---
updated-dependencies:
- dependency-name: aiohttp
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…_bert (huggingface#32220)

Bump torch in /examples/research_projects/visual_bert

Bumps [torch](https://github.com/pytorch/pytorch) from 1.13.1 to 2.2.0.
- [Release notes](https://github.com/pytorch/pytorch/releases)
- [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md)
- [Commits](pytorch/pytorch@v1.13.1...v2.2.0)

---
updated-dependencies:
- dependency-name: torch
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Rename "Templates for Chat Models" doc to "Chat Templates"

* Small formatting fix

* Small formatting fix

* Small formatting fix

* Cleanup tool calling docs as well

* Remove unneeded 'revision'

* Move tip to below main code example

* Little bonus section on template editing
* Update _toctree.yml

* docs: ko: deepspeed.md

* Apply suggestions from code review

Co-authored-by: wony617 <[email protected]>

* Apply suggestions from code review

Co-authored-by: wony617 <[email protected]>

* Update docs/source/ko/_toctree.yml

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/ko/deepspeed.md

* Update docs/source/ko/deepspeed.md

Co-authored-by: SeungAhSon <[email protected]>

* Apply suggestions from code review

Co-authored-by: wony617 <[email protected]>

* Update docs/source/ko/_toctree.yml

---------

Co-authored-by: wony617 <[email protected]>
Co-authored-by: Steven Liu <[email protected]>
Co-authored-by: SeungAhSon <[email protected]>
* fix: manual edits

* Apply suggestions from code review

Co-authored-by: SeongWooChoi <[email protected]>
Co-authored-by: Chulhwa (Evan) Han <[email protected]>

* fix:manual edits

- 잘못된 경로에 번역본 파일을 생성해서 옮김

* Delete docs/source/ko/tasks/awq.md

* Update docs/source/ko/_toctree.yml

Co-authored-by: Steven Liu <[email protected]>

---------

Co-authored-by: SeongWooChoi <[email protected]>
Co-authored-by: Chulhwa (Evan) Han <[email protected]>
Co-authored-by: Steven Liu <[email protected]>
Fixed failing test_find_base_model_checkpoint.
…decision_transformer (huggingface#32341)

Bump tensorflow in /examples/research_projects/decision_transformer

Bumps [tensorflow](https://github.com/tensorflow/tensorflow) from 2.11.1 to 2.12.1.
- [Release notes](https://github.com/tensorflow/tensorflow/releases)
- [Changelog](https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md)
- [Commits](tensorflow/tensorflow@v2.11.1...v2.12.1)

---
updated-dependencies:
- dependency-name: tensorflow
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* "to be not" -> "not to be"

* Update sam.md

* Update trainer.py

* Update modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py
…version` argument (huggingface#32545)

* Fixed wrong argument in is_torch_mps_available() function call.

* Fixed wrong argument in is_torch_mps_available() function call.

* sorted the import.

* Fixed wrong argument in is_torch_mps_available() function call.

* Fixed wrong argument in is_torch_mps_available() function call.

* Update src/transformers/utils/import_utils.py

Co-authored-by: Arthur <[email protected]>

* removed extra space.

* Added type hint for the min_version parameter.

* Added missing import.

---------

Co-authored-by: Arthur <[email protected]>
* let it be

* draft

* should not have changed

* add warnings

* fix & add tests

* fix tests

* ipnuts embeds cannot be passed with pixels

* more updates

* paligemma ready!

* minor typos

* update blip-2

* fix tests & raise error

* docstring

* add blip2 test

* tmp

* add image seq length to config

* update docstring

* delete

* fix tests

* fix blip

* fix paligemma

* out-of-place scatter

* add llava-next-video

* Update src/transformers/models/blip_2/modeling_blip_2.py

Co-authored-by: Pablo Montalvo <[email protected]>

* remove tmp

* codestyle

* nits

* more nits

* remove overriding in tests

* comprehension when merging video

* fix-copies

* revert changes for embeds test

* fix tests after making comprehension

* Update src/transformers/models/blip_2/processing_blip_2.py

Co-authored-by: Pablo Montalvo <[email protected]>

* Update src/transformers/models/blip_2/processing_blip_2.py

Co-authored-by: Pablo Montalvo <[email protected]>

* more updates

* fix tests

---------

Co-authored-by: Pablo Montalvo <[email protected]>
)

* Automatically add `transformers` tag to the modelcard

* Specify library_name and test
* skip failing tests

* [no-filter]

* [no-filter]

* fix wording catch in FA2 test

* [no-filter]

* trigger normal CI without filtering
…face#32316)

* fix

* enable on xpu

* no manual remove

* move to device

* remove to

* add move to
* add grokadamw

* reformat

* code review feedback, unit test

* reformat

* reformat
* add checkpoint and repo names

* adapt head to support metric depth estimation

* add max_depth output scaling

* add expected logits

* improve docs

* fix docstring

* add checkpoint and repo names

* adapt head to support metric depth estimation

* add max_depth output scaling

* add expected logits

* improve docs

* fix docstring

* rename depth_estimation to depth_estimation_type

* add integration test

* Refactored tests to include metric depth model inference test
* Integration test pass when the timm backbone lines are commented (L220-L227)

* address feedback

* replace model path to use organization path

* formatting

* delete deprecated TODO

* address feedback

* [run_slow] depth_anything
…s.py` (huggingface#32601)

* Removed un-necessary expressions.

* Fixed directory path for utils folder in test_tokenization_utils.py
)

* Add padding="max_length" to tokenizer kwargs and change crop_size to size for image_processor kwargs

* remove crop_size argument in align processor tests to be coherent with base tests

* Add pad_token when loading tokenizer if needed, change test override tokenizer kwargs, remove unnecessary test overwrites in grounding dino
* Update modeling_tf_deberta.py

Corrected some codes which do not support mixed precision

* Update modeling_tf_deberta_v2.py

Corrected some codes which do not support mixed precision

* Update modeling_tf_deberta_v2.py

* Update modeling_tf_deberta.py

* Add files via upload

* Add files via upload
* add fix for recurrentgemma

* [no-filter]

* trigger-ci

* [no-filter]

* [no-filter]

* attempt to fix mysterious zip error

* [no-filter]

* fix lookup error

* [no-filter]

* remove summarization hack

* [no-filter]
* Update testing_utils.py

* changes

* from env var

* name change

* debug

* name change
* skip failures

* navi31 skip

* mi300 skips

* conversational test backwards compatability

* mi300 skips
* mi250 new skip from update

* revert change from main use_auth_token depricated
@Cemberk Cemberk closed this Aug 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.