Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated PR: Downstream develop rebase new changes #53

Closed
wants to merge 200 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
200 commits
Select commit Hold shift + click to select a range
251a240
Add llama3-llava-next-8b to llava_next conversion script (#31395)
jamt9000 Jul 23, 2024
3aefb4e
LLaVaNeXT: pad on right if training (#32134)
zucchini-nlp Jul 23, 2024
f83c6f1
Remove `trust_remote_code` when loading Libri Dummy (#31748)
sanchit-gandhi Jul 23, 2024
2782aad
[modelling] remove un-necessary transpose for fa2 attention (#31749)
sanchit-gandhi Jul 23, 2024
605f324
Fix mask creations of `GPTNeoX` and `GPT2` (#31944)
vasqu Jul 23, 2024
7405c1c
Add method to retrieve used chat template (#32032)
KonradSzafer Jul 23, 2024
34b4321
Add YaRN and Dynamic-YaRN RoPE Scaling Methods (#30910)
mig-mfreitas Jul 23, 2024
1535a2c
Disable quick init for TapasPreTrainedModel (#32149)
daniellok-db Jul 23, 2024
5a4a76e
Modify resize_token_embeddings to ensure output type is same as input…
bayllama Jul 23, 2024
2e11342
Llama: RoPE refactor (#32135)
gante Jul 23, 2024
a1844a3
gguf conversion add_prefix_space=None for llama3 (#31937)
itazap Jul 23, 2024
a5b226c
Fix flash attention speed issue (#32028)
Cyrilvallez Jul 23, 2024
9ced33c
Fix video batching to videollava (#32139)
merveenoyan Jul 23, 2024
bab32d6
Added mamba.py backend (#30139)
alxndrTL Jul 23, 2024
034b477
Rename Phi-3 rope scaling type (#31436)
garg-amit Jul 23, 2024
3263b34
Revert "Incorrect Whisper long-form decoding timestamps " (#32148)
sanchit-gandhi Jul 23, 2024
a009fbd
Fix typing to be compatible with later py versions (#32155)
amyeroberts Jul 23, 2024
6370062
feat(cache): StaticCache uses index_copy_ to avoid useless copy (#31857)
tengomucho Jul 23, 2024
7d92009
Added additional kwarg for successful running of optuna hyperparamete…
DeF0017 Jul 23, 2024
9cf4f2a
Enhancing SFT Training Efficiency Using Packing and FlashAttention2 w…
RhuiDih Jul 23, 2024
d2c687b
Updated `ruff` to the latest version (#31926)
Sai-Suraj-27 Jul 23, 2024
ff0d708
Dev version: v4.44.0.dev0
LysandreJik Jul 23, 2024
d5a99df
Llama 3.1 conversion
LysandreJik Jul 23, 2024
23f6a43
fix (#32162)
gante Jul 23, 2024
bc2adb0
fix: Fixed an if condition that is always evaluating to true (#32160)
Sai-Suraj-27 Jul 23, 2024
c85510f
[docs] change temperature to a positive value (#32077)
faaany Jul 23, 2024
01be5b4
adds: extra_repr() to MambaRMSNorm to include hidden size / size of w…
rohitdwivedula Jul 24, 2024
8678879
fix: default value reflects the runtime environment variables rather …
junrae6454 Jul 24, 2024
5f4ee98
Update qwen2.md (#32108)
ArtificialZeng Jul 24, 2024
165116b
Remove conversational pipeline tests (#32099)
amyeroberts Jul 24, 2024
e0182f3
RoPE: relaxed rope validation (#32182)
gante Jul 24, 2024
8d2534c
let's not warn when someone is running a forward (#32176)
ArthurZucker Jul 24, 2024
1392a68
Fix resize embedding with Deepspeed (#32192)
zucchini-nlp Jul 24, 2024
af0e4b7
Fix float8_e4m3fn in modeling_utils (#32193)
SunMarc Jul 24, 2024
1c122a4
Support dequantizing GGUF FP16 format (#31783)
PenutChen Jul 24, 2024
edd68f4
:rotating_light: No more default chat templates (#31733)
Rocketknight1 Jul 24, 2024
85a1269
fix: Replaced deprecated `unittest method` with the correct one (#32198)
Sai-Suraj-27 Jul 24, 2024
5658e74
[whisper] fix short-form output type (#32178)
sanchit-gandhi Jul 25, 2024
f53a5de
remove unnecessary guard code related with pytorch versions 1.4.2 ~ 1…
ji-huazhong Jul 25, 2024
1ecedf1
Update question_answering.py (#32208)
avlewis Jul 25, 2024
9b9a54e
[BigBird Pegasus] set _supports_param_buffer_assignment to False (#32…
kashif Jul 25, 2024
de23188
[warnings] fix E721 warnings (#32223)
kashif Jul 25, 2024
df6eee9
Follow up for #31973 (#32025)
ydshieh Jul 25, 2024
6ed0bf1
translate philosophy.md to chinese (#32177)
ji-huazhong Jul 25, 2024
3a83ec4
Allow a specific microphone to be used by the ffmpeg audio pipeline u…
jrhe Jul 25, 2024
9d6c064
Fix code snippet for Grounding DINO (#32229)
qubvel Jul 25, 2024
4ab33c2
Generation: stop at `eos` for assisted decoding (#31301)
zucchini-nlp Jul 26, 2024
fad15fb
Llava: generate without images (#32183)
zucchini-nlp Jul 26, 2024
c46edfb
Resize embeds with DeepSpeed (#32214)
zucchini-nlp Jul 26, 2024
1c7ebf1
don't log base model architecture in wandb if log model is false (#32…
joaonadkarni Jul 26, 2024
b8e5cd5
Refactor: Removed un-necessary `object` base class (#32230)
Sai-Suraj-27 Jul 26, 2024
f9756d9
Adds: extra_repr for RMSNorm layers in most models (#32204)
rohitdwivedula Jul 26, 2024
5f841c7
Add check for `target_sizes is None` in `post_process_image_guided_de…
catalys1 Jul 26, 2024
27c7f97
[tests] fix `static` cache implementation is not compatible with `att…
faaany Jul 26, 2024
81233c0
Flash-Attn: fix generation when no attention mask or no pading (#32241)
zucchini-nlp Jul 26, 2024
8da9068
More flexible trigger condition (#32251)
ydshieh Jul 26, 2024
44f6fdd
Llama 3.1: replace for loop by tensor ops at inv_freq initialization …
gante Jul 27, 2024
f739687
🚨 Bloom support for cache class (#31445)
zucchini-nlp Jul 29, 2024
f2122cc
Upload new model failure report to Hub (#32264)
ydshieh Jul 29, 2024
5019aab
Optimize t5 tokenize logic to avoid redundant calls (#32270)
leejet Jul 29, 2024
a2ad9d5
fix: Fixed wrong argument passed to `convert_blip_checkpoint` functio…
Sai-Suraj-27 Jul 29, 2024
535fe78
Repo: remove exceptions in `check_docstrings` (#32259)
gante Jul 29, 2024
6494479
make `p_mask` a numpy array before passing to `select_starts_ends` (#…
faaany Jul 29, 2024
4992889
fix(docs): Fixed a link in docs (#32274)
Sai-Suraj-27 Jul 29, 2024
7ffe25f
Generate: end-to-end compilation (#30788)
gante Jul 29, 2024
3fbaaaa
Whisper tokenizer word level timestamps (#32197)
kamilakesbi Jul 29, 2024
7f5d644
[pipeline] fix padding for 1-d tensors (#31776)
sanchit-gandhi Jul 29, 2024
811a9ca
Make static cache compatible with torch.export (#32168)
guangy10 Jul 29, 2024
a24a9a6
Add stream messages from agent run for gradio chatbot (#32142)
aymeric-roucher Jul 29, 2024
f0bc49e
use torch 2.4 in 2 CI jobs (#32302)
ydshieh Jul 29, 2024
3e8106d
Docs: fix GaLore optimizer code example (#32249)
gil2rok Jul 30, 2024
934fe15
Fix GGUF dequantize for `gguf==0.9.1` (#32298)
Isotr0py Jul 30, 2024
20528f0
Cast epochs_trained to int when resuming training (#32286)
teddy-f-47 Jul 30, 2024
084b509
feat(ci): set `fetch-depth: 0` in trufflehog checkout step (#31663)
McPatate Jul 30, 2024
2fbbcf5
Fix M4T for ASR pipeline (#32296)
ylacombe Jul 30, 2024
e68ec18
Docs: formatting nits (#32247)
gante Jul 30, 2024
bd54ed2
Alternative agent plan (#32295)
plaggy Jul 30, 2024
1627108
fix: Added missing raise keyword for few exceptions (#32333)
Sai-Suraj-27 Jul 30, 2024
62c60a3
fixes to properly shard FSDP across cpu and meta for cpu_efficient_lo…
winglian Jul 30, 2024
516af4b
fixes #32329 : The Torch code is correct - to get an average of 10% o…
fkrasnov2 Jul 30, 2024
026a173
Repo checks: skip docstring checks if not in the diff (#32328)
gante Jul 30, 2024
6e2d04e
Fix slow GemmaTokenizer and improve SPM slow -> fast conversion proce…
xenova Jul 30, 2024
a326433
LLaVA-NeXT: fix anyres shapes (#32314)
zucchini-nlp Jul 31, 2024
7f552e2
Gemma2 and flash-attention (#32188)
zucchini-nlp Jul 31, 2024
b75ad56
Llama 3.1: Fix incorrect `inv_freq` assignment (#32330)
gante Jul 31, 2024
5f1fcc2
[Idefics2] - Fix FA2 call for Perceiver layer (#32275)
amyeroberts Jul 31, 2024
ef177a5
Gemma 2: support assisted generation (#32357)
gante Jul 31, 2024
b46bd8b
Fix error when streaming to gradio with non-string tool arguments (#3…
aymeric-roucher Jul 31, 2024
92abe60
>3-5x faster torch.compile forward compilation for autoregressive dec…
fxmarty Jul 31, 2024
53f0c9c
fix: Removed unnecessary `@staticmethod` decorator (#32361)
Sai-Suraj-27 Jul 31, 2024
14ee232
fix: warmup_steps check for training_args (#32236)
Ricardo-L-C Jul 31, 2024
453e748
LLaVa: add cache class attribute (#32278)
zucchini-nlp Aug 1, 2024
9451a38
[enc-dec cache] fix bug in indexing (#32370)
sanchit-gandhi Aug 1, 2024
e234061
[whisper] compile compatibility with long-form decoding (#31772)
sanchit-gandhi Aug 1, 2024
48ed24c
Remove size check between attn_weights and kv_seq_len for phi3 (#32339)
helunwencser Aug 1, 2024
9e28284
add missing attribute _supports_param_buffer_assignment for gpt-j. (#…
nv-guomingz Aug 1, 2024
05c1f9a
Check device map for saving tokenizer config on TPU (fix for issue #3…
ayukh Aug 1, 2024
2229ebe
update clean_up_tokenization_spaces warning (#32371)
itazap Aug 1, 2024
db8c7ca
Empty list in defaults for LLaMA special tokens during weights conver…
ViktorooReps Aug 1, 2024
b4727a1
Fix conflicting key in init kwargs in PreTrainedTokenizerBase (#31233)
OmarManzoor Aug 1, 2024
ca59d6f
Offloaded KV Cache (#31325)
n17s Aug 1, 2024
e3d8285
Docker: add `speech` dep to the consistency docker image (#32374)
gante Aug 1, 2024
51ab25e
Fixed Hybrid Cache Shape Initialization. (#32163)
OsamaS99 Aug 1, 2024
82efc53
Yell at the user if zero-3 init wasn't performed, but expected to hav…
muellerzr Aug 1, 2024
2af199c
Update docs (#32368)
zucchini-nlp Aug 2, 2024
083e13b
RoPE: Add numerical tests ✨ (#32380)
gante Aug 2, 2024
c1aa0ed
[generate] only require an attention mask for mps with torch<2.4 (#32…
sanchit-gandhi Aug 2, 2024
7c31d05
fix: (issue #32124) Exception raised when running `transformers/examp…
fshp971 Aug 3, 2024
621fb3c
MixtralFlashAttention2: put "plus 1" inside parentheses when calculat…
xenshinu Aug 3, 2024
847bb85
Bump keras from 2.8.0 to 2.13.1 in /examples/research_projects/decisi…
dependabot[bot] Aug 5, 2024
05ae3a3
fix: SeamlessM4TFeatureExtractor stride remainder (#32088)
TechInterMezzo Aug 5, 2024
3bb646a
Phi3 tests: fix typing for Python 3.8 (#32388)
zucchini-nlp Aug 5, 2024
3d7c2f9
#32184 save total_vocab_size (#32240)
itazap Aug 5, 2024
ea5da52
add values for neftune (#32399)
nbroad1881 Aug 5, 2024
f5f1e52
Fix documentation references to google/bit-50 model (#32407)
JuanFKurucz Aug 5, 2024
baf7e5c
Persist embedding type of BART and mBART models after resize (#32242)
AbdiHaryadi Aug 5, 2024
458b0cd
fix: Updated `test_embeded_special_tokens` for luke and mluke models …
Sai-Suraj-27 Aug 5, 2024
7e5d46d
Respect the config's attn_implementation if set (#32383)
amyeroberts Aug 5, 2024
13dc6b0
Fix documentation links and code reference to model llava-next (#32434)
JuanFKurucz Aug 5, 2024
37c5ca5
Cache: create docs (#32150)
zucchini-nlp Aug 6, 2024
0aa8328
Llava: fix checkpoint_doc (#32458)
RUFFY-369 Aug 6, 2024
e85d863
add the missing flash attention test marker (#32419)
faaany Aug 6, 2024
fb66ef8
Update kwargs validation for `preprocess` with decorator (#32024)
qubvel Aug 6, 2024
438d06c
Fix get large model config for Switch Transformer encoder only tester…
JuanFKurucz Aug 6, 2024
36fd35e
Dependencies: fix typo (#32389)
gante Aug 6, 2024
6a03942
Add Nemotron HF Support (#31699)
suiyoubi Aug 6, 2024
3d8bd11
Generate: fix end to end compilation (#32465)
gante Aug 6, 2024
80b90e7
Add codestral mamba2 (#32080)
molbap Aug 6, 2024
194cf1f
Migrate import checks not need accelerate, and be more clear on min v…
muellerzr Aug 6, 2024
50c3ba8
Documentation: BOS token_id deprecation change for NLLB (#32443)
christoukmaji Aug 6, 2024
26a9443
dev version 4.45.0
ArthurZucker Aug 6, 2024
4fdc702
`is_torchdynamo_compiling` -- cast a wide exception net (#32476)
gante Aug 6, 2024
ac2707e
Revert "fixes to properly shard FSDP across cpu and meta for cpu_effc…
matthewdouglas Aug 6, 2024
5301b98
🌐 [i18n-KO] Translated `mask_generation.md` to Korean (#32257)
jeongiin Aug 6, 2024
3b193c7
🌐 [i18n-KO] Translated `idefics.md` to Korean (#32258)
boyunJang Aug 6, 2024
6af0854
🌐 [i18n-KO] Translated `image_to_image.md` to Korean (#32327)
shinhyunji36 Aug 6, 2024
a30c865
Cache: new Cache format in decoder-only models (#31421)
zucchini-nlp Aug 7, 2024
7ad784a
Gemma2: add cache warning (#32279)
zucchini-nlp Aug 7, 2024
46d09af
enable xla fsdp (#32048)
hanwen-sun Aug 7, 2024
c54a6f9
Fix typo in tokenization_utils_base.py (#32484)
blubitz Aug 7, 2024
e0d8253
Agents use grammar (#31735)
aymeric-roucher Aug 7, 2024
b640103
fix broken link in docs (#32491)
jorahn Aug 7, 2024
b7fb393
Docs: alert for the possibility of manipulating logits (#32467)
gante Aug 7, 2024
1124d95
🌐 [i18n-KO] Translated `gptq.md` to Korean (#32293)
1kmmk1 Aug 7, 2024
fcc4f2a
🌐 [i18n-KO] Translated `prompting.md` to Korean (#32294)
chhaewxn Aug 7, 2024
fa59fd8
🌐 [i18n-KO] Translated `quantization/quanto.md` to Korean (#32281)
fabxoe Aug 7, 2024
cba7bcf
🌐 [i18n-KO] Translated `image_feature_extraction.md` to Korean (#32239)
mreraser Aug 7, 2024
73a59a2
Fix references to model google mt5 small (#32497)
JuanFKurucz Aug 7, 2024
543df48
Docs: Fixed WhisperModel.forward’s docstring link (#32498)
Sai-Suraj-27 Aug 7, 2024
78566db
🌐 [i18n-KO] Translated `chat_templating.md` to Korean (#32362)
enchantee00 Aug 7, 2024
f5cdbf6
Fix link to autoclass_tutorial.md in i18n.md (#32501)
JuanFKurucz Aug 7, 2024
aefd3e2
Fix typo: depracted -> deprecated (#32489)
tomaarsen Aug 8, 2024
1c944ac
Fix issue #32518: Update llm_tutorial.md (#32523)
doomdagadiggiedahdah Aug 8, 2024
e28784f
Change Phi3 `_supports_sdpa` to True (#32457)
pocca2048 Aug 8, 2024
d3b3551
Uniformize kwargs for processors - GroundingDINO (#31964)
SangbumChoi Aug 8, 2024
b51d414
Fix add-new-model-like (#31773)
molbap Aug 8, 2024
16ed064
Add Qwen2-Audio (#32137)
faychu Aug 8, 2024
cc832cb
filter flash_attn optional imports loading remote code (#30954)
eaidova Aug 8, 2024
43f3fe8
🌐 [i18n-KO] Translated `ko-llm_tutorial_optimization.md` to Korean (#…
010kim Aug 8, 2024
96ba7f0
🌐 [i18n-KO] Translated `trainer.md` to Korean (#32260)
cjfghk5697 Aug 8, 2024
e0396bd
🌐 [i18n-KO] Translated `eetq.md` to Korean (#32352)
jun048098 Aug 8, 2024
496207a
🌐 [i18n-KO] Translated `fsdp.md` to Korean (#32261)
win2dvp21 Aug 8, 2024
b01f9c4
🌐 [i18n-KO] Translated `bitsandbytes.md` to Korean (#32408)
SeungAhSon Aug 8, 2024
0442816
Fix generate with `inputs_embeds` as input (#32493)
molbap Aug 8, 2024
0164560
Fixed test `test_static_cache_exportability` with torch 2.4.0 (#32516)
guangy10 Aug 8, 2024
54ac39c
Fix code example to load bigcode starcoder2 7b (#32474)
JuanFKurucz Aug 8, 2024
85817d9
[docs] Translation guide (#32547)
stevhliu Aug 8, 2024
838d141
Gemma2: fix FA2 generation (#32553)
zucchini-nlp Aug 9, 2024
7728b78
Fix a bug in Qwen2Audio (#32552)
faychu Aug 9, 2024
e4522fe
fix slow integration gemma2 test (#32534)
ArthurZucker Aug 9, 2024
e7f4ace
fix non contiguous tensor value error in save_pretrained (#32422)
congcongke Aug 9, 2024
48101cf
🌐 [i18n-KO] Translated `agent.md` to Korean (#32351)
Jwaminju Aug 9, 2024
7c11491
Add new model (#32615)
younesbelkada Aug 12, 2024
8f2b6d5
Fix: FA2 with packed training (#32487)
zucchini-nlp Aug 12, 2024
342e3f9
Fix sliding window attention used in Gemma2FlashAttention2 (#32522)
brcps12 Aug 12, 2024
bd251e4
fix: Fixed conditional check for `encodec` model names (#32581)
Sai-Suraj-27 Aug 12, 2024
e31a7a2
Fix `.push_to_hub(..., create_pr=True, revision="my-branch")` when cr…
Wauplin Aug 12, 2024
50837f2
Bump aiohttp from 3.9.4 to 3.10.2 in /examples/research_projects/deci…
dependabot[bot] Aug 12, 2024
8a3c55e
Bump torch from 1.13.1 to 2.2.0 in /examples/research_projects/visual…
dependabot[bot] Aug 12, 2024
b7ea171
Cleanup tool calling documentation and rename doc (#32337)
Rocketknight1 Aug 12, 2024
4996990
🌐 [i18n-KO] Translated `deepspeed.md` to Korean (#32431)
4N3MONE Aug 12, 2024
7f777ab
🌐 [i18n-KO] Translated `awq.md`to Korean (#32324)
ahnjj Aug 12, 2024
ce4b288
fix: Fixed failing `test_find_base_model_checkpoint` (#32638)
Sai-Suraj-27 Aug 12, 2024
126cbdb
Bump tensorflow from 2.11.1 to 2.12.1 in /examples/research_projects/…
dependabot[bot] Aug 12, 2024
f1c8542
"to be not" -> "not to be" (#32636)
qgallouedec Aug 12, 2024
2a5a6ad
fix: Updated the `is_torch_mps_available()` function to include `min_…
Sai-Suraj-27 Aug 12, 2024
a29eabd
Expand inputs in processors for VLMs (#30962)
zucchini-nlp Aug 13, 2024
29c3a0f
Automatically add `transformers` tag to the modelcard (#32623)
LysandreJik Aug 13, 2024
a5a8291
Fix tests (#32649)
molbap Aug 13, 2024
b5016d5
fix tensors on different devices in `WhisperGenerationMixin` (#32316)
faaany Aug 13, 2024
481e156
Add support for GrokAdamW optimizer (#32521)
ehartford Aug 13, 2024
cc25757
Add Depth Anything V2 Metric models (#32126)
bt2513 Aug 13, 2024
c3cd9d8
Fix: Fixed directory path for utils folder in `test_tokenization_util…
Sai-Suraj-27 Aug 13, 2024
5bcbdff
Modify ProcessorTesterMixin for better generalization (#32637)
yonigozlan Aug 13, 2024
9d2ab88
TF_Deberta supporting mixed precision (#32618)
pinesnow72 Aug 13, 2024
c135783
Fix tests recurrent (#32651)
molbap Aug 13, 2024
fe874ae
Changes from old ROCm main
amathews-amd Jul 22, 2024
8c9bb1a
Add skip if rocm (#38)
Cemberk Jul 24, 2024
d21e068
skip failures (#39)
Cemberk Jul 29, 2024
bb48693
Debug v4.43 rocm (#40) (#42)
Cemberk Aug 1, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
3 changes: 2 additions & 1 deletion .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,7 @@ jobs:
- run: python utils/custom_init_isort.py --check_only
- run: python utils/sort_auto_mappings.py --check_only
- run: python utils/check_doc_toc.py
- run: python utils/check_docstrings.py --check_all

check_repository_consistency:
working_directory: ~/transformers
Expand Down Expand Up @@ -190,4 +191,4 @@ workflows:
- check_circleci_user
- check_code_quality
- check_repository_consistency
- fetch_all_tests
- fetch_all_tests
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/i18n.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ Some notes:

## Tutorial section
- [ ] [pipeline_tutorial.md](https://github.com/huggingface/transformers/blob/main/docs/source/en/pipeline_tutorial.md)
- [ ] [autoclass_tutorial.md](https://github.com/huggingface/transformers/blob/master/docs/source/autoclass_tutorial.md)
- [ ] [autoclass_tutorial.md](https://github.com/huggingface/transformers/blob/main/docs/source/en/autoclass_tutorial.md)
- [ ] [preprocessing.md](https://github.com/huggingface/transformers/blob/main/docs/source/en/preprocessing.md)
- [ ] [training.md](https://github.com/huggingface/transformers/blob/main/docs/source/en/training.md)
- [ ] [accelerate.md](https://github.com/huggingface/transformers/blob/main/docs/source/en/accelerate.md)
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/self-pr-slow-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ on:
pull_request:
paths:
- "src/transformers/models/*/modeling_*.py"
- "tests/models/*/test_*.py"
- "tests/**/test_*.py"

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
Expand Down
5 changes: 4 additions & 1 deletion .github/workflows/self-push-amd-mi210-caller.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,14 @@ on:
- ".github/**"
- "templates/**"
- "utils/**"
pull_request:
types: [opened, reopened, synchronize]
branches: ["main"]

jobs:
run_amd_ci:
name: AMD mi210
if: (cancelled() != true) && ((github.event_name == 'workflow_run') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_amd_push_ci_caller')))
if: (cancelled() != true) && (github.event_name != 'schedule') && (github.event_name == 'pull_request')
uses: ./.github/workflows/self-push-amd.yml
with:
gpu_flavor: mi210
Expand Down
7 changes: 5 additions & 2 deletions .github/workflows/self-push-amd-mi250-caller.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,15 @@ on:
- "tests/**"
- ".github/**"
- "templates/**"
- "utils/**"
- "utils/**"
pull_request:
types: [opened, reopened, synchronize]
branches: ["main"]

jobs:
run_amd_ci:
name: AMD mi250
if: (cancelled() != true) && ((github.event_name == 'workflow_run') || ((github.event_name == 'push') && startsWith(github.ref_name, 'run_amd_push_ci_caller')))
if: (cancelled() != true) && (github.event_name != 'schedule') && (github.event_name == 'pull_request')
uses: ./.github/workflows/self-push-amd.yml
with:
gpu_flavor: mi250
Expand Down
88 changes: 30 additions & 58 deletions .github/workflows/self-push-amd.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,37 +18,22 @@ env:
HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}

jobs:
check_runner_status:
name: Check Runner Status
runs-on: ubuntu-22.04
steps:
- name: Checkout transformers
uses: actions/checkout@v4
with:
fetch-depth: 2

- name: Check Runner Status
run: python utils/check_self_hosted_runner.py --target_runners amd-mi210-single-gpu-ci-runner-docker --token ${{ secrets.ACCESS_REPO_INFO_TOKEN }}

check_runners:
name: Check Runners
needs: check_runner_status
strategy:
matrix:
machine_type: [single-gpu, multi-gpu]
runs-on: [self-hosted, amd-gpu, '${{ matrix.machine_type }}', '${{ inputs.gpu_flavor }}']
runs-on: rocm
container:
image: huggingface/transformers-pytorch-amd-gpu-push-ci # <--- We test only for PyTorch for now
options: --device /dev/kfd --device /dev/dri --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
options: --device /dev/kfd --device /dev/dri --env HIP_VISIBLE_DEVICES --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: ROCM-SMI
run: |
rocm-smi
- name: ROCM-INFO
run: |
rocminfo | grep "Agent" -A 14
- name: Show ROCR environment
- name: Show HIP environment
run: |
echo "HIP: $HIP_VISIBLE_DEVICES"
echo "ROCR: $ROCR_VISIBLE_DEVICES"

setup_gpu:
Expand All @@ -57,14 +42,21 @@ jobs:
strategy:
matrix:
machine_type: [single-gpu, multi-gpu]
runs-on: [self-hosted, amd-gpu, '${{ matrix.machine_type }}', '${{ inputs.gpu_flavor }}']
runs-on: rocm
container:
image: huggingface/transformers-pytorch-amd-gpu-push-ci # <--- We test only for PyTorch for now
options: --device /dev/kfd --device /dev/dri --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
options: --device /dev/kfd --device /dev/dri --env HIP_VISIBLE_DEVICES --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
outputs:
matrix: ${{ steps.set-matrix.outputs.matrix }}
test_map: ${{ steps.set-matrix.outputs.test_map }}
steps:
- name: Remove transformers repository (installed during docker image build)
working-directory: /
shell: bash
run: |
rm -r transformers
git clone https://github.com/ROCmSoftwarePlatform/transformers.git

# Necessary to get the correct branch name and commit SHA for `workflow_run` event
# We also take into account the `push` event (we might want to test some changes in a branch)
- name: Prepare custom environment variables
Expand Down Expand Up @@ -155,11 +147,23 @@ jobs:
matrix:
folders: ${{ fromJson(needs.setup_gpu.outputs.matrix) }}
machine_type: [single-gpu, multi-gpu]
runs-on: [self-hosted, amd-gpu, '${{ matrix.machine_type }}', '${{ inputs.gpu_flavor }}']
runs-on: rocm
container:
image: huggingface/transformers-pytorch-amd-gpu-push-ci # <--- We test only for PyTorch for now
options: --device /dev/kfd --device /dev/dri --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
options: --device /dev/kfd --device /dev/dri --env HIP_VISIBLE_DEVICES --env ROCR_VISIBLE_DEVICES --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:

- name: Remove transformers repository (installed during docker image build)
working-directory: /
shell: bash
run: |
rm -r transformers
git clone https://github.com/ROCmSoftwarePlatform/transformers.git

- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
working-directory: /transformers
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .

# Necessary to get the correct branch name and commit SHA for `workflow_run` event
# We also take into account the `push` event (we might want to test some changes in a branch)
- name: Prepare custom environment variables
Expand Down Expand Up @@ -192,10 +196,6 @@ jobs:
git checkout ${{ env.CI_SHA }}
echo "log = $(git log -n 1)"

- name: Reinstall transformers in edit mode (remove the one installed during docker image build)
working-directory: /transformers
run: python3 -m pip uninstall -y transformers && python3 -m pip install -e .

- name: Echo folder ${{ matrix.folders }}
shell: bash
# For folders like `models/bert`, set an env. var. (`matrix_folders`) to `models_bert`, which will be used to
Expand All @@ -209,13 +209,11 @@ jobs:
echo "matrix_folders=$matrix_folders" >> $GITHUB_ENV

- name: ROCM-SMI
run: |
rocm-smi
- name: ROCM-INFO
run: |
rocminfo | grep "Agent" -A 14
- name: Show ROCR environment
- name: Show HIP environment
run: |
echo "HIP: $HIP_VISIBLE_DEVICES"
echo "ROCR: $ROCR_VISIBLE_DEVICES"

- name: Environment
Expand Down Expand Up @@ -246,10 +244,9 @@ jobs:

send_results:
name: Send results to webhook
runs-on: ubuntu-22.04
runs-on: ubuntu-latest
if: always()
needs: [
check_runner_status,
check_runners,
setup_gpu,
run_models_gpu,
Expand All @@ -261,7 +258,6 @@ jobs:
shell: bash
# For the meaning of these environment variables, see the job `Setup`
run: |
echo "Runner availability: ${{ needs.check_runner_status.result }}"
echo "Setup status: ${{ needs.setup_gpu.result }}"
echo "Runner status: ${{ needs.check_runners.result }}"

Expand Down Expand Up @@ -303,27 +299,3 @@ jobs:
git checkout ${{ env.CI_SHA }}
echo "log = $(git log -n 1)"

- uses: actions/download-artifact@v4
- name: Send message to Slack
env:
CI_SLACK_BOT_TOKEN: ${{ secrets.CI_SLACK_BOT_TOKEN }}
CI_SLACK_CHANNEL_ID: ${{ secrets.CI_SLACK_CHANNEL_ID }}
CI_SLACK_CHANNEL_ID_DAILY: ${{ secrets.CI_SLACK_CHANNEL_ID_DAILY }}
CI_SLACK_CHANNEL_ID_AMD: ${{ secrets.CI_SLACK_CHANNEL_ID_AMD }}
CI_SLACK_CHANNEL_DUMMY_TESTS: ${{ secrets.CI_SLACK_CHANNEL_DUMMY_TESTS }}
CI_SLACK_REPORT_CHANNEL_ID: ${{ secrets.CI_SLACK_CHANNEL_ID_AMD }}
ACCESS_REPO_INFO_TOKEN: ${{ secrets.ACCESS_REPO_INFO_TOKEN }}
CI_EVENT: Push CI (AMD) - ${{ inputs.gpu_flavor }}
CI_TITLE_PUSH: ${{ github.event.head_commit.message }}
CI_TITLE_WORKFLOW_RUN: ${{ github.event.workflow_run.head_commit.message }}
CI_SHA: ${{ env.CI_SHA }}
RUNNER_STATUS: ${{ needs.check_runner_status.result }}
RUNNER_ENV_STATUS: ${{ needs.check_runners.result }}
SETUP_STATUS: ${{ needs.setup_gpu.result }}

# We pass `needs.setup_gpu.outputs.matrix` as the argument. A processing in `notification_service.py` to change
# `models/bert` to `models_bert` is required, as the artifact names use `_` instead of `/`.
run: |
pip install slack_sdk
pip show slack_sdk
python utils/notification_service.py "${{ needs.setup_gpu.outputs.matrix }}"
23 changes: 6 additions & 17 deletions .github/workflows/trufflehog.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,20 +10,9 @@ jobs:
trufflehog:
runs-on: ubuntu-latest
steps:
- shell: bash
run: |
if [ "${{ github.event_name }}" == "push" ]; then
echo "depth=$(($(jq length <<< '${{ toJson(github.event.commits) }}') + 2))" >> $GITHUB_ENV
echo "branch=${{ github.ref_name }}" >> $GITHUB_ENV
fi
if [ "${{ github.event_name }}" == "pull_request" ]; then
echo "depth=$((${{ github.event.pull_request.commits }}+2))" >> $GITHUB_ENV
echo "branch=${{ github.event.pull_request.head.ref }}" >> $GITHUB_ENV
fi
- name: Checkout code
uses: actions/checkout@v4
with:
ref: ${{env.branch}}
fetch-depth: ${{env.depth}}
- name: Secret Scanning
uses: trufflesecurity/trufflehog@main
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Secret Scanning
uses: trufflesecurity/trufflehog@main
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ quality:
python utils/custom_init_isort.py --check_only
python utils/sort_auto_mappings.py --check_only
python utils/check_doc_toc.py
python utils/check_docstrings.py --check_all


# Format source code automatically and check is there are any problems left that need manual fixing
Expand Down
2 changes: 1 addition & 1 deletion docker/consistency.dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ RUN pip install uv && uv venv && uv pip install --no-cache-dir -U pip setuptools
RUN uv pip install --no-cache-dir --upgrade 'torch' --index-url https://download.pytorch.org/whl/cpu
# tensorflow pin matching setup.py
RUN uv pip install --no-cache-dir "tensorflow-cpu<2.16" "tf-keras<2.16"
RUN uv pip install --no-cache-dir "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[flax,quality,vision,testing]"
RUN uv pip install --no-cache-dir "git+https://github.com/huggingface/transformers.git@${REF}#egg=transformers[flax,quality,torch-speech,vision,testing]"
RUN git lfs install

RUN pip uninstall -y transformers
Expand Down
2 changes: 1 addition & 1 deletion docker/transformers-all-latest-gpu/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ SHELL ["sh", "-lc"]
# The following `ARG` are mainly used to specify the versions explicitly & directly in this docker file, and not meant
# to be used as arguments for docker build (so far).

ARG PYTORCH='2.3.0'
ARG PYTORCH='2.4.0'
# (not always a valid torch version)
ARG INTEL_TORCH_EXT='2.3.0'
# Example: `cu102`, `cu113`, etc.
Expand Down
2 changes: 1 addition & 1 deletion docker/transformers-pytorch-gpu/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ ARG REF=main
RUN git clone https://github.com/huggingface/transformers && cd transformers && git checkout $REF

# If set to nothing, will install the latest version
ARG PYTORCH='2.3.0'
ARG PYTORCH='2.4.0'
ARG TORCH_VISION=''
ARG TORCH_AUDIO=''
# Example: `cu102`, `cu113`, etc.
Expand Down
12 changes: 11 additions & 1 deletion docs/source/en/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,8 @@
sections:
- local: generation_strategies
title: Customize the generation strategy
- local: kv_cache
title: Best Practices for Generation with Cache
title: Generation
- isExpanded: false
sections:
Expand All @@ -118,7 +120,7 @@
- local: custom_models
title: Share a custom model
- local: chat_templating
title: Templates for chat models
title: Chat templates
- local: trainer
title: Trainer
- local: sagemaker
Expand Down Expand Up @@ -368,6 +370,8 @@
title: ESM
- local: model_doc/falcon
title: Falcon
- local: model_doc/falcon_mamba
title: FalconMamba
- local: model_doc/fastspeech2_conformer
title: FastSpeech2Conformer
- local: model_doc/flan-t5
Expand Down Expand Up @@ -436,6 +440,8 @@
title: MADLAD-400
- local: model_doc/mamba
title: Mamba
- local: model_doc/mamba2
title: mamba2
- local: model_doc/marian
title: MarianMT
- local: model_doc/markuplm
Expand Down Expand Up @@ -466,6 +472,8 @@
title: MT5
- local: model_doc/mvp
title: MVP
- local: model_doc/nemotron
title: Nemotron
- local: model_doc/nezha
title: NEZHA
- local: model_doc/nllb
Expand Down Expand Up @@ -500,6 +508,8 @@
title: QDQBert
- local: model_doc/qwen2
title: Qwen2
- local: model_doc/qwen2_audio
title: Qwen2Audio
- local: model_doc/qwen2_moe
title: Qwen2MoE
- local: model_doc/rag
Expand Down
Loading
Loading