[Feature] If i want to start Fine-tuning, must the flashattention be installed? #776

snowattitude · 2024-12-20T03:13:17Z

Motivation

GPUS=2 PER_DEVICE_BATCH_SIZE=1 sh ./****2_5_2b_lora.sh

mistake

FlashAttention is not installed.
FlashAttention is not installed.
flash-attention package not found, consider installing for better performance: No module named 'flash_attn'.
Current flash-attenton does not support window_size. Either upgrade or use attn_implementation='eager'.
flash-attention package not found, consider installing for better performance: No module named 'flash_attn'.
Current flash-attenton does not support window_size. Either upgrade or use attn_implementation='eager'.

The text was updated successfully, but these errors were encountered:

Weiyun1025 · 2024-12-20T03:42:53Z

It is not required; however, installing it will significantly improve the training speed.

snowattitude · 2024-12-20T07:43:42Z

how can i turn the flash-attention's switch off? because i can't find how to turn off it

snowattitude · 2024-12-20T08:11:11Z

It is not required; however, installing it will significantly improve the training speed.

does this code must use Ampere GPU?

Weiyun1025 · 2024-12-22T04:29:34Z

you can try to set _attn_implementation to eager in the config.

snowattitude · 2024-12-26T09:21:46Z

you can try to set _attn_implementation to eager in the config.

my file is the same as you

snowattitude · 2024-12-26T09:21:57Z

{
"_commit_hash": null,
"architectures": [
"InternVLChatModel"
],
"auto_map": {
"AutoConfig": "configuration_internvl_chat.InternVLChatConfig",
"AutoModel": "modeling_internvl_chat.InternVLChatModel",
"AutoModelForCausalLM": "modeling_internvl_chat.InternVLChatModel"
},
"downsample_ratio": 0.5,
"dynamic_image_size": true,
"force_image_size": 448,
"llm_config": {
"_name_or_path": "internlm/internlm2_5-1_8b-chat",
"add_cross_attention": false,
"architectures": [
"InternLM2ForCausalLM"
],
"attn_implementation": "flash_attention_2",
"auto_map": {
"AutoConfig": "configuration_internlm2.InternLM2Config",
"AutoModel": "modeling_internlm2.InternLM2ForCausalLM",
"AutoModelForCausalLM": "modeling_internlm2.InternLM2ForCausalLM",
"AutoModelForSequenceClassification": "modeling_internlm2.InternLM2ForSequenceClassification"
},
"bad_words_ids": null,
"begin_suppress_tokens": null,
"bias": false,
"bos_token_id": 1,
"chunk_size_feed_forward": 0,
"cross_attention_hidden_size": null,
"decoder_start_token_id": null,
"diversity_penalty": 0.0,
"do_sample": false,
"early_stopping": false,
"encoder_no_repeat_ngram_size": 0,
"eos_token_id": 2,
"exponential_decay_length_penalty": null,
"finetuning_task": null,
"forced_bos_token_id": null,
"forced_eos_token_id": null,
"hidden_act": "silu",
"hidden_size": 2048,
"id2label": {
"0": "LABEL_0",
"1": "LABEL_1"
},
"initializer_range": 0.02,
"intermediate_size": 8192,
"is_decoder": false,
"is_encoder_decoder": false,
"label2id": {
"LABEL_0": 0,
"LABEL_1": 1
},
"length_penalty": 1.0,
"max_length": 20,
"max_position_embeddings": 32768,
"min_length": 0,
"model_type": "internlm2",
"no_repeat_ngram_size": 0,
"num_attention_heads": 16,
"num_beam_groups": 1,
"num_beams": 1,
"num_hidden_layers": 24,
"num_key_value_heads": 8,
"num_return_sequences": 1,
"output_attentions": false,
"output_hidden_states": false,
"output_scores": false,
"pad_token_id": 2,
"prefix": null,
"pretraining_tp": 1,
"problem_type": null,
"pruned_heads": {},
"remove_invalid_values": false,
"repetition_penalty": 1.0,
"return_dict": true,
"return_dict_in_generate": false,
"rms_norm_eps": 1e-05,
"rope_scaling": {
"factor": 2.0,
"type": "dynamic"
},
"rope_theta": 1000000,
"sep_token_id": null,
"suppress_tokens": null,
"task_specific_params": null,
"temperature": 1.0,
"tf_legacy_loss": false,
"tie_encoder_decoder": false,
"tie_word_embeddings": false,
"tokenizer_class": null,
"top_k": 50,
"top_p": 1.0,
"torch_dtype": "bfloat16",
"torchscript": false,
"transformers_version": "4.37.2",
"typical_p": 1.0,
"use_bfloat16": true,
"use_cache": true,
"vocab_size": 92553
},
"max_dynamic_patch": 12,
"min_dynamic_patch": 1,
"model_type": "internvl_chat",
"ps_version": "v2",
"select_layer": -1,
"template": "internvl2_5",
"torch_dtype": "bfloat16",
"use_backbone_lora": 0,
"use_llm_lora": 0,
"use_thumbnail": true,
"vision_config": {
"architectures": [
"InternVisionModel"
],
"attention_dropout": 0.0,
"drop_path_rate": 0.0,
"dropout": 0.0,
"hidden_act": "gelu",
"hidden_size": 1024,
"image_size": 448,
"initializer_factor": 1.0,
"initializer_range": 0.02,
"intermediate_size": 4096,
"layer_norm_eps": 1e-06,
"model_type": "intern_vit_6b",
"norm_type": "layer_norm",
"num_attention_heads": 16,
"num_channels": 3,
"num_hidden_layers": 24,
"output_attentions": false,
"output_hidden_states": false,
"patch_size": 14,
"qk_normalization": false,
"qkv_bias": true,
"return_dict": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.37.2",
"use_bfloat16": false,
"use_flash_attn": false
}
}

popoyaya · 2025-01-02T11:11:06Z

+1，how can I finetune the model without the flash-attn?

popoyaya · 2025-01-02T11:32:37Z

how can i turn the flash-attention's switch off? because i can't find how to turn off it

+1，how can I finetune the model without the flash-attn?

genkerizer · 2025-01-04T00:48:25Z

You need to install flash_attn after removing code at InternVL/internvl_chat/internvl/train/internvl_chat_finetune.py

popoyaya · 2025-01-05T11:55:27Z

You need to install flash_attn after removing code at InternVL/internvl_chat/internvl/train/internvl_chat_finetune.py

Thanks！However, I encounter an error when installing flash-attn, and I am unable to resolve it. Is there any fine-tuning method that does not require flash-attn?

snowattitude · 2025-01-06T01:27:33Z

You need to install flash_attn after removing code at InternVL/internvl_chat/internvl/train/internvl_chat_finetune.py

Thanks！However, I encounter an error when installing flash-attn, and I am unable to resolve it. Is there any fine-tuning method that does not require flash-attn?

i can not see the question image, but u should install by the file named "~.whl"

genkerizer · 2025-01-06T02:25:26Z

You need to install flash_attn after removing code at InternVL/internvl_chat/internvl/train/internvl_chat_finetune.py

Thanks！However, I encounter an error when installing flash-attn, and I am unable to resolve it. Is there any fine-tuning method that does not require flash-attn?

i can not see the question image, but u should install by the file named "~.whl"

genkerizer · 2025-01-06T02:32:07Z

You need to install flash_attn after removing code at InternVL/internvl_chat/internvl/train/internvl_chat_finetune.py

Thanks！However, I encounter an error when installing flash-attn, and I am unable to resolve it. Is there any fine-tuning method that does not require flash-attn?

to install Flash_attn, please pull Nvidia devel image, refer in: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/cuda/tags

Example: nvcr.io/nvidia/cuda:12.6.3-cudnn-devel-ubuntu20.04

popoyaya · 2025-01-14T02:22:24Z

You need to install flash_attn after removing code at InternVL/internvl_chat/internvl/train/internvl_chat_finetune.py

Thanks！However, I encounter an error when installing flash-attn, and I am unable to resolve it. Is there any fine-tuning method that does not require flash-attn?

to install Flash_attn, please pull Nvidia devel image, refer in: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/cuda/tags

Example: nvcr.io/nvidia/cuda:12.6.3-cudnn-devel-ubuntu20.04

Thank you so much! Help a lot!

popoyaya · 2025-01-14T10:20:26Z

You need to install flash_attn after removing code at InternVL/internvl_chat/internvl/train/internvl_chat_finetune.py

Thanks！However, I encounter an error when installing flash-attn, and I am unable to resolve it. Is there any fine-tuning method that does not require flash-attn?

to install Flash_attn, please pull Nvidia devel image, refer in: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/cuda/tags

Example: nvcr.io/nvidia/cuda:12.6.3-cudnn-devel-ubuntu20.04

I have another question regarding the use of LoRA. Do I only need to set --freeze_backbone False? However, after training, I found that in the config.json, "use_backbone_lora": 0. What else should I do if I want to finetune the visual encoder as well?

Thank you in advance for your help!

snowattitude changed the title ~~[Feature] If i want to start Fine-tuning, the flashattention must be installed?~~ [Feature] If i want to start Fine-tuning, must the flashattention be installed? Dec 20, 2024

snowattitude closed this as completed Dec 27, 2024

snowattitude reopened this Dec 30, 2024

snowattitude closed this as completed Jan 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] If i want to start Fine-tuning, must the flashattention be installed? #776

[Feature] If i want to start Fine-tuning, must the flashattention be installed? #776

snowattitude commented Dec 20, 2024 •

edited

Loading

Weiyun1025 commented Dec 20, 2024

snowattitude commented Dec 20, 2024

snowattitude commented Dec 20, 2024

Weiyun1025 commented Dec 22, 2024

snowattitude commented Dec 26, 2024

snowattitude commented Dec 26, 2024

popoyaya commented Jan 2, 2025

popoyaya commented Jan 2, 2025

genkerizer commented Jan 4, 2025

popoyaya commented Jan 5, 2025

snowattitude commented Jan 6, 2025

genkerizer commented Jan 6, 2025

genkerizer commented Jan 6, 2025

popoyaya commented Jan 14, 2025

popoyaya commented Jan 14, 2025

[Feature] If i want to start Fine-tuning, must the flashattention be installed? #776

[Feature] If i want to start Fine-tuning, must the flashattention be installed? #776

Comments

snowattitude commented Dec 20, 2024 • edited Loading

Motivation

mistake

Weiyun1025 commented Dec 20, 2024

snowattitude commented Dec 20, 2024

snowattitude commented Dec 20, 2024

Weiyun1025 commented Dec 22, 2024

snowattitude commented Dec 26, 2024

snowattitude commented Dec 26, 2024

popoyaya commented Jan 2, 2025

popoyaya commented Jan 2, 2025

genkerizer commented Jan 4, 2025

popoyaya commented Jan 5, 2025

snowattitude commented Jan 6, 2025

genkerizer commented Jan 6, 2025

genkerizer commented Jan 6, 2025

popoyaya commented Jan 14, 2025

popoyaya commented Jan 14, 2025

snowattitude commented Dec 20, 2024 •

edited

Loading