You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I hope to finetune llava-onevision-qwen2-0.5b-si on my own dataset. During the inference process after training the model, a warning appears stating: "Some weights of LlavaQwenForCausalLM were not initialized from the model checkpoint at /mnt/dolphinfs/ssd_pool/docker/user/hadoop-perception-zw04/baiyan02/llava_log/llava-onevision-qwen2-0.5b-si-test and are newly initialized: ['lm_head.weight']." Additionally, the inference output is garbled. However, when testing the model llava-onevision-qwen2-0.5b-si separately, it works normally.
my training scripts:
META_NAME='test'
OUTPUT_DIR=llava-onevision-qwen2-0.5b-si-$META_NAME
LLM_VERSION="llava-onevision-qwen2-0.5b-si"
VISION_MODEL_VERSION="llava-data/llava/siglip"
DATA_PATH="test.json"
during inference, I load my model by:
model_name = "llava_qwen"
tokenizer, model, image_processor, context_len = load_pretrained_model(model_path, None, model_name, attn_implementation="sdpa", device_map="auto", multimodal=True)
Thank you very much for your assistance.
The text was updated successfully, but these errors were encountered:
Hi,
I hope to finetune llava-onevision-qwen2-0.5b-si on my own dataset. During the inference process after training the model, a warning appears stating: "Some weights of LlavaQwenForCausalLM were not initialized from the model checkpoint at /mnt/dolphinfs/ssd_pool/docker/user/hadoop-perception-zw04/baiyan02/llava_log/llava-onevision-qwen2-0.5b-si-test and are newly initialized: ['lm_head.weight']." Additionally, the inference output is garbled. However, when testing the model llava-onevision-qwen2-0.5b-si separately, it works normally.
my training scripts:
META_NAME='test'
OUTPUT_DIR=llava-onevision-qwen2-0.5b-si-$META_NAME
LLM_VERSION="llava-onevision-qwen2-0.5b-si"
VISION_MODEL_VERSION="llava-data/llava/siglip"
DATA_PATH="test.json"
PROMPT_VERSION="qwen_1_5"
LLM_VERSION_CLEAN="${LLM_VERSION////}"
VISION_MODEL_VERSION_CLEAN="${VISION_MODEL_VERSION////}"
RUN_NAME="llava-onevision-${VISION_MODEL_VERSION_CLEAN}-${LLM_VERSION_CLEAN}-${META_NAME}"
echo "MID_RUN_NAME: ${RUN_NAME}"
torchrun --nproc_per_node=8 --nnodes=1
llava/train/train_mem.py
--deepspeed scripts/zero3.json
--model_name_or_path ${LLM_VERSION}
--version $PROMPT_VERSION
--data_path ${DATA_PATH}
--image_folder ""
--mm_tunable_parts="mm_vision_tower,mm_mlp_adapter,mm_language_model"
--mm_vision_tower_lr=2e-6
--vision_tower ${VISION_MODEL_VERSION}
--mm_projector_type mlp2x_gelu
--mm_vision_select_layer -2
--mm_use_im_start_end False
--mm_use_im_patch_token False
--group_by_modality_length True
--image_aspect_ratio anyres_max_9
--image_grid_pinpoints "(1x1),...,(6x6)"
--mm_patch_merge_type spatial_unpad
--bf16 True
--run_name $RUN_NAME
--output_dir ${OUTPUT_DIR}
--num_train_epochs 1
--per_device_train_batch_size 1
--per_device_eval_batch_size 4
--gradient_accumulation_steps 2
--evaluation_strategy "no"
--save_strategy "steps"
--save_steps 1000
--save_total_limit 1
--learning_rate 1e-5
--weight_decay 0.
--warmup_ratio 0.03
--lr_scheduler_type "cosine"
--logging_steps 1
--tf32 True
--model_max_length 32768
--gradient_checkpointing True
--dataloader_num_workers 4
--lazy_preprocess True
--report_to none
--torch_compile True
--torch_compile_backend "inductor"
--dataloader_drop_last True
--frames_upbound 32
--attn_implementation sdpa
during inference, I load my model by:
model_name = "llava_qwen"
tokenizer, model, image_processor, context_len = load_pretrained_model(model_path, None, model_name, attn_implementation="sdpa", device_map="auto", multimodal=True)
Thank you very much for your assistance.
The text was updated successfully, but these errors were encountered: