You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Video-LLaVA is a good open source model for video based question answering and compatibility with libraries such as outlines through huggingface would be beneficial.
Your contribution
I have tried loading video-LLaVA with transformers but I keep getting an error. I hope this can be implemented
on trying to use : https://huggingface.co/LanguageBind/Video-LLaVA-Pretrain-7B
I get this error:
python3 video.py
config.json: 100%|█████████████████████████| 1.12k/1.12k [00:00<00:00, 1.11MB/s]
You are using a model of type llava to instantiate a model of type llama. This is not supported for all configurations of models and can yield errors.
Traceback (most recent call last):
File "/Users/kamakshiramamurthy/Desktop/GSoC/outline/video.py", line 5, in
pipe = pipeline("text-generation", model="LanguageBind/Video-LLaVA-Pretrain-7B")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kamakshiramamurthy/miniconda3/envs/out_test/lib/python3.12/site-packages/transformers/pipelines/init.py", line 905, in pipeline
framework, model = infer_framework_load_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kamakshiramamurthy/miniconda3/envs/out_test/lib/python3.12/site-packages/transformers/pipelines/base.py", line 292, in infer_framework_load_model
raise ValueError(
ValueError: Could not load model LanguageBind/Video-LLaVA-Pretrain-7B with any of the following classes: (<class 'transformers.models.auto.modeling_auto.AutoModelForCausalLM'>, <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>). See the original errors:
while loading with AutoModelForCausalLM, an error is thrown:
Traceback (most recent call last):
File "/Users/kamakshiramamurthy/miniconda3/envs/out_test/lib/python3.12/site-packages/transformers/pipelines/base.py", line 279, in infer_framework_load_model
model = model_class.from_pretrained(model, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kamakshiramamurthy/miniconda3/envs/out_test/lib/python3.12/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers.models.llava.configuration_llava.LlavaConfig'> for this kind of AutoModel: AutoModelForCausalLM.
Model type should be one of BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfig, CamembertConfig, LlamaConfig, CodeGenConfig, CpmAntConfig, CTRLConfig, Data2VecTextConfig, ElectraConfig, ErnieConfig, FalconConfig, FuyuConfig, GemmaConfig, GitConfig, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, LlamaConfig, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MistralConfig, MixtralConfig, MptConfig, MusicgenConfig, MvpConfig, OpenLlamaConfig, OpenAIGPTConfig, OPTConfig, PegasusConfig, PersimmonConfig, PhiConfig, PLBartConfig, ProphetNetConfig, QDQBertConfig, Qwen2Config, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, Speech2Text2Config, StableLmConfig, TransfoXLConfig, TrOCRConfig, WhisperConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, XmodConfig.
while loading with LlamaForCausalLM, an error is thrown:
Traceback (most recent call last):
File "/Users/kamakshiramamurthy/miniconda3/envs/out_test/lib/python3.12/site-packages/transformers/pipelines/base.py", line 279, in infer_framework_load_model
model = model_class.from_pretrained(model, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kamakshiramamurthy/miniconda3/envs/out_test/lib/python3.12/site-packages/transformers/modeling_utils.py", line 3234, in from_pretrained
raise EnvironmentError(
OSError: LanguageBind/Video-LLaVA-Pretrain-7B does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack.
The text was updated successfully, but these errors were encountered:
The instructions for using the VideoLlaVa model are on the model page of the checkpoint. If these aren't working, you can open a discussion on the model page, detailing the issues you're encountering.
There isn't currently a model implementation which is compatible with the transformers library either on the hub or in the repo (to the best of my knowledge). If you would like it to be added, you can open a new model request in this repo. I'd suggest opening a discussion on the model page on the hub too, as the authors might be interested in contributing this.
Feature request
I want to use Video-LLaVA - https://huggingface.co/LanguageBind/Video-LLaVA-Pretrain-7B model with outlines for constrained generation.
Can video-LLaVA be used with Transformers for me to be able to do this?
This is the connector I want to use for outlines: dottxt-ai/outlines#728
Motivation
Video-LLaVA is a good open source model for video based question answering and compatibility with libraries such as outlines through huggingface would be beneficial.
Your contribution
I have tried loading video-LLaVA with transformers but I keep getting an error. I hope this can be implemented
on trying to use : https://huggingface.co/LanguageBind/Video-LLaVA-Pretrain-7B
I get this error:
python3 video.py
config.json: 100%|█████████████████████████| 1.12k/1.12k [00:00<00:00, 1.11MB/s]
You are using a model of type llava to instantiate a model of type llama. This is not supported for all configurations of models and can yield errors.
Traceback (most recent call last):
File "/Users/kamakshiramamurthy/Desktop/GSoC/outline/video.py", line 5, in
pipe = pipeline("text-generation", model="LanguageBind/Video-LLaVA-Pretrain-7B")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kamakshiramamurthy/miniconda3/envs/out_test/lib/python3.12/site-packages/transformers/pipelines/init.py", line 905, in pipeline
framework, model = infer_framework_load_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kamakshiramamurthy/miniconda3/envs/out_test/lib/python3.12/site-packages/transformers/pipelines/base.py", line 292, in infer_framework_load_model
raise ValueError(
ValueError: Could not load model LanguageBind/Video-LLaVA-Pretrain-7B with any of the following classes: (<class 'transformers.models.auto.modeling_auto.AutoModelForCausalLM'>, <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>). See the original errors:
while loading with AutoModelForCausalLM, an error is thrown:
Traceback (most recent call last):
File "/Users/kamakshiramamurthy/miniconda3/envs/out_test/lib/python3.12/site-packages/transformers/pipelines/base.py", line 279, in infer_framework_load_model
model = model_class.from_pretrained(model, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kamakshiramamurthy/miniconda3/envs/out_test/lib/python3.12/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers.models.llava.configuration_llava.LlavaConfig'> for this kind of AutoModel: AutoModelForCausalLM.
Model type should be one of BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BloomConfig, CamembertConfig, LlamaConfig, CodeGenConfig, CpmAntConfig, CTRLConfig, Data2VecTextConfig, ElectraConfig, ErnieConfig, FalconConfig, FuyuConfig, GemmaConfig, GitConfig, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, LlamaConfig, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MistralConfig, MixtralConfig, MptConfig, MusicgenConfig, MvpConfig, OpenLlamaConfig, OpenAIGPTConfig, OPTConfig, PegasusConfig, PersimmonConfig, PhiConfig, PLBartConfig, ProphetNetConfig, QDQBertConfig, Qwen2Config, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, Speech2Text2Config, StableLmConfig, TransfoXLConfig, TrOCRConfig, WhisperConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, XmodConfig.
while loading with LlamaForCausalLM, an error is thrown:
Traceback (most recent call last):
File "/Users/kamakshiramamurthy/miniconda3/envs/out_test/lib/python3.12/site-packages/transformers/pipelines/base.py", line 279, in infer_framework_load_model
model = model_class.from_pretrained(model, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kamakshiramamurthy/miniconda3/envs/out_test/lib/python3.12/site-packages/transformers/modeling_utils.py", line 3234, in from_pretrained
raise EnvironmentError(
OSError: LanguageBind/Video-LLaVA-Pretrain-7B does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack.
The text was updated successfully, but these errors were encountered: