Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Model inference on Windows LNL NPU for openai/clip-vit-large-patch14 is not working #28171

Open
3 tasks done
azhuvath opened this issue Dec 20, 2024 · 1 comment
Open
3 tasks done
Assignees
Labels
bug Something isn't working category: NPU OpenVINO NPU plugin support_request

Comments

@azhuvath
Copy link

OpenVINO Version

2024.6

Operating System

Ubuntu 20.04 (LTS)

Device used for inference

NPU

Framework

None

Model used

openai/clip-vit-large-patch14

Issue description

Model inference on Windows LNL NPU for openai/clip-vit-large-patch14 is not working. Error observed is as follows.

[ERROR] 05:26:28.301 [vpux-compiler] Got Diagnostic at loc(fused<{name = "__module.vision_model.embeddings.patch_embedding/aten::_convolution/Convolution", type = "Convolution"}>["__module.vision_model.embeddings.patch_embedding/aten::_convolution/Convolution"]) : Channels count of input tensor shape and filter shape must be the same: -9223372036854775808 != 3

loc(fused<{name = "__module.vision_model.embeddings.patch_embedding/aten::_convolution/Convolution", type = "Convolution"}>["__module.vision_model.embeddings.patch_embedding/aten::_convolution/Convolution"]): error: Channels count of input tensor shape and filter shape must be the same: -9223372036854775808 != 3
LLVM ERROR: Failed to infer result type(s).

Step-by-step reproduction

Create Environment

python -m venv npu_env
./npu_env/Scripts/activate
python -m pip install --upgrade pip
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
pip install pillow scikit-learn requests transformers openvino

Code to execute. Change CPU to NPU

import requests
import numpy as np
import openvino as ov
from scipy.special import softmax
from PIL import Image
from pathlib import Path
from transformers import CLIPProcessor, CLIPModel

model = CLIPModel.from_pretrained("openai/clip-vit-large-patch14")
processor = CLIPProcessor.from_pretrained("openai/clip-vit-large-patch14")

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)

classes = ["a photo of a cat", "a photo of a dog"]
inputs = processor(text=classes, images=image, return_tensors="pt", padding=True)

outputs = model(**inputs)
logits_per_image = outputs.logits_per_image # this is the image-text similarity score
probs = logits_per_image.softmax(dim=1) # we can take the softmax to get the label probabilities
predicted_idx = probs.argmax().item()
print(classes[predicted_idx])

ov_model_path = "clip-vit-large-patch14-fp32.xml"
fp32_model_path = Path(ov_model_path)
model.config.torchscript = True

ov_model = ov.convert_model(model, example_input=dict(inputs))
ov.save_model(ov_model, fp32_model_path, compress_to_fp16=False)

device = 'NPU'
core = ov.Core()
compiled_model = core.compile_model(ov_model_path, device)
inputs = dict(inputs)
outputs = compiled_model(inputs)[0]
probs = softmax(outputs, axis=1)
[predicted_idx] = np.argmax(probs, axis=1)
print(classes[predicted_idx])

Relevant log output

[ERROR] 05:26:28.301 [vpux-compiler] Got Diagnostic at loc(fused<{name = "__module.vision_model.embeddings.patch_embedding/aten::_convolution/Convolution", type = "Convolution"}>["__module.vision_model.embeddings.patch_embedding/aten::_convolution/Convolution"]) : Channels count of input tensor shape and filter shape must be the same: -9223372036854775808 != 3
loc(fused<{name = "__module.vision_model.embeddings.patch_embedding/aten::_convolution/Convolution", type = "Convolution"}>["__module.vision_model.embeddings.patch_embedding/aten::_convolution/Convolution"]): error: Channels count of input tensor shape and filter shape must be the same: -9223372036854775808 != 3
LLVM ERROR: Failed to infer result type(s).

Issue submission checklist

  • I'm reporting an issue. It's not a question.
  • I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
  • There is reproducer code and related data files such as images, videos, models, etc.
@azhuvath azhuvath added bug Something isn't working support_request labels Dec 20, 2024
@ilya-lavrenov ilya-lavrenov added the category: NPU OpenVINO NPU plugin label Dec 20, 2024
@mlyashko
Copy link

There is a new version of Linux driver available, please use this driver: https://github.com/intel/linux-npu-driver/releases/tag/v1.10.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working category: NPU OpenVINO NPU plugin support_request
Projects
None yet
Development

No branches or pull requests

4 participants