-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: text_embedding_inference. JSONDecodeError: Expecting value: line 1 column 1 (char 0) #14487
Comments
Hey @nitskoh! 👋 I'm here to help you with bugs, questions, and more while you wait for a human maintainer. Let's squash those bugs together! To resolve the Here are some steps to troubleshoot and fix the issue:
Here is the relevant part of the code where the error is raised: import httpx
def _call_api(self, texts: List[str]) -> List[List[float]]:
headers = {"Content-Type": "application/json"}
if self.auth_token is not None:
if callable(self.auth_token):
headers["Authorization"] = self.auth_token(self.base_url)
else:
headers["Authorization"] = self.auth_token
json_data = {"inputs": texts, "truncate": self.truncate_text}
with httpx.Client() as client:
response = client.post(
f"{self.base_url}/embed",
headers=headers,
json=json_data,
timeout=self.timeout,
)
return response.json() Additionally, if you are using the class HuggingFaceEmbedding(BaseEmbedding):
tokenizer_name: str = Field(description="Tokenizer name from HuggingFace.")
max_length: int = Field(
default=DEFAULT_HUGGINGFACE_LENGTH, description="Maximum length of input.", gt=0
)
pooling: Pooling = Field(default=None, description="Pooling strategy.")
normalize: bool = Field(default=True, description="Normalize embeddings or not.")
query_instruction: Optional[str] = Field(
description="Instruction to prepend to query text."
)
text_instruction: Optional[str] = Field(
description="Instruction to prepend to text."
)
cache_folder: Optional[str] = Field(
description="Cache folder for huggingface files."
)
_model: Any = PrivateAttr()
_tokenizer: Any = PrivateAttr()
_device: str = PrivateAttr()
def __init__(
self,
model_name: Optional[str] = None,
tokenizer_name: Optional[str] = None,
pooling: Optional[str] = None,
max_length: Optional[int] = None,
query_instruction: Optional[str] = None,
text_instruction: Optional[str] = None,
normalize: bool = True,
model: Optional[Any] = None,
tokenizer: Optional[Any] = None,
embed_batch_size: int = DEFAULT_EMBED_BATCH_SIZE,
cache_folder: Optional[str] = None,
trust_remote_code: bool = False,
device: Optional[str] = None,
callback_manager: Optional[CallbackManager] = None,
):
try:
from transformers import AutoModel, AutoTokenizer
except ImportError:
raise ImportError(
"HuggingFaceEmbedding requires transformers to be installed.\n"
"Please install transformers with `pip install transformers`."
)
self._device = device or infer_torch_device()
cache_folder = cache_folder or get_cache_dir()
if model is None: # Use model_name with AutoModel
model_name = (
model_name
if model_name is not None
else DEFAULT_HUGGINGFACE_EMBEDDING_MODEL
)
model = AutoModel.from_pretrained(
model_name, cache_dir=cache_folder, trust_remote_code=trust_remote_code
)
elif model_name is None: # Extract model_name from model
model_name = model.name_or_path
self._model = model.to(self._device)
if tokenizer is None: # Use tokenizer_name with AutoTokenizer
tokenizer_name = (
model_name or tokenizer_name or DEFAULT_HUGGINGFACE_EMBEDDING_MODEL
)
tokenizer = AutoTokenizer.from_pretrained(
tokenizer_name, cache_dir=cache_folder
)
elif tokenizer_name is None: # Extract tokenizer_name from model
tokenizer_name = tokenizer.name_or_path
self._tokenizer = tokenizer
if max_length is None:
try:
max_length = int(self._model.config.max_position_embeddings)
except AttributeError as exc:
raise ValueError(
"Unable to find max_length from model config. Please specify max_length."
) from exc
if not pooling:
pooling = get_pooling_mode(model_name)
try:
pooling = Pooling(pooling)
except ValueError as exc:
raise NotImplementedError(
f"Pooling {pooling} unsupported, please pick one in"
f" {[p.value for p in Pooling]}."
) from exc
super().__init__(
embed_batch_size=embed_batch_size,
callback_manager=callback_manager,
model_name=model_name,
tokenizer_name=tokenizer_name,
max_length=max_length,
pooling=pooling,
normalize=normalize,
query_instruction=query_instruction,
text_instruction=text_instruction,
) Ensure that the model and tokenizer are correctly set up and that the |
This does not help. As I am using your example |
Its not paid, but you need to deploy it somewhere before you can use it (local docker, was, etc.) |
Thanks for the note, @logan-markewich. Which of the deployment methods have you used? |
Bug Description
I tried running the code snippet in this example
https://docs.llamaindex.ai/en/stable/examples/embeddings/text_embedding_inference/
I got the following error
Version
Using the colab notebook provided
Steps to Reproduce
Running the colab notebook here
https://docs.llamaindex.ai/en/stable/examples/embeddings/text_embedding_inference/
Relevant Logs/Tracbacks
No response
The text was updated successfully, but these errors were encountered: