[Bug]: text_embedding_inference. JSONDecodeError: Expecting value: line 1 column 1 (char 0) #14487

nitskoh · 2024-07-01T10:29:21Z

Bug Description

I tried running the code snippet in this example
https://docs.llamaindex.ai/en/stable/examples/embeddings/text_embedding_inference/

I got the following error

JSONDecodeError                           Traceback (most recent call last)
<ipython-input-26-a7c0f9477fcc> in <cell line: 13>()
     11 
     12 
---> 13 embeddings = embed_model.get_text_embedding("Hello World!")
     14 print(len(embeddings))
     15 print(embeddings[:5])

8 frames
/usr/lib/python3.10/json/decoder.py in raw_decode(self, s, idx)
    353             obj, end = self.scan_once(s, idx)
    354         except StopIteration as err:
--> 355             raise JSONDecodeError("Expecting value", s, err.value) from None
    356         return obj, end

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Version

Using the colab notebook provided

Steps to Reproduce

Running the colab notebook here

https://docs.llamaindex.ai/en/stable/examples/embeddings/text_embedding_inference/

Relevant Logs/Tracbacks

No response

The text was updated successfully, but these errors were encountered:

dosubot · 2024-07-01T10:36:39Z

Hey @nitskoh! 👋 I'm here to help you with bugs, questions, and more while you wait for a human maintainer. Let's squash those bugs together!

To resolve the JSONDecodeError encountered when running the text embedding inference code snippet from the LlamaIndex example, ensure that the model name you are using is correct and supported by Predibase. The error suggests that the model deployment ID might not be found at Predibase.

Here are some steps to troubleshoot and fix the issue:

Verify Model Name: Ensure that the model_name you are using is correct and listed in the Predibase supported models. You can refer to the Predibase documentation for the list of supported models: Predibase Models.
Check API Token: Ensure that the PREDIBASE_API_TOKEN is correctly set and valid.
Update Base URL: Ensure that the base_url is correctly set to the endpoint of the text embeddings service. The default URL is set to http://127.0.0.1:8080, but it might need to be updated to match your deployment configuration.

Here is the relevant part of the code where the error is raised:

import httpx

def _call_api(self, texts: List[str]) -> List[List[float]]:
    headers = {"Content-Type": "application/json"}
    if self.auth_token is not None:
        if callable(self.auth_token):
            headers["Authorization"] = self.auth_token(self.base_url)
        else:
            headers["Authorization"] = self.auth_token
    json_data = {"inputs": texts, "truncate": self.truncate_text}

    with httpx.Client() as client:
        response = client.post(
            f"{self.base_url}/embed",
            headers=headers,
            json=json_data,
            timeout=self.timeout,
        )

    return response.json()

Additionally, if you are using the HuggingFaceEmbedding class from LlamaIndex, ensure that the model and tokenizer are correctly initialized. Here is a snippet from the HuggingFaceEmbedding class:

class HuggingFaceEmbedding(BaseEmbedding):
    tokenizer_name: str = Field(description="Tokenizer name from HuggingFace.")
    max_length: int = Field(
        default=DEFAULT_HUGGINGFACE_LENGTH, description="Maximum length of input.", gt=0
    )
    pooling: Pooling = Field(default=None, description="Pooling strategy.")
    normalize: bool = Field(default=True, description="Normalize embeddings or not.")
    query_instruction: Optional[str] = Field(
        description="Instruction to prepend to query text."
    )
    text_instruction: Optional[str] = Field(
        description="Instruction to prepend to text."
    )
    cache_folder: Optional[str] = Field(
        description="Cache folder for huggingface files."
    )

    _model: Any = PrivateAttr()
    _tokenizer: Any = PrivateAttr()
    _device: str = PrivateAttr()

    def __init__(
        self,
        model_name: Optional[str] = None,
        tokenizer_name: Optional[str] = None,
        pooling: Optional[str] = None,
        max_length: Optional[int] = None,
        query_instruction: Optional[str] = None,
        text_instruction: Optional[str] = None,
        normalize: bool = True,
        model: Optional[Any] = None,
        tokenizer: Optional[Any] = None,
        embed_batch_size: int = DEFAULT_EMBED_BATCH_SIZE,
        cache_folder: Optional[str] = None,
        trust_remote_code: bool = False,
        device: Optional[str] = None,
        callback_manager: Optional[CallbackManager] = None,
    ):
        try:
            from transformers import AutoModel, AutoTokenizer
        except ImportError:
            raise ImportError(
                "HuggingFaceEmbedding requires transformers to be installed.\n"
                "Please install transformers with `pip install transformers`."
            )

        self._device = device or infer_torch_device()

        cache_folder = cache_folder or get_cache_dir()
        if model is None:  # Use model_name with AutoModel
            model_name = (
                model_name
                if model_name is not None
                else DEFAULT_HUGGINGFACE_EMBEDDING_MODEL
            )
            model = AutoModel.from_pretrained(
                model_name, cache_dir=cache_folder, trust_remote_code=trust_remote_code
            )
        elif model_name is None:  # Extract model_name from model
            model_name = model.name_or_path
        self._model = model.to(self._device)

        if tokenizer is None:  # Use tokenizer_name with AutoTokenizer
            tokenizer_name = (
                model_name or tokenizer_name or DEFAULT_HUGGINGFACE_EMBEDDING_MODEL
            )
            tokenizer = AutoTokenizer.from_pretrained(
                tokenizer_name, cache_dir=cache_folder
            )
        elif tokenizer_name is None:  # Extract tokenizer_name from model
            tokenizer_name = tokenizer.name_or_path
        self._tokenizer = tokenizer

        if max_length is None:
            try:
                max_length = int(self._model.config.max_position_embeddings)
            except AttributeError as exc:
                raise ValueError(
                    "Unable to find max_length from model config. Please specify max_length."
                ) from exc

        if not pooling:
            pooling = get_pooling_mode(model_name)
        try:
            pooling = Pooling(pooling)
        except ValueError as exc:
            raise NotImplementedError(
                f"Pooling {pooling} unsupported, please pick one in"
                f" {[p.value for p in Pooling]}."
            ) from exc
        super().__init__(
            embed_batch_size=embed_batch_size,
            callback_manager=callback_manager,
            model_name=model_name,
            tokenizer_name=tokenizer_name,
            max_length=max_length,
            pooling=pooling,
            normalize=normalize,
            query_instruction=query_instruction,
            text_instruction=text_instruction,
        )

Ensure that the model and tokenizer are correctly set up and that the max_length and pooling parameters are properly configured. This should help in resolving the JSONDecodeError ^[1]^[2]^[3]^[4].

To continue talking to Dosu, mention @dosu.

nitskoh · 2024-07-01T11:02:00Z

This does not help. As I am using your example

SID262000 · 2024-07-01T14:50:19Z

@nitskoh Looks like the very first step is to deploy the embedding server which sounds like a paid option as described here.
cc: @namin @tobi

logan-markewich · 2024-07-02T17:34:30Z

Its not paid, but you need to deploy it somewhere before you can use it (local docker, was, etc.)

SID262000 · 2024-07-03T10:08:24Z

Thanks for the note, @logan-markewich. Which of the deployment methods have you used?

nitskoh added bug Something isn't working triage Issue needs to be triaged/prioritized labels Jul 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: text_embedding_inference. JSONDecodeError: Expecting value: line 1 column 1 (char 0) #14487

[Bug]: text_embedding_inference. JSONDecodeError: Expecting value: line 1 column 1 (char 0) #14487

nitskoh commented Jul 1, 2024

dosubot bot commented Jul 1, 2024

nitskoh commented Jul 1, 2024

SID262000 commented Jul 1, 2024

logan-markewich commented Jul 2, 2024

SID262000 commented Jul 3, 2024

[Bug]: text_embedding_inference. JSONDecodeError: Expecting value: line 1 column 1 (char 0) #14487

[Bug]: text_embedding_inference. JSONDecodeError: Expecting value: line 1 column 1 (char 0) #14487

Comments

nitskoh commented Jul 1, 2024

Bug Description

Version

Steps to Reproduce

Relevant Logs/Tracbacks

dosubot bot commented Jul 1, 2024

nitskoh commented Jul 1, 2024

SID262000 commented Jul 1, 2024

logan-markewich commented Jul 2, 2024

SID262000 commented Jul 3, 2024