Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: text_embedding_inference. JSONDecodeError: Expecting value: line 1 column 1 (char 0) #14487

Open
nitskoh opened this issue Jul 1, 2024 · 5 comments
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized

Comments

@nitskoh
Copy link

nitskoh commented Jul 1, 2024

Bug Description

I tried running the code snippet in this example
https://docs.llamaindex.ai/en/stable/examples/embeddings/text_embedding_inference/

I got the following error

JSONDecodeError                           Traceback (most recent call last)
<ipython-input-26-a7c0f9477fcc> in <cell line: 13>()
     11 
     12 
---> 13 embeddings = embed_model.get_text_embedding("Hello World!")
     14 print(len(embeddings))
     15 print(embeddings[:5])

8 frames
/usr/lib/python3.10/json/decoder.py in raw_decode(self, s, idx)
    353             obj, end = self.scan_once(s, idx)
    354         except StopIteration as err:
--> 355             raise JSONDecodeError("Expecting value", s, err.value) from None
    356         return obj, end

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Version

Using the colab notebook provided

Steps to Reproduce

Running the colab notebook here

https://docs.llamaindex.ai/en/stable/examples/embeddings/text_embedding_inference/

Relevant Logs/Tracbacks

No response

@nitskoh nitskoh added bug Something isn't working triage Issue needs to be triaged/prioritized labels Jul 1, 2024
Copy link

dosubot bot commented Jul 1, 2024

Hey @nitskoh! 👋 I'm here to help you with bugs, questions, and more while you wait for a human maintainer. Let's squash those bugs together!

To resolve the JSONDecodeError encountered when running the text embedding inference code snippet from the LlamaIndex example, ensure that the model name you are using is correct and supported by Predibase. The error suggests that the model deployment ID might not be found at Predibase.

Here are some steps to troubleshoot and fix the issue:

  1. Verify Model Name: Ensure that the model_name you are using is correct and listed in the Predibase supported models. You can refer to the Predibase documentation for the list of supported models: Predibase Models.

  2. Check API Token: Ensure that the PREDIBASE_API_TOKEN is correctly set and valid.

  3. Update Base URL: Ensure that the base_url is correctly set to the endpoint of the text embeddings service. The default URL is set to http://127.0.0.1:8080, but it might need to be updated to match your deployment configuration.

Here is the relevant part of the code where the error is raised:

import httpx

def _call_api(self, texts: List[str]) -> List[List[float]]:
    headers = {"Content-Type": "application/json"}
    if self.auth_token is not None:
        if callable(self.auth_token):
            headers["Authorization"] = self.auth_token(self.base_url)
        else:
            headers["Authorization"] = self.auth_token
    json_data = {"inputs": texts, "truncate": self.truncate_text}

    with httpx.Client() as client:
        response = client.post(
            f"{self.base_url}/embed",
            headers=headers,
            json=json_data,
            timeout=self.timeout,
        )

    return response.json()

Additionally, if you are using the HuggingFaceEmbedding class from LlamaIndex, ensure that the model and tokenizer are correctly initialized. Here is a snippet from the HuggingFaceEmbedding class:

class HuggingFaceEmbedding(BaseEmbedding):
    tokenizer_name: str = Field(description="Tokenizer name from HuggingFace.")
    max_length: int = Field(
        default=DEFAULT_HUGGINGFACE_LENGTH, description="Maximum length of input.", gt=0
    )
    pooling: Pooling = Field(default=None, description="Pooling strategy.")
    normalize: bool = Field(default=True, description="Normalize embeddings or not.")
    query_instruction: Optional[str] = Field(
        description="Instruction to prepend to query text."
    )
    text_instruction: Optional[str] = Field(
        description="Instruction to prepend to text."
    )
    cache_folder: Optional[str] = Field(
        description="Cache folder for huggingface files."
    )

    _model: Any = PrivateAttr()
    _tokenizer: Any = PrivateAttr()
    _device: str = PrivateAttr()

    def __init__(
        self,
        model_name: Optional[str] = None,
        tokenizer_name: Optional[str] = None,
        pooling: Optional[str] = None,
        max_length: Optional[int] = None,
        query_instruction: Optional[str] = None,
        text_instruction: Optional[str] = None,
        normalize: bool = True,
        model: Optional[Any] = None,
        tokenizer: Optional[Any] = None,
        embed_batch_size: int = DEFAULT_EMBED_BATCH_SIZE,
        cache_folder: Optional[str] = None,
        trust_remote_code: bool = False,
        device: Optional[str] = None,
        callback_manager: Optional[CallbackManager] = None,
    ):
        try:
            from transformers import AutoModel, AutoTokenizer
        except ImportError:
            raise ImportError(
                "HuggingFaceEmbedding requires transformers to be installed.\n"
                "Please install transformers with `pip install transformers`."
            )

        self._device = device or infer_torch_device()

        cache_folder = cache_folder or get_cache_dir()
        if model is None:  # Use model_name with AutoModel
            model_name = (
                model_name
                if model_name is not None
                else DEFAULT_HUGGINGFACE_EMBEDDING_MODEL
            )
            model = AutoModel.from_pretrained(
                model_name, cache_dir=cache_folder, trust_remote_code=trust_remote_code
            )
        elif model_name is None:  # Extract model_name from model
            model_name = model.name_or_path
        self._model = model.to(self._device)

        if tokenizer is None:  # Use tokenizer_name with AutoTokenizer
            tokenizer_name = (
                model_name or tokenizer_name or DEFAULT_HUGGINGFACE_EMBEDDING_MODEL
            )
            tokenizer = AutoTokenizer.from_pretrained(
                tokenizer_name, cache_dir=cache_folder
            )
        elif tokenizer_name is None:  # Extract tokenizer_name from model
            tokenizer_name = tokenizer.name_or_path
        self._tokenizer = tokenizer

        if max_length is None:
            try:
                max_length = int(self._model.config.max_position_embeddings)
            except AttributeError as exc:
                raise ValueError(
                    "Unable to find max_length from model config. Please specify max_length."
                ) from exc

        if not pooling:
            pooling = get_pooling_mode(model_name)
        try:
            pooling = Pooling(pooling)
        except ValueError as exc:
            raise NotImplementedError(
                f"Pooling {pooling} unsupported, please pick one in"
                f" {[p.value for p in Pooling]}."
            ) from exc
        super().__init__(
            embed_batch_size=embed_batch_size,
            callback_manager=callback_manager,
            model_name=model_name,
            tokenizer_name=tokenizer_name,
            max_length=max_length,
            pooling=pooling,
            normalize=normalize,
            query_instruction=query_instruction,
            text_instruction=text_instruction,
        )

Ensure that the model and tokenizer are correctly set up and that the max_length and pooling parameters are properly configured. This should help in resolving the JSONDecodeError [1][2][3][4].

To continue talking to Dosu, mention @dosu.

@nitskoh
Copy link
Author

nitskoh commented Jul 1, 2024

This does not help. As I am using your example

@SID262000
Copy link

@nitskoh Looks like the very first step is to deploy the embedding server which sounds like a paid option as described here.
cc: @namin @tobi

@logan-markewich
Copy link
Collaborator

Its not paid, but you need to deploy it somewhere before you can use it (local docker, was, etc.)

@SID262000
Copy link

Thanks for the note, @logan-markewich. Which of the deployment methods have you used?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized
Projects
None yet
Development

No branches or pull requests

3 participants