Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Json Decode Error #267

Open
Aravind4525 opened this issue Nov 6, 2024 · 27 comments
Open

Json Decode Error #267

Aravind4525 opened this issue Nov 6, 2024 · 27 comments
Labels
bug bug!

Comments

@Aravind4525
Copy link

Traceback (most recent call last):
File "C:\Users\Admin\PycharmProjects\New_PiCourseSearch\venv\Lib\site-packages\requests\models.py", line 971, in json
return complexjson.loads(self.text, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Admin\AppData\Local\Programs\Python\Python311\Lib\json_init_.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Admin\AppData\Local\Programs\Python\Python311\Lib\json\decoder.py", line 340, in decode
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 11956)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\Admin\PycharmProjects\New_PiCourseSearch\venv\Lib\site-packages\requests\models.py", line 975, in json
raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Extra data: line 2 column 1 (char 11956)

In my local i am getting this error at this cmd - chatbot = hugchat.ChatBot(cookies=cookies.get_dict())

@Aravind4525 Aravind4525 added the bug bug! label Nov 6, 2024
@Aravind4525
Copy link
Author

when it will be fixed?

@swimminsparrow
Copy link

Same issue here

@ghost
Copy link

ghost commented Nov 6, 2024

I have the same problem, when will this be fixed?

@SPMrik
Copy link

SPMrik commented Nov 6, 2024

Same issue here—any update?

@manuthecoder
Copy link
Contributor

also experiencing this

@SPMrik
Copy link

SPMrik commented Nov 7, 2024

Hi all, I solved the JSON parsing issue in get_remote_llms (in hugchat.py) by implementing chunk-by-chunk parsing

@Aravind4525
Copy link
Author

Aravind4525 commented Nov 7, 2024

Hi all, I solved the JSON parsing issue in get_remote_llms (in hugchat.py) by implementing chunk-by-chunk parsing

How did you do

@manuthecoder
Copy link
Contributor

@Aravind4525 AI-based solution:
https://chatgpt.com/share/672c644a-30e4-800e-a06c-31c3498b4381

Tested this and it works 🎉

@Savit-Raj
Copy link

Facing the same error. Says "Extra data: line 2 column 1 (char 11877)".

@Soulter
Copy link
Owner

Soulter commented Nov 7, 2024

Hello, the bug has been fixed by @manuthecoder , please update to v0.4.12 for the solution. Thanks @manuthecoder

@helqasem
Copy link

helqasem commented Nov 7, 2024

upgrading to v0.4.12 didn't work for me.

There are three lines in the r.text response. When get_remote_llms is looping through "lines" the first line works.. the 2nd line throws and exception as there is no "nodes" element in the chunk.
Error:
An error occurred while parsing: 'nodes'

Any ideas on how to resolve?

@helqasem
Copy link

helqasem commented Nov 8, 2024

It seems that there have been some updates where some calls now return multiple non-uniform json objects in a single call across multiple lines/chunks. This is causing the JSON Decode errors.

So far, I've seen both get_remote_llms() and get_remote_conversations() are affected.
The simple fix for get_remote_llms() in v0.4.12 is change the loop to only process the first line provided in the response:
change: for line in lines:
to: for line in lines[:1]:

For get_remote_conversations() the conversation information is now passed in the second line/chunk. If anyone has the time to fork, validate and update.. here is the updated function:

    def get_remote_conversations(self, replace_conversation_list=True):
        """
        Returns all the remote conversations for the active account. Returns the conversations in a list.
        """

        r = self.session.post(
            self.hf_base_url + "/chat/__data.json",
            headers=self.get_headers(ref=False),
            cookies=self.get_cookies(),
        )

        if r.status_code != 200:
            raise Exception(
                f"Failed to get remote conversations with status code: {r.status_code}"
            )

        line2 = r.text.splitlines()[1] # New
        data = json.loads(line2) # Changed

        conversationIndices = data['data'][0] # Changed
        conversations = []

        for index in conversationIndices:
            conversation_data = data['data'][index] # Changed
            c = Conversation(
                id=data['data'][conversation_data["id"]], # Changed 
                title=data['data'][conversation_data["title"]], # Changed
                model=data['data'][conversation_data["model"]], # Changed
            )

            conversations.append(c)

        if replace_conversation_list:
            self.conversation_list = conversations

        return conversations
        
       

@methuselah-0
Copy link

It seems that there have been some updates where some calls now return multiple non-uniform json objects in a single call across multiple lines/chunks. This is causing the JSON Decode errors.

So far, I've seen both get_remote_llms() and get_remote_conversations() are affected. The simple fix for get_remote_llms() in v0.4.12 is change the loop to only process the first line provided in the response: change: for line in lines: to: for line in lines[:1]:

For get_remote_conversations() the conversation information is now passed in the second line/chunk. If anyone has the time to fork, validate and update.. here is the updated function:

    def get_remote_conversations(self, replace_conversation_list=True):
        """
        Returns all the remote conversations for the active account. Returns the conversations in a list.
        """

        r = self.session.post(
            self.hf_base_url + "/chat/__data.json",
            headers=self.get_headers(ref=False),
            cookies=self.get_cookies(),
        )

        if r.status_code != 200:
            raise Exception(
                f"Failed to get remote conversations with status code: {r.status_code}"
            )

        line2 = r.text.splitlines()[1] # New
        data = json.loads(line2) # Changed

        conversationIndices = data['data'][0] # Changed
        conversations = []

        for index in conversationIndices:
            conversation_data = data['data'][index] # Changed
            c = Conversation(
                id=data['data'][conversation_data["id"]], # Changed 
                title=data['data'][conversation_data["title"]], # Changed
                model=data['data'][conversation_data["model"]], # Changed
            )

            conversations.append(c)

        if replace_conversation_list:
            self.conversation_list = conversations

        return conversations
        
       

I tried it, and now "/switch all" gives me "# Error: 'id:", running just "/switch" works though.

@digital-mine
Copy link

Is there any ETA for the solution? I've tried several solutions published here, but none of them work for me.

@Soulter
Copy link
Owner

Soulter commented Nov 15, 2024

Found the problem, it will be fixed in 1 day.

@Soulter
Copy link
Owner

Soulter commented Nov 15, 2024

the problem is https://huggingface.co/chat/__data.json doesn't return a valid json.

UPDATE:
so funny, hf team placed 3 jsons into this file

@Soulter
Copy link
Owner

Soulter commented Nov 15, 2024

v0.4.15 published and truly fixed the problem.

@Soulter
Copy link
Owner

Soulter commented Nov 15, 2024

@Aravind4525 AI-based solution: https://chatgpt.com/share/672c644a-30e4-800e-a06c-31c3498b4381

Tested this and it works 🎉

IT DOESN'T WORKS

@swimminsparrow
Copy link

@Soulter thank u for your work!

@helqasem
Copy link

@Soulter Thanks for your work on this.
I tested v0.4.15.. it now allows login and creation of a chatbot, however, get_remote_conversations() still causes a JSON-decode "Extra Data" Error.

In the case of get_remote_conversations() the conversation data has been moved to the 2nd JSON-object not the 1st. I provided a quick workaround above:
#267 (comment)

Hope this helps.

@digital-mine
Copy link

digital-mine commented Nov 15, 2024

@Soulter sorry to say but the problem persists. Same as @helqasem described.

Soulter added a commit that referenced this issue Nov 16, 2024
@Soulter
Copy link
Owner

Soulter commented Nov 16, 2024

@Soulter Thanks for your work on this. I tested v0.4.15.. it now allows login and creation of a chatbot, however, get_remote_conversations() still causes a JSON-decode "Extra Data" Error.

In the case of get_remote_conversations() the conversation data has been moved to the 2nd JSON-object not the 1st. I provided a quick workaround above: #267 (comment)

Hope this helps.

Thanks

@Soulter
Copy link
Owner

Soulter commented Nov 16, 2024

v0.4.16 published and fixed the problem.

@methuselah-0
Copy link

v0.4.16 published and fixed the problem.

Still "/switch all" gives me "# Error: 'id:".

@digital-mine
Copy link

The id problem persists folks: "id=data['data'][conversation_data["id"]] KeyError: 'id' "

@helqasem
Copy link

helqasem commented Nov 23, 2024

The issue is caused due to another server side change where hf have re-integrated the conversation data into the main chunk. The 2nd chunk where they moved it to temporarily is now empty:
{"type":"chunk","id":1,"data":[[]]}}

@Soulter it looks like the conversation data has been put back in it's original location in the main json object. There are still multiple json objects in __data.json.. it's now in the 1st instead of the second.. your previous code for extracting the conversations works.. here is an updated get_remote_conversations():

def get_remote_conversations(self, replace_conversation_list=True):
    """
    Returns all the remote conversations for the active account. Returns the conversations in a list.
    """

    r = self.session.post(
        self.hf_base_url + "/chat/__data.json",
        headers=self.get_headers(ref=False),
        cookies=self.get_cookies(),
    )

    if r.status_code != 200:
        raise Exception(
            f"Failed to get remote conversations with status code: {r.status_code}"
        )

    # temporary workaround for #267
    line_ = r.text.splitlines()[0]
    data = json.loads(line_)["nodes"][0]["data"]

    conversationIndices = data[data[0]["conversations"]]

    conversations = []

    for index in conversationIndices:
        conversation_data = data[index]

        c = Conversation(
            id=data[conversation_data["id"]],
            title=data[conversation_data["title"]],
            model=data[conversation_data["model"]],
        )

        conversations.append(c)

    if replace_conversation_list:
        self.conversation_list = conversations

    return conversations

@digital-mine
Copy link

thanks @helqasem it looks like it is working now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug bug!
Projects
None yet
Development

No branches or pull requests

9 participants