Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSONDecodeError: Extra data: line 1 column 61 (char 60) #5

Open
al-yakubovich opened this issue Jun 2, 2023 · 4 comments
Open

JSONDecodeError: Extra data: line 1 column 61 (char 60) #5

al-yakubovich opened this issue Jun 2, 2023 · 4 comments

Comments

@al-yakubovich
Copy link

al-yakubovich commented Jun 2, 2023

Hi. I am getting the following error:

JSONDecodeError: Extra data: line 1 column 61 (char 60)
Traceback:
File "C:\Users\AppData\Local\anaconda3\envs\py311_test\Lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 565, in _run_script
    exec(code, module.__dict__)
File "C:\Users\Desktop\GenAI\app\app2_test\interface.py", line 72, in <module>
    decoded_response = decode_response(response)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Desktop\GenAI\app\app2_test\interface.py", line 17, in decode_response
    return json.loads(response)
           ^^^^^^^^^^^^^^^^^^^^
File "C:\Users\AppData\Local\anaconda3\envs\py311_test\Lib\json\__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\\AppData\Local\anaconda3\envs\py311_test\Lib\json\decoder.py", line 340, in decode
    raise JSONDecodeError("Extra data", s, end)

It looks like something wrong with decode_response function.

I changed it to:

def decode_response(response: str) -> dict:
    lines = response.splitlines()
    json_line = lines[0]  
    return json.loads(json_line)

and it started working for simple questions, but it fails for most questions (e.g. plot something)

@al-yakubovich
Copy link
Author

Looks like it is all about token limits. When response is too long then system just cut it and response becomes not correct json structure. For example: {'key1': 'long_text_here, 'key2': 'another_long_text_her

@Sharvadze
Copy link

Having the same issue, did manage to sort it out? Also, it's pretty slow with 10mb CSV files

@al-yakubovich
Copy link
Author

Nope, this tool outputs json structure with all data from csv and it would always hit token limit. The right way to do it is to change prompt and code so it would output pandas/matplotlib code instead and then this code is needed to be converted into pandas df/plot.

@Ngonie-x
Copy link
Owner

Ngonie-x commented Jun 7, 2023

Hey. You're absolutely right. When outputting data as a JSON structure, it's highly likely to hit the token limit. To address this, I'm experimenting with an alternative approach by outputting the data as a data frame formula. For instance, instead of returning a complete string of books with the highest rating, it will return a string like {"table": {"data": "df[['title', 'ratings_count']].head()"}}.To convert this string back to a Python dictionary, we can use json.loads().

Once the string is converted to a dictionary, we can apply it to the actual DataFrame, like this:

df = pd.read_csv(data)

if "table" in response_dict:
        data = response_dict["table"]
        table_df = eval(data["data"])
        st.table(table_df)

The evaluation statement eval() will process the expression, effectively converting it back into a DataFrame object. Finally, the rendered DataFrame will be displayed in Streamlit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants