Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

entity_relation_extractor.py fails when Node.properties having double quotes #238

Open
thimoonxy opened this issue Jan 3, 2025 · 1 comment

Comments

@thimoonxy
Copy link

thimoonxy commented Jan 3, 2025

Backgroud

  • GPT-4o works fine, but Other Model like qwen2.5-72B-instruct met this issue, seems qwen is not able to gen nested json object

LLM outputs

    {
      "id": "0",
      "label": "Person",
      "properties": "{'name': 'Duke Leto Atreides'}"
    }

Expected output

    {
      "id": "0",
      "label": "Person",
      "properties": {'name': 'Duke Leto Atreides'}
    }

Locate src file

neo4j_graphrag/experimental/components/entity_relation_extractor.py

Try to fix

Have to remove double quotes from LLMEntityRelationExtractor.extract_for_chunk
when some LLM model is not able to generate nested JSON ojbect:

llm_result.content = llm_result.content.replace('''"{''', "{").replace('''}"''', "}")

Any other proper ways to fix this ?

  • not sure how to fix it properly if LLMs can not generate the properties json object correctly
@stellasia
Copy link
Contributor

Hi @thimoonxy ,

Thanks for bringing this issue to us. This is part of a broader problem, when the LLM is not able to generate JSON in the format we expect.

We're planning to improve this behavior.

Unfortunaly for now, there is not so much you can do except trying to implement your own EntityRelationExtractor component.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants