-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question]: How to update the default prompt of KnowledgeGraphIndex #15760
Comments
To update the default prompt of Here is an example of how you can update the from llama_index.core.prompts.base import PromptTemplate
from llama_index.core.prompts.prompt_type import PromptType
from llama_index.core.prompts.default_prompts import DEFAULT_KG_TRIPLET_EXTRACT_PROMPT
CUSTOM_KG_TRIPLET_EXTRACT_TMPL = (
"Extract up to {max_knowledge_triplets} knowledge triplets from the given text. "
"Each triplet should be in the form of (head, relation, tail) with their respective types.\n"
"---------------------\n"
"INITIAL ONTOLOGY:\n"
"Entity Types: {allowed_entity_types}\n"
"Relation Types: {allowed_relation_types}\n"
"\n"
"Use these types as a starting point, but introduce new types if necessary based on the context.\n"
"\n"
"GUIDELINES:\n"
"- Output in JSON format: [{{'head': '', 'head_type': '', 'relation': '', 'tail': '', 'tail_type': ''}}]\n"
"- Use the most complete form for entities (e.g., 'United States of America' instead of 'USA')\n"
"- Keep entities concise (3-5 words max)\n"
"- Break down complex phrases into multiple triplets\n"
"- Ensure the knowledge graph is coherent and easily understandable\n"
"---------------------\n"
"EXAMPLE:\n"
"Text: Tim Cook, CEO of Apple Inc., announced the new Apple Watch that monitors heart health. "
"UC Berkeley researchers studied the benefits of apples.\n"
"Output:\n"
"[{{'head': 'Tim Cook', 'head_type': 'PERSON', 'relation': 'CEO_OF', 'tail': 'Apple Inc.', 'tail_type': 'ORGANIZATION'}},\n"
" {{'head': 'Apple Inc.', 'head_type': 'ORGANIZATION', 'relation': 'PRODUCES', 'tail': 'Apple Watch', 'tail_type': 'PRODUCT'}},\n"
" {{'head': 'Apple Watch', 'head_type': 'PRODUCT', 'relation': 'MONITORS', 'tail': 'heart health', 'tail_type': 'HEALTH_METRIC'}},\n"
" {{'head': 'UC Berkeley', 'head_type': 'ORGANIZATION', 'relation': 'STUDIES', 'tail': 'benefits of apples', 'tail_type': 'RESEARCH_TOPIC'}}]\n"
"---------------------\n"
"Text: {text}\n"
"Output:\n"
)
CUSTOM_KG_TRIPLET_EXTRACT_PROMPT = PromptTemplate(
CUSTOM_KG_TRIPLET_EXTRACT_TMPL, prompt_type=PromptType.KNOWLEDGE_TRIPLET_EXTRACT
) In this custom template, you can specify the from llama_index.core.indices.knowledge_graph.base import KnowledgeGraphIndex
kg_index = KnowledgeGraphIndex(
kg_triplet_extract_template=CUSTOM_KG_TRIPLET_EXTRACT_PROMPT,
max_triplets_per_chunk=10,
include_embeddings=False,
show_progress=True
) This will ensure that the from llama_index.core.llms.llm import LLM
from llama_index.core.indices.property_graph.transformations.schema_llm import SchemaLLMPathExtractor
llm = LLM() # Initialize your language model here
schema_extractor = SchemaLLMPathExtractor(
llm=llm,
extract_prompt=CUSTOM_KG_TRIPLET_EXTRACT_PROMPT,
possible_entities=["PERSON", "ORGANIZATION", "PRODUCT", "HEALTH_METRIC", "RESEARCH_TOPIC"],
possible_relations=["CEO_OF", "PRODUCES", "MONITORS", "STUDIES"],
strict=True,
max_triplets_per_chunk=10
)
# Use schema_extractor to process your data This setup ensures that the |
How can I pass allowed_entity_types and allowed_relation_types to the template. |
To pass from typing import Literal
from llama_index.llms.ollama import Ollama
from llama_index.core.indices.property_graph import SchemaLLMPathExtractor
# Define possible entity types and relation types
entities = Literal["PERSON", "PLACE", "ORGANIZATION"]
relations = Literal["HAS", "PART_OF", "WORKED_ON", "WORKED_WITH", "WORKED_AT"]
# Define the validation schema for which entities can have which relations
validation_schema = [
("ORGANIZATION", "HAS", "PERSON"),
("PERSON", "WORKED_AT", "ORGANIZATION"),
("PERSON", "WORKED_WITH", "PERSON"),
("PERSON", "WORKED_ON", "ORGANIZATION"),
("PERSON", "PART_OF", "ORGANIZATION"),
("ORGANIZATION", "PART_OF", "ORGANIZATION"),
("PERSON", "WORKED_AT", "PLACE"),
]
# Create the SchemaLLMPathExtractor with the defined schema
kg_extractor = SchemaLLMPathExtractor(
llm=Ollama(model="llama3", json_mode=True, request_timeout=3600),
possible_entities=entities,
possible_relations=relations,
kg_validation_schema=validation_schema,
strict=True, # Ensures only values within the schema are allowed
) In this example, |
I mean how can I pass allowed_entity_types and allowed_relation_types to CUSTOM_KG_TRIPLET_EXTRACT_TMPL? |
To pass CUSTOM_KG_TRIPLET_EXTRACT_TMPL = (
"Extract up to {max_knowledge_triplets} knowledge triplets from the given text. "
"Each triplet should be in the form of (head, relation, tail) with their respective types.\n"
"---------------------\n"
"INITIAL ONTOLOGY:\n"
"Entity Types: {allowed_entity_types}\n"
"Relation Types: {allowed_relation_types}\n"
"\n"
"Use these types as a starting point, but introduce new types if necessary based on the context.\n"
"\n"
"GUIDELINES:\n"
"- Output in JSON format: [{{'head': '', 'head_type': '', 'relation': '', 'tail': '', 'tail_type': ''}}]\n"
"- Use the most complete form for entities (e.g., 'United States of America' instead of 'USA')\n"
"- Keep entities concise (3-5 words max)\n"
"- Break down complex phrases into multiple triplets\n"
"- Ensure the knowledge graph is coherent and easily understandable\n"
"---------------------\n"
"EXAMPLE:\n"
"Text: Tim Cook, CEO of Apple Inc., announced the new Apple Watch that monitors heart health. "
"UC Berkeley researchers studied the benefits of apples.\n"
"Output:\n"
"[{{'head': 'Tim Cook', 'head_type': 'PERSON', 'relation': 'CEO_OF', 'tail': 'Apple Inc.', 'tail_type': 'COMPANY'}},\n"
" {{'head': 'Apple Inc.', 'head_type': 'COMPANY', 'relation': 'PRODUCES', 'tail': 'Apple Watch', 'tail_type': 'PRODUCT'}},\n"
" {{'head': 'Apple Watch', 'head_type': 'PRODUCT', 'relation': 'MONITORS', 'tail': 'heart health', 'tail_type': 'HEALTH_METRIC'}},\n"
" {{'head': 'UC Berkeley', 'head_type': 'UNIVERSITY', 'relation': 'STUDIES', 'tail': 'benefits of apples', 'tail_type': 'RESEARCH_TOPIC'}}]\n"
"---------------------\n"
"Text: {text}\n"
"Output:\n"
) You can then create a from llama_index.core.prompts.base import PromptTemplate
from llama_index.core.prompts.prompt_type import PromptType
CUSTOM_KG_TRIPLET_EXTRACT_PROMPT = PromptTemplate(
CUSTOM_KG_TRIPLET_EXTRACT_TMPL,
prompt_type=PromptType.KNOWLEDGE_TRIPLET_EXTRACT
) This template includes placeholders for from llama_index.core.indices.property_graph.transformations.dynamic_llm import DynamicLLMPathExtractor
from llama_index.core.llms.llm import LLM
llm = LLM() # Initialize your LLM instance
extractor = DynamicLLMPathExtractor(
llm=llm,
extract_prompt=CUSTOM_KG_TRIPLET_EXTRACT_PROMPT,
allowed_entity_types=['PERSON', 'COMPANY', 'PRODUCT', 'HEALTH_METRIC', 'UNIVERSITY', 'RESEARCH_TOPIC'],
allowed_relation_types=['CEO_OF', 'PRODUCES', 'MONITORS', 'STUDIES']
) This setup ensures that the |
After creating CUSTOM_KG_TRIPLET_EXTRACT_PROMPT, how can I create KnowledgeGraphIndex using it and pass allowed_entity_types and allowed_relation_types? |
To create a
Here is an example: from llama_index import KnowledgeGraphIndex
from llama_index.core.prompts import PromptTemplate
from llama_index.core.prompts.prompt_type import PromptType
# Define your custom prompt template
CUSTOM_KG_TRIPLET_EXTRACT_TMPL = (
"Extract up to {max_knowledge_triplets} knowledge triplets from the given text. "
"Each triplet should be in the form of (head, relation, tail) with their respective types.\n"
"---------------------\n"
"INITIAL ONTOLOGY:\n"
"Entity Types: {allowed_entity_types}\n"
"Relation Types: {allowed_relation_types}\n"
"\n"
"Use these types as a starting point, but introduce new types if necessary based on the context.\n"
"\n"
"GUIDELINES:\n"
"- Output in JSON format: [{{'head': '', 'head_type': '', 'relation': '', 'tail': '', 'tail_type': ''}}]\n"
"- Use the most complete form for entities (e.g., 'United States of America' instead of 'USA')\n"
"- Keep entities concise (3-5 words max)\n"
"- Break down complex phrases into multiple triplets\n"
"- Ensure the knowledge graph is coherent and easily understandable\n"
"---------------------\n"
"EXAMPLE:\n"
"Text: Tim Cook, CEO of Apple Inc., announced the new Apple Watch that monitors heart health. "
"UC Berkeley researchers studied the benefits of apples.\n"
"Output:\n"
"[{{'head': 'Tim Cook', 'head_type': 'PERSON', 'relation': 'CEO_OF', 'tail': 'Apple Inc.', 'tail_type': 'COMPANY'}},\n"
" {{'head': 'Apple Inc.', 'head_type': 'COMPANY', 'relation': 'PRODUCES', 'tail': 'Apple Watch', 'tail_type': 'PRODUCT'}},\n"
" {{'head': 'Apple Watch', 'head_type': 'PRODUCT', 'relation': 'MONITORS', 'tail': 'heart health', 'tail_type': 'HEALTH_METRIC'}},\n"
" {{'head': 'UC Berkeley', 'head_type': 'UNIVERSITY', 'relation': 'STUDIES', 'tail': 'benefits of apples', 'tail_type': 'RESEARCH_TOPIC'}}]\n"
"---------------------\n"
"Text: {text}\n"
"Output:\n"
)
CUSTOM_KG_TRIPLET_EXTRACT_PROMPT = PromptTemplate(
CUSTOM_KG_TRIPLET_EXTRACT_TMPL,
prompt_type=PromptType.KNOWLEDGE_TRIPLET_EXTRACT
)
# Define allowed entity types and relation types
allowed_entity_types = ["PERSON", "COMPANY", "PRODUCT", "UNIVERSITY", "HEALTH_METRIC", "RESEARCH_TOPIC"]
allowed_relation_types = ["CEO_OF", "PRODUCES", "MONITORS", "STUDIES"]
# Create the KnowledgeGraphIndex
kg_index = KnowledgeGraphIndex(
kg_triple_extract_template=CUSTOM_KG_TRIPLET_EXTRACT_PROMPT,
allowed_entity_types=allowed_entity_types,
allowed_relation_types=allowed_relation_types
) This code sets up the |
How can I pass my documents to create this kg_index |
To pass your documents to create the from llama_index import KnowledgeGraphIndex
from llama_index.core.prompts import PromptTemplate
from llama_index.core.prompts.prompt_type import PromptType
# Define your custom prompt template
CUSTOM_KG_TRIPLET_EXTRACT_TMPL = (
"Extract up to {max_knowledge_triplets} knowledge triplets from the given text. "
"Each triplet should be in the form of (head, relation, tail) with their respective types.\n"
"---------------------\n"
"INITIAL ONTOLOGY:\n"
"Entity Types: {allowed_entity_types}\n"
"Relation Types: {allowed_relation_types}\n"
"\n"
"Use these types as a starting point, but introduce new types if necessary based on the context.\n"
"\n"
"GUIDELINES:\n"
"- Output in JSON format: [{{'head': '', 'head_type': '', 'relation': '', 'tail': '', 'tail_type': ''}}]\n"
"- Use the most complete form for entities (e.g., 'United States of America' instead of 'USA')\n"
"- Keep entities concise (3-5 words max)\n"
"- Break down complex phrases into multiple triplets\n"
"- Ensure the knowledge graph is coherent and easily understandable\n"
"---------------------\n"
"EXAMPLE:\n"
"Text: Tim Cook, CEO of Apple Inc., announced the new Apple Watch that monitors heart health. "
"UC Berkeley researchers studied the benefits of apples.\n"
"Output:\n"
"[{{'head': 'Tim Cook', 'head_type': 'PERSON', 'relation': 'CEO_OF', 'tail': 'Apple Inc.', 'tail_type': 'COMPANY'}},\n"
" {{'head': 'Apple Inc.', 'head_type': 'COMPANY', 'relation': 'PRODUCES', 'tail': 'Apple Watch', 'tail_type': 'PRODUCT'}},\n"
" {{'head': 'Apple Watch', 'head_type': 'PRODUCT', 'relation': 'MONITORS', 'tail': 'heart health', 'tail_type': 'HEALTH_METRIC'}},\n"
" {{'head': 'UC Berkeley', 'head_type': 'UNIVERSITY', 'relation': 'STUDIES', 'tail': 'benefits of apples', 'tail_type': 'RESEARCH_TOPIC'}}]\n"
"---------------------\n"
"Text: {text}\n"
"Output:\n"
)
CUSTOM_KG_TRIPLET_EXTRACT_PROMPT = PromptTemplate(
CUSTOM_KG_TRIPLET_EXTRACT_TMPL,
prompt_type=PromptType.KNOWLEDGE_TRIPLET_EXTRACT
)
# Define allowed entity types and relation types
allowed_entity_types = ["PERSON", "COMPANY", "PRODUCT", "UNIVERSITY", "HEALTH_METRIC", "RESEARCH_TOPIC"]
allowed_relation_types = ["CEO_OF", "PRODUCES", "MONITORS", "STUDIES"]
# Create the KnowledgeGraphIndex
kg_index = KnowledgeGraphIndex(
kg_triple_extract_template=CUSTOM_KG_TRIPLET_EXTRACT_PROMPT,
allowed_entity_types=allowed_entity_types,
allowed_relation_types=allowed_relation_types
)
# Define your documents
documents = [
"Tim Cook, CEO of Apple Inc., announced the new Apple Watch that monitors heart health.",
"UC Berkeley researchers studied the benefits of apples."
]
# Create the KnowledgeGraphIndex from documents
kg_index.from_documents(documents) This code sets up the |
I got an error: ValueError: One of nodes, objects, or index_struct must be provided. |
To resolve the from llama_index import KnowledgeGraphIndex
from llama_index.core.prompts import PromptTemplate
from llama_index.core.prompts.prompt_type import PromptType
from llama_index.legacy.schema import BaseNode, TextNode
# Define your custom prompt template
CUSTOM_KG_TRIPLET_EXTRACT_TMPL = (
"Extract up to {max_knowledge_triplets} knowledge triplets from the given text. "
"Each triplet should be in the form of (head, relation, tail) with their respective types.\n"
"---------------------\n"
"INITIAL ONTOLOGY:\n"
"Entity Types: {allowed_entity_types}\n"
"Relation Types: {allowed_relation_types}\n"
"\n"
"Use these types as a starting point, but introduce new types if necessary based on the context.\n"
"\n"
"GUIDELINES:\n"
"- Output in JSON format: [{{'head': '', 'head_type': '', 'relation': '', 'tail': '', 'tail_type': ''}}]\n"
"- Use the most complete form for entities (e.g., 'United States of America' instead of 'USA')\n"
"- Keep entities concise (3-5 words max)\n"
"- Break down complex phrases into multiple triplets\n"
"- Ensure the knowledge graph is coherent and easily understandable\n"
"---------------------\n"
"EXAMPLE:\n"
"Text: Tim Cook, CEO of Apple Inc., announced the new Apple Watch that monitors heart health. "
"UC Berkeley researchers studied the benefits of apples.\n"
"Output:\n"
"[{{'head': 'Tim Cook', 'head_type': 'PERSON', 'relation': 'CEO_OF', 'tail': 'Apple Inc.', 'tail_type': 'COMPANY'}},\n"
" {{'head': 'Apple Inc.', 'head_type': 'COMPANY', 'relation': 'PRODUCES', 'tail': 'Apple Watch', 'tail_type': 'PRODUCT'}},\n"
" {{'head': 'Apple Watch', 'head_type': 'PRODUCT', 'relation': 'MONITORS', 'tail': 'heart health', 'tail_type': 'HEALTH_METRIC'}},\n"
" {{'head': 'UC Berkeley', 'head_type': 'UNIVERSITY', 'relation': 'STUDIES', 'tail': 'benefits of apples', 'tail_type': 'RESEARCH_TOPIC'}}]\n"
"---------------------\n"
"Text: {text}\n"
"Output:\n"
)
CUSTOM_KG_TRIPLET_EXTRACT_PROMPT = PromptTemplate(
CUSTOM_KG_TRIPLET_EXTRACT_TMPL,
prompt_type=PromptType.KNOWLEDGE_TRIPLET_EXTRACT
)
# Define allowed entity types and relation types
allowed_entity_types = ["PERSON", "COMPANY", "PRODUCT", "UNIVERSITY", "HEALTH_METRIC", "RESEARCH_TOPIC"]
allowed_relation_types = ["CEO_OF", "PRODUCES", "MONITORS", "STUDIES"]
# Define your documents
documents = [
"Tim Cook, CEO of Apple Inc., announced the new Apple Watch that monitors heart health.",
"UC Berkeley researchers studied the benefits of apples."
]
# Create BaseNode objects from documents
nodes = [TextNode(text=doc) for doc in documents]
# Create the KnowledgeGraphIndex
kg_index = KnowledgeGraphIndex(
nodes=nodes,
kg_triple_extract_template=CUSTOM_KG_TRIPLET_EXTRACT_PROMPT,
allowed_entity_types=allowed_entity_types,
allowed_relation_types=allowed_relation_types
) This code creates |
I got TypeError: KnowledgeGraphIndex._build_index_from_nodes() got an unexpected keyword argument 'allowed_entity_types' |
To resolve the Here's the relevant part of the def _build_index_from_nodes(self, nodes: Sequence[BaseNode]) -> KG:
"""Build the index from nodes."""
# do simple concatenation
index_struct = self.index_struct_cls()
nodes_with_progress = get_tqdm_iterable(
nodes, self._show_progress, "Processing nodes"
)
for n in nodes_with_progress:
triplets = self._extract_triplets(
n.get_content(metadata_mode=MetadataMode.LLM)
)
logger.debug(f"> Extracted triplets: {triplets}")
for triplet in triplets:
subj, _, obj = triplet
self.upsert_triplet(triplet)
index_struct.add_node([subj, obj], n)
if self.include_embeddings:
triplet_texts = [str(t) for t in triplets]
embed_model = self._service_context.embed_model
embed_outputs = embed_model.get_text_embedding_batch(
triplet_texts, show_progress=self._show_progress
)
for rel_text, rel_embed in zip(triplet_texts, embed_outputs):
index_struct.add_to_embedding_dict(rel_text, rel_embed)
return index_struct The method does not accept |
but you told me to do this: kg_index = KnowledgeGraphIndex( |
To create a Here is the correct way to set up and use the
Here is an example: from llama_index.legacy.prompts.base import BasePromptTemplate
from llama_index.legacy.indices.knowledge_graph.base import KnowledgeGraphIndex
from llama_index.legacy.schema import TextNode
# Define your custom prompt template
CUSTOM_KG_TRIPLET_EXTRACT_PROMPT = BasePromptTemplate(
template_str=(
"Extract up to {max_knowledge_triplets} knowledge triplets from the given text. "
"Each triplet should be in the form of (head, relation, tail) with their respective types.\n"
"---------------------\n"
"INITIAL ONTOLOGY:\n"
"Entity Types: {allowed_entity_types}\n"
"Relation Types: {allowed_relation_types}\n"
"\n"
"Use these types as a starting point, but introduce new types if necessary based on the context.\n"
"\n"
"GUIDELINES:\n"
"- Output in JSON format: [{{'head': '', 'head_type': '', 'relation': '', 'tail': '', 'tail_type': ''}}]\n"
"- Use the most complete form for entities (e.g., 'United States of America' instead of 'USA')\n"
"- Keep entities concise (3-5 words max)\n"
"- Break down complex phrases into multiple triplets\n"
"- Ensure the knowledge graph is coherent and easily understandable\n"
"---------------------\n"
"EXAMPLE:\n"
"Text: Tim Cook, CEO of Apple Inc., announced the new Apple Watch that monitors heart health. "
"UC Berkeley researchers studied the benefits of apples.\n"
"Output:\n"
"[{{'head': 'Tim Cook', 'head_type': 'PERSON', 'relation': 'CEO_OF', 'tail': 'Apple Inc.', 'tail_type': 'COMPANY'}},\n"
" {{'head': 'Apple Inc.', 'head_type': 'COMPANY', 'relation': 'PRODUCES', 'tail': 'Apple Watch', 'tail_type': 'PRODUCT'}},\n"
" {{'head': 'Apple Watch', 'head_type': 'PRODUCT', 'relation': 'MONITORS', 'tail': 'heart health', 'tail_type': 'HEALTH_METRIC'}},\n"
" {{'head': 'UC Berkeley', 'head_type': 'UNIVERSITY', 'relation': 'STUDIES', 'tail': 'benefits of apples', 'tail_type': 'RESEARCH_TOPIC'}}]\n"
"---------------------\n"
"Text: {text}\n"
"Output:\n"
)
)
# Define allowed entity types and relation types
allowed_entity_types = ["PERSON", "COMPANY", "PRODUCT", "UNIVERSITY", "HEALTH_METRIC", "RESEARCH_TOPIC"]
allowed_relation_types = ["CEO_OF", "PRODUCES", "MONITORS", "STUDIES"]
# Define your documents
documents = [
"Tim Cook, CEO of Apple Inc., announced the new Apple Watch that monitors heart health.",
"UC Berkeley researchers studied the benefits of apples."
]
# Create TextNode objects from documents
nodes = [TextNode(text=doc) for doc in documents]
# Create the KnowledgeGraphIndex
kg_index = KnowledgeGraphIndex(
nodes=nodes,
kg_triple_extract_template=CUSTOM_KG_TRIPLET_EXTRACT_PROMPT.partial_format(
allowed_entity_types=allowed_entity_types,
allowed_relation_types=allowed_relation_types,
max_knowledge_triplets=10
)
) This code sets up the |
how to persist the graph index after building it? |
To persist the graph index after building it, you can use the built-in Composable Graphgraph.root_index.storage_context.persist(persist_dir="<persist_dir>") This will write all the data to disk at the specified location. SimpleGraphStoregraph_store = SimpleGraphStore()
# ... (build your graph store)
graph_store.persist(persist_path="<persist_path>") This will save the graph store data to the specified path. PropertyGraphIndexfrom llama_index.core import PropertyGraphIndex
# create
index = PropertyGraphIndex.from_documents(documents)
# save
index.storage_context.persist("./storage")
# load
from llama_index.core import StorageContext, load_index_from_storage
storage_context = StorageContext.from_defaults(persist_dir="./storage")
index = load_index_from_storage(storage_context) This will save the Loading SimpleGraphStorefrom llama_index.legacy.graph_stores.simple import SimpleGraphStore
# load
graph_store = SimpleGraphStore.from_persist_path(persist_path="<persist_path>") This will load the Additionally, you can use constants from the from llama_index.core.graph_stores.types import DEFAULT_PERSIST_DIR, DEFAULT_PERSIST_FNAME
# Example usage
graph_store.persist(persist_path=DEFAULT_PERSIST_DIR + "/" + DEFAULT_PERSIST_FNAME) This ensures that you are using the default directory and file name conventions provided by the LlamaIndex framework [1][2][3][4][5]. |
can I use the updated template directly in kg_index.from_documents method |
also how to print out the actual prompt used generating the response, with the actual values of the variables in the prompt template |
Yes, you can use the updated template directly in the Here is an example: from llama_index.legacy.prompts.base import PromptTemplate
from llama_index.legacy.prompts.prompt_type import PromptType
from llama_index.legacy.indices.knowledge_graph.base import KnowledgeGraphIndex
from llama_index.legacy.schema import TextNode
# Define your custom prompt template
CUSTOM_KG_TRIPLET_EXTRACT_TMPL = (
"Extract up to {max_knowledge_triplets} knowledge triplets from the given text. "
"Each triplet should be in the form of (head, relation, tail) with their respective types.\n"
"---------------------\n"
"INITIAL ONTOLOGY:\n"
"Entity Types: {allowed_entity_types}\n"
"Relation Types: {allowed_relation_types}\n"
"\n"
"Use these types as a starting point, but introduce new types if necessary based on the context.\n"
"\n"
"GUIDELINES:\n"
"- Output in JSON format: [{{'head': '', 'head_type': '', 'relation': '', 'tail': '', 'tail_type': ''}}]\n"
"- Use the most complete form for entities (e.g., 'United States of America' instead of 'USA')\n"
"- Keep entities concise (3-5 words max)\n"
"- Break down complex phrases into multiple triplets\n"
"- Ensure the knowledge graph is coherent and easily understandable\n"
"---------------------\n"
"EXAMPLE:\n"
"Text: Tim Cook, CEO of Apple Inc., announced the new Apple Watch that monitors heart health. "
"UC Berkeley researchers studied the benefits of apples.\n"
"Output:\n"
"[{{'head': 'Tim Cook', 'head_type': 'PERSON', 'relation': 'CEO_OF', 'tail': 'Apple Inc.', 'tail_type': 'COMPANY'}},\n"
" {{'head': 'Apple Inc.', 'head_type': 'COMPANY', 'relation': 'PRODUCES', 'tail': 'Apple Watch', 'tail_type': 'PRODUCT'}},\n"
" {{'head': 'Apple Watch', 'head_type': 'PRODUCT', 'relation': 'MONITORS', 'tail': 'heart health', 'tail_type': 'HEALTH_METRIC'}},\n"
" {{'head': 'UC Berkeley', 'head_type': 'UNIVERSITY', 'relation': 'STUDIES', 'tail': 'benefits of apples', 'tail_type': 'RESEARCH_TOPIC'}}]\n"
"---------------------\n"
"Text: {text}\n"
"Output:\n"
)
CUSTOM_KG_TRIPLET_EXTRACT_PROMPT = PromptTemplate(
CUSTOM_KG_TRIPLET_EXTRACT_TMPL,
prompt_type=PromptType.KNOWLEDGE_TRIPLET_EXTRACT
)
# Define allowed entity types and relation types
allowed_entity_types = ["PERSON", "COMPANY", "PRODUCT", "UNIVERSITY", "HEALTH_METRIC", "RESEARCH_TOPIC"]
allowed_relation_types = ["CEO_OF", "PRODUCES", "MONITORS", "STUDIES"]
# Format the prompt with actual values
formatted_prompt = CUSTOM_KG_TRIPLET_EXTRACT_PROMPT.format(
allowed_entity_types=allowed_entity_types,
allowed_relation_types=allowed_relation_types,
max_knowledge_triplets=10,
text="Tim Cook, CEO of Apple Inc., announced the new Apple Watch that monitors heart health."
)
# Print the formatted prompt
print(formatted_prompt)
# Define your documents
documents = [
"Tim Cook, CEO of Apple Inc., announced the new Apple Watch that monitors heart health.",
"UC Berkeley researchers studied the benefits of apples."
]
# Create TextNode objects from documents
nodes = [TextNode(text=doc) for doc in documents]
# Create the KnowledgeGraphIndex
kg_index = KnowledgeGraphIndex(
nodes=nodes,
kg_triplet_extract_template=CUSTOM_KG_TRIPLET_EXTRACT_PROMPT.partial_format(
allowed_entity_types=allowed_entity_types,
allowed_relation_types=allowed_relation_types,
max_knowledge_triplets=10
)
)
# Alternatively, you can use the from_documents method
kg_index = KnowledgeGraphIndex.from_documents(
documents,
kg_triplet_extract_template=CUSTOM_KG_TRIPLET_EXTRACT_PROMPT.partial_format(
allowed_entity_types=allowed_entity_types,
allowed_relation_types=allowed_relation_types,
max_knowledge_triplets=10
)
) This code demonstrates how to use the updated template directly in the |
Question Validation
Question
I want to figure out how to update the default prompt of KnowledgeGraphIndex, so that I can set constraints on the entities and relations it extracts.
The text was updated successfully, but these errors were encountered: