Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Genie API wrapper #2

Merged
merged 12 commits into from
Oct 24, 2024
Merged

Create Genie API wrapper #2

merged 12 commits into from
Oct 24, 2024

Conversation

prithvikannan
Copy link
Collaborator

@prithvikannan prithvikannan commented Oct 17, 2024

Create the core Genie API wrapper. The wrapper API follows the Genie convseration APIs for polling.

Unit tested.

Signed-off-by: Prithvi Kannan <[email protected]>
Signed-off-by: Prithvi Kannan <[email protected]>
Signed-off-by: Prithvi Kannan <[email protected]>
Signed-off-by: Prithvi Kannan <[email protected]>
Signed-off-by: Prithvi Kannan <[email protected]>
Signed-off-by: Prithvi Kannan <[email protected]>
Signed-off-by: Prithvi Kannan <[email protected]>
Signed-off-by: Prithvi Kannan <[email protected]>
pyproject.toml Outdated Show resolved Hide resolved
}

def start_conversation(self, content):
resp = self.genie._api.do(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the polling mechanisms with the start_conversation API does not work. ill asked the genie team and they recommended to use the _api for now. we can revisit when fixed.

src/databricks_ai_bridge/genie.py Show resolved Hide resolved
headers=self.headers,
)
if resp["status"] == "EXECUTING_QUERY":
sql = next(r for r in resp["attachments"] if "query" in r)["query"]["query"]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we getting the sql statements here only for the debug statement? Do end customers need this information?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eventually these should be part of the trace. currently adding extra traces with autologged traces are not supported, but this is coming soon. we'll use the SQL at that time.


return poll_result()

def ask_question(self, question):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of adding a timeout feature here, so users are not stuck waiting forever in the polling loop?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question. i wonder if genie requests have some RPC level timeout on that side, but i think a client side timeout also makes sense. will update

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually genie has a CANCELED state that happens in the case of timeout, which will hit the else case and break the loop. i dont think we need the client side timeout.

logging.debug(f"SQL: {sql}")
return poll_query_results()
elif resp["status"] == "COMPLETED":
return next(r for r in resp["attachments"] if "text" in r)["text"]["content"]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For genie, do we only care about the first response in the list?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at this time there's only one text attachment from genie so this is a safe assumption.

{"status": "COMPLETED", "attachments": [{"text": {"content": "Answer"}}]},
]
result = genie.ask_question("What is the meaning of life?")
assert result == "Answer"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The answer cant be that simple 🤣

Copy link

@aravind-segu aravind-segu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the databricks-sdk code, the get api is exactly similar if we can use it directly. But it would be half direct api code and half databricks sdk code. Upto you on if you think that would be cleaner.

@prithvikannan
Copy link
Collaborator Author

Looking at the databricks-sdk code, the get api is exactly similar if we can use it directly. But it would be half direct api code and half databricks sdk code. Upto you on if you think that would be cleaner.

Good find. Let's do the refactor in one shot to keep it readable.

@prithvikannan prithvikannan merged commit b314f60 into main Oct 24, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants