-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create Genie API wrapper #2
Conversation
Signed-off-by: Prithvi Kannan <[email protected]>
Signed-off-by: Prithvi Kannan <[email protected]>
Signed-off-by: Prithvi Kannan <[email protected]>
Signed-off-by: Prithvi Kannan <[email protected]>
Signed-off-by: Prithvi Kannan <[email protected]>
Signed-off-by: Prithvi Kannan <[email protected]>
Signed-off-by: Prithvi Kannan <[email protected]>
Signed-off-by: Prithvi Kannan <[email protected]>
Signed-off-by: Prithvi Kannan <[email protected]>
Signed-off-by: Prithvi Kannan <[email protected]>
Signed-off-by: Prithvi Kannan <[email protected]>
} | ||
|
||
def start_conversation(self, content): | ||
resp = self.genie._api.do( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question, why are we using the raw _api
request here? Can we not use this method directly? https://databricks-sdk-py.readthedocs.io/en/stable/workspace/dashboards/genie.html#databricks.sdk.service.dashboards.GenieAPI.start_conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think the polling mechanisms with the start_conversation
API does not work. ill asked the genie team and they recommended to use the _api
for now. we can revisit when fixed.
headers=self.headers, | ||
) | ||
if resp["status"] == "EXECUTING_QUERY": | ||
sql = next(r for r in resp["attachments"] if "query" in r)["query"]["query"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we getting the sql statements here only for the debug statement? Do end customers need this information?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eventually these should be part of the trace. currently adding extra traces with autologged traces are not supported, but this is coming soon. we'll use the SQL at that time.
|
||
return poll_result() | ||
|
||
def ask_question(self, question): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think of adding a timeout feature here, so users are not stuck waiting forever in the polling loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good question. i wonder if genie requests have some RPC level timeout on that side, but i think a client side timeout also makes sense. will update
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually genie has a CANCELED
state that happens in the case of timeout, which will hit the else case and break the loop. i dont think we need the client side timeout.
logging.debug(f"SQL: {sql}") | ||
return poll_query_results() | ||
elif resp["status"] == "COMPLETED": | ||
return next(r for r in resp["attachments"] if "text" in r)["text"]["content"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For genie, do we only care about the first response in the list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at this time there's only one text attachment from genie so this is a safe assumption.
{"status": "COMPLETED", "attachments": [{"text": {"content": "Answer"}}]}, | ||
] | ||
result = genie.ask_question("What is the meaning of life?") | ||
assert result == "Answer" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The answer cant be that simple 🤣
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the databricks-sdk code, the get api is exactly similar if we can use it directly. But it would be half direct api code and half databricks sdk code. Upto you on if you think that would be cleaner.
Good find. Let's do the refactor in one shot to keep it readable. |
Create the core Genie API wrapper. The wrapper API follows the Genie convseration APIs for polling.
Unit tested.