Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(QueryTracker): publish query tracker results #773

Merged
merged 3 commits into from
Nov 22, 2023

Conversation

ArslanSaleem
Copy link
Collaborator

@ArslanSaleem ArslanSaleem commented Nov 22, 2023

Summary by CodeRabbit

  • New Features

    • Implemented a query tracker to monitor and publish updates post pipeline execution.
  • Tests

    • Added tests to ensure the query tracker is properly called during the chat method execution.
    • Enhanced testing for code execution within the smart data lake pipeline context.

Copy link
Contributor

coderabbitai bot commented Nov 22, 2023

Walkthrough

The changes involve enhancing a data pipeline's code execution tracking and testing mechanisms. A new query tracker is introduced to publish updates post pipeline execution, and the execute method within the pipeline now uses a tracker for code execution. Tests are updated to mock and verify these new behaviors, ensuring the tracker is called appropriately during execution and when the chat method is invoked.

Changes

File Path Change Summary
pandasai/smart_datalake/__init__.py Added a line to publish a query tracker after pipeline execution.
tests/test_smartdatalake.py Added a test to verify query_tracker_publish is called in the chat method.
pandasai/pipelines/.../code_execution.py Modified execute method to use query_exec_tracker.execute_func.
tests/pipelines/.../test_code_execution.py Updated mocks and added new tests for code execution tracking.

🐇 In the code's weave, a tracker's tale, 🍂
Through autumn's brisk air, our updates sail. 🌬️
With each line we refine, our pipeline's dance,
Ensuring each query has a fair chance. 🌟


Tips

Chat with CodeRabbit Bot (@coderabbitai)

  • If you reply to a review comment from CodeRabbit, the bot will automatically respond.
  • To engage with CodeRabbit bot directly around the specific lines of code in the PR, mention @coderabbitai in your review comment
  • Note: Review comments are made on code diffs or files, not on the PR overview.
  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Note: For conversation with the bot, please use the review comments on code diffs or files.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 1

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between dcc660b and e49ce0e.
Files selected for processing (2)
  • pandasai/smart_datalake/init.py (1 hunks)
  • tests/test_smartdatalake.py (1 hunks)
Files skipped from review due to trivial changes (1)
  • pandasai/smart_datalake/init.py

Comment on lines 149 to +170
"value": "There are 10 countries in the dataframe.",
}

@patch.object(
CodeManager,
"execute_code",
return_value={
"type": "string",
"value": "There are 10 countries in the dataframe.",
},
)
@patch("pandasai.helpers.query_exec_tracker.QueryExecTracker.publish")
def test_query_tracker_publish_called_in_chat_method(
self, mock_query_tracker_publish, _mocked_method, smart_datalake: SmartDatalake
):
assert smart_datalake.last_result is None

_mocked_method.__name__ = "execute_code"

smart_datalake.chat("How many countries are in the dataframe?")
mock_query_tracker_publish.assert_called()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new test method test_query_tracker_publish_called_in_chat_method is correctly structured to test the invocation of the query_tracker_publish function. However, there are a few points to consider:

  1. The test does not verify the parameters with which mock_query_tracker_publish is called. It is important to ensure that the correct arguments are being passed to the function.
  2. The test assumes that smart_datalake.last_result is None before the chat method is called, which is good. However, it does not check the state of last_result after the chat method is called to ensure that it has been updated correctly.
  3. The test uses patching correctly, but it's not clear if the mock_query_tracker_publish is being reset after each test. This could potentially affect other tests if the mock state is not isolated.

Consider adding assertions to check the arguments passed to mock_query_tracker_publish and the state of last_result after the chat method is called. Also, ensure that mocks are reset after each test to maintain test isolation.

-        smart_datalake.chat("How many countries are in the dataframe?")
+        question = "How many countries are in the dataframe?"
+        smart_datalake.chat(question)
+        mock_query_tracker_publish.assert_called_with(question, ANY)
+        assert smart_datalake.last_result is not None

Commitable suggestion

[!IMPORTANT]
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
"value": "There are 10 countries in the dataframe.",
}
@patch.object(
CodeManager,
"execute_code",
return_value={
"type": "string",
"value": "There are 10 countries in the dataframe.",
},
)
@patch("pandasai.helpers.query_exec_tracker.QueryExecTracker.publish")
def test_query_tracker_publish_called_in_chat_method(
self, mock_query_tracker_publish, _mocked_method, smart_datalake: SmartDatalake
):
assert smart_datalake.last_result is None
_mocked_method.__name__ = "execute_code"
smart_datalake.chat("How many countries are in the dataframe?")
mock_query_tracker_publish.assert_called()
@patch.object(
CodeManager,
"execute_code",
return_value={
"type": "string",
"value": "There are 10 countries in the dataframe.",
},
)
@patch("pandasai.helpers.query_exec_tracker.QueryExecTracker.publish")
def test_query_tracker_publish_called_in_chat_method(
self, mock_query_tracker_publish, _mocked_method, smart_datalake: SmartDatalake
):
assert smart_datalake.last_result is None
_mocked_method.__name__ = "execute_code"
question = "How many countries are in the dataframe?"
smart_datalake.chat(question)
mock_query_tracker_publish.assert_called_with(question, ANY)
assert smart_datalake.last_result is not None

@codecov-commenter
Copy link

codecov-commenter commented Nov 22, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (dcc660b) 85.09% compared to head (726cfaf) 85.09%.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #773   +/-   ##
=======================================
  Coverage   85.09%   85.09%           
=======================================
  Files          88       88           
  Lines        3824     3825    +1     
=======================================
+ Hits         3254     3255    +1     
  Misses        570      570           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ArslanSaleem ArslanSaleem marked this pull request as draft November 22, 2023 09:41
@ArslanSaleem ArslanSaleem marked this pull request as ready for review November 22, 2023 11:30
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 1

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between e49ce0e and 726cfaf.
Files selected for processing (2)
  • pandasai/pipelines/smart_datalake_chat/code_execution.py (1 hunks)
  • tests/pipelines/smart_datalake/test_code_execution.py (2 hunks)
Additional comments: 2
pandasai/pipelines/smart_datalake_chat/code_execution.py (1)
  • 42-52: The change to use pipeline_context.query_exec_tracker.execute_func for executing code is a significant improvement in terms of maintainability and encapsulation. It centralizes the execution logic and potentially allows for better tracking and management of code execution within the pipeline context. Ensure that the execute_func method is thoroughly tested, especially for its error handling and retry logic.
tests/pipelines/smart_datalake/test_code_execution.py (1)
  • 100-107: The context.get_intermediate_value method is mocked to return different objects based on the key provided. However, the mock for context._query_exec_tracker and context.query_exec_tracker.execute_func is set up but not used in the test_code_execution_successful_with_no_exceptions method. Ensure that the mocks are being utilized correctly in the test case to reflect the intended behavior.

Comment on lines 159 to 181
raise Exception("Unit test exception")
return "Mocked Result after retry"

# Conditional return of execute_func method based arguments it is called with
def mock_execute_func(*args, **kwargs):
if isinstance(args[0], Mock) and args[0].name == "execute_code":
return mock_execute_code(*args, **kwargs)
else:
return [
"Interuppted Code",
"Exception Testing",
"Successful after Retry",
]

mock_code_manager = Mock()
mock_code_manager.execute_code = Mock(side_effect=mock_execute_code)
mock_code_manager.execute_code = Mock()
mock_code_manager.execute_code.name = "execute_code"

context._query_exec_tracker = Mock()
context.query_exec_tracker.execute_func = Mock(
return_value=[
"Interuppted Code",
"Exception Testing",
"Successful after Retry",
]
)

context.query_exec_tracker.execute_func = Mock(side_effect=mock_execute_func)

def mock_intermediate_values(key: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mock_execute_func is designed to simulate different behaviors based on the arguments it receives. However, the conditional logic within mock_execute_func seems to be based on the name attribute of the Mock object, which is not a standard attribute. This could lead to unexpected behavior or test failures if the name attribute is not set elsewhere in the code. Consider using a more reliable condition to differentiate the behavior of mock_execute_func.

@gventuri gventuri merged commit 431c83c into main Nov 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants