Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[py] BiDi Network implementation of Intercepts and Auth in Python #14592

Open
wants to merge 35 commits into
base: trunk
Choose a base branch
from

Conversation

shbenzer
Copy link
Contributor

@shbenzer shbenzer commented Oct 13, 2024

User description

Add implementations for BiDi Network's Auth and Interception to Python

Description

Added network.py
Added bidi_network_tests.py
updated command.py

Motivation and Context

Issue #13993

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • I have read the contributing document.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

PR Type

Enhancement, Tests


Description

  • Implemented a new Network class to handle network events and requests, supporting BiDi protocol operations.
  • Added command constants and mapped them to HTTP methods for network operations.
  • Integrated the Network class into the WebDriver, allowing network interception and authentication handling.
  • Developed tests to verify the new network functionalities, including request interception and authentication.

Changes walkthrough 📝

Relevant files
Enhancement
network.py
Implement Network class for BiDi protocol support               

py/selenium/webdriver/common/bidi/network.py

  • Implemented the Network class for handling network events and
    requests.
  • Added methods for intercepting requests and handling authentication.
  • Introduced event subscription and callback handling.
  • +94/-0   
    command.py
    Add network command constants for BiDi operations               

    py/selenium/webdriver/remote/command.py

  • Added new command constants for network interception and continuation.

  • +5/-1     
    remote_connection.py
    Map network commands to HTTP endpoints                                     

    py/selenium/webdriver/remote/remote_connection.py

    • Mapped new network commands to HTTP methods and endpoints.
    +5/-0     
    webdriver.py
    Integrate Network class into WebDriver                                     

    py/selenium/webdriver/remote/webdriver.py

  • Integrated Network class into WebDriver for BiDi support.
  • Added a property to access network operations.
  • +7/-0     
    Tests
    bidi_network_tests.py
    Add tests for BiDi network functionalities                             

    py/test/selenium/webdriver/common/bidi_network_tests.py

  • Added tests for network interception and continuation functionalities.
  • Verified handling of authentication and request modifications.
  • +49/-0   

    💡 PR-Agent usage: Comment /help "your question" on any pull request to receive relevant information

    Copy link
    Contributor

    codiumai-pr-agent-pro bot commented Oct 13, 2024

    PR Reviewer Guide 🔍

    (Review updated until commit 01f6aaf)

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🧪 PR contains tests
    🔒 Security concerns

    Sensitive information exposure:
    The continue_with_auth method in the Network class (py/selenium/webdriver/common/bidi/network.py) handles username and password. While it's not directly exposing this information, care should be taken to ensure these credentials are not logged or stored insecurely during the authentication process.

    ⚡ Recommended focus areas for review

    Error Handling
    The Network class lacks proper error handling and exception raising. This could lead to silent failures or unexpected behavior when network operations fail.

    Type Hinting
    The Network class methods lack type hints, which could make the code harder to understand and maintain, especially for complex objects like 'conn'.

    Test Coverage
    While basic tests are implemented, there's a lack of comprehensive test cases covering edge cases, error scenarios, and all network events.

    Copy link
    Contributor

    codiumai-pr-agent-pro bot commented Oct 13, 2024

    PR Code Suggestions ✨

    Latest suggestions up to 01f6aaf
    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Score
    Best practice
    Add input validation to network operation methods to prevent issues caused by invalid parameters

    Consider adding input validation for the parameters in methods like
    continue_response, continue_request, and add_intercept. This can help prevent issues
    caused by invalid input and improve the overall robustness of the code.

    py/selenium/webdriver/common/bidi/network.py [39-48]

     def continue_response(self, request_id, status_code, headers=None, body=None):
    +    if not isinstance(request_id, str) or not request_id:
    +        raise ValueError("request_id must be a non-empty string")
    +    if not isinstance(status_code, int) or status_code < 100 or status_code >= 600:
    +        raise ValueError("status_code must be a valid HTTP status code")
         params = {
             'requestId': request_id,
             'status': status_code
         }
         if headers is not None:
    +        if not isinstance(headers, dict):
    +            raise ValueError("headers must be a dictionary")
             params['headers'] = headers
         if body is not None:
    +        if not isinstance(body, (str, bytes)):
    +            raise ValueError("body must be a string or bytes")
             params['body'] = body
         self.conn.execute('network.continueResponse', params)
    • Apply this suggestion
    Suggestion importance[1-10]: 8

    Why: Input validation is crucial for preventing errors and ensuring that methods receive valid data. This suggestion enhances the robustness and reliability of the code by checking parameter types and values before proceeding with network operations.

    8
    Enhancement
    Implement robust error handling for network operations to improve reliability and debugging

    Consider using a more robust error handling mechanism for the network operations.
    Wrap the self.conn.execute() calls in try-except blocks to catch and handle
    potential exceptions, providing more informative error messages or logging.

    py/selenium/webdriver/common/bidi/network.py [48]

    -self.conn.execute('network.continueResponse', params)
    +try:
    +    self.conn.execute('network.continueResponse', params)
    +except Exception as e:
    +    logging.error(f"Failed to continue response: {str(e)}")
    +    raise
    • Apply this suggestion
    Suggestion importance[1-10]: 7

    Why: Adding error handling around network operations can improve the robustness of the code by catching and logging exceptions, which aids in debugging and ensures that the application can handle unexpected issues gracefully.

    7
    Add a method to remove event listeners, improving flexibility in event management

    Consider implementing a method to remove all event listeners for a specific event or
    all events. This would provide more flexibility in managing event subscriptions and
    prevent potential memory leaks.

    py/selenium/webdriver/common/bidi/network.py [91-94]

     def on(self, event, callback):
         event = self.EVENTS.get(event, event)
         self.callbacks[event] = callback
         session_subscribe(self.conn, event, self.handle_event)
     
    +def off(self, event=None):
    +    if event:
    +        event = self.EVENTS.get(event, event)
    +        if event in self.callbacks:
    +            del self.callbacks[event]
    +            session_unsubscribe(self.conn, event, self.handle_event)
    +    else:
    +        for event in list(self.callbacks.keys()):
    +            session_unsubscribe(self.conn, event, self.handle_event)
    +        self.callbacks.clear()
    +
    • Apply this suggestion
    Suggestion importance[1-10]: 7

    Why: Implementing a method to remove event listeners provides better control over event management and can prevent potential memory leaks. This enhancement increases the flexibility and maintainability of the code.

    7
    Expand test coverage to include various scenarios and edge cases for network operations

    Enhance the test coverage by adding more comprehensive test cases for different
    scenarios, including edge cases and error conditions. This will help ensure the
    robustness of the network functionality.

    py/test/selenium/webdriver/common/bidi_network_tests.py [27-32]

    -def test_continue_response(driver, network):
    +@pytest.mark.parametrize("status_code,expected_text", [
    +    (200, "authorized"),
    +    (401, "unauthorized"),
    +    (500, "server error")
    +])
    +def test_continue_response(driver, network, status_code, expected_text):
         network.add_intercept(phases=[Network.PHASES['before_request']])
    -    network.on('before_request', lambda event: network.continue_response(event['requestId'], 200))
    +    network.on('before_request', lambda event: network.continue_response(event['requestId'], status_code))
         
         driver.get(url_for("basicAuth"))
    -    assert driver.find_element_by_tag_name('h1').text == 'authorized'
    +    assert expected_text in driver.page_source.lower()
    • Apply this suggestion
    Suggestion importance[1-10]: 6

    Why: Enhancing test coverage by including different scenarios and edge cases can significantly improve the reliability of the network functionality. This suggestion is valuable for ensuring that the code behaves correctly under various conditions.

    6

    💡 Need additional feedback ? start a PR chat


    Previous suggestions

    ✅ Suggestions up to commit 14bf971
    CategorySuggestion                                                                                                                                    Score
    Possible issue
    Validate input parameters to prevent potential issues with invalid data being passed to the BiDi protocol

    Consider implementing input validation for the add_intercept method to ensure that
    the provided phases are valid according to the PHASES dictionary. This can prevent
    potential issues with invalid phase names being passed to the BiDi protocol.

    py/selenium/webdriver/common/bidi/network.py [68-76]

     def add_intercept(self, phases=None, contexts=None, url_patterns=None):
         if phases is None:
             phases = []
    +    else:
    +        valid_phases = set(self.PHASES.values())
    +        if not all(phase in valid_phases for phase in phases):
    +            raise ValueError(f"Invalid phase(s) provided. Valid phases are: {', '.join(valid_phases)}")
         params = {
             'phases': phases,
             'contexts': contexts,
             'urlPatterns': url_patterns
         }
         self.conn.execute('network.addIntercept', params)
    Suggestion importance[1-10]: 8

    Why: Implementing input validation for the add_intercept method is a valuable enhancement that prevents invalid phase names from causing issues. This suggestion addresses a potential bug and improves the method's robustness.

    8
    Enhancement
    Implement error handling for network operations to improve robustness and prevent unexpected crashes

    Consider using a more robust error handling mechanism for the network operations.
    Wrap the self.conn.execute() calls in try-except blocks to catch and handle
    potential exceptions, ensuring graceful error handling and preventing crashes.

    py/selenium/webdriver/common/bidi/network.py [43-52]

     def continue_response(self, request_id, status_code, headers=None, body=None):
         params = {
             'requestId': request_id,
             'status': status_code
         }
         if headers is not None:
             params['headers'] = headers
         if body is not None:
             params['body'] = body
    -    self.conn.execute('network.continueResponse', params)
    +    try:
    +        self.conn.execute('network.continueResponse', params)
    +    except Exception as e:
    +        logging.error(f"Error in continue_response: {str(e)}")
    +        raise
    Suggestion importance[1-10]: 7

    Why: Adding error handling to network operations can improve the robustness of the code by preventing crashes due to unhandled exceptions. This suggestion is relevant and enhances the reliability of the continue_response method.

    7
    ✅ Enhance test assertions to provide more comprehensive verification of network interception behavior
    Suggestion Impact:The commit added more comprehensive assertions to verify network interception behavior, such as checking the status of responses and the method of requests, which aligns with the suggestion to enhance test assertions.

    code diff:

    +def test_add_response_handler(response):
    +    passed = [False]
     
    -def test_continue_request(driver, network):
    -    network.add_intercept(phases=[Network.PHASES['before_request']])
    -    network.on('before_request', lambda event: network.continue_request(event['requestId'], url=url_for("basicAuth")))
    -    
    -    driver.get(url_for("basicAuth"))
    -    assert driver.find_element_by_tag_name('h1').text == 'authorized'
    +    def callback(event):
    +        if event['response']['status'] == 200:
    +            passed[0] = True
     
    -def test_continue_with_auth(driver, network):
    -    username = 'your_username'
    -    password = 'your_password'
    -    
    -    network.add_intercept(phases=[Network.PHASES['auth_required']])
    -    network.on('auth_required', lambda event: network.continue_with_auth(event['requestId'], username, password))
    -    
    -    driver.get(url_for("basicAuth"))
    -    assert driver.find_element_by_tag_name('h1').text == 'authorized'+    network_response.add_response_handler(callback)
    +    pages.load("basicAuth")
    +    assert passed[0] == True, "Callback was NOT successful"
    +
    +def test_remove_response_handler(response):
    +    passed = [False]
    +
    +    def callback(event):
    +        if event['response']['status'] == 200:
    +            passed[0] = True
    +
    +    network_response.add_response_handler(callback)
    +    network_response.remove_response_handler(callback)
    +    pages.load("basicAuth")
    +    assert passed[0] == False, "Callback was successful"
    +
    +def test_add_request_handler(request):
    +    passed = [False]
    +
    +    def callback(event):
    +        if event['request']['method'] == 'GET':
    +            passed[0] = True
    +
    +    network_request.add_request_handler(callback)
    +    pages.load("basicAuth")
    +    assert passed[0] == True, "Callback was NOT successful"
    +
    +def test_remove_request_handler(request):
    +    passed = [False]
    +
    +    def callback(event):
    +        if event['request']['method'] == 'GET':
    +            passed[0] = True
    +
    +    network_request.add_request_handler(callback)
    +    network_request.remove_request_handler(callback)
    +    pages.load("basicAuth")
    +    assert passed[0] == False, "Callback was successful"
    +
    +def test_add_authentication_handler(network):
    +    network.add_authentication_handler('test','test')
    +    pages.load("basicAuth")
    +    assert driver.find_element_by_tag_name('h1').text == 'authorized', "Authentication was NOT successful"
    +
    +def test_remove_authentication_handler(network):
    +    network.add_authentication_handler('test', 'test')
    +    network.remove_authentication_handler()
    +    pages.load("basicAuth")

    Consider adding more comprehensive assertions to verify the behavior of network
    interceptions. This could include checking the content of intercepted requests or
    responses, verifying headers, or ensuring that the correct number of interceptions
    occur.

    py/test/selenium/webdriver/common/bidi_network_tests.py [29-34]

     def test_continue_response(driver, network):
    +    intercepted_requests = []
         network.add_intercept(phases=[Network.PHASES['before_request']])
    +    network.on('before_request', lambda event: intercepted_requests.append(event))
         network.on('before_request', lambda event: network.continue_response(event['requestId'], 200))
         
         driver.get(url_for("basicAuth"))
         assert driver.find_element_by_tag_name('h1').text == 'authorized'
    +    assert len(intercepted_requests) > 0, "No requests were intercepted"
    +    assert intercepted_requests[0]['request']['url'] == url_for("basicAuth"), "Unexpected URL intercepted"
    Suggestion importance[1-10]: 7

    Why: Adding more comprehensive assertions in tests improves the verification of network interception behavior, ensuring the tests are more thorough and reliable. This suggestion enhances the quality of the test suite.

    7
    Best practice
    Use a context manager for WebDriver instances to ensure proper resource cleanup

    Consider using a context manager for the WebDriver instance to ensure proper
    cleanup, even if an exception occurs during the test. This can help prevent resource
    leaks and improve test reliability.

    py/test/selenium/webdriver/common/bidi_network_tests.py [11-16]

     @pytest.fixture
     def driver():
         options = webdriver.ChromeOptions()
         options.add_argument('--remote-debugging-port=9222')
    -    driver = webdriver.Chrome(options=options)
    -    yield driver
    -    driver.quit()
    +    with webdriver.Chrome(options=options) as driver:
    +        yield driver
    Suggestion importance[1-10]: 6

    Why: Using a context manager for the WebDriver instance ensures proper cleanup and resource management, even if exceptions occur. This suggestion follows best practices for resource handling in tests.

    6

    @shbenzer shbenzer marked this pull request as draft October 18, 2024 22:52
    @shbenzer
    Copy link
    Contributor Author

    shbenzer commented Oct 18, 2024

    was adding this to BUILD.bazel for easy testing:

    py_test(
        name = "bidi_network_tests",
        srcs = ["test/selenium/webdriver/common/bidi_network_tests.py"],
        deps = [
            ":init-tree",
            ":selenium",
            ":webserver",
        ] + TEST_DEPS,
    )
    

    so i could run bazel test //py:bidi_network_tests

    @shbenzer
    Copy link
    Contributor Author

    shbenzer commented Oct 18, 2024

    Think it's ready now... @AutomatedTester could you give it a look?

    @shbenzer shbenzer marked this pull request as ready for review October 18, 2024 23:21
    Copy link
    Contributor

    Persistent review updated to latest commit 01f6aaf

    @shbenzer shbenzer changed the title BiDi Network implementation of Intercepts and Auth in Python [py] BiDi Network implementation of Intercepts and Auth in Python Nov 1, 2024
    Copy link
    Member

    @p0deje p0deje left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    I'm not a Python expert but I ended up being the person who drafted the initial BiDi implementation. I tried to avoid using the async/await pattern because I assumed that it can't be used in a sync code that the majority of Python clients of the library do. Is there any reason to use the pattern in this PR?

    @shbenzer
    Copy link
    Contributor Author

    shbenzer commented Nov 14, 2024

    I'm not a Python expert but I ended up being the person who drafted the initial BiDi implementation. I tried to avoid using the async/await pattern because I assumed that it can't be used in a sync code that the majority of Python clients of the library do. Is there any reason to use the pattern in this PR?

    It has pros and cons, but I think you're right that it isn't needed and could be more trouble than it's worth. I'll remove it for the time being and if we decide later it's an easy shift.

    Copy link
    Member

    @p0deje p0deje left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Can we add some high-level API examples and tests per an original issue?

    • driver.network.add_request_handler(...)
    • driver.network.add_response_handler(...)

    This would allow us to surface the API usage and make sure it works nicely via lambdas.

    py/selenium/webdriver/remote/command.py Outdated Show resolved Hide resolved
    py/selenium/webdriver/remote/remote_connection.py Outdated Show resolved Hide resolved
    py/test/selenium/webdriver/common/bidi_network_tests.py Outdated Show resolved Hide resolved
    @shbenzer
    Copy link
    Contributor Author

    shbenzer commented Nov 14, 2024

    Can we add some high-level API examples and tests per an original issue?

    • driver.network.add_request_handler(...)
    • driver.network.add_response_handler(...)

    This would allow us to surface the API usage and make sure it works nicely via lambdas.

    Could you provide a bit more guidance here? What would those two tests entail that isn't already being covered? the tests I added are an expanded version of the .spec from the ruby implementation

    @p0deje
    Copy link
    Member

    p0deje commented Nov 14, 2024

    I think the current tests expose the low-level API that users are not supposed to call. The high-level API unfortunately is not implemented in any bindings so there is no example to follow. However, the proposal is to allow the add_request_handler to accept a matcher (e.g. URL filter for what requests should be handled) and a callback that received the request to handle. This callable is responsible for continuing/failing the request optionally allowing to mutate it (e.g. add headers).

    def handle_request(request):
        request.headers["X-FOO"] = "bar"
        request.continue()
    
    driver.network.add_request_handler('https://google.com', handle_request)

    This is roughly what we had considered to be a user-facing API.

    @shbenzer
    Copy link
    Contributor Author

    shbenzer commented Nov 14, 2024

    I think the current tests expose the low-level API that users are not supposed to call. The high-level API unfortunately is not implemented in any bindings so there is no example to follow. However, the proposal is to allow the add_request_handler to accept a matcher (e.g. URL filter for what requests should be handled) and a callback that received the request to handle. This callable is responsible for continuing/failing the request optionally allowing to mutate it (e.g. add headers).

    def handle_request(request):
        request.headers["X-FOO"] = "bar"
        request.continue()
    
    driver.network.add_request_handler('https://google.com', handle_request)

    This is roughly what we had considered to be a user-facing API.

    Interesting. I think I understood - what do you think about this most recent push?

    @shbenzer
    Copy link
    Contributor Author

    shbenzer commented Nov 14, 2024

    tests passing locally

    (selenium_dev_curr) (base) simonbenzer@Simons-MBP selenium % bazel test //py:bidi_network_tests                                                
    WARNING: Ignoring JAVA_HOME, because it must point to a JDK, not a JRE.
    Starting local Bazel server and connecting to it...
    WARNING: For repository 'aspect_bazel_lib', the root module requires module version [email protected], but got [email protected] in the resolved dependency graph.
    INFO: Analyzed target //py:bidi_network_tests (395 packages loaded, 10836 targets configured).
    INFO: Found 1 test target...
    Target //py:bidi_network_tests up-to-date:
      bazel-bin/py/bidi_network_tests
    INFO: Elapsed time: 8.816s, Critical Path: 1.27s
    INFO: 29 processes: 4 internal, 25 darwin-sandbox.
    INFO: Build completed successfully, 29 total actions
    //py:bidi_network_tests                                                  PASSED in 0.7s
    

    @p0deje
    Copy link
    Member

    p0deje commented Nov 14, 2024

    Interesting. I think I understood - what do you think about this most recent push?

    Much closer to what we are aiming for! Let's add some abstractions (Request/Response) objects, move continuation there, and write more tests. I've left more detailed comments in the PR review.

    @shbenzer
    Copy link
    Contributor Author

    @p0deje Is this closer?

    @p0deje
    Copy link
    Member

    p0deje commented Nov 14, 2024

    @p0deje Is this closer?

    Not quite, we need to make Request/Response be passed to the callbacks instead of plain event, see #14592 (comment) for the example.

    @shbenzer
    Copy link
    Contributor Author

    shbenzer commented Nov 14, 2024

    @p0deje Is this closer?

    Not quite, we need to make Request/Response be passed to the callbacks instead of plain event, see #14592 (comment) for the example.

    Hmm... Seems I misunderstood. Let me see if I got it now:

    1. move add_response_handler, remove_response_handler, add_request_handler, and remove_request_handler back into Network class
    2. in the callback_on_url_match() function inside each add_handler:
            def callback_on_url_match(data):
                if url_pattern in data['request']['url']:
                    callback(data)
    

    it should create a request/response object, then perform the callback on that object instead of on the event itself.
    3. The response and request classes should just be simple objects with the continue method (and mark continue public). e.g.

    class Request:
        def __init__(self, request_id, url, method, headers, body, network: Network):
            self.request_id = request_id
            self.url = url
            self.method = method
            self.headers = headers
            self.body = body
            self.network = network
    
        def continue_request(self):
          """Continue after sending a request."""
          params = {
              'requestId': self.request_id
          }
          if self.url is not None:
              params['url'] = url
          if self.method is not None:
              params['method'] = method
          if self.headers is not None:
              params['headers'] = headers
          if self.postData is not None:
              params['postData'] = postData
          self.network.conn.execute('network.continueRequest', params)
    

    @shbenzer
    Copy link
    Contributor Author

    @p0deje I did some updates based on my comment above. If I still haven't quite understood, I can just revert the commits lol

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    2 participants