Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[security] Update profiler to 0.4.0 #999

Merged
merged 2 commits into from
Feb 21, 2024

Conversation

BenjaminPelletier
Copy link
Member

@BenjaminPelletier BenjaminPelletier commented Feb 7, 2024

This is nominally a one-line PR that changes the profiler dependency from 0.2.0 to 0.4.0. The other changes are due to running go mod tiy -compat=1.17.

@BenjaminPelletier
Copy link
Member Author

I'm not sure how to effectively test this change, so if reviewer(s) could consider carefully, that would be appreciated.

Copy link
Contributor

@mickmis mickmis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(reviewing upon request of @barroco)

Given that we can't test automatically, we can look at the changelog of the library. My impression is that:

  • there is no breaking change
  • there are a few bug fixes that don't look like could impact us
  • mostly it is updating dependencies, the most notable one being an upgrade to protobuf-v2

Only the dependencies update could maybe have some red flags.
Now, IIUC with this library the only thing we do is that we spawn a routine to generate and upload profiles, with a behavior that seems to be determined by the server. Given that, as long as the server used supports the new version this upgrade should be OK. Which since this is a hosted cloud service it should.
And also if that change break something, that should be limited to this profiling feature and should in theory not have a broader impact.

Only alternative would be to test manually, but it looks like this service is not enabled in any of the example deployment files:


prof_grpc_name: '',

IMO it is low-risk enough to go ahead with the change.

@barroco
Copy link
Contributor

barroco commented Feb 14, 2024

Successful deployment with profiler enabled:
Screenshot 2024-02-14 at 16 25 14

@barroco
Copy link
Contributor

barroco commented Feb 14, 2024

Tested deployment on GCP using terraform and helm deployment with DSS image built from this PR branch.
Test was performed using the prober on image interuss/monitoring:v0.3.0. (Image currently used in DSS CI)

Note that all test passed except the test_operation_simple_heavy_traffic_concurrent test due to a timeout. I am expecting the issue to come from the fact that the prober is run on my local machine in Switzerland and the cluster deployed in US east zone. I will conduct more investigation, test previous images and file an issue if applicable.

============================= test session starts ==============================
platform linux -- Python 3.11.7, pytest-6.2.4, py-1.11.0, pluggy-0.13.1
rootdir: /app/monitoring/prober
plugins: Faker-8.1.0, mock-3.6.1
collected 315 items

aux_/test_token_validation.py ...                                        [  0%]
aux_/test_version.py .                                                   [  1%]
monitorlib/test_geo.py .                                                 [  1%]
rid/v1/test_isa_expiry.py .......                                        [  3%]
rid/v1/test_isa_simple.py ...................                            [  9%]
rid/v1/test_isa_simple_heavy_traffic_concurrent.py .....                 [ 11%]
rid/v1/test_isa_validation.py .........                                  [ 14%]
rid/v1/test_subscription_isa_interactions.py ......                      [ 16%]
rid/v1/test_subscription_isa_slightly_overlapping.py .....               [ 17%]
rid/v1/test_subscription_simple.py ............                          [ 21%]
rid/v1/test_subscription_validation.py ........                          [ 24%]
rid/v1/test_token_validation.py ......                                   [ 26%]
rid/v2/test_isa_expiry.py .......                                        [ 28%]
rid/v2/test_isa_simple.py ...................                            [ 34%]
rid/v2/test_isa_validation.py .........                                  [ 37%]
rid/v2/test_subscription_isa_interactions.py ......                      [ 39%]
rid/v2/test_subscription_simple.py ............                          [ 42%]
rid/v2/test_subscription_validation.py ........                          [ 45%]
rid/v2/test_token_validation.py ......                                   [ 47%]
scd/test_constraint_simple.py ...................                        [ 53%]
scd/test_constraints_with_subscriptions.py ...........                   [ 56%]
scd/test_operation_references_error_cases.py .......................     [ 64%]
scd/test_operation_references_state_transition.py ......                 [ 66%]
scd/test_operation_simple.py .....................                       [ 72%]
scd/test_operation_simple_heavy_traffic.py ...............               [ 77%]
scd/test_operation_simple_heavy_traffic_concurrent.py ....Fs.            [ 79%]
scd/test_operation_special_cases.py .......                              [ 81%]
scd/test_operations_simple.py ......................                     [ 88%]
scd/test_subscription_queries.py ..........                              [ 92%]
scd/test_subscription_query_time.py .                                    [ 92%]
scd/test_subscription_simple.py ...........                              [ 95%]
scd/test_subscription_update_validation.py ...........                   [ 99%]
scd/test_uss_availability.py ..                                          [100%]

=================================== FAILURES ===================================
__________________________ test_mutate_ops_concurrent __________________________

ids = <function ids.<locals>.<lambda> at 0x2aaaeb2642c0>, scd_api = '1.0.0'
scd_session = <monitoring.monitorlib.infrastructure.UTMClientSession object at 0x2aaaeaf11610>
scd_session_async = <monitoring.monitorlib.infrastructure.AsyncUTMTestSession object at 0x2aaaeb219910>

    @for_api_versions(scd.API_0_3_17)
    @default_scope(SCOPE_SC)
    @depends_on(test_create_ops_concurrent)
    def test_mutate_ops_concurrent(ids, scd_api, scd_session, scd_session_async):
        start_time = datetime.datetime.utcnow()
        op_req_map = {}
        op_resp_map = {}
        op_map = {}

        # Build mutate requests
        for idx, op_id in enumerate(map(ids, OP_TYPES)):
            op_req_map[op_id] = _build_mutate_request(
                idx, op_id, op_map, scd_session, scd_api
            )
        assert len(op_req_map) == len(OP_TYPES)

        # Mutate operations in parallel
        loop = asyncio.get_event_loop()
>       results = loop.run_until_complete(
            asyncio.gather(
                *[
                    _put_operation_async(req, op_id, scd_session_async, scd_api, False)
                    for op_id, req in op_req_map.items()
                ]
            )
        )

scd/test_operation_simple_heavy_traffic_concurrent.py:419:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/usr/local/lib/python3.11/asyncio/base_events.py:653: in run_until_complete
    return future.result()
scd/test_operation_simple_heavy_traffic_concurrent.py:158: in _put_operation_async
    result = await scd_session_async.put(req_url, data=req), req_url, req
../monitorlib/infrastructure.py:199: in put
    async with self._client.put(url, **kwargs) as response:
/usr/local/lib/python3.11/site-packages/aiohttp/client.py:1187: in __aenter__
    self._resp = await self._coro
/usr/local/lib/python3.11/site-packages/aiohttp/client.py:601: in _request
    await resp.start(conn)
/usr/local/lib/python3.11/site-packages/aiohttp/client_reqrep.py:960: in start
    with self._timer:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <aiohttp.helpers.TimerContext object at 0x2aaaeb21b010>
exc_type = <class 'asyncio.exceptions.CancelledError'>
exc_val = CancelledError(), exc_tb = <traceback object at 0x2aaaeaf3d200>

    def __exit__(
        self,
        exc_type: Optional[Type[BaseException]],
        exc_val: Optional[BaseException],
        exc_tb: Optional[TracebackType],
    ) -> Optional[bool]:
        if self._tasks:
            self._tasks.pop()

        if exc_type is asyncio.CancelledError and self._cancelled:
>           raise asyncio.TimeoutError from None
E           TimeoutError

/usr/local/lib/python3.11/site-packages/aiohttp/helpers.py:735: TimeoutError
----------------------------- Captured stdout call -----------------------------
0 (2d): (-56.0, 178.0) 0.0-120.0 2024-02-14T17:38:42.682161Z-2024-02-14T18:38:42.682161Z
1 (2d): (-56.1, 178.0) 0.0-120.0 2024-02-14T17:38:42.817100Z-2024-02-14T18:38:42.817100Z
2 (2d): (-56.2, 178.0) 0.0-120.0 2024-02-14T17:38:42.943341Z-2024-02-14T18:38:42.943341Z
3 (2d): (-56.3, 178.0) 0.0-120.0 2024-02-14T17:38:43.076327Z-2024-02-14T18:38:43.076327Z
4 (2d): (-56.4, 178.0) 0.0-120.0 2024-02-14T17:38:43.209786Z-2024-02-14T18:38:43.209786Z
5 (2d): (-56.5, 178.0) 0.0-120.0 2024-02-14T17:38:43.345533Z-2024-02-14T18:38:43.345533Z
6 (2d): (-56.6, 178.0) 0.0-120.0 2024-02-14T17:38:43.493194Z-2024-02-14T18:38:43.493194Z
7 (altitude): (-56.0, 178.0) 140.0-159.0 2024-02-14T17:38:43.637506Z-2024-02-14T18:38:43.637506Z
8 (altitude): (-56.0, 178.0) 160.0-179.0 2024-02-14T17:38:43.782837Z-2024-02-14T18:38:43.782837Z
9 (altitude): (-56.0, 178.0) 180.0-199.0 2024-02-14T17:38:43.934621Z-2024-02-14T18:38:43.934621Z
10 (altitude): (-56.0, 178.0) 200.0-219.0 2024-02-14T17:38:44.079218Z-2024-02-14T18:38:44.079218Z
11 (altitude): (-56.0, 178.0) 220.0-239.0 2024-02-14T17:38:44.229315Z-2024-02-14T18:38:44.229315Z
12 (altitude): (-56.0, 178.0) 240.0-259.0 2024-02-14T17:38:44.384903Z-2024-02-14T18:38:44.384903Z
13 (altitude): (-56.0, 178.0) 260.0-279.0 2024-02-14T17:38:44.540394Z-2024-02-14T18:38:44.540394Z
14 (time): (-56.0, 178.0) 0.0-120.0 2024-02-14T22:18:44.698535Z-2024-02-14T22:27:44.698535Z
15 (time): (-56.0, 178.0) 0.0-120.0 2024-02-14T22:38:44.845532Z-2024-02-14T22:47:44.845532Z
16 (time): (-56.0, 178.0) 0.0-120.0 2024-02-14T22:58:44.985477Z-2024-02-14T23:07:44.985477Z
17 (time): (-56.0, 178.0) 0.0-120.0 2024-02-14T23:18:45.143788Z-2024-02-14T23:27:45.143788Z
18 (time): (-56.0, 178.0) 0.0-120.0 2024-02-14T23:38:45.300040Z-2024-02-14T23:47:45.300040Z
19 (time): (-56.0, 178.0) 0.0-120.0 2024-02-14T23:58:45.456198Z-2024-02-15T00:07:45.456198Z
=============================== warnings summary ===============================
../../../usr/local/lib/python3.11/site-packages/pvlib/tools.py:7
  /usr/local/lib/python3.11/site-packages/pvlib/tools.py:7: DeprecationWarning:
  Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
  (to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
  but was not found to be installed on your system.
  If this would cause problems for you,
  please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466

    import pandas as pd

-- Docs: https://docs.pytest.org/en/stable/warnings.html
--------------------- generated xml file: /app/test_result ---------------------
=========================== short test summary info ============================
SKIPPED [1] infrastructure.py:46: Prerequisite task did not pass
======= 1 failed, 313 passed, 1 skipped, 1 warning in 319.78s (0:05:19) ========
+ '[' '' == true ']'
+ echo 'Prober did not succeed.'
Prober did not succeed.
+ exit 1

@barroco
Copy link
Contributor

barroco commented Feb 15, 2024

Note that the problem reported above does not seem to be related to that particular change. The same check fails for the following dss image deployments:

  • This PR with profiler enabled: fails
  • This PR without profiler: fails
  • interuss/dss:v0.10.0-rc1: fails
  • interuss/dss:v0.9.0-rc1: fails
  • interuss/dss:v0.8.0-rc2: fails
  • interuss/dss:v0.7.0: fails

An issue has been created to address this particular problem: #1002

@BenjaminPelletier
Copy link
Member Author

Per meeting yesterday, my understanding is that we believe the concurrency latency issue is separate and should not block this PR.

@BenjaminPelletier BenjaminPelletier merged commit 8557154 into interuss:master Feb 21, 2024
6 checks passed
@BenjaminPelletier BenjaminPelletier deleted the update-profiler branch February 21, 2024 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants