feat: set default `maxTimeMS` when creating an index; unhandled exc on `setup()` will exit `kytosd` #414

viniarck · 2023-09-26T17:43:25Z

Closes #413 (check out the last comments there for more information)
Partly addresses #314

Summary

See updated changelog file

Local Tests

Simulated a Mongo OperationFailure (equivalent to index creation op error), confirmed the retries also worked as expected, and confirmed the traceback was included (this will generate some extra information, but when this happens, sometimes it can be an actual bug, so the traceback will be needed for reporting a bug):

Web update - Web UI was not updated
2023-09-26 14:28:47,620 - INFO [kytos.core.db] (MainThread) Starting DB connection
2023-09-26 14:28:47,621 - INFO [kytos.core.db] (MainThread) Trying to run 'hello' command on MongoDB...
2023-09-26 14:28:47,631 - INFO [kytos.core.db] (MainThread) Ran 'hello' command on MongoDB successfully. It's ready!
2023-09-26 14:28:47,631 - WARNING [kytos.core.retry] (MainThread) Retry #1 for bootstrap_index, args: (<class 'kytos.core.db.Mongo'>, 'users', [('username', 1)]), kwargs: {'unique': Tru
e}, seconds since start: 0.00
2023-09-26 14:28:48,192 - WARNING [kytos.core.retry] (MainThread) Retry #2 for bootstrap_index, args: (<class 'kytos.core.db.Mongo'>, 'users', [('username', 1)]), kwargs: {'unique': Tru
e}, seconds since start: 0.56
Kytos couldn't start because of some forced OperationFailure Traceback (most recent call last):
  File "/home/viniarck/repos/kytos/kytos/core/controller.py", line 260, in start
    self.start_auth()
  File "/home/viniarck/repos/kytos/kytos/core/controller.py", line 165, in start_auth
    self.auth = Auth(self)
  File "/home/viniarck/repos/kytos/kytos/core/auth.py", line 202, in __init__
    self.user_controller.bootstrap_indexes()
  File "/home/viniarck/repos/kytos/.direnv/python-3.9/lib/python3.9/site-packages/tenacity/__init__.py", line 324, in wrapped_f
    return self(f, *args, **kw)
  File "/home/viniarck/repos/kytos/.direnv/python-3.9/lib/python3.9/site-packages/tenacity/__init__.py", line 404, in __call__
    do = self.iter(retry_state=retry_state)
  File "/home/viniarck/repos/kytos/.direnv/python-3.9/lib/python3.9/site-packages/tenacity/__init__.py", line 349, in iter
    return fut.result()
  File "/usr/lib/python3.9/concurrent/futures/_base.py", line 439, in result
    return self.__get_result()
  File "/usr/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "/home/viniarck/repos/kytos/.direnv/python-3.9/lib/python3.9/site-packages/tenacity/__init__.py", line 407, in __call__
    result = fn(*args, **kwargs)
  File "/home/viniarck/repos/kytos/kytos/core/retry.py", line 25, in decorated
    return func(*args, **kwargs)
  File "/home/viniarck/repos/kytos/kytos/core/auth.py", line 96, in bootstrap_indexes
    if self.mongo.bootstrap_index(collection, keys, **kwargs):
  File "/home/viniarck/repos/kytos/.direnv/python-3.9/lib/python3.9/site-packages/tenacity/__init__.py", line 324, in wrapped_f
    return self(f, *args, **kw)
  File "/home/viniarck/repos/kytos/.direnv/python-3.9/lib/python3.9/site-packages/tenacity/__init__.py", line 404, in __call__
    do = self.iter(retry_state=retry_state)
  File "/home/viniarck/repos/kytos/.direnv/python-3.9/lib/python3.9/site-packages/tenacity/__init__.py", line 360, in iter
    raise retry_exc.reraise()
  File "/home/viniarck/repos/kytos/.direnv/python-3.9/lib/python3.9/site-packages/tenacity/__init__.py", line 193, in reraise
    raise self.last_attempt.result()
  File "/usr/lib/python3.9/concurrent/futures/_base.py", line 439, in result
    return self.__get_result()
  File "/usr/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "/home/viniarck/repos/kytos/.direnv/python-3.9/lib/python3.9/site-packages/tenacity/__init__.py", line 407, in __call__
    result = fn(*args, **kwargs)
  File "/home/viniarck/repos/kytos/kytos/core/db.py", line 121, in bootstrap_index
    raise OperationFailure("some forced OperationFailure")
pymongo.errors.OperationFailure: some forced OperationFailure

Task was destroyed but it is pending!
task: <Task pending name='Task-1' coro=<start_shell_async() running at /home/viniarck/repos/kytos/kytos/core/kytosd.py:120>>
sys:1: RuntimeWarning: coroutine 'start_shell_async' was never awaited
RuntimeWarning: Enable tracemalloc to get the object allocation traceback

Simulated a generic error, undefined variable on flow_manager, noticed that the traeback was chained correctly just so you can trace it completely:

Web update - Web UI was not updated
2023-09-26 14:35:31,877 - INFO [kytos.core.db] (MainThread) Starting DB connection
2023-09-26 14:35:31,878 - INFO [kytos.core.db] (MainThread) Trying to run 'hello' command on MongoDB...
2023-09-26 14:35:31,889 - INFO [kytos.core.db] (MainThread) Ran 'hello' command on MongoDB successfully. It's ready!
2023-09-26 14:35:31,901 - INFO [kytos.core.api_server] (MainThread) Started /api/kytos/core/auth/login/ - GET
2023-09-26 14:35:31,901 - INFO [kytos.core.api_server] (MainThread) Started /api/kytos/core/auth/users/ - GET
2023-09-26 14:35:31,901 - INFO [kytos.core.api_server] (MainThread) Started /api/kytos/core/auth/users/{username} - GET
2023-09-26 14:35:31,901 - INFO [kytos.core.api_server] (MainThread) Started /api/kytos/core/auth/users/ - POST
2023-09-26 14:35:31,901 - INFO [kytos.core.api_server] (MainThread) Started /api/kytos/core/auth/users/{username} - DELETE
2023-09-26 14:35:31,901 - INFO [kytos.core.api_server] (MainThread) Started /api/kytos/core/auth/users/{username} - PATCH
2023-09-26 14:35:31,901 - INFO [kytos.core.controller] (MainThread) /home/viniarck/repos/kytos/.direnv/python-3.9/var/run/kytos
2023-09-26 14:35:31,901 - INFO [kytos.core.controller] (MainThread) Starting Kytos - Kytos Controller
2023-09-26 14:35:31,903 - INFO [kytos.core.controller] (MainThread) Starting TCP server: <kytos.core.atcp_server.KytosServer object at 0x7f540d51d490>
2023-09-26 14:35:31,903 - INFO [kytos.core.atcp_server] (MainThread) Kytos listening at 0.0.0.0:6653
2023-09-26 14:35:31,903 - INFO [kytos.core.controller] (MainThread) Loading Kytos NApps...
2023-09-26 14:35:31,905 - INFO [kytos.core.napps.napp_dir_listener] (MainThread) NAppDirListener Started...
2023-09-26 14:35:31,907 - INFO [kytos.core.controller] (MainThread) Loading NApp kytos/flow_manager
Kytos couldn't start because of KytosNAppSetupException: NApp kytos/flow_manager exception name 'undef_var' is not defined  Traceback (most recent call last):
  File "/home/viniarck/repos/kytos/kytos/core/controller.py", line 863, in load_napp
    napp = napp_module.Main(controller=self)
  File "/home/viniarck/repos/kytos/kytos/core/napps/base.py", line 194, in __init__
    self.setup()
  File "/home/viniarck/repos/napps/napps/kytos/flow_manager/main.py", line 90, in setup
    undef_var
NameError: name 'undef_var' is not defined

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/viniarck/repos/kytos/kytos/core/controller.py", line 265, in start
    self.start_controller()
  File "/home/viniarck/repos/kytos/kytos/core/controller.py", line 359, in start_controller
    self.load_napps()
  File "/home/viniarck/repos/kytos/kytos/core/controller.py", line 898, in load_napps
    self.load_napp(napp.username, napp.name)
  File "/home/viniarck/repos/kytos/kytos/core/controller.py", line 866, in load_napp
    raise KytosNAppSetupException(msg) from exc
kytos.core.exceptions.KytosNAppSetupException: KytosNAppSetupException: NApp kytos/flow_manager exception name 'undef_var' is not defined 

Task was destroyed but it is pending!
task: <Task pending name='Task-1' coro=<start_shell_async() running at /home/viniarck/repos/kytos/kytos/core/kytosd.py:120>>
2023-09-26 14:35:32,041 - ERROR [kytos.core.controller] (MainThread) 1
2023-09-26 14:35:32,041 - INFO [kytos.core.controller] (MainThread) Shutting down Kytos...
/home/viniarck/repos/kytos/kytos/core/kytosd.py:98: RuntimeWarning: coroutine 'start_shell_async' was never awaited
  async_main(config)
RuntimeWarning: Enable tracemalloc to get the object allocation traceback

End-to-End Tests

I'll dispatch an exec on GitLab, I'll post the results here later:

============================= test session starts ==============================
platform linux -- Python 3.9.2, pytest-7.2.0, pluggy-1.3.0
rootdir: /builds/amlight/kytos-end-to-end-tester/kytos-end-to-end-tests
plugins: rerunfailures-10.2, timeout-2.1.0, anyio-3.6.2
collected 242 items
tests/test_e2e_01_kytos_startup.py ..                                    [  0%]
tests/test_e2e_05_topology.py ..................                         [  8%]
tests/test_e2e_10_mef_eline.py ..........ss.....x.....x................  [ 24%]
tests/test_e2e_11_mef_eline.py ......                                    [ 27%]
tests/test_e2e_12_mef_eline.py .....Xx.                                  [ 30%]
tests/test_e2e_13_mef_eline.py .....xs.s......xs.s.XXxX.xxxx..X......... [ 47%]
...                                                                      [ 48%]
tests/test_e2e_14_mef_eline.py x                                         [ 49%]
tests/test_e2e_15_mef_eline.py ..                                        [ 50%]
tests/test_e2e_20_flow_manager.py .....................                  [ 58%]
tests/test_e2e_21_flow_manager.py ...                                    [ 59%]
tests/test_e2e_22_flow_manager.py ...............                        [ 66%]
tests/test_e2e_23_flow_manager.py ..............                         [ 71%]
tests/test_e2e_30_of_lldp.py ....                                        [ 73%]
tests/test_e2e_31_of_lldp.py ...                                         [ 74%]
tests/test_e2e_32_of_lldp.py ...                                         [ 76%]
tests/test_e2e_40_sdntrace.py .............                              [ 81%]
tests/test_e2e_41_kytos_auth.py ........                                 [ 84%]
tests/test_e2e_50_maintenance.py ........................                [ 94%]
tests/test_e2e_60_of_multi_table.py .....                                [ 96%]
tests/test_e2e_70_kytos_stats.py ........                                [100%]
=============================== warnings summary ===============================
= 220 passed, 6 skipped, 11 xfailed, 5 xpassed, 867 warnings in 11472.15s (3:11:12) =

parametrized maxTimeMS with env var MONGO_IDX_TIMEOUTMS

italovalcy

LGTM, just left a comment regarding print (not sure it was on purpose). Once the end-to-end tests finish, hopefully with good results, you have my approval.

italovalcy · 2023-09-26T17:47:06Z

kytos/core/controller.py

+        except Exception as exc:
+            exc_fmt = traceback.format_exc(chain=True)
+            message = f"Kytos couldn't start because of {str(exc)} {exc_fmt}"
+            print(message)


Just want to double check if this should be print instead of self.log.critical

Hi @italovalcy, yes, initially it was self.log.exception (relatively similar to self.log.critical), but I had to revert it on f4432f9. In summary, at this point here, since we're using QueueListener by default and it's threaded, it might end up not ready yet, so even a self.log.listener.stop() doesn't guarantee that it'd print out correctly (so it might lose the message before emiting the event or processing it). So, print ensure it's deterministic here, similarly to other parts of the code were either the log isn't ready or setup yet https://github.com/kytos-ng/kytos/blob/master/kytos/core/controller.py#L175

viniarck · 2023-09-26T17:59:07Z

kytos/core/controller.py

+        # pylint: disable=broad-except
+        try:
+            if self.options.database:
+                self.db_conn_or_core_shutdown()


There's room to also allow both db_conn_or_core_shutdown and init_apm_or_core_shutdown to reraise as mentioned on #415, but I'll only address in the feature since it's just a chore, and that way it also minimizes the chunks of code that need to be backported.

viniarck · 2023-09-27T13:21:14Z

Thanks for your review, Italo. e2e tests have passed with this branch. I'll merge it soon.

viniarck added 8 commits September 26, 2023 13:40

feat: added KytosNAppSetupException exception

78ee275

feat: added retries for bootstrap_index

6509119

parametrized maxTimeMS with env var MONGO_IDX_TIMEOUTMS

feat: handled broad exceptions on controller.start

fb789dd

tests: augmented create_index unit test

a108f61

tests: augmented controller unit tests accordingly

6f7209c

Updated CHANGELOG.rst

34dd4d3

feat: replaced self.log.exception with print trace to be deterministic

f4432f9

tests: refactored unit test accordingly

af3568c

viniarck requested a review from italovalcy September 26, 2023 17:43

viniarck requested a review from a team as a code owner September 26, 2023 17:43

italovalcy approved these changes Sep 26, 2023

View reviewed changes

viniarck mentioned this pull request Sep 26, 2023

chore: let db_conn_or_core_shutdown and init_apm_or_core_shutdown reraise their exceptions #415

Closed

viniarck commented Sep 26, 2023

View reviewed changes

fixed typo on changelog.rst

54ad9ab

This was referenced Sep 26, 2023

[2022.3.3] feat: set default maxTimeMS when creating an index; unhandled exc on setup() will exit kytosd #416

Merged

[2023.1.1] feat: set default maxTimeMS when creating an index; unhandled exc on setup() will exit kytosd #417

Merged

viniarck merged commit 7a2daf6 into master Sep 27, 2023
1 check passed

viniarck deleted the feat/load_napps_or_shutdown branch September 27, 2023 17:58

This was referenced Sep 28, 2023

how to know if the topology was loaded successfully and what to do otherwise kytos-ng/topology#163

Closed

enhancement: try to find a more graceful way to stop QueueListener when stopping kytosd #418

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: set default `maxTimeMS` when creating an index; unhandled exc on `setup()` will exit `kytosd` #414

feat: set default `maxTimeMS` when creating an index; unhandled exc on `setup()` will exit `kytosd` #414

viniarck commented Sep 26, 2023 •

edited

Loading

italovalcy left a comment

italovalcy Sep 26, 2023

viniarck Sep 26, 2023

viniarck Sep 26, 2023

viniarck commented Sep 27, 2023

feat: set default maxTimeMS when creating an index; unhandled exc on setup() will exit kytosd #414

feat: set default maxTimeMS when creating an index; unhandled exc on setup() will exit kytosd #414

Conversation

viniarck commented Sep 26, 2023 • edited Loading

Summary

Local Tests

End-to-End Tests

italovalcy left a comment

Choose a reason for hiding this comment

italovalcy Sep 26, 2023

Choose a reason for hiding this comment

viniarck Sep 26, 2023

Choose a reason for hiding this comment

viniarck Sep 26, 2023

Choose a reason for hiding this comment

viniarck commented Sep 27, 2023

feat: set default `maxTimeMS` when creating an index; unhandled exc on `setup()` will exit `kytosd` #414

feat: set default `maxTimeMS` when creating an index; unhandled exc on `setup()` will exit `kytosd` #414

viniarck commented Sep 26, 2023 •

edited

Loading