Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle case when there is no running event loop #220

Merged
merged 5 commits into from
May 6, 2024

Conversation

kat-wicks
Copy link
Contributor

As seen in this on call alert (see logs below), when there is no running event loop, calling get_event_loop during TLM initialization may fail. This is because get_event_loop will only generate a new event loop if it's called from the main thread, and will raise a RuntimeError otherwise. We set the event loop explicitly in TLM initialization and call loop.run_until_complete throughout the class (i.e., no using get_event_loop outside of initialiization), so setting self._loop to a new_event_loop is fine.

Alternatives:

  • use Asyncio.run in place of loop.run_until_complete
Datadog logs

{
"id": "AgAAAY8YgItVj7gwYwAAAAAAAAAYAAAAAEFZOFlnSTVJQUFEaVBHdFVaUjB5MWdBQQAAACQAAAAAMDE4ZjE4ODgtNjYyMi00MDA4LWJmMTMtMmQ3NTYwYWU3YzRm",
"content": {
"timestamp": "2024-04-26T03:43:48.565Z",
"tags": [
"cluster_name:cleanlab-studio-training-cluster",
"region:us-east-1",
"task_arn:cleanlab-studio-training-cluster/62f421e6d8424ebbb882dcfa812b36db",
"source:fargate_backend_container",
"task_version:452",
"task_family:cleanlabpipelinestackcleanlabstudioproductionstagecleanlabtrainingstackfargatebackendtaska425bcfa",
"container_id:62f421e6d8424ebbb882dcfa812b36db-640160366",
"container_name:fargate_backend_container",
"env:production",
"datadog.submission_auth:api_key"
],
"service": "FARGATE_BACKEND_CONTAINER",
"message": "CLI error: os: Linux\nos_release: 5.15.0-1058-aws\npython_version: 3.10.10 (main, Mar 21 2023, 18:45:11) [GCC 11.2.0]\ncli_version: 2.0.2\ndependencies: {'aiohttp': '3.9.1', 'Click': None, 'colorama': '0.4.6', 'ijson': '3.2.3', 'jsonstreams': '0.6.0', 'nest-asyncio': '1.5.8', 'openpyxl': '3.1.2', 'pandas': '2.1.4', 'Pillow': None, 'pyexcel': '0.7.0', 'pyexcel-xls': '0.7.0', 'pyexcel-xlsx': '0.6.0', 'requests': '2.31.0', 'semver': '2.13.0', 'tqdm': '4.66.1', 'typing-extensions': None, 'validators': '0.22.0'}\nfunc_name: TLM\napi_key: 27834d86d84845ecb8cfb94eb45c8fce\nstack_trace: File \n result = func(*args, **kwargs)\n File \n return trustworthy_language_model.TLM(\n File \n self._event_loop = asyncio.get_event_loop()\n File \n raise RuntimeError('There is no current event loop in thread %r.'\nRuntimeError: There is no current event loop in thread 'ScriptRunner.scriptThread'.\n\nerror_type: RuntimeError\nis_handled_error: False",
"attributes": {
"dd": {
"service": "FARGATE_BACKEND_CONTAINER",
"env": "production",
"version": ""
},
"remote_addr": "52.90.210.70",
"method": "POST",
"source": "stdout",
"request_json": {
"os": "Linux",
"api_key": "27834d86d84845ecb8cfb94eb45c8fce",
"python_version": "3.10.10 (main, Mar 21 2023, 18:45:11) [GCC 11.2.0]",
"error_type": "RuntimeError",
"is_handled_error": false,
"os_release": "5.15.0-1058-aws",
"cli_version": "2.0.2",
"stack_trace": "File \n result = func(*args, **kwargs)\n File \n return trustworthy_language_model.TLM(\n File \n self._event_loop = asyncio.get_event_loop()\n File \n raise RuntimeError('There is no current event loop in thread %r.'\nRuntimeError: There is no current event loop in thread 'ScriptRunner.scriptThread'.\n",
"dependencies": {
"validators": "0.22.0",
"ijson": "3.2.3",
"requests": "2.31.0",
"pandas": "2.1.4",
"tqdm": "4.66.1",
"pyexcel": "0.7.0",
"colorama": "0.4.6",
"nest-asyncio": "1.5.8",
"semver": "2.13.0",
"aiohttp": "3.9.1",
"jsonstreams": "0.6.0",
"pyexcel-xls": "0.7.0",
"pyexcel-xlsx": "0.6.0",
"openpyxl": "3.1.2"
},
"func_name": "TLM"
},
"filename": "cli_api.py",
"lineno": 302,
"route": "/api/cli/v0/telemetry",
"service": "FARGATE_BACKEND_CONTAINER",
"name": "backend.api.cli.cli_api",
"levelname": "ERROR",
"asctime": "2024-04-26 03:43:48,565",
"request_id": "73d8232fcd5642f696cc89cbcdb30a34",
"timestamp": 1714103028565
}
}
}

@axl1313
Copy link
Collaborator

axl1313 commented Apr 30, 2024

Curious how you've tested this PR?

@kat-wicks kat-wicks requested review from axl1313 and removed request for axl1313 May 3, 2024 18:43
Copy link
Collaborator

@axl1313 axl1313 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was able to reproduce error and test fix by running TLM in AnyIO thread

@kat-wicks kat-wicks merged commit 74b7479 into main May 6, 2024
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants