diff --git a/docs/develop/python/python-sdk-sync-vs-async.mdx b/docs/develop/python/python-sdk-sync-vs-async.mdx index 080001376f..1c1880a334 100644 --- a/docs/develop/python/python-sdk-sync-vs-async.mdx +++ b/docs/develop/python/python-sdk-sync-vs-async.mdx @@ -56,7 +56,7 @@ into a synchronous program that executes serially, defeating the entire purpose of using `asyncio`. This can also lead to potential deadlock, and unpredictable behavior that causes tasks to be unable to execute. Debugging these issues can be difficult and time consuming, as locating the source of the blocking call might not always -be immediately obvious. +be immediately evident. Due to this, Python developers must be extra careful to not make blocking calls from within an asynchronous Activity, or use an async safe library to perform @@ -67,65 +67,79 @@ asynchronous Activity would lead to blocking your event loop. If you want to mak an HTTP call from within an asynchronous Activity, you should use an async-safe HTTP library such as `aiohttp` or `httpx`. Otherwise, use a synchronous Activity. -## Implementing Asynchronous Activities +## Python SDK Worker Execution Architecture -The following code is an asynchronous Activity Definition that's similar to one -you will use during an upcoming exercise. Like the Workflow Definition -you've already run, it takes a name (`str`) as input and returns a -customized greeting (`str`) as output. However, this Activity makes -a call to a microservice, accessed through HTTP, to request this -greeting in Spanish. This activity uses the `aiohttp` library to make an async -safe HTTP request. Using the `requests` library here would have resulting in -blocking code within the async event loop, which will block the entire async -event loop. For more in-depth information about this issue, refer to the -[Python asyncio documentation](https://docs.python.org/3/library/asyncio-dev.html#running-blocking-code). +Python workers have following components for executing code: -The code below also implements the Activity Definition as a class, rather than a -function. The `aiohttp` library requires an established `Session` to perform the -HTTP request. It would be inefficient to establish a `Session` every time an -Activity is invoked, so instead this code accepts a `Session` object as an instance -parameter and makes it available to the methods. This approach will also be -beneficial when the execution is over and the `Session` needs to be closed. +- Your event loop, which runs Tasks from async Activities **plus the rest of the Temporal Worker, such as communicating with the server**. +- An executor for executing Activity Tasks from synchronous Activities. A thread pool executor is recommended. +- A thread pool executor for executing Workflow Tasks. -In this example, the Activity supplies the name in the URL and retrieves -the greeting from the body of the response. +> See Also: [docs for](https://python.temporal.io/temporalio.worker.Worker.html#__init__) `worker.__init__()` -```python -import aiohttp -import urllib.parse -from temporalio import activity +### Activities -class TranslateActivities: - def __init__(self, session: aiohttp.ClientSession): - self.session = session +- The Activities and the temporal worker SDK code both run the default asyncio event loop or whatever event loop you give the Worker. +- Synchronous Activities run in the `activity_executor`. - @activity.defn - async def greet_in_spanish(self, name: str) -> str: - greeting = await self.call_service("get-spanish-greeting", name) - return greeting +### Workflows - # Utility method for making calls to the microservices - async def call_service(self, stem: str, name: str) -> str: - base = f"http://localhost:9999/{stem}" - url = f"{base}?name={urllib.parse.quote(name)}" +Since Workflow Tasks have the following three properties, they're run in threads. - async with self.session.get(url) as response: - translation = await response.text() +- are CPU bound +- need to be timed out for deadlock detection +- need to not block other Workflow Tasks - if response.status >= 400: - raise ApplicationError( - f"HTTP Error {response.status}: {translation}", - # We want to have Temporal automatically retry 5xx but not 4xx - non_retryable=response.status < 500, - ) +The `workflow_task_executor` is the thread pool these Tasks are run on. +The fact that Workflow Tasks run in a thread pool can be confusing at first because Workflow Definitions are `async`. +The key differentiator is that the `async` in Workflow Definitions isn't referring to the standard event loop -- it's referring to the Workflow's own event loop. +Each Workflow gets its own “Workflow event loop,” which is deterministic, and described in [the Python SDK blog](https://temporal.io/blog/durable-distributed-asyncio-event-loop#temporal-workflows-are-asyncio-event-loops). +The Workflow event loop doesn't constantly loop -- it just gets cycled through during a Workflow Task to make as much progress as possible on all of its futures. +When it can no longer make progress on any of its futures, then the Workflow Task is complete. - return translation -``` +### Number of CPU cores + +The only ways to use more than one core in a python Worker (considering Python's GIL) are: + +- Run more than one Worker Process. +- Run the synchronous Activities in a process pool executor, but a thread pool executor is recommended. -## Implementing Synchronous Activities +### A Worker infrastructure option: Separate Activity and Workflow Workers -The following code is an implementation of the above Activity, but as a -synchronous Activity Definition. When making the call to the microservice, +To reduce the risk of event loops or executors getting blocked, +some users choose to deploy separate Workers for Workflow Tasks and Activity Tasks. + +## Activity Definition + +**By default, Activities should be synchronous rather than asynchronous**. +You should only make an Activity asynchronous if you are +certain that it doesn't block the event loop. + +This is because if you have blocking code in an `async def` function, +it blocks your event loop and the rest of Temporal, which can cause bugs that are +hard to diagnose, including freezing your worker and blocking Workflow progress +(because Temporal can't tell the server that Workflow Tasks are completing). +The reason synchronous Activities help is because they +run in the `activity_executor` ([docs for](https://python.temporal.io/temporalio.worker.Worker.html#__init__) `worker.__init__()`) +rather than in the global event loop, +which helps because: + +- There's no risk of accidentally blocking the global event loop. +- If you have multiple + Activity Tasks running in a thread pool rather than an event loop, one bad + Activity Task can't slow down the others; this is because the OS scheduler preemptively + switches between threads, which the event loop coordinator doesn't do. + +> See Also: +> ["Types of Activities" section of Python SDK README](https://github.com/temporalio/sdk-python#types-of-activities) + +## How to implement Synchronous Activities + +The following code is a synchronous Activity Definition. +It takes a name (`str`) as input and returns a +customized greeting (`str`) as output. + +It makes a call to a microservice, and when making this call, you'll notice that it uses the `requests` library. This is safe to do in synchronous Activities. @@ -150,11 +164,11 @@ class TranslateActivities: return response.text ``` -In the above example we chose not to share a session across the Activity, so +The preceeding example doesn't share a session across the Activity, so `__init__` was removed. While `requests` does have the ability to create sessions, -it is currently unknown if they are thread safe. Due to no longer having or needing -`__ini__`, the case could be made here to not implement the Activities as a class, -but just as decorated functions as shown below: +it's currently unknown if they're thread safe. Due to no longer having or needing +`__init__`, the case could be made here to not implement the Activities as a class, +but just as decorated functions as shown here: ```python @activity.defn @@ -171,10 +185,90 @@ def call_service(stem: str, name: str) -> str: return response.text ``` -Whether to implement Activities as class methods or functions is a design choice -choice left up to the developer when cross-activity state is not needed. Both are +Whether to implement Activities as class methods or functions is a design +choice left up to the developer when cross-activity state isn't needed. Both are equally valid implementations. +### How to run Synchronous Activities on a Worker + +When running synchronous Activities, the Worker +needs to have an `activity_executor`. Temporal +recommends using a `ThreadPoolExecutor` as shown here: + +```python +with ThreadPoolExecutor(max_workers=42) as executor: + worker = Worker( + # ... + activity_executor=executor, + # ... + ) +``` + +## How to Implement Asynchronous Activities + +The following code is an implementation of the preceeding Activity, but as an +asynchronous Activity Definition. + +It makes +a call to a microservice, accessed through HTTP, to request this +greeting in Spanish. This Activity uses the `aiohttp` library to make an async +safe HTTP request. Using the `requests` library here would have resulting in +blocking code within the async event loop, which will block the entire async +event loop. For more in-depth information about this issue, refer to the +[Python asyncio documentation](https://docs.python.org/3/library/asyncio-dev.html#running-blocking-code). + +The following code also implements the Activity Definition as a class, rather than a +function. The `aiohttp` library requires an established `Session` to perform the +HTTP request. It would be inefficient to establish a `Session` every time an +Activity is invoked, so instead this code accepts a `Session` object as an instance +parameter and makes it available to the methods. This approach will also be +beneficial when the execution is over and the `Session` needs to be closed. + +In this example, the Activity supplies the name in the URL and retrieves +the greeting from the body of the response. + +```python +import aiohttp +import urllib.parse +from temporalio import activity + +class TranslateActivities: + def __init__(self, session: aiohttp.ClientSession): + self.session = session + + @activity.defn + async def greet_in_spanish(self, name: str) -> str: + greeting = await self.call_service("get-spanish-greeting", name) + return greeting + + # Utility method for making calls to the microservices + async def call_service(self, stem: str, name: str) -> str: + base = f"http://localhost:9999/{stem}" + url = f"{base}?name={urllib.parse.quote(name)}" + + async with self.session.get(url) as response: + translation = await response.text() + + if response.status >= 400: + raise ApplicationError( + f"HTTP Error {response.status}: {translation}", + # We want to have Temporal automatically retry 5xx but not 4xx + non_retryable=response.status < 500, + ) + + return translation +``` + +### How to run synchronous code from an asynchronous activity + +If your Activity is asynchronous and you don't want to change it to synchronous, +but you need to run blocking code inside it, +then you can use python utility functions to run synchronous code +in an asynchronous function: + +- [`loop.run_in_executor()`](https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.run_in_executor), which is also mentioned in the ["running blocking code" section of the "developing with asyncio" guide](https://docs.python.org/3/library/asyncio-dev.html#running-blocking-code) +- [`asyncio.to_thread()`](https://docs.python.org/3/library/asyncio-task.html#running-in-threads) + ## When Should You Use Async Activities Asynchronous Activities have many advantages, such as potential speed up of execution.