Published on

Be Careful Not to Hog the Eventloop in Python

Authors

I have been integrating with several libraries in Python to use web services and sockets to get information from different sources.

This information needed to come in as its available. As a result, using websockets (where possible) made sense.

I kept running into an issue where only one of the many async functions I was using was being hit. To understand what was happening I dove deeper into what happens with async in Python by reading this excellent article. Based on this I understood how the event loop works and how to run different coroutines can be run in parallel.

To understand the problem I ran into let's see how we can run async functions in parallel in Python:

Running Multiple Async Coroutines in Parallel

To run multiple async functions in parallel you have to make sure that:

  • You defined these functions as coroutines.
  • You create tasks out of these.
  • You use asyncio.gather on these tasks.

Defining Functions as Coroutines

Defining a coroutine is fairly simple in Python. It looks as below:

async def my_async_function_1():
    # do whatever async things you need to here
    # Use await where necessary when waiting on some server call or other IO

async def my_async_function_2():
    # do whatever async things you need to here
    # Use await where necessary when waiting on some server call or other IO

Creating Tasks from Coroutines

We then turn this coroutine into a task using the helper methods in asyncio:

# ...
import asyncio

tasks = [asyncio.create_task(my_async_function_1()),  asyncio.create_task(my_async_function_2())]

Note in the above how we call the coroutine and do no simply pass the function pointer.

The reason for this is because calling an async function without awaiting it will return an awaitable coroutine object and not execute the function.

Gathering and Running the Tasks in Parallel

Finally we gather the tasks and run them together:

# ...

async def main():
    await asyncio.gather(*tasks)

if __name__ == '__main__':
    asyncio.get_event_loop().run_until_complete(main())

Gotchas

Trying to Await Something that is not async

One thing that caught me out with this is that one of the libraries I was using had websocket integration but the implementation it used was not async. I was awaiting it but as the underlying library was not async my await was useless - this library ended up hogging the event loop.

When using async a single event loop is used to manage the execution of different async functions. It is not parallel instead when some expensive IO operation is in progress await tells the event loop manager that it is busy and can carry on processing another async function. When that function awaits the manager moves on to the next async function.

As IO operations take long the CPU is not forced to sit waiting and instead can process some other async function as specified by the event loop manager. In my case, the issue I was having is that my library got a hold of the event loop and did not let go. As this library was not async, calling await against it was ignored. The library I was using ended up hogging the event loop and nothing else could execute as a result. I fixed this by integrating with the target endpoint's websocket API myself in an async manner.

Forgetting to Await Something that is Async

This is an easy one to do. You simply forget to await some_async_function(). Luckily this is easy to spot when you hit this piece of code a warning will pop up in your logs. If you use Pycharm this will also be highlighted in yellow for you.