Playing around with await/async in Python 3.5

Mon, 25 May 2015 07:07:03 +0000

PEP-0492 was recently approved giving Python 3.5 some special syntax for dealing with co-routines. A lot of the new functionaltiy was available pre-3.5, but the syntax certainly wasn’t ideal as the concept of generators and co-routines were kind of intermingled. PEP-0492 makes an explicit distinction between generators and co-routines through the use of the the async keyword.

This post aims to describe how these new mechanisms works from a rather low-level. If you are mostly interested in just using this functionality for high-level stuff I recommend skipping this post and reading up on the built-in asyncio module. If you are interested in how these low-level concepts can be used to build up your own version of the asyncio module, then you might find this interesting.

For this post we’re going to totally ignore any asynchronous I/O aspect and just limit things to interleaving progress from multiple co-routines. Here are two very simple functions:

def coro1():
    print("C1: Start")
    print("C1: Stop")


def coro2():
    print("C2: Start")
    print("C2: a")
    print("C2: b")
    print("C2: c")
    print("C2: Stop")

We start with two very simple function, coro1 and coro2. We could call these function one after the other:

coro1()
coro2()

and we’d get the expected output:

C1: Start
C1: Stop
C2: Start
C2: a
C2: b
C2: c
C2: Stop

But, for some reason, rather than running these one after the other, we’d like to interleave the execution. We can’t just do that with normal functions, so let’s turn these into co-routines:

async def coro1():
    print("C1: Start")
    print("C1: Stop")


async def coro2():
    print("C2: Start")
    print("C2: a")
    print("C2: b")
    print("C2: c")
    print("C2: Stop")

Through the magic of the new async these functions are no-longer functions, but now they are co-routines (or more specifically native co-routine functions). When you call a normal function, the function body is executed, however when you call a co-routine function the body isn’t executed; instead you get back a co-routine object:

c1 = coro1()
c2 = coro2()
print(c1, c2)

gives:

<coroutine object coro1 at 0x10ea60990> <coroutine object coro2 at 0x10ea60a40>

(The interpretter will also also print some runtime warnings that we’ll ignore for now).

So, what good is having a co-routine object? How do we actually execute the thing? Well, one way to execute a co-routine is through an await expression (using the new await keyword). You might think you could do something like:

await c1

but, you’d be disappointed. An await expression is only valid syntax when contained within an native co-routine function. You could do something like:

async def main():
    await c1

but of course then you are left with the problem of how to force the execution of main!

The trick to realise is that co-routines are actually pretty similar to Python generators, and have the same send method. We can kick off execution of a co-routine by calling the send method.

c1.send(None)

This gets our first co-routine executing to completion, however we also get a nasty StopIteration exception:

C1: Start
C1: Stop
Traceback (most recent call last):
  File "test3.py", line 16, in 
    c1.send(None)
StopIteration

The StopIteration exception is the mechanism used to indicate that a generator (or co-routine in this case) has completed execution. Despite being an exception it is actually quite expected! We can wrap this in an appropriate try/catch block, to avoid the error condition. At the same time let’s start the execution of our second co-routine:

try:
    c1.send(None)
except StopIteration:
    pass
try:
    c2.send(None)
except StopIteration:
    pass

Now we get complete output, but it is disappointingly similar to our original output. So we have a bunch more code, but no actual interleaving yet! Co-routines are not dissimilar to threads as they allow the interleaving of multiple distinct threads of control, however unlike threads when we have a co-routine any switching is explicit, rather than implicit (which is, in many cases, a good thing!). So we need to put in some of these explicit switches.

Normally the send method on generators will execute until the generator yields a value (using the yield keyword), so you might think we could change coro1 to something like:

async def coro1():
    print("C1: Start")
    yield
    print("C1: Stop")

but we can’t use yield inside a co-routine. Instead we use the new await expression, which suspends execution of the co-routine until the awaitable completes. So we need something like await _something_; the question is what is the something in thise case? We can’t await just await on nothing! The PEP explains which things are awaitable. One of them is another native co-routine, but that doesn’t help get to the bottom of things. One of them is an object defined with a special CPython API, but we want to avoid extension modules and stick to pure Python right now. That leaves two options; either a generator-based coroutine object or a special Future-like like object.

So, let’s go with the generator-based co-routine object to start with. Basically a Python generator (e.g.: something that has a yield in it) can be marked as a co-routine through the types.coroutine decorator. So a very simple example of this would be:

@types.coroutine
def switch():
    yield

This define a generator-based co-routine function. To get a generator-based co-routine object we just call the function. So, we can change our coro1 co-routine to:

async def coro1():
    print("C1: Start")
    await switch()
    print("C1: Stop")

With this in place, we hope that we can interleave our execution of coro1 with the execution of coro2. If we try it with our existing code we get the output:

C1: Start
C2: Start
C2: a
C2: b
C2: c
C2: Stop

We can see that as expected coro1 stopped executing after the first print statement, and then coro2 was able to execute. In fact, we can look at the co-routine object and see exactly where it is suspended with some code like this:

print("c1 suspended at: {}:{}".format(c1.gi_frame.f_code.co_filename, c1.gi_frame.f_lineno))

which print-out the line of your await expression. (Note: this gives you the outer-most await, so is mostly just for explanatory purpose here, and not particularly useful in the general case).

OK, the question now is, how can we resume coro1 so that it executes to completion. We can just use send again. So we end up with some code like:

try:
    c1.send(None)
except StopIteration:
    pass
try:
    c2.send(None)
except StopIteration:
    pass
try:
    c1.send(None)
except StopIteration:
    pass

which then gives us our expected output:

C1: Start
C2: Start
C2: a
C2: b
C2: c
C2: Stop
C1: Stop

So, at this point we’re manually pushing the co-routines through to completion by explicitly calling send on each individual co-routine object. This isn’t going to work in general. What we’d really like is a function that kept executing all our co-routines until they had all completed. In other words, we want to continually execute send on each co-routine object until that method raises the StopIteration exception.

So, let’s create a function that takes in a list of co-routines and executes them until completion. We’ll call this function run.

def run(coros):
    coros = list(coros)

    while coros:
        # Duplicate list for iteration so we can remove from original list.
        for coro in list(coros):
            try:
                coro.send(None)
            except StopIteration:
                coros.remove(coro)

This picks a co-routine from the list of co-routines, executes it, and then if a StopIteration exception is raised, the co-routine is removed from the list.

We can then remove the code manually calling the send method and instead do something like:

c1 = coro1()
c2 = coro2()
run([c1, c2])

And now we have a very simple run-time for executing co-routines using the new await and async features in Python 3.5. Code related to this post is available on github.

blog comments powered by Disqus