A high-level systems programming language


Project maintained by egorelik93 Hosted on GitHub Pages — Theme by mattgraham

Coroutines and Continuations

The counterpart to variant types is much more difficult to explain, in part because their scenario is related to a rare but very useful feature in some languages called coroutines. Some programmers may be familiar with special cases of coroutines such as iterator generators. A coroutine is essentially an entirely cooperative, non-preemptive thread. Some languages implement them using a task scheduler, but others, especially for the special cases, compile them down to a state machine and run them manually. Generally they are understood as being able to yield and resume multiple times. A highly unusual perspective is that each possible yield returns a variant, the current state of the state machine. Whereas a resource of a linear variant type represents one of multiple possibilities, a coroutine will at different points in time take on all possibilties; the only question is which is when. Thus, linear types describing the multiple yields of a coroutine are actually themselves the counterpart of variant types. Traditionally, this type construct is called par, which is fairly evocative of the spirit of coroutines even though they are not true parallelism since they are cooperative.

This is all probably quite confusing. In order to explain it more properly though, we are going to need to take a detour.

Continuations

I will come right out and say it: it probably comes as no surprise that implementing coroutines and other thread-like concepts, and in particular yielding, involves saving the current state of the coroutine. If you have already heard of the term continuation, then you probably already know that they are essentially saved states of the program that can be resumed. If you have not heard of continuations before, then you should be thinking of them as being safe goto statements, or perhaps more accurately setjmp and longjmp, each carrying a label and the saved state that will be needed at that label. Continuations are perhaps the single most flexible programming language feature ever designed, but much like goto statements themselves they are perhaps too flexible. The traditional continuation implemented in a few traditional programming languages is more than just a goto statement, because like everything else in such languages the continuation must be copyable and disposable too! Copying continuations turns out to be too expensive to be practical. We do not have this requirement in a linear type system, so we are free to implement continuations quite cheaply!

This turns out to be very important, because continuations, or rather their types, are actually the defining characteristic of Classical Linear Logic, compared to other linear type systems.

First, let us discuss what continuations could look like in a Rust-like language.

Many programmers from C-syntax languages are familiar with the return statement for exiting from a subroutine and providing a result. Rust is expression-oriented, and as such the return statement is not needed at the end of a function, but Rust still uses it to support early return. In that spirit, I propose that the return keyword can be used by itself to denote a resource of type Return<T>, where T is the return type of the function. Resources of this type can be called as if they were a single-argument function of type T, except that doing so will act as an early return.

fn f() -> i32 {
    let k = return;
    k(10)
}

In fact, this resource can be passed to and called from a nested function, and it will actually act as an early return not just from the nested function but also the function it was created in.

fn f() -> i32 {
    let k = return;
    let f = || k(5);
    f()
}

This resource is an example of a continuation, specifically representing resuming the caller of the function it was created it. What is happening is that a continuation of type Return<T> stores the return information from the function it was created in, usually the caller’s return address and a pointer to the caller’s stack frame. Most function calling conventions already place these two pieces of information onto the stack whenever a function is called. Calling this continuation restores this stack frame, jumps to the return address, and provides the given value as output, just like an ordinary return statement.

If you are familiar with Rust’s borrowing logic, you might expect that this stack pointer is borrowed by Return<T>. Rust currently lacks the concept of an ‘owned’ reference, but that is probably the closest comparison. Unlike Rust’s ‘borrowed’ reference types, one cannot just drop the Return<T> and regain access to the stack frame; one must resume the continuation in order to consume it, and only inside that executing continuation is access restored. Otherwise, the continuation cannot be disposed; it is our first example of a type that must be linear and not affine. The continuation is in that regard a bit like a closure that takes ownership of its environment. However, when it comes to the role of the lifetime system in ensuring that a reference does not outlive the location it refers to, it is fair to say that Return<T> must have a lifetime restricted to the function it is in; the stack frame needs to be disposed when the calling function itself returns, and that is not possible as long as a Return<T> is pointing to it. The real Rust lifetime system might solve this with the creation of some sort of “local function” lifetime. I have a slightly different solution in mind that we will return to, but for now, let us just say that there is a trait marking types that can escape the function, and Return<T> is not one of them. As a consequence, one cannot just call another continuation while a more recent one is in scope, as the latter will not have been consumed by the time its lifetime ends, which is illegal for a non-affine type. In fact, that leads me to my next point.

A very important aspect of calling a continuation is that there will be no opportunity for resources to be disposed afterwards. Actually, this is an issue even when just using early return; it is just invisible in an affine type system because everything will automatically get cleaned up when it needs to be. A proper linear type checker would need to enforce that all resources in the current context are either explicitly or implicitly disposed before the continuation gets called. Unlike in some languages, writing code after the continuation call will be forbidden.

To elaborate on the last point a bit, many languages, Rust included, treat unreachable code as being safe. Rust actually has a type, written ! and named “never”, used to represent the return type of functions that never return, but many other languages are still able to reason about local unreachable code within a given function. The assumption is that because the code will never run, it can pretend to do whatever it wants; Rust assumes that ! can be turned into any type. That is only possible because these languages are at least affine. The danger in a linearly typed language is that unreachable code “tricks” the compiler into thinking that a resource is disposed, even though the destructor will never actually be called. If a function never returns because it jumped to some continuation, then such resources have been leaked. An affine language can make sure that everything will be disposed before that point regardless.

There actually are cases where unreachable code can be safe, for example if we wish to indicate that a function always throws an error or if it runs in an infinite loop. We can continue to use ! to represent such cases, where it has the same meaning as the empty enum. To contrast with such cases, we will denote the return type of a continuation as noreturn. In fact, a broader view of continuations is that they are precisely those functions of the special return type noreturn, with Return<T> being a particular implementation of the trait FnOnce(T) -> noreturn with a particular lifetime. Using other impementations of this trait is also an alternative to trying to return a Return<T> from a function. In traditional formulations, ! is sometimes called the Zero type, because there are no expressions of this type, and noreturn the Bottom type, for reasons that I will not get into.

There are some caveats to discuss when capturing the current continuation of a function. First, once you have chosen to use return in this way, you are obligating yourself to use continuations to return from the function. The function block will expect that the last expression is of type noreturn, and you will not be able to provide an expression of any other type to implicitly return. Second, the return continuation is still a resource; once you have captured it, you will not be able to capture it again, except as where choice is involved.

Finally, I need to remark a bit on the special nature of noreturn. In the actual Rust language, and up until now in this document, all function return types have been of resources that can actually be written onto the stack as the output of a function. All such types are forced to be Sized, even ! technically. Though, there are capabilities like alloca in the low-level intermediate languages that allow for dynamic memory allocation on the stack. Whether that could one day allow for other possibilities remains to be seen. The point though is that, at bare minimum, noreturn cannot possibly be a type that allows resources to be written onto the stack, because morally expressions of this type cannot do anything except propagate the noreturn higher up. In that regard, it needs to be understood that there are many places where any type normally would be allowed but noreturn is not, for example as the type of a function argument. In some ways it is similar to unsized types like [T], the dereferenced array slice of type T and unknown length. As we mentioned with calling a Return<T>, all other resources must have been disposed before you can start ‘returning’ a noreturn. We will have to get into the exact nature of noreturn later.

There is one rather fundamental function that continuations support called call_with_current_continuation. With the above design, it is rather easy to implement.

fn call_with_current_continuation<F, T>(k : F) -> T 
  where F : FnOnce(Return<T>) -> noreturn
{
    k(return)
}

Such a function is possible in ordinary code only because Return<T> must be the continuation called at the end of k, by virtue of being unable to escape or be consumed any other way. Had we not had such a restriction, it would be possible to store this continuation somewhere else and run some other continuation instead. This would wreak havoc on the stack under ordinary calling conventions, as the continuation will have resumed further down than it started. The only ways to support such a thing would be either an alternate calling convention that allows functions return lower on the stack than they were called, or a continuation passing transform (and in practice these seem to be the same).

Coroutines

Now, I did mention that continuations are fundamental to Classical Linear Logic. The reason is that they relate each type operators to its “opposite” in a sense. In fact, they allow us to derive the second half of the type operators from the first half. It turns out that the continuation of a variant is exactly a choice, and vice-versa:

trait MyChoice {
    fn choice1(self) -> i32;
    fn choice2(self) -> i64;
}

enum MyEnum {
    Choice1(Return<i32>),
    Choice2(Return<i64>)
}

fn choiceToCont<C : MyChoice>(c : C) -> 
    impl FnOnce(MyEnum) -> noreturn
{
    |e| match e {
        Choice1(ki32) -> ki32(c.choice1()),
        Choice2(ki64) -> ki64(c.choice2())
    }
}

fn contToChoice<K : FnOnce(MyEnum) -> noreturn>(k : K) ->
    impl MyChoice
{
    choice MyChoice {
        choice1: || {
            k(MyEnum::Choice1(return))
        },
        choice2: || {
            k(MyEnum::Choice2(return))
        }
    }
}

If you noticed where I’ve been heading, then you might have guessed by now that multi-yield coroutine types are actually the continuation of structs. You would be correct. This is the only definition of linear coroutines that I have come across.

trait Par<A, B> {
    fn execute_par<K1, K2>(self, k1 : K1, k2 : K2) -> noreturn
        where K1 : FnOnce(A) -> noreturn, 
              K2 : FnOnce(B) -> noreturn;
}

A simple syntax for building this sort of construct is much less obvious than for choice, but I will give it my best shot. Inspired by languages with syntax for iterator generator functions, let’s suppose that in place of the return type of an ordinary function, we can write:

coroutine(value1 : i32, value2 : i64)

Each of the “variable names” of this spec refers to one of the yields of the coroutine body, which is done with a new syntax yield <name>. As with return, this can either be used to immediately yield a given value of the specified type, or it can be used by itself as a continuation resource of type Return<T>. The body must eventually use all of these yields. These replace the return keyword, which will not be available for use, nor will implicit return be allowed as the function body expects to end in noreturn.

fn fork<T>() -> coroutine(send : Return<T>, receive : T) 
{
    yield receive call_with_current_continuation(yield send)
}

Obviously, this cannot be called like an ordinary function, or Return<T> wouldn’t be allowed as one of the yield types. Every call to a coroutine must be destructured by a new syntax of the existing keyword continue:

fn id(t : T) -> T 
{
    continue fork()
    {
        send => send(t),
        received => return received
    }
}

fn spawn<F : FnOnce(B) -> C>(f : F) -> 
    coroutine(send : Return<B>, receive : C)
{
    continue fork()
    {
        sendB => yield send sendB,
        receiveB => yield receive f(receiveB)
    }
}

Each branch of this syntax refers to one of the yields of the coroutine, captures the output of each yield into the given variable name, in the order specified by the coroutine spec — the names used in the coroutine body are not significant, only their order. Each branch must end with a noreturn. Unlike with match, all branches will eventually get run, so linear resources can only be used by one branch.

There is one circumstance where you do not need this continue syntax, and that is when you directly want to implement one coroutine with another.

fn myFork() -> coroutine(_ : Return<T>, _ : T)
{
    let idFn = |x| x;
    spawn(idFn)
}

As with continue syntax, the yields will be associated with each other by order in the coroutine spec, so they do not need to be named here.

Keep in mind that unlike ordinary functions, coroutines do not clean up their stack frame. Such stack frames will continue to build up until an ordinary calling function returns, so do be careful about overusing them. This, however, is why they can return Return<T>.

Let us now demonstrate the connection between coroutines and structs:

trait MyChoice {
    fn choice1(self) -> i32;
    fn choice2(self) -> i64;
}

struct MyStruct {
    field1 : Return<i32>,
    field2 : Return<i64>
}

fn contToCoroutine(k : Return<MyStruct>) -> 
    coroutine(yield1 : i32, yield2 : i64)
{
    k(MyStruct {
        field1 : yield yield1,
        field2 : yield yield2
    })
}

fn coroutineToCont<F>(f : F) 
    -> impl FnOnce(MyStruct) -> noreturn
    where F : FnOnce() -> coroutine(i32, i64)
{
    |s| {
        continue f()
        {
            yield1Value => s.field1(yield1Value),
            yield2Value => s.field2(yield2Value)
        }
    }
}

We can also apply this to show that noreturn is exactly the continuation of the empty struct. For that reason, we define that noreturn is actually just a synonym for coroutine(), and in fact is treated as a coroutine type for syntax purposes. Similarly, we can show that coroutine(return : T) can be converted to and from an actual return type T, assuming T is valid as such. The former is still considered a coroutine in both runtime characteristics and required calling syntax.

fn valueToCoroutine(i : i32) -> 
    coroutine(yield1 : i32)
{
    yield yield1 i
}

fn coroutineToValue<F>(f : F) -> i32
    where F : FnOnce() -> coroutine(i32)
{
    continue f()
    {
        yield1Value => return yield1Value
    }
}

In general though, these coroutine specs are not themselves types, but are part of a generalized function type. If you want to use these as a type, the closest is what we did above with the trait

FnOnce() -> coroutine(T1, T2, T3,...)

There is one seeming exception to this; a coroutine “type” spec, like what you see above in the trait, can also be part of a choice. Of course, that is because the types in a choice are themselves really the return types of a function. As with implementing one coroutine on top of another, you can pass in a coroutine to a choice by calling it without continue syntax.

fn myCoroutine() -> coroutine(_ : Return<T>, _ : T)
{
    fork()
}

choice{
    choice1 : 5,
    choice2 : myCoroutine()
}

It may not seem like it, but this foundation is enough to do some interesting stuff! Such as this type:

trait ListBuilderInterface<T>
{
    fn break(self) -> noreturn;
    fn next(self) -> coroutine(Return<T>, ListBuilder<T>)
}

struct ListBuilder<T>{
    item: Box<dyn ListBuilderInterface<T>>
};

This type, which is a formulation of the continuation of a linked list, actually allows us to construct a linked list using generator-like syntax:

fn generate<T, F>(generator : F) -> LinkedList<T>
    where F : FnOnce(ListBuilder<T>) -> noreturn
{
    fn break(returnResult : Return<LinkedList<T>>) -> noreturn
    {
        returnResult(LinkedList::Empty)
    }

    fn next(returnResult : Return<LinkedList<T>>) -> coroutine(
        returnNext : Return<T>,
        builder : ListBuilder<T>)
    {
        returnResult(
            LinkedList::Cons(
                call_with_current_continuation(yield returnNext),
                generate(yield builder))
        )
    }

    generator(ListBuilder { 
        item: box choice ListBuilderChoice<T> {
            break : break(return),
            next : next(return)
        }
    })
}

let myList = generate(|b| {
    let b = (|| continue b.item.next()
    {
        returnItem => returnItem(1),
        b => return b
    })();

    let b = (|| continue b.item.next()
    {
        returnItem => returnItem(2),
        b => return b
    })();

    let b = (|| continue b.item.next()
    {
        returnItem => returnItem(3),
        b => return b
    })();

    b.item.break()
})

If this seems clunky, that’s probably because b.item.next() is returning a coroutine. But, consider what this coroutine does: on one branch, we input an item into Return<i32>, which adds it to the list. On the other branch, we are just returning a new builder back to the outside. Since we are basically inputting an i32 and outputting a ListBuilder<i32>, couldn’t we instead replace this with a function FnOnce(i32) -> ListBuilder<i32>. The answer is yes! In fact, it is always true that coroutine(Return<A>, B) can be replaced with FnOnce(A) -> B; they are equivalent types.

fn fnToCoroutine<A, B, F : FnOnce(A) -> B>(f : F) -> 
    coroutine(input : Return<A>, output : B)
{
    let a = call_with_current_continuation(yield input);
    yield output f(a)
}

fn coroutineToFn<A, B, F>(f : F) 
    -> impl FnOnce(A) -> B
    where F : FnOnce() -> coroutine(Return<A>, B)
{
    |a| {
        continue f()
        {
            send => send(a),
            received => return f(received)
        }
    }
}

With some refactoring, now we could write:

let myList = generate(|b| {
    let b = b.item.next(1);
    let b = b.item.next(2);
    let b = b.item.next(3);
    b.item.break()
})

In general, it seems that you can create a similar generator for any type T by identifing a convenient Builder type equivalent to a continuation of T, and then having users write functions FnOnce(Builder) -> noreturn.

Let me try to write up the more traditional rules for coroutines and continuations. These will look strange compared to the earlier rules, because coroutines require some special machinery in the rules, and I am taking some shortcuts to get the idea across.

\[\begin{matrix} G, a : A \vdash e : \texttt{coroutine}(H) \\ \hline G \vdash \texttt{let } a = \texttt{call}\textunderscore\texttt{with}\textunderscore\texttt{current}\textunderscore\texttt{continuation} (\texttt{yield } r) ; e : \texttt{coroutine}( r : \texttt{Return<}A\texttt{>}, H) \end{matrix}\] \[\begin{matrix} G_1 \vdash f : \texttt{Return<}T\texttt{>} && G_2 \vdash e : T \\ \hline G_1, G_2 \vdash f(e) : \texttt{coroutine}() \end{matrix}\] \[\begin{matrix} G \vdash e : \texttt{coroutine}(\texttt{coroutine}(H_1), \texttt{coroutine}(H_2)) \\ \hline G \vdash \texttt{continue } e \texttt{ } \{ a \texttt{ => continue } a \{ ... \}, b \texttt{ => continue } b \{ ... \} \} : \texttt{coroutine}(H_1, H_2) \end{matrix}\] \[\begin{matrix} G_1, a : A \vdash e_1 : T_1 & G_2, b : B \vdash e_2 : T_2 \\ \hline G_1, G_2, c : \texttt{coroutine}(A, B) \vdash \texttt{continue } e \texttt{ } \{ a \texttt{ => yield } r_1 \texttt{ } e_1, b \texttt{ => yield } r_2 \texttt{ } e_2 \} & : \texttt{coroutine}(r_1 : T_1, r_2 : T_2) \end{matrix}\]

Prev Home Next