Playground Wisdom: Threads Beat Async/Await

Const-me

The article seems specific to JavaScript, C# is different.

> you cannot await in a sync function

In C# it’s easy to block the current thread waiting for an async task to complete, see Task.Wait method.

> since it will never resolve, you can also never await it

In C#, awaiting for things which never complete is not that bad, the standard library has Task.WhenAny() method for that.

> let's talk about C#. Here the origin story is once again entirely different

Originally, NT kernel was designed for SMP from the ground up, supports asynchronous operations on handles like files and sockets, and since NT 3.5 the kernel includes support for thread pool to dispatch IO completions: https://en.wikipedia.org/wiki/Input/output_completion_port

Overlapped I/O and especially IOCP are hard to use directly. When Microsoft designed initial version of .NET, they implemented thread pool and IOCP inside the runtime, and exposed higher-level APIs to use them. Stuff like Stream.BeginRead / Stream.EndRead available since .NET 1.1 in 2003, the design pattern is called Asynchronous Programming Model (APM).

Async/await language feature introduced in .NET 4.5 in 2012 is a thin layer of sugar on top of these begin/end asynchronous APIs which were always there. BTW, if you have a pair of begin/end methods, converting into async/await takes 1 line of code, see TaskFactory.FromAsync.

show comments

arctek

I actually think out of any language async/await makes the most sense for javascript.

In the first example: there is no such thing as a blocking sleep in javascript. What people use as sleep is just a promise wrapper around a setTimeout call. setTimeout has always created microtasks, so calling a sleep inline would do nothing to halt execution.

I do agree that dangling Promises are annoying and Promise.race is especially bad as it doesn't do what you expect: finish the fastest promise and cancel the other. It will actually eventually resolve both but you will only get one result.

Realistically in JS you write your long running async functions to take an AbortController wrapper that also provides a sleep function, then in your outer loop you check the signal isn't aborted and the wrapper class also handles calling clearTimeout on wrapped sleep functions to stop sleeping/pending setTimeouts and exit your loop/function.

show comments

serbuvlad

As someone who has only written serious applications in single-threaded, or manually threaded C/C++, and concurrent applications in go using goroutines, channels, and all that fun stuff, I always find the discussion around async/await fascinating. Especially since it seems to be so ubiquitous in modern programming, outside of my sphere.

But one thing is: I don't get it. Why can't I await in a normal function? await sounds blocking. If async functions return promises, why can't I launch multiple async functions, then await on each of them, in a non-async function that does not return a promise?

I get there are answers to my questions. I get await means "yeald if not ready" and if the function is not async "yeald" is meaningless. But I find it a very strange way of thinking nonetheless.

show comments

cryptonector

Threads are definitely not _the_ answer but _an_ answer.

You can have as many threads as hardware threads, but in each thread you want continuation passing style (CPS) or async-await (which is a lot like syntactic sugar for CPS). Why? Because threads let you smear program state over a large stack, increasing memory footprint, while CPS / async-await forces you to make all the state explicit and compressed, thus optimizing memory footprint. This is not a small thing. If you have thread-per-client services, each thread will need a sizeable stack, each stack with a guard page -- even with virtual memory that's expensive, both to set up and in terms of total memory footprint.

Between memory per client, L1/L2 cache footprint per client, page faults (to grow the stack), and context switching overhead, thread-per-client is much more expensive than NPROC threads doing CPS or async-await. If you compress the program state per client you can fit more clients in the same amount of memory, and the overhead of switching from one client to another is lower, thus you can have more clients.

This is the reason that async I/O is the key to solving the "C10K" problem: it forces the programmer to compress per-client program state.

But if you don't need to cater to C10K (or C10M) then thread-per-client is definitely simpler.

So IMO it's really about trade-offs. Does your service need to be C10K? How much are you paying for the hardware/cloud you're running it on? And so on. Being more efficient will be more costly in developer cycles -- that can be very expensive, and that's the reason that research into async-await is ongoing: hopefully it can make C10K dev cheaper.

But remember, rewrites cost even more than doing it right the first time.

show comments

whoisthemachine

> Your Child Loves Actor Frameworks

It turns out, Promises are actors. Very simple actors that can have one and only one message that upon resolution they dispatch to all other subscribed actors [0]. So children might love Promises and async/await then?

Personally, I've often thought the resolution to the "color" debate would be for a new language to make all public interfaces between modules "Promises" by default. Then the default assumption is "if I call this public function it could take some time to complete". Everything acting synchronously should be an implementation detail that is nice if it works out.

https://en.wikipedia.org/wiki/Futures_and_promises#Semantics...

show comments

mst

> Stackless did not have a bright future because the stackless nature meant that you could not have interleaving Python -> C -> Python calls and suspend with them on the stack.

perl's Coro.pm (https://p3rl.org/Coro) includes a C-level coroutine implementation so if you call back into perl code from C code, it marks that underlying coroutine as occupied until it returns back through the C code into something that's perl all the way down.

(I got rapidly annoyed with it because I kept managing to segfault the damn thing, but I am the sort of programmer who seems to manage to break everything, and a bunch of other people whose abilities I respect have built substantial systems using it without suffering from the same; please either assume for the sake of argument that I was holding it wrong, or at least consider that the concept is extremely neat even if a tad on the tricky side to make fully robust)

RantyDave

Almost as an aside the article makes an interesting point: memory accesses can block. Presumably if it blocks because it's accessing a piece of hardware the operating system schedules another thread on that core ... but what if it blocks on a 'normal' memory access? Does it stall the core entirely? Can 'hyperthreading' briefly run another thread? Does out of order execution make it suddenly not a problem? Surely it doesn't go all the way down to the OS?

show comments

NeutralForest

I thought that was interesting and I definitely get the frustration in some aspect. I'm mostly familiar with Python and the function "coloring" issue is so annoying as it forces you to have two APIs depending on async or not (look at SQLAlchemy for example). The ergonomics are bad in general and I don't really like having to deal with, for example, awaiting for a result that will be needed in a sync function.

That being said, some alternatives were mentioned (structured concurrency à la Go) but I'd like to hear about people in BEAM land (Elixir) and what they think about it. Though I understand that for system languages, handling concurrency through a VM is not an option.

show comments

exabrial

> The language that I think actually go this right is modern Java. Project Loom in Java has coroutines and all the bells and whistles under the hood, but what it exposes to the developer is good old threads. There are virtual threads, which are mounted on carrier OS threads, and these virtual threads can travel from thread to thread. If you end up issuing a blocking call on a virtual thread, it yields to the scheduler.

I completely agree! They studied a lot of "bad" implementations and moved slow to get it right

manmal

I view Swift‘s Tasks as a thread-like abstraction that does what the author is asking for. Not every Task is providing structured concurrency in the strict sense, because cancellation has to be managed explicitly for the default Task constructor. But Tasks have a defined runtime, cancellation, and error propagation, if one chooses to use a TaskGroup, async let, or adds some glue code. The tools to achieve this are all there.

FlyingSnake

I expected to see Swift but seems like most such discussions overlook it. Here’s a great discussion that goes deeper into it: https://forums.swift.org/t/concurrency-structured-concurrenc...

nextcaller

I'm still not sure if function coloring is also a problem in javascript. The problem became very clear in other languages like python or c#. But in javascript i've been writing code only awaiting a function when I need to and haven't ran into issues. I might write some simple experiment to check myself.

show comments

pwdisswordfishz

> Go, for instance, gets away without most of this, and that does not make it an inferior language!

Yes, it does. Among other things.

huem0n

> The closest equivalent would be a stupid function that calls a very long running sleep

I disagree. The equivalent would be a thread that's never joined, like a thread with an infinite loop

mark_l_watson

Nice read, and the article got me to take a look at Java’s project Loom and then Eric Normand’s writeup on Loom and threading options for Clojure.

Good stuff.

agentkilo

People should try Janet (the programming language). Its fiber abstraction got everything right IMO.

Functions in Janet don't have "colors", since fiber scheduling is built-in to the runtime in a lower level. You can call "async" functions from anywhere, and Janet's event loop would handle it for you. It's so ergonomic that it almost feels like Erlang.

Janet has something akin to Erlang's supervisor pattern too, which, IMO, is a decent implementation of "structured concurrency" mentioned in the article.

andrewstuart

Feels academic because despite the concerns raised, I only experience async/await as a good thing in real world.

show comments

vanderZwan

I'm a little annoyed with the repeated claim in the article that "threads" do not exist in JavaScript when the MDN page on web workers[0] starts with:

> Web Workers makes it possible to run a script operation in a background thread separate from the main execution thread of a web application.

Sure, it's not like a unix thread, it's message-passing based instead. Because event-based concurrency came first, and so the natural way to add threads to it was to make them message-passing based. Given that the author expresses their love of the actor model makes it extra surprising they don't mention it, since it seems to be pretty much what they want semantically speaking.

If you want to avoid promises you can also have lots of fun with workers and message channels[1] to implement your own concurrency approaches.

What I'm even more annoyed by is that everyone always acts like Nathaniel J. Smith invented structured concurrency. There's an entire field of overlooked concurrency paradigms in the family of dataflow programming languages[2], which goes back all the way to the eighties. My personal recent-ish favorite among them being Céu[3][4]. Funny enough it's also yet another language with a completely different take on what "async" and "await" mean.

Céu an imperative language centered where the concurrency used both asynchronous events and sychronous ones. The former represents external events. The latter internal ones.

Code flow can be split into synchronous concurrent paths by writing out each path directly, similar to if .. else .. branches. Instead of "branches" they're called "trails", and started with par/or and par/and as keywords.

    par/or do
        loop do
            await 1s;
            _printf("Hello ");
            await 1s;
            _printf("World!\n");
        end
    with
        await async do
            emit 10s;
        end
    end

Trails are visited in lexical order. If an "await" keyword is encountered the trail is suspended until the event it waits for fires. This is the benefit of synchronous, single-threaded concurrency: have the benefit of allowing the order in which events fire deterministic and expressible in lexical order.

As mentioned, synchronous events represent external inputs. From the perspective of a single module of single-threaded code. they're queued up and get resolved one at a time after the current series of synchronous events finishes and all activated trails in the module are suspended (i.e. awaiting another external event to wake one or more of them in lexical order).

The "async" keyword exists solely to simulate such an asynchronous external event inside the code. In the above example, the "emit 10s" simulates passing 10 seconds of time (and could also have been written as a simple "await 5s", but then it wouldn't have demonstrated the async keyword).

So in the above example, the first trail is entered, starting a loop. Then at each "await 1s" it suspends until one second has passed, after which it resumes from where it awaited. The second trail is entered after the first trails suspends the first time, then suspends for ten seconds. After that the second trail finishes. Because we split into concurrent trails using "par/or", all trails are aborted whenever any trail finishes. This means that we should have printed "Hello World!" five times. However, if the two trails had been written in reverse order:

    par/or do
        await async do
            emit 10s;
        end
    with
        loop do
            await 1s;
            _printf("Hello ");
            await 1s;
            _printf("World!\n");
        end
    end

... the first trail would have finished first, aborting the last "await 1s" of the second trail, resulting in one half finished "Hello " without a newline.

The neat thing is that all of the synchronous concurrency can be compiled down to a very minimal finite state machine with microscopic memory overhead, making it usable in embedded contexts.

(Tangent: the above is the old syntax btw, it is currently being rewritten into a version called Dynamic Céu with more features, and a different syntax whose terminology is also more in line with what the mainstream consensus is for various concurrency terms[5][6]).

[0] https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers...

[1] https://developer.mozilla.org/en-US/docs/Web/API/MessageChan...

[2] https://en.wikipedia.org/wiki/Dataflow_programming

[3] https://github.com/ceu-lang/ceu

[4] https://ceu-lang.github.io/ceu/out/manual/v0.30/

[5] https://github.com/fsantanna/dceu?tab=readme-ov-file#the-pro...

[6] https://github.com/fsantanna/dceu/blob/main/doc/manual-out.m...

unscaled

I think most of the arguments in this essay rely on this single premise: "The second thing I want you to take away is that imperative languages are not inferior to functional ones."

There is an implied assumption that async/await is a "functional feature" that was pushed into a bunch of innocent imperative languages and polluted them. But there is one giant problem with this assumption: async/await is not a functional feature. If anything, it's the epitome of an imperative flow-control feature.

There are many kinds of functional languages out there, but I think the best common denominator for a primarily functional language nowadays is exactly this: in functional languages control flow structures are first class citizens, and they can be customized by the programmer. In fact, most control flow structures are basically just functions, and the one that aren't (e.g. pattern matching in ML-like languages and monadic comprehensions in Haskell-inspired languages) are extremely generic, and their behavior depends on the types you feed into them. There are other emphasis points that you see in particular families of languages such as pattern matching, strict data immutability or lazy computation — but none of these is a core functional concept.

The interesting point I want to point out is that no primarily functional language that I know actually has async/await. Some of them have monads and these monads could be used for something like async/await but that's not a very common use, and monad comprehensions can be used for other things. For instance, you could use do expressions in Haskell (or for expressions in Scala) to operate on multiple lists at once. The same behavior is possible with nested for-loops in virtually every modern imperative language, but nobody has blamed Algol for "polluting" the purity of our Fortran gotos and arithmetic ifs with this "fancy functional garbage monad from damned ivory tower Academia". That would be absurd, not only because no programming language with monadic comprehensions existed back then, but also because for loops are a very syntax for a very specific thing that can be done with monadic expression. They turn a very abstract functional concept into a highly specific — and highly imperative — feature. The same is true for await. It's an imperative construct that instructs the runtime to suspend (or the compiler to turn the current function into a state machine).

So no, async/await does not have anything to do with functional-language envy and is, in fact, a feature that is quite antithetical to functional programming. If there is any theoretical paradigm behind async/await (vs. just using green threads), it's strong typing and especially the idea of representing effects by types. This is somewhat close to fully-fledged Effect Systems (in languages such as a Koka), but not as powerful. The general idea is that certain functions behave in a way that is "infective" — in other words, if foo() calls bar() which in-turn calls doStuff(), it might be impacted by some side-effect of doStuff(). In order to prevent unpleasant surprises, we want to mark this thing that doStuff does in the function signature (either using an extra argument, a return type wrapper or just an extra modifier like "async").

In a pure language like Haskell, everything from I/O to mutable memory requires specifying an effect and this is usually done through monadic return types. But even the very first version of Java (Ron Pressler's ideal untarnished "imperative" language) has effects (or "colors") which still remain in the language: checked exceptions. They are just as infective as async I/O. If you don't handle exceptions in place, a function marked with "throws IOException" (basically almost any function that deals with I/O) can only be called by another function marked with "throws IOException". What's worse, unlike JavaScript which only has two colors (async and non-async), Java has an infinite number colors!

The description above sounds horrible, but it's not. Checked exceptions are widely believed to be a mistake[1], but they don't bother Java developers enough to make the language unusable. You can always just wrap them with another exception and rethrow. The ergonomics could have been made slightly better, but they're decent enough. But the same can be said for async/await. If you take a language with a similar feature that is close to Java (C# or Kotlin), you'll see the asynchronous functions can still run as blocking code from inside synchronous functions, while synchronous functions can be scheduled on another thread from a synchronous function. The ergonomics for doing that are not any harder than wrapping exceptions.

In addition to that, the advantages of marking a function that runs asynchronous I/O (just like marking a function that throws an exception) are obvious, even if the move itself is controversial. These functions generally involve potentially slow network I/O and you don't want to call them by mistake. If you think that never happens, here is the standard Java API for constructing an InetAddress object from a string representing an IPv4 or IPv6 address: InetAddress.getByName()[2]. Unfortunately, if your IP address is invalid, this function may block while trying to resolve it as a domain name. That's plain bad API design, but APIs that can block in surprising ways are abundant, so you cannot argue that async/await doesn't introduce additional safety.

But let's face it — in most cases choosing async/await vs. green threads for an imperative language is a matter of getting the right trade-off. Async/Await schedulers are easier to implement (they don't need to deal with segmented/relocatable/growable stacks) and do not require runtime support. Async/await also exhibits more efficient memory usage, and arguably better performance in scenarios that do not involve a long call-graph of async functions. Async/await schedulers also integrates more nicely with blocking native code that is used as a library (i.e. C/C++, Objective C or Rust code). With green threads, you just cannot run this code directly from the virtual thread and if the code is blocking, your life becomes even harder (especially if you don't have access to kernel threads). Even with full control of the runtime, you'd usually end up with a certain amount of overhead for native calls[3].

Considering these trade-offs, async/await is perfect in scenarios like below:

1. JavaScript had multiple implementations. Not only were most of them single-threaded, they would also need a major overhaul to support virtual threads even if a thread API was specified.

2. Rust actually tried green threads and abandoned them. The performance was abysmal for a language that seeks zero-cost abstraction and the system programming requirements for Rust made them a deal breaker even if this wasn't the case. Rust just had to support pluggable runtimes and mandating dynamic stacks just won't work inside the Kernel or in soft real-time systems.

3. Swift had to interoperate with a large amount of Objective C called that was already using callbacks for asynchronous I/O (this is what they had). In addition, it is not garbage-collected language, and it still needed to call a lot of C and Objective C APIs, even if that was wrapped by nice Swift classes.

4. C# already had a Promise-like Task mechanism that evolved around wrapping native windows asynchronous I/O. If .Net was redesigned from scratch nowadays, they could have very well went with green threads, but the way .Net developed, this would have just introduced a lot of compatibility issues for almost no gains.

5. Python had the GIL, as the article already mentioned. But even with patching runtime I/O functions (like greenlet — or more accurately, gevent[4] — did), there were many third party libraries relying on native code. Python just went with the more compatible approach.

6. Java did not have any established standard for asynchronous I/O. CompletableFuture was introuced in Java 8, but it wasn't as widely adopted (especially in the standard library) as the C# Task was. Java also had gauranteed full control of the runtime (unlike JavaScript and Rust), it was garbage collected (unlike Rust and Swift) and it had less reliance on native code than Swift, Pre-.NET Core C# or Python. On the other hand, Java had a lot of crucial blocking APIs that haven't been updated to use CompletableFuture, like JDBC and Servlet (Async Servlets were cumbersome and never caught on). Introducing async/await to Java would mean having to rewrite or significantly refactor all existing frameworks in order to support them. That was not a very palatable choice, so again, Java did the correct thing and went with virtual threads.

If you look at all of these use cases, you'd see all of these languages seem to have made the right pragmatic choice. Unless you are designing a new language from scratch (and that language is garbage collected and doesn't need to be compatible with another language or deal with a lot of existing native code), you can go with the ideological argument of "I want my function to be colorless" (or, inversely, you can go with the ideological argument of "I want all suspending functions to be marked explicitly"). In all other cases, pragmatism should win.

---

[1] Although it mostly comes to bad composability — checked result types work very well in Rust.

[2] https://docs.oracle.com/en/java/javase/17/docs/api/java.base...

[3] See the article blelow for the overhead in Go. Keep in mind that the Go team has put a lot of effort into optimizing Cgo calls and reducing this overhead, but they still cannot eliminate it entirely. https://shane.ai/posts/cgo-performance-in-go1.21/

[4] https://www.gevent.org/

show comments