Playground Wisdom: Threads Beat Async/Await

118 points61 comments5 days ago
Const-me

The article seems specific to JavaScript, C# is different.

> you cannot await in a sync function

In C# it’s easy to block the current thread waiting for an async task to complete, see Task.Wait method.

> since it will never resolve, you can also never await it

In C#, awaiting for things which never complete is not that bad, the standard library has Task.WhenAny() method for that.

> let's talk about C#. Here the origin story is once again entirely different

Originally, NT kernel was designed for SMP from the ground up, supports asynchronous operations on handles like files and sockets, and since NT 3.5 the kernel includes support for thread pool to dispatch IO completions: https://en.wikipedia.org/wiki/Input/output_completion_port

Overlapped I/O and especially IOCP are hard to use directly. When Microsoft designed initial version of .NET, they implemented thread pool and IOCP inside the runtime, and exposed higher-level APIs to use them. Stuff like Stream.BeginRead / Stream.EndRead available since .NET 1.1 in 2003, the design pattern is called Asynchronous Programming Model (APM).

Async/await language feature introduced in .NET 4.5 in 2012 is a thin layer of sugar on top of these begin/end asynchronous APIs which were always there. BTW, if you have a pair of begin/end methods, converting into async/await takes 1 line of code, see TaskFactory.FromAsync.

show comments
arctek

I actually think out of any language async/await makes the most sense for javascript.

In the first example: there is no such thing as a blocking sleep in javascript. What people use as sleep is just a promise wrapper around a setTimeout call. setTimeout has always created microtasks, so calling a sleep inline would do nothing to halt execution.

I do agree that dangling Promises are annoying and Promise.race is especially bad as it doesn't do what you expect: finish the fastest promise and cancel the other. It will actually eventually resolve both but you will only get one result.

Realistically in JS you write your long running async functions to take an AbortController wrapper that also provides a sleep function, then in your outer loop you check the signal isn't aborted and the wrapper class also handles calling clearTimeout on wrapped sleep functions to stop sleeping/pending setTimeouts and exit your loop/function.

show comments
serbuvlad

As someone who has only written serious applications in single-threaded, or manually threaded C/C++, and concurrent applications in go using goroutines, channels, and all that fun stuff, I always find the discussion around async/await fascinating. Especially since it seems to be so ubiquitous in modern programming, outside of my sphere.

But one thing is: I don't get it. Why can't I await in a normal function? await sounds blocking. If async functions return promises, why can't I launch multiple async functions, then await on each of them, in a non-async function that does not return a promise?

I get there are answers to my questions. I get await means "yeald if not ready" and if the function is not async "yeald" is meaningless. But I find it a very strange way of thinking nonetheless.

show comments
FlyingSnake

I expected to see Swift but seems like most such discussions overlook it. Here’s a great discussion that goes deeper into it: https://forums.swift.org/t/concurrency-structured-concurrenc...

whoisthemachine

> Your Child Loves Actor Frameworks

It turns out, Promises are actors. Very simple actors that can have one and only one message that upon resolution they dispatch to all other subscribed actors [0]. So children might love Promises and async/await then?

Personally, I've often thought the resolution to the "color" debate would be for a new language to make all public interfaces between modules "Promises" by default. Then the default assumption is "if I call this public function it could take some time to complete". Everything acting synchronously should be an implementation detail that is nice if it works out.

https://en.wikipedia.org/wiki/Futures_and_promises#Semantics...

show comments
manmal

I view Swift‘s Tasks as a thread-like abstraction that does what the author is asking for. Not every Task is providing structured concurrency in the strict sense, because cancellation has to be managed explicitly for the default Task constructor. But Tasks have a defined runtime, cancellation, and error propagation, if one chooses to use a TaskGroup, async let, or adds some glue code. The tools to achieve this are all there.

cryptonector

Threads are definitely not _the_ answer but _an_ answer.

You can have as many threads as hardware threads, but in each thread you want continuation passing style (CPS) or async-await (which is a lot like syntactic sugar for CPS). Why? Because threads let you smear program state over a large stack, increasing memory footprint, while CPS / async-await forces you to make all the state explicit and compressed, thus optimizing memory footprint. This is not a small thing. If you have thread-per-client services, each thread will need a sizeable stack, each stack with a guard page -- even with virtual memory that's expensive, both to set up and in terms of total memory footprint.

Between memory per client, L1/L2 cache footprint per client, page faults (to grow the stack), and context switching overhead, thread-per-client is much more expensive than NPROC threads doing CPS or async-await. If you compress the program state per client you can fit more clients in the same amount of memory, and the overhead of switching from one client to another is lower, thus you can have more clients.

This is the reason that async I/O is the key to solving the "C10K" problem: it forces the programmer to compress per-client program state.

But if you don't need to cater to C10K (or C10M) then thread-per-client is definitely simpler.

So IMO it's really about trade-offs. Does your service need to be C10K? How much are you paying for the hardware/cloud you're running it on? And so on. Being more efficient will be more costly in developer cycles -- that can be very expensive, and that's the reason that research into async-await is ongoing: hopefully it can make C10K dev cheaper.

But remember, rewrites cost even more than doing it right the first time.

show comments
RantyDave

Almost as an aside the article makes an interesting point: memory accesses can block. Presumably if it blocks because it's accessing a piece of hardware the operating system schedules another thread on that core ... but what if it blocks on a 'normal' memory access? Does it stall the core entirely? Can 'hyperthreading' briefly run another thread? Does out of order execution make it suddenly not a problem? Surely it doesn't go all the way down to the OS?

show comments
agentkilo

People should try Janet (the programming language). Its fiber abstraction got everything right IMO.

Functions in Janet don't have "colors", since fiber scheduling is built-in to the runtime in a lower level. You can call "async" functions from anywhere, and Janet's event loop would handle it for you. It's so ergonomic that it almost feels like Erlang.

Janet has something akin to Erlang's supervisor pattern too, which, IMO, is a decent implementation of "structured concurrency" mentioned in the article.

mark_l_watson

Nice read, and the article got me to take a look at Java’s project Loom and then Eric Normand’s writeup on Loom and threading options for Clojure.

Good stuff.

pwdisswordfishz

> Go, for instance, gets away without most of this, and that does not make it an inferior language!

Yes, it does. Among other things.

NeutralForest

I thought that was interesting and I definitely get the frustration in some aspect. I'm mostly familiar with Python and the function "coloring" issue is so annoying as it forces you to have two APIs depending on async or not (look at SQLAlchemy for example). The ergonomics are bad in general and I don't really like having to deal with, for example, awaiting for a result that will be needed in a sync function.

That being said, some alternatives were mentioned (structured concurrency à la Go) but I'd like to hear about people in BEAM land (Elixir) and what they think about it. Though I understand that for system languages, handling concurrency through a VM is not an option.

show comments
andrewstuart

Feels academic because despite the concerns raised, I only experience async/await as a good thing in real world.

show comments
unscaled

I think most of the arguments in this essay rely on this single premise: "The second thing I want you to take away is that imperative languages are not inferior to functional ones."

There is an implied assumption that async/await is a "functional feature" that was pushed into a bunch of innocent imperative languages and polluted them. But there is one giant problem with this assumption: async/await is not a functional feature. If anything, it's the epitome of an imperative flow-control feature.

There are many kinds of functional languages out there, but I think the best common denominator for a primarily functional language nowadays is exactly this: in functional languages control flow structures are first class citizens, and they can be customized by the programmer. In fact, most control flow structures are basically just functions, and the one that aren't (e.g. pattern matching in ML-like languages and monadic comprehensions in Haskell-inspired languages) are extremely generic, and their behavior depends on the types you feed into them. There are other emphasis points that you see in particular families of languages such as pattern matching, strict data immutability or lazy computation — but none of these is a core functional concept.

The interesting point I want to point out is that no primarily functional language that I know actually has async/await. Some of them have monads and these monads could be used for something like async/await but that's not a very common use, and monad comprehensions can be used for other things. For instance, you could use do expressions in Haskell (or for expressions in Scala) to operate on multiple lists at once. The same behavior is possible with nested for-loops in virtually every modern imperative language, but nobody has blamed Algol for "polluting" the purity of our Fortran gotos and arithmetic ifs with this "fancy functional garbage monad from damned ivory tower Academia". That would be absurd, not only because no programming language with monadic comprehensions existed back then, but also because for loops are a very syntax for a very specific thing that can be done with monadic expression. They turn a very abstract functional concept into a highly specific — and highly imperative — feature. The same is true for await. It's an imperative construct that instructs the runtime to suspend (or the compiler to turn the current function into a state machine).

So no, async/await does not have anything to do with functional-language envy and is, in fact, a feature that is quite antithetical to functional programming. If there is any theoretical paradigm behind async/await (vs. just using green threads), it's strong typing and especially the idea of representing effects by types. This is somewhat close to fully-fledged Effect Systems (in languages such as a Koka), but not as powerful. The general idea is that certain functions behave in a way that is "infective" — in other words, if foo() calls bar() which in-turn calls doStuff(), it might be impacted by some side-effect of doStuff(). In order to prevent unpleasant surprises, we want to mark this thing that doStuff does in the function signature (either using an extra argument, a return type wrapper or just an extra modifier like "async").

In a pure language like Haskell, everything from I/O to mutable memory requires specifying an effect and this is usually done through monadic return types. But even the very first version of Java (Ron Pressler's ideal untarnished "imperative" language) has effects (or "colors") which still remain in the language: checked exceptions. They are just as infective as async I/O. If you don't handle exceptions in place, a function marked with "throws IOException" (basically almost any function that deals with I/O) can only be called by another function marked with "throws IOException". What's worse, unlike JavaScript which only has two colors (async and non-async), Java has an infinite number colors!

The description above sounds horrible, but it's not. Checked exceptions are widely believed to be a mistake[1], but they don't bother Java developers enough to make the language unusable. You can always just wrap them with another exception and rethrow. The ergonomics could have been made slightly better, but they're decent enough. But the same can be said for async/await. If you take a language with a similar feature that is close to Java (C# or Kotlin), you'll see the asynchronous functions can still run as blocking code from inside synchronous functions, while synchronous functions can be scheduled on another thread from a synchronous function. The ergonomics for doing that are not any harder than wrapping exceptions.

In addition to that, the advantages of marking a function that runs asynchronous I/O (just like marking a function that throws an exception) are obvious, even if the move itself is controversial. These functions generally involve potentially slow network I/O and you don't want to call them by mistake. If you think that never happens, here is the standard Java API for constructing an InetAddress object from a string representing an IPv4 or IPv6 address: InetAddress.getByName()[2]. Unfortunately, if your IP address is invalid, this function may block while trying to resolve it as a domain name. That's plain bad API design, but APIs that can block in surprising ways are abundant, so you cannot argue that async/await doesn't introduce additional safety.

But let's face it — in most cases choosing async/await vs. green threads for an imperative language is a matter of getting the right trade-off. Async/Await schedulers are easier to implement (they don't need to deal with segmented/relocatable/growable stacks) and do not require runtime support. Async/await also exhibits more efficient memory usage, and arguably better performance in scenarios that do not involve a long call-graph of async functions. Async/await schedulers also integrates more nicely with blocking native code that is used as a library (i.e. C/C++, Objective C or Rust code). With green threads, you just cannot run this code directly from the virtual thread and if the code is blocking, your life becomes even harder (especially if you don't have access to kernel threads). Even with full control of the runtime, you'd usually end up with a certain amount of overhead for native calls[3].

Considering these trade-offs, async/await is perfect in scenarios like below:

1. JavaScript had multiple implementations. Not only were most of them single-threaded, they would also need a major overhaul to support virtual threads even if a thread API was specified.

2. Rust actually tried green threads and abandoned them. The performance was abysmal for a language that seeks zero-cost abstraction and the system programming requirements for Rust made them a deal breaker even if this wasn't the case. Rust just had to support pluggable runtimes and mandating dynamic stacks just won't work inside the Kernel or in soft real-time systems.

3. Swift had to interoperate with a large amount of Objective C called that was already using callbacks for asynchronous I/O (this is what they had). In addition, it is not garbage-collected language, and it still needed to call a lot of C and Objective C APIs, even if that was wrapped by nice Swift classes.

4. C# already had a Promise-like Task mechanism that evolved around wrapping native windows asynchronous I/O. If .Net was redesigned from scratch nowadays, they could have very well went with green threads, but the way .Net developed, this would have just introduced a lot of compatibility issues for almost no gains.

5. Python had the GIL, as the article already mentioned. But even with patching runtime I/O functions (like greenlet — or more accurately, gevent[4] — did), there were many third party libraries relying on native code. Python just went with the more compatible approach.

6. Java did not have any established standard for asynchronous I/O. CompletableFuture was introuced in Java 8, but it wasn't as widely adopted (especially in the standard library) as the C# Task was. Java also had gauranteed full control of the runtime (unlike JavaScript and Rust), it was garbage collected (unlike Rust and Swift) and it had less reliance on native code than Swift, Pre-.NET Core C# or Python. On the other hand, Java had a lot of crucial blocking APIs that haven't been updated to use CompletableFuture, like JDBC and Servlet (Async Servlets were cumbersome and never caught on). Introducing async/await to Java would mean having to rewrite or significantly refactor all existing frameworks in order to support them. That was not a very palatable choice, so again, Java did the correct thing and went with virtual threads.

If you look at all of these use cases, you'd see all of these languages seem to have made the right pragmatic choice. Unless you are designing a new language from scratch (and that language is garbage collected and doesn't need to be compatible with another language or deal with a lot of existing native code), you can go with the ideological argument of "I want my function to be colorless" (or, inversely, you can go with the ideological argument of "I want all suspending functions to be marked explicitly"). In all other cases, pragmatism should win.

---

[1] Although it mostly comes to bad composability — checked result types work very well in Rust.

[2] https://docs.oracle.com/en/java/javase/17/docs/api/java.base...

[3] See the article blelow for the overhead in Go. Keep in mind that the Go team has put a lot of effort into optimizing Cgo calls and reducing this overhead, but they still cannot eliminate it entirely. https://shane.ai/posts/cgo-performance-in-go1.21/

[4] https://www.gevent.org/

show comments