Chapter 2
Reasoning About Async Rust with State Machines
On AI assistance
Async Rust can be frustrating! You write code that looks reasonable, the compiler disagrees, and the explanation does not seem connected to what you were trying to do. You change a few things, create a few new problems, and eventually get it working without really knowing why.
And even when it compiles, you may hit a runtime issue where something hangs, never wakes up, or runs in an order you did not expect, and it is not always clear where to start debugging.
This chapter starts building a model for reasoning about async Rust instead of treating it as a list of rules to memorize. We will develop that model throughout the chapters ahead, but making it instinctive will take time and practice.
Here, the goal is to take the first step: understand how async Rust works internally and begin forming an intuition for why it behaves the way it does. With practice, this understanding will help you design reliable async code, debug it systematically, and reason about performance.
Before we continue, let’s recap takeaways from Chapter 1:
- Async work is represented by a future, and it does not run by itself.
- Each poll moves the future forward until it finishes or has to wait.
- After waiting, the future must continue from where it stopped.
Problem with Waiting
Programs spend a surprising amount of time waiting: for a database query, a network response, a timer, or a file operation. If a thread remains occupied during that wait, it cannot do anything else. Most of the time, the CPU is not doing useful work. The thread is simply waiting for an answer from somewhere else. Async code can give the thread back while it waits. The runtime can then use that same thread to move other work forward.
Imagine a chat server, game server, or API may hold a network connection for every connected client, but most clients are not sending data at the same moment. Their connections spend most of their time waiting for the next message. Async lets a small number of threads work on whichever connections have data ready, instead of keeping one blocked thread waiting for every connected client.
This does not make CPU-heavy work faster. A large calculation still needs CPU time. Async is most useful when work repeatedly alternates between doing a little computation and waiting on something outside the CPU.
But giving the thread back creates a new problem: the waiting work must remember enough to continue later. Where did it stop? Which values does it still need? What should run when the answer arrives?
Rust solves this by recording where the work stopped and storing the values it will need when it continues. When the work is ready again, Rust uses that saved information to resume from the right place instead of starting over.
To do that, Rust turns every async fn into a state machine. Each state represents a place where the work can stop and stores what it needs to continue from there.
State Machine
A state machine describes work as a set of possible states and the events that move it between them. At any moment, the work is in one state. When something happens, it either stays there or moves to another.
Think about an ATM. It might move through these states:
Inserting a card moves the ATM from Idle to Card Inserted. Entering the correct PIN moves it to Authenticated. Choosing a withdrawal and an amount moves it forward again. The same input can mean different things in different states: pressing a number may enter a PIN in one state and choose an amount in another.
Each state also stores what the next step needs. Card Inserted keeps the card details. Withdrawal Selected keeps the account and amount. Together, the current state and its stored values tell the ATM what has already happened and what it can do next.
Now let’s see what this looks like in async Rust. We will use a small function that calculates the total price of two items in a shopping cart. The prices do not arrive immediately, so the function must get the socks price, then the shoes price, and finally add them together.
async fn cart_total(socks: Receiver<i64>, shoes: Receiver<i64>) -> i64 { let socks_price = socks.await; let shoes_price = shoes.await; socks_price + shoes_price}socks and shoes are two slow price lookups. Each one is a Receiver<i64>, the same kind of oneshot receiver we built in Chapter 1, now made generic so it can also carry a number instead of only a string (see chapter 2 of the repo for the updated code).
The function does three things:
- Wait for the socks price.
- Wait for the shoes price.
- Add the two prices.
Now slow the function down in your head.
The first time this future is polled, the socks price may not be ready yet. In that case, it returns Pending.
Later, a worker thread sends the socks price. The waker fires. Runtime polls the future again.
This time socks.await finishes and gives us:
let socks_price = 12;But we still cannot return the total. We have not loaded the shoes price yet.
So we move to the second await:
let shoes_price = shoes.await;And if the shoes price is not ready, the future returns Pending again.
Here is the question: While the future is Pending on shoes.await, where is socks_price?
It cannot be on the normal call stack in the usual way, because the function stopped and returned Pending to its caller. It also cannot be thrown away, because the final line still needs it:
socks_price + shoes_priceSo it must live inside the future itself. More specifically, it becomes part of the state machine the compiler builds for this async fn. The future stores which state it is in and the values needed to continue from that state.
We will turn this async function into a state machine by hand, Rust compiler performs similar transformation behind the scenes.
Defining The States
How do we define the possible states of this state machine?
We look for the places where the work can stop and later continue. In an async fn, those places are the .await points.
And after both waits finish, the future still needs a final state that says the work is over. Once the total is returned the future is done, and a state machine needs an explicit finished state so a future that has already returned Ready is never accidentally polled forward again.
Now notice two things these three states share. The future is always in exactly one of them, never two at once. And each one has to remember different values to continue from where it paused.
Rust has a type built for exactly that use case. An enum is a value that is always exactly one of a fixed set of named variants. And in Rust a variant is not just a bare label: each one can store its own data, and different variants can hold different fields, even of different types. That is precisely what we need, one variant per state, holding the values that state must remember.
So we turn each state into a variant and store inside it only the values that state still needs. We will call them Start (still waiting on the socks price), GotSocks (socks price in, shoes price not yet), and Done (the total has been returned):
pub enum CartTotal { Start { socks: Receiver<i64>, shoes: Receiver<i64>, }, GotSocks { socks_price: i64, shoes: Receiver<i64>, }, Done,}This type change follows the code let socks_price = socks.await;. The moment we await socks, the receiver is used up. What we keep is the price it returned, a plain number. So the next state no longer stores socks; it stores socks_price.
Why don’t we capture shoes_price and the function return value in the state machine?
Because nothing pauses after them. A value only needs a home in the state machine if it has to survive an .await. socks_price does: it is created at the first .await and still needed after the second, so it has to wait inside GotSocks. shoes_price is created at the last .await, and nothing pauses after that. The function runs straight on to add the two prices and return, all in a single poll. The total is handed back the instant it is computed. Neither value ever has to be remembered between polls, so neither is stored, and the machine simply lands in Done.
Build The State Machine
Let’s implement it. Clone this repo and use the starter project in the chapter2 folder. It has a few small changes from Chapter 1 code, the oneshot channel is now generic, and the project structure has been updated.
Create the file state_machine.rs inside tinyrt/src folder. We start with the imports and the states we defined with the enum.
use std::future::Future;use std::pin::Pin;use std::task::{Context, Poll};use crate::oneshot::Receiver;pub enum CartTotal { Start { socks: Receiver<i64>, shoes: Receiver<i64>, }, GotSocks { socks_price: i64, shoes: Receiver<i64>, }, Done,}Then add the constructor that puts the machine in its first state:
impl CartTotal { pub fn new(socks: Receiver<i64>, shoes: Receiver<i64>) -> Self { CartTotal::Start { socks, shoes } }}The enum describes what the future can store. The poll method describes how the future moves from one state to the next.
Inside poll, we need to inspect and update the state machine. So we need mutable access to the enum value itself. self.get_mut() gives us that. This is enough for the simple future we are building here. Later, we will see that some futures need extra protection before Rust lets us move or access their stored state. That is the problem Pin is designed to handle, and we will come back to it in a later chapter.
Then we enter a loop. The match tells us which state the future is currently in: Start, GotSocks, or Done. The code for that state decides what happens next.
If the state is waiting on something that is not ready, poll returns Pending. If the state can move forward, we replace the enum with the next state and let the loop check again. That is how one call to poll can move through more than one state when the values are already ready.
impl Future for CartTotal {CartTotal can now be polled like any other future type Output = i64;when it finishes, it returns an i64 fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<i64> { let this = self.get_mut();get mutable access to the enum value loop {keep moving while progress is possible match this {check the current state: Start, GotSocks, or Done // state arms go hereeach arm decides whether to wait, move, or finish } } }}In Start, the future is sitting at the first .await:
let socks_price = socks.await;So the first thing poll must do is poll the socks receiver.
If socks are not ready, the cart total is not ready either. We stay in Start and return Pending.
If socks are ready, we move to GotSocks. That next state stores the socks price and keeps the shoes receiver for the second .await.
loop {context: the state arms live inside this loop match this {context: this match chooses the current state CartTotal::Start { socks, .. } => {first state: we are waiting for socks match Pin::new(socks).poll(cx) { Poll::Pending => {socks are not ready, so the cart future waits too println!("[cart] Start: socks price not back yet, waiting"); return Poll::Pending; } Poll::Ready(socks_price) => {socks are ready, so we can move to the next state println!( "[cart] Start -> GotSocks: socks are ${socks_price}" ); let CartTotal::Start { shoes, .. } =take shoes by field name from the old Start; .. ignores socks std::mem::replace(this, CartTotal::Done)put temporary Done in this, and get the old state back else { unreachable!() }; *this = CartTotal::GotSocks { socks_price, shoes };write the real next state } } } }}To move from Start to GotSocks, we need to carry the shoes receiver forward. The next state must contain it.
But shoes currently lives inside CartTotal::Start. To put it into GotSocks, we must take ownership of it.
That creates a small Rust problem: we cannot pull shoes out and leave the enum half-empty.
The workaround is to swap the whole state out first.
std::mem::replace(this, CartTotal::Done) puts a temporary Done state into this. That keeps the state machine holding a complete value. At the same time, it hands back the old state that used to be there.
In this branch, that old state is the Start state. So we can take it apart and keep only the shoes receiver:
let CartTotal::Start { shoes, .. } = std::mem::replace(this, CartTotal::Done)else { unreachable!()};The pattern uses the field name shoes, so the order of fields does not matter. The .. means “ignore the rest,” which includes socks.
Now that we own shoes, we can write the real next state:
*this = CartTotal::GotSocks { socks_price, shoes };That is the first state transition. The function has crossed the first .await, so socks_price is no longer just a temporary local. It is now saved inside the GotSocks state while the future waits for shoes.
Next comes the second state.
In GotSocks, the socks price is already stored, so we poll the shoes receiver. If shoes are not ready, we return Pending and keep waiting in GotSocks. If shoes are ready, we can add the two prices, move to Done, and return the total.
loop {context: still inside the same poll loop match this {context: this is another arm of the same state match // CartTotal::Start { ... }previous state arm goes above this one CartTotal::GotSocks { socks_price, shoes } => {second state: socks are already saved match Pin::new(shoes).poll(cx) { Poll::Pending => {shoes are not ready, so keep waiting in GotSocks println!( "[cart] GotSocks: shoes price not back yet, ${socks_price} waits as a field" ); return Poll::Pending; } Poll::Ready(shoes_price) => {shoes are ready, so the total can be computed let total = *socks_price + shoes_price; println!( "[cart] GotSocks -> Done: shoes are ${shoes_price}, cart total = ${total}" ); *this = CartTotal::Done;mark the future as finished return Poll::Ready(total);hand the final answer back to the executor } } } }}That leaves the terminal state.
Done exists so the future remembers that it has already returned its answer. Once a future has returned Ready, it should not run its body again.
CartTotal::Done => { panic!("polled CartTotal after it already returned the total")}Here’s the full implementation of the state machine.
impl Future for CartTotal { type Output = i64; fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<i64> { let this = self.get_mut(); loop { match this { CartTotal::Start { socks, .. } => { match Pin::new(socks).poll(cx) { Poll::Pending => { println!("[cart] Start: socks price not back yet, waiting"); return Poll::Pending; } Poll::Ready(socks_price) => { println!( "[cart] Start -> GotSocks: socks are ${socks_price}" ); let CartTotal::Start { shoes, .. } = std::mem::replace(this, CartTotal::Done) else { unreachable!() }; *this = CartTotal::GotSocks { socks_price, shoes }; } } }, CartTotal::GotSocks { socks_price, shoes } => { match Pin::new(shoes).poll(cx) { Poll::Pending => { println!( "[cart] GotSocks: shoes price not back yet, ${socks_price} waits as a field" ); return Poll::Pending; } Poll::Ready(shoes_price) => { let total = *socks_price + shoes_price; println!( "[cart] GotSocks -> Done: shoes are ${shoes_price}, cart total = ${total}" ); *this = CartTotal::Done; return Poll::Ready(total); } } }, CartTotal::Done => { panic!("polled CartTotal after it already returned the total") } } } }}Also, expose the new module from lib.rs:
pub mod block_on;pub mod oneshot;pub use block_on::block_on;pub mod state_machine;Run The State Machine
Now we need two slow price lookups.
We will use the oneshot channel from Chapter 1 for this. Each lookup creates a Sender and a Receiver. The sender runs on a worker thread and sends the price later. The receiver is the future that CartTotal polls.
The first sender sends the socks price after 200ms. The second sender sends the shoes price after 400ms. Until those values arrive, the receivers return Pending, which gives the hand-written state machine a real chance to pause and resume.
Create ch02_state_machine.rs inside tinyrt/examples folder.
use std::thread;use std::time::Duration;use tinyrt::block_on;use tinyrt::state_machine::CartTotal;use tinyrt::oneshot::{self, Receiver};Each sender runs on a separate thread and sends its price later.fn price_lookups() -> (Receiver<i64>, Receiver<i64>) { let (socks_tx, socks_rx) = oneshot::channel::<i64>(); let (shoes_tx, shoes_rx) = oneshot::channel::<i64>(); thread::spawn(move || { thread::sleep(Duration::from_millis(200));socks become ready first socks_tx.send(12);wakes the socks receiver }); thread::spawn(move || { thread::sleep(Duration::from_millis(400));shoes become ready later shoes_tx.send(89);wakes the shoes receiver }); (socks_rx, shoes_rx)return the receiving ends to CartTotal}Now drive CartTotal with block_on.
fn main() { let (socks, shoes) = price_lookups(); let total = block_on(CartTotal::new(socks, shoes));poll the state machine until it returns Ready println!("cart total = ${total}");}Run it:
cargo run --example ch02_state_machineThe hand-written machine prints every step:
[cart] Start: socks price not back yet, waiting[cart] Start -> GotSocks: socks are $12[cart] GotSocks: shoes price not back yet, $12 waits as a field[cart] GotSocks -> Done: shoes are $89, cart total = $101cart total = $101That is the state machine running.
When you write the original async fn, you do not write this enum or this poll method yourself. The compiler builds a state machine with a similar shape behind the scenes.
Applying The State Machine Model
Now that we have built the state machine by hand, we can use it to reason through common async issues.
The sections below are the same state-machine idea applied to real problems: locks held while waiting, memory kept alive by parked tasks, executor threads blocked by long poll calls, and compiler errors that point at .await but are really about what the future stored.
Reason About Locks Held While Waiting
Shared state is common in async programs: a counter, a cache, a connection table, or some small piece of application state that many tasks need to update. A Mutex protects that state so only one task can change it at a time.
When you call state.lock(), Rust gives you a guard. The guard is the permission to access the protected value, and the mutex stays locked for as long as that guard exists. When the guard is dropped, the lock is released.
In this example, touch_db() stands in for an async database call or network call. It may return Pending, so the task may pause there.
This code looks ordinary:
async fn bump(state: Arc<Mutex<i32>>) { let mut guard = state.lock().unwrap(); touch_db().await; *guard += 1;}But when this future is used on a multi-threaded async runtime, the compiler can reject it with an error like this:
future cannot be sent between threads safelyfuture is not Send as this value is used across an awaitthe trait Send is not implemented for MutexGuard<'_, i32>The state-machine model explains why.
guard is created before the .await, and it is used after the .await. That means guard must survive the pause. So the suspended state has to store the MutexGuard.
That has two consequences.
First, the lock stays held while the task is waiting for touch_db(). The task is not actively using the locked data during that wait, but other tasks still cannot take the lock. In a busy server, that can create long waits, poor throughput, or deadlocks that are hard to understand from the surface code.
Second, if a task pauses on one worker thread, the runtime may resume it later on another worker thread.
For that to be safe, the future must be safe to move between threads. That is what Send means here: a value can be transferred to another thread without breaking Rust’s safety rules.
A future is Send only if the values stored inside it are Send. std::sync::MutexGuard is not Send, so the whole future is not Send, and Rust rejects using it in a place that requires a thread-safe future.
The usual fix is to finish the locked work before the .await:
async fn bump(state: Arc<Mutex<i32>>) { { let mut guard = state.lock().unwrap(); *guard += 1; } // guard dropped here touch_db().await;}Now the paused state does not store the guard. The lock is released before the task waits, and the future no longer carries a non Send guard across .await.
When the lock truly must be held across an .await, use an async-aware lock such as tokio::sync::Mutex. Its guard is designed to cross .await points, and waiting for the lock does not block an executor thread. But the lock is still held while the task is paused.
Reason About Memory Held While Waiting
The same rule applies to ordinary data too. If a value is still needed after .await, the paused future must keep it.
With a lock guard, that meant the lock stayed held. With a large value, the consequence is memory: the value stays alive while the task waits.
Consider the following example, where the task creates a large buffer, waits on the network, and later uses only the buffer length:
async fn example() { let big_buffer = vec![0_u8; 1024 * 1024]; wait_for_network().await; println!("{}", big_buffer.len());}big_buffer is used after the .await, so the future must keep it while wait_for_network() is pending. Here big_buffer stands in for something you actually produced, like a decoded image, a parsed request body, or a file you read into memory, and big_buffer.len() stands in for the small fact you needed from it.
For one task, this may not matter. But async programs often have many tasks paused at the same time: many requests, many connections, many jobs, or many timers. If each paused task keeps a large buffer alive across .await, memory usage grows with the number of paused tasks.
The future only stores the Vec handle, a few bytes, but that handle keeps the megabyte on the heap alive. So even while the task is just waiting on the network, the buffer cannot be freed. A value stored inline instead, like a big array or struct, grows the future itself. Either way, every waiting task has its future parked somewhere.
The numbers here are kept deliberately clean so the pattern is easy to see. Each one on its own is realistic: 10,000 tasks waiting at once is ordinary for a busy server, and 1 MB buffers are common once you are decoding images or buffering response bodies. What makes this a worst case is the combination, every task holding a full buffer and all of them parked at the same moment. Real workloads are lumpier. Buffer sizes vary, and tasks finish and free memory continuously. The point is not the exact 10 GB. The point is that the cost is paid per paused task, so it scales with how many futures are parked at the same time.
So the question is Does the whole buffer need to stay alive while this future waits?
If the code really needs the buffer after the .await, keeping it can be correct. But if you only need a small piece of information from it, extract that piece before the .await and let the large value drop.
async fn better() { let len = { let big_buffer = vec![0_u8; 1024 * 1024]; big_buffer.len() }; wait_for_network().await; println!("{len}");}Now the paused state only needs to store len, not the large buffer.
Reason About Performance
Async does not make every line non-blocking.
A future gives the executor control back only when poll returns Pending or Ready. Code between two .await points runs like normal synchronous Rust inside one poll, so .await is where async code gets a chance to pause and let the executor run something else.
Blocking code does not do that. It keeps running inside the current poll call.
async fn bad_task() { std::thread::sleep(Duration::from_secs(5));}Even though this function is async, there is no real async pause inside it. The generated future cannot return Pending during those five seconds. The executor only sees one long poll call that does not return.
For waiting on time in async code, use an async timer such as tokio::time::sleep. It can return Pending, so the executor can run other tasks while the timer is waiting.
The same problem appears with CPU-heavy work:
async fn handle() { parse_large_json(); compress_payload(); socket.write_all(...).await;}The solution is to keep the synchronous chunks between .await points small and fast. If you must call blocking code, move it out of the async worker path with tokio::task::spawn_blocking or a separate blocking thread pool. For sustained CPU-heavy work, prefer a dedicated CPU pool or Rayon, so compute-heavy jobs do not saturate threads meant for short blocking calls.
Reason About Compiler Errors
When async Rust gives a confusing compiler error, the useful question is usually not “what is wrong with this line?” but “what did this future capture or keep alive?”
We already saw one example with MutexGuard: the guard lived across .await. Another common version is a borrowed value that may not live long enough.
For example:
async fn handle(client: &Client) { tokio::spawn(async { client.request().await; });}client: &Client means handle does not own the client. It only borrowed access to a client owned somewhere else. That borrowed access is only guaranteed to be valid while handle is running.
But the async block creates another future. If that future is handed to the runtime, the runtime may keep it and poll it later, even after handle has returned. If the future stored client: &Client, it would be holding borrowed access to something that may no longer be available.
The error is not really about the call to client.request(). It is about the future storing a borrowed reference.
That is why async runtimes often require spawned futures to satisfy bounds like:
Future + Send + 'staticIn this context: Send means the future can move between worker threads safely. ‘static means the future does not depend on borrowed data that may disappear when the current function returns.
A common fix is to move owned data into the task:
async fn handle(client: Arc<Client>) {
tokio::spawn(async move {
client.request().await;
});
}
Now the future stores an owned Arc<Client>, not a borrowed &Client. The Arc keeps the client alive as long as the task needs it.
When you hit async compiler errors, here are some questions to consider:
- What does this async block capture?
- Which locals are still alive across .await?
- Could this future be kept after this function returns?
- Could this future move to another worker thread?
There is much more to cover in this area. But it need concepts we have not introduced yet, especially Pin, tasks, executors, and how a runtime schedules work.
In the next chapters, we will keep building on this same state machine idea. First we will see why Pin exists, then build the task and executor that schedule futures, then connect those pieces to Tokio, spawn, Send + 'static, real I/O wakeups, timers, channels, and the patterns used in async services.