7 Building Blocks Framework

Queues: The Building Block That Makes Systems Resilient

A queue forces one uncomfortable question: can this be done later? That question is the difference between fragile and resilient.

The Building Block AI Always Forgets

Of all the 7 building blocks, the queue is the one AI coding assistants almost never suggest. Ask an AI to build a video upload feature and it will generate a service that handles the upload, processes the video, and returns a response. Clean, direct, synchronous. And fragile.

The queue is the building block that makes systems survive the real world. Not the world of demos and happy paths, but the world where a million users upload videos at the same time, where downstream services go down at 2 AM, and where a traffic spike from a viral post could take your entire platform offline.

What a Queue Actually Does

A queue holds messages. That is it. One component puts a message in. Another component takes a message out. The two components never talk to each other directly.

This sounds almost too simple to matter. But that separation is what makes systems resilient.

# Producer: the upload service
def handle_upload(request):
    video_id = save_raw_video(request.file)
    queue.send({
        "video_id": video_id,
        "formats": ["1080p", "720p", "480p"],
        "uploaded_by": request.user_id
    })
    return {"status": "processing", "video_id": video_id}

The upload service does not encode the video. It does not even know how encoding works. It saves the raw file, drops a message on the queue, and tells the user "we are processing your video." Done in under a second.

On the other side, a worker pulls messages from the queue and does the slow, heavy encoding work. If it takes 5 minutes per video, that is fine. The user is not waiting. The queue holds the work until the worker is ready.

The "Done Later" Insight

The most important architectural question a queue forces you to ask is: can this be done later?

That question sounds simple. It is not. It requires you to think about user expectations, not just technical implementation.

When a user uploads a video, do they need the encoded versions immediately? No. They need to know the upload succeeded. They can wait a few minutes for processing. Done later is acceptable.

When a user submits a payment, do they need the receipt email immediately? No. They need confirmation that the payment went through. The email can arrive 30 seconds later. Done later is acceptable.

When a user requests their bank balance, can that be done later? No. They are staring at the screen waiting for a number. That is a service call, not a queue job.

The "done later" test is how you decide whether a queue belongs in your design. If the answer is yes, a queue almost always makes the system better. If the answer is no, a queue adds unnecessary complexity.

Why Queues Make Systems Resilient

Three properties make queues the backbone of resilient architecture:

1. Spike Absorption

Without a queue, your system can only handle as many requests as your workers can process simultaneously. If you have 10 workers and each takes 1 second, you handle 10 requests per second. The 11th request either waits (slow) or fails (broken).

With a queue, spikes get absorbed. A million messages can pile up during a viral moment, and your workers chew through them at a steady pace. The system never breaks. It just gets temporarily behind. There is a huge difference between "slow" and "down."

2. Failure Isolation

Without a queue, if the video encoding service crashes, the upload service has nowhere to send work. Uploads fail. Users see errors.

With a queue, the upload service does not care whether the encoding worker is running. It drops messages on the queue and moves on. If the worker crashes, messages accumulate on the queue. When the worker restarts, it picks up where it left off. No data lost. No user-facing errors.

This is failure isolation. The queue acts as a buffer between components, preventing failures in one part of the system from cascading to others.

3. Independent Scaling

Without a queue, your upload service and your encoding logic are tangled together. To handle more uploads, you scale both, even if uploads are fast and encoding is the bottleneck.

With a queue, you scale them independently. Getting more uploads? Add upload service instances. Encoding falling behind? Add more workers. The queue mediates between them, so each component scales based on its own needs.

The Tradeoff: Managing Expectations

Queues are not free. They introduce a user experience challenge: you have to tell users "this will be done later."

That means designing your interface for eventual completion. A progress bar. A notification when processing finishes. A "your video is being processed" state that does not look like an error.

This is a product decision, not just a technical one. The queue makes the system resilient, but only if the product design accounts for the delay.

Instagram handles this beautifully. When you post a photo, the app shows it immediately in your feed. Behind the scenes, workers are generating thumbnails, running content moderation, and updating follower feeds. You see instant feedback. The heavy work happens later.

When Not to Use a Queue

Not everything belongs in a queue. If you fail the "done later" test, a direct service call is the right choice.

The common thread: these are all cases where the user is actively waiting for a result that determines their next action. A queue would add latency with no benefit.

Why AI Defaults to Direct Calls

AI coding assistants generate the simplest path from input to output: request comes in, work happens, response goes out. A direct call works correctly. It just does not work reliably under load, under failure, or under the unpredictable conditions of production.

Knowing when to introduce a queue requires thinking about what happens when things go wrong. AI does not think about those scenarios unless you explicitly tell it to. And knowing what to tell it is your value as an engineer.

The Queue in Context

The queue connects the other building blocks. It sits between services and workers, turning synchronous requests into asynchronous work. It works alongside the storage blocks that hold the data being processed.

In most production systems, the pattern looks like this: a service receives a request, stores the raw data, places a message on a queue, and returns immediately. A worker picks up the message, does the heavy processing, and stores the result. The user checks back (or gets notified) when the work is done.

That pattern powers Instagram uploads, Netflix video encoding, Uber ride matching, and Slack message delivery. Different products, same building blocks, same architecture.

Test Your Understanding

Ready to decide where queues belong in a real system? The interactive challenges put you in the architect's seat. You will design Instagram, Netflix, and Uber, and the queue decisions are some of the most revealing choices you will make.

And if you want to go deeper into the complete framework, start from the beginning: The 7 Building Blocks Behind Every System.

Test Your Understanding

Try designing Instagram, Netflix, and Uber using the building blocks. 3 challenges, 5 minutes each. No signup required.

Try the Interactive Challenges