Asynchronous Programming
Harness the full potential of FastAPI's asynchronous capabilities to build high-performance applications.
Content
Concurrency in FastAPI
Versions:
Watch & Learn
AI-discovered learning video
Sign in to watch the learning video for this topic.
Concurrency in FastAPI — Concurrent Chaos, Tamed
“Concurrency is like juggling while riding a unicycle — thrilling until you drop the database connection.”
You already know how async/await works and you’ve absorbed the magic of asynchronous I/O. Now let’s stop treating concurrency like a mythological beast and start giving it a proper routine: what it is in FastAPI, how it interacts with the event loop, what kills your throughput (and why), and how to test it so you don’t ship surprising race conditions to production.
Why this matters (fast recap)
FastAPI sits on top of Starlette and asyncio. That means your endpoints can be either def (sync) or async def (async). You’ve seen how async/await keeps the event loop from blocking — now imagine dozens or thousands of clients doing that at once.
If your app can’t coordinate concurrent tasks well, you’ll get slow responses, blocked workers, and flaky tests. If you get it right, you get low-latency, high-throughput endpoints that don’t melt under load.
Key ideas, boiled down
- Concurrency vs Parallelism: Concurrency = dealing with multiple things at once (interleaving). Parallelism = doing multiple things literally at the same time (multiple CPU cores). asyncio gives concurrency; multiple worker processes or threads give parallelism.
- I/O-bound vs CPU-bound: Async is great for I/O-bound tasks (DB, HTTP calls, sockets). For CPU-bound tasks, use process pools or external workers — the GIL and event loop cannot help you here.
- Event loop per process: Uvicorn/Hypercorn spawn processes/workers; each one has its own event loop. So scaling horizontally is often a process-level decision.
Practical patterns in FastAPI
1) Use async for I/O work; sync for blocking work
async defendpoints should only await non-blocking coroutines (async DB clients, httpx AsyncClient, etc.).- For a blocking library, either run it in a threadpool or use a sync endpoint.
Example: run a CPU/blocking function in the thread pool:
import asyncio
from fastapi import FastAPI
app = FastAPI()
def compute_heavy(x):
# CPU-heavy / blocking
return sum(i * i for i in range(x))
@app.get("/heavy")
async def heavy(x: int = 10_000_000):
loop = asyncio.get_event_loop()
result = await loop.run_in_executor(None, compute_heavy, x)
return {"result": result}
2) Fire-and-forget: Background tasks vs task creation
Use BackgroundTasks or dedicated async tasks with care. Creating untracked tasks (asyncio.create_task) means you must manage cancellation and exceptions.
from fastapi import BackgroundTasks
@app.post("/notify")
async def notify(background_tasks: BackgroundTasks):
background_tasks.add_task(send_email, "hello@example.com")
return {"status": "queued"}
BackgroundTasks are safe for short-lived background work. For long-running jobs, use a worker (Celery, RQ, or similar).
3) Running many concurrent subtasks: gather, create_task, semaphore
If you’re calling multiple external APIs concurrently, use asyncio.gather or spawn tasks — but remember to limit concurrency.
Example: limit concurrent outbound requests with a semaphore:
import asyncio
import httpx
sem = asyncio.Semaphore(10)
async def fetch(client, url):
async with sem:
r = await client.get(url)
return r.text
@app.get("/multi")
async def multi(urls: list[str]):
async with httpx.AsyncClient() as client:
tasks = [asyncio.create_task(fetch(client, u)) for u in urls]
return await asyncio.gather(*tasks)
Why? Because an unbounded gather can exhaust sockets, DB connections, or RAM.
Concurrency pitfalls and defensive patterns
- Blocking calls in async endpoints: A single blocking call can block the entire worker event loop. Use
run_in_executoror convert to async libraries. - Shared mutable state: Beware race conditions. Use locks (
asyncio.Lock) or design around immutability. - Connection pool saturation: DB or HTTP pools limit concurrency — tune pool size and limit request concurrency accordingly.
- Cancellation: Requests can be cancelled by clients — ensure you clean up or handle partial work. Use
asyncio.shieldif appropriate.
Table: Quick decision helper
| Problem | Async-friendly solution |
|---|---|
| I/O-bound (DB/HTTP) | Use async drivers (asyncpg, databases, httpx.AsyncClient) |
| CPU-bound | run_in_executor or external workers/processes |
| Too many outbound requests | Use asyncio.Semaphore or a queue |
| Long-running jobs | Background worker (Celery/RQ) or external service |
Testing concurrency (builds on your Testing FastAPI Applications knowledge)
You’ve already written functional tests. Now: simulate concurrency and race conditions.
- Use
httpx.AsyncClient(orAsyncClientfrom starlette.testclient) inside async pytest tests to send concurrent requests. - Use
asyncio.gatherin tests to create parallel calls. - Assert for race conditions and ensure idempotency where required.
Example test snippet with pytest + anyio/httpx:
import asyncio
import pytest
from httpx import AsyncClient
from myapp import app
@pytest.mark.anyio
async def test_concurrent_increments():
async with AsyncClient(app=app, base_url="http://test") as ac:
tasks = [ac.post('/increment') for _ in range(20)]
results = await asyncio.gather(*tasks)
assert all(r.status_code == 200 for r in results)
r = await ac.get('/value')
assert r.json()['value'] == 20
Make sure your tests run in a realistic environment: if you use async DB drivers and a pool, tests should run against a DB with similar pool settings to detect pool exhaustion.
When to scale with workers vs async tuning
- If adding more users doesn’t change CPU usage but increases waiting for I/O, tune async concurrency and connection pools.
- If CPU spikes (e.g., heavy image processing), scale with additional processes or offload to worker queues.
Simple rule of thumb: async = scale for I/O concurrency; processes/workers = scale for CPU parallelism.
Final checklist before you ship
- Convert blocking libraries to async or isolate them with
run_in_executor. - Limit concurrency (Semaphores/Queues) for external resources.
- Use connection pools carefully (DB, HTTP). Tune sizes.
- Test concurrent behavior with async tests and simulate race conditions.
- Offload long-running jobs to workers.
If your app is fast, but your DB is slow, your app will be simultaneously fast and useless. Fix the DB (or throttle correctly).
Takeaways — TL;DR you can use between sips of coffee
- Concurrency in FastAPI = asyncio-powered concurrency per process + process-level parallelism for scale. Know which one you need.
- Use
async deffor I/O-bound tasks and avoid blocking the event loop. For CPU-heavy work, use executors or worker processes. - Limit concurrent outbound work; manage connection pools; avoid shared mutable global state without locks.
- Test concurrency deliberately: create simultaneous requests to expose race conditions and pool exhaustion.
Go build something that feels buttery smooth under load — and when it breaks, you’ll at least know whether to blame the event loop or your database.
Version note: This builds on the Async and Await basics and Understanding Asynchronous I/O. If you’ve completed the previous testing module, apply those strategies now to concurrency tests. Curious where to go next? Learn about observability (traces and metrics) to see where concurrent work actually spends time — then you’ll be unstoppable.
Comments (0)
Please sign in to leave a comment.
No comments yet. Be the first to comment!