Design a notification fan-out system at Twitter scale

A one-billion-user social service with celebrity accounts and ordinary accounts. The interviewer wants to see you draw the line between fanout-on-write and fanout-on-read — and defend it with numbers.

Drill #042 Last attempted · 4 days ago Weak on · Framing 87 staff candidates ran this last week

The interviewer will open with

Take a minute. I'd like you to design the notification fan-out for a service at Twitter's scale. Roughly a billion users, a mix of celebrities and ordinary accounts. Start wherever you like.

What you'll be designing against

One billion daily-active users, a follower distribution that is sharply power-law (the top one thousand accounts have between five million and a hundred million followers each), and an expectation that a tweet posted by an ordinary user shows up in their followers' inboxes within five seconds at the ninety-ninth percentile. The read side of the system dominates the write side by roughly twenty to one at the median, and by a hundred to one at the celebrity tier.

The interviewer will not feed you these numbers. They expect you to clarify and quantify the load before you draw boxes. If you draw boxes first, they will let you — and then ask for the ratio thirty seconds later, when it is already too late to anchor the rest of the answer.

What the rubric checks

Five dimensions. Each scored on the printed rubric — Below bar, Approaching, At bar, Above bar — with a written critique. Nothing is hidden behind a black-box score.

1

Framing Did the candidate state a read/write ratio inside the first ninety seconds, and did they commit to a regime — fanout-on-write, fanout-on-read, or a defended hybrid — before drawing any architecture?
2

Signal Did the candidate name the celebrity-tier problem unprompted, and did they offer a threshold-tunable hybrid with a defended cutoff? Did they discuss the partitioning key trade-off (author vs follower) explicitly?
3

Evidence Three numbers minimum — peak write QPS, median inbox depth, p99 read latency target. Each defended with back-of-envelope arithmetic the interviewer can audit on screen.
4

Recovery When the room pushed back on the thundering-herd failure mode or the celebrity-tier hot partition, did the candidate restate the failure in their own words and propose a bound — concurrency cap, backpressure, queue depth — before the room had to suggest one?
5

Pacing Twenty-five to thirty-five minutes for a staff round. Time budget held in proportion: roughly five minutes on framing and load, twenty on architecture and trade-offs, and the remaining time on failure modes and follow-ups.

What a strong answer does — and what a weak one does

At-bar candidate

States the read/write ratio in the first ninety seconds — twenty reads per write at the median, a hundred at the celebrity tier.
Commits to a hybrid before drawing: write-side fanout for ordinary accounts, read-side merge for celebrities, threshold tunable around ten thousand followers.
Quantifies inbox depth, write QPS, and read latency target on screen; the interviewer can audit the arithmetic.
Names the thundering-herd failure mode for celebrity tweets unprompted, and bounds concurrency per partition.
Closes with one clear recommendation and one named open question.

Below-bar candidate

Goes architecture-first — draws Kafka, Redis, Cassandra — before any number is on the page.
Confuses fanout-on-write with fanout-on-read, or commits to one without naming the trade-off.
Hand-waves the celebrity tier as "we'd just handle them differently" without proposing a threshold or a partitioning model.
Lets the interviewer extract the read/write ratio at the eight-minute mark.
Runs out of time before failure modes; ends mid-thought without a recommendation.

Drills like this one

Design a real-time presence system for a million concurrent users

Same shape: read/write asymmetry, partitioning trade-off, celebrity-tier reasoning.

Staff · 35m

Design a feed-ranking ingest pipeline for ten billion events a day

Backpressure, batching, schema evolution, replay correctness.

Staff · 45m

Design a multi-region cache invalidation protocol

Consistency model, global purge under five seconds, the cost of stale reads.

Staff · 45m

Voice-only session. The interviewer will hold silence on pauses up to seven seconds. End the session whenever you want; the scorecard arrives by email within ninety seconds.

Session configuration · last used

Bar Staff

Target role Staff software engineer — Stripe-shaped

Hold silence On — pauses up to 7s

Push back on weak framing On — the room will not let you skip the read/write ratio

Duration target 35 minutes