Design a notification fan-out system at Twitter scale
A one-billion-user social service with celebrity accounts and ordinary accounts.
The interviewer wants to see you draw the line between fanout-on-write and fanout-on-read — and defend it with numbers.
Drill #042Last attempted · 4 days agoWeak on · Framing87 staff candidates ran this last week
The interviewer will open with
Take a minute. I'd like you to design the notification fan-out for a service at Twitter's scale. Roughly a billion users, a mix of celebrities and ordinary accounts. Start wherever you like.
What you'll be designing against
One billion daily-active users, a follower distribution that is sharply power-law (the top one thousand accounts have between five million and a hundred million followers each), and an expectation that a tweet posted by an ordinary user shows up in their followers' inboxes within five seconds at the ninety-ninth percentile. The read side of the system dominates the write side by roughly twenty to one at the median, and by a hundred to one at the celebrity tier.
The interviewer will not feed you these numbers. They expect you to clarify and quantify the load before you draw boxes. If you draw boxes first, they will let you — and then ask for the ratio thirty seconds later, when it is already too late to anchor the rest of the answer.
What the rubric checks
Five dimensions. Each scored on the printed rubric — Below bar, Approaching, At bar, Above bar — with a written critique. Nothing is hidden behind a black-box score.
1
Framing
Did the candidate state a read/write ratio inside the first ninety seconds, and did they commit to a regime — fanout-on-write, fanout-on-read, or a defended hybrid — before drawing any architecture?
2
Signal
Did the candidate name the celebrity-tier problem unprompted, and did they offer a threshold-tunable hybrid with a defended cutoff? Did they discuss the partitioning key trade-off (author vs follower) explicitly?
3
Evidence
Three numbers minimum — peak write QPS, median inbox depth, p99 read latency target. Each defended with back-of-envelope arithmetic the interviewer can audit on screen.
4
Recovery
When the room pushed back on the thundering-herd failure mode or the celebrity-tier hot partition, did the candidate restate the failure in their own words and propose a bound — concurrency cap, backpressure, queue depth — before the room had to suggest one?
5
Pacing
Twenty-five to thirty-five minutes for a staff round. Time budget held in proportion: roughly five minutes on framing and load, twenty on architecture and trade-offs, and the remaining time on failure modes and follow-ups.
What a strong answer does — and what a weak one does
At-bar candidate
States the read/write ratio in the first ninety seconds — twenty reads per write at the median, a hundred at the celebrity tier.
Commits to a hybrid before drawing: write-side fanout for ordinary accounts, read-side merge for celebrities, threshold tunable around ten thousand followers.
Quantifies inbox depth, write QPS, and read latency target on screen; the interviewer can audit the arithmetic.
Names the thundering-herd failure mode for celebrity tweets unprompted, and bounds concurrency per partition.
Closes with one clear recommendation and one named open question.
Below-bar candidate
Goes architecture-first — draws Kafka, Redis, Cassandra — before any number is on the page.
Confuses fanout-on-write with fanout-on-read, or commits to one without naming the trade-off.
Hand-waves the celebrity tier as "we'd just handle them differently" without proposing a threshold or a partitioning model.
Lets the interviewer extract the read/write ratio at the eight-minute mark.
Runs out of time before failure modes; ends mid-thought without a recommendation.
Drills like this one
Design a real-time presence system for a million concurrent users
Same shape: read/write asymmetry, partitioning trade-off, celebrity-tier reasoning.
Staff · 35m
Design a feed-ranking ingest pipeline for ten billion events a day
Consistency model, global purge under five seconds, the cost of stale reads.
Staff · 45m
Voice-only session. The interviewer will hold silence on pauses up to seven seconds. End the session whenever you want; the scorecard arrives by email within ninety seconds.