Design a Social Media News Feed — Complete System Design Walkthrough
The news feed is one of the most iconic system design interview questions. It appears at every tier of tech company — from startups to FAANG — because it touches on fan-out strategies, ranking algorithms, caching, and the tension between consistency and latency at massive scale. In this walkthrough, we will design a social media news feed from scratch, following the structured 6-stage approach that interviewers expect.
Stage 1: Requirements Gathering
Start every system design interview by clarifying scope. Spend 3-5 minutes here — it demonstrates maturity and prevents you from designing the wrong system.
Functional Requirements
- Create posts — Users can publish posts containing text, images, and videos.
- Follow/unfollow users — Asymmetric follow model (like Twitter/Instagram, not mutual friendship like Facebook).
- View personalized feed — A ranked, scrollable timeline of posts from followed users.
- Like and comment on posts — Social interactions that also influence feed ranking.
- Support multiple content types — Text-only, single image, image carousel, short video.
Non-Functional Requirements
- Scale: 500 million total users, 200 million DAU.
- Social graph density: Average user follows 500 accounts. Some celebrities have 50M+ followers.
- Read volume: 10 billion feed reads per day (~115,000 reads/second average, ~350,000 at peak).
- Write volume: 200 million posts per day (~2,300 writes/second average).
- Latency: Feed load under 200ms at p99 for cached users. New posts appear in followers' feeds within 5 seconds.
- Availability: 99.99% uptime — the feed is the core product surface.
Interview tip: Explicitly state the read-to-write ratio. Here it is roughly 50:1. This immediately signals that the system is read-heavy and that caching and fan-out strategy will be central design decisions.
Stage 2: API Design
Define the contract between client and server before diving into internals. Keep APIs RESTful and paginated.
Core Endpoints
| Method | Endpoint | Purpose |
|---|---|---|
| POST | /v1/posts | Create a new post |
| GET | /v1/feed?cursor={cursor}&limit=20 | Fetch personalized feed (paginated) |
| POST | /v1/users/{id}/follow | Follow a user |
| DELETE | /v1/users/{id}/follow | Unfollow a user |
| POST | /v1/posts/{id}/like | Like a post |
| POST | /v1/posts/{id}/comments | Add a comment |
Create Post Request
POST /v1/posts
| Field | Type | Description |
|---|---|---|
content | string | Text body of the post |
media_ids | string[] | Pre-uploaded media references |
content_type | enum | text, image, video, carousel |
Feed Response
GET /v1/feed?cursor=abc123&limit=20
| Field | Type | Description |
|---|---|---|
posts | Post[] | Ranked list of feed items with author info, engagement counts, and media URLs |
next_cursor | string | Opaque cursor for the next page (encodes timestamp + post ID) |
has_more | boolean | Whether more pages exist |
Design decision: We use cursor-based pagination rather than offset-based. With offset pagination, new posts inserted at the top cause items to shift, leading to duplicates or missed posts. A cursor (typically encoding the last seen post's timestamp and ID) provides a stable reference point regardless of new insertions.
Stage 3: Data Model
The data model must support two dominant access patterns: writing a post and its fan-out, and reading a user's personalized feed with low latency.
Posts Table
| Column | Type | Notes |
|---|---|---|
post_id | Snowflake ID (PK) | Time-sortable, globally unique |
author_id | UUID | Indexed for author profile page |
content | text | |
content_type | enum | text, image, video, carousel |
media_urls | JSON array | CDN URLs for attached media |
like_count | int | Denormalized counter |
comment_count | int | Denormalized counter |
created_at | timestamp |
Store posts in a relational database (PostgreSQL) sharded by author_id. Each shard holds all posts for a subset of users, making author-profile queries efficient.
Social Graph
| Column | Type | Notes |
|---|---|---|
follower_id | UUID | Who follows |
followee_id | UUID | Who is followed |
created_at | timestamp |
Store in a dedicated graph-optimized store or a relational table with composite index on (followee_id, follower_id). The critical query is: given a user who just posted, return all their follower IDs. Secondary index on follower_id supports the reverse query: who does this user follow?
Feed Cache (Redis Sorted Set)
| Key | Value | Score |
|---|---|---|
feed:{user_id} | post_id | Timestamp or relevance score |
Each user's feed is a Redis sorted set containing post IDs, scored by a combination of recency and relevance. We keep the most recent 800 post IDs per user. Older posts fall off the cache and are fetched from the database on demand (rare, since most users do not scroll past 200 items).
Content Store (Separate from Metadata)
Post media (images, videos) is stored in object storage (S3) behind a CDN. The posts table holds only CDN URLs. This separation is critical — serving media directly from the application database would be catastrophic for latency and throughput.
Stage 4: High-Level Architecture
The core architectural decision for any news feed system is the fan-out strategy: when and how do we assemble each user's feed?
Fan-Out-on-Write (Push Model)
When a user publishes a post, we immediately push the post ID into the feed cache of every follower.
- Pro: Feed reads are instant — just read the pre-assembled sorted set from Redis.
- Con: A celebrity with 50 million followers triggers 50 million cache writes per post. At 2,300 posts/second globally, this creates enormous write amplification.
Fan-Out-on-Read (Pull Model)
When a user requests their feed, we fetch the latest posts from all accounts they follow and merge them in real time.
- Pro: No write amplification. Celebrity posts are written once.
- Con: Feed reads become expensive. A user following 500 accounts requires 500 queries, merged and ranked on the fly. At 115,000 reads/second, this is not viable.
Hybrid Approach (The Right Answer)
The industry-standard solution combines both strategies:
- Regular users (fewer than 10,000 followers): Fan-out-on-write. Their posts are pushed to all followers' feed caches immediately. This covers 99%+ of all users.
- Celebrity users (10,000+ followers): Fan-out-on-read. Their posts are not pushed to followers' caches. Instead, when a user loads their feed, the Feed Service fetches recent posts from the celebrities they follow, merges them with the pre-assembled cache, and ranks the combined list.
The threshold (10,000 followers) is configurable and can be tuned based on system capacity. This hybrid model caps the worst-case fan-out while keeping feed reads fast for the common case.
Write Path
- User submits a post via
POST /v1/posts. - The Post Service validates the content, generates a Snowflake ID, writes to the Posts DB, and uploads media to the CDN.
- The Post Service publishes a
PostCreatedevent to a message queue (Kafka). - The Fan-out Service consumes the event. It queries the Social Graph for the author's followers.
- For regular users: the Fan-out Service writes the post ID to each follower's Feed Cache (Redis
ZADD). - For celebrities: the Fan-out Service does nothing — these posts will be pulled at read time.
Read Path
- User requests their feed via
GET /v1/feed. - The Feed Service reads the pre-assembled post IDs from the user's Feed Cache.
- The Feed Service identifies which celebrities the user follows and fetches their recent post IDs from the Posts DB (or a celebrity post cache).
- The Merge & Rank layer combines both lists, applies the ranking algorithm, and returns the top N posts.
- The Feed Service hydrates the post IDs into full post objects (author info, media URLs, engagement counts) using a multi-get from the Posts DB or a post-detail cache.
Stage 5: Deep Dive
We will go deep on two areas that interviewers probe most: the fan-out strategy and celebrity problem, and feed ranking and relevance scoring.
Deep Dive 1: Fan-Out Strategy and the Celebrity Problem
The celebrity problem is the defining challenge of news feed design. Let us quantify it.
Why Naive Fan-Out Breaks
Consider a celebrity with 50 million followers who posts 10 times per day. Fan-out-on-write would generate 500 million cache writes per day just for one user. Multiply across hundreds of celebrities, and the fan-out service would need to process billions of cache writes per day on top of regular user fan-out. This is not just expensive — it introduces unacceptable latency. Followers of that celebrity would not see the post for minutes while the fan-out queue drains.
Celebrity Detection and Classification
We maintain a celebrity registry — a lightweight table mapping user_id to is_celebrity: boolean. The classification runs as a background job that checks follower counts periodically. When a user crosses the threshold (e.g., 10,000 followers), they are flagged as a celebrity. When the Fan-out Service processes a PostCreated event, it checks this registry before deciding whether to fan out.
Optimizing Fan-Out for Regular Users
Even for regular users, fan-out must be efficient. Key optimizations:
- Batch writes: Instead of individual
ZADDcommands per follower, we pipeline Redis commands in batches of 1,000. This reduces round-trip overhead by 100x. - Prioritize active users: We fan out to users who were active in the last 7 days first. Inactive users' feeds are populated on-demand when they return (backfill from the posts table). This reduces fan-out volume by 40-60% depending on the platform's activity ratio.
- Partitioned fan-out workers: Kafka partitions are keyed by
author_id. Each worker handles fan-out for a subset of authors. Workers scale horizontally — during peak hours, we auto-scale the consumer group.
Celebrity Post Cache
Rather than mixing celebrity posts into each user's feed cache, we maintain a separate celebrity post cache. It is a simple sorted set per celebrity: celeb_posts:{user_id} containing their last 100 post IDs. At read time, the Feed Service identifies the celebrities a user follows (typically 20-50 accounts), fetches their recent posts from this cache (20-50 Redis reads), and merges them with the user's pre-assembled feed. This adds 5-15ms to the read path — well within our latency budget.
Deep Dive 2: Feed Ranking and Relevance Scoring
A purely chronological feed is simple to build but delivers a poor user experience. Users follow hundreds of accounts and cannot consume everything. Ranking ensures the most relevant posts surface first.
Ranking Signal Categories
| Signal Category | Examples | Weight |
|---|---|---|
| Engagement | Like count, comment count, share count, save count | High |
| Affinity | How often the viewer interacts with the author (likes, comments, DMs, profile visits) | High |
| Recency | Time since post creation (exponential decay) | Medium |
| Content type | User's historical preference (video vs. image vs. text) | Medium |
| Author quality | Author's overall engagement rate, post frequency | Low |
| Negative signals | User hid similar posts, unfollowed similar accounts | Penalty |
Scoring Formula
A simplified relevance score combines these signals:
score = (affinity_weight * affinity_score) + (engagement_weight * normalized_engagement) + (recency_weight * decay_function(age)) + (content_pref_weight * content_match) - (negative_signal_penalty)
In production, this formula is replaced by a machine learning model (typically a gradient-boosted tree or a lightweight neural network) that learns weights from user behavior. The model is trained offline on click-through and engagement data, then served via a low-latency prediction service.
Two-Pass Ranking Architecture
Running a complex ML model on every candidate post is expensive. We use a two-pass approach:
- Candidate generation (pass 1): Retrieve the top 500 post IDs from the feed cache + celebrity cache, sorted by a lightweight score (recency + raw engagement). This is fast — just sorted set reads and simple arithmetic.
- Re-ranking (pass 2): Send the top 500 candidates to the Ranking Service, which applies the full ML model to produce a final ordered list. The model evaluates ~500 candidates in under 50ms using batched inference.
Only the top 20-50 posts (one page) are returned to the client. The remaining ranked candidates are cached briefly for fast subsequent page loads.
Handling Ranking Freshness
Post engagement changes rapidly after publication. A post that had 10 likes when it was ranked might have 10,000 likes five minutes later. We handle this with:
- Engagement counters in Redis: Denormalized like/comment counts are updated in real time. The ranking model reads these counters at scoring time, so re-ranking naturally picks up new engagement.
- Feed invalidation on high-velocity posts: When a post's engagement crosses a threshold (e.g., 10x its expected rate), the system marks it as "trending" and boosts it in upcoming feed requests for relevant users.
Stage 6: Scaling and Trade-Offs
Cache Layers
The feed system operates behind multiple cache layers:
- CDN edge cache: Static media (images, video thumbnails) served from edge locations. Cache-Control headers with long TTLs for immutable content-addressed URLs.
- Feed cache (Redis): Pre-assembled post ID lists per user. This is the primary data store for the read path. We run Redis in cluster mode with automatic sharding across 50+ nodes.
- Post detail cache (Redis/Memcached): Hydrated post objects (author name, avatar, media URLs, engagement counts). TTL of 5 minutes. Cache-aside pattern: on miss, read from database and populate cache.
- Celebrity post cache (Redis): Separate sorted sets for high-follower users. TTL of 24 hours with active refresh on new posts.
Pagination with Cursor
The cursor encodes the last seen post's score (not just timestamp) and post ID as a tiebreaker. This is essential for ranked feeds where order is not strictly chronological. The cursor is opaque to the client — encoded as a base64 string.
When the client requests the next page, the server uses the cursor to resume ranking from the correct position in the sorted set. Because rankings can shift between requests (new posts, engagement changes), we accept minor inconsistencies at page boundaries. Users rarely notice a post appearing twice or being skipped across pages, and the trade-off is justified by the reduction in server-side state.
Content Delivery
Media delivery accounts for 95%+ of bandwidth. Our strategy:
- Upload pipeline: Client uploads media to a pre-signed S3 URL. A background job generates thumbnails (multiple resolutions) and transcodes video into adaptive bitrate formats (HLS). The post is not published until processing completes.
- CDN distribution: All media is served through a CDN with edge locations worldwide. Content-addressed URLs (hash-based) enable aggressive caching with infinite TTLs.
- Progressive loading: The feed response includes a low-resolution blur hash for each image. The client renders the blur hash instantly, then lazy-loads the full-resolution image as the user scrolls to it.
Multi-Region Architecture
With 200 million DAU globally, single-region deployment creates unacceptable latency for distant users. We deploy across three regions: US-East, EU-West, and AP-Southeast.
- Feed cache: Replicated per region. Fan-out writes are published to a global Kafka cluster and consumed by fan-out workers in each region.
- Social graph: Primary in one region with read replicas in others. Follow/unfollow operations route to the primary, with replication lag under 1 second.
- Posts database: Sharded globally. Posts are written to the region closest to the author and replicated asynchronously. A post created in EU-West is available in AP-Southeast within 2-3 seconds.
- Request routing: DNS-based routing sends each user to the nearest region. Cross-region fallback if a region is unhealthy.
Trade-Off Summary
| Decision | Choice | Trade-Off |
|---|---|---|
| Fan-out strategy | Hybrid (write for regular, read for celebrities) | Added complexity in the read path, but avoids write amplification for high-follower users. |
| Feed storage | Redis sorted sets | Fast reads but expensive memory. Mitigated by capping feed length at 800 items and evicting inactive users. |
| Ranking | Two-pass (lightweight filter + ML re-rank) | Adds 30-50ms to the read path, but dramatically improves feed quality and engagement. |
| Pagination | Cursor-based on score + ID | Minor inconsistencies at page boundaries vs. the cost of maintaining server-side session state. |
| Consistency model | Eventual (2-5s propagation) | New posts may take a few seconds to appear in all followers' feeds. Acceptable for a social feed — not a banking system. |
| Media delivery | CDN with content-addressed URLs | Infinite cache TTLs eliminate invalidation complexity, but requires unique URL per media version. |
Scoring Tips
To maximize your score on a news feed design question:
- Name the fan-out trade-off immediately. The instant you hear "news feed," your interviewer is waiting to see if you understand fan-out-on-write vs. fan-out-on-read. State both approaches, quantify the cost of each, and propose the hybrid. This alone puts you ahead of most candidates.
- Quantify the celebrity problem. Do not just say "celebrities have lots of followers." Calculate: 50M followers times 10 posts/day = 500M cache writes/day for a single user. Numbers make your argument concrete and memorable.
- Discuss ranking as a first-class concern. Many candidates treat ranking as an afterthought. Interviewers at top companies expect you to discuss signal categories, the two-pass architecture, and the tension between ranking quality and latency.
- Show the read path end-to-end. Walk through: cache lookup, celebrity merge, ranking, hydration, response. Show that you understand every millisecond between the user's request and the rendered feed.
- Address the inactive user optimization. Mentioning that you skip fan-out for inactive users demonstrates production-level thinking. It shows you understand that not all 500M users are active, and that wasting resources on dormant feeds is poor engineering.
- Acknowledge trade-offs explicitly. Every decision has a cost. State it. "We chose eventual consistency, which means a post may take up to 5 seconds to appear in all followers' feeds. This is acceptable for a social media feed." Interviewers reward this calibration.
Practice delivering this walkthrough in under 35 minutes, leaving room for follow-up questions. The best candidates complete all six stages and still have time to go deeper on the interviewer's area of interest. Tools like Hoppers AI can help you rehearse under realistic time pressure, providing structured feedback on the depth and clarity of each stage.