Design a Social Media News Feed — System Design Interviews

Design a Social Media News Feed — Complete System Design Walkthrough

The news feed is one of the most iconic system design interview questions. It appears at every tier of tech company — from startups to FAANG — because it touches on fan-out strategies, ranking algorithms, caching, and the tension between consistency and latency at massive scale. In this walkthrough, we will design a social media news feed from scratch, following the structured 6-stage approach that interviewers expect.

Stage 1: Requirements Gathering

Start every system design interview by clarifying scope. Spend 3-5 minutes here — it demonstrates maturity and prevents you from designing the wrong system.

Functional Requirements

Create posts — Users can publish posts containing text, images, and videos.
Follow/unfollow users — Asymmetric follow model (like Twitter/Instagram, not mutual friendship like Facebook).
View personalized feed — A ranked, scrollable timeline of posts from followed users.
Like and comment on posts — Social interactions that also influence feed ranking.
Support multiple content types — Text-only, single image, image carousel, short video.

Non-Functional Requirements

Scale: 500 million total users, 200 million DAU.
Social graph density: Average user follows 500 accounts. Some celebrities have 50M+ followers.
Read volume: 10 billion feed reads per day (~115,000 reads/second average, ~350,000 at peak).
Write volume: 200 million posts per day (~2,300 writes/second average).
Latency: Feed load under 200ms at p99 for cached users. New posts appear in followers' feeds within 5 seconds.
Availability: 99.99% uptime — the feed is the core product surface.

Interview tip: Explicitly state the read-to-write ratio. Here it is roughly 50:1. This immediately signals that the system is read-heavy and that caching and fan-out strategy will be central design decisions.

Stage 2: API Design

Define the contract between client and server before diving into internals. Keep APIs RESTful and paginated.

Core Endpoints

Method	Endpoint	Purpose
POST	`/v1/posts`	Create a new post
GET	`/v1/feed?cursor={cursor}&limit=20`	Fetch personalized feed (paginated)
POST	`/v1/users/{id}/follow`	Follow a user
DELETE	`/v1/users/{id}/follow`	Unfollow a user
POST	`/v1/posts/{id}/like`	Like a post
POST	`/v1/posts/{id}/comments`	Add a comment

Create Post Request

POST /v1/posts

Field	Type	Description
`content`	string	Text body of the post
`media_ids`	string[]	Pre-uploaded media references
`content_type`	enum	text, image, video, carousel

Feed Response

GET /v1/feed?cursor=abc123&limit=20

Field	Type	Description
`posts`	Post[]	Ranked list of feed items with author info, engagement counts, and media URLs
`next_cursor`	string	Opaque cursor for the next page (encodes timestamp + post ID)
`has_more`	boolean	Whether more pages exist

Design decision: We use cursor-based pagination rather than offset-based. With offset pagination, new posts inserted at the top cause items to shift, leading to duplicates or missed posts. A cursor (typically encoding the last seen post's timestamp and ID) provides a stable reference point regardless of new insertions.

Stage 3: Data Model

The data model must support two dominant access patterns: writing a post and its fan-out, and reading a user's personalized feed with low latency.

Posts Table

Column	Type	Notes
`post_id`	Snowflake ID (PK)	Time-sortable, globally unique
`author_id`	UUID	Indexed for author profile page
`content`	text
`content_type`	enum	text, image, video, carousel
`media_urls`	JSON array	CDN URLs for attached media
`like_count`	int	Denormalized counter
`comment_count`	int	Denormalized counter
`created_at`	timestamp

Store posts in a relational database (PostgreSQL) sharded by author_id. Each shard holds all posts for a subset of users, making author-profile queries efficient.

Social Graph

Column	Type	Notes
`follower_id`	UUID	Who follows
`followee_id`	UUID	Who is followed
`created_at`	timestamp

Store in a dedicated graph-optimized store or a relational table with composite index on (followee_id, follower_id). The critical query is: given a user who just posted, return all their follower IDs. Secondary index on follower_id supports the reverse query: who does this user follow?

Feed Cache (Redis Sorted Set)

Key	Value	Score
`feed:{user_id}`	`post_id`	Timestamp or relevance score

Each user's feed is a Redis sorted set containing post IDs, scored by a combination of recency and relevance. We keep the most recent 800 post IDs per user. Older posts fall off the cache and are fetched from the database on demand (rare, since most users do not scroll past 200 items).

Content Store (Separate from Metadata)

Post media (images, videos) is stored in object storage (S3) behind a CDN. The posts table holds only CDN URLs. This separation is critical — serving media directly from the application database would be catastrophic for latency and throughput.

Stage 4: High-Level Architecture

The core architectural decision for any news feed system is the fan-out strategy: when and how do we assemble each user's feed?

Fan-Out-on-Write (Push Model)

When a user publishes a post, we immediately push the post ID into the feed cache of every follower.

Pro: Feed reads are instant — just read the pre-assembled sorted set from Redis.
Con: A celebrity with 50 million followers triggers 50 million cache writes per post. At 2,300 posts/second globally, this creates enormous write amplification.

Fan-Out-on-Read (Pull Model)

When a user requests their feed, we fetch the latest posts from all accounts they follow and merge them in real time.

Pro: No write amplification. Celebrity posts are written once.
Con: Feed reads become expensive. A user following 500 accounts requires 500 queries, merged and ranked on the fly. At 115,000 reads/second, this is not viable.

Hybrid Approach (The Right Answer)

The industry-standard solution combines both strategies:

Regular users (fewer than 10,000 followers): Fan-out-on-write. Their posts are pushed to all followers' feed caches immediately. This covers 99%+ of all users.
Celebrity users (10,000+ followers): Fan-out-on-read. Their posts are not pushed to followers' caches. Instead, when a user loads their feed, the Feed Service fetches recent posts from the celebrities they follow, merges them with the pre-assembled cache, and ranks the combined list.

The threshold (10,000 followers) is configurable and can be tuned based on system capacity. This hybrid model caps the worst-case fan-out while keeping feed reads fast for the common case.

Write Path

User submits a post via POST /v1/posts.
The Post Service validates the content, generates a Snowflake ID, writes to the Posts DB, and uploads media to the CDN.
The Post Service publishes a PostCreated event to a message queue (Kafka).
The Fan-out Service consumes the event. It queries the Social Graph for the author's followers.
For regular users: the Fan-out Service writes the post ID to each follower's Feed Cache (Redis ZADD).
For celebrities: the Fan-out Service does nothing — these posts will be pulled at read time.

Read Path

User requests their feed via GET /v1/feed.
The Feed Service reads the pre-assembled post IDs from the user's Feed Cache.
The Feed Service identifies which celebrities the user follows and fetches their recent post IDs from the Posts DB (or a celebrity post cache).
The Merge & Rank layer combines both lists, applies the ranking algorithm, and returns the top N posts.
The Feed Service hydrates the post IDs into full post objects (author info, media URLs, engagement counts) using a multi-get from the Posts DB or a post-detail cache.

Stage 5: Deep Dive

We will go deep on two areas that interviewers probe most: the fan-out strategy and celebrity problem, and feed ranking and relevance scoring.

Deep Dive 1: Fan-Out Strategy and the Celebrity Problem

The celebrity problem is the defining challenge of news feed design. Let us quantify it.

Why Naive Fan-Out Breaks

Consider a celebrity with 50 million followers who posts 10 times per day. Fan-out-on-write would generate 500 million cache writes per day just for one user. Multiply across hundreds of celebrities, and the fan-out service would need to process billions of cache writes per day on top of regular user fan-out. This is not just expensive — it introduces unacceptable latency. Followers of that celebrity would not see the post for minutes while the fan-out queue drains.

Celebrity Detection and Classification

We maintain a celebrity registry — a lightweight table mapping user_id to is_celebrity: boolean. The classification runs as a background job that checks follower counts periodically. When a user crosses the threshold (e.g., 10,000 followers), they are flagged as a celebrity. When the Fan-out Service processes a PostCreated event, it checks this registry before deciding whether to fan out.

Optimizing Fan-Out for Regular Users

Even for regular users, fan-out must be efficient. Key optimizations:

Batch writes: Instead of individual ZADD commands per follower, we pipeline Redis commands in batches of 1,000. This reduces round-trip overhead by 100x.
Prioritize active users: We fan out to users who were active in the last 7 days first. Inactive users' feeds are populated on-demand when they return (backfill from the posts table). This reduces fan-out volume by 40-60% depending on the platform's activity ratio.
Partitioned fan-out workers: Kafka partitions are keyed by author_id. Each worker handles fan-out for a subset of authors. Workers scale horizontally — during peak hours, we auto-scale the consumer group.

Celebrity Post Cache

Rather than mixing celebrity posts into each user's feed cache, we maintain a separate celebrity post cache. It is a simple sorted set per celebrity: celeb_posts:{user_id} containing their last 100 post IDs. At read time, the Feed Service identifies the celebrities a user follows (typically 20-50 accounts), fetches their recent posts from this cache (20-50 Redis reads), and merges them with the user's pre-assembled feed. This adds 5-15ms to the read path — well within our latency budget.

Deep Dive 2: Feed Ranking and Relevance Scoring

A purely chronological feed is simple to build but delivers a poor user experience. Users follow hundreds of accounts and cannot consume everything. Ranking ensures the most relevant posts surface first.

Ranking Signal Categories

Signal Category	Examples	Weight
Engagement	Like count, comment count, share count, save count	High
Affinity	How often the viewer interacts with the author (likes, comments, DMs, profile visits)	High
Recency	Time since post creation (exponential decay)	Medium
Content type	User's historical preference (video vs. image vs. text)	Medium
Author quality	Author's overall engagement rate, post frequency	Low
Negative signals	User hid similar posts, unfollowed similar accounts	Penalty

Scoring Formula

A simplified relevance score combines these signals:

score = (affinity_weight * affinity_score) + (engagement_weight * normalized_engagement) + (recency_weight * decay_function(age)) + (content_pref_weight * content_match) - (negative_signal_penalty)

In production, this formula is replaced by a machine learning model (typically a gradient-boosted tree or a lightweight neural network) that learns weights from user behavior. The model is trained offline on click-through and engagement data, then served via a low-latency prediction service.

Two-Pass Ranking Architecture

Running a complex ML model on every candidate post is expensive. We use a two-pass approach:

Candidate generation (pass 1): Retrieve the top 500 post IDs from the feed cache + celebrity cache, sorted by a lightweight score (recency + raw engagement). This is fast — just sorted set reads and simple arithmetic.
Re-ranking (pass 2): Send the top 500 candidates to the Ranking Service, which applies the full ML model to produce a final ordered list. The model evaluates ~500 candidates in under 50ms using batched inference.

Only the top 20-50 posts (one page) are returned to the client. The remaining ranked candidates are cached briefly for fast subsequent page loads.

Handling Ranking Freshness

Post engagement changes rapidly after publication. A post that had 10 likes when it was ranked might have 10,000 likes five minutes later. We handle this with:

Engagement counters in Redis: Denormalized like/comment counts are updated in real time. The ranking model reads these counters at scoring time, so re-ranking naturally picks up new engagement.
Feed invalidation on high-velocity posts: When a post's engagement crosses a threshold (e.g., 10x its expected rate), the system marks it as "trending" and boosts it in upcoming feed requests for relevant users.

Stage 6: Scaling and Trade-Offs

Cache Layers

The feed system operates behind multiple cache layers:

CDN edge cache: Static media (images, video thumbnails) served from edge locations. Cache-Control headers with long TTLs for immutable content-addressed URLs.
Feed cache (Redis): Pre-assembled post ID lists per user. This is the primary data store for the read path. We run Redis in cluster mode with automatic sharding across 50+ nodes.
Post detail cache (Redis/Memcached): Hydrated post objects (author name, avatar, media URLs, engagement counts). TTL of 5 minutes. Cache-aside pattern: on miss, read from database and populate cache.
Celebrity post cache (Redis): Separate sorted sets for high-follower users. TTL of 24 hours with active refresh on new posts.

Pagination with Cursor

The cursor encodes the last seen post's score (not just timestamp) and post ID as a tiebreaker. This is essential for ranked feeds where order is not strictly chronological. The cursor is opaque to the client — encoded as a base64 string.

When the client requests the next page, the server uses the cursor to resume ranking from the correct position in the sorted set. Because rankings can shift between requests (new posts, engagement changes), we accept minor inconsistencies at page boundaries. Users rarely notice a post appearing twice or being skipped across pages, and the trade-off is justified by the reduction in server-side state.

Content Delivery

Media delivery accounts for 95%+ of bandwidth. Our strategy:

Upload pipeline: Client uploads media to a pre-signed S3 URL. A background job generates thumbnails (multiple resolutions) and transcodes video into adaptive bitrate formats (HLS). The post is not published until processing completes.
CDN distribution: All media is served through a CDN with edge locations worldwide. Content-addressed URLs (hash-based) enable aggressive caching with infinite TTLs.
Progressive loading: The feed response includes a low-resolution blur hash for each image. The client renders the blur hash instantly, then lazy-loads the full-resolution image as the user scrolls to it.

Multi-Region Architecture

With 200 million DAU globally, single-region deployment creates unacceptable latency for distant users. We deploy across three regions: US-East, EU-West, and AP-Southeast.

Feed cache: Replicated per region. Fan-out writes are published to a global Kafka cluster and consumed by fan-out workers in each region.
Social graph: Primary in one region with read replicas in others. Follow/unfollow operations route to the primary, with replication lag under 1 second.
Posts database: Sharded globally. Posts are written to the region closest to the author and replicated asynchronously. A post created in EU-West is available in AP-Southeast within 2-3 seconds.
Request routing: DNS-based routing sends each user to the nearest region. Cross-region fallback if a region is unhealthy.

Trade-Off Summary

Decision	Choice	Trade-Off
Fan-out strategy	Hybrid (write for regular, read for celebrities)	Added complexity in the read path, but avoids write amplification for high-follower users.
Feed storage	Redis sorted sets	Fast reads but expensive memory. Mitigated by capping feed length at 800 items and evicting inactive users.
Ranking	Two-pass (lightweight filter + ML re-rank)	Adds 30-50ms to the read path, but dramatically improves feed quality and engagement.
Pagination	Cursor-based on score + ID	Minor inconsistencies at page boundaries vs. the cost of maintaining server-side session state.
Consistency model	Eventual (2-5s propagation)	New posts may take a few seconds to appear in all followers' feeds. Acceptable for a social feed — not a banking system.
Media delivery	CDN with content-addressed URLs	Infinite cache TTLs eliminate invalidation complexity, but requires unique URL per media version.

Scoring Tips

To maximize your score on a news feed design question:

Name the fan-out trade-off immediately. The instant you hear "news feed," your interviewer is waiting to see if you understand fan-out-on-write vs. fan-out-on-read. State both approaches, quantify the cost of each, and propose the hybrid. This alone puts you ahead of most candidates.
Quantify the celebrity problem. Do not just say "celebrities have lots of followers." Calculate: 50M followers times 10 posts/day = 500M cache writes/day for a single user. Numbers make your argument concrete and memorable.
Discuss ranking as a first-class concern. Many candidates treat ranking as an afterthought. Interviewers at top companies expect you to discuss signal categories, the two-pass architecture, and the tension between ranking quality and latency.
Show the read path end-to-end. Walk through: cache lookup, celebrity merge, ranking, hydration, response. Show that you understand every millisecond between the user's request and the rendered feed.
Address the inactive user optimization. Mentioning that you skip fan-out for inactive users demonstrates production-level thinking. It shows you understand that not all 500M users are active, and that wasting resources on dormant feeds is poor engineering.
Acknowledge trade-offs explicitly. Every decision has a cost. State it. "We chose eventual consistency, which means a post may take up to 5 seconds to appear in all followers' feeds. This is acceptable for a social media feed." Interviewers reward this calibration.

Practice delivering this walkthrough in under 35 minutes, leaving room for follow-up questions. The best candidates complete all six stages and still have time to go deeper on the interviewer's area of interest. Tools like Hoppers AI can help you rehearse under realistic time pressure, providing structured feedback on the depth and clarity of each stage.