Designing a News Feed (Facebook/Twitter) — System Design Masterclass

Every social platform has a feed. Facebook’s News Feed, Twitter’s Timeline, Instagram’s Explore, LinkedIn’s Home — they all solve the same fundamental problem: given a user, show them the most relevant recent content from people and topics they follow.

The feed looks simple from the outside. Under the hood, it is one of the hardest distributed systems problems. A user might follow 500 people. Each of those people might post multiple times per day. The feed must aggregate, rank, and serve this content in under 200ms. And it must do this for 2 billion users.

This lesson covers the two core approaches (fan-out on write vs fan-out on read), the hybrid model used in production, ranking algorithms, real-time updates, and the full architecture.

Step 1: Requirements

Functional Requirements

Create post: A user publishes a post (text, images, video, links).
Follow/unfollow: Users can follow other users.
View feed: A user sees a feed of recent posts from people they follow.
Real-time updates: New posts appear in the feed without a page refresh.
Interactions: Like, comment, share (affect ranking).

Non-Functional Requirements

Low latency: Feed must load in under 200ms (p99).
High availability: 99.99% uptime.
Scale: 500M daily active users, 1B posts per day.
Eventual consistency: A post appearing in followers’ feeds within a few seconds is acceptable.

Estimation

Users: 500M DAU, average 300 followees each
Posts: 1B new posts per day = ~12,000 posts/second
Feed reads: 500M users * 10 feed loads/day = 5B reads/day = ~58,000 reads/second
Feed size: Top 500 posts per user feed

Storage per feed entry: ~200 bytes (post_id, user_id, timestamp, score)
Pre-computed feeds: 500M users * 500 entries * 200 bytes = 50 TB

The read-to-write ratio is about 5:1 on feed operations. But each write (new post) can trigger thousands of feed updates (one per follower), making the effective write amplification the real challenge.

Step 2: The Fan-Out Problem

When User A publishes a post, it needs to appear in the feeds of all their followers. If User A has 10,000 followers, that single post must be “delivered” to 10,000 feeds. This is the fan-out problem.

Fan-out on write vs fan-out on read comparison

Fan-Out on Write (Push Model)

When a user publishes a post, immediately push it to every follower’s pre-computed feed.

# Fan-out on write: push to all followers' feeds at post time
async def publish_post(author_id: str, post: dict):
    # 1. Store the post
    post_id = await db.insert("posts", post)

    # 2. Get all followers
    followers = await db.query(
        "SELECT follower_id FROM follows WHERE followee_id = %s",
        [author_id]
    )

    # 3. Push to each follower's feed (in Redis sorted set)
    pipe = redis.pipeline()
    for follower in followers:
        feed_key = f"feed:{follower['follower_id']}"
        # Score = timestamp (for chronological ordering)
        pipe.zadd(feed_key, {post_id: post["created_at"]})
        # Trim to keep only the latest 500 entries
        pipe.zremrangebyrank(feed_key, 0, -501)

    await pipe.execute()

# Reading the feed is trivial — just read from the pre-computed sorted set
async def get_feed(user_id: str, page: int = 0, page_size: int = 20):
    feed_key = f"feed:{user_id}"
    start = page * page_size
    end = start + page_size - 1

    # Get post IDs from pre-computed feed (sorted by score/timestamp)
    post_ids = await redis.zrevrange(feed_key, start, end)

    # Fetch full post objects
    posts = await db.query(
        "SELECT * FROM posts WHERE id IN %s", [tuple(post_ids)]
    )

    # Fetch authors and engagement counts
    return await enrich_posts(posts)

Tradeoffs:

Aspect	Detail
Read latency	Excellent — feed is pre-computed, just read from Redis
Write latency	High — must fan out to all followers
Storage	High — every post is stored N times (once per follower feed)
Consistency	New posts appear in feeds within seconds
Celebrity problem	Disastrous — a celebrity with 50M followers triggers 50M writes

Fan-Out on Read (Pull Model)

Do nothing at post time. When a user loads their feed, query the posts table for all their followees’ recent posts.

# Fan-out on read: pull from followees at read time
async def get_feed(user_id: str, page: int = 0, page_size: int = 20):
    # 1. Get the list of people this user follows
    followees = await db.query(
        "SELECT followee_id FROM follows WHERE follower_id = %s",
        [user_id]
    )
    followee_ids = [f["followee_id"] for f in followees]

    # 2. Query recent posts from all followees
    offset = page * page_size
    posts = await db.query(
        """
        SELECT * FROM posts
        WHERE author_id IN %s
        ORDER BY created_at DESC
        LIMIT %s OFFSET %s
        """,
        [tuple(followee_ids), page_size, offset]
    )

    return await enrich_posts(posts)

Tradeoffs:

Aspect	Detail
Read latency	Slow — must query and merge from many sources at read time
Write latency	Fast — just store the post, no fan-out
Storage	Low — each post stored once
Consistency	Immediate — reads always get the latest data
Celebrity problem	None — celebrities post like anyone else

The SQL query above looks simple, but it is a nightmare at scale. If a user follows 500 people, you are doing an IN clause with 500 IDs and sorting millions of rows. This does not scale without heavy caching and indexing.

Step 3: The Hybrid Approach

Production systems (Facebook, Twitter) use a hybrid: fan-out on write for normal users, fan-out on read for celebrities.

CELEBRITY_THRESHOLD = 100_000  # Followers above this = celebrity

async def publish_post(author_id: str, post: dict):
    post_id = await db.insert("posts", post)

    follower_count = await db.query_one(
        "SELECT COUNT(*) FROM follows WHERE followee_id = %s", [author_id]
    )

    if follower_count < CELEBRITY_THRESHOLD:
        # Normal user: fan-out on write
        await fan_out_to_followers(author_id, post_id, post["created_at"])
    else:
        # Celebrity: do NOT fan out. Store in a "celebrity posts" index.
        await redis.zadd(
            f"celebrity_posts:{author_id}",
            {post_id: post["created_at"]}
        )
        await redis.zremrangebyrank(f"celebrity_posts:{author_id}", 0, -101)

async def get_feed(user_id: str, page_size: int = 20, cursor: str = None):
    # 1. Get the pre-computed feed (from fan-out on write)
    feed_key = f"feed:{user_id}"
    if cursor:
        precomputed = await redis.zrevrangebyscore(
            feed_key, f"({cursor}", "-inf", start=0, num=page_size
        )
    else:
        precomputed = await redis.zrevrange(feed_key, 0, page_size - 1, withscores=True)

    # 2. Get celebrity posts (fan-out on read)
    celebrity_followees = await get_celebrity_followees(user_id)
    celebrity_posts = []
    for celeb_id in celebrity_followees:
        posts = await redis.zrevrange(
            f"celebrity_posts:{celeb_id}", 0, 20, withscores=True
        )
        celebrity_posts.extend(posts)

    # 3. Merge and sort by score (timestamp or ranking score)
    all_posts = merge_sorted(precomputed, celebrity_posts)
    top_posts = all_posts[:page_size]

    # 4. Fetch full post objects and enrich
    return await enrich_posts(top_posts)

Why This Hybrid Works

Normal users (99% of authors): Their posts fan out on write. Followers get instant feed reads.
Celebrities (1% of authors but 30%+ of content consumption): Their posts are pulled at read time and merged into the pre-computed feed. The read-time merge adds a few milliseconds but avoids writing to 50M feeds.

The threshold (100K followers in our example) is tunable. Some systems use a dynamic threshold based on current system load.

Step 4: Feed Storage

Redis Sorted Sets for Pre-Computed Feeds

Redis sorted sets are the natural data structure for feeds:

# Each feed is a sorted set: member = post_id, score = timestamp (or ranking score)

# Add a post to a user's feed
await redis.zadd("feed:user123", {"post_abc": 1711612800.0})

# Get the latest 20 posts (reverse order = newest first)
posts = await redis.zrevrange("feed:user123", 0, 19, withscores=True)

# Get posts older than a cursor (for pagination)
posts = await redis.zrevrangebyscore("feed:user123", "(1711612800.0", "-inf", start=0, num=20)

# Trim feed to keep only the latest 500 entries
await redis.zremrangebyrank("feed:user123", 0, -501)

Memory estimation:

500M users * 500 posts/feed * 16 bytes/entry (8-byte member + 8-byte score)
= 4 TB of Redis memory

With Redis overhead (~2x): ~8 TB
Using a Redis cluster with 64 nodes: ~125 GB per node (feasible)

Post Storage

Posts themselves live in a database, not in the feed. The feed only stores (post_id, score) pairs. When rendering the feed, you fetch the full post objects in a batch query.

-- Posts table (Cassandra or PostgreSQL)
CREATE TABLE posts (
    id          BIGINT PRIMARY KEY,
    author_id   BIGINT NOT NULL,
    content     TEXT,
    media_urls  TEXT[],       -- Array of image/video URLs
    created_at  TIMESTAMP NOT NULL,
    like_count  INT DEFAULT 0,
    comment_count INT DEFAULT 0,
    share_count INT DEFAULT 0
);

-- Social graph (Cassandra — partitioned by follower for "who do I follow" queries)
CREATE TABLE follows (
    follower_id  BIGINT,
    followee_id  BIGINT,
    created_at   TIMESTAMP,
    PRIMARY KEY (follower_id, followee_id)
);

-- Reverse index (partitioned by followee for "who follows me" queries)
CREATE TABLE followers (
    followee_id  BIGINT,
    follower_id  BIGINT,
    created_at   TIMESTAMP,
    PRIMARY KEY (followee_id, follower_id)
);

Two separate tables for the social graph: one partitioned by follower (for reading your feed), one by followee (for fan-out). This denormalization is necessary at scale.

Step 5: Ranking

Chronological feeds are simple but produce a poor user experience. A post from your close friend should rank higher than a post from someone you followed three years ago and never interact with. This is where ranking comes in.

Scoring Model

Each candidate post gets a score. The feed is sorted by score, not by timestamp.

def compute_feed_score(post: dict, viewer: dict, interactions: dict) -> float:
    """
    Compute a ranking score for a post relative to a specific viewer.
    Higher score = more relevant = shown higher in feed.
    """

    # 1. Recency decay — exponential decay from post creation time
    age_hours = (time.time() - post["created_at"]) / 3600
    recency_score = math.exp(-0.05 * age_hours)  # Half-life ~14 hours

    # 2. Affinity — how close is the viewer to the author?
    affinity = interactions.get(post["author_id"], {})
    affinity_score = (
        affinity.get("likes_given", 0) * 1.0 +
        affinity.get("comments_given", 0) * 2.0 +
        affinity.get("messages_sent", 0) * 3.0 +
        affinity.get("profile_views", 0) * 0.5
    )
    affinity_score = min(affinity_score / 100.0, 1.0)  # Normalize to 0-1

    # 3. Engagement — how much engagement has this post received?
    engagement_score = (
        post["like_count"] * 1.0 +
        post["comment_count"] * 2.0 +
        post["share_count"] * 3.0
    )
    engagement_score = math.log1p(engagement_score) / 10.0  # Log scale, normalized

    # 4. Content type boost — videos and images rank higher
    type_boost = 1.0
    if post.get("has_video"):
        type_boost = 1.3
    elif post.get("has_image"):
        type_boost = 1.1

    # 5. Diversity penalty — suppress multiple posts from the same author
    diversity_penalty = 1.0  # Applied externally during feed assembly

    # Final score
    score = (
        0.4 * recency_score +
        0.3 * affinity_score +
        0.2 * engagement_score +
        0.1 * type_boost
    ) * diversity_penalty

    return score

Where Ranking Happens

There are two approaches to when ranking is computed:

Rank at write time: When fan-out pushes a post to a follower’s feed, compute a coarse score and use it as the sorted set score. Fast reads, but the score is stale (engagement counts change over time).

Rank at read time: Store posts in chronological order. When the user loads their feed, fetch the top N candidates and re-rank them with fresh data. More accurate, but adds latency.

Production approach: two-pass ranking.

async def get_ranked_feed(user_id: str, page_size: int = 20):
    # Pass 1: Get candidates (broad set, coarsely ranked)
    candidates = await get_feed_candidates(user_id, count=200)

    # Pass 2: Re-rank with fresh signals
    viewer = await get_user_profile(user_id)
    interactions = await get_interaction_history(user_id)

    scored_posts = []
    for post in candidates:
        # Fetch fresh engagement counts
        fresh_counts = await get_engagement_counts(post["id"])
        post.update(fresh_counts)

        score = compute_feed_score(post, viewer, interactions)
        scored_posts.append((score, post))

    # Sort by score, take top N
    scored_posts.sort(key=lambda x: x[0], reverse=True)
    top_posts = [post for _, post in scored_posts[:page_size]]

    # Apply diversity rules (no more than 2 consecutive posts from same author)
    return apply_diversity_rules(top_posts)

ML-Based Ranking

At Facebook/Instagram scale, the scoring function is a machine learning model, not a hand-tuned formula:

Feature extraction: For each (viewer, post) pair, extract hundreds of features — viewer demographics, post metadata, author-viewer interaction history, time of day, device type.
Prediction: A neural network predicts the probability of engagement (like, comment, share, click, dwell time).
Score: The predicted engagement probabilities are combined into a single score via a weighted sum.
Filtering: Posts that violate content policies or that the user has hidden are removed.

This is computationally expensive. Facebook uses a multi-stage funnel:

All candidate posts (~2000)
  -> Lightweight model filters to ~500
  -> Full model ranks to ~200
  -> Business rules and diversity to ~50
  -> Served to user (first page = 20)

Step 6: Real-Time Feed Updates

When a user is actively viewing their feed and a new post arrives, it should appear without a page refresh.

Approach: Hybrid Push + Pull

# Server: notify connected clients of new feed items
async def notify_feed_update(user_id: str, post_id: str):
    """Called after fan-out writes a new post to a user's feed."""

    # Check if user has an active WebSocket/SSE connection
    if await connection_manager.is_connected(user_id):
        # Push a lightweight notification (not the full post)
        await connection_manager.send(user_id, {
            "type": "new_feed_item",
            "post_id": post_id,
            "message": "New posts available"
        })

// Client: listen for feed updates
const feedSource = new EventSource('/api/feed/stream');

feedSource.addEventListener('new_feed_item', (event) => {
  const data = JSON.parse(event.data);

  // Show "New posts" banner (like Twitter)
  // Do NOT auto-insert — it disrupts scrolling
  showNewPostsBanner(data.count);
});

// When user clicks the banner, fetch and prepend new posts
async function loadNewPosts() {
  const response = await fetch(`/api/feed?since=${lastPostTimestamp}`);
  const newPosts = await response.json();
  prependToFeed(newPosts);
}

An important UX decision: never auto-insert posts while the user is scrolling. It causes content to jump around. Instead, show a “N new posts” banner and let the user choose to load them. Twitter and Facebook both use this pattern.

Step 7: Pagination

Offset-Based Pagination (Bad)

SELECT * FROM feed WHERE user_id = 123 ORDER BY score DESC LIMIT 20 OFFSET 40;

Problems: if new posts are inserted between page loads, you get duplicates or missed posts. Also, OFFSET is O(N) in most databases.

Cursor-Based Pagination (Good)

Use the last item’s score (or timestamp) as the cursor:

async def get_feed_page(user_id: str, cursor: float = None, page_size: int = 20):
    feed_key = f"feed:{user_id}"

    if cursor:
        # Get items with score less than the cursor
        items = await redis.zrevrangebyscore(
            feed_key,
            f"({cursor}",  # Exclusive upper bound
            "-inf",
            start=0,
            num=page_size,
            withscores=True
        )
    else:
        items = await redis.zrevrange(
            feed_key, 0, page_size - 1, withscores=True
        )

    posts = await fetch_and_enrich(items)

    # The cursor for the next page is the score of the last item
    next_cursor = items[-1][1] if items else None

    return {
        "posts": posts,
        "next_cursor": next_cursor,
        "has_more": len(items) == page_size
    }

Cursor-based pagination is stable: inserts and deletes do not cause duplicates or gaps.

Step 8: Media Handling

Posts contain images and videos. The feed system does not store media directly — it stores references.

async def create_post_with_media(author_id: str, content: str, media_files: list):
    # 1. Upload media to object storage (S3)
    media_urls = []
    for file in media_files:
        key = f"posts/{author_id}/{uuid4()}/{file.filename}"
        await s3.upload(key, file.content)

        # Generate thumbnail for images
        if file.content_type.startswith("image/"):
            thumb_key = f"thumbs/{key}"
            thumbnail = resize_image(file.content, max_width=400)
            await s3.upload(thumb_key, thumbnail)

        media_urls.append({
            "url": f"https://cdn.example.com/{key}",
            "thumbnail": f"https://cdn.example.com/{thumb_key}" if thumb_key else None,
            "type": file.content_type
        })

    # 2. Create the post with media references
    post = {
        "author_id": author_id,
        "content": content,
        "media": media_urls,
        "created_at": time.time()
    }

    return await publish_post(author_id, post)

Key decisions:

CDN for media delivery. Never serve images from your application servers.
Lazy loading. Only load images when the post scrolls into the viewport.
Multiple resolutions. Store thumbnails, medium, and full-size. Serve the appropriate size based on the client’s viewport.

Step 9: Content Moderation

Every feed system needs content moderation to prevent spam, hate speech, and policy violations from appearing in feeds.

async def moderate_post(post: dict) -> dict:
    """Run moderation checks before a post enters the feed pipeline."""

    # 1. Automated checks (fast, synchronous)
    text_result = await ml_classifier.check_text(post["content"])
    if text_result["toxic_score"] > 0.9:
        post["status"] = "blocked"
        await flag_for_review(post, reason="toxic_content")
        return post

    # 2. Image/video check (async — may take seconds)
    if post.get("media"):
        await moderation_queue.enqueue({
            "post_id": post["id"],
            "media_urls": [m["url"] for m in post["media"]],
            "action": "check_media"
        })
        # Post enters feed immediately but can be pulled back
        post["moderation_status"] = "pending"

    # 3. Spam detection
    recent_posts = await get_recent_posts(post["author_id"], minutes=5)
    if len(recent_posts) > 20:  # More than 20 posts in 5 minutes = spam
        post["status"] = "rate_limited"
        return post

    post["status"] = "published"
    return post

Moderation is a tradeoff between speed and accuracy. Blocking a post for 30 seconds while you run deep ML checks creates a poor user experience. Most systems publish the post immediately, run moderation asynchronously, and retroactively remove violating content.

The Complete Architecture

                     +------------------+
                     |   Mobile/Web     |
                     |   Clients        |
                     +--------+---------+
                              |
                     +--------+---------+
                     |    API Gateway    |
                     +--------+---------+
                              |
              +---------------+----------------+
              |               |                |
     +--------+---+  +-------+------+  +------+--------+
     | Post       |  | Feed         |  | Notification  |
     | Service    |  | Service      |  | Service       |
     +--------+---+  +-------+------+  +------+--------+
              |               |                |
     +--------+---+  +-------+------+  +------+--------+
     | Fan-Out    |  | Feed Cache   |  | Push Gateway  |
     | Workers    |  | (Redis)      |  | (APNs/FCM)    |
     +--------+---+  +--------------+  +---------------+
              |
     +--------+---+
     | Pub/Sub    |
     | (Kafka)    |
     +--------+---+
              |
     +--------+-----------+----------+
     |                     |          |
+----+------+  +-----------+--+  +---+--------+
| Posts DB  |  | Social Graph |  | Analytics  |
| (Cass.)   |  | (Cass.)      |  | (Click-    |
|           |  |              |  |  House)    |
+-----------+  +--------------+  +------------+

Write path:

User creates a post via API Gateway.
Post Service stores the post in the Posts DB.
Post Service publishes a “new post” event to Kafka.
Fan-Out Workers consume the event, check the author’s follower count.
Below the celebrity threshold: fan-out on write to followers’ Redis feeds.
Above the threshold: index in the celebrity posts cache.
Notification Service sends push notifications to users with notifications enabled.

Read path:

User requests their feed via API Gateway.
Feed Service reads from the pre-computed Redis feed.
Feed Service merges in celebrity posts (fan-out on read).
Feed Service re-ranks the merged candidates.
Full post objects are fetched from Posts DB (or post cache).
Enriched feed is returned to the client.

Key Takeaways

Fan-out on write (push model) pre-computes feeds at write time. Reads are instant, but writes are expensive and the celebrity problem (users with millions of followers) causes massive write amplification.
Fan-out on read (pull model) computes feeds at read time. Writes are cheap, but reads are slow because you must query and merge from many sources.
The hybrid approach is what production systems use. Fan-out on write for normal users (99%), fan-out on read for celebrities (1%). The threshold is tunable based on system load.
Redis sorted sets are the standard data structure for pre-computed feeds. Score by timestamp for chronological feeds, or by ranking score for algorithmic feeds.
Two-pass ranking is the practical approach: coarse ranking at write time, re-ranking with fresh signals at read time. At scale, the ranking function is an ML model.
Cursor-based pagination is mandatory. Offset pagination breaks when the feed is being updated in real time.
Real-time updates use a “new posts available” banner pattern, not auto-insertion. Push a lightweight notification via WebSocket/SSE; let the user choose to load.
The social graph needs two tables: one partitioned by follower (for reading feeds), one by followee (for fan-out). Denormalization is the price of performance.
Content moderation is async in practice. Publish first, moderate second, retract if needed. Blocking publication for moderation adds unacceptable latency.
Media is stored in object storage (S3) and served via CDN. The feed only stores references. Generate multiple resolutions for responsive delivery.