The URL shortener is the “Hello World” of system design interviews. It seems simple — take a long URL, return a short one — but it touches on encoding algorithms, distributed ID generation, caching strategy, database design, and analytics pipelines. It is a great problem because every component has meaningful tradeoffs.
Let us design a URL shortener from scratch: define requirements, estimate scale, choose encoding strategies, design the data model, and architect the system for millions of URLs and billions of redirects.
Step 1: Requirements
Functional Requirements
- Shorten: Given a long URL, generate a unique short URL.
- Redirect: Given a short URL, redirect to the original long URL.
- Custom aliases: Users can optionally specify a custom short code.
- Expiration: URLs can optionally have an expiration time.
- Analytics: Track click counts, referrers, geolocation, and timestamps.
Non-Functional Requirements
- Low latency: Redirects must complete in under 10ms (p99).
- High availability: The redirect path cannot go down. 99.99% uptime.
- Scale: Handle 100M URLs created per month, 10B redirects per month.
- Durability: Once created, a URL mapping must not be lost.
Out of Scope (for now)
- User accounts and authentication
- Rate limiting (covered in a separate lesson)
- Spam/phishing URL detection
Step 2: Back-of-Envelope Estimation
URL creation:
100M new URLs per month
= ~40 URLs per second (average)
= ~400 URLs per second (peak, assuming 10x)
Redirects:
10B redirects per month (100:1 read-to-write ratio)
= ~4,000 redirects per second (average)
= ~40,000 redirects per second (peak)
Storage:
Each URL mapping: ~500 bytes (short code + long URL + metadata)
100M URLs/month * 12 months * 5 years = 6B URLs
6B * 500 bytes = 3 TB total storage
Bandwidth:
40K redirects/sec * 500 bytes = 20 MB/s (modest)
Cache:
80/20 rule: 20% of URLs get 80% of traffic
Hot URL set: 6B * 20% * 500 bytes = 600 GB (fits in distributed Redis)The system is heavily read-dominant. The write path (create URL) is 100x less frequent than the read path (redirect). This shapes every design decision.
Step 3: Encoding Strategies
The core problem: how do you generate a unique short code for each URL?
Approach 1: Base62 Encoding of a Counter
The most straightforward approach. Maintain a global counter. For each new URL, increment the counter and encode the number in base62.
import string
ALPHABET = string.digits + string.ascii_lowercase + string.ascii_uppercase
BASE = len(ALPHABET) # 62
def encode_base62(num: int) -> str:
"""Convert integer to base62 string."""
if num == 0:
return ALPHABET[0]
result = []
while num > 0:
result.append(ALPHABET[num % BASE])
num //= BASE
return ''.join(reversed(result))
def decode_base62(s: str) -> int:
"""Convert base62 string back to integer."""
num = 0
for char in s:
num = num * BASE + ALPHABET.index(char)
return num
# Examples:
# encode_base62(1) -> "1"
# encode_base62(1000000) -> "4c92"
# encode_base62(56800235584) -> "zzzzzz" (6-char max)Code length analysis:
| Characters | Possible codes | Coverage |
|---|---|---|
| 6 | 62^6 = 56.8 billion | Enough for decades |
| 7 | 62^7 = 3.5 trillion | Overkill for most systems |
A 7-character code gives us 3.5 trillion unique URLs. At 100M per month, that is 35,000 months (almost 3,000 years).
How to maintain the counter in a distributed system:
Option A — Database auto-increment: Use a single MySQL/Postgres instance as the ID generator. Simple, but it is a single point of failure.
Option B — Range-based allocation: Each application server reserves a range of IDs (e.g., Server 1 gets 1-10000, Server 2 gets 10001-20000). A coordination service (ZooKeeper, etcd) manages range assignments.
class RangeCounter:
"""Each server gets a pre-allocated range of IDs."""
def __init__(self, range_size=10000):
self.range_size = range_size
self.current = 0
self.max = 0
async def next_id(self) -> int:
if self.current >= self.max:
# Request new range from coordination service
start = await self._allocate_range()
self.current = start
self.max = start + self.range_size
self.current += 1
return self.current - 1
async def _allocate_range(self) -> int:
# Atomic increment in ZooKeeper/etcd/Redis
# Returns the start of the new range
return await redis.incrby("url_counter", self.range_size)Option C — Snowflake IDs: Use Twitter’s Snowflake algorithm to generate 64-bit unique IDs that are roughly time-ordered. Each worker generates IDs independently without coordination.
import time
class SnowflakeGenerator:
"""
64-bit ID:
1 bit unused | 41 bits timestamp | 10 bits machine ID | 12 bits sequence
"""
EPOCH = 1609459200000 # Custom epoch (2021-01-01)
def __init__(self, machine_id: int):
self.machine_id = machine_id & 0x3FF # 10 bits
self.sequence = 0
self.last_timestamp = -1
def next_id(self) -> int:
timestamp = int(time.time() * 1000) - self.EPOCH
if timestamp == self.last_timestamp:
self.sequence = (self.sequence + 1) & 0xFFF # 12 bits
if self.sequence == 0:
# Wait for next millisecond
while timestamp == self.last_timestamp:
timestamp = int(time.time() * 1000) - self.EPOCH
else:
self.sequence = 0
self.last_timestamp = timestamp
return (timestamp << 22) | (self.machine_id << 12) | self.sequenceThe base62 encoding of a Snowflake ID will be longer (10-11 chars) because the IDs are 64-bit. If you want short codes (6-7 chars), use the range-based counter approach.
Approach 2: Hash-Based (MD5/SHA-256)
Hash the long URL and take the first N characters.
import hashlib
def hash_url(long_url: str, length: int = 7) -> str:
"""Generate short code by hashing the URL."""
hash_hex = hashlib.md5(long_url.encode()).hexdigest()
# Convert hex to base62
hash_int = int(hash_hex, 16)
code = encode_base62(hash_int)[:length]
return codeThe collision problem: With 7-character codes, different URLs can produce the same short code. You must handle this:
async def create_short_url(long_url: str) -> str:
code = hash_url(long_url)
# Check for collision
attempts = 0
while await db.exists("urls", code):
existing = await db.get("urls", code)
if existing["long_url"] == long_url:
return code # Same URL, return existing code
# Collision! Append a suffix and re-hash
attempts += 1
code = hash_url(long_url + str(attempts))
await db.insert("urls", {"code": code, "long_url": long_url})
return codeHash-based approaches give you the nice property that the same long URL always gets the same short code (deduplication). But collision handling adds complexity and requires a read-before-write pattern that is harder to scale.
Approach 3: Pre-Generated Key Service
Generate all possible short codes in advance. Store them in a database. When a new URL needs a code, pop one from the pool.
class KeyService:
"""Pre-generates keys and serves them from a pool."""
def __init__(self):
self.used_keys_table = "used_keys"
self.available_keys_table = "available_keys"
async def get_key(self) -> str:
# Atomic pop from available keys
key = await redis.spop("available_keys")
if key is None:
raise Exception("Key pool exhausted — generate more keys")
await redis.sadd("used_keys", key)
return key.decode()
async def generate_keys(self, count: int):
"""Bulk-generate random base62 keys."""
keys = set()
while len(keys) < count:
code = ''.join(random.choices(ALPHABET, k=7))
keys.add(code)
# Add to available pool (skip any that are already used)
used = await redis.smembers("used_keys")
new_keys = keys - used
if new_keys:
await redis.sadd("available_keys", *new_keys)This eliminates collision handling entirely and makes URL creation a simple key lookup. The downside is that you need a separate service managing the key pool, and you must monitor pool exhaustion.
Which Approach to Use?
| Approach | Pros | Cons | Best For |
|---|---|---|---|
| Base62 counter | Simple, no collisions, predictable | Requires coordination for distributed counter | Most systems |
| Hash-based | Same URL gets same code (dedup) | Collisions, read-before-write | Systems needing URL dedup |
| Pre-generated keys | No collision, fast creation | Extra infrastructure, pool management | Very high write throughput |
Recommendation: Start with the range-based counter + base62 encoding. It is simple, collision-free, and scales well.
Step 4: Database Design
Schema
CREATE TABLE urls (
code VARCHAR(7) PRIMARY KEY,
long_url TEXT NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
expires_at TIMESTAMP NULL,
user_id BIGINT NULL,
click_count BIGINT DEFAULT 0
);
-- Index for looking up by long URL (dedup)
CREATE INDEX idx_long_url ON urls (long_url);
-- Index for expiration cleanup
CREATE INDEX idx_expires ON urls (expires_at) WHERE expires_at IS NOT NULL;Database Choice
The URL table has a very simple access pattern:
- Write: Insert a row (code -> long_url)
- Read: Lookup by code (primary key lookup)
This is a key-value workload. Any database works, but some choices are better than others:
- DynamoDB / Cassandra: Natural key-value stores. Partition by
code. Excellent for this workload. - PostgreSQL / MySQL: Works fine at moderate scale. Use read replicas for the heavy read path.
- Redis (as primary store): If your dataset fits in memory (~600 GB for 6B URLs at 100 bytes each, not counting long URLs). Extremely fast, but expensive and risky for durability.
For a production URL shortener, DynamoDB is the pragmatic choice: single-digit millisecond reads, automatic scaling, and no operational overhead.
Step 5: Read Path (Redirect)
The redirect path is the hot path. It must be fast and reliable.
@app.get("/{code}")
async def redirect(code: str):
# 1. Check cache first
long_url = await redis_cache.get(f"url:{code}")
if long_url is None:
# 2. Cache miss — query database
row = await db.get("urls", code)
if row is None:
raise HTTPException(status_code=404, detail="URL not found")
if row["expires_at"] and row["expires_at"] < datetime.utcnow():
raise HTTPException(status_code=410, detail="URL expired")
long_url = row["long_url"]
# 3. Populate cache (TTL of 24 hours)
await redis_cache.setex(f"url:{code}", 86400, long_url)
# 4. Log the click asynchronously (do not block the redirect)
asyncio.create_task(log_click(code, request))
# 5. Redirect
return RedirectResponse(url=long_url, status_code=302)301 vs 302 Redirect
This is a critical design decision:
301 (Moved Permanently): The browser caches the redirect. Subsequent visits to the short URL go directly to the long URL without hitting your server. Faster for users, but you lose analytics visibility.
302 (Found / Temporary): The browser does NOT cache the redirect. Every visit hits your server. Slightly slower, but you can track every click.
Most URL shorteners use 302 because analytics is a core feature. If you do not need analytics, 301 reduces server load dramatically.
Caching Strategy
The 80/20 rule applies strongly: a small fraction of URLs get most of the traffic.
# Multi-tier caching approach
class URLCache:
def __init__(self):
# L1: In-process LRU cache (1M entries, ~100MB)
self.local_cache = LRUCache(maxsize=1_000_000)
# L2: Redis cluster (distributed, shared across servers)
self.redis = redis_cluster
async def get(self, code: str) -> Optional[str]:
# Check L1
url = self.local_cache.get(code)
if url:
return url
# Check L2
url = await self.redis.get(f"url:{code}")
if url:
self.local_cache.put(code, url)
return url
return None
async def put(self, code: str, long_url: str):
self.local_cache.put(code, long_url)
await self.redis.setex(f"url:{code}", 86400, long_url)With this setup, the vast majority of redirects are served from the in-process cache (sub-millisecond) without even hitting Redis.
CDN-Level Caching
For extremely hot URLs (viral content), put a CDN in front of the redirect service:
User -> CDN (Cloudflare/CloudFront) -> Load Balancer -> App Servers -> Cache -> DBConfigure the CDN to cache 302 redirects for a short TTL (e.g., 60 seconds). This absorbs traffic spikes without touching your infrastructure. The tradeoff: analytics granularity drops to 1-minute resolution for cached URLs.
Step 6: Write Path (Create URL)
@app.post("/api/shorten")
async def shorten(request: ShortenRequest):
long_url = request.url
custom_alias = request.custom_alias # Optional
expires_in = request.expires_in # Optional (seconds)
# Validate URL
if not is_valid_url(long_url):
raise HTTPException(status_code=400, detail="Invalid URL")
# Handle custom alias
if custom_alias:
if not is_valid_alias(custom_alias):
raise HTTPException(status_code=400, detail="Invalid alias")
if await db.exists("urls", custom_alias):
raise HTTPException(status_code=409, detail="Alias already taken")
code = custom_alias
else:
# Generate code via counter + base62
counter_value = await counter.next_id()
code = encode_base62(counter_value)
# Calculate expiration
expires_at = None
if expires_in:
expires_at = datetime.utcnow() + timedelta(seconds=expires_in)
# Store
url_record = {
"code": code,
"long_url": long_url,
"created_at": datetime.utcnow(),
"expires_at": expires_at,
"user_id": request.user_id,
"click_count": 0
}
await db.insert("urls", url_record)
# Pre-populate cache
await cache.put(code, long_url)
short_url = f"https://short.io/{code}"
return {"short_url": short_url, "code": code, "expires_at": expires_at}Handling Duplicate URLs
Should the same long URL always get the same short code? It depends:
- bit.ly approach: Every creation generates a new short code, even for the same URL. Simpler, and different users can have different analytics for the same destination.
- TinyURL approach: Same URL returns the same short code. Requires a lookup by long URL before creation (add an index on
long_url).
If you need dedup, add a lookup step:
# Check if this URL was already shortened
existing = await db.query_one(
"SELECT code FROM urls WHERE long_url = %s", [long_url]
)
if existing:
return {"short_url": f"https://short.io/{existing['code']}"}Step 7: Analytics
Tracking clicks is a core feature, but it must not slow down the redirect path. Use async event logging.
async def log_click(code: str, request: Request):
"""Fire-and-forget click logging."""
event = {
"code": code,
"timestamp": time.time(),
"ip": request.client.host,
"user_agent": request.headers.get("user-agent"),
"referer": request.headers.get("referer"),
"country": geoip_lookup(request.client.host),
}
# Send to Kafka for async processing
await kafka_producer.send("click-events", json.dumps(event).encode())
# Increment counter in Redis (for real-time display)
await redis.incr(f"clicks:{code}")The analytics pipeline:
Click Event -> Kafka -> Stream Processor (Flink/Spark) -> Analytics DB (ClickHouse)
-> Update click_count in primary DB (batched)ClickHouse is excellent for analytics queries like “clicks per day for URL X” or “top URLs by country.” It handles billions of rows with sub-second query times for aggregation workloads.
Step 8: Scaling the System
Database Sharding
At billions of URLs, a single database instance won’t cut it. Shard by the short code:
def get_shard(code: str, num_shards: int = 16) -> int:
"""Consistent shard assignment based on short code."""
return hash(code) % num_shardsSince the primary access pattern is lookup by code, sharding by code means each read hits exactly one shard. No scatter-gather queries needed.
Read Replicas
The redirect path is 100x more frequent than creation. Add read replicas to each shard:
Write path: App Server -> Primary DB (shard N)
Read path: App Server -> Cache -> Read Replica (shard N)Replication lag is acceptable here — a new URL being unresolvable for 100ms after creation is not a problem.
Rate Limiting
Without rate limiting, someone can exhaust your key space or use your service as a spam distribution platform:
# Per-IP rate limiting for URL creation
@rate_limit(key="ip:{request.client.host}", limit=100, window=3600)
@app.post("/api/shorten")
async def shorten(request: ShortenRequest):
...URL Expiration Cleanup
Expired URLs consume storage. Run a periodic cleanup job:
# Cron job — runs every hour
async def cleanup_expired_urls():
while True:
deleted = await db.execute(
"DELETE FROM urls WHERE expires_at < NOW() LIMIT 10000"
)
if deleted < 10000:
break # No more expired URLs
await asyncio.sleep(0.1) # Avoid overloading the DBThe Complete Architecture
+---------+
| CDN |
+----+----+
|
+----+----+
| LB |
+----+----+
/ | \
+--------+ +--------+ +--------+
| App S1 | | App S2 | | App S3 |
+---+----+ +---+----+ +---+----+
| | |
+----+----------+----------+----+
| Redis Cache |
+-------------------------------+
| | |
+----+----+ +--+-----+ +-+------+
| DB Sh 0 | | DB Sh 1| | DB Sh N|
| Primary | | Primary| | Primary|
| + Read | | + Read | | + Read |
| Replica | | Replica| | Replica|
+---------+ +--------+ +--------+
|
+----+----+
| Kafka | <-- click events
+----+----+
|
+----+------+
| ClickHouse| <-- analytics queries
+-----------+Key Takeaways
- Start with requirements and estimation. The numbers drive the design. A URL shortener is 100x read-heavy, which means caching and read replicas matter far more than write optimization.
- Base62 encoding of a counter is the simplest and most reliable encoding strategy. Use range-based allocation for distributed counter generation.
- Hash-based encoding adds collision handling complexity but provides natural deduplication. Choose based on whether you need dedup.
- 302 redirects give you analytics visibility. 301 redirects reduce server load. Most shorteners choose 302.
- Multi-tier caching (in-process LRU + Redis + CDN) makes the redirect path extremely fast. The vast majority of redirects should be served without hitting the database.
- Analytics must be async. Never block the redirect path to log a click. Use Kafka or a similar queue for fire-and-forget event logging, and aggregate in a columnar database like ClickHouse.
- Shard by the short code because the primary access pattern is key lookup. Each read hits exactly one shard.
- Custom aliases add complexity (uniqueness checking, different length constraints) but are a common product requirement. Handle them as a special case in the write path.
- Expiration needs a background cleanup job. Do not check expiration only at read time — you also need to reclaim storage.
- The URL shortener interview question tests breadth. It touches encoding, distributed systems, caching, databases, and analytics. Walk through each layer methodically.
