Back-of-the-envelope calculation for system design

In system design interviews and real-world architecture planning, the ability to make quick, reasonably accurate estimations—often called “back-of-the-envelope” calculations—is an invaluable skill. These calculations help you determine whether a proposed design is feasible, identify potential bottlenecks, and make informed architectural decisions without getting lost in implementation details.

Why Back-of-the-Envelope Calculations Matter

Back-of-the-envelope calculations serve several critical purposes in system design:

  1. Feasibility Assessment: Quickly determine if a proposed solution is viable before investing in detailed design
  2. Resource Planning: Estimate hardware, network, and storage requirements
  3. Bottleneck Identification: Discover potential performance limitations early
  4. Cost Estimation: Approximate infrastructure and operational costs
  5. Capacity Planning: Ensure the system can handle expected load and growth

Essential Numbers to Memorize

To perform back-of-the-envelope calculations effectively, you should memorize certain key values. Here are the most important ones:

Time-Related Constants

Storage and Data Transfer

Common Web Application Numbers

The Art of Estimation: A Step-by-Step Approach

1. Define the Problem Clearly

Begin by understanding what you’re estimating. Is it storage requirements, QPS (queries per second), bandwidth, or something else?

2. Clarify Requirements and Constraints

Identify key metrics like:

  • Number of users (daily/monthly active)
  • Expected request rates
  • Data storage needs
  • Latency requirements
  • Availability expectations

3. Make Reasonable Assumptions

Document your assumptions clearly. For example:

  • Average user session duration
  • Read-to-write ratio
  • Data growth rate
  • Peak-to-average traffic ratio

4. Break Down the Problem

Divide complex estimations into smaller, manageable calculations.

5. Apply Power-of-Ten Approximations

Round numbers to simplify calculations (e.g., use 10^6 instead of 1,234,567).

6. Verify with Sanity Checks

Double-check results against known benchmarks or real-world examples.

7. Identify Potential Bottlenecks

Use your calculations to spot system limitations.

Example 1: URL Shortener System

Let’s estimate storage and QPS requirements for a URL shortener service like bit.ly.

Step 1: Define Requirements

We need to estimate:

  • Storage requirements for URLs
  • QPS the system must handle
  • Bandwidth requirements

Step 2: Make Assumptions

  1. Traffic assumptions:
    • 100 million URLs shortened per month
    • Read-to-write ratio: 10:1 (10 URL accesses for each new shortened URL)
  2. Storage assumptions:
    • Original URL average length: 100 characters (100 bytes)
    • Shortened URL length: 7 characters (7 bytes)
    • Metadata per URL: 50 bytes (timestamps, user ID, etc.)

Step 3: Calculate QPS

New shortened URLs per month: 100 million

  • Per day: 100 million / 30 ≈ 3.33 million
  • Per second: 3.33 million / 86,400 ≈ 39 URLs/second

URL redirections (reads) per second: 39 × 10 = 390 URLs/second

Total QPS: 39 + 390 = 429 QPS (writes + reads)

Peak QPS (assuming 2× peak factor): 429 × 2 ≈ 858 QPS

Step 4: Calculate Storage Requirements

Storage per URL: 100 bytes (original URL) + 7 bytes (shortened URL) + 50 bytes (metadata) = 157 bytes

Monthly storage growth: 100 million URLs × 157 bytes ≈ 15.7 GB/month

5-year storage requirement: 15.7 GB × 12 months × 5 years ≈ 942 GB

URL Shortener Estimation

Step 5: System Implications

Based on our calculations:

  • QPS requirement (858 at peak) can be handled by a single modern application server
  • Storage requirement (~1 TB) is modest and can be handled by a single database server
  • We should consider sharding if we expect significant growth beyond our estimates

Example 2: Video Streaming Platform

Let’s estimate storage and bandwidth requirements for a YouTube-like video streaming service.

Step 1: Define Requirements

We need to estimate:

  • Daily storage requirements for new videos
  • Bandwidth requirements for streaming

Step 2: Make Assumptions

  1. Traffic assumptions:
    • 100 million daily active users (DAU)
    • 5% of users upload a video daily
    • Average user watches 10 videos per day
    • Average watch time: 70% of video length
  2. Video assumptions:
    • Average video length: 5 minutes
    • Average video size: 50 MB per minute (multiple resolutions)
    • We store 3 resolutions of each video (HD, medium, low)

Step 3: Calculate Storage Requirements

Number of videos uploaded daily: 100 million × 0.05 = 5 million videos

Average video size: 5 minutes × 50 MB/minute × 3 resolutions = 750 MB per video

Daily storage required: 5 million × 750 MB = 3,750 TB (3.75 PB) per day

Annual storage growth: 3.75 PB × 365 = 1,368.75 PB (1.37 EB) per year

Step 4: Calculate Bandwidth Requirements

Views per day: 100 million users × 10 videos = 1 billion video views

Average data per view: 5 minutes × 70% × 3 MB/minute = 10.5 MB per view (assuming average streaming quality is about 3 MB/minute)

Daily bandwidth: 1 billion × 10.5 MB = 10.5 PB per day

Peak concurrent users: 10% of DAU = 10 million users

Peak bandwidth (assuming equal distribution): 10 million × 3 MB/minute = 30 TB/minute = 500 GB/second

Video Platform Requirements

Step 5: System Implications

Based on our calculations:

  • Storage requirements are massive, requiring a distributed storage solution
  • Bandwidth requirements demand a global CDN infrastructure
  • Cost considerations suggest the need for efficient video compression and storage optimization

Common Estimation Scenarios

Here are several common estimation scenarios you might encounter in system design interviews:

Leave a Reply