System Design Cheatsheet
System design is crucial for building scalable, robust, and efficient systems. This cheatsheet provides a quick reference to key concepts, best practices, and common patterns in system design.
1. Key Principles of System Design
Scalability:
- Vertical Scaling (Scaling up): Adding more resources (CPU, RAM) to a single machine.
- Horizontal Scaling (Scaling out): Adding more machines (nodes) to the system.
- Stateless services are easier to scale horizontally.
- Data sharding: Splitting the data into smaller, manageable pieces (shards) across multiple nodes.
Reliability:
- Redundancy: Multiple replicas of data and services to ensure availability in case of failure.
- Failover: Automatic switching to a backup system when a failure occurs.
- Consistency: Ensuring that all replicas of data are the same across the system (eventually consistent vs. strongly consistent).
Maintainability:
- Modular design: Breaking down the system into smaller, manageable components (microservices, modules).
- Separation of concerns: Ensuring that components have distinct responsibilities.
Performance:
- Caching: Store frequently accessed data in-memory (e.g., Redis, Memcached).
- Load balancing: Distribute traffic across multiple servers to avoid overloading any one server.
- Asynchronous processing: Queueing tasks for background processing (e.g., RabbitMQ, Kafka).
2. High-Level Design (HLD)
Definition: The process of creating a high-level overview of the system architecture.
Components to Include:
- Service Components: Identify core components (e.g., front-end, backend services, databases, etc.).
- User Interaction: How users interact with the system (APIs, web clients, etc.).
- External Services: Identify any external services or third-party dependencies.
- Data Flow: Visualize how data flows between components.
- Scaling Strategy: Consider horizontal/vertical scaling and load balancing.
- Caching: Where and how caching will be implemented.
- Failover/Redundancy: How the system will recover from failures (replication, backups).
Tools:
- Use UML diagrams, Block Diagrams, and Data Flow Diagrams to represent HLD.
Example: High-Level Design of a URL Shortener:
- Frontend: React web application for user interface.
- Backend: REST APIs (to generate and retrieve short URLs).
- Database: NoSQL (MongoDB) for storing URLs and metadata.
- Cache: Redis for storing frequently accessed short URLs.
- Load Balancer: Distributes traffic to multiple instances of the backend service.
- Queue: Asynchronous job queue for processing URL analytics (e.g., RabbitMQ).
3. Low-Level Design (LLD)
Definition: The process of designing detailed components, their interactions, and APIs.
Components to Include:
- Data Models: Define the data structure of entities (e.g., classes, tables, collections).
- API Design: Specify RESTful APIs with HTTP methods, parameters, and responses.
- Database Schema: Design tables, relationships, and indices.
- Algorithms: Specific algorithms for task handling (e.g., pagination, sorting).
- Error Handling: How errors and exceptions are handled in the system.
- Concurrency: How multiple processes or threads are managed (e.g., thread pools, locking mechanisms).
Tools:
- Use Class Diagrams, Entity-Relationship Diagrams (ERD), and Sequence Diagrams for LLD.
Example: Low-Level Design for URL Shortener:
- Database Schema:
- URLs table: (ID, Long URL, Short URL, Created At, Expiry Date, etc.).
- Statistics table: (Short URL, Click Count, Last Access Time, etc.).
- API Endpoints:
POST /shorten
: Accepts a long URL and returns a short URL.GET /{shortUrl}
: Redirects to the original long URL.- Caching Layer:
- Cache short URLs in Redis with a TTL (time-to-live) to avoid constant database lookups.
- Concurrency:
- Use mutexes or locks to handle cases where multiple users are creating or retrieving short URLs simultaneously.
4. Common System Design Patterns
1. Microservices Architecture:
- Description: Break the application into smaller, loosely coupled services that are independently deployable.
- Pros: Independent scalability, fault isolation, easier updates.
- Cons: Increased complexity, inter-service communication challenges.
2. Event-Driven Architecture:
- Description: Components communicate through events. This is ideal for systems with high throughput or asynchronous workloads.
- Pros: Decoupled components, scalability.
- Cons: Complexity in handling eventual consistency.
3. Layered Architecture:
- Description: Divide the system into layers like presentation, business logic, data access, etc.
- Pros: Clear separation of concerns, maintainability.
- Cons: Can lead to performance bottlenecks if not optimized.
4. Client-Server Architecture:
- Description: The system is divided into two components: a client and a server. The client makes requests, and the server provides responses.
- Pros: Simple, efficient.
- Cons: Server can become a single point of failure.
5. CQRS (Command Query Responsibility Segregation):
- Description: Separate read operations (queries) and write operations (commands) into different models for scalability and performance.
- Pros: Optimized for read-heavy or write-heavy systems.
- Cons: Complexity in data synchronization.
6. Proxy Pattern:
- Description: A proxy acts as an intermediary to control access to a resource (e.g., for caching, logging).
- Pros: Can provide security and logging.
- Cons: Adds overhead to each request.
5. Key Concepts in System Design
Load Balancing:
- Round Robin: Distributes requests equally to all servers.
- Least Connections: Sends requests to the server with the least number of connections.
- IP Hash: Routes requests based on the hash of the client’s IP address.
Database Sharding:
- Horizontal Sharding: Splits data into different databases (e.g., based on user ID ranges).
- Vertical Sharding: Splits data based on functionality (e.g., one DB for users, another for orders).
Caching Strategies:
- Cache Aside (Lazy Loading): Load data into the cache only when it’s requested.
- Write Through: Write data to the cache and database simultaneously.
- Read Through: If data is not in the cache, fetch from the database and store it in the cache.
Data Replication:
- Master-Slave Replication: One master database handles writes, and multiple slave databases handle reads.
- Peer-to-Peer Replication: All databases are equal, and data is replicated between them.
Message Queues:
- Purpose: Decouple microservices and handle asynchronous processing.
- Tools: RabbitMQ, Kafka, AWS SQS.
Rate Limiting:
- Token Bucket: Allows requests in bursts but enforces an average rate.
- Leaky Bucket: Processes requests at a constant rate, discarding excess requests.
6. Scalability & Availability Strategies
Scaling:
- Stateless Design: Easier to scale since no session data is stored on the server.
- Horizontal Scaling: Add more instances of the service, and use load balancing to distribute traffic.
- Vertical Scaling: Increase server resources, but only to a limited extent.
High Availability:
- Replication: Use multiple servers or databases to ensure availability even during failures.
- Failover: Automatically switch to a backup system if the primary one fails.
- Disaster Recovery: Plan for backups and strategies to recover from major failures.
CAP Theorem:
- Consistency: All nodes have the same data at any given time.
- Availability: Every request gets a response, whether the data is up-to-date or not.
- Partition Tolerance: The system continues to function even if network partitions occur.
Consistency Models:
- Strong Consistency: All replicas are updated immediately.
- Eventual Consistency: Replicas may be out of sync but will eventually be consistent.
- Transactional Consistency: Achieved through ACID properties (Atomicity, Consistency, Isolation, Durability).
7. System Design Case Studies
1. Design a URL Shortener:
- Key components: Web UI, API layer, Database (NoSQL or Relational), Cache (Redis), Load balancer.
- Features:
- Shorten long URLs.
- Redirect short URLs to the original long URL.
- Track click analytics (optional).
- Challenges: Unique short URLs, efficient redirection, scalability (for high traffic).
2. Design a Chat Application:
- Key components: User authentication, message broker (Kafka
/RabbitMQ), database for message history, real-time messaging.
- Features:
- Send and receive messages in real time.
- Group chat, notifications, and message persistence.
- Challenges: Handling real-time communication, scaling for large user bases, ensuring message delivery.