August 14, 2022 8:00 PM PDT
This document summarizes a mock system design interview focused on creating a real-time messaging system. The interview covered various aspects of system requirements, architecture, and technology choices, including functional and non-functional requirements, database design, message queuing, and scalability considerations.
Interview Details
- Topic: Real-Time Messaging System
- Level: L4 (Experienced Individual Contributor)
- Duration: 45 minutes
- Drawing Tool Used: Excalidraw
Requirements
Functional Requirements
- 1-1 chat functionality
- Message history maintenance
Non-Functional Requirements
- Scalability and availability
- Low latency in message delivery
System Design
External APIs
send(userId, content)
receive(userId, content)
Architecture
- Database Choice: NoSQL for scalability
- Messages are sent and saved in the database.
- Users receive messages from the NoSQL database when they come online.
Communication Protocols
- Polling: Periodically asks for messages (high overhead if no messages).
- Long Polling: Waits for messages when there is no immediate response.
- WebSocket: Provides bidirectional communication with low overhead.
Message Queue
- Each user has a dedicated message queue topic.
- Recommended message queue: Kafka
- Supports topic creation for each user.
- Robust failover and retry mechanisms.
Scaling the Message Queue
- Each chat server acts as both a publisher and a subscriber.
- Adding publishers and subscribers is straightforward for Kafka.
WebSocket Connections
- Each user maintains one WebSocket connection per server.
- Messages are sent from the database to the message queue that the recipient is subscribed to.
Database Design
- Database Type: NoSQL (e.g., Cassandra)
- Schema:
- Columns:
messageID
,content
,sender
,receiver
,timestamp
- Incrementing
messageID
for message ordering. - Partition key can be either
senderID
orreceiverID
.
- Columns:
Message Ordering
- Use incrementing
messageID
and timestamp for ordering. - Handle network issues by relying on timestamps of message receipt.
Scalability and Availability
-
Avoiding Single Points of Failure:
- Replicate chat servers.
- Use Kafka for message queuing.
- Implement NoSQL with replicas in different data centers.
-
Load Balancing:
- Use consistent hashing based on user ID to distribute load evenly among servers.
- Servers can join and leave the cluster using a heartbeat mechanism.
Feedback and Discussion Points
- WebSocket Complexity: Challenges in managing WebSocket connections across multiple servers.
- Message Queue Choices: Discussion on the suitability of Kafka vs. RabbitMQ for real-time messaging.
- Database Considerations: Importance of maintaining message order and handling edge cases in message delivery.
Conclusion
The interview highlighted the complexities involved in designing a scalable and efficient real-time messaging system. Key considerations included the choice of technology for message queuing, database design, and ensuring message order and delivery reliability.