July 2, 2023 7:00 PM PDT
This document outlines the system design for a realtime commenting feature for a newsfeed application. The design focuses on handling a high volume of comments and ensuring low latency for users. Key considerations include scalability, data consistency, and the choice of database technologies.
Requirements
-
Functionality:
- Realtime commenting similar to Facebook.
- Comments should appear below the post.
- Support for one level of comments, with optional privileged comments.
-
User Metrics:
- Monthly Active Users (MAU): 1 billion
- Posts per minute: 100 million
- Comments per minute: 600,000
API Design
- Publish Comment:
Publish_comment(uid, ...)
- View Comments:
View_comment(post_id, ...)
- View Comments by User:
View_comment_by_user(uid, self_uid, auth_token)
Non-Functional Requirements
- High availability (99.999%)
- Resilience in comment publishing
- Scalability
- Eventual consistency (no need for linearizability or transactions)
- Real-time delivery (within 1 second)
Performance Estimates
- Queries Per Second (QPS): 600,000 comments per minute translates to approximately 10,000 QPS.
- Daily Comments: 1.08 billion comments per day.
- Data Size: Assuming 1 KB per comment, approximately 10 TB of data per day (after 50% compression, around 5 TB).
High-Level Design
-
Database Options:
- Realtime databases like leader-follower or multi-level databases.
- Quorum-based systems such as Dynamo or Cassandra.
-
Database Type:
- NoSQL is preferred (document or column store) for better locality and performance.
- Comments can be stored in a post table with JSON format for flexibility.
Conflict Resolution
- Use a version vector system to handle concurrent comments.
- Old versions can be deleted once combined.
Real-Time Delivery Mechanism
- Users maintain a socket connection with the server.
- Comments can be streamed using Kafka or similar technologies.
- Comments are published to a stream and read from an in-memory datastore for fast access.
Throughput Calculation
- For an app with 1 billion users, reading every 3 seconds:
- Estimated read throughput: 4.5 billion reads per second.
- Scale by partitioning across 1,000 to 2,000 servers.
- Each server would handle approximately 4.5 million reads per second.
Conclusion
The design focuses on achieving high throughput and low latency for a large-scale commenting system. By leveraging NoSQL databases and real-time streaming technologies, the system can efficiently handle the expected load while maintaining a responsive user experience.