May 21, 2023 7:00 PM PDT
This document outlines the system design for a video upload and viewing platform similar to YouTube. The focus is on handling high traffic, ensuring high availability, and providing a seamless user experience across multiple devices.
Requirements
-
Functional Requirements:
- Support video upload and viewing.
- Multiple device compatibility.
- Display video thumbnails and previews on hover.
- Implement a "like" feature.
-
Non-Functional Requirements:
- Ensure videos are not lost.
- Provide fluent video playback.
- Achieve high availability.
- Minimize network costs.
Constraints
- User Base: 100 million daily active users.
- Video Upload Rate: 5 videos per user per day.
- View Rate:
- Queries per second (QPS): 5 * 100 million / 86400 ≈ 5000/s.
- Upload/Download Ratio: 1:200.
- Upload QPS: 25/s.
- Storage Needs: 683 TB/day.
- Upload Bandwidth: 7 G/s.
External API Design
getVideo(video_id, user_id, offset) -> URL
uploadVideo(video_id, user_id, description, length, tags[], video_content)
System Design Discussion
Metadata and Upload Services
- Separation of Services:
- The metadata service is separated from the upload service to prevent blocking during long upload times. The metadata service provides a signed URL for uploads.
Encoding Service
- Positioning:
- The encoding service is placed before the upload service to allow clients to upload videos to original storage first. This supports various video encodings.
Message Queue Utilization
- Purpose:
- A message queue is added to trigger the encoding service asynchronously, improving efficiency by not requiring the system to wait for encoding to complete.
Storage Solutions
- Choice of Storage:
- S3 is preferred for video storage due to its capacity for handling large volumes of data. Alternatives such as Azure or Google Cloud can also be considered.
Download Design
- High Bandwidth Needs:
- Clients can directly access videos from S3, with a CDN implemented to cache popular videos. The CDN will fall back to S3 for less popular content.
Caching Strategies
- Popular Videos:
- A recommender/topK system will determine which videos to cache in the CDN.
Comments Feature Design
-
External API:
comment(video_id, user_id, content, comment_id)
like(video_id, comment_id)
-
Database Schema:
- Comments will be stored in a NoSQL database due to the straightforward schema and the need to handle large volumes of data without complex joins.
-
Caching Comments:
- Popular comments will be cached in Redis and persisted to the database every 5 minutes.
User Interaction
- Like Button Accuracy:
- The importance of accurate tracking for user interactions, such as likes, is acknowledged. A standby Redis instance may be used to ensure reliability.
This document serves as a comprehensive guide for designing a video upload and viewing platform, addressing key technical considerations and system architecture.