December 5, 2021 7:00 PM PST
This document summarizes a mock system design interview focused on designing a video-sharing platform similar to YouTube. The interview covered functional and non-functional requirements, system design, and various technical discussions related to video upload, storage, and retrieval.
Interview Details
- Level: L5 (Senior)
- Duration: 1 hour
- Drawing Tool Used: Whimsical
Requirements
Functional Requirements
- View videos and thumbnails
- Upload videos
- Search for videos
- Display popularity and comments on videos
Non-Functional Requirements
- Ensure video integrity (no loss of videos)
- Provide smooth video playback
- Ensure high availability
- Minimize network costs
Constraints
- Daily Active Users (DAU): 100 million
- Average uploads: 5 videos per person per day
- Queries per second (QPS) for views: ~5000
- Upload and download ratios and bandwidth requirements were discussed.
System Design
External APIs
-
Video Retrieval
getVideo(video_id, user_id, offset)
- Offset is necessary due to large video sizes being divided into smaller segments.
-
Video Upload
uploadVideo(video_id, user_id, description, length, tags[], video_content)
- Returns a presigned URL for uploading.
Upload Flow
- The upload process involves changing the API to accommodate presigned URLs.
- Discussion on whether to use Google Cloud Storage or AWS S3 versus maintaining a custom distributed file system.
- Consideration of trade-offs between pre-built solutions and self-maintained systems.
Encoding and Processing
- A message queue is introduced to trigger the encoding service, addressing the time-consuming nature of encoding and uploading.
- Discussion on the choice of storage solutions, emphasizing the need for scalable storage like S3.
Download Flow
- Data retrieval from S3 with the addition of a CDN to cache popular videos.
- Implementation of a TopK system to identify frequently watched videos for caching.
- Optimization strategies include encoding videos at different resolutions.
Comment Features
- APIs for commenting and liking were proposed:
comment(video_id, user_id, content, comment_id)
like(video_id, comment_id)
- Discussion on database schema and Redis usage for caching comments and handling failures.
Database Schema
- Considerations for handling Redis failures and ensuring data integrity through synchronization with NoSQL databases.
- Discussion on the importance of maintaining a backup Redis instance.
Additional Design Considerations
- Importance of requirement gathering and feature prioritization during the interview.
- The interviewee should drive the discussion and confirm understanding with the interviewer.
- Discussion on handling upload errors, pacing of uploads, and the scalability of the system.
- Consideration of various storage solutions and their implications on performance and cost.
Audience Feedback
- Emphasis on the need for the interviewee to propose ideas and follow up with questions.
- Discussion on the importance of covering all features within the time constraints of the interview.
- Suggestions for improving the systematic expression of knowledge during the interview.
Technical Discussions
- Various technical aspects were discussed, including:
- Handling of uploads and downloads, including chunking and encoding.
- Use of message queues for asynchronous processing.
- Considerations for using NoSQL versus SQL databases.
- Strategies for caching and ensuring high availability.
This summary encapsulates the key points discussed during the mock interview, highlighting the technical depth and considerations necessary for designing a scalable video-sharing platform.