November 28, 2021 7:00 PM PST
This document summarizes a mock system design interview focused on designing a TopK system for YouTube. The interview evaluated both soft and hard skills, with an emphasis on system architecture, database design, and performance considerations.
Key Feedback
Soft Skills
- The interviewee demonstrated strong clarifying skills by pausing at key junctions to confirm requirements.
Hard Skills
- The TopK system for YouTube can be designed similarly to a time-series database with multiple layers:
- Redis: Fast distributed cache.
- Periodically aggregate Redis data into a relational database (e.g., at minute frequency).
- Periodically aggregate higher frequency data into lower frequency data (e.g., hourly or daily).
- Lower granularity database tables can be cleared out once aggregation is computed.
Interview Overview
- Target Level: L4 (Experienced Individual Contributor)
- Duration: 45 minutes
- Topic Covered: Top K for YouTube
- Drawing Tool Used: Whimsical
Requirements
Functional Requirements
- Trending systems
- TopK for the last 24 hours for 300 videos
Non-Functional Requirements
- High availability
- Low latency
- High throughput/scalability
- Fault tolerance
Traffic
- 100M users
- Average of 1 hour of video watching
- Each video averages 5 minutes
- Peak time estimates for platforms like YouTube and Bilibili
- Read-heavy operations
System Design
External APIs
getTopVideos(K, startTime, null)
System Design Diagram
- Considerations for distributed design versus single host design.
- Landing page request to calculate the most trending videos on the fly.
Database Schema
- NoSQL (Cassandra):
- Time, ListOfVideos (top K)
- Date, one day
- 5 minutes, ListOfVideos (top K videos)
- Top 200 videos for 5 minutes
Watching Video
- Daemon to calculate 80% of video views.
- Distributed message queue schema:
- User Watch Count for the last few seconds.
- Consistent hashing by UserID.
- CountMin sketch for fixed memory.
Aggregation
- Aggregate count data with a single process.
- Precompute topK and save in cache.
- Estimate the number of videos and distinguished videos.
Database Considerations
- Need to keep data in a database.
- Use of time-series databases.
- Sharding by video ID.
- Writing processes into the database using Redis cache, MapReduce, or Spark.
- Handling 15,000 QPS with Redis for short-term pre-processing.
Key Challenges and Considerations
- Lossy count and count-min sketch for batch and stream processing.
- Ensuring no data loss and fast data processing.
- Asynchronous jobs provided by the backend.
- Database scanning efficiency (N log (k)).
- Aggregation at 5-minute granularity.
- Potential use of priority queues for data management.
Additional Questions
- Considerations for Level 5 and Level 6 challenges.