August 7, 2022 8:00 PM PDT
This document summarizes a mock system design interview focused on an Ads Targeting System. The discussion covered functional and non-functional requirements, design considerations, and trade-offs between SQL and NoSQL databases. The interview aimed to evaluate the candidate's understanding of system design principles, particularly in the context of high-volume data processing and user tagging.
Requirements
Functional Requirements
-
User Tagging:
- Users need to be tagged to reflect different behaviors.
- Hundreds of tags for millions of users.
-
APIs:
AddTag(tag_info, user_id)
: Add a tag to a user.tags_infos get(user_id)
: Retrieve tags associated with a user.User_id get(tag_info)
: Retrieve users associated with a specific tag.
-
Tag Metadata:
- Different tags have unique metadata with no overlap.
Non-Functional Requirements
-
Latency Requirements:
- Some services require millisecond response times (e.g., subscription for paid services).
- Reads should be fast; writes can tolerate slower responses.
-
Queries Per Second (QPS):
- Reads: Off-peak 5k QPS; peak hours 25k QPS.
- Writes: 2k QPS; requests may contain millions of users.
-
Consistency:
- Minimum consistency is acceptable.
- Use of message queues (MQ) and caching is necessary.
- Cache invalidation required when adding tags.
System Design Considerations
Caching and Message Queues
-
Cache Structure:
user_id -> tag_ids
tag_id -> user_ids
-
Message Queue:
- Necessary to handle spikes in requests.
- Consideration for smaller requests needing faster responses, potentially using two separate message queues.
Database Choices
-
Relational Database:
- Can handle high QPS and supports complex queries.
-
NoSQL Database:
- Key-value store recommended for user-tag relationships.
- Schema:
Key: Tag_name/User_id
, with values being arrays of IDs or pointers to a blob store.
-
Blob Store:
- Used for storing large objects, such as tags with millions of users.
Consistency and Performance
- Read and Write Operations:
- Reads are more frequent than writes; cache invalidation ensures up-to-date data.
- Handling large requests may slow the system, requiring careful design.
Feedback and Discussion Points
-
Interviewer Feedback:
- The candidate needed more clarity on requirements and traffic estimations.
- Suggested that a clear set of APIs could streamline the discussion.
-
Audience Insights:
- SQL databases are sufficient for many-to-many relationships, while NoSQL can handle larger write traffic more easily.
- Emphasis on the importance of defining schemas to clarify design choices.
-
Trade-offs:
- Maintaining consistency across multiple tables can be challenging.
- NoSQL databases offer easier partitioning but may complicate consistency management.
Conclusion
The interview highlighted the complexities involved in designing an Ads Targeting System, particularly regarding user tagging, database choices, and performance considerations. Both SQL and NoSQL databases have their advantages and trade-offs, and the choice depends on specific use cases and requirements. The discussion underscored the importance of clear requirements and effective communication in system design.