January 23, 2022 7:00 PM PST
This document summarizes a mock system design interview focused on an Ads Logging system. The interview covered functional and non-functional requirements, system design, and various technical challenges related to logging ads impressions, clicks, and conversions.
Ads impression logging study notes
Interview Overview
- Target Level: L5
- Duration: 45 minutes
- Topic Covered: Ads Logging
- Drawing Tool Used: Whimsical
Requirements
Functional Requirements
- Generic ads logging system
- Advertiser: Provides ads and aims for user conversion
- Publisher: Displays ads
- Influence ranking of the ads
- Log publisher page information and end user information
- Events to log: display, click, conversion
Non-Functional Requirements
- Scalability
- Lower latency
- High availability
- Variable latency requirements
System Design
System Flow
- Client loads app/web page.
- Client goes to ad selector to inject the ad.
- Client retrieves page information and end user information.
User Identification
- If the user is authenticated, the system can utilize Device ID and email to identify the user.
- The ads platform may be a third-party service, which complicates user tracking.
Data Persistence and Scalability
- Log data is persisted for analysis (hourly, daily, monthly).
- Real-time data retrieval for specific ads to track display counts.
- Message queues (e.g., Kafka) are used to ensure scalability under high query per second (QPS) loads.
Cache and Database Interaction
- Data is first sent to cache, then flushed to the database.
- In case of cache failure, data can be recovered from the message queue based on timestamps.
Event Processing
- Event schema includes event type, page ID, ad ID, and count.
- Sharding key can be based on ad ID to enhance scalability.
- The message queue guarantees the sequence of events.
Interviewer and Audience Feedback
Key Points
- The interviewee provided a workable solution but needed to clarify requirements and assumptions.
- Importance of understanding user tracking methods and the differences between mobile and web tracking.
- Discussion on stream processing versus batch processing, emphasizing the need for low latency in certain scenarios.
- The interviewee should have asked more clarifying questions to ensure alignment with the interviewer's expectations.
Technical Challenges
- Identifying user types and logging information effectively.
- Handling high throughput with Kafka for real-time classification.
- Ensuring correctness in logging user behavior and ad delivery.
- Understanding the implications of using different sharding keys for performance.
Conclusion
The interview highlighted the complexities involved in designing an ads logging system, including user identification, data persistence, and processing methodologies. Continuous clarification of requirements and assumptions is crucial for successful system design.