July 10, 2022 7:00 PM PDT
The meeting focused on the system design of a platform similar to Yelp, discussing functional and non-functional requirements, system architecture, and potential challenges. The interviewee demonstrated their understanding of geospatial data handling, database design, and scalability considerations.
Requirements
Functional Requirements
- Design a Yelp-like platform
- Return empty results if no businesses are found
- Support radius choices: 1, 5, and 10 miles
- Allow business owners to update their information with low query per second (QPS) requirements
Non-Functional Requirements
- Support 100 million daily active users (DAU)
- Handle 200 million businesses
- Maintain low latency (500 ms for radius changes)
- Aim for updates to be reflected ideally within minutes
- Ensure high scalability and durability
- Manage a QPS of 10,000 (2 requests per day per user, peak hour multiplied by 5)
System Design
External APIs
- User API:
get_biz_from(cur_location, radius)
- Business owner API:
CRUD update(biz_id, location, biz_info)
Geospatial Data Handling
- Geohashing:
- Cut the area into quadrants, each represented by prefixes (00, 01, 11, 10).
- Use varying lengths of geohash for different radius searches (e.g., length 6 for 1 mile, length 5 for 5 miles, etc.).
Database Design
-
Storage:
- Use Redis for caching business IDs associated with geohashes.
- Store geohashes in a relational database with a simple structure.
-
Querying:
- Use SQL to find nearby businesses based on geohash.
- Handle scenarios where not enough results are returned by querying broader geohashes.
Memory Management
- Estimate memory requirements for storing geohashes and business IDs.
- Consider sharding the storage by regions to manage memory effectively.
Interviewer and Audience Feedback
Interviewer Feedback
- Overall performance was satisfactory, but there were areas for improvement:
- Soft skills: Need to be more concise in communication and requirement gathering.
- Technical skills: Could simplify database entries by storing only the longest geohash for each business.
- Suggested using data structures like B+ trees or hash tables for efficiency.
Audience Feedback
- Emphasized the importance of handling boundary conditions and the maximum number of results returned.
- Suggested using a quadtree for managing varying densities of restaurants and precomputing nearby boxes.
- Discussed Redis capabilities for geospatial queries and sorting based on overlapping bits.
Conclusion
The meeting provided a comprehensive overview of the design considerations for a Yelp-like platform, highlighting the importance of efficient data handling, scalability, and user experience. The feedback received will aid in refining both technical and soft skills for future interviews.