June 11, 2023 7:00 PM PDT
This document outlines the system design considerations for a Google Calendar-like application, focusing on requirements, database design, API specifications, and optimization strategies.
Requirements
- Total users: 500M
- Daily Active Users (DAU): 100M
- Queries Per Second (QPS):
- Write: 1000 QPS
- Read: 10,000 QPS
API Design
Booking Event
- Request Parameters:
- Sender
- Invitee
- Title
- Time
- Location
- Text
- Endpoint:
GET /calendar-event
Considerations
- Handling anonymous users
- Supporting multiple invitees (email or ID)
- Read/write throughput for a single machine
- Updating meeting times
Database Design
Relational Database Schema
-
Events Table:
eventID
sender
invitee
(string)title
time
location
text
-
Normalization:
- Factor out event detail table to avoid repeated fields.
Data Volume Calculation
- Estimated data volume:
- 1.5TB per year
- 15TB over 10 years
Primary Key Definition
- Primary Key:
sender + eventID
- Index:
dayID
- Composite Index:
participant + dayID
Query Optimization
- Range query for meetings:
SELECT * FROM meetings WHERE time > from_ts AND time < to_ts
- Use of indices to optimize queries, particularly for range queries.
Handling Multiple Invitees
- Considerations for multiple rows for invitees.
- Handling changes in event times may require updates to multiple rows.
Email Notification System
- Implement an email notification system for users outside the system.
- Use a meeting processor for asynchronous changes to events.
Database Technology Considerations
- Relational vs Non-relational databases:
- High throughput and volume may necessitate sharding by time range.
- Non-relational databases may be recommended for handling large event tables.
Conflict Resolution
- Addressing conflicts when multiple users have meetings at the same time.
- Gathering requirements for supporting meeting rooms.
Additional Considerations
- Sharding strategies using user ID or invitee ID as partition keys.
- Microservices architecture for better scalability and maintenance.
- Strategies for syncing data to clients and indexing the database effectively.