September 4, 2023 7:00 PM PDT
This document summarizes the discussion on the design and features of an inventory management system utilizing Azure Cosmos Database. The meeting covered various technical aspects, including database architecture, consistency models, and comparisons with other database solutions.
System Design Presentation - Azure Cosmos Database
Key Features
- Fully Managed: Provides a high level of service with minimal maintenance.
- High Resiliency: Achieves 99.999% uptime.
- Low Latency: Offers less than 10ms read/write latency at P99.
- Elastic Scale:
- Throughput: Supports from 100 requests per second (rps) to over a trillion rps.
- Storage: Ranges from 50GB to petabytes (PB).
- Global Distribution: Capable of operating across all Azure regions.
- Tunable Consistency: Options include:
- Eventual
- Consistent-prefix
- Session
- Bounded Staleness
- Strong consistency
Architecture
- Components:
- Azure regions, data centers, stamps, fault domains, clusters, and machines.
- Cosmos database engine with direct connection vs. gateway mode.
- ServiceFabric, similar to Kubernetes, for orchestration.
- MasterService for metadata management (similar to etcd).
- ServiceService for handling real data.
Resiliency and Partitioning
- Horizontal Partitioning: Each partition has a maximum size of 50GB.
- Replica Sets: Each set contains four replicas with one leader, where changes are propagated to the leader.
- Global Distribution: Peer-to-peer links can be established between all nodes.
Data Structures and Indexing
- Unique Data Structures:
- B+ tree: A traditional choice for low write throughput but can be slow with many indices.
- BW tree: Offers lock-free updates using compare and swap techniques.
Conflict Resolution
- Write Conflicts: Strong consistency can prevent write conflicts. Classic B-trees require locking, which can slow performance.
- Delta Writes: To improve throughput, deltas are written on top of the base page, with conflict detection during compare and swap.
Comparison with DynamoDB
- DynamoDB Features:
- Offers an optional sort key and supports global secondary indices (limited to six).
- CosmosDB supports indexing on all JSON fields and offers more query types.
Pros and Cons of NoSQL vs SQL Databases
-
NoSQL:
- Pros: Flexible schema design, horizontal scaling, performance tuning based on workloads.
- Cons: Often favors availability over consistency, limited query capabilities (e.g., no joins).
-
SQL:
- Pros: ACID compliance, structured schema, supports complex queries (joins).
- Cons: Performance bottlenecks with large data sets.
Use Cases
- SQL Use Cases: E-commerce, financial systems, content management systems.
- NoSQL Use Cases:
- Social media, graph databases, chat applications, big data (e.g., Cassandra for Netflix).
- High write-throughput applications (e.g., Philips Hue using DynamoDB).
- Hybrid cases in gaming (e.g., Redis for leaderboards) and e-commerce (personal recommendations).
This document serves as a comprehensive overview of the key discussions and technical insights shared during the meeting regarding the inventory management system and its underlying database architecture.