Youtube Design

Problems

Storage?
Scalability?
Web Server?
Cache?


Web Server
* Multiple Web Servers behind load balancer
* User Authentication
* Sessions
* Fetching and Updating Users Data

* Video Server
* Multiple Web Servers behind load balancer

* Recommendation Server
* Multiple Web Servers behind load balancer


Storage

MySQL Database:
Lets assume videos have comments and likes and comments do not have likes

Table:UserDetails
Fields: UserId, Name, Email, Reg Data, Profile Information(Address, Age etc.)

Table: User
Fields: UserId, Email, Username, Password, Reg Date

Table: Video
Fields: VideoId, UserId, Title, Description, Size, Video File, LikeCount, ViewCount

Table: VideoComments
Fields: VideoId, UserId, Comment, Date

Table: VideoLike
Fields: VideoId, UserId, Date

SQL is Geo Replicated

Video and Image Storage
Thumbnails of different sizes for different screens.
CDN - Proxy servers deployed in multiple data centres. Provide Availability, Scalability and Performance

Popular videos go on CDN. less popular ones go on local storage

Database Scaling
Simple DB
Master/Slave
Partitioning based on location. So requests can be served quickly (Sharding approaches)

Video Cluster is allocated the max resources. General cluster is the one with basic resources. Bottleneck - watching videos.


Cache
Front and back end cache. Not on videos as there are many many videos.


Security
Problems:
  • Hack View Count
  • Programmatically send view count
Remedy: Smart logic implementation
  • IP Address block
  • Cap on view count on the IP Address
  • Also can observe browser agent and user's past history
  • View count also relates to comments, share count, view time etc.

 

Comments

Popular posts from this blog

Terraform Basics