5. Scale from ZERO to MILLION Users in Detailed | System design interview: Scale to 1million users
2022 ж. 27 Там.
38 553 Рет қаралды
In this Video, i talked about scalability, and how to grew a website to handle 1 million users. I cover load balancing, content delivery networks, database master/slave replication, horizontal/vertical sharding, and more.
To appreciate the work, pls join:
/ @conceptandcoding
#cdn #sharding #hld
I seen LLD & HLD in language but couldn't grab it.... because there directly they were discussing on the solution. But the way you went from 0 -> N that was amazing. I worked in monolith application but couldn't understand how to approach to the best change..... Hats of to you....your teaching method.... More Power to you !!!
Honestly, this is one of the fine grained system design lectures I have ever seen! I paid for udemy courses, looked for freecodecamp videos, medium blogs, but this is the finest of all gems! Thanks a lot! You deserve a lot lot lot!
thanks
Beautiful explanation
Great content! It would be helpful if you could include any references you used, such as articles, blogs, or books in descriptions
in love with all the videos !! curiously waiting for the consistent hashing topic , please drop it soon 🙂
:) sure thanks
That is really cool to see such incremental approach. Appreciate this video. Could you please also make an extension of this video mentioning, Elastic Search Engine.
Noted
This stuff is really really cool, and Thank you so so much for sharing your knowledge. Appreciated. :)
Thanks
One of the best video on step-wise scaling!
Thanks
Beautiful explanation. Thank you!
Thanks
Thank you for your service. Great job.
Thank you
Amazing content☺ on LLD and HLD. I was finding difficulty in answering about the design patterns in the interview, but the roadmap of LLD and HLD explained here really helped.
thanks
I was from an electrical branch, SO I didn't do that project, nor having learned of any computer science project. But now I am gaining it
Thank you
Great content, keep it up You have earned a new subscriber
thank you
Liked the step-by-step scaling explanation. Pls make more such videos so that we can find them all here.
thank you
Very nice tutorial. Provided detailed information in very short time. thank you
thank you
So clean and sophisticated explaination
Thanks
Maja aa gya , Love this video (y). Thank you so much
Glad to know that
Amazing content + highly recommended to watch it.
Thanks
It's a great learning video. Thank you. I have seen in interviews and in other HLDs that they add queue to store requests when load increased too much, so that any request do not get missed. Could you please make a video on this concept explaining how this is done (whether the request thread wait on queue) and how response is send back in detail.
noted Shobhit
Best video I have ever seen. Thanks so much
Thanks
You are the best Tech Proffesional.
Thanks
Thanks for the great video, whats difference between Horizontal Sharding and Indexing?
Nice Video to get a summary of all concepts
Thanks
Keep up the great work:)
thank you
Really good explanation. Thank you
Thanks
I think what you mentioned about sharding here is not actually sharding its partitioning, multiple partitions can reside in the same shard also. Sharding is always on db and partitioning is on data content (rows and columns). Different instances of db deployed are called shards. CMIIW.
Thank you, really helpful video.
thanks Shipra, pls do share it with your connections
crystal clear explanation
thank you
Hey , Thanks for a great content men. I have one doubt . I have few doubt thought. 1 - Scaling of db and scaling of app server is same right ? . I mean for both we use Horizontal scaling and Vertical scaling right ? only difference is implantation wise . 2 - Do you think Master slave is kind of horizontal scaling only bcz there we have replication of data(Slaves have same data as Master) and in case of app server we have replication of code on server and multiple servers are running them ? 3 - If we have master slave then what is the need of the Sharding (as it is also kind of horizontal scaling only) 4 - Why don't we have Load balancers in front of the Slaves in master slaves like we have in front of the App server ?
Can you please make a separate detailed video on Messaging Queue? It will be really helpful.
It's in my bucket list
Bhai you are working great. I must tell you. I always recommnd your channel to the people who wants to learn Lld and DP. One note : when you are recording DP comes up with real time use cases . And where can we implement these DP concept in the hld questions asked in interview. Like can we implement any dp in parking lot.
Got it thanks for the feedback. I will do that .
Summary: 04:25 Scaling from 0 to 1 million users 08:50 Implementing Load Balancer 10:00 Database Replication to handle system failures 13:15 Using cache and CDN to improve performance and reduce database calls 17:40 CDN is a solution to reduce latency for global users 22:05 CDN enables advanced technology with features to optimize load on DB and increase security. 26:30 Messaging queue helps in asynchronous processing of heavy load operations. 30:55 Fanout, Topic, and Messaging Queue are important concepts in message exchange 35:14 Horizontal and vertical sharding are methods of dividing data in a scalable system.
agar hum sharding primary key ke base pr kre to jab bhi tree sharding problem hat jaegi ?
I am watching this from New York but There are so many ads after every few minutes. Its so so frustrating. idk its by the author or location but it needs to be fixed. Content is good as always.
It totally depends on the region You can use brave browser or add blocker to bypass adds
Question - Step 4, How does the requests from App servers farmed out to multiple dbs, do we need internal load balancer between App and db servers? Appreciate it if you could add some explanation notes. Great videos, love it.
I will cover this in my next HLD video, it gonna be very interesting video
good job young man
Thank you
Question: Suppose if there is huge data in DB and there needs to be divided into two shards, Will both the shards have their dedicated node, or both the shards will live in single node ? Also can a single node/vm contains only one shard or it can contain multiple shards
Hello Question: How do data centre communicate with each other and avoid consistency in a system? Lets say at time t1 a write req goes to India DC(data centre), it then goes to US DC for replication. If this is a sync communication, user's write will get latency delay as sending ack back to user will take a while. To avoid latency, we choose an async comm btw data centres and series of events happened: 1. Since using async medium, Queue has data for US DC to be replicated. 2. Ind DC goes down 3. At time t2, t2 > t1, read req(read1) comes for USDC. At time t2, data is not replicated to US DC yet. 4. Say (read1) need to serve a heavy query result and supposedly we will have a cache miss. How to handle such case? PS: We can have multiple DATA CENTREs, but the crux here is read comes before replication to other data servers and primary/master(the server with latest write) goes down.
Thanks Piyush, for this detailed explanation of the usecase. Let me read out the paragraph from "Design Data Intensive Application" book "Often, leader-based replication is configured to be completely asynchronous. In this case, if the leader fails and is not recoverable, any writes that have not yet been replicated to followers are lost. This means that a write is not guaranteed to be durable, even if it has been confirmed to the client. However, a fully asynchronous configuration has the advantage that the leader can continue processing writes, even if all of its followers have fallen behind. Weakening durability may sound like a bad trade-off, but asynchronous replication is nevertheless widely used, especially if there are many followers or if they are geographically distributed. We will return to this issue in “Problems with Replication Lag”. " So this is single datacenter with Single Leader and multiple follower setup problem, but same applies to Multi data centre with multiple leader too, So its a trade off, and it does exists :) Hope this helps.
Hey @ConceptandCoding, How consistent hashing will help to reduce the tree structured sharding, I mean every shard has different data, redirecting the request using consistent hashing to a wrong shard won't help right?
Very Nice !!
thanks
Super
Great content 👍👍
Thank you 🙌
why there is a need of join query in sharding ? we have same columns in every tables , so why we have to do join ? we can simply make query on respective shards ?
Beautifully explained words aren't enough for your effort taken, thanks! Also i have doubt pls reply, i wanted to know how long does it take for a single software engineer to implement HLD all these mentioned concepts in real time, how long does it take to scale an app? Lets assume it's a food delivery app
Sorry I did not get the question, could you please elaborate when you say "how long it takes to implement"...
@@ConceptandCoding I'm curious about the time required for a lone software engineer to implement a high-level system design project, incorporating all the mentioned concepts from the video, in real-time, utilizing AWS. Additionally, I'm interested in understanding the timeframe needed to scale an app, specifically considering a food delivery app as an example." Do you also take classes? Webinar/course for the High level system?
Great
Thank you
Hello Sir, I want to present my idea in Tata Imagination Challenge. Can you please guide me on the same? If possible, please make a video so that it will be helpful for all of us. Thankyou.
Nice work buddy. Can you help me with these doubts (as soon as you can, I have an interview scheduled, thanks in advance)::::> Suppose If I have a trading application, and want to reduce the latency for geographical users, lets say a user is in US and other is in Singapore(APAC), and third in london(EMEA), so I scale my appservers, make application replicas and place them in different geographies, so 1) where will I place the gateway? Gateway is generally 1 right? 2) For applications to interact with DB, where to place the db (will it be better to make distributed db replicas placed across geographies )? 3) what type of db should I use who can sync ups b/w replicas? (looks like consistency will take a hit since db sync up will take time) 4) How to handle concurrent orders( trade order may have 1 unit left, and all users are trying to book that 1 unit, db isolation may not work here bcos they are reading different replicas OR use active passive strategy where write is always on primary db, in that case where to put primary db as far uaway users will again face latency in write operatons)
1. Instead of relying on centralized gateway. We can use Geographically distribute gateways like anycast gateways, CDN. 2. Yes, we need to make Distributed DB replicas placed across geographies. 3. Its a common scenario that distributed DB can sync up with each other, for example ORACLE uses OGG (Oracle golden gate) to transfer the committed data to distributed db. But its true, as even oracle mentioned in their OGG document that, it might take max 15 mins in the worst scenario. 4. Active-Passive is one solution and yes for far away users, latency issue will be present. But Active-Active also can be used with "Distributed Locking mechanism" (as part of this mechanism, lock on same db instance put on all active nodes)
@@ConceptandCoding Thanks for such a quick and perfect response. I will be asking many more questions this weekend :) Also, Do you suggest any cloud certification and cloud related courses which can make us pro on these topics?
Please make a separate video on database scaling .
noted
1)Who will take care of making slave DB master if primary fails..? Is that db cluster itself does the job? I believe replication is also taken care by db cluster by itself. 2)In general we give 1 url in datasource bean of springboot property file for a relational db. What would be URL in springboot application if there are multiple replicas dbs and how can we change primary in that case?
2nd was asked in an interview recently
1.) when db are run in docker conatiner we give these configuration in files.. exapmle is stated below MongoDB shell version v5.0.8 > db = (new Mongo('localhost:27017')).getDB('test') test > config = { "_id" : "my-mongo-set", "members" : [ { "_id" : 0, "host" : "mongo1:27017" }, { "_id" : 1, "host" : "mongo2:27017" }, { "_id" : 2, "host" : "mongo3:27017" } ] } 2. for the second question ..we can create these files in spring also..in the application.yml files we can give . spring: devtools: restart: enabled: false livereload: enabled: false data: cassandra: contactPoints: localhost protocolVersion: V4 compression: LZ4 keyspaceName: xyz jest: uri: localhost:9200 *jest is elastic search client and db here is cassandra
Whats the use of vertical sharding, it is just splitting the table on the columns, but the number of rows remain same, which kind of means we are creating multiple tables from one table, doesn't that make the design even more complex?
A user profile can have say 20 fields, out of this say only 5 are most sought after/used frequently. Now making 2 tables, one with 5 column and another with 15 will make the first DB lighter and faster in response.
do you refer to alex xu?
There is one concept of Service Discovery also. This should have also been added into the diagrams.
Noted
Great video 😇.Hope there would have been some predefined steps to reach 1 million followers/subscribers 🥺
Yes first step is to share it with your connections :)
I had a question, at your level like sde3 do you also have to work on frontend side also or more on backend architecture only
I am a backend engineer, never worked on frontend.
@@ConceptandCoding you are really lucky that you don't get frontend tasks, whenever I joined dev job I was expected to work on frontend which I really don't understand, I knew nodejs that time but no such frontend expertise I had, what should I do so that I can get only backend related jobs even after I crack any interview
Pls also start videos of spring boot framework.
noted Alok
Heyy, can you please explain how data will be replicated between databases situated in different geo. location; like what is the connection type between them.
Ack
Bro, when can we expect English subtitles to this video?
so you are saying first step should be replication than vertical scaling .......... but I thinks it should be reverse.......
How to achieve Master slave replication in real time?
Where to learn LLD HLD from ? and how to scale it to millions of users
pls check the video Deepak.
@@ConceptandCoding im struggling with a coderanch problem . Can i emal it to you ?
Consecutive write and read operation for same data will result in failure if read is from replica even if replication gap is 3-5milli second
:) it will not fail, generally the data ia also cached. if weite and read happens almost instantly (in a gap of 3-5mill sec) it will mostly get served from Cache. So no issues
Hi can you pls share these kind of notes :) (onenote with us)
Sure, i will share from future, as i haven't saved earlier ones
Started HLD today. Hope to finish hld and lld in a week. Wish me luck
good luck buddy
Please post videos in English
English would be very helpful
Yes all latest videos are in English only
Can you please make this video in English if you have free time?
Sure i will try
@@ConceptandCoding Thanks a lot.
A humble request to speak in English
Hi Radhi, yes i have switched to English for all my latest videos
Please speak in English or at least add english subtitles
Hindi seekh lo
Does CDN have server instance or only static data ? where does the client request goes first ? - CDN or actual server
Manish suthar?