Tag Archives: Architecture

Design: Twitter

In the series of exploring designs for popular systems, I will try to come up with Twitter’s system design today.

Functional Requirements

  • User should be able to Tweet
  • User should be able to follow and view tweets of others on their timeline
  • User should be able to search for tweets
  • User should be able to view current trends

Non Functional requirements

  • Availability
  • Performance
  • Eventual Consistency
  • Scale 150 million users with 500 million tweets per day or ~5500 Tweets/Second

30000 view of Architecture

30000 view of Twitter Architecture

There are one or two aspects of the above design which are very interesting. The first one we can see is the user timeline. This can be a complicated piece, whenever a user login into the app, he should see his timeline, which will show all the tweets from people he is following. The user might be following hundreds of accounts, it will not be feasible to calculate tweets from all these accounts at runtime and create timeline data. So a passive approach makes sense here, where we can keep the user timeline data in a cache beforehand.

Say user A is following user B, and user B publishes a new tweet, at that time itself, user A timeline will be updated with a new timeline getting added to user A timeline data. Similarly, if 100 users are following user B, all the timelines get updated (Fanout the tweet and update all timelines).

It can get tricky if user B has millions of followers. A different approach can be used in this case. Assuming there are not many such popular users, we can create a separate bucket for handling these popular users. Say user A is following user C (celebrity), so instead of updating the timeline for C beforehand, tweets for all such celebrity users can be pulled in real-time.

Another important aspect is hashtagging and trends exploration. For all the tweets coming in, the text can be tokenized and tokens can be analyzed for most usage. For example, when a cricket match is going on in India, many people might tweet with the term match or cricket. Again these trends might be geo-location-based as this particular trend is a country-specific one.

Design: Whatsapp

In this series of trying to understand designs for popular systems, I am taking up Whatsapp in this post. Please remember all the designs I am sharing in this series are my personal view for educational purposes and might not be the same as actual implementation.

To get started let us take a look at the requirements

Functional Requirements

  • User should be able to create and manage an account (Covered already)
  • User should be able to send a message to contact
  • User should be able to send a message to a group
  • User should be able to send a media message (image/ video)
  • Message Received and Message Read receipts to sender
  • Voice Calling (Not covering here)

Non Functional Requirements

  • Encryption
  • Scaleability
  • Availability

At a very high level, the design looks very straightforward

The first important thing that we see here is that communication is not one-way like a normal web application where the client sends a request and receives a response. Here the mobile app (client) should be able to receive live messages from the messaging server as well. This kind of communication is called Duplex-Connection where both parties can send messages. To achieve this, one can use long polling or web sockets (preferred).

Communication Management: When a user sends a message, it will be sent to the queue of messages received, from where it will be processed and sent to the queue for messages to be sent to users.

Media Management: Before sending the message for processing, media is uploaded and stored in a storage bucket, and the link is shared with users, which can be used to fetch the actual media file.

Single/ Double/ Blue Tick: When a message is received and processed by the server, the information is sent back to the user and marked single ticked. Similarly, when the message is sent successfully the receiver is marked double tick and finally, when the receiver opens the message, it is blue ticked for the sender.

Additional Perpectives

https://leetcode.com/discuss/interview-question/system-design/1490923/WhatsApp-System-Design
WhatsApp system architecture design
https://www.codekarle.com/system-design/Whatsapp-system-design.html

Design: URL Shortner

The problem we are trying to solve is to create a service that can take a large URL and return a shorter version, for example, say take https://kamalmeet.com/cloud-computing/cloud-native-application-design-12-factor-application/ as input and give me https://myurl.com/xc12B2d, a URL easy to share.

The application looks simple, but it does provide a few interesting aspects.

Database: The Main database will be used to store long URLs, short URLs, created dates, created by, last used, etc. as we can see this will be a read-heavy database and should be able to handle large datasets, a NoSQL document-based database should be good for scalability.

Data Scale:

  • Long URL – 2 KB (2048 chars)
  • Short URL – 7 bytes (7 chars)
  • Created at – 7 bytes (7 chars for epoch time)
  • last used – 7 bytes
  • created by – 16 bytes (userid)
  • Total: ~2KB

2KB * 30 million URLs per month = ~60 GB per month or 7.2 TB in 10 years

Format: The next challenge is to decide the format of the tiny URL. The decision is an easy one, Base 10 URL would give you 10^7 or 10 million combinations for a 7-character string whereas a Base 62 format will give 62^7 or 3.5 trillion combinations for 7 character string.

Short URL Generator: Another challenge to solve is how to choose a random 7 Base 62 string for each URL.

Soln 1: Use MD5 which returns a string of 20+ chars, we can take the first 7 characters. The problem here is taking the first 7 characters might lead to a collision where multiple strings have MD5 with the same first 7 characters

Soln 2: Use a counter-based approach. A counter service will generate the counter which gets converted to Base 62, making sure all requests get a unique Base 62 string. To scale it better, we will have a distributed counter generator.

Design: Dropbox

In the series of exploring design for popular systems, I will look at a file share system like dropbox.

Functional Requirements

  • User is able to upload or download files via a client application or web application
  • User is able to sync and share files
  • User is able to view the history of updates

Non Functional Requirements

  • Performance: Low latency while uploading the files
  • Availability
  • Concurrency: Multiple users are able to update the same file

Scaling Assumptions

  • Average size file – say 200 MB
  • Total user base- 500 million
  • Daily active users- 100 million
  • Daily file creations- 10 per user
  • Total files per user- 100
  • Average Ingress per day: 10 * 100 million * 200 MB = 200 petabytes per day

Services Needed

  • User management Service
  • File Handler Service
  • Notification Service
  • Synchronization Service

File Sync

When Syncing the files we will break the file into smaller chunks, so that only the chunk which has undergone updates will be sent to the server. This architecture is helpful in contrast to sending the file to the server for every update. Say a 40 MB file gets broken into 2 MB chunks each.

This architecture helps solve problems like

  • Concurrency: If two users are updating different chunks, there is no conflict
  • Latency: Faster and parallel upload
  • Bandwidth: Only chunk updated is sent
  • History Storage: New version only need a chunk of data rather than full file space
Complete-System-Design-Solution-of-Dropbox-Service
https://www.geeksforgeeks.org/design-dropbox-a-system-design-interview-question/

The most important part of this design is the client component.

  • Watcher: This component keeps an eye on a local folder for any changes. It informs Chunker and Indexer about changes.
  • Chunker: As discussed above, the chunker is responsible for breaking a file into manageable chunks
  • Indexer: On receiving an update from watcher, Indexer updates the internal database with metadata details. It also communicates with Synchrnozation service sending or receiving information on updates happening to files and syncing the latest version.
  • Internal DB: To maintain file metadata locally on the client.

Cloud Storage finally stores the files and updates. Metadata server maintains metadata and helps inform clients about any updates through synchronization service. Synchronization service adds data to the queue which is then picked by various clients based on availability (if a client is offline, it can read messages later and sync up the data). Edge store helps provide details to clients from the nearest location.

Design: Uber

In this series of popular system designs, I will take up design for a taxi aggregator service like Uber. Please note that the designs I share are based on my understanding of these systems and might not be the same as actual designs.

Before Getting on with Designs, let’s list down the requirements

Functional Requirements

  • User (Rider) should be able to create an account
  • Driver should be able to register
  • Rider should be able to view cabs availability in near proximity along with approx ETA
  • Rider should be able to request for a cab
  • Driver should be able to receive a trip request.
  • Driver can accept the request (Assumption: Multiple drivers will receive the request and one will accept)
  • Trip starts when rider boards the request.
  • When trip ends, Rider can make the payment and receive a receipt.
  • Rider can rate the Trip / Driver
  • Rider can view Trip history
  • Driver can view Trip history

Non Functional Requirements

  • Availability
  • Scalability
  • Performance
  • Consistency
  • Localization

Services to be created

  • User Management Service
  • Driver Management Service
  • Cab Search Service (Takes Riders location and finds nearby cars)
  • Ride Request Service (Rider’s request is shared with drivers for acceptance)
  • Trip Management Service
  • Trip History Service

Database Requirements

  • User Data
  • Driver and Cab Data
  • Cab Location Data
  • Trip Data (Current and Historical)

A high-level design might look like

An important question one needs to answer is about the Cab locator service works. The class way of GeoHashing should be used, where the whole area will be thought of as square boxes. Say a user is at Lat, Long position, we will try to find all the boxes around the current location within a given radius. And then find all the Drivers available in these Geo boxes.

Some interesting alternative representations

https://www.codekarle.com/system-design/Uber-system-design.html
Uber-System-Design-High-Level-Architecture
https://www.geeksforgeeks.org/system-design-of-uber-app-uber-system-architecture/

Design: Netflix

Designing or architecting a system is a complex task. One needs to think of various aspects that can impact a system. At a high level, we bucket the requirements into two parts – Functional and Non-Functional. Functional requirements, in simple words, can be thought of as functionalities one needs to build. Non-functional requirements can be complex as they usually will not be called out explicitly and as an architect, you need to figure out after discussions with various stakeholders.

Reference: https://kamalmeet.com/system-design-and-documentation/steps-for-designing-a-system-from-scratch/

In this post, I would try to look at the system design for Netflix. Of course, it is a complex system and it is difficult to cover in one post, but I will try to touch upon important aspects.

Functional Requirements:

  • Account Management: Create Account/ Login/ Manage and Delete the Account
  • Subscription Management
  • Search
  • Watch a Video: View/ Download for offline viewing
  • Recommendations: User-based/ Generic/ Top trends/ Genre
  • Device Synchronization
  • Language Selection: Audio/ Video

Non-Functional Requirements:

  • Performance: Realtime streaming performance
  • Reliability
  • Availability
  • Scalability
  • Durability

Data needed:

  • number of users
  • daily active users
  • the average number of videos watched per day/ per user
  • the average size of the video
  • number of videos total/ uploaded per day

Let me borrow the high-level architecture image

https://www.linkedin.com/pulse/system-design-netflix-narendra-l/?published=t

Microservices-based architecture: Netflix is an early adapter of microservices and helped popularize the use of microservices. Microservices help Netflix manage its critical services by keeping them stateless, secured, scalable, available, and reliable.

CDN or Content Delivery Network: In the image above we see Open Connect, which is Netflix’s CDN. For any application which has consumers across multiple geographies, CDN is an important piece. This helps deliver content like images, videos, JavaScript, and other files from a location nearest to the user helping improve performance. In addition, Netflix provides Open Connect Appliances to ISPS free of cost, which helps ISPs save bandwidth and helps Netflix Cache content for better performance.

Transcoding: Any video getting uploaded to Netflix then gets converted to videos of various resolutions. The video gets uploaded to a queue from where it is taken up by transcoder workers who after converting the video upload them to AWS S3. When a user clicks on a video to be played, the best option is chosen based on the client and bandwidth.

API Gateway: ZUUL is the API gateway used by Netflix, which provides features for gateway like security, authentication, routing, decorating requests, Beta testing (based on routing), etc.

Resiliency: It is a resiliency library by Netflix. It handles scenarios like timeout handling, failing fast by rejecting requests when the thread pool is full, circuit breaker when the error rate is heavy, fallback to default response, etc.

Cache: Netflix uses EV cache to provide performance, reduced latency, better throughput, and reduced overall cost. EV cache is a custom implementation of Memcache, which is not dependent on RAM and can use SSD.

Database: Netflix uses MySQL for data that needs ACID properties, data like user data. Read replicas are used to improve query performances. Cassandra is used for NoSQL, to keep data like browsing and watching history. Older history data can be moved to the compressed cheaper data store.

Logs Management: All log data is sent to Chukwa through Kafka. You can view logs on the dashboard. Finally, logs can be sent to S3 for further retention and usage.

Search: Elastic Search is used for indexing and searching.

Recommendations: Spark is used for data analysis. it helps rank content based on user history as well as using data from users with similar tastes. For example, if two users have given similar ratings to a movie, their tastes might be similar. Also if a user watches comedy content mostly, the recommendation engine might suggest more comedy content.

Refrences:

Cloud Native Application Design – Challenges with Microservices

In the previous post, I talked about the benefits of microservices. but the discussion will be incomplete without talking about some of the important challenges one should expect when going for microservices-based architecture.

Well, microservices provide a lot of benefits, but this definitely is not a silver bullet, and if not architected properly, the design can cause more pain than it will provide benefits. Here are some of the considerations one needs to keep in mind when designing an application with microservices.

Too few/ too many services: The first question one needs to deal with when going for microservices-based architecture is how to break the application into microservices. Too many microservices would mean you are unnecessarily complicating the system and too few would mean you are not getting the benefits of a microservice-based design.

Complex DevOps: Unlike a monolith application where you are deploying just a single application, not you are dealing with dozens of microservices. Each service needs its own compilation and deployment pipeline, which means independent management and tracking.

Monitoring: As multiple services are communicating with each other to make the application work, a single failure can impact the overall success. Hence it is important to monitor all services, which means a complex dashboarding, alerting, and monitoring system in place to keep a check on all pieces.

Multiple Tech Stacks: One of the advantages we get with microservices is the independence you get in choosing a tech stack for each piece, but too many tech stacks would mean difficult intra-team support and low expertise on technology.

Managing Data: Another challenge with microservice-based design is managing the data. As a rule of thumb, each microservice should manage its own data. But this can get tricky as sometimes microservices need to share data. If not managed properly one can run into a problem of duplicate sources of data or performance issues in fetching data from other services.

Design for Accessibility – 2

In the last post, I introduced the concept of accessibility, here I will discuss more on testing your application for accessibility. The good thing is that we have many tools that can help us with basic accessibility testing.

The first level of testing that you would like to do for accessibility would be for Keyboard navigation (does your application support tab/ arrow navigation) and Screen reader. All Operating systems come with inbuilt support for screen readers Windows has Narrator, Mac has VoiceOver, and there are readers like JAWS and NVDA.

Additionally, there are tools to help with accessibility testing, some of the common ones are following

Chromium Developer Tools: Chromium has inbuilt support for checking for accessibility and generating reports for a page. Refer https://developer.chrome.com/docs/devtools/accessibility/reference/

Accessibility insights: Another tool that can help with testing and generating a detailed report for an application. Refer https://accessibilityinsights.io/. It gives not only details but also suggestions on fixes. Here is a sample report

Report generated by Accessibility Insights
https://techblog.topdesk.com/accessibility/testing-accessibility-with-accessibility-insights/

Additional Tools Worth Exploring

Design for Accessibility – 1

As a software developer/ designer/ architect, when you think about software, there are many non-functional aspects of the software that you need to take care of. Accessibility is one such important non-functional aspect, which can get neglected if you are not paying attention.

Before getting into more details, let’s try to understand what accessibility is –

Accessibility enables people with disabilities to perceive, understand, navigate, interact with, and contribute to the web. 

https://medium.com/salesforce-ux/7-things-every-designer-needs-to-know-about-accessibility-64f105f0881b

Four core Accessibility principles – Perceivable, Operable, Understandable, and Robust.

Information about accessibility testing and compliance
https://www.pqatesting.com/2020/10/26/your-accessibility-testing-cheat-sheet/

WCAG or Web Content Accessibility Guidelines, current version 2.1 gives us a detailed idea of areas one needs to consider when working on the accessibility of a website.

Refer: https://www.w3.org/TR/WCAG21/#intro

WCAG 2.1 gives three levels of conformance A, AA, or AAA

WCAG diagram showing overall structure, with principles in the far left column, guidelines in the adjacent column, and success criteria numbers across the following three columns, A, AA and AAA.
https://openclassrooms.com/en/courses/6663451-make-your-web-content-accessible/6912083-get-to-know-the-web-content-accessibility-guidelines-wcag

WAI-ARIA: Web Accessibility Initiative – Accessible Rich Internet Applications or WAI-ARIA is a specification developed by W3C in 2008.

WAI-ARIA, the Accessible Rich Internet Applications Suite, defines a way to make Web content and Web applications more accessible to people with disabilities. It especially helps with dynamic content and advanced user interface controls developed with HTML, JavaScript, and related technologies. 

https://www.w3.org/WAI/standards-guidelines/aria/

Further Readings

Cloud Native Application Design – Benefits of Microservices

In the last post, I introduced the concept of microservices and why it is a default fit for Cloud Native Design. But an important question that should be asked is why microservice-based design has gained so much popularity in recent years. What all advantages microservices gives us over traditional application design?

Let us go over some advantages of microservices-based design here

Ease of Development: The very first thing visible in microservices-based architecture is that we have divided the application into smaller services. This also means we can organize our developers into smaller teams, focusing on one piece of deliverable, less code to understand, less code to the manager, and hence can be more agile.

Scalability: A monolith application is difficult to scale as you are looking at replicating the whole application. In the case of microservices, you can focus on pieces that are actually required to scale. For example, in an eCommerce system, a search microservice is getting a lot of requests and hence can be scaled independently of the rest of the application.

Polyglot programming: As we have divided applications into smaller pieces, we have flexibility in choosing tech stacks for different pieces as per our comfort. For example, if we know a certain Machine learning piece of application/ microservice can be developed easier in python, whereas the rest of my application is in Java, it is easy to make the choice.

Single Responsibility Services: As we have divided the application into smaller services, each service is now focusing on one responsibility only.

Testability: It is easier to test smaller pieces of applications than to test them as a whole. For example, in an e-commerce application, if a change is made in the search feature, testing can be focused on this area only.

Agile Development: As teams are dealing with smaller deployable now, it is easier and a good fit for Continuous Integration (CI) and Continuous Deployment (CD), hence helping achieve less time to market.

Stabler Applications: One advantage microservices architecture gives us is to categorize our microservices on basis of criticality. For example, in our e-commerce system, order placement and shopping cart features will be of higher criticality than say a recommendation service. This way we can focus on the availability, scalability, and maintenance of areas that can directly impact user experience.