Cloud Native Application Design – Backend For Frontend Pattern

When working with applications for which frontend is available in more than one medium, for example, a desktop and a mobile application. The scenario can be complicated when the Application has different mobile versions for Android and iOS. Also, the APIs might be consumed by third-party services. In short, the same set of APIs has other consumers, which might have different requirements out of them.

One way to solve the problem is that the API being called checks the source of the call and replies with the required data for example for a GET orderdetails call, a mobile call might need just order history listing, whereas a desktop frontend might want to show more information as it can accommodate more on the interface. At the same time, we just want to expose a piece of limited information to a third party caller.

General Purpose API vs BFF https://medium.com/mobilepeople/backend-for-frontend-pattern-why-you-need-to-know-it-46f94ce420b0

The image above shows very well, how the BFF pattern helps customize responses for callers.

References:

https://medium.com/mobilepeople/backend-for-frontend-pattern-why-you-need-to-know-it-46f94ce420b0

https://learn.microsoft.com/en-us/azure/architecture/patterns/backends-for-frontends

https://samnewman.io/patterns/architectural/bff/

Cloud Native Application Design – 12 Factor application

If you will look around you will find many sets of best practices available for creating web applications, microservices, cloud-native applications, and so on. Developers and teams generally tend to share their learnings while they have gone through the process of application development. One needs to understand, learn and figure out what one can use in our application development process.

One such list of best practices, which is very popular among developers and architects is “The twelve-factor app”. https://12factor.net/
There are definitely some interesting points in this, that we should understand and use.

I. Codebase
One codebase tracked in revision control, many deploys

Always track your application via the version control system
Each microservice should be in its own repository
One repository serves multiple deployments e.g. dev, stage, and prod

One codebase maps to many deploys
https://12factor.net/codebase

II. Dependencies
Explicitly declare and isolate dependencies

An application has different dependencies to make it work, these can be libraries (jar files), third-party services, databases, etc. You must have seen scenarios where the application is behaving differently on two machines and eventually it is figured out the problem was a different version of a library on the machines.

The idea here is to call out all dependencies explicitly (dependency managers like maven, property files for any third-party dependencies), and isolate them from the main code, so that any developer now can start with a bare minimum setup (say install only Java on the machine) and get started (all dependencies are managed and downloaded separately).

III. Config
Store config in the environment

Your application has environment-specific configurations (dev, stage, and prod) like database, caching config, etc.

Keeping configurations with code can have multiple issues

  • you need to update the code and deploy for any configuration changes
  • As the access to code will be with most of the developers, it will be difficult to keep control of who can make changes and prone to human errors

To avoid these we will need to keep configurations within the environment.

IV. Backing services
Treat backing services as attached resources

Backing services can be databases, storage, message queues, etc.
Treating them as resources means that we can replace them easily for example moving from rabitMQ to ActiveMQ

V. Build, release, run
Strictly separate build and run stages

Clearly define and separate stages

Build: Results in deployable like war, jar, or ear file that can be deployed to an environment
Release: Club build deliverables with environment-specific configuration to create a release
Run: A Release is run, e.g. a docker image is deployed on a container.

Code becomes a build, which is combined with config to create a release.
https://12factor.net/build-release-run

VI. Processes
Execute the app as one or more stateless processes

Look at your application as a stateless process. Imagine the pain maintaining the status for a scalable application (sticky session will hinder true scalability)
So make sure to outsource session management

VII. Port binding
Export services via port binding

“The twelve-factor app is completely self-contained and does not rely on runtime injection of a webserver into the execution environment to create a web-facing service. The web app exports HTTP as a service by binding to a port, and listening to requests coming in on that port.”

https://12factor.net/port-binding

VIII. Concurrency
Scale out via the process model

“The process model truly shines when it comes time to scale out. The share-nothing, horizontally partitionable nature of twelve-factor app processes means that adding more concurrency is a simple and reliable operation. “

https://12factor.net/concurrency

IX. Disposability
Maximize robustness with fast startup and graceful shutdown

“The twelve-factor app’s processes are disposable, meaning they can be started or stopped at a moment’s notice. This facilitates fast elastic scaling, rapid deployment of code or config changes, and robustness of production deploys.”

https://12factor.net/disposability
  • Processes should strive to minimize startup time.
  • Processes shut down gracefully

X. Dev/prod parity
Keep development, staging, and production as similar as possible

You might have seen issues, where something that was working and tested on a lower environment, suddenly starts showing some erroneous behavior in production. This can be due to some mismatch in terms of tools/ library versions. To avoid such a situation it is recommended to keep all environments as similar as possible.

XI. Logs
Treat logs as event streams

Logs are the lifeline of any application and their importance becomes more with a distributed system (say there is an issue, you need to know where the problem is in a distributed system which might consist of tens of applications). But log handling should not be the responsibility of the code. All logs are streamed out to a specialized system meant to manage logs like Splunk or ELK.

XII. Admin processes
Run admin/management tasks as one-off processes

There can be some one-time maintenance tasks like database migration or backup, report generation, a maintenance script, etc. The idea is to keep these tasks independent of the core application and handled separately.

Cloud Native Application Design – Strangler Pattern

In his post, Martin Fowler talks about Strangler Figs

They seed in the upper branches of a tree and gradually work their way down the tree until they root in the soil. Over many years they grow into fantastic and beautiful shapes, meanwhile strangling and killing the tree that was their host.

https://martinfowler.com/bliki/StranglerFigApplication.html

This gives a definition to a prevalent development pattern when you are working on moving an existing monolith application to a new microservices-based cloud-native application. You do not make the change in one go, but instead, start small, take a part or functionality from the application, move it to a newer cloud-native microservice, and then remove that piece from the existing application. Step by step the old application is completely replaced by a fresh cloud native microservice-based application.

There will be three phases to such a transition 

Transform: Create a parallel application build in microservices, cloud-native design.

Coexist: Incrementally you will implement features and transfer traffic from older monolithic applications to newly built cloud-native applications.

Eliminate: Completely remove the older version and only maintain the new application.

https://www.amazon.in/Cloud-Native-Applications-Jakarta-Microservices-ebook/dp/B093D2QMF8

Cloud Native Application Design – SAGA design pattern

When developing cloud native applications using microservices, there might be time when you want to manage a transaction, where it is set of microservices working with each other to complete the process. For example, lets take a look at example below for placing an order in a e-commerce website.

The Saga design pattern is a way to manage data consistency across microservices in distributed transaction scenarios. A saga is a sequence of transactions that updates each service and publishes a message or event to trigger the next transaction step. If a step fails, the saga executes compensating transactions that counteract the preceding transactions.

https://learn.microsoft.com/en-us/azure/architecture/reference-architectures/saga/saga

Choreography based SAGA: Event-driven communication across various services. Each
service publishes events to the queue, which is listened to by the interested
services. The listener services will perform actions based on the data
received.

Orchestration based SAGA:

The central role in this type of implementation is played by an orchestrator.
The orchestrator service takes control of the overall communication among
various services.

Reference: https://www.amazon.in/Cloud-Native-Applications-Jakarta-Microservices-ebook/dp/B093D2QMF8

REST APIs Naming – Beyond CRUD

Not planning to write yet another article for best practices for REST APIs as the topic is covered multiple times already. What better than Google’s reference document – https://cloud.google.com/apis/design/naming_convention

Here I would like to discuss some cases which are not straightforward. But before going there it makes sense to revise some basic concepts.

REST stands for Representational State Transfer. One can manage state of a resource.

What is a resource?

From https://restfulapi.net/

The key abstraction of information in REST is a resource. Any information that we can name can be a resource. For example, a REST resource can be a document or image, a temporal service, a collection of other resources, or a non-virtual object (e.g., a person).

The state of the resource, at any particular time, is known as the resource representation.

The resource representations are consist of:

  • the data
  • the metadata describing the data
  • and the hypermedia links that can help the clients in transition to the next desired state.

When we say REST can help to manage resources (CRUD operations), it is done by following methods

  • POST for Create
  • GET for Read
  • PUT and Patch for update
  • DELETE for Delete

There are other methods like options and head, but we will focus on core CRUD operations mentioned above.

To get started lets take a simple use case, where we have a resource Employee

Generic URL format will look like /{baseurl}/{service or microservice}/{resource}

For example https://api.kamalmeet.com/employee-management/employees

  • GET list of the employees GET /employees
  • Get specific data GET /employees/{id}
  • Create a new object POST /employees
  • Update an employee object Patch or PUT /employees/{id}
  • Delete the object DELETE /employees/{id}

Now that was easy part

Let us talk about some complex cases now, which are not straightforward to fit into REST naming conventions.

Fetch Related resources for the object

/employees/{id}/projects/

Controller verb for a special operation

/users/{id}/cart/checkout

Complex resources representation

only get specific orders (dashes are acceptable) 
/users/{id}/pending-orders/

Fetch only speific columns

/employees/?fields={name, department, salary}

Complex searches (reports)

/search/?params={} 
/reports/absentreport

Complex listing

/myorders

Above are some of the acceptable practices. Users can modify as per their needs.

Design: Netflix

Designing or architecting a system is a complex task. One needs to think of various aspects that can impact a system. At a high level, we bucket the requirements into two parts – Functional and Non-Functional. Functional requirements, in simple words, can be thought of as functionalities one needs to build. Non-functional requirements can be complex as they usually will not be called out explicitly and as an architect, you need to figure out after discussions with various stakeholders.

Reference: https://kamalmeet.com/system-design-and-documentation/steps-for-designing-a-system-from-scratch/

In this post, I would try to look at the system design for Netflix. Of course, it is a complex system and it is difficult to cover in one post, but I will try to touch upon important aspects.

Functional Requirements:

  • Account Management: Create Account/ Login/ Manage and Delete the Account
  • Subscription Management
  • Search
  • Watch a Video: View/ Download for offline viewing
  • Recommendations: User-based/ Generic/ Top trends/ Genre
  • Device Synchronization
  • Language Selection: Audio/ Video

Non-Functional Requirements:

  • Performance: Realtime streaming performance
  • Reliability
  • Availability
  • Scalability
  • Durability

Data needed:

  • number of users
  • daily active users
  • the average number of videos watched per day/ per user
  • the average size of the video
  • number of videos total/ uploaded per day

Let me borrow the high-level architecture image

https://www.linkedin.com/pulse/system-design-netflix-narendra-l/?published=t

Microservices-based architecture: Netflix is an early adapter of microservices and helped popularize the use of microservices. Microservices help Netflix manage its critical services by keeping them stateless, secured, scalable, available, and reliable.

CDN or Content Delivery Network: In the image above we see Open Connect, which is Netflix’s CDN. For any application which has consumers across multiple geographies, CDN is an important piece. This helps deliver content like images, videos, JavaScript, and other files from a location nearest to the user helping improve performance. In addition, Netflix provides Open Connect Appliances to ISPS free of cost, which helps ISPs save bandwidth and helps Netflix Cache content for better performance.

Transcoding: Any video getting uploaded to Netflix then gets converted to videos of various resolutions. The video gets uploaded to a queue from where it is taken up by transcoder workers who after converting the video upload them to AWS S3. When a user clicks on a video to be played, the best option is chosen based on the client and bandwidth.

API Gateway: ZUUL is the API gateway used by Netflix, which provides features for gateway like security, authentication, routing, decorating requests, Beta testing (based on routing), etc.

Resiliency: It is a resiliency library by Netflix. It handles scenarios like timeout handling, failing fast by rejecting requests when the thread pool is full, circuit breaker when the error rate is heavy, fallback to default response, etc.

Cache: Netflix uses EV cache to provide performance, reduced latency, better throughput, and reduced overall cost. EV cache is a custom implementation of Memcache, which is not dependent on RAM and can use SSD.

Database: Netflix uses MySQL for data that needs ACID properties, data like user data. Read replicas are used to improve query performances. Cassandra is used for NoSQL, to keep data like browsing and watching history. Older history data can be moved to the compressed cheaper data store.

Logs Management: All log data is sent to Chukwa through Kafka. You can view logs on the dashboard. Finally, logs can be sent to S3 for further retention and usage.

Search: Elastic Search is used for indexing and searching.

Recommendations: Spark is used for data analysis. it helps rank content based on user history as well as using data from users with similar tastes. For example, if two users have given similar ratings to a movie, their tastes might be similar. Also if a user watches comedy content mostly, the recommendation engine might suggest more comedy content.

Refrences:

Find Longest Common Prefix

Solving any problem requires one to analyze various available solutions and then look at factors like space complexity, time complexity, and ease of development.

Problem statement: Find the longest common prefix from N number of Strings.

Example: “kamal”, “kamalmeet”, “kamaljeet” should return “kamal”

Solution 1- Vertical Scanning

Approach: Start with the first element of each string and compare for equality. Continue till a point we do not find the same character.

Pseudo code:

-- for i=0 to n, of the length of the first (any) string
---- check if ith character is equal for all string
------ if not, return string from 0 to i-1 character

Code:

    public String longestCommonPrefix(String[] strs) {
        for (int index = 0; index < strs[0].length(); index++) {
            char ch = strs[0].charAt(index);
            for (String st : strs) {
                // one of the strings has ended or char mismatch found
                if (index >= st.length() || st.charAt(index) != ch) {
                    return strs[0].substring(0, index);
                }
            }
        }
        // if you reached here, first string is longest common prefix
        return strs[0];
    }

Time Complexity: O(S) for Worst case all strings are equal, and S is the sum of all string lengths

Space Complexity: O(1) no additional data structure

Solution 2 – Horizontal Scanning

Approach: Start by comparing the first two strings and find the longest common prefix. Use this as input and compare it with next string and so on.

Pseudo code:

-- longestprefix = str[0]
-- for strings in i=1 to N
---- longestprefix = findlongestprefix(longestprefix, str[i])
-- return longestprefix

Time Complexity: O(S)

Space complexity: O(1)

Solution 3- Divide and Conquer

Approach: Break down the array of strings into two equal parts, solve for the two subarrays, and find a solution for two results (repeat the process at each step).

longest_common_prefix6
source: https://www.geeksforgeeks.org/longest-common-prefix-using-divide-and-conquer-algorithm/

Additional Approaches

Azure: Region pairs, Billing and Subscription hierarchy

Some old notes from Azure

Screenshot of the hierarchy for objects in Azure.
https://learn.microsoft.com/

Azure region pairs

Availability zones are created by using one or more datacenters. There’s a minimum of three zones within a single region. It’s possible that a large disaster could cause an outage big enough to affect even two datacenters. That’s why Azure also creates region pairs.

What is a region pair?

Each Azure region is always paired with another region within the same geography (such as US, Europe, or Asia) at least 300 miles away. This approach allows for the replication of resources (such as VM storage) across a geography that helps reduce the likelihood of interruptions because of events such as natural disasters, civil unrest, power outages, or physical network outages that affect both regions at once. If a region in a pair was affected by a natural disaster, for instance, services would automatically failover to the other region in its region pair.

Customize billing to meet your needs

If you have multiple subscriptions, you can organize them into invoice sections. Each invoice section is a line item on the invoice that shows the charges incurred that month. For example, you might need a single invoice for your organization but want to organize charges by department, team, or project.

Depending on your needs, you can set up multiple invoices within the same billing account. To do this, create additional billing profiles. Each billing profile has its own monthly invoice and payment method.

The following diagram shows an overview of how billing is structured. If you’ve previously signed up for Azure or if your organization has an Enterprise Agreement, your billing might be set up differently.

Flowchart-style diagram showing an example of setting up a billing structure where different groups like marketing or development have their own Azure subscription that rolls up into a larger company-paid Azure billing account.
https://learn.microsoft.com/en-us/training/modules/azure-architecture-fundamentals/management-groups-subscriptions

Hierarchy of management groups and subscriptions

You can build a flexible structure of management groups and subscriptions to organize your resources into a hierarchy for unified policy and access management. The following diagram shows an example of creating a hierarchy for governance by using management groups.

Diagram showing an example of a management group hierarchy tree.
https://learn.microsoft.com/en-us/training/modules/azure-architecture-fundamentals/management-groups-subscriptions

You can create a hierarchy that applies a policy. For example, you could limit VM locations to the US West Region in a group called Production. This policy will inherit onto all the Enterprise Agreement subscriptions that are descendants of that management group and will apply to all VMs under those subscriptions. This security policy can’t be altered by the resource or subscription owner, which allows for improved governance.

Cloud Native Application Design – Challenges with Microservices

In the previous post, I talked about the benefits of microservices. but the discussion will be incomplete without talking about some of the important challenges one should expect when going for microservices-based architecture.

Well, microservices provide a lot of benefits, but this definitely is not a silver bullet, and if not architected properly, the design can cause more pain than it will provide benefits. Here are some of the considerations one needs to keep in mind when designing an application with microservices.

Too few/ too many services: The first question one needs to deal with when going for microservices-based architecture is how to break the application into microservices. Too many microservices would mean you are unnecessarily complicating the system and too few would mean you are not getting the benefits of a microservice-based design.

Complex DevOps: Unlike a monolith application where you are deploying just a single application, not you are dealing with dozens of microservices. Each service needs its own compilation and deployment pipeline, which means independent management and tracking.

Monitoring: As multiple services are communicating with each other to make the application work, a single failure can impact the overall success. Hence it is important to monitor all services, which means a complex dashboarding, alerting, and monitoring system in place to keep a check on all pieces.

Multiple Tech Stacks: One of the advantages we get with microservices is the independence you get in choosing a tech stack for each piece, but too many tech stacks would mean difficult intra-team support and low expertise on technology.

Managing Data: Another challenge with microservice-based design is managing the data. As a rule of thumb, each microservice should manage its own data. But this can get tricky as sometimes microservices need to share data. If not managed properly one can run into a problem of duplicate sources of data or performance issues in fetching data from other services.

Design for Accessibility – 2

In the last post, I introduced the concept of accessibility, here I will discuss more on testing your application for accessibility. The good thing is that we have many tools that can help us with basic accessibility testing.

The first level of testing that you would like to do for accessibility would be for Keyboard navigation (does your application support tab/ arrow navigation) and Screen reader. All Operating systems come with inbuilt support for screen readers Windows has Narrator, Mac has VoiceOver, and there are readers like JAWS and NVDA.

Additionally, there are tools to help with accessibility testing, some of the common ones are following

Chromium Developer Tools: Chromium has inbuilt support for checking for accessibility and generating reports for a page. Refer https://developer.chrome.com/docs/devtools/accessibility/reference/

Accessibility insights: Another tool that can help with testing and generating a detailed report for an application. Refer https://accessibilityinsights.io/. It gives not only details but also suggestions on fixes. Here is a sample report

Report generated by Accessibility Insights
https://techblog.topdesk.com/accessibility/testing-accessibility-with-accessibility-insights/

Additional Tools Worth Exploring