Tag Archives: Architecture

Open-Closed principle Revisited

Reference: http://kamalmeet.com/system-design-and-documentation/understanding-openclosed-principle/

Open closed principle states that your classes should be open for extension but closed for modification. One way to look at it is that when you provide a library or a jar file to a system, you can ofcourse use the classes or extend the classes, but you cannot get into the code and update it.

At a principle level, this means you should code in a manner that you never need to update your class once code. One major reason behind this principle is that you have a class which is reviewed and Unit tested, you would not like someone to modify and possibly corrupt the code.

How do I make sure that my class follow open closed principle?

Let’s look at a design of this MyPizza class

public class MyPizza {
public void createPizza(Pizza pizza)
//create a cheese pizza
else if(pizza.type.equals("Veg"))
//create a veg pizza

Following pizza type classes use this

class Pizza
String type;

class CheesePizza extends Pizza{

class VegPizza extends Pizza{

The above design clearly violates the open closed principle. What if I need to add a double cheese pizza here. I will have to go to MyPizza class and update it, which is not following “closed for modification” rule.

How can fix this design?

public class MyPizza {
public void createPizza(Pizza pizza)

class CheesePizza extends Pizza{

public void create()
//do the creation here

With this simple modification we are making sure that we will need not change the code in MyPizza class even when we will add new types of pizza, as actual responsibility of creation would be with the new class being created (DoubleCheese).

Reverse Engineering: MySQL WorkBench

In last post I talked about creating sequence diagrams using MaintainJ. Another important aspect you would want to understand for a Project is the database schema design. How many tables are there? How do they interact with each other? etc.

For understanding this design the best way is to look into ER or Entity Relationship diagram. Ideally one would create the ER diagram first and then implement database.

In case we do not have a ER diagram available we can create using Reverse Engineering the database to ER diagram.

For MySQL, we can use MySQL WorkBench tool to create one.

Download the installer from https://www.mysql.com/products/workbench/

Once installed, you can connect to you mysql database in workbench. Then in Database Tab at the top, select Reverse Engineer option, and select the schema you want to reverse engineer.

Reverse Engineering: MaintainJ

The best way to analyze the code with hundred of Java classes is to look into the documentation, class diagrams, sequence diagrams etc to understand the flow and usage. Unfortunately there are times when you would not be provided with any such documentation.

Reverse Engineering tools can be of help upto some level. MaintainJ is one such tool to help you with Java.

So if you have a working codebase for a web application, which you need to analyze, here are the steps to go ahead.

1. Download the MaintainJ war file from http://maintainj.com/userGuide.jsp?param=install
2. Add the war file to the server where main application (to be analyzed is available), for example if you project war file is added to tomcat – tomcat/webapps, add the MaintainJ.war
3. Now if you will visit the link to server like http://localhost:8080/MaintainJ/, it will let you provide the package to be traced and directory where output file to be added.
4. It will provide simple settings to be added to catalina.sh (or other server config),
5. Once all settings done, restart the server.
6. Go to MaintainJ link and start tracing.
7. Now browse through the actual app, MaintainJ will create sequence diagrams to the directory where you have provided the path.

You can view the ser file created by MaintainJ in eclipse by adding MaintainJ plugin to eclipse. Create a new project of MaintainJ trace type and copy generated ser files into this project in a folder.

A good overall demo is provided – http://maintainj.com/userGuide.jsp?param=overviewDemo

Shared Nothing vs Shared Everything

In database cluster implementation we can have multiple ways to make sure how different nodes will communicate with each other.

Shared nothing approach: None of the nodes will use others memory or storage. This is best suited for the solutions where inter node communication is not required, i.e. a node can come up with a solution on its own.

Shared Memory: In this approach memory is shared, i.e. each node/ processor is working with same memory. This is used when we need nodes to share solutions/ calculations done by other nodes and are available in memory.

Shared Everything: In this approach nodes share memory plus storage. This makes sense when nodes are working on problem where calculations and data created/ used by node is dependent on others.

Further Reads:


Message Oriented Middleware

In last post I talked about what is middleware, I will focus on message implementation of same today. Message oriented middleware or MOM mostly uses message queues to send and receive data between two systems.

In simple terms, a message is anything that is being sent from one system to another. Mostly MOM uses XML formats, sometimes SOAP based requests or plain texts. An example MOM system will send message to a  message queue or MQ, from where the receiver will pick up the message.

Advantages of Message Oriented Middleware

  1. Persistence: In normal client-server architecture, we will need to make sure both the systems to be available to have a successful communication. Whereas if we are using MQs, one system can still send messages even if the second is down.
  2. Support for Synchronous and Asynchronous communication: by default the communication is asynchronous but we can implement a synchronous system where a request message sender will wait for the response from other party.
  3. Messages can be consumed at will: If one system is busy when messages are received (which do not need immediate response), it can consume the messages when load is less. For example, a lot of systems are designed to consume the messages at non business hours.
  4. Reliability: As messages are persistent, threat of losing information is low even if one or more systems are not available. Additional security mechanism can be implemented in MQ layer.
  5. Decoupling of systems: Both client and server work independently, and often do not have knowledge for other end. System A creates a message and adds to message queue, without concerning who will pick it up as long as it gets the response message (if required). So one system can be written in Java and other can be in Dot Net.
  6. Scalability: As both machines involved in interaction are independent of each other, it is easier to add resources at either end without impacting the whole system.
  7. Group communication: Sender can send message to multiple queues or same queue can have multiple listeners. In addition Publisher- Subscriber approach can help broadcast a message.

Types of Messaging:

Point to Point: This is a simple messaging architecture where a sender will directly send a message to receiver through message queue.

Publisher-Subscriber (Pub-Sub): This type of communication is required when sender wants to send messages to multiple receivers. Topics are defined to which subscriber can subscribe and receive requests based on same. For example, say a core banking system can trigger messages on various events like new account open, a withdrawal is made, interest rate changed etc. For an event, multiple other systems might want that information to take an action, so say for all withdrawal events, systems like fraud detection, mobile messaging system, daily reporting system, account maintenance system subscribe. Whenever, publisher publishes the message to “Withdrawal” topic, all of these systems will receive the message and take appropriate action.

Difference between web server and application server

We  use terms application server and web server interchangeably these days. Historically a web server was supposed to serve static html pages over http protocol. Then web servers started serving scripting languages like ASP, JSP, PHP etc. For complex application which needed services connection pooling, object pooling, messaging, transaction management were supposed to use application server.

I said historically because with time the distinguishing line has narrowed down. Most of the web servers have started providing features of application servers with plugins and add-ons. Some exmples of web servers- Apache HTTP (PHP), Tomcat (Servlet container, newer versions have EJB container) etc. Application server examples are JBoss, Websphere etc.


A cluster in simple terms is group of similar things, in this case computers or servers. A more refined explanation would be that a cluster is group of computers, working together in such a way that for end user it is one single machine. This is close to what I discussed about implementation of virtualization, so yes clustering is a form of virtualization.

But when we are strictly talking about software architecture, we are actually talking about using cluster for load balancing or handling failover. For a simple web application, this would mean creating 2 or more similar server machines, which are maintained in a cluster. There is a single point of entry which dictates which server from cluster should fulfill incoming request. This is load balancing. The server at the entry point can use any algorithm like round robin or check the actual load on a server to assign a request to one of the servers in the cluster. At the same time if one of the machine goes down in cluster for some reason, other servers can share the load and end user will never know about the problem occurred at backend.



Grid Computing

What is a Grid? In definition it is nothing but some horizontal and vertical lines cutting each other. Something like


How is that related to grid computing? Not actually. But if you think that there is a computer node available at each intersection of horizontal and vertical line, we will get close.

What exactly is grid computing? Think of the fact there are many average computers connected to each other and can share their processing power, how fast it can deliver solution. It is form of implementation of distributed computing and parallel computing. It can help using many average sized computer and giving impression on a super computer processing power.