Category Archives: Design

Design before you code

Somehow, with the penetration of agile development practices, I have been observing that less and less time is spent on designing the solution. Engineers treat Agile as a license to develop without design.

What is the problem with developing without designing or architecting the system first? I still remember when I was in college, my professor drew a parallel between architecting a building and architecting a software. Would you start building a house without thinking about design? you will not just start placing bricks without considering how many rooms you need? Where all these rooms go? How large should be every room? Where will the kitchen and bathroom be? Where all the wiring and plumbing will go? You will come up with the design, put in on a paper or a software and then calculate the feasibility.

Think of the complications that can happen if you build your house without thinking about design first. One room might be so big that there is very little space for the others. Or you are not left with any space to create a bathroom. These mistakes are costly. And that brings me to the other important factor which is causes ignoring of design in software. The manager thinks that change at a later state is going to be easy. At the end it is code, we can just change it. Well, changing the code is definitely possible, but never easy or cheap. It comes with its own cost.

More than often, where developers need to make changes at a later stage, they would be more inclined to apply quick fixes or hacks rather than making a bigger correct change. Of course, you have a tried and tested code in hand, why would you make too many changes to it, even if you know that is the right thing. And then there is developer ego, I mean, let’s admit it, it is not easy to accept one’s fault, especially if that means you would need to put in extra hours to fix that. More than ego, it is actually denial at times. People would try to stick to the solution they developed initially even though it is realized at a later stage that there could have been a better solution. Reason being, you have already invested too much.

This situation can be avoided if we think about design first. We can make sure our design covers all the possible requirements. It will not be possible to anticipate all the changes that are going to come at a later stage, but we can try to keep our design flexible. The idea is, set the ground rules, understand what all component and services we are required to create? What all data needs to be persisted and how are we going to do that? How security and error handling will be done? We need not get into implementation details, but the high-level design is a good start. We can fill in the details as and when requirements are clearer and we start actual work on the given piece of requirement.

In the end, all I would like to highlight is, if you think you are saving time by not thinking about the design of the application before you start to code, you are going to end up spending more time applying patches and fixes at the end.

Technology Agnostic Design

When developing high level design for a solution, it is not a good idea to think about technology choices. We need to keep a gap between design and implementation.

You should not finalize at this point if the solution is going to be build in Java, Python, NodeJS or PHP at this time. For example, you just define that there will be a employee service to provide employee data, but which language will be used to build it, is not part of high level design.

You should take a call what kind of database is well suited, will it be a RDBMS or a document based database, but we should not take a decision which specific vendor’s database we are going to use at this point. For example, we will take a call that we will use RDBMS, but will it be Oracle, mySQL, postgres or some other vendor provided DB, we will decide when we will think about implementation details.

Similarly, all your vendor and technology decisions will not be part of high level design. The details should be filled at a later point, once you have finalized your high level design and made sure you have all the components, services and communications identified.

Why should we not think about technology choices while building high level design? Because it limits our design and solution. Because strengths and weaknesses of a technology becomes strength and weakness of our solution. Because strengths and weaknesses of technologies change over time, hence our architecture should be independent of that.

Once you bring in vendor and technology at a high level design, you commit too much to that. For example say XYZ RDBMS provider is currently the best in market as they provide fastest operations. You design your architecture around that, you use vendor specific data structures and data types. In future, if there is a vendor providing similar services at a cheaper price, we will figure out making a change is very costly as we have to change too much of code. If we would have thought of RDBMS as just a plug and play provide, we could have made this change easily. Infact, keeping our architecture technology agnostic will force us to think beyond a vendor. We will need to think ways of improving our database performance, think of better indexing, sharding, caching, making our architecture robust and independent of technology and vendors.

https://bigmedium.com/ideas/links/managing-technology-agnostic-design-systems.html

https://www.infoq.com/news/2007/09/technology-agnostic-soa

http://bradfrost.com/blog/post/managing-technology-agnostic-design-systems/

SOLID Principles for object oriented design

There are many best practices and principles figured out by developers and architects for object oriented design. Robert Martin has intelligently put a subset of these good practices together, and gave them acronym SOLID which helps easy remembrance.

Single responsibility principle: A class should handle only one single responsibility and have only one reason for change. For example a class “Employee” should not change if there a change in project or some reporting details.

Open Closed principle: Code should be open for extension but closed for modification. If you want to add a new type of report in the system, you should not be changing any existing code. More here

Liskov substitution principle: “objects in a program should be replaceable with instances of their subtypes without altering the correctness of that program.” So if we have Employee class, which is extended by Manager. We should be able to use Manager instead of Employee and all the Employee methods like calculate Salary, generate annual report etc should work without any issues. Say if there is an object like “ContractWorker” that does not support a few functions of Employee like annual report, one should be careful not to make it subtype of Employee.

Interface Segregation principle: “no client should be forced to depend on methods it does not use”. Coming back to previous example, if “ContractWorker” does not need to support annual report, we should not force it to implement an iEmployee interface. We should break the interfaces say iReport and iEmployee, iEmployee can extent iReport and iContractWorker should implement only iReport. iReport can further be divided into reporting types if required.

Dependency Inversion principle: This one seems to be one of my favorite as I have written about it here, here, here and here. This one indeed is one of the most important design patterns which can be followed to make the code loosely coupled and hence making it more maintainable (golden rule- low coupling + high cohesiveness). In traditional programming, when a high level method calls a low level method, it needs to be aware of the low level method at compile time, whereas using DI we can make high level method depend on an abstraction or interface and details of implementation will be provided at run time, hence giving us freedom to use which implementation to be used. Coming back to my previous example, I can have multiple implementations of Employee Reporting, iReport. Some implementation need and excel report, other might need a PDF reporting, which can be decided at runtime.

Generating ER diagram from database -2

Sometime back I wrote about DBvisualizer to generate schema ER design from database.

Here is another way by using schemaspy.

http://schemaspy.sourceforge.net/

This is a simple java based tool/ jar file. As per example given in link above, all you need to run the jar file providing database access details.

java -jar schemaSpy.jar -t dbType -db dbName [-s schema] -u user [-p password] -o outputDir 

You might want to give database drivers jar file path. For example, for Postgres

java -jar /home/kamal/pathto/schemaSpy_5.0.0.jar -t pgsql -db dbnamehere -s public -u dhusername -p dbpassword -host localhost -port 5432  -o /home/kamal/outputdir -dp /home/kamal/pathto/postgresql-9.3-1104.jdbc4.jar

Data Modeling at different levels

When you are designing database for an application, there can be 3 core levels at which you can design your database.

1. Conceptual Level: At this level you are only aware of high level entities and their relationships. For example you know that you have “Employee” Entity who “works for” a “Department” and “has” an “Address”. You are not worried about details.

2. Logical Level: You try to add as much details as possible, without worrying about how it will actually be converted to a physical database structure. So will provide any attributes for “Employee” i.e. Id, FirstName, LastName, AddressId, Salary and define primary and foreign key relations.

3. Physical Level: This is the actual representation of your database design with exact column names, types etc.

database

More info- http://www.1keydata.com/datawarehousing/data-modeling-levels.html

Open-Closed principle Revisited

Reference: http://kamalmeet.com/system-design-and-documentation/understanding-openclosed-principle/

Open closed principle states that your classes should be open for extension but closed for modification. One way to look at it is that when you provide a library or a jar file to a system, you can ofcourse use the classes or extend the classes, but you cannot get into the code and update it.

At a principle level, this means you should code in a manner that you never need to update your class once code. One major reason behind this principle is that you have a class which is reviewed and Unit tested, you would not like someone to modify and possibly corrupt the code.

How do I make sure that my class follow open closed principle?

Let’s look at a design of this MyPizza class

public class MyPizza {
public void createPizza(Pizza pizza)
{
if(pizza.type.equals("Cheese"))
{
//create a cheese pizza
}
else if(pizza.type.equals("Veg"))
{
//create a veg pizza
}
}
}

Following pizza type classes use this

class Pizza
{
String type;
}

class CheesePizza extends Pizza{
CheesePizza()
{
this.type=”Cheese”;
}
}

class VegPizza extends Pizza{
VegPizza()
{
this.type=”Veg”;
}
}

The above design clearly violates the open closed principle. What if I need to add a double cheese pizza here. I will have to go to MyPizza class and update it, which is not following “closed for modification” rule.

How can fix this design?

public class MyPizza {
public void createPizza(Pizza pizza)
{
pizza.create();
}
}


class CheesePizza extends Pizza{
CheesePizza()
{
this.type="Cheese";
}

public void create()
{
//do the creation here
}
}

With this simple modification we are making sure that we will need not change the code in MyPizza class even when we will add new types of pizza, as actual responsibility of creation would be with the new class being created (DoubleCheese).

Message Oriented Middleware

In last post I talked about what is middleware, I will focus on message implementation of same today. Message oriented middleware or MOM mostly uses message queues to send and receive data between two systems.

In simple terms, a message is anything that is being sent from one system to another. Mostly MOM uses XML formats, sometimes SOAP based requests or plain texts. An example MOM system will send message to a  message queue or MQ, from where the receiver will pick up the message.

Advantages of Message Oriented Middleware

  1. Persistence: In normal client-server architecture, we will need to make sure both the systems to be available to have a successful communication. Whereas if we are using MQs, one system can still send messages even if the second is down.
  2. Support for Synchronous and Asynchronous communication: by default the communication is asynchronous but we can implement a synchronous system where a request message sender will wait for the response from other party.
  3. Messages can be consumed at will: If one system is busy when messages are received (which do not need immediate response), it can consume the messages when load is less. For example, a lot of systems are designed to consume the messages at non business hours.
  4. Reliability: As messages are persistent, threat of losing information is low even if one or more systems are not available. Additional security mechanism can be implemented in MQ layer.
  5. Decoupling of systems: Both client and server work independently, and often do not have knowledge for other end. System A creates a message and adds to message queue, without concerning who will pick it up as long as it gets the response message (if required). So one system can be written in Java and other can be in Dot Net.
  6. Scalability: As both machines involved in interaction are independent of each other, it is easier to add resources at either end without impacting the whole system.
  7. Group communication: Sender can send message to multiple queues or same queue can have multiple listeners. In addition Publisher- Subscriber approach can help broadcast a message.

Types of Messaging:

Point to Point: This is a simple messaging architecture where a sender will directly send a message to receiver through message queue.

Publisher-Subscriber (Pub-Sub): This type of communication is required when sender wants to send messages to multiple receivers. Topics are defined to which subscriber can subscribe and receive requests based on same. For example, say a core banking system can trigger messages on various events like new account open, a withdrawal is made, interest rate changed etc. For an event, multiple other systems might want that information to take an action, so say for all withdrawal events, systems like fraud detection, mobile messaging system, daily reporting system, account maintenance system subscribe. Whenever, publisher publishes the message to “Withdrawal” topic, all of these systems will receive the message and take appropriate action.