Design: URL Shortner

The problem we are trying to solve is to create a service that can take a large URL and return a shorter version, for example, say take as input and give me, a URL easy to share.

The application looks simple, but it does provide a few interesting aspects.

Database: The Main database will be used to store long URLs, short URLs, created dates, created by, last used, etc. as we can see this will be a read-heavy database and should be able to handle large datasets, a NoSQL document-based database should be good for scalability.

Data Scale:

  • Long URL – 2 KB (2048 chars)
  • Short URL – 7 bytes (7 chars)
  • Created at – 7 bytes (7 chars for epoch time)
  • last used – 7 bytes
  • created by – 16 bytes (userid)
  • Total: ~2KB

2KB * 30 million URLs per month = ~60 GB per month or 7.2 TB in 10 years

Format: The next challenge is to decide the format of the tiny URL. The decision is an easy one, Base 10 URL would give you 10^7 or 10 million combinations for a 7-character string whereas a Base 62 format will give 62^7 or 3.5 trillion combinations for 7 character string.

Short URL Generator: Another challenge to solve is how to choose a random 7 Base 62 string for each URL.

Soln 1: Use MD5 which returns a string of 20+ chars, we can take the first 7 characters. The problem here is taking the first 7 characters might lead to a collision where multiple strings have MD5 with the same first 7 characters

Soln 2: Use a counter-based approach. A counter service will generate the counter which gets converted to Base 62, making sure all requests get a unique Base 62 string. To scale it better, we will have a distributed counter generator.