Highly available and scalable URL shortening service.
Shortlink is a highly scalable and available service that allows users to shorten URLs for easy sharing, in addition to receiving various metrics on incoming hits.
- Given a URL, the service should generate a link short enough to be easily copied and shared between applications and users.
- When users access a short link, the service should redirect them to the original link.
- Users should be able to access metrics about their link redirects.
- The service should also be accessible through REST APIs by other services.
- The system should be highly available, because if the service is down, all URL redirections will start failing.
- URL redirection should happen with minimal latency.
As one of the non-functional requirements of the application requires a highly available service, the microservices architecture will be better adapted by distributing the responsibilities among several services, allowing an easy horizontal scalability that will consequently decrease the downtime, reaching 99.999% availability.
![]() |
---|
The above representation image rules out authorization system, metric services and log services. |
To write a new URL in the storage system without selecting and checking if it already exists in the database, the chosen solution is to use a service that will allow you to create a range of integers, for example [1-10,000] for each service, so , the chance of collision between URLs is reduced by 100%. But this solution can bring another problem, a single point of failure. To solve this, the tool chosen was Zookeeper, basically because it is a distributed and high-performance service.
Using the base62 encoding scheme it will be possible to get approximately 3.8 trillion unique URLs.
Base62 - [A-Z,a-z,0-9] 62 characters
62^7 = 3.521.614.606.208 possibilities
Ex: shorturl.example/a58BT17u
Therefore, a hash length of 7 characters is enough for creating multiple URLs and at the same time short enough for easy sharing.
This service is read-heavy, that is, it has more read requests than writes, and it doesn't have many relationships between the data. Therefore, the best option for this use case is to use a non-relational (NoSQL) storage system, which allows for data storage in a distributed manner. Consequently, the database chosen was Cassandra.
Cassandra is an open source NoSQL distributed database designed to handle large amounts of data across multiple servers, providing high read and write throughput.
Caching is an efficient way to improve the performance of reads on the system, reducing the load on the database server.
Since one of the system requirements is the minimum latency in redirecting URLs, the caching service is essential, given the fact that retrieving data from the database server is a time-consuming process. Therefore, when implementing a cache system it is possible to store the hash and the respective URL in an in-memory data store, allowing much faster access.
sequenceDiagram
participant User
participant WebServer
participant CacheServer
participant Database
User ->>+ WebServer: HTTP GET /{hash}
WebServer ->>+ CacheServer: get value by key {hash}
CacheServer ->>- WebServer: null or matching URL
WebServer ->>+ Database: query to retrieve original url
Note over WebServer,Database: In case of null value in cache
Database ->>- WebServer: null or matching URL
WebServer ->>- User: Response
Note over WebServer,User: Redirection or Not Found
The representation image above shows the flow of a GET request using the cache layer.
For this, redis was the chosen data store, due to its powerful distributed caching mechanism that provides key-value pair caching with very low latency, among other features.
OBS: follow these steps only for development environment.
- clone the repository
git clone https://github.com/hugosrc/shortlink.git
- change directory
cd shortlink
- copy the environment variables
cp .env.example .env
- start docker container
docker compose up -d keycloak
- Access in the browser: http://localhost:8080
- Access the Administration Console
- Sign in.
user: admin password: admin
- Create a Realm
- In Resource file, click on browse file and select the shortlink-realm.json file, then press the create button
- In Clients, select link-service credentials and regenerate the client secret
- start docker container
docker compose up -d cassandra
- Run the queries below on the cassandra database server
CREATE KEYSPACE shortlink
WITH REPLICATION = {
'class' : 'SimpleStrategy',
'replication_factor' : 1
};
CREATE TABLE shortlink.url_mapping (
hash VARCHAR,
original_url VARCHAR,
user_id UUID,
creation_time TIMESTAMP,
PRIMARY KEY (hash)
);
CREATE INDEX user_idx ON shortlink.url_mapping (user_id);
- start docker container
docker compose up -d redis
- start docker container
docker compose up -d zookeeper
To create kafka, it is recommended to start a cluster on Confluent Cloud, using the free service available.
-
go to the Confluent Cloud website
-
perform signin/signup
-
create a default kafka cluster
-
create kakfa cluster API key
-
set the received values to the following environment variables.
KAFKA_BOOTSTRAP_SERVERS, KAFKA_SASL_USERNAME, KAFKA_SASL_PASSWORD
-
create a new topic and set it to the environment variable
KAFKA_METRICS_PRODUCER_TOPIC_NAME
Well, now let's proceed.
All done, now just start the server
go run cmd/api/main.go
You can reach me on my LinkedIn