Whatsapp System Design and Chat Messaging Architecture (Part 1)

Kasun Dissanayake
Nerd For Tech
Published in
8 min readJul 24, 2020

--

In this tutorial, we are discussing the Whatsapp Application Design. Whatsapp is a chat-based application. Once you know the design of Whatsapp application you would design any chat-based application with newer features.

The special things about Whatsapp are,

  • One to one Chat — You can make a chat with your friends or any numbers which you have in your contacts.
  • Group Messaging — Whatsapp has Groups. Almost 200 people can enter to one group.
  • Sent + Delivered + Read receipts — Here you can see tick marks coming in based on what stage is the message on.
  • Online / Last Seen — The person is online or the person last online status
  • Image Sharing — Images are going to share with messages. Here video sharing also applied.
  • Temporary/ Permanent Chats — This is for store some messages store someplace forever. This will save your storage and so on. If you delete some chat and your friend delete the same chat, those messages are lost forever.

One to One Chat and Sent + Delivered + Read receipts

How does one person send a message to another person? So that is one to one chat. Let’s take this step by step. You have the application installed on your cell phone. You connect to WhatsApp on the cloud. The place you connecting to the WhatsApp is called a Gateway. The reason for this is you will be using an external protocol when you’re taking to Whatsapp. But Whatsapp might be talking in different languages or services. So you do not need that much security, You do not need those big headers that HTTP provides you when you’re talking internally. Gateway taking care of all these activities.

One-to-one Connection

Once you(Person1) connect to the Gateway, let's assume you sending a message to Person2. The gateway sends the message that you already send to Person2. So you can store in each and every Gateway box information like which the user is connected to which Gateway Box itself. In that case, you just need some sort of mappings like UserID and Gateway Box ID.

Mapping

Gateway service is a microservice itself and it stores User-Box mapping information. This is going to be an expensive thing.

Why Expensive?

Because maintaining a TCP connection itself it takes some memory. You want to increase the maximum number of connections to a single box also. And you do need to store all the mapping to connection in each and every box as a cache or somewhere. So this will waste your memory. And also this information is also duplicated in all 3 servers. And the coupling is high here.

What to Do?

You want to keep the dump connection. This TCP connection should be in a dumb in the sense that you just take information and gives information from this dump. It does not know what is doing apart of that. This dump connection is a microservice. If you are new to microservices please refer to this article.

Here we have a couple of services and the service which helps to handle session related things we can name it as Sessions Service. There are many microservices to handle several services. But here for the one to one chat we are using Sessions Microservice. This service stores who is connected to which box information. And no you can see this service is decoupled with the system.

Usecase

When Person1 sending a message to Person2 it will be called some function with parameters(ex: Sender ID, Receiver ID, Message, TimeStamp, and so on). Then the Gateway gets this message it just sends it to the Session Service which is out microservice. This Session Service is indirectly a router. When the micro-service gets this message microservice can send a response parallel to Gateway1 saying that I got a message now it’s gonna send to Person2 when it’s possible. Gateway1 sends the response to Person1. This means you got a Sent tick in your App.

Message Sent

When it gets this message it searches where this Receiver ID exists and which Gateway Box is connected to this Receiver ID and route this message to the particular Gateway. Then the Gateway sends back it to Receiver.

Person1 sends a message to Person2 and how the message passes through the Gateway and Service

NOTE: This can not be done using HTTP. You know HTTP is a client-server protocol. The client sends Requests and the client sends Responses. So you cannot send a message from server to the client using HTTP. You can just send a message from the client to the server.

But here we need to send the message from the server to the client. To overcome this deficiency, Web app developers can implement a technique called HTTP long polling, where the client polls the server requesting new information. The server holds the request open until new data is available. Once available, the server responds and sends the new information. When the client receives the new information, it immediately sends another request, and the operation is repeated. This effectively emulates a server push feature.

Long Polling

In our example, Person2 can ask for any messages from Gateway and Update the application.

But this is not real-time. It is very important to have real-time updates specially for Chat Applications. So HTTP is not something that we can use for realtime. And we need another protocol over TCP. That is Web Sockets. WebSocket is a computer communications protocol, providing full-duplex communication channels over a single TCP connection.

Web Sockets allow you to peer-to-peer communication. Using this, the server can send messages to the client as well as the client can send messages to the server. This means Person1 sends a message to Person2 and Person2 receives it. When Person2 received the message Person2 sends a response(TCP acknowledgment) to Gateway2. Then Gateway2 sends it again to the Session Service saying that This message is received. So this updated in the database(Message, To, From, TimeStamp, and other details) which is connected to the Session Microservice. Then the Session Service finds out where should to pass the acknowledgment through which Gateway. Then it passes the acknowledgment to Person1 saying that The message is delivered to Person2.

Now, the Person1 message delivered to Person2. Person2 got the message now. It means you got Delivered tick. At this point, Person1 should notify the message has been delivered.

Message Delivered

Now definitely you gonna think about How Read is working. The moment the Person2 opens the application and the chat application itself sends a response to Gateway that saying This person has read the message. And this response will pass to the Person1 same as the delivered response.

Message Read

Online / Last Seen

Here we are learning about the Last seen or the status saying that the person is online right now status. Simply think about our previous example. Person2 wants to know when the Person2 online last time. This information has to be stored somewhere. As I mentioned above there are microservices to handle the WhatsApp process and this process is handle by one microservice we can name it as Last Seen Service. So Last Seen service stores online details of each user in the Cloud Database table saying that This user's last online time is this time.

How Last Reen Request passes to the Service

Now the question is How the Last seen timestamp maintains for a particular User. Whenever Person1 does an activity like sending a message or reading a message or any kind of request to the server should be logged as an activity and that current timestamp should be persisted in the Last Seen table. In that case, we can say that whenever Person1 did anything definitely he will online which means that the last seen timestamp should be updated. Based on this Person2 can be told that Person1 is online or not.

One of the key features over here is that saying someone's online status exactly like Person1 was online 3 seconds ago. Then Person2 can tell that Person1 was online 3 seconds ago. You can keep this as Online instead of showing that Person1 was online 3 seconds ago. You can keep this threshold to whatever 10 seconds or 20 seconds like that if they exceed the selected time(10 seconds or 20 seconds).

So finally the Last Seen service tracking the activities of each and every user and track the last seen of the Users.

Last Seen Status

Specifically, there might be some requests which are not been sent by the User by the Application itself. For example when you polling for some messages such as you are not using the app, but you want an application to notify you whenever there is a message( Ex: Delivery the seat). So the request/client should be smart. It can identify User Activities and Application generated activities. If there is an application generated activities no last seen status does not send to the Last Seen Service. If it is User activity send it to Last Seen Service and update the database. In that way, Person2 can see Person2’s online status.

I hope you will get a basic idea about How WhatsApp Application designed and the functionalities work.

So this is all about One to one Chat, Sent + Delivered + Read receipts, and Online / Last Seen. We will learn about Group Messaging and Temporary/ Permanent Chats in the next tutorial.

Thank You!

--

--

Kasun Dissanayake
Nerd For Tech

Senior Software Engineer at IFS R & D International || Former Software Engineer at Pearson Lanka || Former Associate Software Engineer at hSenid Mobile