Erik Rehn



How we built a serverless real-time chat on AWS (with APIGateway and NodeJS)

Sockets are a notoriously difficult to scale due to their stateful nature. Hence deploying them in a horizontal scalable way is not an easy task.

What are WebSockets?

WebSockets is a technology that allows for real-time, two-way communication between a client and a server over a single, long-lived connection. Unlike traditional HTTP connections, which are stateless and require a new connection to be established for each request, WebSockets enable persistent, bi-directional data to transfer between the client and server.

This technology has become increasingly popular for building modern web applications that require real-time updates, such as chat applications, online gaming, and collaborative document editing tools. With WebSockets, developers can build highly interactive and dynamic web applications that provide a seamless user experience.

WebSockets are supported by all major web browsers and can be implemented using a variety of programming languages, including JavaScript, Python, and Ruby. The WebSocket protocol is designed to be efficient and lightweight, making it an ideal solution for low-latency communication between clients and servers. To sum it up, WebSockets offer a powerful and flexible tool for building modern web applications that require real-time data exchange.

The problem with horizontal scaling

Now, let's imagine you are trying to scale them horizontally using an ALB (application load balancer). Each new connection will be randomly assigned to one of the instances behind the ALB.

Imagine that Bob connects to the service and wants to send a real-time message to Alice. When Bob connects, he will set up a socket connection to "Instance 1" and when Alice connects, she will set up a socket connection to "Instance 2".

However, this leads to a problem since "Instance 1" has no knowledge regarding which users exist on other machines. Therefore, the instance has no way of propagating the information in a scalable fashion. In theory, the instance could make a synchronous POST request to all other machines but that would not scale since each message would have to be processed by each machine, making scaling impossible.

A common approach is to sync all messages using some type of topic in a PubSub pattern. Socket.IO supports this natively with the help of a prebuilt package for Redis. This approach will solve the problem, but it will require an ElasticCache Cluster and will not make the solution serverless. Another issue that arises is that using Redis solely for pub/sub might not be the best idea since Redis is a general-purpose database that supports a variety of data structures and operations. It can be used for caching, messaging, and other use cases, in addition to pub/sub. While Redis is a great choice for pub/sub when used as part of a larger application, it might not be the best fit for a standalone pub/sub service, especially if the service requires high scalability and availability. In such cases, it might be better to use a dedicated pub/sub service like AWS SNS or Google Cloud Pub/Sub.

Serverless API Gateway to the rescue

To more efficiently solve the aforementioned problem, we can use API Gateway with WebSocket support to handle socket connections. With this solution, when a user connects to our WebSocket service, we can store the connection information in a DynamoDB table. This information can include the connection ID, which is a unique identifier for the connection, and any other relevant metadata, such as the user ID or session ID.

Once the connection information is stored in DynamoDB, we can use it to send messages to specific connections, or broadcast messages to all connected clients. To achieve this, we can use a Lambda function as the backend for our WebSocket service. This function can receive messages from the client, process them as necessary, and send responses back to the client. By using API Gateway and DynamoDB, in combination with a serverless backend, we can create a scalable and highly available WebSocket service that can handle large numbers of concurrent connections.

Let's revisit our diagram. Instead of each instance handling the socket connection we use the API Gateway which has socket support. When a user connects to our Socket we store the connection in a DynamoDB table.

Instead of each service handling its own socket solution, we instead move out all the real-time logic to a separate real-time microservice. One way to use an SQS to trigger a Lambda function that sends messages to connected users' sockets is to set up a queue that receives messages from other parts of the application. When a message is received in the queue, it can trigger a Lambda function to process the message and send the appropriate response to the connected users.

For example, imagine a chat application where users can send messages to each other. When a user sends a message, it is added to an SQS queue. A Lambda function is triggered by the SQS message that processes the message, determining which users should receive the message. The Lambda function sends the message to the appropriate user's sockets, using the information about the connection that is stored in DynamoDB.

This approach allows for a scalable and decoupled architecture, where different parts of the application can send messages to the queue without needing to know the details of how the messages are processed and sent to the users. Additionally, when using a serverless architecture with Lambda and SQS, the system can scale automatically to handle large numbers of messages and users, while only paying for what is used.

Kontakta oss