The stateful scaling problem

HTTP is largely stateless — each request carries all the information needed to handle it, and any server can handle any request. This makes horizontal scaling straightforward: add servers behind a load balancer, distribute requests round-robin.

WebSocket connections are stateful. A connected client has an ongoing relationship with a specific server process. The server holds the socket object in memory. If a client connects to Server A and a message is published that should reach them, Server B can’t deliver it — it doesn’t have the socket.

Sticky sessions

The simplest solution is sticky sessions: configure the load balancer to always route a given client to the same server. The client’s first request establishes which server “owns” them. Subsequent requests (and the WebSocket upgrade) go to the same server.

upstream websocket_servers {
    ip_hash;  # route by client IP -- simple sticky session
    server ws1.internal:8080;
    server ws2.internal:8080;
    server ws3.internal:8080;
}

With sticky sessions, the in-memory pub/sub approach works. All clients assigned to a server communicate directly. The problem is server failures: if Server A goes down, all clients on it are disconnected. They reconnect and land on the remaining servers, which have no knowledge of their subscriptions. Clients need to re-subscribe.

Sticky sessions are appropriate for:

  • Moderate scale (thousands, not millions of connections)
  • Use cases where reconnection + re-subscription is acceptable
  • Simple deployments where operational simplicity outweighs perfect availability

Redis pub/sub for horizontal broadcast

For true horizontal scalability, use a shared message broker. All server instances publish to and subscribe from a central Redis instance. When a message needs to go to clients on any server, it’s published to Redis. All servers receive it and deliver it to their local clients.

const Redis = require('ioredis');
const { WebSocketServer } = require('ws');

const subscriber = new Redis();
const publisher = new Redis();
const wss = new WebSocketServer({ port: 8080 });

// Each server process tracks its own clients
const localClients = new Map(); // userId -> WebSocket

// Subscribe to the global channel
subscriber.subscribe('broadcast');
subscriber.on('message', (channel, message) => {
  const { targetUserId, payload } = JSON.parse(message);

  // Deliver to local clients if they're connected to this server
  if (targetUserId) {
    const ws = localClients.get(targetUserId);
    if (ws?.readyState === 1) {
      ws.send(payload);
    }
  } else {
    // Broadcast to all local clients
    localClients.forEach(ws => {
      if (ws.readyState === 1) ws.send(payload);
    });
  }
});

wss.on('connection', (ws, request) => {
  const userId = request.user.id;
  localClients.set(userId, ws);

  ws.on('close', () => localClients.delete(userId));
});

// To send a message (from anywhere in the codebase):
function sendToUser(userId, data) {
  publisher.publish('broadcast', JSON.stringify({
    targetUserId: userId,
    payload: JSON.stringify(data),
  }));
}

Now any server instance can trigger delivery to any client, regardless of which server they’re connected to.

Channel-based routing

For pub/sub with channels, extend the Redis pattern:

// Each server subscribes all its clients' channels to Redis
function subscribeToChannel(ws, channel) {
  localSubscriptions.get(channel)?.add(ws) 
    ?? localSubscriptions.set(channel, new Set([ws]));

  // Subscribe this server process to the Redis channel if not already
  if (!redisSubscriptions.has(channel)) {
    subscriber.subscribe(channel);
    redisSubscriptions.add(channel);
  }
}

// Publish to a channel
function publish(channel, data) {
  publisher.publish(channel, JSON.stringify(data));
}

When to use each approach

Sticky sessions when:

  • Single-digit server count
  • Simplicity matters more than perfect availability
  • Reconnection is acceptable and re-subscription is cheap

Redis pub/sub when:

  • More than a few servers
  • Server failure should not cause message loss
  • Clients should receive messages regardless of which server they connect to

A third option worth knowing: managed services like Ably, Pusher, and Socket.io’s hosted platform handle all of this for you. For large scale, the operational complexity of managing Redis + WebSocket servers often outweighs the cost of a managed service.