How to adopt Realtime updates in your app

Nov 23, 2023 1434 words 7 minutes

Contents

…and why you really should!

What do I mean by "Realtime"

Realtime is where clients of your app can find out about something the moment it happens, rather than having to poll/nag/ask for updates. It’s generally where the server pushes the updates to the clients the moment the updates happens, rather than the clients needing to ask for them.

Realtime updates rely on two main technologies:

Websockets: A stateful, persistent, bi-directional ‘channel’ of communication.
Server sent events (SSE): Built on top of HTTP, opens a long-running HTTP connection where multiple independent messages are written to the response over time.

Please don't use polling

You might also think of polling or long polling as a mechanism for fetching ‘Realtime’ data from your backend. Polling is not Realtime.

Polling has a terrible trade-off based on how often your clients poll your backend. Poll too frequently, and you add lots of additional load to your servers and database, too infrequently and messages are not ‘Realtime’ anymore.

Bonus: if you ever wondered what’s different between polling, long-polling, and server-sent-events well this: Polling is request-response; the server will respond with a message if it has one, or nothing. Long-polling is request-response, but the server will wait for exactly-one message before writing the response. Server-sent-events is request-response, but the server writes multiple independent messages over time to the response.

Realtime is not hard (anymore)

Realtime was hard, but not anymore. Basically it’s hard because the connections you need for non-realtime features are much easier manage than those for Realtime features.

Why is this? Well, because “stateless-ness”.

HTTP request/response is naturally stateless. Across the industry we’ve largely adopted an architecture of stateless backend services connecting to one or more databases. We push all the application state into the database, and horizontally-scale the stateless backend services.

…but Realtime connections are stateful. So when it comes to Realtime, and we want to use technologies like websockets or server-send-events, these stateful connections don’t fit that well with our stateless backend architectures.

We end up with a problem where we need to work out which replica of our backend service is hosting the websocket connection for a specific client so that we can send that client the right information, which sucks!

We can solve this a little by keeping state specific to the clients (as well as the application) in the database, but that also sucks because now we have state specific to clients in the database, and the backend services need to poll the database for updates to that client-specific state.

Wait, I thought you said Realtime isn’t hard (anymore)?

I did, and it’s not. There are certain infrastructure features where it’s probably not worth your time trying to run them yourself. You probably think nothing of asking your cloud-provider for a Load balancer to share traffic across your servers. Realtime infrastructure for publishing and subscribing to messages (over websockets, and server-send-events, etc) is exactly the same. Like all things, you could run it yourself but adopting Realtime pub/sub from an Infrastructure provider is no longer expensive, and is easy to do.

Plus any reasonably well-featured Realtime infrastructure provider will have a bunch of useful abstractions (channels) on top of raw websockets to help with the fanout, statefulness, and disconnection/reconnection problems and provide meaningful delivery guarantees.

Any reasonably modern app needs Realtime features

When someone says “websockets” everyone thinks:

“that’s only for chat, and I don’t have chat.”

Well, any reasonably modern app needs Realtime features. Many newer apps are disrupting their entire category by having Realtime features. Linear is a great example of that. Notion, too. You’d be surprised what people are using Realtime for, check out what Stack Overflow use websockets for.

The thing is, these disruptors have figured out that Realtime is both required, and no-longer-hard.

Not convinced yet? Here are some example where pushing the data to clients makes for a much more delightful (and performant) experience:

Task management app: updates to tickets
Data or hosting platform: updates to the state of jobs or servers
Survey platform: new answers received from respondents
Uber, reducing battery and network usage with server-send-events

“But I am not Uber”. Yes, and neither are Notion or Linear, but they are disrupting their industries with Realtime features!

And as more and more apps start to include multiplayer and collaboration features, Realtime updates to the UI and state are going to become the baseline for what users expect.

Patterns for adopting Realtime features

So finally, how can you actually adopt Realtime features? I promise it’s not all that hard.

I’ll run through a bunch of Realtime patterns, and touch on what to watch out for with each one.

Pattern 1: Poke/pull

Poke/Pull is the simplest and easiest to get started with, it fits neatly into your existing application with really minimal changes.

There are two phases (no prizes for guessing the names):

Poke: the server uses a Realtime channel to tell the subscribing clients that there’s new data.
Pull: the clients make a request to fetch the new data.

Poke/Pull is really neat, because the Pull phase can happen using your applications existing APIs. If you can fetch an object, all you need to do is poke the clients to tell them there’s updated data, and they can re-fetch it. No more polling, just Realtime UIs.

Things to watch out for:

Thundering herd: If you poke a lot of clients at once, you will see a spike in requests as all those clients make the Pull request in response to the Poke. By-and-large you’re going to get fewer total requests than if you polled, but you might get larger spikes.
UI re-renders: If you whole-sale replace the state in your application, depending on how you do this, you might trigger a full re-render of the applications UI. Rather than a partial re-render. The Frontend folks tell me that is bad!

Pattern 2: Push state

The natural extension of Poke/Pull is to skip the Pull step, and instead push the updated state. Given all the clients are about to go and Pull the state, you might as well use Realtime channel fan-out of a websocket pub/sub system to save you from the Thundering herd problem.

So when something changes, you can send the full updated state on the Realtime channel.

Things to watch out for:

UI re-renders, again…
You lose the ‘cause’ of the update: Say you have a agile planning app, and the sprint has just ended. Maybe the tasks are moved to ‘completed’, the sprint is marked as ‘completed’, and some other UI changes. You can push the state to the clients, but the state alone might not make sense without knowing that the business event was sprint-completed.

Pattern 3: Push business events

The next pattern fixes the loss of the ‘cause’ from the previous pattern, by pushing the ‘business event’. Taking the same example, you can use your Realtime channels to push the sprint-completed event.

If you’re updating individual parts of your state based on that business event, you’re going to suffer from the UI re-render problem less, as you can get incremental re-renders.

Things to watch out for:

Implementation drift: For any meaningfully useful business event, there’s likely going to be more than one change happening at once. In the completed sprint example, we were closing the sprint, changing the status of tickets, etc. When you send just the business event, you run the risk that the implementation of that event in the UI can drift from the backend. If this happens, your UI will show out of data information.

Pattern 4: Push state diffs

This is probably my favourite pattern. In this, you push updates that represent what has changed (and why). So we’d have an update that contained the business event sprint-completed and the individual state changes that the business event caused.

For example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
{
  "event": "sprint-completed",
  "sprint": {
    "completed_date": 1700757002
  },
  "tasks": [
    {
      "id": 78163,
      "state": "completed"
    }
    {
      "id": 28056,
      "state": "completed"
    }
  ]
}

Now you can patch the fields that have changed, and get the business event that caused the changes. You also get incremental UI re-renders.

If you think this is interesting, I’m working on SDKs and tooling to help reflect Realtime state changes in apps.

Realtime is the future

Realtime is no longer hard, and is becoming a differentiator between apps. It’s also not that hard to adopt. While some of these patterns aren’t perfect, they are a really strong stepping-stone to start including Realtime updates in your apps.