2024Full Stack EngineerActiveOngoing

Better-State

A shared state primitive that makes realtime sync, optimistic updates, and offline support feel like one line of state — not a new architecture.

01 — Origin

The thing that kept repeating across every product

Every time I built something collaborative, the same tax showed up:

  • a flurry of WebSocket handlers
  • state reconciliation logic scattered across the app
  • “offline later” as a lie we told ourselves
  • edge cases that only appeared when someone’s train went through a tunnel

And the worst part: the code wasn’t hard because the UI was complex. It was hard because the protocol was implicit—hidden in a thousand little decisions.

So I stopped trying to “add realtime” to apps.

I tried to build a primitive.

One that lets you write:

  • “this is state”
  • “this is how it changes”
  • “sync it everywhere”
  • “don’t break when I go offline”

…without shipping a new ecosystem or a full CRDT thesis.

Obsession

Distributed state shouldn’t be a bespoke project. It should feel like a primitive.

02 — Constraints

The walls I had to design around

Instant feelIf a button waits for a server response, the product feels broken. Optimistic updates weren’t optional.
Offline realityPeople disconnect. Browsers sleep. Tabs crash. The system needed an offline queue that survives refresh and resumes cleanly.
Correctness under contentionTwo clients will mutate the same key at the same time. Convergence had to be engineered, not hoped for.
DebuggabilityIf syncing is a black box, your app becomes superstition. I needed a story you can inspect: history, replay, subscriptions.
Drop-in adoptionThe API had to be tiny. If it required rewriting the app around it, it wasn’t a primitive.
03 — Decisions

The bets I placed

Make the protocol explicit: event log → replay → broadcast.
Instead of treating state as “whatever the latest value is,” I treated it as the result of a sequence of mutations. That unlocks three things:

  1. History (what happened)
  2. Replay (how we got here)
  3. Deterministic recovery (how we resync after chaos)
store.ts
typescript
// The API I wanted: simple, explicit, offline-ready.
const store = createStore({
key: "room-123",
initial: { count: 0 },

// Optimistic updates happen instantly
update: (state, action) => {
  if (action.type === "INCREMENT") {
    state.count += 1;
  }
},

// Sync happens in the background
sync: "wss://api.better-state.dev"
});

Local-first UI with authoritative convergence.
Clients apply mutations locally first (optimistic), then communicate the mutation. The server becomes the authority that appends to the log, replays to compute the canonical state, and broadcasts back to everyone.

Design for the common case, protect the edge case.
Most apps don’t need a novel conflict-free data type to be useful. They need a reliable “shared counter / todos / presence / polls / app flags” primitive that doesn’t implode under reconnect storms.

So I chose:

  • optimistic concurrency + resync
  • deterministic retry of pending mutations after snapshot
  • a small surface area that can be reasoned about

Make it self-hostable by default.
If shared state is infrastructure, you should be able to run it like infrastructure. So Better-State ships as a server you can start with a single command and store data locally (SQLite).

Key Decision

Treat shared state as a protocol with an event log, not as “the latest value.”

04 — The Failure

Where it broke (and why it mattered)

The first versions mostly worked.

Which is a dangerous state in realtime systems.

The bug wasn’t a crash. It was worse: subtle divergence.

It would happen when:

  • a client went offline,
  • queued mutations,
  • reconnected quickly,
  • and sent a burst while also receiving server broadcasts.

In that overlap, I had a period where the client believed it had authoritative state, while the server was replaying a slightly different sequence. The result wasn’t immediately visible—but it showed up later as “why are these two tabs disagreeing?”

The fix wasn’t “more retries.” It was making the resync path a first-class event:

  1. client reconnects
  2. client requests/receives a server snapshot + cursor
  3. client applies snapshot
  4. client replays its queued mutations deterministically (in order)
  5. server accepts/appends/broadcasts
  6. client clears queue only when acknowledged

That failure changed how I think about realtime:

If your recovery path isn’t explicit, your system isn’t reliable. It’s lucky.

05 — Architecture

How it actually works

Better-State is intentionally boring in the right places.

It’s a small engine that does four jobs well:

Client

  • state(key, initial) creates a state object
  • subscribe(cb) re-renders/reacts on changes
  • set(value) / update(fn) expresses mutations
  • offline queue stores pending mutations (so refresh doesn’t lose intent)

Transport

  • WebSocket connection with auth (API key)
  • subscribe / mutate messages
  • reconnection with resync semantics

Server

  • authenticates API keys and namespaces
  • appends mutations to an event log
  • replays mutations to compute current state
  • broadcasts authoritative state to subscribers

Storage

  • SQLite persistence for durability
  • REST endpoints for health, namespaces, state listing, and history
Better State Dashboard
Real-time state visualization and history replay
Core modelMutation event log + replay to compute canonical state
Realtime transportWebSocket: subscribe + mutate + broadcast
Offline behaviorClient-side mutation queue survives refresh; resync then deterministic retry
PersistenceSQLite-backed server storage (self-host friendly)
Developer experienceMinimal client API + optional React hooks + a dashboard/playground for inspection
06 — Learnings

What I’d tell myself before starting

Realtime is a protocol problemYou can’t sprinkle WebSockets on top of an app and call it collaboration. When state crosses devices, the protocol becomes the product.
Optimistic updates buy you feel — but cost you truthInstant UI is mandatory, but it creates two realities: local intent and server authority. Your job is to make convergence explicit and observable.
Recovery paths define reliabilityThe real test is not the happy path. It’s reconnect storms, tab reloads, offline bursts. If resync is fuzzy, correctness becomes probabilistic.
Small APIs win adoptionA primitive only works if developers can hold it in their head. A small surface area beats a feature buffet.
07 — Future

Where this goes next

Better-State is designed to be a foundation: simple enough to adopt, structured enough to grow.

The roadmap is about expanding power without losing clarity:

  • Stronger conflict strategies for richer data shapes (beyond last-write-wins style state)
  • More inspection tooling (timeline views, mutation diffing, “why did this change?”)
  • Configurable policies (per-key access rules, rate limits, payload constraints)
  • Lightweight presence/rooms primitives built on the same engine
  • Production hardening: backups, migrations, and operational docs

But the north star stays unchanged:

Everything should still feel like one line of state.

Nathanim
NathanimFull Stack & AI Engineer

A bored developer is a dangerous developer.