Drag, drop, and the optimistic update race

Imagine a small task tracker — the kind of kanban board every team eventually builds. There are columns (Todo, In Progress, Done), cards you drag between them, an order within each column, and tasks can block other tasks. Each card has a status, a position, and maybe some blockers. You drag a card to a new column, it moves, the server is told. Simple.

I've used React Query for years, and my reflex for mutations has always been the same: fire the mutation, then invalidateQueries to get the real deal from the server. That reflex is correct — for forms. It fell apart the moment I made the board draggable, and digging into why taught me more about mutations than years of CRUD forms ever did.

The reflex that works for forms

A create/update/delete form is slow and deliberate. The user clicks submit, waits, sees the result. You want server truth — generated ids, computed fields, side effects — so you just invalidate:

const invalidate = () => queryClient.invalidateQueries({ queryKey: ['tasks'] }) useMutation({ mutationFn: createTask, onSuccess: invalidate })

Nothing wrong with this. The round-trip is fine because the human is already waiting on it. The mistake is assuming this is the pattern for mutations rather than one pattern, suited to one kind of interaction.

Drag-and-drop breaks the reflex

Drag is the opposite of a form. It's fast, it's optimistic (the card must move now, not after a round-trip), and you can fire several before any of them settle. Drag card A to Done, then immediately drag card B to In Progress, then nudge A up a slot. Three mutations in flight at once.

With "invalidate on settle" the board started thrashing — cards visibly snapping back to where they'd been a moment ago, then forward again. Two distinct bugs were hiding in there.

Bug 1: concurrent invalidation thrash

invalidateQueries doesn't just mark data stale — it forces an immediate refetch of any active query. (If you pause your polling with refetchInterval, note that invalidation ignores it.) So with two drags in flight:

  1. Drag A optimistically moves the board to state n.
  2. Drag B optimistically moves it to n+1.
  3. Drag A settles first → invalidate → a GET fires while B's write hasn't committed yet → the server returns the whole board at state n (has A, not B).
  4. That whole-board response replaces the cache, wiping B's move. Snap-back.

A list GET is a global replace — a stale snapshot clobbers changes it knows nothing about. The fix isn't to stop invalidating; it's to only invalidate after the last mutation in the burst settles (more on the exact check below).

Bug 2: the same-card out-of-order race

The subtler one. If you reconcile by writing the server response into the cache in onSuccess, two mutations on the same card can resolve out of order:

A per-id onSuccess patch is perfectly safe across different cards (each touches its own row). It's only the same card, mutated twice, that bites. And it bites precisely because we're treating the response as the source of truth.

The insight: the optimistic write is the truth

Here's what unlocked it. For a drag, the value I'm optimistically writing is exactly what I'm persisting. I compute the new status and position on the client, write them to the cache, and send the very same thing to the server. The server just stores it and echoes it back. The response carries no new information.

So I shouldn't apply the response at all. The optimistic writes already happen in interaction order (drag 1, then drag 2), so the latest one wins by construction. Drop onSuccess entirely and the same-card race disappears — there's nothing left to land out of order.

The exception that proves the rule: reconcile from the response only when the server mints data the client didn't have — a generated id, a server-computed field. And even then, merge it by id; don't replace the list.

The rules I landed on

Putting it together, an optimistic drag mutation looks like this:

const KEY = ['tasks', 'status-move'] // shared across all the drag mutations useMutation({ mutationKey: KEY, mutationFn: moveTask, onMutate: async (vars) => { // 1. cancel first — abort any in-flight GET so it can't land on top // of our optimistic write. THIS is what makes optimism safe, not invalidate. await queryClient.cancelQueries({ queryKey: ['tasks'] }) // 2. write the new state immediately queryClient.setQueryData(['tasks'], (old) => applyMove(old, vars)) // 3. no snapshot — see onSettled }, // no onSuccess: the optimistic write is authoritative onSettled: () => invalidateIfLastMove(queryClient), })

And the gated invalidate:

function invalidateIfLastMove(qc: QueryClient) { // imperative isMutating() — NOT the useIsMutating() hook (stale closure) if (qc.isMutating({ mutationKey: ['tasks', 'status-move'] }) === 1) { void qc.invalidateQueries({ queryKey: ['tasks'] }) } }

A few things worth calling out:

  • cancelQueries in onMutate is the load-bearing call. It's not the invalidate that makes optimistic updates safe — it's cancelling the in-flight read so a poll or earlier fetch can't overwrite what you just wrote.
  • === 1, not === 0. When onSettled runs, this mutation is still counted as pending — query-core dispatches success/error only after the onSettled callbacks finish. So 1 means "I'm the last one standing," which is exactly when it's safe to refetch. I only believed this after reading the source.
  • onSettled runs on success and error. So a failed drag's wrong optimistic value self-heals on the same refetch. No manual rollback.
  • That's why I don't roll back from a saved snapshot. Under concurrency, restoring a snapshot taken at this mutation's onMutate can clobber a later successful drag of the same card. A refetch gives true current state; a snapshot gives a stale guess.
  • Pause polling while moving. Gate refetchInterval on useIsMutating(...) > 0false while a move is in flight, your normal interval otherwise. The useIsMutating hook is right here (render-time, wants reactivity); the imperative qc.isMutating() is right inside the callback.

The single post-burst refetch isn't wasted, either — it catches server-side side effects the response wouldn't include. Closing a blocker can flip its dependents from Blocked to Ready in other columns; the gated reconcile picks that up.

Where each concern belongs

Once the bugs were gone, the bigger lesson was about placement. A useMutation runs through a fixed sequence of handlers, and knowing the order tells you where each concern goes:

Two things fall out of this order. By the time your await returns, the hook's onMutate and onSettled have already run — so a success toast is purely cosmetic, fired after the reconcile is already kicked off. And because onSettled runs on both paths, the reconcile (and any cleanup) belongs there, not duplicated across onSuccess/onError.

Keep the hook UI-agnostic; toast at the call site

This is the part I'd push hardest on. The hook should own what changed (cache and state); the component should own telling the user (toasts, banners, notifications). The way you get that separation is to call the mutation with await mutateAsync(...) and handle UX in the surrounding try/catch:

Same hook, two call sites, completely different feedback. Because the hook never imports a toast library, it stays portable — you can lift it into another screen, or another app, and it just works. If the toast lived inside the hook, you'd be forking it the first time a second screen needed a different message.

Prefer mutateAsync over mutate for this. await gives you a linear flow — success line, then a catch for the error — right where the component already has its notification API in scope. The one rule you can't forget: mutateAsync rejects on error, so it must be wrapped in try/catch (or a .catch), or you get an unhandled rejection. The hook's onSettled still runs either way.

When plain mutate is fine: an interaction whose only feedback is the optimistic change itself, where errors self-heal via the gated refetch — there's no toast to show, and mutate can't throw. Just don't reach for mutateAsync and then ignore the returned promise; that's the worst of both.

A bonus from the key factory: exact vs. partial matching

All of this leans on query keys, and there's one behaviour worth internalising. A hierarchical key factory:

const taskKeys = { all: ['tasks'] as const, list: (q) => ['tasks', 'list', q] as const, detail: (id) => ['tasks', 'detail', id] as const, }

invalidateQueries, cancelQueries, and the isMutating/isFetching filters all match partially by default — a query matches if its key starts with the key you pass. That prefix behaviour is the entire point of the factory: one key, many blast radii.

You passMatches
['tasks']every tasks query — all boards and all details
['tasks', 'list']all boards, any filter
taskKeys.list(q)that one filtered board
['tasks', 'detail', id]a single task

Two gotchas. Objects inside a key are matched partially too (a subset match, not by reference), so an under-specified object filter can hit more queries than you meant. And when a prefix would catch siblings you want to leave alone, pin it with { exact: true }. My rule of thumb: invalidate at the broadest key that's still correct. A board move uses ['tasks'] deliberately, because a move can ripple into other columns — the wide prefix is the right radius, and the gated call keeps it to a single refetch per burst.

The takeaway

"Just invalidate" is a fine default until interactions get fast and concurrent. Then the questions change: cancel the in-flight read first, decide whether the optimistic value or the response is the source of truth (for a drag, it's the optimistic value), gate the reconcile to the last mutation in the burst, and let errors heal by refetching rather than by restoring a stale snapshot. Keep all of that in the hook, and keep the toast at the call site so the hook stays reusable.

A checklist I now run before shipping any mutation:

  • Form or fast-optimistic interaction? Picked the matching strategy?
  • onMutate does cancelQueries before setQueryData?
  • If optimistic: no onSuccess reconcile (unless the server returns new data)?
  • Invalidate gated with imperative qc.isMutating(...) === 1?
  • Related mutations share a mutationKey, and polling pauses on it?
  • No snapshot rollback relied on under concurrency?
  • UI feedback at the call site via await mutateAsync in a try/catch, not in the hook?

Much of this is distilled from TkDodo's excellent Concurrent Optimistic Updates in React Query and Practical React Query — both worth reading in full.