Architectural Decisions

After years working with software, I've developed a set of principles that guide my architectural decision-making. These aren't hard rules — they're lenses to look through when faced with choices that will outlast the sprint they were made in. After each section I've added some questions to ask yourself or your llm.

Simplicity over complexity

Prefer simple APIs over complex ones. A smaller surface area and a more focused role make a system easier to understand, reason about, and eventually replace.

Everything should be as simple as it can be, but not simpler ― Einstein

If something offers a larger surface area, stay in the simpler parts of the API rather than reaching for complex features — indexes in Postgres carry less lock-in than stored procedures, for instance. Prefer proxy APIs that allow for simpler switching later.

Perfection is achieved not when there is nothing more to add, but when there is nothing more to remove. — Antoine de Saint-Exupéry

Grug puts it more bluntly:

The eternal enemy: Complexity

given choice between complexity or one on one against t-rex, grug take t-rex: at least grug see t-rex

Can I explain this in an hour and master it in a day?
Does this reduce complexity, or does it introduce new abstractions?
Is the API surface area as small as possible?

There is no black and white — it's a spectrum

Everything is on a spectrum, and "correct" can mean different things depending on your goals, timelines, and constraints. This is why it's worth spending a little time upfront to save days or years of pain — and why writing down the why behind a decision matters, so you can re-evaluate it later. Understand that there's sometimes different solutions for the short, medium and long term.

What are the constraints — timeline, team size, risk tolerance?
Have I written down the why so I can revisit this later?
Am I optimising for now, or for longer-term maintainability?

Always compare

Starting from the best solutions in the space gives you the chance of landing on the right one.

Make sure decisions are evaluated against the top competitors, and that you have clear criteria for why you're choosing one thing over another. Check real-world reports of each in production use, and build a proof of concept — of the candidate and its competitors. Even a mental walkthrough of how an API or tool would be used in practice is more valuable than committing to a full implementation.

Always explain why you recommend what you do and what you considered along the way.

Note that some people skip this step -- some of it might come down to satisficing vs. maximising — we'd all be stuck forever trying to find the optimal solution. Still, it's worth understanding why people make the choices they do and why we ourselves might skip it. LLMs are a good starting point if we ensure that they search for the most up to date information and not use only the knoweldge cutoff.

Have I evaluated the top alternatives with clear criteria?
Have I checked real-world reports and built (or mentally walked through) a POC of each?
What are the gaps, and how did competitors solve them?
Have people already converged on a standard here?
Is there something I heard about lately or read about that seems better?

Prefer low vendor lock-in

Favour open standards, small API surface areas, and simple libraries over entirely proprietary solutions.

Databases are boring technology — and that's usually a good thing. MySQL, Postgres, and Redis (Valkey) are typically better choices than something like FaunaDB unless there's an order-of-magnitude performance gain or a meaningful speed-to-develop advantage and an easy migration off.

Is this based on open standards?
How hard would it be to swap this out in six months? A year?
Is there a proxy or abstraction I can sit behind?

One-way vs. two-way decisions

Returning to simplicity and implementation cost: can this be walked back in a week, a month, a quarter, or a year? Can migration be automated, or does it require a dedicated team effort?

If it's a two-way decision, it's safer to try it and back out with your learnings intact. If it's a one-way decision, are you taking enough time to be happy living with it for much longer?

This talk covers the idea of reversible vs. irreversible decisions well — worth the watch.

Can this be reversed in a week / month / year?
Can migration be automated, or does it need a dedicated team?
Am I treating this as reversible when it actually isn't?

The "do nothing" option is always an option

Choosing none of the above is always valid — but be honest about its costs and trade-offs. Sometimes thinking through the "do nothing" path is exactly what helps clarify the why behind whichever direction you eventually choose.

What is the cost of waiting or choosing nothing?
Does thinking through "do nothing" sharpen the case for the chosen direction?

Supported and maintained

Weight more toward projects whose maintainers have a long history of support. In the absence of that, prefer open-source projects or APIs close enough to a competitor that switching remains feasible. Prefer funded, profitable ventures over brand-new ones, and consider whether self-hosting is possible if needed.

All else being equal, the larger, more established community will usually be the safer bet.

XKCD 2347: Dependency — modern digital infrastructure balanced precariously on a single unmaintained project

Does the maintainer have a track record of long-term support?
Is this funded and profitable, or brand-new?
Could I self-host if needed?

Do try this at home

One genuine exception to the "boring tech" guideline is personal, low-impact projects, or intentionally cutting-edge experiments. Testing bleeding-edge ideas in a low-risk environment is one of the best ways to learn — you hit real-world problems while picking up new ways of thinking, and those learnings carry back into the main work.

Did it work?
Was it fun?
What part of the API or technology was best?

Consider total cost of ownership and ROI

Development and support time, onboarding speed, LLM familiarity (is it in the training data, or does context cost thousands of tokens?), documentation quality, build-vs-buy trade-offs, upgrade burden, security, testability, observability, tooling integration, and usability all factor into total cost of ownership.

What are the uptime risks? How proven is this, and are there real production adopters?

ROI matters too. Spending $20 to save $100 is making money — if your estimates are correct. Measure before you adopt and monitor after to validate your assumptions. Spending money isn't wasting money if the alternative is burning time or compounding technical debt.

What are the ongoing support, upgrade, and security costs?
How fast can new people get up to speed?
Does the spending save more than it costs? Have I measured before and will I monitor after?
What are the uptime risks, and are there real production adopters?

When is the crowd wrong, and why?

Usually when there's pure hype without substance — the tool hasn't passed the tests above, or it has the wrong abstraction for your context. It's also possible that someone hasn't measured the right things, or has weighted them differently than you would.

Actively seek out dissenting opinions to stress-test your assumptions. Sometimes two people can both be right about the same thing — it's about weighing the trade-offs and planning for the risks. Look for the most critical takes and ask honestly whether you can live with those trade-offs.

When is the crowd right? Sometimes they actually are, so cross-check whether you've missed anything. Critical mass and community knowledge have real value: shared knowledge is searchable, baked into LLMs, and means more people have already faced — and solved — the problems you're about to hit.

Is this hype, or has it passed the tests above?
Have I sought out dissenting opinions?
Can I live with the criticisms?

Evaluate critically after the fact

Check whether your assumptions held as you move forward. Watch for friction, and don't be afraid to choose a different path — but try to converge on a single solution rather than maintaining multiple approaches in parallel. The principles of one-way vs. two-way decisions and simple APIs should support you here. Does it actually speed things up? Keep a decision registry and look back to understand when you were right or wrong and more importantly why.

XKCD 927: Standards — how competing standards proliferate

Were my assumptions correct and what could I do better next time?
Where is there friction, and what is it telling me?
Should I consolidate to a single solution?

Architectural decision records

One way to ensure that these decisions are revisited is to create an architectural decision record. This is a document that outlines the decision, the context, the criteria, the options, and the reasoning behind the decision. Make sure there's a recommendation even on first draft so that people reviewing know what your initial thoughts are and then the discussion can begin. When people differ on a given point that's when the interesting discussions happen. The thing to remember is that people are basing their decisions on both a different past and team/timeframe/constraints and the future may have different estimates when people are basing things from feelings or even estimates. ADRs are a whole topic on their own but the main thing is to create one, even if it's just for yourself to refer to later.

Conclusion

“When the facts change, I change my mind - what do you do, sir?” ― John Maynard Keynes

Good architectural decisions help narrow down the range of possible solutions in an unpredictable world. Ideally they balance risk, cost and reward but ultimately they're about being intentional, comparing honestly, and setting yourself up to either change course or commit with confidence. Know what you're optimising for, write down the why, and revisit when the world changes. Good luck!

Do you agree or disagree on some points? Why? What things do you value and why? Hopefully this helps you think about your own design principles.