Designing Multi-Tenant Fintech Backends with Fastify and Cloud Spanner
What a year of building a payments platform taught me about tenancy, isolation, and boring reliability
For the past year and a half I've been building a multi-tenant fintech platform - Fastify on the backend, Next.js up front, Google Cloud Spanner underneath. Payments is a domain where “move fast and break things” reads like a threat, so most of what I've learned is about making systems boring in the best possible way. Here's what actually mattered.
Pick your tenancy model before it picks you
The first real decision in any multi-tenant system is isolation. Database-per-tenant gives you the strongest walls and the worst operational story: migrations multiply, connection pools fragment, and onboarding a merchant means provisioning infrastructure. We went with shared tables and a tenant_id column on every row, enforced at the query layer - not in each handler's good intentions.
The rule that saved us: no query runs without a tenant context. We wrapped our data access so the tenant ID comes from the authenticated request, never from the request body. A tenant ID in a POST payload is an invitation to read someone else's ledger.
RBAC belongs in the request lifecycle, not the handler
Fastify's hook system is the quiet hero of this platform. Role checks live in a preHandler hook, declared per route, so a handler body never has to ask “is this user allowed to be here?” - by the time it runs, that question is answered.
app.get(
"/merchants/:id/settlements",
{ preHandler: requireRole("merchant:read") },
async (req) => {
// req.tenant is set by an earlier hook from the verified JWT.
// No tenant or role logic lives in the handler.
return settlements.list(req.tenant.id, req.params.id);
}
);This sounds obvious until you inherit a codebase where authorization is a copy-pasted if-statement in forty handlers, thirty-eight of which are correct.
Spanner rewards thinking in hierarchies
Cloud Spanner gives you horizontal scale with real SQL semantics, but it wants you to model data the way it's accessed. Interleaving merchant-owned rows under the merchant table keeps a tenant's working set physically close, and it made our most common query shape - “everything about this merchant” - cheap. The tradeoff is that you commit to an access hierarchy early. We spent more time on schema design than on any other architectural decision, and it was worth it.
Automation is a product feature
Merchant onboarding used to be a chain of manual steps: CRM entry, KYC checks, gateway configuration. We automated the pipeline end to end - Zoho CRM integration, document validation, gateway provisioning - and onboarding time dropped 35%. The lesson wasn't “automation is good.” It was that onboarding is the first impression of your platform, and every manual step is a place where a customer waits.
Test coverage as a payments requirement
- We hold the platform above 80% test coverage, enforced in CI. In fintech this isn't vanity - a refund path that silently regresses is a financial incident, not a bug ticket.
- Contract tests around payment gateway integrations caught more real bugs than unit tests ever did. Third parties change behavior; your tests should notice before your customers do.
- CI/CD means deploys are boring. Boring deploys mean you deploy often. Deploying often means small diffs, and small diffs are debuggable.
What I'd tell past me
Multi-tenancy is not a feature you add - it's a constraint you design under from day one. Put tenancy and authorization in the request lifecycle, model your data around real access patterns, and automate anything a customer has to wait for. None of it is glamorous. All of it is why the system holds.