Architecture9 min read

Multi-Tenant Isolation: Why Row-Level Security is Not Enough

Four layers of tenant isolation from API gateway to database. Defense in depth for financial infrastructure.

Multi-Tenant Isolation in Financial Systems: Row-Level Security is Not Enough

A developer forgets a WHERE clause. The ORM generates an unscoped query. An admin endpoint bypasses the tenant filter for "operational convenience." One bug, one query, and Tenant A sees Tenant B's transaction history.

In a SaaS product, this is a data breach. In a financial system, it is also a regulatory violation. PSD2 requires protection of payment data. GDPR imposes fines per affected individual. DORA requires that ICT systems prevent unauthorized access to financial data. A single cross-tenant data leak triggers all three.

Row-Level Security (RLS) is supposed to prevent this. The database enforces tenant boundaries regardless of what the application code does. But RLS is the last line of defense, not the only one. It answers "can this database session access this row?", but someone must set the session context correctly. Someone must ensure the tenant ID flowing through the system is authentic, not caller-supplied.

Multi-tenant isolation in financial systems requires three layers. Each layer catches failures that the others miss.

Layer 1: The Gateway

The API gateway sits between the public internet and the application. Its job: authenticate the caller, determine which tenant they belong to, and inject that identity as a trusted header.

The critical property: the application never reads tenant identity from the caller's request body, query parameters, or JWT claims that the caller controls. The gateway validates the OAuth2 token (or API key), resolves the associated tenant, and sets an HTTP header, X-Tenant-ID, on the internal request. The application reads the header. The caller cannot forge it.

This prevents a category of attack that application-level filtering cannot: a compromised or malicious caller that sends a valid authentication token but manipulates the tenant context. If the application reads tenant_id from the request body, a caller with valid credentials can claim to be any tenant. If the application reads X-Tenant-ID from a gateway-injected header, the claim is verified before the request reaches application code.

Implementation: Traefik ForwardAuth middleware, Kong plugins, or a custom gateway that calls an auth service. The auth service validates the token, resolves the tenant, and returns the tenant ID as a response header. The gateway injects it. The application trusts it.

Layer 2: The Application

Every service method receives tenant context from the gateway header. The context is immutable for the duration of the request. The service cannot construct a query for a different tenant, not because it's forbidden by convention, but because the API does not accept tenant ID as a parameter. It reads it from the injected header.

This eliminates the WHERE-clause problem. The application does not add WHERE tenant_id = ? to every query manually. The tenant context is set on the database connection at the start of the request, and RLS (Layer 3) enforces it transparently.

But the application layer adds something that the database cannot: request-scoped validation. Before any write operation, the service verifies that the resources being modified belong to the current tenant. A transfer from Account A to Account B? Both accounts must belong to the requesting tenant. A customer update? The customer must belong to the requesting tenant. These checks happen in application code, before the database is touched.

Why not rely on RLS alone? Because RLS operates at the row level. It can prevent reading another tenant's rows. It cannot prevent semantically invalid operations within a single query, for example, constructing a transfer where the debit account and credit account belong to different tenants. That validation requires application logic.

Layer 3: The Database

PostgreSQL Row-Level Security policies on every table containing tenant data. The policy:

CREATE POLICY tenant_isolation ON finance_transfer
  USING (tenant_id = current_setting('app.current_tenant_id')::uuid);

The application sets app.current_tenant_id at the start of every database connection (from the gateway-injected header). Every query is automatically filtered. Forgetting a WHERE clause is irrelevant, the database will not return rows from other tenants regardless of the query.

The key detail: the database role used by the application cannot bypass RLS. Only the migration role (used for schema changes, never for application queries) has BYPASSRLS permission. If the application role could bypass RLS, a single SET ROLE bug or SQL injection would defeat the entire isolation model.

-- Application role: RLS enforced
CREATE ROLE finance_api NOINHERIT NOBYPASSRLS;

-- Migration role: RLS bypassed (schema changes only, never used by application)
CREATE ROLE finance_admin BYPASSRLS;

Layer 4: The Ledger

Financial systems have a concern that generic SaaS platforms do not: the ledger must enforce tenant isolation independently of the relational database.

The ledger engine operates on fixed-size records with a user_data_128 field that carries encoded tenant information. A transfer between two accounts is only valid if both accounts share the same tenant encoding. The ledger rejects cross-tenant transfers at the protocol level, before the transfer reaches storage.

This is not redundant. It is defense in depth. Consider the failure mode: a bug in the application layer constructs a transfer where the debit account belongs to Tenant A and the credit account belongs to Tenant B. The application-level check (Layer 2) should catch this. But if it doesn't, a missing validation, a race condition, a code path added by a new developer who didn't know about the check, the ledger engine rejects the transfer. The financial record is never corrupted.

ApplicationBug: transfer(debit=TenantA:acct1, credit=TenantB:acct2)

Layer 2Application check: should catchMissed (bug)

Layer 3Database RLS: both accounts visible to TenantA?Blocked

Layer 4Ledger engine: accounts have different tenant encodingRejected

Two independent safety nets behind the application. Either one is sufficient to prevent the corruption. Both together make it structurally impossible.

Workflow Propagation

Multi-step workflows span multiple services. Account creation calls the finance service, then the IBAN service, then the KYC service. Each call must carry the correct tenant context.

The durable execution engine propagates the tenant ID in every workflow invocation. When a workflow calls the finance service, it injects X-Tenant-ID in the HTTP request header. The finance service validates it against the gateway-injected value. If a workflow step calls an external provider (KYC, AML screening), the tenant context is used to select the correct provider configuration, without exposing the tenant ID to the external system.

If the workflow engine does not propagate tenant context, cross-service calls may execute in the wrong tenant scope. A common bug, not a hypothetical, in multi-tenant systems that add workflow orchestration as an afterthought.

Five Questions to Assess Your Isolation

Binary answers. No partial credit.

1. Can a caller set their own tenant identity? If the application reads tenant_id from the request body or a caller-controlled JWT claim: yes. Your gateway layer is missing or incomplete.

2. Can application code construct a query for a different tenant? If any code path accepts tenant_id as a function parameter rather than reading it from the request context: yes. Your application layer has a gap.

3. Does your database role have permissions to bypass RLS? If the application connects with a role that has BYPASSRLS, SUPERUSER, or owns the tables: yes. Your database layer is broken. RLS is enforced but meaningless.

4. Can a transfer move funds between accounts of different tenants? If the ledger does not independently verify tenant ownership of both accounts in a transfer: yes. Your ledger layer is missing.

5. Does your workflow engine propagate tenant context across service boundaries? If cross-service calls within a workflow do not carry tenant context: no. Your multi-step processes may leak scope.

Every "yes" to questions 1-4 and "no" to question 5 is a gap. In a financial system, each gap is a potential regulatory finding.

Read more: The Ledger | Security & Compliance

Sources:

DORA, Regulation (EU) 2022/2554, Art. 9 (Protection & Prevention of unauthorized access)
PSD2, Directive 2015/2366, Art. 94 (Protection of personal data)
GDPR, Regulation 2016/679, Art. 32 (Security of processing)
PostgreSQL documentation: Row Security Policies (https://www.postgresql.org/docs/current/ddl-rowsecurity.html)