Skip to main content
Back to Insights
Ledger10 min read

Why General-Purpose Databases Are the Wrong Foundation for Financial OLTP

Amdahl's Law and the limits of general-purpose databases for high-throughput settlement workloads. The case for purpose-built engines.

Why General-Purpose Databases Fail at Financial OLTP


PostgreSQL can store financial transactions. It does so reliably for millions of companies. The question is not whether it can store them, it is whether it can enforce financial invariants at the speed and correctness level required for a core banking ledger.

The answer, for a specific and well-defined class of workload, is no. Not because PostgreSQL is flawed. Because it was designed to solve a different problem.

The PostgreSQL Ledger

The default architecture: a transfers table, an accounts table with a balance column (or a materialized view that sums postings), and application-level logic to enforce double-entry. Transfers are INSERT-only. Balance updates are triggered by application code or database triggers.

At 100 TPS with 10,000 accounts uniformly distributed, this works without incident. Queries complete in single-digit milliseconds. Lock contention is negligible. The system passes every test.

The ceiling appears with hot accounts. A fee collection account touched by 30% of all transactions. A settlement account that aggregates outbound payments. A platform revenue account that receives every commission split. These accounts exist in every financial system. They are structural, not exceptional.

When 50 concurrent transactions attempt to update the same account balance, they queue. PostgreSQL acquires a row-level lock (FOR UPDATE) on the balance row. One transaction proceeds; 49 wait. Under SERIALIZABLE isolation, the only level that prevents all anomalies in financial workloads, the conflict detection layer adds further overhead. Serialization failures trigger retries. Retries compound under load.

The degradation is non-linear. At 200 TPS, the system is fine. At 500 TPS, p99 latency doubles. At 1,000 TPS with hot accounts, the retry rate exceeds 20% and effective throughput plateaus. Adding connections makes it worse, more contention, more retries, more wasted work.

Amdahl's Law Applied to Ledger Workloads

Gene Amdahl formalized this ceiling in 1967. If a fraction S of a workload is inherently serial (cannot be parallelized), the maximum speedup from N parallel processors is:

Speedup = 1 / (S + (1 - S) / N)

As N approaches infinity, the speedup approaches 1/S. If 5% of your transfers touch a hot account that serializes access, the maximum speedup is 20x, regardless of hardware. If 10%, the ceiling is 10x.

For financial workloads with realistic account distributions (Zipfian, a small number of accounts receive a disproportionate share of traffic), the serial fraction is typically 5-15%. The ceiling is real, and it is low.

The workarounds are familiar to anyone who has operated a PostgreSQL-based ledger at scale:

Sharding. Split accounts across multiple databases. Cross-shard transfers (sender on shard A, receiver on shard B) require two-phase commit, slow, complex, and a new failure mode. The operational overhead of managing shard boundaries, rebalancing, and cross-shard queries often exceeds the performance gain.

Eventual consistency. Relax the isolation level. Accept that balances may be temporarily wrong. Reconcile later. For a payment system that displays balances to customers, "temporarily wrong" means "the customer sees a number that is not their actual balance." This violates PSD2 Art. 87 (value date and availability requirements) and destroys user trust.

Optimistic locking with retries. Accept the contention. Retry failed transactions. Set a retry budget. Hope the contention resolves before the budget is exhausted. At high load, this degenerates into a retry storm where most database work is wasted.

Application-level locking. Use Redis advisory locks to serialize access outside the database. Now you have two systems maintaining consistency, the lock manager and the database, with a window between lock acquisition and database commit where they can disagree. A crash in that window produces inconsistent state.

Each approach trades one problem for another. None addresses the fundamental mismatch: PostgreSQL's row-level locking model is designed for arbitrary concurrent access to arbitrary schemas. Financial settlement is not arbitrary. It is a constrained, high-frequency, contention-heavy workload that benefits from a fundamentally different execution model.

The Purpose-Built Alternative

A settlement engine designed for financial OLTP makes three choices that a general-purpose database cannot make without abandoning generality:

Fixed-size records. Every account: 128 bytes, cache-line aligned. Every transfer: 128 bytes, cache-line aligned. No variable-length fields. No TOAST. No overflow pages. The storage engine calculates any record's disk position by arithmetic, no B-tree traversal, no index lookup. Sequential access is maximized. Random I/O is eliminated.

Batch processing instead of row-level locking. The engine collects transfers into batches (up to 8,190 per batch) and processes them in a single pass. There are no row-level locks because there is no concurrent access to individual rows. The batch is the unit of atomicity.

The throughput model inverts: more concurrent clients means fuller batches. Fuller batches means better amortization of per-batch overhead (consensus round, disk flush). Under increasing load, throughput improves, the opposite of a lock-based system.

Engine-enforced invariants. Double-entry is not a library or a trigger. It is a protocol constraint. A transfer where sum(debits) ≠ sum(credits) is rejected by the engine before it reaches storage. A transfer that would overdraw an account (when the debits_must_not_exceed_credits flag is set) is rejected. Deduplication is protocol-level: a 128-bit transfer ID prevents replays without application-layer idempotency keys.

No application bug can bypass these invariants. They are enforced by a separate process with its own memory space, its own validation, and its own storage. The blast radius of an application-layer error is bounded by the service boundary.

The Right Tool for Each Layer

This is not an argument against PostgreSQL. PostgreSQL is the right choice for:

Data TypeWhy PostgreSQL
Customer recordsRelational, queryable, variable-length fields, full-text search
KYC/KYB dataComplex nested structures, JSON support, foreign keys to customers
ConfigurationKey-value with constraints, audit triggers, RLS per tenant
Jurisdiction profilesRelational with holiday calendars, business day rules
Audit logsAppend-only with hash chains, queryable by correlation ID
Transaction metadataISO reason codes, external references, posting audit trail

PostgreSQL handles all of this well. SQL is the right query language for ad-hoc analysis, reporting, and operational tooling.

The argument is against using PostgreSQL for one specific thing: high-throughput, contention-free, strictly serializable financial settlement. For this workload, and only this workload, a purpose-built engine delivers properties that PostgreSQL cannot match without compromises that undermine correctness.

The architecture: two storage layers, each optimized for its workload.

PostgreSQLRelational storage

Domain data, metadata, config: customer records, KYC/KYB data, audit logs, transaction metadata, jurisdiction profiles. Full SQL power for queries and reporting.

Settlement EnginePurpose-built OLTP

Ledger operations only: account balances, transfer execution, double-entry enforcement, two-phase commits. Sub-millisecond latency, zero contention.

PostgreSQL does what it does best: store and query structured data with full SQL power. The settlement engine does one thing: move money between accounts with strict serializability, zero contention, and engine-enforced invariants. Neither system tries to be the other.

How to Evaluate

Three questions for any ledger storage evaluation. The answers are measurable.

1. What happens when 100 concurrent transfers hit the same account?

Measure p99 latency, not p50. P50 tells you the typical case. P99 tells you what happens under contention, which is the only scenario that matters for hot accounts. If p99 degrades by more than 10x under hot-account load, the system has a contention ceiling.

2. Can a bug in your application code create money?

If the double-entry invariant is enforced in application code (a library, a trigger, a middleware), the answer is yes. A bug in the enforcement layer, a missing check, a race condition, an exception that skips validation, can produce a transfer where debits ≠ credits. If the invariant is enforced by a separate engine at the protocol level, the answer is no.

3. What is your deduplication strategy?

If it's an application-layer idempotency key (a UUID stored in a table, checked before every transfer), what happens when the key generation logic has a bug? What happens when the check and the transfer are not in the same atomic operation? Protocol-level deduplication, where the engine itself rejects duplicate transfer IDs, eliminates this class of problem.


Read more: The Ledger | Deterministic Settlement


Sources:

  • Amdahl, Gene M. "Validity of the single processor approach to achieving large scale computing capabilities." AFIPS '67, 1967.
  • PostgreSQL documentation: Row-Level Locking, Serializable Isolation (https://www.postgresql.org/docs/current/transaction-iso.html)
  • PSD2, Directive 2015/2366, Art. 87 (Value date and availability of funds)
  • "Scalability! But at what COST?" (McSherry et al., USENIX HotOS'15), why scale-out is not always the answer for write-heavy OLTP