Skip to main content
Back to Insights
Ledger9 min read

Deterministic Settlement: Why General-Purpose Databases Fail at Scale

Lock contention, GC pauses, and deserialization overhead. What a purpose-built settlement engine does differently.

Deterministic Settlement: Why Sub-Millisecond Latency Matters in Finance


A transfer in-flight is capital locked. The sender's balance is debited. The recipient's balance is not yet credited. That delta, the gap between commitment and availability, is money that neither party can use.

At low volume, the cost is invisible. At 1,000 transfers per second with an average settlement time of 200 milliseconds, roughly 200 transfers are in-flight at any moment. At EUR 500 average value, that is EUR 100,000 perpetually locked in transit. Increase settlement latency to 2 seconds and the locked capital rises to EUR 1,000,000. The infrastructure cost of the servers is a rounding error compared to the capital cost of the latency.

Banks hold capital buffers for exactly this reason. The buffer covers the uncertainty window: the period between "we committed to send" and "we confirmed it arrived." Shrink the window, shrink the buffer. The relationship is direct.

Why General-Purpose Databases Hit a Wall

The default architecture for a financial ledger: PostgreSQL (or MySQL, or any relational database), a transactions table, and application-level logic to update balances. For the first 10,000 accounts and 100 TPS, this works without incident.

The ceiling appears when accounts become "hot." A fee collection account touched by every transaction. A settlement account that aggregates all outbound payments. A platform revenue account that receives every commission split. These accounts are accessed by a disproportionate share of transfers, and every access contends for the same row lock.

PostgreSQL serializes writes to the same row. Under SERIALIZABLE isolation (the only level that prevents all anomalies in financial workloads), conflict detection adds further overhead. When two concurrent transactions touch the same account, one wins and one retries. Under load, retries cascade. Throughput degrades non-linearly.

Amdahl's Law quantifies the ceiling. If 5% of your workload is serialized, because 5% of transfers touch a hot account, the maximum theoretical speedup from parallelization is 20x. Add more cores, more connections, more replicas: the ceiling does not move. It is a property of the workload, not the hardware.

The workarounds are familiar:

  • Sharding, split the account space across databases. But cross-shard transfers (sender on shard A, receiver on shard B) require distributed transactions. Two-phase commit is slow and complex. You have traded lock contention for coordination overhead.
  • Eventual consistency, relax the isolation level, accept that balances may be temporarily inaccurate, reconcile later. In finance, "temporarily inaccurate" means "the customer sees a balance that is wrong." Unacceptable for any system that displays balances in real-time.
  • Application-level locking, use Redis or advisory locks to serialize access outside the database. Now you have two systems to keep consistent, two failure modes to handle, and a lock manager that becomes its own single point of contention.

Each workaround solves the immediate problem and introduces a new one. The fundamental issue remains: a general-purpose database was designed to handle arbitrary queries against arbitrary schemas. Financial settlement is not an arbitrary workload. It is a specific, constrained, high-frequency operation that benefits from a purpose-built execution model.

What a Purpose-Built Engine Does Differently

A settlement engine designed for financial workloads makes three architectural choices that a general-purpose database cannot:

Fixed-size records. Every account is exactly 128 bytes. Every transfer is exactly 128 bytes. Cache-line aligned. No variable-length fields, no TOAST tables, no overflow pages. The CPU can predict memory access patterns, and the storage engine can calculate any record's position by arithmetic. No index lookup required.

Batch processing. Instead of acquiring and releasing a lock per transfer, the engine collects transfers into batches, up to 8,190 operations per batch, and processes them in a single pass. Lock contention is irrelevant because there are no locks. The batch is the unit of work. All transfers in a batch are applied atomically.

The throughput model inverts: instead of degrading under concurrency, it improves. More concurrent clients means fuller batches. Fuller batches means better amortization of the per-batch overhead. The system gets faster as load increases, up to the physical limits of memory bandwidth and disk I/O.

Zero allocation, zero GC. All memory is statically allocated at startup. No garbage collector runs during operation. No malloc/free cycles. No fragmentation over time. Latency is deterministic: the time to process a batch is a function of the batch size, not of the heap state.

The result: sub-millisecond per-transfer latency with predictable throughput under load. No degradation curve. No tail latency spikes from GC pauses. No surprise behavior during peak hours.

PropertyGeneral-Purpose DBPurpose-Built Engine
Record accessIndex lookup (B-tree traversal)Arithmetic (offset calculation)
Concurrency modelRow-level locks, conflict detectionBatch processing, no locks
Throughput under contentionDegrades non-linearlyImproves with fuller batches
Memory managementDynamic allocation + GC or manual freeStatic allocation, zero GC
Tail latency (p99)Unpredictable (GC, lock waits, vacuum)Deterministic
Invariant enforcementApplication code or triggersEngine-level (protocol rejects invalid transfers)

What Deterministic Settlement Enables

Sub-millisecond settlement is not a vanity metric. It enables four operational capabilities:

Real-time balance visibility. Internal transfers settle instantly. No "pending" state for on-platform movements. The balance the customer sees is the balance they have. This eliminates an entire category of support tickets ("Why is my balance wrong?") and removes the need for "available balance" vs. "ledger balance" distinction for internal flows.

Smaller capital buffers. Less uncertainty means less reserve required. If settlement completes in under a millisecond, the in-flight capital at any given moment is negligible. The capital that would otherwise be locked in buffers can be deployed productively.

Faster reconciliation. When settlement and recording happen in the same atomic operation, reconciliation becomes a verification step rather than an investigation. The ledger and the settlement engine agree by construction. Discrepancies can only arise at the boundary, when external systems (banks, clearing networks) are involved.

Deterministic simulation testing. If the engine is deterministic, same inputs always produce same outputs, in the same order, you can replay production workloads in a test environment and get identical results. This is how you test a financial system under load: not with mocks that approximate behavior, but with deterministic replay that reproduces it exactly. Inject faults (disk corruption, network partitions, process crashes) and verify that the engine recovers correctly. Every time. Reproducibly.

The Performance Claims Problem

Most core banking vendors publish throughput numbers. "50,000 TPS." "100,000 accounts per second." These numbers are rarely meaningful without context.

Questions that matter:

  • What kind of transaction? A balance inquiry is not a settlement. A read is not a write. "TPS" without specifying the operation is meaningless.
  • What account distribution? 1,000 TPS against 1 million uniformly distributed accounts is trivial. 1,000 TPS against 100 accounts with a Zipfian distribution (hot accounts) is a different workload entirely.
  • What consistency model? "50,000 TPS" under READ COMMITTED is a different number than under SERIALIZABLE. Financial workloads require the strongest isolation. Quote the number at that level.
  • What percentile? p50 latency tells you the typical case. p99 tells you the worst 1-in-100 case. p100 tells you the actual worst case. For financial settlement, p99 and p100 are what matter. A system that is fast 99% of the time and stalls for 2 seconds on the 100th request is not deterministic.

Thought Machine (a core banking vendor) publishes certified performance reports with specific test journeys, pre-loaded data volumes, documented infrastructure, and independent verification. This is the standard the industry should adopt: qualified claims with reproducible methodology.

The honest approach: publish your test methodology alongside your numbers. Describe the workload. Name the isolation level. Show the percentile distribution. Let engineers evaluate the claim against their own workload profile. Unqualified performance numbers are marketing. Qualified performance data is engineering.


Read more: The Ledger | The Architecture of a Financial Operating System


Sources:

  • Amdahl, Gene M. "Validity of the single processor approach to achieving large scale computing capabilities." AFIPS '67, 1967.
  • Jim Gray, "A Measure of Transaction Processing Power" (1985), TPC debit/credit benchmark
  • Thought Machine, "Vault Core Performance", certified benchmark methodology with independent verification
  • "Gray Failure: The Achilles' Heel of Cloud-Scale Systems" (Microsoft Research, 2017), silent hardware degradation