Leverage Domain Knowledge for Faster Distributed Transactions03 Aug 2020
Blog post about paper: Path-Sensitive Atomic Commit: Local Coordination Avoidance for Distributed Transactions @ https://doi.org/10.22152/programming-journal.org/2021/5/3
Cross-posted on ING’s Tech Blog.
TLDR: Safely optimize distributed transactions by leveraging high-level domain-specific models.
Many tools and libraries in software try to make the work of engineers easier: to speed up development, but also to close the gap between IT and business. These tools provide abstractions that focus on writing business logic. Within ING Bank this is no different. We use and create tools and abstractions that are closer to the business and abstract away implementation details: Baker, Cucumber, front-end libraries and components, query creators, cryptography primitives, security layers, our internal API SDK, and a lot more.
The premise of this blog post is no different. We want to describe high-level business logic without being bothered by low-level implementation details. However, creating a performant implementation of said logic is non-trivial. This blog describes an approach on how this high-level domain knowledge encoded in a model can be used to optimize distributed transactions. This even gives us an advantage over general purpose transaction mechanisms that cannot depend on this extra domain knowledge and can be used for optimizing transactions between micro-services.
In other words: We want to make transactions faster, automatically. Our algorithm, Path-Sensitive Atomic Commit (PSAC), provides a more performant synchronization implementation for automatically generated implementations. This enables writing of high-level business logic or functional requirements, and letting the algorithm take care of performance at run time.
PSAC’s main idea is to use the explicit domain knowledge to improve concurrency where safely possible, e.g. multiple concurrent withdrawals on a bank account are safe when there is enough balance available for both. However, determining this can be more expensive computationally.
Of course this algorithm’s performance has to be evaluated. Does it really perform better than a base-line implementation? Here is a sneak preview of the performance results. PSAC performs up to 1.8 times better than 2PL/2PC in a high-contention scenario.
Background: Distributed Transactions
Transactions are a mechanism to limit the complexities inherent to concurrent and distributed systems, such as dealing with hardware failure, application crashes, network interruptions, multiple clients writing to same resource, reading of partial updates, and data and race conditions. ACID transactions are the standard in databases. ACID stands for Atomic, Consistent, Isolated and Durable.
Atomic Commit & Two-Phase Commit (2PC): 2PC is a well-studied atomic commitment protocol. Atomic Commit requires that multiple resources agree on an action: all should do it or non should do it. This also hold in case of failure of one of the resources. Resources in this case can be distributed over multiple server nodes, or can even be different applications (see XA).
2PC works with a transaction manager and multiple transaction resources. The manager asks the resources to vote on an action. Only when it receives a vote commit from all, it tells all to globally commit and apply the decision.
Concurrency Control & Two-Phase Locking (2PL): 2PL is a concurrency control mechanism that uses locking to make sure that no concurrent changes are made to a resource.
Distributed Transactions: 2PL and 2PC can be combined to implement ACID distributed transactions. The locks are on the level of the 2PC resources. When a resource has voted, it is considered locked. Only after handling a global commit or abort it is unlocked again. This makes sure no other transactions can change the data in the mean time.
Path-sensitive Atomic Commit
Models in Rebel
Let’s first look at an example of such semantically rich models. We use Rebel (paper), a domain specific language for financial products, based on state machines. The concept of leveraging model knowledge is not limited to Rebel. Our example: A bank account system example consisting of money transfers and accounts with balances, which should never go below 0, visualized as state charts:
In the textual representation, we see different classes, with some internal data, representing the account balance and identities. On each of the states events are defined with pre- and postconditions, e.g.
Withdraw is only valid when the account has enough balance available.
MoneyTransfer class has a special construct
sync which represents an atomic synchronized event, where money is
Withdrawn from one account and
Deposited from another. Either both should happen or none.
class Account accountNumber: Iban @identity balance: Money initial init on Open(initialDeposit: Money): opened pre: initialDeposit >= €0 post: this.balance == initialDeposit opened on Withdraw(amount: Money): opened pre: amount > €0, balance - amount >= €0 post: this.balance == balance - amount on Deposit(amount: Money): opened pre: amount > €0 post: this.balance == balance + amount on Close(): closed final closed class MoneyTransfer initial init on Book(amount: Money, to: Account, from: Account): booked sync: from.Withdraw(amount) to.Deposit(amount) final booked
We can see how these kinds of models can represent different business logic on a relatively high level.
Rebel with 2PC/2PL
If we want to implement these models in a scalable system, we can represent all instances of these objects as 2PC resources. This means that they can be interacted with separately, until synchronization (using
sync) is requested. Locally each resource does 2PL, making sure that data is not changed concurrently, and 2PC is used to coordinate the sync.
The illustration above on the left describes what happens for such a resource (Account Entity). Vertically time is represented, and the arrows represent messages sent and received.
First (1) a vote request is received from a 2PC manager, preconditions are checked and the resource is locked. When another event (2) arrives it is delayed. Now when the 2PC manager signals the commit later (3), the event’s effects are applied to the resource’s internal state and the resource is unlocked. Now the delayed event can start as well. We see that in this way all events are nicely serialized for this resource and no preconditions are done on possibly invalid (partial) state. Event do have to wait on each other in this case, which can become a problem for busy resources.
When looking at the account model above, the most interesting precondition is
balance - amount >= €0 of
Withdraw, denoting that there should be enough balance available for the
Withdraw to be allowed.
2PL only allows a single
Withdraw to be in progress at the same time by locking the Account resource. When we naively allow multiple concurrent
Withdraws on an account, precondition checks could interleave each other, resulting in a balance below zero.
Enter Path-Sensitive Atomic Commit:
PSAC enables multiple concurrent events in-progress at the same time, resulting in lower latency for individual events, because of no locking and no delaying for events.
But how does it keep that safe?:
PSAC makes multiple concurrent
Withdraws safe by keeping track of all in-progress events. It effectively tracks all possible outcome states of in-progress events, and when a concurrent event arrives, the preconditions can be checked against all outcomes. If preconditions hold in all states, an event can already be accepted for processing (and the 2PC commit vote sent). Same for abort, if the preconditions fail in all states.
If the preconditions hold in some states, but not all, PSAC falls back to 2PL/2PC behavior and delays the events.
Withdraw example, multiple
Withdraws can be in progress concurrently when there is enough balance available for all.
Other examples such as
Deposits can also run concurrently, because adding money to an account is always allowed by its preconditions.
The PSAC diagram (above on the right) tries to explain in more detail how this works and represents the internal decisions of above sequence diagrams:
Withdrawarrives and since there are no events in progress, the preconditions are checked against the account state of €100. Now internally there are two possible outcome states, represented by the arrows: €100, when the
Withdrawis eventually aborted by the transaction manager, and €70 when the
-respectively representing the global commit and global abort.
- Now when another
Withdrawarrives, the possible outcomes tree is split again for the existing possible states.
- When a
Withdrawof €60 arrives, it is delayed, because in some of the outcome states its preconditions are valid and in some not.
- As -€50 commits, the outcome tree can be pruned, and the leaves where it had aborted are cut off.
- Now -€60 can be retried and can be directly rejected (
Fail), because in no possible outcome state its preconditions hold.
- When the first event commits the last open branch is pruned as well, and the state can be calculated by applying the postconditions of all events in order of original arrival.
This blog post’s goal is to intuitively sketch the algorithm. Please see the paper for more details.
Does it perform?
We implemented code generators for the Rebel specifications to both 2PL/2PC and PSAC on the Akka actor toolkit.
Experiments using the bank account system example above are scaled over an increasing number of nodes (and Cassandra database nodes) on Amazon AWS virtual machines. In this case there are as many money transfers as possible done, picked uniformly from 1000 bank accounts. This artificially increases the contention. The graph below contains a violin plots that show all captured throughput numbers and a fit through it. The transparent line is the linear scalability upper bound. In this graph we see both algorithms and their throughput numbers. PSAC outperforms 2PL/2PC, which is explained by the increased concurrency.
We thus see that there PSAC performs up to 1.8 times better than 2PL/2PC in this high-contention scenario. This promises good results in other situations with other models. We expect to improve the throughput (and latency) even more when more contention is happening, such as when a bank has to execute a lot of transactions involving a single bank account, e.g. when a tax office pays out benefits to citizens.
This shows that higher level semantically-rich models, such as Rebel, give possibilities in bridging the gap between declarative high-level models and optimized implementations.
This paper is part of my PhD project in the ongoing collaboration between ING Bank and Centrum Wiskunde & Informatica (CWI) on managing complexity of enterprise software ecosystems.