Myths about Distributed Business Transactions

Nowadays, most systems are distributed in one way or the other - just accessing a database via HTTP makes the system distributed. Not so rarely, there is a message broker in the mix which makes a good soil for Distributed Business Transactions. There is a lot of mystery about them, they get confused with ACID transactions easily. I'm going to address common myths and misunderstandings about Distributed Business Transactions and provide some options on how to handle multiple independent resources. Before we start, let's try to understand what a Distributed Business Transaction is.

Distributed Business Transactions

Distributed Business Transaction is a business transaction that spans at least two independent transactional resources. 

One example would be a database update and sending a message via a message broker (see Diagram 1). Those two resources have their own transaction management and are completely independent. Another example would be updating two different files on a file system.

Diagram 1

In order to even think about Distributed Business Transactions, at least n - 1 of resources must have the option to revert (or overwrite) a transaction successfully committed before. We execute the Distributed Business Transaction by executing individual transactions on the corresponding transactional resources. If the execution of the transaction T(k) fails (where k ranges from 1 to n), we need to revert committed T(1)...T(k-1) transactions. 
Another important characteristic of transactions against resources is idempotency. In order to fix certain Distributed Business Transaction issues, some solutions may choose to retry transactions that already have been committed.

Do note that I'm not talking here about Distributed Transactions implemented by some database vendors that span across database nodes. These nodes are interdependent and managed by the database management system (DBMS). Although they internally suffer from the same myths described in this article, we observe them as a single resource via a DBMS.

Myth: Atomicity

Diagram 1 shows a client trying to perform a Distributed Business Transaction comprised of two independent transactions - T1: updating the database, and T2: sending a message via a message broker. Consider a case where the database update succeeds, and sending a message fails. We might retry sending a message, depending on the failure, it may never succeed. If we observe only a database, we may draw the wrong conclusion that our Distributed Business Transaction succeeded, while it didn't. Trying to revert the database in order to "rollback" the Distributed Business Transaction might also be difficult.
Distributed Business Transaction over several independent resources will never be atomic if we observe atomicity of any resource. Unless independent resources are reasoned about as a single unit of atomicity.

Bonus Myth: Consistency

It may go even without saying, but 
Several independent resources will never have an immediate consistency, but eventual one. Unless independent resources are reasoned about as a single unit of consistency.

Myth: Two-Phase Commit

In order to reduce the chance for a Distributed Business Transaction to fail, a two-phase commit protocol is introduced. It does exactly what its name suggests - splits the commit into two phases: pre-commit and commit. Pre-commit phase is more of an inquiry to see whether the transaction would be accepted, and commit actually executes the transaction. If all resources confirm the pre-commit phase, the commit phase is executed (see Diagram 2).

Diagram 2

However, as you might have guessed, this only reduces the chance of a partially done commit. A lot can change between the pre-commit and commit phases. If a pre-commit was successful, it doesn't guarantee that a commit will be.

Two-Phase commit gives you comfort, not a solution.

An interesting benefit is that, if possible, individual transactions can be executed in parallel. 

Myth: Distributed Saga

Distributed Saga is a mechanism that helps coordination of a Distributed Business Transaction. It expects that each resource transaction could be compensated (reverted). The Distributed Business Transaction is executed step by step - by executing transactions against resources. If one of the steps fails, all previous ones get compensated (see Diagram 3).
Diagram 3

Of course, atomicity and consistency mustn't be reasoned on a single resource level, but rather on a higher level. Individual transactions could be executed in parallel.

Myth: Trust a single Resource

The idea behind this approach is to define the order of individual transactions against resources. Once the order is determined, the success of the transaction on the last resource defines the success of the whole Distributed Business Transaction. Usually, the resource that is the most difficult to be reverted is picked to be the last one.

Of course, this approach limits the parallelization of the whole Distributed Business Transaction. However, it improves reasoning about atomicity since only one resource is taken into consideration. 

Conclusion

A very common misconception is to treat Distributed Business Transactions the same way as ACID transactions. Distributed Business Transactions require much more planning and designing in advance. Atomicity and Consistency should be reconsidered. Also, there is no silver bullet for implementing Distributed Business Transactions. Don't trust out-of-the-box solutions without fully understanding what are the pros and cons. Good luck!

Comments

Popular posts from this blog

Optimistic State Machine Execution

Lock-free Exclusive Processing

Ordering in Event-Sourced Systems