Posts

Optimistic State Machine Execution

Image
Distributed Consensus algorithms play an essential role in any Distributed System in which a cluster of nodes must agree on a single value - usually an order of user requests. Once user requests are replicated and uniquely ordered on each node (in a replication log ), a node can apply the agreed-upon request to its state machine. If each node has the same implementation of a deterministic state machine, the state on all nodes will be the same (up to the given user request). Voila, we have achieved redundancy in our system, and if some  of the nodes fail, we still have the state available  on other nodes for users to consume.  State Machine Replication Idempotency If a node crashes and restarts, it wouldn't be optimal to have the apply process start applying requests from the beginning of time. Hence, the apply process durably stores its progress in an apply_index variable. This saves us some CPU cycles. However, a node could crash during the apply process before we have durab

Lock-free Exclusive Processing

Image
The problem that I'd like to explore in this article is related to multithreaded processing, and it goes something like this:  Design a component that executes tasks one after the other - there is no overlapping in processing. Tasks could be submitted by different threads. Of course, the most obvious solution would be to synchronize the execution of tasks. Each thread would try to acquire the lock on this object if possible, otherwise, it will block and wait for the lock to be released. While waiting for this lock to be released, the thread cannot do anything else. This approach creates a lot of contention and besides that, acquiring and releasing the lock is quite an expensive operation. So, it is not performant as well. However, we must admire the simplicity of the solution. Is there a more performant solution that does not block the thread which wants to execute the task?  Firstly, let's decouple task submission and task execution - a queue sounds like a perfect fit for the

Bi-Temporal Event Sourcing, who cares?

Image
  Everybody wants to control time and change history. Even the software. Event Sourcing gives a reliable history of facts during a monotonically increasing timeline. Although people want to change history, it is not possible, despite many attempts to do so . However, in software, as well as in life, we make conclusions based on the facts we have at certain points in time. If later, we learn new facts that happened earlier, we may change our previous conclusions.  Many people wrote about temporality: Mathias Verraes , Martin Fowler , and others. There are even some database solutions that support bi-temporality: XTDB , Rails EventStore , etc. I'd definitely recommend reading these articles to get familiar with the topic. If there are so many resources, why am I writing about it? I personally think there is a lot of ambiguity about what it means for the Event Store to support bi-temporality. Will Event Store break its guarantee and rewrite history once it finds out about facts (event

Myths about Distributed Business Transactions

Image
Nowadays, most systems are distributed in one way or the other - just accessing a database via HTTP makes the system distributed. Not so rarely, there is a message broker in the mix which makes a good soil for Distributed Business Transactions. There is a lot of mystery about them, they get confused with ACID transactions easily. I'm going to address common myths and misunderstandings about Distributed Business Transactions and provide some options on how to handle multiple independent resources. Before we start, let's try to understand what a Distributed Business Transaction is. Distributed Business Transactions Distributed Business Transaction is a business transaction that spans at least two independent  transactional resources.  One example would be a database update and sending a message via a message broker (see Diagram 1 ). Those two resources have their own transaction management and are completely independent. Another example would be updating two different files on a

Beyond Total Ordering in Event-Sourced Systems

Image
  In Event-Sourced systems, relying on the total order of events on all Event Store nodes is comforting, and provides ease during the design and implementation. A total order of events on all nodes is usually achieved via single-leader Consensus Protocols, which, unfortunately, have a bottleneck - the leader (I explained this in more detail in my previous article ).  The main reason for writing this article is to provide a solution for the potential issue of having a single leader as a bottleneck. To be clear, this is the issue that not so many systems meet. However, it is not satisfactory to leave it open, and that's why I am going to dive into a possible solution - Leaderless Consensus Protocols. Leaderless Consensus Protocols There is this mystical land of leaderless consensus protocols - a land where there is no bottleneck of a single leader, but each node can be the leader for an append. In other words, such a cluster can take appends via multiple Event Store nodes. One repres

Ordering in Event-Sourced Systems

Image
In a utopian world, Event Store will write events to the disk in the exact same order in which they have happened in real life. However, computers are not so good at observing things by themselves (yet), we (still) must tell them what to do. Hence, we must provide a way to deterministically order events inside the Event Store.  A naive approach would be to attach a timestamp to each and every event and use it for ordering. The reason for its naiveness lies in the difficulty of synchronizing clocks across different machines in a distributed system. As a consequence, we mustn't rely on timestamps for ordering . Since we also cannot impose ordering of events sent by different Event Store clients among them, the only thing left is to accept the ordering of received events by the Event Store. In other words, the order in which Event Store receives events is going to be the order of events written to the disk. This means that two events could have happened in one order in reality, but t

Do we need Event Sourcing?

Image
Before we start, I would definitely recommend getting familiar with the definition and benefits of Event Sourcing. Sara did an  amazing job  in explaining what Event Sourcing is in more detail and comparisons. Let's briefly touch upon some Event Sourcing aspects that are going to be important for this article.  In order to validate a business rule and/or make any decision in our software we need to create a model representation of our domain. For the purposes of this article, I'm going to call this model the  Decision Model . There are different ways of storing this model durably.  The first one is called  state-stored,  which means that we are storing only the current state of the Decision Model.  Another approach is to store the Decision Model as a series of events that describe facts that happened in our system - we say that this model is event-sourced . Now, when we need to load an  event-sourced model from the storage, we must read all events related to it. The most suitab