(W.I.P) Local transactions and their tradeoffs
What is a transaction?
The simplest definition is that it is a set of operations that must either succeed or fail as a complete unit, with no partial status to preserve the data integrity. When dealing with traditional relational databases, transaction semantics are usually defined in terms of ACID. However, as systems scale out, we have to trade these strict guarantees for BASE model.
The ACID principles (where consistency is more important than availability):
Atomicity
Ensures that all operations within a single database transaction are treated as a single, indivisible unit. They are either all fully completed or the database is reverted to its original state as if nothing happened.
Consistency
Guarantees that a transaction can only bring the database from one valid state to another, strictly adhering to all database schema constraints such as foreign keys, unique indexes, data types, and check constraints. Software engineers frequently mistake database consistency for application consistency. For a deeper dive into understanding application invariants I recommend this Milan Jovanovic blog post - what invariants are and why a domain model is the best place to enforce them
Isolation
Determines how visible the changes made by one ongoing transaction are to other concurrent transactions. It provides the illusion that transactions are executed serially (one after another), even though they run simultaneously. We'll dig deeper into isolation levels later on.
Durability
Guarantees that once a transaction is committed, its changes are permanent and will survive any subsequent system failure, such as a sudden power outage, an operating system crash, or a hardware reboot.
The BASE principles (where availability has a higher priority than availability):
Basically Available
The system guarantees availability in terms of the CAP theorem. Instead of locking down the entire system and returning errors during a network partition or a node failure, the system will choose to remain functional. Some parts of the data might be temporarily stale or inaccessible, but the database as a whole does not go down.
Soft State
In an ACID database, data is stable and deterministic at any given point. In a BASE system, the state of the data is "soft"—it can change over time, even without direct user interaction, as updates gradually trickle through different nodes in a distributed cluster.
Eventual Consistency
The system does not guarantee that everyone will see the same data at the exact same millisecond. Instead, it guarantees that if no new updates are made to a specific piece of data, all replicas across the network will eventually converge and become identical.
Isolation levels
SERIALIZABLE - a serial execution is viable but within certain constraints:
- Every transaction must be fastly executed, because a slow one will stall the progress of next in line transactions.
- Cross partition serializability can be implemented, but pose some challanges of their own (TODO: article on that?)