Tuesday, May 10, 2016

Look Ma, No Locks - Act II - Tunable Consistency

 The last post described why our industry started using locks excessively in database system to achieve Serializable transactions. The conflict and tension between Availability and Consistency became more serious as days passed by...
This post first talks about how the concept of database Isolation Levels is actually an early attempt to define Tunable Consistency and then describes the essence of now famous CAP theorem.
Next post will lay out necessary foundations to understand current research in this area including commutatively and example STRONG eventual consistency model called Conflict-free Replicated Data Types.

Locks everywhere in all shapes and sizes.

To increase the performance we had to start using different types of locks for different types of things and operations. There were obviously separate reader and writer locks. There are page level locks and then there are row level locks. Sometimes we use transaction level locks, but most times the locks are statement level locks. In short there were locks everywhere in all sizes and shapes. But our insatiable thirst for throughput meant that we were ready to let go of some consistency guarantees. Stage for Availability vs Consistency battle was set.

Are these Database Isolation Levels or Tunable Consistency Levels in disguise?

To increase throughput (aka availability) we started dropping the locks starting with range lock, then read locks and even write locks. But that obviously started showing up undesirable side effects. Recall that any transaction is in final analysis a sequence of reads and writes and that all reads and writes to the same data items can potentially cause inconsistent state as a result of random scheduling.
Later these bad effects were categorized as follows
  • P0 - Dirty Read – Transaction could read uncommitted data written by another transaction. In short your reads are dirty.
  • P1- Non-Repeatable Reads – If transaction read same data item twice then value may be different between two reads. You reads are not reliably repeatable
  • P2- Phantom reads – This happens when a query with a where clause is executed twice and between those two calls new data gets added or deleted that satisfies the query- the result is phantom reads. Some other names for this are motion blur, fuzzy read
  • In addition there are many other phenomenon viz. Lost Update, Cursor Lost Update, Read Skew, Write Skew etc etc.

In short, there is no free availability without reduced consistency

Based these bad side effects, the ANSI standard defines following four isolation levels. In reality these are rather Consistency Levels and you can choose the reduced consistency for better performance
  • Read Uncommitted – This is lowest level of consistency guarantee. Dirty reads are possible, so are Non-Repeatable and Phantom reads
  • Read committed – In this next higher consistency level, only the committed data is read. Although dirty reads are prevented other anomalies are still possible
  • Repeatable Reads – At this consistency level single item reads are repeatable but phantom reads are still possible
  • Serializable – This is the highest consistency guarantee. It is not possible to have dirty reads, phantom reads or non-repeatable reads
  • Snapshot – this is yet another consistency level where your transactions can see only the snapshot of the data as it existed before your transaction began.

Distributed Transactions – the basics

  1. As computer systems were connected to each other on a distributed network,  our industry started exploring Master-Backup architecture. However there is always some delay between when transactions are committed at Master and when Backup is updated - leaving possibility for In-Consistency
  2. Alternatively you could have synchronous updates - The protocols like two phase commit and three-phase commit were introduced just for handling this sort of stuff. In essence a transaction is not committed until all machines participating in the distributed atomic transactions first commit their share of the transaction. The obvious side effect is that while Consistency is increased the Availability is decreased because of all this co-ordination effort …

Enter the CAP theorem

The CAP theorem says that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:
  • Consistency (all nodes see the same data at the same time)
  • Availability (a guarantee that every request receives a response about whether it succeeded or failed)
  • Partition tolerance  (the system continues to operate despite arbitrary partitioning due to network failures)

 The theorem is easy to understand and equally easy to misapply.   Many have interpreted this as a binary choice between Consistency and Availability (because partitions are always ever present).
However what the theorem REALLY demands of you is following : 
  • better partition detection , 
  • better partition management 
  • And more importantly mitigation and recovery when partition recovers.
More on this next time ...


No comments:

Post a Comment