Google recently published a paper describing Spanner, their globally distributed, Paxos-replicated database with externally consistent transactions, using specially designed hardware and a new TrueTime API.
The motivation behind using Spanner for certain types of applications instead of the massive key-value store that is BigTable is interesting –
We believe it is better to have application programmers deal with performance problems due to overuse of transactions as bottlenecks arise, rather than always coding around the lack of transactions.
In an effort to achieve this, Spanner combines the Replicated ACID transactions of Megastore with the Scalability and throughput of BigTable. The challenges faced and the solutions around this (especially custom hardware and the TrueTime API) are described in great detail in the research paper. You can also view the presentation “Building Spanner” from Alex Lloyd (or just download the slides) to get more details.
Google has already moved F1, the backend for their advertising platform, to use Spanner instead of MySQL.
There are several reactions from around the web.
It’s interesting to see that the creators of BigTable and the early proponents of eventual consistency have invested the last 4.5 years building a system that adds back strong consistency guarantees.
If the Spanner paper is as important as BigTable, ACID may become the new goal for those building distributed systems.
It makes me sad to see how far ahead Google is compared to the rest of the world. The notion of uncertain time is ingenious.
It represents a pragmatic acceptance of developer’s reluctance to reason in the absence of immediately consistent transactions and therefore strikes a bittersweet chord. Philosophically this feels like a big step forwards for distributed systems.
It is probably too early to see whether open source software similar to this will appear soon, especially given that Google leverages custom hardware to really benefit from this. Also we don’t know whether Google will offer this as an external data service through it’s Cloud platform. Such software though could definitely help companies leverage globally distributed cloud infrastructure (to avoid service disruptions when a data-center goes down) and still rely on strong consistency guarantees when designing their applications. NuoDB and FoundationDB are at least two projects that seem to target such a use case although it is unclear how they will compare without special, time-handling hardware.