The name, Bigtable, is not new. Google began developing its internal data storage system back in 2004, and wrote about it in a 2006 research paper. Google’s ideas influenced a generation of developers, all of whom were searching for cheap, distributed ways to store and query data at massive scale. The United States’ National Security Agency (NSA) built Accumulo, in part using Bigtable as an inspiration. Facebook built Cassandra, Powerset built HBase, LinkedIn built Voldemort, and a range of other big names in technology also developed their own solutions to problems very like the ones Bigtable was designed to address. And, over the years, all have grown beyond the companies that developed them, becoming commercial products, open source projects, or both. All except Bigtable. Until today.
Now Google is launching the public beta for a hosted version of Bigtable, running in its cloud, backed by its engineering talent, and available to all comers: meet Google Cloud Bigtable. A decade after its ideas were fresh and new, years after some of Google’s biggest competitors launched equivalent services of their own, can Google and Bigtable still compete? Or are they too late?
Google’s new service takes the Bigtable capabilities that already underpin internal applications like Gmail, and makes them available to developers with a need to run NoSQL processes rapidly and at scale. According to Google, the new service is fast enough to feasibly serve web-scale applications directly. Competitor services don’t always display the right performance profile to act in this way, forcing developers to insert caches or additional infrastructure to cope with the transition between operational data and back-end processing or analytics.
Bigtable fully supports the HBase API, making it relatively straightforward to integrate with existing HBase-based applications and workflows, and Google claims significant performance improvements over native HBase. Google’s own figures suggest that Google Cloud Bigtable may achieve as much as two times the performance of leading competitors, while being half the cost in terms of TCO. The specific savings and performance gains will vary from use case to use case, of course, and Bigtable will not always be cheapest or most performant.
A significant piece of Google’s value proposition with Bigtable relates to the way in which the full service is comprehensively managed by the company. ‘Management’ and ‘support’ figured throughout Google’s pre-briefing session, and ZDNet’s Toby Wolpe captures it well, quoting Google’s Cory O’Connor:
“[Bigtable's capabilities are] being packaged in a product that you don’t have to manage. Even if you had a piece of technology that could live up to these data sizes, managing has always been a challenge. When we say fully managed, this is not fully deployed or managed deployment. This is essentially an API that you provision with a guaranteed amount of server processor throughput behind it and unlimited flexible storage behind that as well.”
And this comprehensive management is a trend that we’re seeing more of. Altiscale, for example, offers something very similar for Hadoop. Their argument has always been that customers shouldn’t have to care about configuring clusters, or scaling workloads. Customers should simply ask for data to be processed, and the backend infrastructure should be available to make that happen. Easier said than done, for sure, but Altiscale seem to be doing something right.
There’s a group of customers for whom this management of all the underlying infrastructure is compelling, but it’s less clear that they are necessarily natural consumers of Google. Technologically, Google can do this. Everything they’ve done for a decade or more clearly shows that to be the case, and the speeds and feeds around Google Cloud Bigtable are certainly impressive. But can they build the right relationships with the right people in the right parts of the right organizations to turn this into a viable business? That’s less clear.
Google is also rather late to this particular party. Amazon’s DynamoDB, for example, launched all the way back in January of 2012, and AWS has continued to improve the product in a variety of ways since then. DynamoDB may (as Google and some of their reference customers claim) be more expensive than Bigtable for particularly I/O-intensive workloads, but it’s still a compelling product… and one that’s been securing customers and mindshare for two years. And for those who don’t necessarily want to run everything in Amazon or Google’s cloud, alternatives like Basho’s Riak database continue to improve in some of the areas that Google specifically highlights as Bigtable’s strengths. Riak’s latest release, last month, claimed to have “increased write speeds by more than 2x for write-heavy workloads.” None of Google’s competitors are standing still.
With today’s news, Google brings another interesting product to a busy market. But there’s no need for competitors to pack up and go home, just yet. Bigtable has some strengths, and it has some limitations. For customers who know what problem they’re trying to solve, there’s one more viable and powerful tool to test. For some of those customers, Bigtable will be the winner. But not for all of them. Not by a long way.
This article was written by Paul Miller from Forbes and was legally licensed through the NewsCred publisher network.