Tuesday, September 9, 2014

Database Elasticity: It's more important than you think

Databases are notoriously hard to do right: to design, to administer, to make highly available and above all, scalable.

But this discussion will not be about SQL vs. NoSQL, Big Data vs. Not-So-Big Data. This is about elastic vs. inelastic.

Workloads are not only continuing their perpetual climb to the sky, they are becoming less predictable. This means ever-larger worst-case peaks to prepare for. There is an old adage among IT managers: No manager ever got fired for buying IBM. Similarly, no DBA was ever fired for over-provisioning a database server. It's not hard to understand why: paying $10000 for an un-sexy, infrastructure-like service like a relational database when $5000 would have been enough to cover any peaks of demand is often a trivial $5000 cost in comparison to the potentially disastrous costs due to high latency, instability or unavailability that could be brought on by using an under-provisioned $2500 service [1]. Systems tend to be at their most unpredictable when approaching their peak capacity, and so over-provisioning is a deeply ingrained practice.

Consider the demand and capacity curves for a typical inelastic database service here:

The workload graph above is admittedly contrived and simplified for the purposes of this discussion, but fairly well represents the peaks and valleys that characterize most workloads, say, on a daily basis. The black line is the capacity of the server/VM you paid for. It could be a physical server in your own data center or a virtual machine deployed through a database-as-a-service like Amazon RDS or Openstack Trove. The entirety of the green area of the curve below the capacity line represents wasted computing power, and thus, wasted money given a constant price per time, which is the typical case in most public clouds. Put another way, the cost per operation is highest in the valleys of the workload and lowest at peak. Similarly, the blue area of the curve is the dangerous situation of having a service that is under-provisioned.

The holy grail of elastic services, database or otherwise, is to make the flat capacity curve fit the demand curve as closely as possible, thereby approaching a constant cost per operation and not time. The state of the art in relational database systems is short of this ideal: services like Amazon RDS with Provisioned IOPS indirectly take a step in this direction through the use of cost-per-IO in the underlying block storage. Research systems, like the MoSQL storage engine, bring elasticity a layer higher, enabling a flexible allotment of servers in a cluster in order to match workload.

An interesting twist is that public cloud providers will not have much interest in changing this status quo. As long as customers are able and willing to pay to over-provision services in the cloud, whether databases or anything else, public clouds will be able to over-commit their underlying hardware resources and sell more VMs with less hardware.

It is not clear how long this situation will last. As services such as databases become more elastic and capacity begins to more closely match workload, users will be attracted to the cost savings and place downward pressure on the over-commitment ratios for underlying hardware. Public clouds may be forced to either raise prices or move away from an instance-hour pricing model.

(A version of this article originally appeared on LinkedIn Pulse)

[1] The numbers are of course arbitrary; it could be $10,000 for an actual server housed in your data center or a $10/hour VM on a public cloud

No comments:

Post a Comment