The evolution of key-value "NoSQL" data stores over the past decade has been a natural evolution of the tendency of web-based applications to avoid the problems with the object-relational impedence mismatch, by storing complex objects inside blob fields in traditional relational databases. These values sometimes contained even nested structures, much to the horror of old-school DBAs used to strict adherence to third normal form [1]. Once this pattern set in, it was only a matter of time before people began to realize that you don't really need the whole RDBMS if you're going to use such a limited subset of its capabilities.
After reading the article discussing Wix's use of MySQL, I became curious about just how expensive this whole SQL layer really is, and how much are we "paying" for it relative to a streamlined interface.
There are a number of sites exploring the performance of memcached-innodb, but I wanted to dig a bit deeper and understand the performance relative to a baseline, which would be vanilla memcached itself, and compare to SQL-based PgSql for good measure. In a follow-up post, I am hoping to include some comparisons to HStore and JSONB for Postgres.
In other words, I will intentionally compare apples to oranges and pineapples. You’ve been warned. The reason for this is if you are going to "misuse" an apple as an orange, why not try to understand what effect this has?
My hypothesis going in is that, for key-value like workloads, MySQL and PostgreSQL would perform comparably, and given its simplicity, memcached would be significantly faster - it’s not an RDBMS after all. What was less clear to me was where in between memcached+innodb would be.
As I’ve been looking for excuses to get better with both Go and Docker, I decided to write a simple sysbench-inspired benchmarking tool that supports the above systems, that you can find on github. Why Go? From my perspective, it’s like a happy marriage of C and Python, keeping much of the benefit of both and discarding most of what is tedious, dangerous or downright annoying.
Docker-machine is an amazing tool, and, an under appreciated gem for the benchmarker. With a few commands I can deploy a colocated benchmarking run with my framework, the system I want to test, all on varying types of hardware and on several different platforms. I decided on using a KVM-backed VM on my own laptop, and a t2.medium instance on Amazon as the two testing grounds.
During the growth days of the LAMP stack in the early 2000's, you had to use the tools available. A pattern I saw frequently in many applications was the use of MySQL or Postgresql as effectively a key-value store, with a single, or small number of monster tables storing serialized objects in a format such as JSON.
Surprisingly, this pattern persists even in 2016, despite the availability of mature, specialized and open source datastores [2] and even add-ons (maybe better remove-ons?) for MySQL and PostgreSQL [3] allowing a more streamlined interface.
One thing it is easy to forget, is that stable and complex RDBMS systems like MySQL are multi-tiered applications in and of themselves, and these tiers can be decoupled. Systems like MoSQL, and Vertica essentially took the front-end of MySQL and Postgres and connected them to entirely different backend storage systems. The memcached interface to InnoDB does the inverse - allowing developers that have no need for a full-blown SQL front end to make use of the stable and high-performance InnoDB storage engine that is the most commonly used backend for MySQL.
My previous research experience in this area was in the construction of a MoSQL, which was a storage engine for MySQL that we connected to a transactional, in-memory k-v store, along the lines of the deprecated BerkeleyDB storage engine for MySQL. One thing we became aware of very quickly in this process, is the considerable performance penalty for running simple key-value workloads through MySQL compared to hitting the backend directly through it's key-value interface - though our storage engine code was probably not sufficiently optimized.
Comparing vanilla Postgres, mysql, memcached and memcached+innodb
After reading the article discussing Wix's use of MySQL, I became curious about just how expensive this whole SQL layer really is, and how much are we "paying" for it relative to a streamlined interface.
In other words, I will intentionally compare apples to oranges and pineapples. You’ve been warned. The reason for this is if you are going to "misuse" an apple as an orange, why not try to understand what effect this has?
The Hypothesis
The tool and tests
As I’ve been looking for excuses to get better with both Go and Docker, I decided to write a simple sysbench-inspired benchmarking tool that supports the above systems, that you can find on github. Why Go? From my perspective, it’s like a happy marriage of C and Python, keeping much of the benefit of both and discarding most of what is tedious, dangerous or downright annoying.
Docker-machine is an amazing tool, and, an under appreciated gem for the benchmarker. With a few commands I can deploy a colocated benchmarking run with my framework, the system I want to test, all on varying types of hardware and on several different platforms. I decided on using a KVM-backed VM on my own laptop, and a t2.medium instance on Amazon as the two testing grounds.
The workloads I tested are random PK-based INSERT, UPDATE, SELECT and DELETE operations against a relatively small table (160k rows), and the equivalent operations against memcached (against memcached itself, and the memcached interface for InnoDB). I tried to get at least three runs of each test, and ran tests with 1-16 clients [5]. I kept the table small because I wanted it to be big enough such that the btrees were of a non-trivial height in the RDBMSs, but still comfortably within memory, since ultimately what I'm most interested in understanding is the cost of the SQL layer.
The results for the tests against a t2.medium instance on EC2 were largely consistent with the results I got against my laptop - however the effect of lower I/O performance on a standard EC2 instance, along with 2 CPU cores instead of the one available on my KVM instance, make the differences observed above even more pronounced. Presented as one big facet_grid:
As always, whenever possible, use the right tool for the job. If you have good reasons for using one tool for a number of purposes, as RDBMS systems historically were made to do, expect to need to make some effort to make it perform as well as the tool specifically-built for that purpose.
Providing a SQL-based interface to your data can reduce performance by 2x or more. Moreover, maintaining a B+Tree is expensive. As we can see with the innodb-memcache select results, when unshacked and compared to NoSQL systems in a more apples-to-apples fashion, legacy RDBMS systems show that they run quite fast.
[1] At least those old-school DBAs that were allowed anywhere these systems by the cool kids of the day
[2] Too numerous to name, Redis, Cassandra, Mongo, etc.
[3] innodb-memcache, hstore and JSONB for postgres
[4] At some point I'd like to expand these tests to include the Hash index type for MySQL, but wanted to use a more "out of the box" set up for mysql
[5] This isn't many connections, and many of these systems scale right up to 256+ connections, but with a 1 or 2 core system, the extra performance is not so substantial and can be seen with the declining improvement already at 16 for most of the tests
Benchmark Results
Select
Probably the most interesting result in terms of illustrating the cost of operating a SQL front end to your data is random select performance. The test is about as easy as you can get from the perspective of an RDBMS - the data is entirely cached in memory, and there are no expensive operations such as BTree modifications or disk I/O.
As would be expected, performance for MySQL and PgSQL is comparable.
Where we see the cost of the SQL front end is in the difference between MySQL and InnoDB-memcache - about a 2x difference for the same test. Also note the difference between innodb-memcache and vanilla memcache - when we "unshackle" InnoDB of the need to do all the extra things a RDBMS has to do in comparison to a simple in-memory k-v system, it runs at nearly the same speed as vanilla memcached. This is quite impressive.
I should note that it is possible that this difference could be even more pronounced if not for the cost of running the benchmark itself .
Insert
The insert performance results highlights one of the dangers of using a BTree-based RDBMS table for a key-value workload, and of evaluating the taste of an orange after you have bitten into a heirloom apple[4].
Our baseline, memcached alone, has no need to maintain a BTree or worry about writing anything to disk, and so writes are not too different from selects above.
All operations in the benchmark are done as single transactions - in other words, a single begin/commit and sync to disk. This is an example of apples to oranges comparison, but, is exactly why it can be unfair to think of "legacy" RDBMS systems such as MySQL or PgSQL as "slow". Given enough parallel clients, and some tuning, we could make the RDBMS results a bit closer to the memcache result. But if you naively use a RDBMS as a k-v store, this is the sort of performance you can expect.
Delete
Delete performance has the interesting result of being the fastest of the operations for vanilla memcache. My guess is because not even any memory copying is necessary - a flag is updated for the key that is garbage collected at some point, though I am not familiar with the details of the memcache implementation.
Performance for the other three is comparable to the results for insert, given the need to maintain and periodically resize the BTree.
Update
Updates also follow a similar pattern.
AWS Performance
Concluding Thoughts
As always, whenever possible, use the right tool for the job. If you have good reasons for using one tool for a number of purposes, as RDBMS systems historically were made to do, expect to need to make some effort to make it perform as well as the tool specifically-built for that purpose.
Providing a SQL-based interface to your data can reduce performance by 2x or more. Moreover, maintaining a B+Tree is expensive. As we can see with the innodb-memcache select results, when unshacked and compared to NoSQL systems in a more apples-to-apples fashion, legacy RDBMS systems show that they run quite fast.
[1] At least those old-school DBAs that were allowed anywhere these systems by the cool kids of the day
[2] Too numerous to name, Redis, Cassandra, Mongo, etc.
[3] innodb-memcache, hstore and JSONB for postgres
[4] At some point I'd like to expand these tests to include the Hash index type for MySQL, but wanted to use a more "out of the box" set up for mysql
[5] This isn't many connections, and many of these systems scale right up to 256+ connections, but with a 1 or 2 core system, the extra performance is not so substantial and can be seen with the declining improvement already at 16 for most of the tests