Database Microbenchmarks

Symas Corp., July 2012

This page follows on from Google's LevelDB benchmarks published in July 2011 at LevelDB. (A snapshot of that document is available here for reference. In addition to the benchmarks tested there, we add the venerable BerkeleyDB as well as the OpenLDAP MDB database. For this test, we compare LevelDB version 1.5 (git rev dd0d562b4d4fbd07db6a44f9e221f8d368fee8e4), SQLite3 (version 3.7.7.1) and Kyoto Cabinet's (version 1.2.76) TreeDB (a B+Tree based key-value store), Berkeley DB 5.3.21, and OpenLDAP MDB (git rev a0993354a603a970889ad5c160c289ecca316f81). We would like to acknowledge the LevelDB project for the original benchmark code.

Benchmarks were all performed on a Dell Precision M4400 laptop with a quad-core Intel(R) Core(TM)2 ExtremeCPU Q9300 @ 2.53GHz, with 6144 KB of total L3 cache and 8 GB of DDR2 RAM at 800 MHz. (Note that LevelDB uses at most two CPUs since the benchmarks are single threaded: one to run the benchmark, and one for background compactions.) The benchmarks were run on two different filesystems, one with a tmpfs and one with reiserfs on an SSD. The SSD is a relatively old model, Samsung PM800 Series 256GB. The system had Ubuntu 12.04 installed, with kernel 3.2.0-26. Tests were all run in single-user mode to prevent variations due to other system activity. CPU performance scaling was disabled (scaling_governor = performance), to ensure a consistent CPU clock speed for all tests. The numbers reported below are the median of three measurements. The databases are completely deleted between each of the three measurements.

Update: Additional tests were run on a Western Digital WD20EARX 2TB SATA hard drive. The HDD results start in Section 8. The results across multiple filesystems are in Section 11.

Benchmark Source Code

We wrote benchmark tools for SQLite, BerkeleyDB, MDB, and Kyoto TreeDB based on LevelDB's db_bench. The LevelDB, SQLite3, and TreeDB benchmark programs were originally provided in the LevelDB source distribution but we've made additional fixes to the versions used here. The code for each of the benchmarks resides here:

LevelDB: db_bench.cc.
SQLite: db_bench_sqlite3.cc.
Kyoto TreeDB: db_bench_tree_db.cc.
OpenLDAP MDB: db_bench_mdb.cc.
BerkeleyDB: db_bench_bdb.cc.

Custom Build Specifications

Compression support was disabled in the libraries that support it. No special malloc library was used in the build. All of the benchmark programs were linked to their respective static libraries, to show the actual size needed for a minimal program using each library.

LevelDB: Assertions were disabled.
TreeDB: We enabled the TSMALL and TLINEAR options when opening the database in order to reduce the footprint of each record.
SQLite: We tuned SQLite's performance, by setting its locking mode to exclusive. We also enabled SQLite's write-ahead logging.
BerkeleyDB: We configure with --with-mutex=POSIX/pthreads to avoid using the default hybrid mutex implementation.
MDB: Assertions were disabled.

1. Relative Footprint

Most database vendors claim their product is fast and lightweight. Looking at the total size of each application gives some insight into the footprint of each database implementation.

size db_bench*
   text	   data	    bss	    dec	    hex	filename
 271991	   1456	    320	 273767	  42d67	db_bench
1682579	   2288	    296	1685163	 19b6ab	db_bench_bdb
  96879	   1500	    296	  98675	  18173	db_bench_mdb
 655988	   7768	   1688	 665444	  a2764	db_bench_sqlite3
 296244	   4808	   1080	 302132	  49c34	db_bench_tree_db

The core of the MDB code is barely 32K of x86-64 object code. It fits entirely within most modern CPUs' on-chip caches. All of the other libraries are several times larger.

2. Baseline Performance

This section gives the baseline performance of all the databases. Following sections show how performance changes as various parameters are varied. For the baseline:

All operations are running on tmpfs. This shows the pure CPU time each database requires, independent of I/O speed.
Each database is allowed 4 MB of cache memory. (MDB has no cache, so this is irrelevant.)
Databases are opened in asynchronous write mode. (LevelDB's sync option, TreeDB's OAUTOSYNC option, SQLite3's synchronous options are all turned off, MDB uses the MDB_NOSYNC option, and BerkeleyDB uses the DB_TXN_WRITE_NOSYNC option). I.e., every write is pushed to the operating system, but the benchmark does not wait for the write to reach the disk.
Keys are 16 bytes each.
Value are 100 bytes each.
Sequential reads/writes traverse the key space in increasing order.
Random reads/writes traverse the key space in random order.

A. Sequential Reads

LevelDB	4,566,210 ops/sec
Kyoto TreeDB	851,788 ops/sec
SQLite3	265,816 ops/sec
MDB	14,492,754 ops/sec
BerkeleyDB	834,029 ops/sec

B. Random Reads

LevelDB	186,289 ops/sec
Kyoto TreeDB	107,631 ops/sec
SQLite3	82,706 ops/sec
MDB	768,640 ops/sec
BerkeleyDB	101,647 ops/sec

C. Sequential Writes

LevelDB	562,114 ops/sec
Kyoto TreeDB	345,423 ops/sec
SQLite3	55,860 ops/sec
MDB	113,161 ops/sec
BerkeleyDB	90,531 ops/sec

D. Random Writes

LevelDB	363,504 ops/sec
Kyoto TreeDB	106,134 ops/sec
SQLite3	37,074 ops/sec
MDB	93,835 ops/sec
BerkeleyDB	47,950 ops/sec

LevelDB has the fastest write operations. MDB has the fastest read operations by a huge margin, due to its single-level-store architecture. MDB was written for OpenLDAP; LDAP directory workloads tend to be many reads/few writes, so read optimization is more critical for that workload than writes. LevelDB is oriented towards many writes/few reads, so write optimization is emphasized there.

E. Batch Writes

A batch write is a set of writes that are applied atomically to the underlying database. A single batch of N writes may be significantly faster than N individual writes. The following benchmark writes one thousand batches where each batch contains one thousand 100-byte values. TreeDB does not support batch writes so its baseline numbers are repeated here for reference.

Sequential Writes

LevelDB	745,712 entries/sec	(1.33x baseline)
Kyoto TreeDB	345,423 entries/sec	(baseline)
SQLite3	111,161 entries/sec	(1.99x baseline)
MDB	2,493,766 entries/sec	(22.0x baseline)
BerkeleyDB	182,216 entries/sec	(2.01x baseline)

Random Writes

LevelDB	469,263 entries/sec	(1.29x baseline)
Kyoto TreeDB	106,135 entries/sec	(baseline)
SQLite3	49,803 entries/sec	(1.34x baseline)
MDB	155,521 entries/sec	(1.66x baseline)
BerkeleyDB	61,248 entries/sec	(1.28x baseline)

Because of the way LevelDB persistent storage is organized, batches of random writes are not much slower (only a factor of 1.6x) than batches of sequential writes. MDB has a special optimization for sequential writes, which is most effective in batched operation.

F. Synchronous Writes

In the following benchmark, we enable the synchronous writing modes of all of the databases. Since this change significantly slows down the benchmark, we stop after 10,000 writes. Unfortunately the resulting numbers are not directly comparable to the async numbers, since overall database size is also a factor in write performance and the resulting databases here are much smaller than the baseline.

For LevelDB, we set WriteOptions.sync = true.
In TreeDB, we enabled TreeDB's OAUTOSYNC option.
For SQLite3, we set "PRAGMA synchronous = FULL".
For MDB, we set no options since full sync is its default mode.
For BerkeleyDB, we set no options since full sync is its default mode.

Sequential Writes

LevelDB	372,024 ops/sec	(0.661x baseline)
Kyoto TreeDB	6,889 ops/sec	(0.0199x baseline)
SQLite3	51,970 ops/sec	(0.93x baseline)
MDB	157,332 ops/sec	(1.39x baseline)
BerkeleyDB	86,468 ops/sec	(0.95x baseline)

Random Writes

LevelDB	349,895 ops/sec	(0.96x baseline)
Kyoto TreeDB	7,080 ops/sec	(0.067x baseline)
SQLite3	45,851 ops/sec	(1.23x baseline)
MDB	147,776 ops/sec	(1.57x baseline)
BerkeleyDB	78,296 ops/sec	(1.63x baseline)

In both LevelDB and TreeDB the fact that operations are synchronous outweighs the fact that the database is much smaller than the baseline. TreeDB in particular performs extremely poorly in synchronous mode. On random writes, for SQLite3, MDB, and BerkeleyDB the smaller database size completely negates the cost of the synchronous writes.

3. Performance Using More Memory

We increased the overall cache size for each database to 128 MB. For SQLite3, we kept the page size at 1024 bytes, but increased the number of pages to 131,072 (up from 4096). For TreeDB, we also kept the page size at 1024 bytes, but increased the cache size to 128 MB (up from 4 MB). For MDB there is no cache, so the numbers are simply a copy of the baseline. Both MDB and BerkeleyDB use the default system page size (4096 bytes).

A. Sequential Reads

LevelDB	4,504,505 ops/sec	(0.99x baseline)
Kyoto TreeDB	1,282,051 ops/sec	(1.50x baseline)
SQLite3	339,328 ops/sec	(1.27x baseline)
MDB	14,084,507 ops/sec	(baseline)
BerkeleyDB	879,507 ops/sec	(1.05x baseline)

B. Random Reads

LevelDB	187,196 ops/sec	(1.005x baseline)
Kyoto TreeDB	218,675 ops/sec	(2.03x baseline)
SQLite3	101,276 ops/sec	(1.22x baseline)
MDB	765,697 ops/sec	(baseline)
BerkeleyDB	173,641 ops/sec	(1.70x baseline)

C. Sequential Writes

LevelDB	564,653 ops/sec	(1.005x baseline)
Kyoto TreeDB	469,043 ops/sec	(1.36x baseline)
SQLite3	53,642 ops/sec	(0.96x baseline)
MDB	99,771 ops/sec	(baseline)
BerkeleyDB	91,819 ops/sec	(1.014x baseline)

D. Random Writes

LevelDB	362,450 ops/sec	(0.997x baseline)
Kyoto TreeDB	227,324 ops/sec	(2.14x baseline)
SQLite3	39,485 ops/sec	(1.07x baseline)
MDB	87,040 ops/sec	(baseline)
BerkeleyDB	72,643 ops/sec	(1.51x baseline)

E. Batch Writes

Sequential Writes

LevelDB	744,602 entries/sec	(1.32x non-batched)
Kyoto TreeDB	469,403 entries/sec	(non-batched)
SQLite3	105,619 entries/sec	(1.97x non-batched)
MDB	1,157,407 entries/sec	(11.6x non-batched)
BerkeleyDB	184,502 entries/sec	(2.01x non-batched)

Random Writes

LevelDB	478,240 entries/sec	(1.32x non-batched)
Kyoto TreeDB	227,324 entries/sec	(non-batched)
SQLite3	55,157 entries/sec	(1.40x non-batched)
MDB	140,766 entries/sec	(1.62x non-batched)
BerkeleyDB	115,969 entries/sec	(1.59x non-batched)

F. Synchronous Writes

Sequential Writes

LevelDB	371,195 ops/sec	(0.661x baseline)
Kyoto TreeDB	6,886 ops/sec	(0.0199x baseline)
SQLite3	51,945 ops/sec	(0.93x baseline)
MDB	125,188 ops/sec	(1.25x baseline)
BerkeleyDB	86,236 ops/sec	(0.95x baseline)

Random Writes

LevelDB	346,741 ops/sec	(0.96x baseline)
Kyoto TreeDB	7,070 ops/sec	(0.067x baseline)
SQLite3	45,880 ops/sec	(1.23x baseline)
MDB	119,918 ops/sec	(1.38x baseline)
BerkeleyDB	77,894 ops/sec	(1.63x baseline)

4. Performance Using Large Values

For this benchmark, we use 100,000 byte values. To keep the benchmark running time reasonable, we stop after writing 1000 values. Otherwise, all of the same tests as for the Baseline are run.

A. Sequential Reads

LevelDB	194,628 ops/sec
Kyoto TreeDB	18,536 ops/sec
SQLite3	7,476 ops/sec
MDB	33,333,333 ops/sec
BerkeleyDB	9,174 ops/sec

B. Random Reads

LevelDB	17,115 ops/sec
Kyoto TreeDB	17,207 ops/sec
SQLite3	7,690 ops/sec
MDB	2,012,072 ops/sec
BerkeleyDB	9,347 ops/sec

MDB's single-level-store architecture clearly outclasses all of the other designs; the others barely even register on the results. MDB's zero-memcpy reads mean its read rate is essentially independent of the size of the data items being fetched; it is only affected by the total number of keys in the database.

C. Sequential Writes

LevelDB	3,422 ops/sec
Kyoto TreeDB	12,415 ops/sec
SQLite3	1.936 ops/sec
MDB	11,758 ops/sec
BerkeleyDB	1,869 ops/sec

D. Random Writes

LevelDB	2,178 ops/sec
Kyoto TreeDB	5,612 ops/sec
SQLite3	1,820 ops/sec
MDB	10,278 ops/sec
BerkeleyDB	1,543 ops/sec

E. Batch Writes

Sequential Writes

LevelDB	2,327 entries/sec
Kyoto TreeDB	12,416 entries/sec
SQLite3	1,908 entries/sec
MDB	6,828 entries/sec
BerkeleyDB	1,901 entries/sec

Random Writes

LevelDB	2,332 entries/sec
Kyoto TreeDB	5,612 entries/sec
SQLite3	1,957 entries/sec
MDB	9,032 entries/sec
BerkeleyDB	1,563 entries/sec

TreeDB has very good performance with large values using asynchronous writes. It has much worse performance in synchronous mode. Batch mode appears to have no benefit with large values; the work of writing the values cancels out the efficiency gained from batching. MDB has additional features to handle large values but the current benchmark code doesn't support it.

F. Synchronous Writes

Sequential Writes

LevelDB	1,090 ops/sec
Kyoto TreeDB	3,115 ops/sec
SQLite3	1,886 ops/sec
MDB	9,747 ops/sec
BerkeleyDB	2,167 ops/sec

Random Writes

LevelDB	1,064 ops/sec
Kyoto TreeDB	3,247 ops/sec
SQLite3	2,137 ops/sec
MDB	10,001 ops/sec
BerkeleyDB	1,882 ops/sec

5. Performance On SSD

The same tests as in Section 2 are performed again, this time using the Samsung SSD with reiserfs. This drive has been in regular use over the past several years and was not reformatted for the tests. It has very poor random write speed as a result.

A. Sequential Reads

LevelDB	4,366,812 ops/sec
Kyoto TreeDB	851,789 ops/sec
SQLite3	274,650 ops/sec
MDB	14,925,373 ops/sec
BerkeleyDB	804,505 ops/sec

B. Random Reads

LevelDB	154,321 ops/sec
Kyoto TreeDB	105,641 ops/sec
SQLite3	82,905 ops/sec
MDB	772,797 ops/sec
BerkeleyDB	103,875 ops/sec

Read performance is essentially the same as for tmpfs since all of the data is present in the filesystem cache.

C. Sequential Writes

LevelDB	414,079 ops/sec
Kyoto TreeDB	342,700 ops/sec
SQLite3	51,464 ops/sec
MDB	93,231 ops/sec
BerkeleyDB	52,048 ops/sec

D. Random Writes

LevelDB	150,399 ops/sec
Kyoto TreeDB	103,928 ops/sec
SQLite3	32,186 ops/sec
MDB	77,851 ops/sec
BerkeleyDB	15,959 ops/sec

Most of the databases perform at close to their tmpfs speeds, which is expected since these are asynchronous writes. However, BerkeleyDB shows a large reduction in throughput.

E. Batch Writes

Sequential Writes

LevelDB	509,165 entries/sec	(1.23x non-batched)
Kyoto TreeDB	342,700 entries/sec	(non-batched)
SQLite3	101,010 entries/sec	(1.96x non-batched)
MDB	953,289 entries/sec	(10.2x non-batched)
BerkeleyDB	79,618 entries/sec	(1.52x non-batched)

Random Writes

LevelDB	202,799 entries/sec	(1.35x non-batched)
Kyoto TreeDB	103,928 entries/sec	(non-batched)
SQLite3	41,530 entries/sec	(1.29x non-batched)
MDB	119,976 entries/sec	(1.54x non-batched)
BerkeleyDB	15,261 entries/sec	(0.96x non-batched)

F. Synchronous Writes

Here the difference between SSD and tmpfs is made obvious.

Sequential Writes

LevelDB	461 ops/sec	(0.0011x asynch)
Kyoto TreeDB	60 ops/sec	(0.0001x asynch)
SQLite3	357 ops/sec	(0.0069x asynch)
MDB	198 ops/sec	(0.0021x asynch)
BerkeleyDB	417 ops/sec	(0.0080x asynch)

Random Writes

LevelDB	460 ops/sec	(0.0031x asynch)
Kyoto TreeDB	67 ops/sec	(0.0006x asynch)
SQLite3	361 ops/sec	(0.0112x asynch)
MDB	194 ops/sec	(0.0025x asynch)
BerkeleyDB	391 ops/sec	(0.0245x asynch)

The slowness of the SSD overshadows any difference between sequential and random write performance here.

6. Performance Using More Memory

We increased the overall cache size for each database to 128 MB, as in Section 3. The "baseline" in these tests refers to the values from Section 5.

A. Sequential Reads

LevelDB	4,566,210 ops/sec	(1.05x baseline)
Kyoto TreeDB	1,298,701 ops/sec	(1.52x baseline)
SQLite3	343,289 ops/sec	(1.25x baseline)
MDB	14,925,373 ops/sec	(baseline)
BerkeleyDB	887,311 ops/sec	(1.10x baseline)

B. Random Reads

LevelDB	154,655 ops/sec	(1.002x baseline)
Kyoto TreeDB	219,154 ops/sec	(2.07x baseline)
SQLite3	98,931 ops/sec	(1.19x baseline)
MDB	772,798 ops/sec	(baseline)
BerkeleyDB	171,644 ops/sec	(1.65x baseline)

C. Sequential Writes

LevelDB	417,885 ops/sec	(1.009x baseline)
Kyoto TreeDB	462,321 ops/sec	(1.35x baseline)
SQLite3	51,575 ops/sec	(1.002x baseline)
MDB	93,231 ops/sec	(baseline)
BerkeleyDB	59,481 ops/sec	(1.14x baseline)

D. Random Writes

LevelDB	150,466 ops/sec	(1.000x baseline)
Kyoto TreeDB	225,073 ops/sec	(2.17x baseline)
SQLite3	35,030 ops/sec	(1.09x baseline)
MDB	77,851 ops/sec	(baseline)
BerkeleyDB	53,502 ops/sec	(3.35x baseline)

E. Batch Writes

Sequential Writes

LevelDB	505,051 entries/sec	(1.21x non-batched)
Kyoto TreeDB	462,321 entries/sec	(non-batched)
SQLite3	100,634 entries/sec	(1.95x non-batched)
MDB	953,289 entries/sec	(10.2x non-batched)
BerkeleyDB	97,714 entries/sec	(1.64x non-batched)

Random Writes

LevelDB	205,212 entries/sec	(1.36x non-batched)
Kyoto TreeDB	225,073 entries/sec	(non-batched)
SQLite3	46,637 entries/sec	(1.33x non-batched)
MDB	119,976 entries/sec	(1.54x non-batched)
BerkeleyDB	77,119 entries/sec	(1.44x non-batched)

F. Synchronous Writes

Sequential Writes

LevelDB	467 ops/sec	(0.0011x asynch)
Kyoto TreeDB	61 ops/sec	(0.0001x asynch)
SQLite3	369 ops/sec	(0.0072x asynch)
MDB	199 ops/sec	(0.0021x asynch)
BerkeleyDB	406 ops/sec	(0.0068x asynch)

Random Writes

LevelDB	466 ops/sec	(0.0031x baseline)
Kyoto TreeDB	70 ops/sec	(0.0003x baseline)
SQLite3	366 ops/sec	(0.0104x baseline)
MDB	194 ops/sec	(0.0025x baseline)
BerkeleyDB	379 ops/sec	(0.0071x baseline)

7. Performance Using Large Values

This is the same as the test in Section 4, using the SSD.

A. Sequential Reads

LevelDB	149,992 ops/sec
Kyoto TreeDB	18,776 ops/sec
SQLite3	7,845 ops/sec
MDB	32,258,064 ops/sec
BerkeleyDB	9,414 ops/sec

B. Random Reads

LevelDB	21,607 ops/sec
Kyoto TreeDB	17,390 ops/sec
SQLite3	8,033 ops/sec
MDB	1,976,285 ops/sec
BerkeleyDB	5,653 ops/sec

The read results are about the same as for tmpfs.

C. Sequential Writes

LevelDB	712 ops/sec
Kyoto TreeDB	12,425 ops/sec
SQLite3	1,184 ops/sec
MDB	4,403 ops/sec
BerkeleyDB	190 ops/sec

D. Random Writes

LevelDB	405 ops/sec
Kyoto TreeDB	5,089 ops/sec
SQLite3	1,311 ops/sec
MDB	4,165 ops/sec
BerkeleyDB	247 ops/sec

E. Batch Writes

Sequential Writes

LevelDB	2,194 entries/sec
Kyoto TreeDB	12,425 entries/sec
SQLite3	694 entries/sec
MDB	3,391 entries/sec
BerkeleyDB	306 entries/sec

Random Writes

LevelDB	2,184 entries/sec
Kyoto TreeDB	5,089 entries/sec
SQLite3	790 entries/sec
MDB	4,901 entries/sec
BerkeleyDB	291 entries/sec

F. Synchronous Writes

Sequential Writes

LevelDB	106 ops/sec
Kyoto TreeDB	32 ops/sec
SQLite3	92 ops/sec
MDB	91 ops/sec
BerkeleyDB	126 ops/sec

Random Writes

LevelDB	106 ops/sec
Kyoto TreeDB	38 ops/sec
SQLite3	104 ops/sec
MDB	88 ops/sec
BerkeleyDB	114 ops/sec

As before, TreeDB's write performance is good on asynchronous writes. BerkeleyDB's performance degrades the least in synchronous mode.

8. Performance On HDD

The same tests as in Section 2 are performed again, this time using the Western Digital WD20EARX HDD with EXT3 fs. The drive was attached to the laptop's eSATA port, so interface bottlenecks are not an issue. The MDB library used here is a littler newer than the previous tests, using revision 5da67968afb599697d7557c13b65fb961ec408dd which results in faster sequential write rates than in the previous tests so those numbers are not directly comparable.

Note that this data does not represent the maximum performance that the drive is capable of. For completeness, the tests were repeated on multiple other filesystems including EXT2, EXT3, EXT4, JFS, XFS, NTFS, ReiserFS, BTRFS, and ZFS. Those results will be uploaded later.

This drive uses 4KB physical sectors. The drive was partitioned into two 1TB partitions, 4KB aligned. The first partition was formatted with NTFS. The 2nd partition was reused with each of the other filesystems.

A. Sequential Reads

LevelDB	4,504,504 ops/sec
Kyoto TreeDB	851,789 ops/sec
SQLite3	272,554 ops/sec
MDB	14,705,882 ops/sec
BerkeleyDB	805,152 ops/sec

B. Random Reads

LevelDB	99,010 ops/sec
Kyoto TreeDB	106,315 ops/sec
SQLite3	82,034 ops/sec
MDB	772,200 ops/sec
BerkeleyDB	98,795 ops/sec

Read performance is essentially the same as the previous tests since all of the data is present in the filesystem cache. LevelDB and BerkeleyDB are slightly slower than before.

C. Sequential Writes

LevelDB	205,550 ops/sec
Kyoto TreeDB	344,828 ops/sec
SQLite3	46,164 ops/sec
MDB	78,021 ops/sec
BerkeleyDB	43,977 ops/sec

D. Random Writes

LevelDB	63,259 ops/sec
Kyoto TreeDB	101,194 ops/sec
SQLite3	28,581 ops/sec
MDB	61,335 ops/sec
BerkeleyDB	4,978 ops/sec

Kyoto Cabinet performs close to its tmpfs speed, while the other databases show more of a reduction in throughput. BerkeleyDB slows down the most.

E. Batch Writes

Sequential Writes

LevelDB	213,904 entries/sec	(1.04x non-batched)
Kyoto TreeDB	344,828 entries/sec	(non-batched)
SQLite3	91,291 entries/sec	(1.98x non-batched)
MDB	1,602,564 entries/sec	(20.5x non-batched)
BerkeleyDB	56,085 entries/sec	(1.27x non-batched)

Random Writes

LevelDB	85,230 entries/sec	(1.35x non-batched)
Kyoto TreeDB	101,194 entries/sec	(non-batched)
SQLite3	35,791 entries/sec	(1.25x non-batched)
MDB	109,866 entries/sec	(1.79x non-batched)
BerkeleyDB	4,928 entries/sec	(0.99x non-batched)

F. Synchronous Writes

As slow as the SSD was, the HDD results are even slower.

Note however, that further investigation shows that these results are nowhere near the maximum performance of the HDD. More details on this in Section 11.

Sequential Writes

LevelDB	68 ops/sec	(0.0003x asynch)
Kyoto TreeDB	5 ops/sec	(0.00001x asynch)
SQLite3	62 ops/sec	(0.0013x asynch)
MDB	35 ops/sec	(0.0004x asynch)
BerkeleyDB	60 ops/sec	(0.0014x asynch)

Random Writes

LevelDB	68 ops/sec	(0.0011x asynch)
Kyoto TreeDB	5 ops/sec	(0.00005x asynch)
SQLite3	62 ops/sec	(0.0222x asynch)
MDB	43 ops/sec	(0.0007x asynch)
BerkeleyDB	60 ops/sec	(0.0121x asynch)

The slowness of the HDD overshadows any difference between sequential and random write performance here. None of these systems are suitable for real-world use in this configuration, but Kyoto Cabinet is by far the worst. If an application demands full ACID transactions, Kyoto Cabinet should definitely be avoided.

9. Performance Using More Memory

We increased the overall cache size for each database to 128 MB, as in Section 3. The "baseline" in these tests refers to the values from Section 8.

A. Sequential Reads

LevelDB	4,464,286 ops/sec	(0.99x baseline)
Kyoto TreeDB	1,236,094 ops/sec	(1.45x baseline)
SQLite3	341,880 ops/sec	(1.25x baseline)
MDB	14,705,882 ops/sec	(baseline)
BerkeleyDB	548,546 ops/sec	(0.68x baseline)

B. Random Reads

LevelDB	100,675 ops/sec	(1.017x baseline)
Kyoto TreeDB	219,491 ops/sec	(2.06x baseline)
SQLite3	98,830 ops/sec	(1.20x baseline)
MDB	772,201 ops/sec	(baseline)
BerkeleyDB	149,343 ops/sec	(1.51x baseline)

C. Sequential Writes

LevelDB	206,228 ops/sec	(1.003x baseline)
Kyoto TreeDB	320,616 ops/sec	(0.93x baseline)
SQLite3	43,925 ops/sec	(0.95x baseline)
MDB	78,021 ops/sec	(baseline)
BerkeleyDB	49,993 ops/sec	(1.14x baseline)

D. Random Writes

LevelDB	61,931 ops/sec	(0.98x baseline)
Kyoto TreeDB	222,816 ops/sec	(2.20x baseline)
SQLite3	29,996 ops/sec	(1.05x baseline)
MDB	61,335 ops/sec	(baseline)
BerkeleyDB	44,256 ops/sec	(8.89x baseline)

E. Batch Writes

Sequential Writes

LevelDB	206,271 entries/sec	(1.00x non-batched)
Kyoto TreeDB	320,616 entries/sec	(non-batched)
SQLite3	91,458 entries/sec	(1.98x non-batched)
MDB	1,602,564 entries/sec	(20.5x non-batched)
BerkeleyDB	76,476 entries/sec	(15.36x non-batched)

Random Writes

LevelDB	85,346 entries/sec	(1.35x non-batched)
Kyoto TreeDB	222,816 entries/sec	(non-batched)
SQLite3	41,658 entries/sec	(1.46x non-batched)
MDB	109,866 entries/sec	(1.79x non-batched)
BerkeleyDB	61,958 entries/sec	(12.44x non-batched)

F. Synchronous Writes

Sequential Writes

LevelDB	67 ops/sec	(0.0003x asynch)
Kyoto TreeDB	5 ops/sec	(0.00001x asynch)
SQLite3	61 ops/sec	(0.0013x asynch)
MDB	35 ops/sec	(0.0004x asynch)
BerkeleyDB	58 ops/sec	(0.0013x asynch)

Random Writes

LevelDB	67 ops/sec	(0.001x baseline)
Kyoto TreeDB	5 ops/sec	(0.00005x baseline)
SQLite3	61 ops/sec	(0.0021x baseline)
MDB	43 ops/sec	(0.0007x baseline)
BerkeleyDB	59 ops/sec	(0.012x baseline)

10. Performance Using Large Values

This is the same as the test in Section 4, using the HDD.

A. Sequential Reads

LevelDB	139,276 ops/sec
Kyoto TreeDB	18,612 ops/sec
SQLite3	7,672 ops/sec
MDB	9,345,794 ops/sec
BerkeleyDB	9,273 ops/sec

B. Random Reads

LevelDB	23,064 ops/sec
Kyoto TreeDB	17,337 ops/sec
SQLite3	7,870 ops/sec
MDB	1,436,782 ops/sec
BerkeleyDB	4,423 ops/sec

Again, the read results are about the same as for tmpfs.

C. Sequential Writes

LevelDB	279 ops/sec
Kyoto TreeDB	4,861 ops/sec
SQLite3	1,343 ops/sec
MDB	5,643 ops/sec
BerkeleyDB	191 ops/sec

D. Random Writes

LevelDB	149 ops/sec
Kyoto TreeDB	5,278 ops/sec
SQLite3	1,376 ops/sec
MDB	5,237 ops/sec
BerkeleyDB	152 ops/sec

E. Batch Writes

Sequential Writes

LevelDB	2,174 entries/sec
Kyoto TreeDB	4,861 entries/sec
SQLite3	1007 entries/sec
MDB	4,069 entries/sec
BerkeleyDB	187 entries/sec

Random Writes

LevelDB	2,166 entries/sec
Kyoto TreeDB	5,279 entries/sec
SQLite3	1108 entries/sec
MDB	5,734 entries/sec
BerkeleyDB	142 entries/sec

F. Synchronous Writes

Sequential Writes

LevelDB	20 ops/sec
Kyoto TreeDB	3 ops/sec
SQLite3	18 ops/sec
MDB	15 ops/sec
BerkeleyDB	20 ops/sec

Random Writes

LevelDB	17 ops/sec
Kyoto TreeDB	4 ops/sec
SQLite3	18 ops/sec
MDB	15 ops/sec
BerkeleyDB	18 ops/sec

The slowness of the HDD makes most of the database implementations perform about the same. As before, kyoto Cabinet is much slower than the rest.

11. Performance Using Different Filesystems

The baseline test was repeated on the same HDD, but using a different filesystem each time. The filesystems tested are btrfs, ext2, ext3, ext4, jfs, ntfs, reiserfs, xfs, and zfs. In addition, the journaling filesystems that support using an external journal were retested with their journal stored on a tmpfs file. These were ext3, ext4, jfs, reiserfs, and xfs. Testing in this second configuration shows how much overhead the filesystem's journaling mechanism imposes, and how much performance is lost by using the default internal journal configuration.

Note: storing the journal on tmpfs was just for the purposes of this test. In a real deployment you would need to store the journal on an actual storage device, like a separate disk, otherwise the filesystem would be lost after a reboot.

The filesystems are created fresh for each test. The tests are only run once each due to the great length of time needed to collect all of the data. (It takes several minutes just to run mkfs for some of these filesystems.) The full results are not presented in HTML here; you will have to download the Spreadsheet to view the results.

You can display the results for a specific benchmark operation across all the filesystem types using the selector in cell B23 of the sheet. Likewise, you can display the results for a specific filesystem across all the benchmark operations using the selector in cell B1, but because the results are so totally dominated by MDB read performance, this view isn't quite as informative.

Just to summarize, jfs with an external journal is the fastest for synchronous writes. If your workload demands fully synchronous transactions, this is clearly the best choice. Otherwise, the original ext2 filesystem is fastest for asynchronous writes.

The raw data for all of these tests is also available. tmpfs, SSD, and HDD. The results are also tabulated in an OpenOffice spreadsheet for further analysis here. The raw filesystem test results are in out.hdd.tar.gz