Databases can be far larger than memory or even storage can hold — the largest are stored on clusters of dozens of servers. One way to improve performance with very large databases is to store the database indexes in memory. Indexes are copies of specific ranges of data, which speed searches up.

However, the larger the database, the larger the index, and it’s often the case that the indexes are too large for system memory. One alternative is to store the indexes on very fast NVMe- or PCIe-based SSDs. While not as fast as memory, they are much faster than other types of storage, yielding improved performance.

Addressing Latency in Communications

Database indexes can be thought of as subsets of the whole database, containing the columns or rows that are most often searched for, such as item name, manufacturer name or price. A large retailer may have hundreds of thousands to millions of items, and even subsets of that much data can reach many terabytes of data.

Storing the indexes in RAM speeds up the lookups, but even a cluster of a dozen database servers will have only a few terabytes of available memory. The necessity to spread the database indexes across many nodes in a cluster means even though the indexes are stored in fast memory, it takes time to move data from one node to another, or to search in memory on a different node. Therefore, professionals must take into account the time required to move across the network from node to node.

Because of the latency involved in network communications, overall latency for memory lookups on other nodes in a cluster may be longer than the time to do a lookup from local, fast PCIe-based SSD or NVMe SSD storage, even though RAM is much faster. PCIe-based SSDs or NVMe SSDs have latencies in the 100-200 microsecond range, while DDR4 DRAM has a typical latency of around 15 nanoseconds, or around 10,000 times faster, so how can performance be better with an SSD?

It’s because the network latencies, even with 10 gigabit Ethernet, are around 5-50 microseconds, and 50-125 microseconds for 1G Ethernet. Reading an index on another node in a cluster will require multiple round trips, even hundreds of round trips, depending on packet sizes. Because of this, the overall network latency can easily reach values higher than the time required to search through a local SSD.

Calculating Cluster Size

One of the biggest reasons for large numbers of nodes in a database cluster is to reach enough RAM for caching indexes and other parts of the database. Because of the cost of DRAM, most servers cannot economically be provisioned with more than two or four terabytes of RAM.

The number of CPUs and the overall computational load on the server is generally less than the need for more RAM. This means that a single server with 20-50 terabytes of PCIe-based SSD or NVMe storage could take the place of as many as a couple of dozen database servers with two or four terabytes of RAM each, and at a much lower overall cost.

With this in mind, clusters of two or four SSD-equipped servers would have the redundancy necessary for enterprise databases, and provide overall performance as good as a 20-node cluster, and be easier to manage and maintain as well.

Find the best storage solutions for your business by checking out our award-winning selection of SSDs for the enterprise.