Measuring Cloud Storage Performance

There are many excellent reasons to use cloud storage, but fast and efficient transfer of large amounts of data isn’t usually listed as a benefit. That’s one of the reasons why people use cloud storage gateways: to speed up cloud storage access. Recently, I realized we’ve never published any details on the performance gains that one should expect when using the CloudArray storage gateway, so I decided to create a simple illustrative test. In this article, I describe the results and explain some cloud storage implementation details that contribute to performance differences.

I came up with a quick test: copy one gigabyte of fully random data to the cloud, broken up into 32768 32k files. The questions are, how long would it take for a user to copy that much data to a CloudArray volume, and how soon before all of that data is safely stored in the cloud?

To set up the test, I used an old, slow laptop to run CloudArray. (That’s what I had handy in my office…) I used just the default basic configuration of a 25G cache to configure a single 100G encrypted volume, attached to Amazon S3, and mapped to a linux client. I formatted the volume with an ext3 file system and used the ‘cp’ command to copy the files. As a basis for comparison, I used an open-source file transfer utility, CyberDuck, to transfer exactly the same files to exactly the same the S3 account.

I’ve broken the CloudArray results down into two events: the data was “usable” when the ‘cp’ operation completes, and “complete” once all of the data had been copied to the cloud. At the usable point, any host attached to the CloudArray could access any of the files, as they were all stored in the local cache. In the background, the CloudArray busily pushed the data to the cloud, and when it was done, I considered the test complete. Of course, for the transfer utility, there’s no such split. The data was either copied to the cloud, or not.

The results of running the test were, well, dramatic:

Image may be NSFW.
Clik here to view. Write performance comparison

For the record, that’s 6:22 (min:sec) for the CloudArray to reach the usable state and another 3:19 to reach the complete state, while it took the file transfer 110:05 to transfer exactly the same data.

What’s the reason for the huge difference? Well, I confess that I did set up this test to highlight one of the advantages of CloudArray; because we’re block-based, we’re not sensitive to the kinds of problems that plague file-based approaches.

The caching accounts for the rapid time to usability, but the more significant part of the equation is the aggregation that CloudArray performs due to the block-level IO. We send out all data in large, cloud-optimized chunks. Cloud storage provider systems store a single 1M object faster than they store a 32k object, so sending them 1024 1M objects is easier for them to handle. That same arithmetic applies to the number of requests, so that we sent 1024 PUT requests as opposed to the 32768 PUTs that CyberDuck (and, indeed, any file-based cloud utility) must send to handle this particular workload.

In other words, regardless of the size or number of files that you store on a CloudArray volume, traffic to the cloud is optimized to give the best performance.

To test read performance, I just copied the same files back from the cloud. Using CloudArray, that once again gave two separate cases: either all of the files were already in local cache, which meant that it was roughly the same as a local file copy, or they weren’t, in which case the data had to be read back from the cloud. Again, the data was automatically read from the cloud in large chunks, giving the best performance. The copy was performed with the ‘cp’ command on the linux host, and compared to CyberDuck transferring the files back.

The results are actually pretty symmetrical with respect to the original write performance:

Image may be NSFW.
Clik here to view. Read performance comparison

Copying all the files from the local CloudArray cache took 6:05, while invalidating the cache and doing the same copy again took 11:42. On the other hand, the utility transfer took 112:24.

None of the numbers that I’ve given are all that useful for comparison with other environments. A large number of factors affect actual performance, e.g. WAN speed, LAN speed, and local disk speed, and I made no attempt to optimize any of them. A faster local disk or SSD, for example, would substantially reduce the time to usability. The importance of these results is the relative performance of CloudArray when compared to the raw transfer, and those results are impressive: 17.3x faster to usability, 11.4x faster to durability, and 9.6x faster to reload (18.4x if the data is in cache).

That’s minutes versus hours.

It’s a quick cup of coffee versus a three-martini lunch.

It’s renewing online versus waiting at the DMV.

It’s time saved.