Can your legacy SAN deliver Quality of Service (QoS)? Is popcorn a vegetable?

Tuesday, February 12, 2013 posted by Dave Wright

Think about it. If corn is a vegetable, why isn't popcorn? Likewise, if storage performance can be guaranteed, why can't any storage architecture do it?

It's a hard truth to face: legacy storage systems are simply not designed to handle the demands of multi-tenant cloud environments. More specifically, the few systems that claim storage Quality of Service (QoS) - or want to claim it on their roadmap - are really just "bolting it on" as an afterthought. And these "bolted on" methods of achieving QoS have unfortunate side effects.

Before we dive in further, let's first discuss why you should care about true storage QoS as a cloud service provider. Hosting business-critical applications in the cloud represents a large revenue growth opportunity for cloud service providers. But until storage performance is predictable and guaranteed, you won't be able to programmatically attract this type of business from your enterprise customers. Is there a solution? Yes, and the answer is storage QoS architected from the ground up with guaranteed performance in mind.

Let's take a closer look at some of the "bolt-on" methods that legacy systems use to try to perform something they can market as "QoS."

Prioritization

How it works - Prioritization defines applications simply as "more" or "less" important in relation to one another. This is often done in canned and well described tiers such as "mission critical," "moderate," and "low." 

Why it doesn't really offer QoS -  While prioritization can indeed help give higher relative performance to some apps and not others, it doesn't actually tell you what performance to expect from any given tier. Additionally, it certainly can't guarantee performance, particularly if the problematic "noisy neighbor" is located at the same priority level. So for starters, there is no ability to guarantee that any one application will get the performance it needs. What's more, there is no functionality for one tenant to understand what their priority designation means in relation to the other priorities on the same system. It means nothing to tell a tenant they are prioritized as "moderate" unless they know how moderate compares to the other categorizations. Moderate is also meaningless without knowing what system resources are dedicated to this particular tier. In addition, priority-based QoS can often make a "noisy neighbor" LOUDER if that tenant has a higher priority because that higher priority tenant is allowed more resources to turn up the volume.

Rate limiting

How it works - Rate limiting attempts to deal with performance requirements by setting a hard limit on an application's rate of IO or bandwidth. Customers that pay for a higher service will get a higher limit.

Why it doesn't really offer QoS - Rate limiting can help quiet noisy neighbors, but does so only by "limiting" the amount of performance that an application has access to. This one-sided approach does nothing to guarantee that the set performance limit can actually be attained. Rate limiting is all about protecting the storage system rather than delivering true QoS to the applications. In addition, firm rate limits set on high performance or bursty applications can inject significant undesired latency.

Dedicated storage

How it works - IT managers attempt to deliver predictable performance by dedicating specific disks or drives to a particular application, isolating it from other applications or noisy neighbors.  

Why it doesn't really offer QoS - Dedicating storage to an application goes a long way toward eliminating "noisy neighbors," yet even dedicated infrastructure cannot guarantee a level of performance. A component failure in one of these storage islands can have a massive impact on application performance as system bandwidth and IO are redirected to recovering from the failure. Despite the dedication of resources, this approach still falls short in its ability to guarantee performance at any level. 

Tiered storage

How it works - Multiple tiers of different storage media (SSD, 15K rpm HDD, 7.2K rpm HDD) are combined to deliver different tiers of performance and capacity. Application performance is determined by the type of media the application resides on. In an effort to optimize application performance, predictive algorithms are layered over the system which literally try to predict, based on historical performance information, which data is "hot" and kept in SSD vs. data that is "cold" and kept in HDD.

Why it doesn't really offer QoS - Tiering is the worst of all the "bolted on" solutions designed for delivering predictable performance. Quite simply, this solution is unable to deliver any level of storage QoS. Tiering actually amplifies "noisy neighbors" because they appear hot and are promoted to higher performing (and scarcer SSDs), thereby displacing other volumes to lower performing, cold disks. Performance for every tenant varies wildly as algorithms move their data between media. No particular tenant knows what to expect of their IO as they don't control the tiering algorithm or have any insight into the effect on other tenants. Some tiering solutions try to offer QoS by pinning the data of a particular application into a specific tier, but this is essentially dedicated storage (discussed above) at an even higher cost than usual.

Stay tuned to our blog to learn more about storage QoS and how a scale-out storage system architecture designed from the ground up to deliver and guarantee consistent performance to thousands of volumes simultaneously is the ideal system for building performance SLAs in a multi-tenant cloud environment.

 -Adam Carter, Director of Product Management