Classroom Series - OpenStack Block Storage 101

From its inception as a collaborative mission in 2010 between NASA and Rackspace to build and manage massively scalable, feature rich clouds with Open Source software, the OpenStack Project has evolved to a community of almost 10,000 people across 87 countries. Today, over 200 vendors are actively participating across the OpenStack ecosystem.

This video provides an introduction to OpenStack and the problems it is solving in cloud computing. We discuss how OpenStack leverages block storage, what it is/isn't, how it works, and then we take a deeper look into the block storage service called Cinder, and describing how it differs from Swift object storage.



Video Transcription

Hey there, my name is John Griffith. I am an open source engineer at SolidFire out in Boulder, Colorado. We’re building a clustered storage device that’s designed specifically for the cloud. Today I wanted to do a quick introduction into OpenStack, give you a 101 as far as what OpenStack is, how it came to be, what problem it’s trying to solve, and maybe a little bit about what it’s not, as well. I’d also like to talk a little bit about the block storage project, specifically, and give you some more information about what that is, what you can do with it, what some of the features are, and a little bit about the design.

Anyway, this is definitely a 101 type course or an introduction. Hopefully, this will get you started and get you enough information. We can continue and move on, and we can dig deeper after this. All of this pretty much started with the hypervisor. The hypervisor gave us the ability to run virtual machines all on a single box and really start to utilize our hardware better, and save us from things like multiple boxes under the desk.

If you’re like me, you can remember the day when you had four or five workstations sitting under your desk, each one having a different OS, or maybe all even the same OS just different configurations. You had to do compile and testing for your code and everything on each one individually. It was somewhat of a pain and somewhat expensive, as well.

Thanks to the hypervisor, we started to have the ability to do things like have one single workstation and be able to run four or five different VMs inside of it, each with our configuration. You can also go ahead and save those configurations and save those setups, and so you could bring them back up later as a VM without having a server or a workstation sitting around doing nothing.

Over the years, as virtualization technology continued to improve and we continue to make advancements in compute resources, what happened was is we started to move more and more things into the hypervisor. We started to do more things there, whether it be running eCommerce apps, web apps, web servers, DNS servers, so on and so forth.

That was all great and we continue to scale that, but what we started to run into then was it started to become more and more difficult to do things like deploy a VM. You may not have direct access. You would have to go to an admin, an IT admin and request a VM be deployed. Not to mention the difficulty added with trying to deploy more hypervisors. All of these things started to get more and more complex. Then you started to see things like storage performance degradations, trouble setting up networks and getting everything to talk.

We started to really get into some management complexity that caused some difficulties, and started to lose some of the efficiencies that we’ve gained. Now it's great. We’re utilizing our hardware much, much better, but on the other hand, things are getting increasingly complicated and difficult to manage. That’s where the whole concept of cloud computing starts to come in. It’s no longer just about a collection of virtual machines. It’s now an orchestration or a management layer of hypervisors in virtualization.

In addition to that, there’s some key concepts and key things that I look for in terms of what is a cloud and what is cloud computing. The whole concept is that to keep in mind is that it’s dynamic. It’s elastic resources that are on demand, so it’s not a file a ticket and next week or hopefully in 72 hours or whatever, I’ll get a VM. It’s I need it now. I get it now.

There’s some key aspects. Like I said, the on demand. One of the other ones is it’s self-service for the end user. This means that, typically, you go to a webpage or internal network portal, whatever it might be, and you actually request your resources from there and set them up yourself. Get them right away, use them, and then return them when you’re done. Some of the key things that you look for as well are the ability to scale dynamically. By scale, I mean horizontally scale.

It’s not take down the server and put in some more RAM or put in another drive, whatever it might be. It’s actually, scale horizontally, throw in another node and have that consumed and utilized as a resource. In addition to all those key concepts, the other idea is to be able to take more and more IT infrastructure and move it into that utility consumption model. Now when we talk about taking things like the networking stack and things like that and moving it in as a resource that you can actually check out and utilize on demand and pay for what you use as opposed to having to put three nicks in a machine or whatever.

All of these things are great and this all sounds fantastic, until you try and actually build it and then try and manage it and get it all to work together. That’s where OpenStack came in. Back in 2010, NASA and Rackspace were both facing these same sorts of challenges. They got together and they created an open source project called OpenStack.

The basic premise of OpenStack is to deliver a platform for you to build and manage massively scalable, super elastic, and feature-rich clouds with those characteristics that I talked about in the last slide. That’s the key. OpenStack provides a number of data center resources, including compute, storage, and networking. There’s also a significant number of other projects that are starting up and services that are more specific. The key is is they all use the same building blocks, and they all use this same management-type interface that OpenStack provides.

That sounds great and that gives you a basic idea of what the marketing slide would say. What’s the real deal here? What’s OpenStack really trying to do? On top of giving you the ability to actually build a cloud, which is huge, the biggest things are is it’s abstracting out physical and virtualized resources and giving you a single common management layer to work with them. We do do things more than just virtual machines. We do do bare metal, as well. We do storage. We do networking, so on and so forth.

One of the keys about OpenStack is, is it provides an easy-to use REST API. Again, it’s self-service, like we talked about. It’s fully dynamic and it’s designed to scale. When I say scale, I mean massively scaled to the point of tens of thousands of VMs running hundreds of nodes. That’s going to continue to grow and we continue to push that limit. I’ve heard different talks on what some of the goals are and I think it’s basically as long as somebody wants to grow, keep growing and fix what needs to be fixed to make it do that.

The good thing is, is that the architecture, as it stands right now, will allow that, and it’ll let us continue to do that. Some of the value propositions that are associated with OpenStack. One of the biggest thing s about OpenStack is you have choice. You have the ability to configure things the way you want and use the things that you want.

In some respects, this makes things a little bit more complex and a little more difficult for people to grasp and figure out how to set this up. On the other hand, it’s fantastic because it gives you the ability to utilize what you want. Also, it gives you the ability to mix and match those things. Hypervisors is a great example. We have support for VMware, KVM, Hyper-V, Xen, and some others. The key is you can pick which one of those you want to use. Not only that. Not only can you pick which one, you can also have a single OpenStack install that utilizes all of them.

You can have Xen on one compute node. You could have KVM on another. You could have VMware on another. The idea is all of those are abstracted out into the management layer so that from an end user’s perspective, they just have to request, give me a VM. Then depending on some of the characteristics that they might ask for in addition to just a basic VM, that may determine which one of those hypervisors it goes to.

The other thing about OpenStack is is it supports the whole Infrastructure as a service stack, not just the compute side. We’re talking storage, networking, security, so on and so forth. We drive significantly greater resource utilization across the entire infrastructure. The other thing is, is we fix some of the scaling issues that you have with just compute virtualization or just the hypervisor. Those are some really big things.

The other thing that is key that a lot of people are focusing in on and really want to drive inside of OpenStack is trying to deliver compatibility between private and public clouds. Everybody knows from the public cloud space one of the big players that everybody always thinks of is Amazon. The idea here is with Amazon you deploy some things in Amazon and that’s great. Everything works fine, but you can’t bring all of that same stuff back into your in-house cloud quite so easily because of the differences in compatibility and so on and so forth.

Now, however, if you go to a service provider that utilizes OpenStack and does a public cloud that way, you do have the ability to do things like have overlap between the two. You can have a private cloud running OpenStack and a public cloud that you go to a provider for running OpenStack. You can actually have things cross and meld in between, which is a huge advantage.

The key thing about OpenStack is you’re really bringing the whole concept of software defined infrastructure to life and you’re really making that happen. Really similar to what we talked about with the hypervisor, a lot of the early use cases for OpenStack were test and development. Overtime, though, just like we saw with the hypervisor, we started to see more and more enterprise IT, DevOps, continuous integration, things like that, all starting to move into OpenStack as well and starting to move into the cloud. We’re starting to see more and more things like databases as a service, whether that be MySQL or MongoDB or other NoSQL type databases.

More and more of those things are moving into OpenStack, as well. We’re also seeing it utilized heavily for internal web services as well as external web services. We’re seeing things like eCommerce is a huge consumer of OpenStack. You can look at folks like WebEx utilizing it. There’s also situations starting with DVI. People are using it to deploy DVI implementations.

Being an open source project, there’s always some good information about the community to look at and to see how healthy it is and how things are going. OpenStack has a community of almost 10,000 people across 87 countries right now, which is massive. There’s greater than 45 companies right now that contributed to the last release of OpenStack. There’s more than 40 listed case studies of public companies willing to talk about what they’ve done with OpenStack, how they’ve used it and how it’s worked out for them.

There’s over 200 vendors involved in the ecosystem. By vendors, I mean companies like SolidFire that I work for that actually develop a product, SolidFire, NetApp, Red Hat, Canonical, so on and so forth. The ecosystem is extremely large and really, really diverse. Let’s delve a little bit into storage specifically. One of the first questions that I typically get from people is what’s the difference between Cinder and Swift?

Swift is object storage, while Cinder is traditional block storage. We put together this little slide. You can take a look and it gives you a decoder of what some of the different objectives are between the two in terms of some of the different use cases and workloads that you might target one over the other. The block storage is definitely the more traditional, the high performance type workload, so on and so forth, high change content, smaller, random read/write, Bursty IO, soon and so forth.

While the object storage is more of hey, I’ve got this object, whether it be an image, or a file, or whatever, a video clip, so on and so forth. Full object that I’m actually moving from one place to another That’s the big thing and that’s one of the first questions that come up, so I wanted to talk on that. I’m not going to go too much into the Swift or object storage. I do want to talk a little bit more about the block storage project.

OpenStack’s block storage project is codenamed Cinder. It’s architected to provide traditional block level storage and abstract those block storage resources out. It presents persistent block storage in the form of volumes that can be used and consumed by Nova. It can be either attached to a compute instance or it can actually be used as the storage location for a compute instance. When I say instance, an instance in OpenStack is basically a VM. We call it an instance.

Cinder manages the creation, detaching, and deleting of these volumes across multiple storage back-ends. Really similar to what we talked about in terms of the hypervisor choices. You can use KVM, you can use Xen, you can use Hyper-V, so on and so forth. We also do the same sort of model inside of the block storage project. You can have different vendors’ back-ends. You can have a NetApp back-end. You can have a SolidFire back-end, an IBM back-end, whatever it might be.

You can have those different back-ends if their drivers are included in the project and supported. You can have those and you can also still also have multiples. You can have multiple nodes with multiple back-ends or even a single node with multiple back-ends attached to it that you can use. I’m not going to go into too much detail about that. I’ll touch on it a little bit more later.

Then there is also a built-in block storage back-end that is implemented using LVM that comes with OpenStack as well. Some of the basic features that you get with the block storage service with Cinder, of course, obviously, we need create and delete of volumes. The thing that’s nice is it’s create and delete, again, on demand. You need a volume, you go and you create it. The call is sent down to the back-end and the back-end is creating that resource on the fly dynamically as you need it and returning the information that you need to connect it and use it.

Then when you’re done with it, you can delete it and have it just go away. We also have the ability to do things like specify custom types and then also add things that we call extra specs. An admin can go in and set up types and extra-specs for the volume service. These will do things like specify, say I want a certain QoS levels or I want a certain back-end to be targeted for this type, so on and so forth.

In addition, we also do things like cloning. You can take an existing volume and you can actually create a clone of it. You can copy an image out of the Glance, which is the image repository. It’s where all the images are stored to create instances. That would be somewhat like an equivalent to your ISO or CD-ROM file for your Ubuntu VM, for example. You can copy that image down to a volume, or you can take a volume that you have that’s bootable and actually send that back up to Glance as well.

What that does is that allows you to do the whole boot-from-volume thing. You can also do point in time copies, so snapshots. You can also turn around and create a volume from those snapshots. The snapshots really, for the most part, are like I said, they’re a point-in-time copy. They’re more used as a vehicle for a lot of the underlying things that the block storage service does, in my opinion. It’s a great way to get that quick, fast, instantaneous point in time copy of what’s going on of the system and stuff and then do things with it afterwards, like a clone or make a new volume, or so on and so forth to actually replicate it.

We also recently have added things like backup of volumes. You can back up to object storage, including Swift, and now Ceph we recently added as well. You can transfer the ownership of a volume. The whole concept in OpenStack is you have what’s called tenants. Tenants are individual users or projects inside of OpenStack. They don’t have visibility to each other’s resources.

The idea is if you have a volume for your dev group A and it’s got some cool stuff on it or it’s got a certain development environment or something that you want to share with somebody or whatever, you can actually go ahead and transfer ownership of that volume over to one of your other tenants in the cloud and give them access to it or basically just give them the volume.

In addition, some of the really cool things that we introduced the last release include scheduling. It used to be we just had simple, basic scheduling and it was really, really basic. What we have now is we have the ability to set up scheduling filters. You can do things like, I mentioned before, use volume types, for example, to specify what back-end you go to or a volume is deployed on. You can also do more complex things. You can do things like capacity filters. You can specify things to run in a manner where the volume that is requested is being deployed on the next back-end with the most capacity.

You can do things like volume counts, so on and so forth. There’s a number of options. Then in addition, you can also write your own custom filters, which is a huge benefit. In addition to all these things, as with pretty much everything inside of OpenStack, there’s also per tenant usage quotas that you can set up. You can set up quotas for different tenants. You can say this tenant is only allowed to set up this many volumes or only allowed to consume this amount of storage or they’re only allowed to create these types of volumes. There’s a number of things that you can do there.

Again, there’s a lot of configuration, a lot of options, a lot of choices that you can set up. One of the things that people ask about because of the fact that the whole point of the block storage service is to abstract those resources and treat them as pools and stuff like that is, well then what’s the advantage of using, say for example, a SolidFire device over the LVM or whatever?

There’s, of course, all the basic things that come with it like performance and efficiencies, power management, all those sorts of things. On top of that, you still can expose your unique features. You can do this either through custom types, the volume types and the extra-spec stuff that I talked about or OpenStack also provides this mechanism called extensions. Extensions can be written and put in to do things above and beyond what the API does on its own, to expose some of those extra features.

Again, like I said, there’s different back-ends have different use cases. There’s definitely clear choices between them just based on the characteristics of the device itself. There is also things that you can do in terms of features that you can expose those things. You get the best of both worlds without having things get too out of control in terms of that management interface and that API. This slide shows a little depiction of how things tie together in a really high-level architectural view of the block storage service.

You can see up there you have the user. That user is sending in a rest command to whether it be Nova, whether it be Cinder, whatever, into the service of interest. One of the things that this doesn’t show is that user or that interface, that could be a client, a command line client. It could be just a CURL call. It could be the OpenStack dashboard project horizon, which is the web UI.

It could be any one of those. They all work basically the same thing. We’ll send the request down to the service. For example, they’ll send it to Cinder via the volume API. We have a set API they set up in so the user knows what to expect and what’s available. That API will then go ahead and deploy that out to the scheduler .The scheduler will figure out what back-end service to use, so on and so forth, and actually deploy the call down to the back-end.

In the case of SolidFire, you can see we basically just plug into the volume manager from a SolidFire driver. That driver then just sends JSON-RPC commands down to the SolidFire cluster to do what we need. I touched on earlier about all the different vendors involved in OpenStack. We wanted to put together a little slide here to show how diverse this group is and how large it is. You can see we’ve broken it up back based on their major areas of interest, in terms of where they contribute the most and where they play the most. You can see there’s a really diverse group.

On the compute side alone we’ve got HP, Dell, IBM, all the players. Everybody’s there. The same on storage. On the storage side, SolidFire, EMC, NetApp, HP. The point about this is what you have is you have all of the major vendors, all of the major players all obviously agreeing and believing that OpenStack is a good thing and a good direction forward. They’re all contributing and investing. What that also means for you is, for an end user or for an implementer, is that you’re not only getting what, for example, SolidFire thinks is good or what SolidFire thinks is the best way to do it. You’re getting a collaboration among all the top key players.

That’s the whole point of open-sourcing the project to begin with. The whole idea is to take all of the best ideas and the best people from each of the different vendors and each of the groups, putting them all together and having the mall work together to build the best project possible. That’s what I really think is happening with OpenStack. I think the model is working extremely well. I think that the uptake that we have with all of the vendors and stuff is what makes it great and makes it what it is.

One of the key things with an open source project and with OpenStack, one of the key things that you need to do is you need to be able to have some case studies and share those things. OpenStack’s doing a really good job of that, I think. There’s a lot of really big household names that are coming out as early adopters that are sharing their story, people like Best Buy, PayPal, Comcast, Bloomberg. You look at all of those, and you get some information about what they’re doing and stuff and it brings all this to life and it shows that it really is succeeding, it really is working and it’s doing what it says it does.

The thing about all these early adopters and these case studies and so on and so forth, the key is, is all of these folks are deploying OpenStack. They’re all configuring it and using it differently. They’re using all those customization features and things like that that sure adds complexity and makes it a little more difficult but at the same time it also enables them to do what they want and that’s key.

This is really cool to see these, you can check these out at the OpenStack.org website and keep an eye on these and it’s cool to see it grow and see more people come in and see what they’re doing and learn more about it. That was a really quick rundown on the basic concepts and the ides. SolidFire is committed to OpenStack, we believe in OpenStack. I am a dedicated resource. This is actually my job at SolidFire is to contribute to OpenStack. The key is not only to contribute to the benefit for our product in making our device fit as good as possible. Of course, I think we have a great device and I think it’s one of the best designs for cloud infrastructure.

We’re really focused on the success and advancement of OpenStack in general for everybody and I think that’s key. That’s a little bit more about us and our philosophy towards OpenStack. Were also really interested in partnering with other people that have that same vision. That’s why we partner with people like Rackspace and Nebula, folks that have a global community view of OpenStack and advancing OpenStack as a whole.

There’s a number of related resources you can check out, not only on the OpenStack side. You can go to the SolidFire webpage and we actually have an OpenStack solutions page. You can go there and get more information about OpenStack, specifically, what SolidFire’s doing with OpenStack via reference architectures, configuration guides for SolidFire and OpenStack.

You can also get some information about just basic features and things that are new inside of the block storage project, inside of Cinder. We do a screencast. Every month or so we try and put up a new screencast about a new feature, how it works, how you can use it, and just do a screencast demo on it. It’s a real quick, easy to follow type of thing. We’ve also got some blogs on there. We do recaps of each summit, so every six months OpenStack has a summit that everybody gets together.

One of the key purposes is all the engineers, all the contributors get together, and we hash out all the ideas for the next cycle. We do six-month release cycles. Well get together and well spend a week in some location and go through just figuring out, okay, what are the top features we want to work on, and brainstorm. It’s our chance to get together. For the other six months, we just work remotely. We connect via IRC and things like that. It’s a great opportunity for us to all to get together and get in the same room. It’s a pretty cool event.

If you’re looking at this and you’re thinking, hey, I’ve got some ideas. I want to get involved. How do I contribute to OpenStack? It’s a lot easier than you might think. The first place I tell people to start is if they want to contribute, go to the wiki.openstack.org webpage. There’s a how to contribute section and It’ll walk you through some of those things and get you started.

If you want to just play with OpenStack and get started with OpenStack, I don’t have it on this slide. I should, but go to devstack.org. You can download and deploy DevStack, which is a full running OpenStack install and setup. You can run all of that and do it in a VM. You can deploy a VM on your laptop, whatever, run DevStack on it and have a running and working OpenStack setup that you can play with and get a better idea and get a feel, and start experimenting. Then from there you can do things like dive in and start figuring out how to do things like add different options, and so on and so forth and then if you want, play with the code.

If you’re looking at this and you still have some questions, you want to talk more or learn more, we’re always happy to help. At SolidFire, you can contact me with technical ideas or block-storage-specific things. My email there, john.griffith@solidfire. If you’re interested in partnerships, like I mentioned, we like to do partnerships with other people that are interested in OpenStack and advancing the OpenStack project. You can get ahold of [McClain Buggel 00:27:10] at solidfire.com. Then of course, if you just want to learn more about SolidFire and you’re interested in SolidFire, check out sales@solidfire.com. They’ll get you squared away.

Anyway, I hope that was a decent introduction for you and gave you a quick, high level view of what OpenStack is and where it’s going. Feel free to send me any questions that you might have. Also, keep an eye, like I said, on that OpenStack solutions page. We have some webcasts and screencasts coming up. We’ll continue to add things there so there’s more to come. Thanks a lot.

Types: Videos

Categories: OpenStack

request a demo
contact sales