RightScale Blog

Cloud Management Blog
RightScale 2014 State of the Cloud Report
Cloud Management Blog

Why Amazon's Elastic Block Store Matters

On the technical side, Amazon's EBS service may look like just another great new feature of the Elastic Compute Cloud, but on the business side it enables a whole slew of new customers. I won't pretend that I understand all the new uses, but I can talk about those we see and are supporting.

First a couple of words about what EBS is. In short, it's a SAN (storage area network) in the cloud. You can allocate a disk volume of 1GB to 1TB in size from what is now an endless SAN in the cloud and attach it to an instance of yours running in EC2. The volume is stored on redundant disks (with some form of RAID) and has a lifetime separate from any instance on which it is mounted, so you can unmount it and later remount it on another instance. You can also perform a snapshot backup of a volume to S3, where it is stored with the redundancy and durability of all objects on S3. Moreover, successive snapshots are incremental, providing a powerful and efficient incremental backup capability for volumes.

All this and much more is explained in detail in another post, and there's yet more detailed EBS information on our support site. The official EBS announcement is on the EC2 detail page, Werner Vogels provides some background, and Jeff Barr's blog entry has links to many other related announcements.

The RightScale dashboard supports all the features of EBS and offers a number of additional goodies, such as configuring volumes to automatically be attached to servers when these launch and keeping track of the ancestry of a volume or snapshot.

What does EBS enable? In short, traditional processing on large datasets and reliable storage for many servers. Let's look at these areas one by one.

Large Datasets

Amazon Web Services are designed for scale. EC2, S3, SQS, and SDB are ideally suited for building large systems that process huge data volumes. The catch has been that they are geared toward modern service-oriented systems that can use storage accessed via HTTP PUTs and GETs (Amazon S3), can work using a non-relational database like Amazon SDB, and thrive on large numbers of simple servers (EC2). Users that have more traditional applications, such as relational databases, that require large datasets stored in a filesystem with a POSIX interface have had difficulties in meeting all their requirements for operating in AWS. While an EC2 X-large instance comes with about 1.4TB of local disk, it is rather difficult to actually use this disk space in a production system. Populating the disk with data at boot time can take hours, and backups, replication, and restoring the data in case of an instance failure are all sore points. For up to 100GB the timescales are all workable, but beyond that it gets difficult.

With EBS, the processing of large datasets contained within a filesystem becomes easily accessible. Volumes can be up to 1TB in size; beyond that it is possible to mount multiple volumes on the same instance such that filesystems of 10TB are practical. The volumes can further be backed up to S3 using snapshots, and they can be replicated by creating new volumes from the snapshots. What is particularly nice is that a volume can be created in any availability zone (think data center) of a region from a snapshot, so copying a large volume across data centers can be offloaded to EBS efficiently.

Many Virtual Appliance Servers

EBS also enables SaaS vendors that use a single-tenant "virtual appliance" model. Many software vendors have approached us with use cases where they would like to run individual servers on behalf of their customers. Often these servers are co-managed between customer and software vendor or have other properties that make the service inappropriate for multitenant SaaS implementation. In these use cases the end customer is storing important data on these servers and requires a robust data safeguarding architecture, in particular for database storage. While we today have a very effective MySQL replication and backup solution, it is really geared at multiserver setups and doesn't fit the price and complexity budget of cookie-cutter single-server virtual appliances. For those use cases, EBS brings the desired performance and reliability and drops the complexity and price.

With EBS the canonical reliable single-server virtual appliance can be implemented with the following architecture: an EC2 instance whose type is chosen for the CPU and memory required, an EBS volume sized appropriately for the data set, a revolving set of frequent snapshots providing disaster recovery backups, and periodic application-level export of backups to S3 for archiving and off-cloud backups. In case of a total failure of the EC2 instance and the EBS volume (as might happen with, for example, a data center fire) a new instance and volume can be allocated in another availability zone from the last revolving snapshot.

When it's time to upgrade the virtual appliance to a new software version it becomes relatively easy for the software vendor to spin up a second instance and volume with the upgraded software for important customers so they can test-drive the new version on their data and train their internal users before committing to the upgrade.

Try It Out for Yourself!

We've been busy integrating support for this new storage system for months so that you can start using it immediately. Our RightScale Dashboard support for EBS is available as part of our free Developer Edition. To learn more about EBS and RightScale's support for it, check out my detailed technical review, read our EBS tutorials at wiki.rightscale.com, register for our upcoming RightScale EBS Webinar, or just drop us a line at sales@rightscale.com.

Comments

[...] posting over on the AWS evangelists blog. Also the folks at Rightscale have two detailed postings: why Amazon EBS matters and Amazon EBS [...]
Yes please, although you might as well just link to it.
<strong>Расширение Amazon Web Services:Elastic Block Store...</strong> Теперь у амазоновских EC2-машин появилась возможность иметь таки постоянный диск (оно конечно и раньше можно было через S3 ElasticDrive но данный ...
[...] processing on large datasets and reliable storage for many servers,&quot; Right Scale said in a blog posting [...]
[...] processing on large datasets and reliable storage for many servers,&quot; Right Scale said in a blog posting [...]
[...] processing on large datasets and reliable storage for many servers," Right Scale said in a blog posting [...]
[...] News &#187; News News Why Amazon&#8217;s Elastic Block Store Matters2008-08-22 18:34:24Number be attached to servers when these launch and offers a volume or ... as [...]
[...] Доступны диски от 1 ГБ до 1 ТБ. Разработчики уже написали краткое руководство по подключению MySQL на сервис EBS, так как раньше использование баз данных внутри модели S3 не было оптимальным для многих разработчиков. [...]
Скажите, а можно ли взять статьи с вашего блога? Со ссылкой на первоисточник разумеется. :)
So, if we want to create a cron script that rolls daily backups it would have to 1. Stop MySQL 2. Freeze XFS filesystem 3. Run a local ec2 api tool for snapping 4. Remove snapshot number 91 (assuming a 90 day roll) 5. Unfreeze filesystem 6. Start Mysql 7. Create Log entry Anyone want to write one? :)
James: yes, you are correct. We're preparing an API call that uses the RightScale web site as a proxy so you don't need to have the EC2 credentials on the instance itself. That can be useful in certain use-cases. We will also be providing our stuff openly, so you can grab it. Finishing that is a few weeks away, unfortunately as we have too many other things on our plate we need to finish. [Yeah, can I rent competent developers in the cloud too? ;-)] Oh, one more thought, the rule we use for rolling the snaps is "keep the N most recent completed snapshots". So the algorithm is: find the Nth most recent 100% completed snapshot, then delete all older ones. This way you avoid situations where you're keeping the last 10, but they failed or haven't completed yet for whatever reason.
Good point. Ok, well I am ready to try that script out once you have it. You should have my email if you need anyone to beta test it. I was thinking that it would be bad to have the ec2 credentials on the instance itself... Your way seems somewhat batter.
This is great stuff. Is there a RightScript available that installs and configures mySQL to run on an EBS? Thanks!
Posted by Chris (not verified)   Ι   August 31, 2008   Ι   06:56 AM
looking forward for more information about this. thanks for sharing. Eugene
Posted by Eugene (not verified)   Ι   October 20, 2008   Ι   10:50 AM
Its a great new service from AWS, but its unfortunate that you can only mount to one instance at a time. It seems the one major missing feature is the ability to create a shared volume that can be accessed by a cluster of servers. To create shared storage across multiple instances still requires using S3.
Posted by Invu (not verified)   Ι   September 18, 2008   Ι   01:45 PM
Response by Aka (not verified)   Ι   January 14, 2012   Ι   10:13 AM
Absolutely right.. I was also kinda hoping the shared storage feature with this new service from Amazon.
I've implemented a very easy to use revolving backup script here: http://www.sambastream.com/blogs/dgildeh/12-03-10/implementing-revolving-backups-aws-ec2 Only 3 files, easy to install and run and allows you to automate your entire backup process on EBS. Hope this helps!
David, thanks for linking it here, looks interesting. We use our own product feature-set, which supports snapshots across striped EBS volumes and doesn't require dropping keys onto every instance. But having alternatives is always good!

Post a comment