RightScale Blog

Cloud Management Blog
RightScale 2014 State of the Cloud Report
Cloud Management Blog

Amazon EC2 - A New Chapter Begins

Tonight Amazon made a milestone release introducing the ability to boot instances from an EBS volume and stop and start instances. In addition, just a few weeks after announcing plans to expand AWS to the Far East, today it has moved west and made a U.S. West Coast cloud available. (Does Amazon need a compass?) For the AWS view on all this see Werner's Blog as well as Jeff Barr's postings. But one thing at a time...

Amazon Introduces US West Coast Cloud

Almost exactly a year after the first geographical expansion of EC2 to Europe today is the second big step to the West Coast. What is notable about the EC2 architecture is that each one of these expansions constitutes a new cloud or "region" in EC2 speak. This means that now in addition to the US-EAST-1 and EU-WEST-1 regions we have a new US-WEST-1 region. Each region operates autonomously from the others in order to provide failure isolation, which has benefits as well as downsides. A major benefit is obviously the redundancy one can get by operating in more than one region or placing DR in a region other than the one used for one's primary service. The downside is that sharing across regions is not as easy as one might imagine. For example, machine images (AMIs) are not shared, so for each image you're using in one region you have to copy and re-register the image in the other, and then it has a different id you need to keep track of and reference. We didn't plan it this way, but our multi-cloud support turns out to be very helpful in managing operations in multiple EC2 regions. For example, in RightScale you can define ServerTemplates that use different images in different clouds, this means that as you update your ServerTemplate it automatically works across clouds and thus EC2 regions.

For redundant operations the comparison between the cloud and DIY datacenters is becoming ever more lopsided. Who can really afford to lose the man-hours, the cap-ex, the time-to-market, and incur the headaches it takes to set up a datacenter from scratch, even if it's in a traditional colo? And who can afford to go through all that again to set up a second or DR site? The ease with which it is now possible to set-up a DR site in the cloud that is a faithful replica of the primary site is really remarkable. And the best is that the second site can be extremely low cost because very little needs to be running there: most of it can be fired up on-demand in the case something happens. If you already have your own datacenter/colo set-up then all hope is not lost. Setting up DR in the cloud is one of the common use cases we see.

Amazon Instances Boot from EBS

The real sea change about to occur in EC2 is booting from EBS. Tonight's release includes a ton of new features which build on the recently introduced ability to publish EBS snapshots. Here's a quick summary:

  • instances can boot from an EBS snapshot instead of a traditional AMI, EC2 creates an EBS volume from the snapshot and makes it the root partition
  • instances can also boot from an EBS volume, which means that a "boot from EBS" instance can effectively be stopped and restarted later by keeping the volume around and launching a fresh instance from the same volume
  • instances can now be stopped and restarted later, which works almost exactly as described in the bullet above except for the fact that the instance id (the i-12345678 number) remains the same
  • almost all attributes of an instance can change while stopped, including the instance size (naturally the availability zone is one thing that can't change)
  • EBS snapshots can be registered and published as images, so now we have "traditional images" as well as "EBS images" (I wonder what AWS will call these)
  • images can specify snapshots and volumes to be automatically mounted at boot, and they can specify EIPs to be attached at boot, the run-instances API call can add/override these "image defaults"
  • instances can be "locked", which prevents their accidental termination
  • instances can be bundled into images using an API call (with shutdown or optionally without)

That's a long list of features to digest! What's going on here is that AWS is responding to the needs of enterprise customers who have many 'legacy' applications that are not designed to scale out or to play nice with the operations agility enabled by the cloud. It's for the apps that sysadmins spend weeks setting up and then do their utmost not to touch again. Now they can be installed on an EBS root volume and servers can be launched and relaunched as needed without having to touch the config. Basically this enables the old-school way of managing servers to be applied to EC2.

But these new features are also of great benefit to those operating scalable arrays of servers or Web 2.0 websites. It is now much easier to make changes to a clean server image: mount the image as a volume onto an extra server, edit the software/config on the image (e.g. using chroot and the native packaging system), when happy create an image from the volume and boot a server. Test it out and fix any problems in the original volume. Repeat until happy. If done correctly, this results in clean images that are not polluted by repeated boots and other operations, which is one goal we've always pursued with the RightImages we publish.

The stopping and starting of servers can also make development more cost effective. Developers that use dev and test servers can stop them at the end of the day and start them back up when they next need them. In fact, many servers could be set-up to stop by themselves if there has been no activity for a while. (This reminds me that I saw that the three longest-running instances visible by RightScale have been running for more than a thousand days and that the account they run in has seen no activity since then, except for credit card charges I assume. Impressive and scary at the same time!)

Stopping and starting servers can also be abused. For example, it can be used to implement "dumb auto-scaling": simply stop some servers when the load drops and start them back up later. The good thing is that you don't end up with fresh servers on start, so they don't have to self-configure. The bad thing is, well, that you don't end up with fresh servers, servers come up believing the world hasn't changed since they were last stopped. I think of this as abuse because it's easy to forget to update one of the stopped servers when making changes to the system, whether these are changes to the software installed on each server or changes to the rest of the system each server needs to communicate with. In other words, the danger of having a zombie come back to life and create mayhem is high. Better keep a basic amount of hygiene and start with fresh servers.

The Cloud Marches On...

It will be interesting to see how EC2 and its user base continue to evolve. With each release Amazon offers more options. That's more ways to do interesting stuff, but also more ways to shoot oneself in the foot and more stuff to 'grok' to get started. Maybe the most important, though, is that the Boot from EBS features rank very high on the "remove sales objections" scale: not every application is ready for the former EC2 cloud, not every sysadmin is ready for it either, by far not. I have to admit that all this leaves me with mixed feelings. EC2 used to have a simple & clean model, it required some rethinking but that was for the better. It was clear how to deploy highly scalable, highly redundant applications with a high degree of automation. Now that there are 10 ways to skin the proverbial cat it's much harder to stay on track and to leverage automation. Where early customers needed help figuring out how to operate in the world of EC2's disposable servers today's customers need help just navigating through all the options available in EC2 and which to apply to each application or use case.

Support for the new features and the new US-WEST region in RightScale will become available with our next release, currently scheduled to go live just before Christmas. Full support for booting from EBS will take a little longer as it has far-reaching implications. I'm sure that many of our customers will be operating in the new west region and that  it may even have some appeal to those in the far east and south pacific as "one step closer" to a local presence.  As always, we'd love to hear your thoughts on the new features, how you're planning to use them, and how you'd like to see us support them.

Updates:

  • AWS now gives each region a little local character: US-WEST-1 is listed as "N. California", US-EAST-1 as "N. Virginia", and "EU-WEST-1" as "Ireland".
  • Nice blog post on some of the mechanics of using Boot from EBS by Shlomo Swidler (but see comment below)
  • Some things you can't do with traditional AMIs: start & stop instance, create image (new way of bundling)
  • Some things you can't do with EBS-based AMIs: dev pay, protect the content of public AMIs (someone can mount the content as a data volume and pull files off it)
  • If you plan to create a public EBS-based AMI beware of deleted files: don't just "delete" files with sensitive data on the volume because they can be "undeleted", you have to erase the blocks, or better, not put anything sensitive there in the first place

Comments

[...] at the Rightscale blog there were a couple of additional takes on what boot from EBS provides. For example the ability to [...]
Whoops, thanks for correcting my mistake. I assume you'll now post a new entry updated for EBS? ;-)
Just a small correction: While it is true that my blog explains how to roll your own boot-from-EBS solution, I wrote that article almost 5 months before the AWS feature was released. The features offered by Amazon's implementation are much richer, including the ability to boot from a snapshot.
Here is a script which I used to bundle a running s3 based instance to an bootable EBS based ami, http://bit.ly/4Vr3Vg
Cool, thanks for leaving a link here! Looks like you should exclude a few more directories, like /tmp. Perhaps take a look at the AWS bundling script, it has a list of what to exclude (I haven't looked at it in a long time). Also, you need a copy of the whole filesystem in /mnt/tmp, which might not fit, so that's something to beware of. But it's hard to get all the excludes in at once...
There is no API for that and from the outside it's really impossible to tell whether something is going on on a server. I meant it's something that could be added, like an hourly cron job that checks whether anyone is logged in or has logged in in the last hour. You'd only want to check just before a billing hour is up, hence hourly.
Never mind. I think you were most likely referring to Auto-Scaling API from Amazon.
Posted by Ravikanth (not verified)   Ι   April 09, 2010   Ι   03:35 PM
well-written blog. Thanks for the post. I had a quick question regarding your comment about "many servers could be set-up to stop by themselves if there has been no activity for a while". Does Amazon provide any APIs to do this?
Posted by Ravikanth (not verified)   Ι   April 09, 2010   Ι   03:25 PM
[...] the new boot features visit the Amazon EC2 detail page and the posting on the AWS developer blog. RightScale’s perspective is also worth [...]
The update actually refers to EU-based instances. That is, datacenters in Europe...Finally Amazon has realized that going regional and focusing on the special needs of that location than global is the best shot in recession.
This is a great post, am not gonna lie it is a little above my knowledge base of servers. I take it though that amazon have taken this action in preparation for the expansion of the e book store we've seen with the release of the smaller version of the Kindle. <a href="http://www.kindlecases.net/" rel="nofollow">Ben - Kindle Case</a>

Post a comment