RightScale Blog

Cloud Management Blog
RightScale 2014 State of the Cloud Report
Cloud Management Blog

RightScale ServerTemplates Explained

One of the distinguishing features of RightScale is that from day one we've focused on automating the configuration management of servers. The reason is simple: you can't get the benefits of the cloud if you have to spend a lot of manual time configuring each launched server. The cloud itself only solves part of the problem: calling EC2's runInstances API is just the beginning of a rather complex process that gets the server into full operation. After all, most launched servers are intended to go into operation/production and not just sit around idle waiting for someone to log in.

After doing a webinar on ServerTemplates, I thought it would be good to go back and write-up why the core of RightScale does what it does. It's on the long side so you may want to skim over the sections that you're already familiar with.

Why Machine Images Don't Work

The standard method of operation in the virtualization world is to work with machine images. This translates to: launch a server with an image that is "close" to what you want, then log in and install the software you need plus make any config changes, then create an image from that server ("bundling" in EC2 parlance). Later on, when you need an instance of that server, you launch the image and hopefully it comes up ready to go. While this process may sound simple, it's actually anything but that and takes quite some time. I went through the bundling treadmill back in 2006 and concluded this wasn't productive. The reasons are simple:

Images are too monolithic. Everything on a server is bundled up in one image file which makes it difficult to manage a collection of images. Change the version of one software package you commonly use and you have to recreate all images that happen to use that package. This quickly gets out of hand.

Images are opaque. From the outside it's hard to tell what's in an image. Even if you fire it up it's not convenient to poke around to figure out what's installed and how it's configured. Try determining the difference between two images: not a pleasant task.

Images are too big. They are unwieldy to work with. Take two images of different versions of your app server. More than 90% of the bits are typically identical, (often more than 99%). But finding the interesting ones that differ is like finding a needle in a haystack. This is ridiculous and contributes to making images hard to work with.

Images are too static. You can't fully configure each server. When you launch the tenth app server it needs to know it's number 10 and not nine. When you launch a test app server it needs to know it's in a test deployment (e.g., don't send alert emails to the ops team), yet you want the same image to be used in test as in production, because otherwise, what are you really testing? So you need some dynamic configuration mechanism to "personalize" each server at boot time.

The bottom line is that my experience with images has been very frustrating. I know that tools exist to help manage them, but I'm not convinced this is a productive avenue. This why late in 2006 I set out to build what we now call ServerTemplates – and I can't imagine going back.

The ServerTemplate Concept

The idea behind ServerTemplates is to boot any server from a small set of very generic images and configure the server dynamically at boot time. We noticed that the Linux package managers are very fast and running yum install apache or apt-get install apache takes just a few seconds, so there is little value in baking such software into an image. There are special cases where every second of boot time counts, but those are very limited and at that point it's still possible to create a more specialized image for that purpose.

Simplifying a bit, a ServerTemplate is composed of the following pieces:

  • a group of settings to define the type of server - i386 vs. x64, etc.
  • a reference to a base image that is to be booted
  • a list of scripts and Chef recipes that are to be run at boot time to install and configure all the software

The illustration on the right shows the layers that are typically found in a ServerTemplate, starting from the bare virtual machine at the bottom, the OS, and various layers of software packages, and finally the application at the very top. On Linux we prefer to boot bare bone images that contain just the OS, while on Windows it is often required to pre-install some of the larger software packages on the image and use the dynamic ServerTemplate functionality primarily to configure these apps.

What happens at boot time is the following:

  • When launching a server, RightScale passes the server its identity in the launch call using a crypto token that uniquely identifies the server to RightScale. This is important because at any one time many servers may be booting from the same image.
  • When it comes up, the server contacts RightScale with its token to obtain instructions on what to download and run.
  • RightScale also sends the server a set of variables that can be configured on the website. This is the way dynamic information is fed into the config to specify things such as the server's name, test vs. production, names/IP addresses of other servers it needs to contact, and so forth.
  • The server then typically downloads packages from a distribution mirror and runs a set of scripts to install and configure everything.
  • Throughout the process, the server sends audit entries to RightScale so that it's possible to monitor the progress on the web site and also for the persistent audit record.

Using ServerTemplates

ServerTemplates provide a very modular building block approach to managing server configurations. In practice many different constituencies contribute to a ServerTemplate. Vendors, OS distributions, and RightScale provide the lower layers in the form of standard software packages. The sysadmin or operations team often provides standard configurations for fleet-wide software, such as logging, intrusion detection, user account management, network config, etc. Developers provide higher layers, such as app server install and the application code itself. The modular approach makes it easy to integrate all the pieces and especially to manage the update process.

The modularity of ServerTemplates also enables flexible software development and test practices. In our case we use the same building blocks to create a large variety of ServerTemplates that are appropriate for the different stages from development to production:

  • In production we use many specialized servers, so we end up with many ServerTemplates, for load balancers, app servers, API servers, and so on.
  • For staging we start to aggregate functions so we are running fewer servers. This saves money and also simplifies updates a bit. To achieve this, we combine scripts that are in different ServerTemplates in production onto fewer ServerTemplates.
  • For test setups we combine again so we have a number of test systems without having to launch and manage too many servers.
  • Finally, developers often use an "all-in-one" ServerTemplate for their development and testing. This ServerTemplate combines all the building blocks in a single ServerTemplate.

The beauty here is that we can reuse the exact same RightScript and Chef cookbook building blocks that we use in production for the other stages of development. This reduces set-up time and issues where developers test configurations that have little resemblance to what will go into production.

Making ServerTemplates Reliable

In IT management there is often a tension between flexibility and reliability: if everything can change at any moment it's hard to lock down a reliable and reproducible configuration. We discovered this early on and spent a lot of engineering resources to provide a good reliability harness around ServerTemplates to solve the problem. Our solution has a number of aspects:

  • ServerTemplates are version controlled, so you can commit a version and come back to it at any later point. If you want to relaunch a server with last year's version of a ServerTemplate you can. Or perhaps you just want to see a diff of what has changed since.
  • We mirror the Linux distribution mirrors such that a booting server retrieves the packages from a local redundant set of fast servers.
  • We also keep a daily snapshot of the Linux distribution mirrors such that when you relaunch a server with last year's version of a ServerTemplate it can retrieve the software packages as they were at that point in time. This is under user control: you can freeze the repos to any day of your choice.

RightScale reliably launches thousands of servers using ServerTemplates each day. Keeping this machinery reliable as both RightScale and the underlying clouds we manage evolve at breakneck pace is a top priority for us and involves a significant amount of engineering. One of the tricks we use to stay ahead of the curve is that we use ServerTemplates ourselves: we leverage RightScale to manage RightScale. The benefits in terms of automation, control and reliability have been incredible, and at this point we cannot imagine going back to a pure machine image model of operation.

Summary

ServerTemplates are a leap forward over images, they enable servers to be launched and dynamically configured at run time – no tweaking machine images. This is the underpinning of truly automated infrastructure. They also support launching reliable infrastructure – you know what you’re launching at all times, it is completely repeatable. Finally, they are built on reusable components, which saves untold time in creating new or similar configurations, or expanding from test and dev to production.

Comments

Sounds very like the instance roles discussed in my recent "New Roles in the Cloud" talk. One benefit not discussed is that the infrastructure can use the historical behaviour of machines in specific roles to aid in placement decisions, something the (VM size, disk image) info doesn't assist on, not if the same disk image is used in many places.
Steve, you're right, but why not drive placement from top-down? When you configure a deployment with multiple servers, why wouldn't it be better to be able to configure placement directives and pass those down through the cloud API? I sometimes want two machines next to each other for performance, but at other times want two machines "far apart" for failure resiliency. Shouldn't that be specified explicitly top-down?
Placement is tricky. It's one of the big controversies. Some people say "you don't need to know where things are", yet if you are trying to debug why some machines always fail, its nice to know what the physical machines are. Similarly, when you want to create a cluster, you may want to say "these two close, but not same physical machine or even same PSU". However there's no easy way to test that the supplier is meeting those requests, which is why I don't like them. No testability implies you can't rely on them.
Steve, I appreciate the intent behind your statement that "there’s no easy way to test that the supplier is meeting those requests, which is why I don’t like them", but is it at all realistic? It seems to me that most things you can't test. Do you really have two cores dedicated to your VM? Or only until the host gets oversubscribed? Do you really have 400 I/O Ops/sec to the disk or only until something fails? Do you really have 250 1Gbps between two machines or only until someone else competes with you? Are these two machines really in failure isolated datacenters or is the provider cheating? You have to trust someone somewhere...
I'm noticing similar movements in some open source projects like hugoduncan/pallet Are you guys planning to open source your templating code and or templates? Cheers, -Adrian jclouds
All the code running on the servers is open source. We're using a fork of nanite as well as chef and our repos are open. A good number of our templates are public and there will be more to come.
I'm a big fan of RightScale's server templates. The time invested to set these up pays itself off pretty quickly. One thing I would find helpful is to incorporate base templates and inheritance. We have several templates that don't differ very much and some changes we need to make apply to all our templates while other changes may only need to be applied to one or a few. Hope you consider it for the future. Cheers, Mike
Posted by Mike Dosik (not verified)   Ι   March 23, 2010   Ι   01:38 PM
Mike, thanks for the kind words. Your request is not new. The reason we haven't forged ahead is that things are pretty complicated as is and inheritance adds another level of complexity. As we move to Chef some of the pain can also be resolved in a different manner. In Chef a recipe is really a "run list", i.e. a list of things that need to be configured. This list often consists of primitives, like installing apache or creating a vhost config file from a template. But the list can also include other recipes. This means that you can easily have a "MyStandardConfigs" cookbook in which you have recipes that purely reference other recipes. So you could have a "public_server" recipe that runs the set of recipes that your publicly accessible servers needs to have (e.g. to lock them down). You would then include that in your ServerTemplates for public servers. If you squint just a little this is essentially a mix-in, which is probably what you wanted when you asked for inheritance. We've been thinking in providing some UI for such mix-in recipes so you don't have to deal with the source code control and meta-data just to add/change a line in a recipe.
[...] with RightScale cloud management tools, you can automate at the application level using ServerTemplates™ for consistent and repeatable results every time you launch a [...]
any idea on how the number of templates have grown over the years? RightScale best practice based vs. customer/partner (from scratch) created templates
Posted by Ab (not verified)   Ι   July 06, 2011   Ι   06:58 PM
[...] users this will not be an obstacle as our platform takes care of the API differences and our ServerTemplates accept and even leverage the more important resource differences. We actually welcome these [...]

Post a comment