The Great 2014 Cloud Reboot continues, with Rackspace announcing late Friday a scheduled reboot of all Standard, Performance 1, and Performance 2 Next Generation Cloud Servers infrastructure that began at 6:00 AM CDT (UTC-5) Sunday, September 28, 2014. Maintenance on the U.S. regions is now complete, and Rackspace will be rolling through the remaining regions with a completion time of 9:00 AM CDT (UTC-5) on Tuesday, September 30, 2014. It’s important to note that this reboot announcement does not mention Rackspace’s traditional hosting infrastructure, but only its Next Generation Cloud Servers.
The reason for the reboots is likely the same upcoming Xen security announcement that caused AWS to start a reboot process on Thursday. Neither Google nor Azure rely on Xen, so they may not be impacted by this particular incident. However, other cloud providers that rely on Xen may need to follow suit.
Rackspace has provided detailed information about the sequencing of the reboots on its status page, as well as recommendations on how to prepare for the reboot. The company is planning to further communicate with customers via email and its status page to let them know at least an hour before maintenance is scheduled to begin for each region and also to inform them immediately as maintenance is completed for each region.
In the case of AWS, RightScale has been able to successfully relaunch instances on already-patched hosts ahead of the maintenance window. In addition, we’ve worked with many of our customers who use AWS to move critical components (such as databases) to unimpacted instance types to avoid reboots. The strategy with Rackspace will be a bit different: Because Rackspace says all Next Generation Cloud Servers are impacted, Rackspace customers may need to move between regions if they want to avoid the reboots.
While cloud naysayers may point to this a weakness of cloud, companies using Xen in their own data centers will need to make similar patches, albeit on a schedule of their choosing. The lesson to come out of all of this is that, yes, you really do need to follow recommended cloud architectures and build in redundancy.
Cloud users (regardless of cloud provider) who have implemented multi-data center strategies and high-availability architectures will weather the reboots more easily, although some may experience database failovers as each data center goes through the reboot process — a survivable, if painful, process for any database administrator. And yes, automation is your friend when it comes to cloud and even on-premises infrastructure, since it reduces both the manual effort and time to recover for all types of maintenance, downtime, disasters, and outages.