RightScale Blog

Cloud Management Blog
RightScale 2014 State of the Cloud Report
Cloud Management Blog

Cloud API Requirements

Does a standard cloud API makes sense? I'm still of the opinion that there is too much diversity out there and that the time is not ripe yet. There are many indispensable features in Amazon EC2 that no-one else has implemented at scale or for which there is no different take. The whole EBS feature set is one example. Image sharing and publishing is another one. Elastic IPs is another example. Eucalyptus has implementations for all this, but I can't point to someone operating them at scale. In any case, Tom White offers the following "Turing complete set of cloud compute services":

1. Instance lifecycle. The lifecycle defines the basic commands to provision, start, stop, and terminate instances. The bare-bones of a compute service.
2. Shared images. While it is possible to bootstrap from plain OS images created by the cloud service provider, the ability to build your own customized image, and crucially to share it with others provides a social element that helps drive adoption of a cloud platform.
3. Instance metadata. This feature allows you to inject small amounts of user-specific metadata to each instance at boot time - e.g. secret keys, or custom boot parameters - which allows another level of customization. This feature works well in conjunction with shared images: the common non-user specific code is baked into the shared image, and the user-specific code is supplied at launch time as metadata.
4. Network controls. Cloud providers need to think about the network environment that the user's instances run in. Offering such services as DNS, firewalling, VPNs (and exposing it via an API) makes it easy for developers to get started quickly without having to build this infrastructure themselves.

This is a great start and I would wholeheartedly agree that these are requirements, but I don't think they're enough. In order to operate interesting services (or acquire interesting customers) a cloud must offer more than this. I'm not yet sure what the list exactly is, but the following come to mind:

    • Security groups or vlans: users must be able to control the network boundary around their servers, they must be able to group servers into tiers, and they must be able to create private communication structures. I believe the only two differences between security groups and vlans to be that a network interface can be in many security groups but only one vlan and that vlans can offer layer-2 multicast (and no-IP protocols) while security groups can't.
    • Private IPs and remappable public IPs: going hand-in-hand with the notion of private communication structures goes the notion of private IP addresses. Of course publicly routable addresses are required as well, and there has to be some way to remap those IPs such that the failure of a server can be masked or a quick fail-over for other purposes can be engineered. I believe that in the end NAT (as used by Amazon's Elastic IPs) is the only scalable choice, but I'm ready to learn new things and there certainly is room for improvement over EIPs.
    • Mountable storage volumes with snapshot backup: we did operate for a long time without Amazon EBS and at the time the "we need no expensive SAN" feeling was great, but after operating databases in EC2 with EBS for several years now there's no way I'm going back. I need to be able to mount a storage volume on a server, operate it, take a snapshot backup, and then create a fresh volume from that snapshot on another server. I'm ok for the volume to be a remote filesystem as opposed to a block device, and I'm ok for the snapshot to be another volume as opposed to tertiary storage (S3 in Amazon's case). Oh, but please don't make it hard to do this across failure zones so the two above servers are failure-isolated.

      In my opinion we can't be successful until we can hash out all these features with a reasonable degree of flexibility so providers can differentiate yet at the same time a reasonable degree of uniformity so users (like the RightScale cloud management system and its pre-built ServerTemplates) can write portable systems. I have not heard the discussion reach the level of sophistication needed (and I freely admit that I haven't listened as hard as I could have) and frankly I also feel like we're all still learning new things all the time. On the RightScale end we're in the process of reworking our multi-cloud layer so we can incorporate some of the above feature sets in a standardized manner so I hope I can make more specific contributions in the near future.

      Comments

      [...] clear illustration of what we are missing out on. I submit to you this post by Adrian Cole and the follow-up (twice)by Thorsten von Eicken. After spending two days at a face to face meeting of the DMTF Cloud [...]
      Ah, I wish I could make it to SF the 23rd... I'm sure you're going to have a good time!
      I couldn't agree more, Mitch. The boot from EBS feature set and Spot instances have started to make a mess of the API. There are so many small inconsistencies now, it's really frustrating. At the same time, while I have seen more principled API proposals I'm not sure they end up being more usable. One thing I fear is that we're going to see two types of clouds: scalable public clouds and non-scalable private clouds. And I mean that from an API point of view, not a usability point of view. If your scope is a few hundred machines you can hook them all up to a big SAN, and you can have traditional VLANs, and you can allocate any IP to any server, and you can do a lot of things that just don't work when the scope becomes hundreds of thousands of machines. And those differences show real clear in the API and in how the resources can be used.
      Great list, Thorsten. The 3 requirements you mentioned are least supported across cloud providers today, and server-side apis such as CDMI, OCCI and vCloud could help push these across the board. In the mean time, we can do some work on the client side of cloud abstraction. For example, EBS has a similar counterpart in Azure, and EIP is similar to terremark public ip address service. I've volunteered we discuss this at the cloud hackers dinner this month, and welcome you to join us: http://www.meetup.com/cloudhackers-SF/calendar/12575910/ I've also added these as issues to the jclouds project. http://code.google.com/p/jclouds/issues Thanks for your feedback! -Adrian founder jclouds
      I agree. EC2 has a number of important features that others have yet to provide. Further, I really think it would be good to give a lot of thought about how these features should be provided and what the abstractions are. The EC2 API is functional but it's also huge and at times ugly. That's probably inevitable, given the speed at which they are developing the service but at some point we need to take a deep breath and think this through more thoroughly.
      [...] stellen sollte. Gut zusammengefasst bzw. teilweise erweitert finden sich diese auch nochmal hier. Und wer zufällig gerade eine API entwickelt, oder dies für die Zukunft plant, sollte sich mal [...]
      Maarten, I'm not sure I understand what you are asking for. With RightScale you can run operational scripts to make run-time changes in a controlled manner. When using our new Chef integration, you can change the inputs and either reconverge or run a specific recipe.
      One thing that would be really usable in all this is an option for runtime configuration management. I have a gut feeling that if we could boot an instance and add/change runtime parameters based already running servers (not design time or launch time parameters). Might be a good use case for SimpleDB, or ZooKeeper (but then in a private LAN) etc.
      Posted by Maarten Koopmans (not verified)   Ι   March 01, 2010   Ι   03:40 AM

      Post a comment