RightScale Blog

Cloud Management Blog
Cloud Management Blog

The Skinny on Cloud Lock-In

The topic of cloud lock-in is getting quite some attention as of late, and it definitely needs to be a primary concern for anyone planning to move business critical applications to the cloud. (And who isn't planning on that these days?) Given all the different layers of cloud computing the conversation can quickly get more confusing than anything else. At Cloud Connect a few weeks ago the lock-in discussion bounced from Salesforce.com to Google App Engine, and then to Amazon Web Services within a single argument -- which just makes no sense. To put it simply, different layers of cloud offerings vary widely when it comes to the dangers of lock-in.

Lock-In Hypothesis

Let me state Thorsten's Lock-in Hypothesis:

The higher the cloud layer you operate in, the greater the lock-in.

lockin increase

This means that if you use an application in the cloud, such as an all-in-one CRM package, you have the highest chance of getting locked-in. Move one level down to a platform in the cloud and you are somewhat less likely to get locked-in. Google App engine is one example: you can move a simple Python app off that platform fairly easily, but anything of substance that uses its BigTable storage and other services will end up relying on a lot of proprietary technology.  This "black box" effect locks you in more than, for example, a platform like Heroku where apps follow more of a standard Rails code base. When you move down to an infrastructure cloud, such as Amazon Web Services, it becomes even easier to see how you can move your application stack from one provider to another. After all, there's not much distinguishing the Linux box you get in EC2 from the Linux box you get at GoGrid. But even here, lock-in needs to be thought through because the system behavior - from storage persistence to networking details and on and on - is far from identical.

So where does this leave us? I've been talking about lock-in, but what does that really mean? Well, with cloud computing you outsource the operation of compute resources to a cloud vendor who "runs" your application and who "stores" your data. Lock-in occurs with this vendor to the extent it is prohibitively expensive or time-consuming to run your application elsewhere or move your data elsewhere. Whether this "elsewhere" is another vendor or whether it is your own infrastructure is not important: if you can't move, or it costs a lot or takes a long time to do so, you're locked-in. We recently asked our customers and prospects what concerned them most about lock-in. Here are the results:

lockin concerns

 

The Layer Cake

Lock-in can actually occur at many levels in the stack, and that's why the cloud layers differ in their effective lock-in risk. The more code that is controlled "behind the curtain" by the cloud, the more you tend to lose freedom. Conversely, the more that is under your control, the easier it is to replicate it elsewhere and retain freedom. Here are a number of different layers at which you could find yourself locked-in:

  • Application: do you own the application that manages your data or do you need to find/write another one to move?
  • Web services: does your app make use of third-party web services that you would have to find or build alternatives to (e.g. storage, search, billing, accounting, ...)?
  • Development & run-time environment: does your app run in a proprietary run-time environment and/or is it coded in a proprietary development environment? Would you need to retrain programmers and rewrite your app to move to a different cloud?
  • Programming language: does your app make use of a proprietary language, or language version? Would you need to look for new programmers to rewrite your app to move?
  • Data model: is your data stored in a proprietary or hard to reproduce data model or storage system? Can you continue to use the same type of database or data storage organization if you moved or do you need to transform all your data (and the code accessing it)?
  • Data: can you actually bring your data with you and if so, in what form? Can you get everything exported raw, or only certain slices or views?
  • Log files and analytics: do you own your history and/or metrics and can you move it to a new cloud or do you have to start from scratch?
  • Operating system and system software: do your sysadmins control the operating system platform, the versions of libraries and tools so you can move the know-how and operational procedures from one cloud to another?

All these issues become pertinent when you face questions such as: "How can I move my Force.com application or my web site running in Google App Engine to my own data center?" Or "Can I get the click-stream data for my site out of the platform so I can analyze, for example, last year's traffic compared to this year's?" Or "Can I easily move an application between my datacenter and EC2 easily?"

Altitude Increases Lock-In

The value proposition of the higher cloud layers is appealing and I predict more and more movement in that direction. But lock-in is one of the issues that really gives me pause and that has kept me in the past from adopting some of the services that otherwise have looked compelling.

Let me pick on Google App Engine for a minute. Suppose you develop your site on App Engine and you find yourself having to move away for whatever reason. I don't know of a good solution for you at that point. While there are ways to port an app from App Engine to Django it's not clear this is really an answer if you're running a high volume production app. It's going to be interesting to see whether we will end up with commercial or perhaps open-source App Engine clones that are "industrial strength" to the point where one can really contemplate moving a big app from one App Engine vendor to another. (Well, first Google App Engine needs to be complete enough to host the types of apps where this is a real concern.)

An example closer to home is Amazon's Simple DB. I've been interested in Simple DB since I first heard about it, but we have yet to use it as part of the RightScale service and the #1 reason is lock-in. For example, we store audit entries for everything that happens with our users' servers and I'd love to get those out of the SQL database they're in. Simple DB may be a good solution to the problem from a technical point of view, but we don't see how we'd be able to move that data out of Amazon without major headaches. In addition, we need to be able to run all pieces of the RightScale service in other clouds and we'd have to build an alternate storage solution there. By the time we do that we might as well only use this alternate solution and forego Simple DB altogether.

At the level of infrastructure clouds like Amazon EC2 the questions around lock-in are somewhat different but still pertinent. The cloud vendor provides what I like to think of as the "atoms of computing," namely processing, storage, and networking. You get to build your infrastructure using virtual machines (EC2), disk block devices (EBS), hashed storage buckets (S3), security groups, etc. This means that the choices of programming language, development environment, runtime environment, database storage and so forth are all yours and can all at least in principle be duplicated in another cloud, at a traditional hosting provider, or in your own datacenter. Where lock-in starts to creep in is in the system architecture and in the operations infrastructure (automation, scripts, procedures) that your sysadmins put in place to manage everything.

Maintaining Freedom of Choice

One of the principles that I've upheld in the design of the RightScale system from the beginning is transparency. Everything happening on your systems should be visible to you. This not only means that you can find out why something happened and who did it, but also that you can replicate it elsewhere. There's no magic happening behind the curtain to which you're held hostage. I love it when others can do magic for me and save me a lot of time and effort by providing a pre-built platform. But there are solid reasons - both business and technology-related - to demand the ability to look into the "secret sauce." That way, I can be enchanted by the magic but not locked in to the magician. Our users need to be able to enjoy the same capability.

A second principle we follow is to focus as much as possible on standard software, architectures and configurations. This means that our solutions can easy be replicated elsewhere, such as in your own datacenter. This can present more of a challenge when designing for a cloud environment, which is why we provide cloud-ready solutions for various types of scalability, but it also frees you from being tied to a particular cloud.

lockin details

In the end, there may not exist a zero lock-in option. In fact, certain kinds and degrees of lock-in are probably unavoidable and are actually tolerable. The point is that the lock-in question is an important consideration to take into account when choosing among different cloud computing alternatives, and it's equally important to keep the differences among cloud layers in mind when you decide what you're willing to live with. All clouds are not created equal, and all clouds do not create equal lock-in. The key is to know the implications of your cloud choices.

Comments

[...] The Skinny on Cloud Lock-in « RightScale Blog (tags: cloud cloudcomputing interoperability portability) [...]
Mitch, your point about "priority of concerns" is a good one. But then you're also an early adopter that is very agile. For a lot of larger businesses that are still in the *looking* at the cloud phase it is a real concern. They don't want to invest resources to "learn the cloud" if they are going to face lock-in issues. Many also have corporate requirements for alternate vendors. We've had to do interesting things for more than one customer who had to have a DR site somewhere other than AWS to satisfy internal policies. Plus, all this is a very popular topic at conferences, which is what initially sparked this blog entry.
There is lots to digest here! I'm thinking a lot more about the cloud apps and systems I use. I never even considered the "Lock-in Hypothesis" until now. Wes Fryer just published a post "Cloud-Based Computing" http://tinyurl.com/dg3yva which I think speaks to users like me who love to play with and adopt these technologies. It seems that very high end users who actually implement these system on a massive scale have a lot more to consider than the typical user does. It be great if you could check out our blog and leave some tips on what end-users should consider before adopting cloud solutions.
[...] von Eickenat at RightScale make an interesting and logically sound assessment of cloud lock in. Thorsten argues that the risk of lock-in increases dramatically in higher layers in the stack: [...]
"If G raises the fees or makes an otherwise unattractive move, tough luck, you’re stuck!" Or I just move to "AppEngine" on EC2 (http://waxy.org/2008/04/exclusive_google_app_engine_ported_to_amazons_ec2/ ). Even if that proofs your point rather than mine. Regards, tamberg
Posted by tamberg (not verified)   Ι   February 20, 2009   Ι   05:18 AM
Thanks Reuven! I always enjoy peeking at http://www.elasticvapor.com/ too!
Thanks for the kind words! I wish you had copied me, I'd have been flattered! ;-)
Paul, I'm not sure I follow your argument. The point is not about breadth but about interoperability. When you build on infrastructure clouds you use more standard (interoperable) interfaces. That means you're *less* locked-in. I looked at the page you link to and I totally agree with respect to the attraction of platform clouds. Hey, who wouldn't love to just dump their app on someone else and say "you run it, you make it fast, you back it up, go!". The point I tried to make in this post is that the price you pay is lock-in. If G raises the fees or makes an otherwise unattractive move, tough luck, you're stuck! When is comes to the prediction at the end, I actually believe that Amazon is on the real inside track, if you saw where their business is headed, you'd have to agree. Where is G's cloud *business*? But coming back to platform vs. infrastructure, I realize I should write a separate post just on that. I think the future is somewhere in between 'cause, as appealing as the platform model may appear, the reality is that more transparency and control is needed in real life.
"The higher the cloud layer you operate in, the greater the lock-in" Not sure about that. Isn't an OS VM a broader "interface" than a programming language plus an application model, thereby increasing dependency? (http://www.amundsen.com/blog/archives/964 argues that only Google "get's it") Regards, tamberg
Posted by tamberg (not verified)   Ι   February 19, 2009   Ι   10:49 AM
Hi - Thanks for sharing the survey of 'what concerns you most', unfortunately the rest of the post doesn't prioritize the issue accordingly (my reading is that it focuses mostly on a runtime or app framework issues, and less on the data which is the largest concern). If your app has read access to all the data it writes, and can re-expose it by definition it is not locked in. If as many tools existed to move data around to/from S3 as there exists for MySQL, this wouldnt be a concern. So maybe it's a matter of time until they get done, and this is just like RDBMSs before ODBC JDBC and the likes emerged. Wait till ORM frameworks evolve to provide cloud versions, too, and apps will be easier to build, test, and port. A more insiduous lock-in not mentioned at the app level is of the authentication authority and user identity system - build your app with GAE and you are locked into gmail AuthN. Also, the emergence of open-source cloud infrastructure is desirable but the analyisis is again a bit superficial. If google open sourced all the GAE and gmail authn infrastructure and... (as an example, te argument applies to AWS and Azure and the likes) it would allow others to replicate partially the environment but missing the real asset/ secret source of why the cloud option is attractive: A high-scalability, availability, and security datacenter for storage, memory, cpu, networking! Until a open-source datacenter with servers, cooling, network etc is maintained (or until a more fantastic peer to peer thing using spare change computing shows up) open source cloud infrastructures will be only be interesting to developers and enterprises wanting to keep 'their own mini clouds' .
I totally agree on the Lock-in hypothesis. I just did an interview with Computerworld and said the exact same thing ( I didn't copy you I promise). Great post. John johnmwillis.com
Thanks for a great post. Good to see that someone in the field has addressed the 500 lb gorilla in the cloud services room.
Posted by B. Factor (not verified)   Ι   February 19, 2009   Ι   12:51 PM
Hi Thorsten - I'm late to the party, didn't see this post till now. I understand the concern about lock-in but, at the moment, I find it difficult to worry a great deal about lock-in when there are so few viable alternatives to AWS. It's kind of like Mazlow's Hierarchy of Needs; Lock-in is a concern in the hierarchy but it's close to the top of the pyramid and at the moment my concerns are at a much more fundamental level within the hierarchy so I'm not focusing a lot of my attention on that problem. And ultimately, I can get my data out of SimpleDB (and in much less time than it took to put it in there, thanks to my lop-sided connectivity). Mitch
Regarding SimpleDB lock-in, a number of people use our open-source alternative, M/DB, as a backup / local version. M/DB may help to alleviate the lock-in concerns about SimpleDB as it could run in any other cloud-provider's infrastructure. See http://www.mgateway.com/mdb.html for more information.
Decent post - a few comments: In November 2008 I presented on a panel with Mike Culver from Amazon Web Services on this very topic. I was attaempting to split fact from fiction. The presentation was very well received and prompted a follow-up article that you can find here: http://www.smarter tools.com/blog/archive/2008/11/20/cloud-computing-challenges-benefits-and-the-future.aspx SmarterTools is in BETA right now to bring our SmarterTrack customer service product to the SaaS model. We have long railed against the evils of the "locked- in" model, so we opened the door pretty wide. Our model allows users/subscribers to migrate our of our SaaS "cloud" and onto an installed and "owned" version of the software at anytime. Further, we allow all data to be exported in convertional formats. Freedom is a good thing. Be well, Jeffrey J. Hardy
[...] slightly different take on cloud computing and security comes from this article: The Skinny on Cloud Lock-in RightScale Blog [...]
Hi Thorsten, Your article has been very insightful and I completely concur on the fact you've stated about how a SaaS application enables increased automation. We've experienced it with our own customer support tool, HappyFox (http://www.happyfox.com). It helps us categorize our tickets in any given priority and in turn raises our efficiency in getting back to customers on time. Being a hosted SaaS application, it eliminated the need for us to manage our own servers. This in turn has helped us cut down on our costs.
Posted by Cassy (not verified)   Ι   May 23, 2012   Ι   01:40 AM

Post a comment