RightScale Blog

Cloud Management Blog
RightScale 2015 State of the Cloud Report
Cloud Management Blog

Cloud Computing vs. Grid Computing

Cloud Computing vs. Grid Computing

Recently Rich Wolski (UCSB Eucalyptus project) and I were discussing grid computing vs. cloud computing. An observation he made makes a lot of sense to me. Since he doesn't blog, let me repeat here what he said. Grid computing has been used in environments where users make few but large allocation requests. For example, a lab may have a thousand-node cluster and users make allocations for all 1,000, or 500, or 200. Only a few of these allocations can be serviced at a time; others need to be scheduled for when resources are released. This results in sophisticated batch job scheduling algorithms of parallel computations.

Cloud computing is about lots of small allocation requests. The Amazon EC2 accounts are limited to 20 servers each by default, and lots and lots of users allocate up to 20 servers out of the pool of many thousands of servers at Amazon. The allocations are real-time, and there is no provision for queueing allocations until someone else releases resources. This is a completely different resource allocation paradigm, a completely different usage pattern, and all this results in completely different method of using compute resources.

I always come back to this distinction between cloud and grid computing when people talk about in-house clouds. It's easy to say "ah, we'll just run some cloud management software on a bunch of machines," but it's a completely different matter to uphold the premise of real-time resource availability. If you fail to provide resources when they are needed, the whole paradigm falls apart and users will start hoarding servers, allocating for peak usage instead of current usage.


<strong>Разница между cloud computing и grid computing...</strong> Grid computing - юзеры делают малое количество больших вычислительных запросов. Например в лаборатории есть 1000-узловой кластер и юзеры запрашив...
[...] to that I was reading the Rightscale Blog and it just suddenly hit me, the biggest difference between the two concepts are how they are [...]
Hello. For an additional point of view on this issue, you may want to read the news article from the Enabling Grids for E-sciencE project at http://news.eu-egee.com/index.php?id=193&amp;tx_ttnews%5Btt_news%5D=5&amp;cHash=76da27657b Best regards, Manuel Delfino, Director, PIC, Barcelona, Spain.
I see cloud computing as a natural evolution of concepts in Grid computing. We/I (globus) have been working on mechanisms for lease-based resource provisioning (as opposed to job management with resource provisioning treated as a side-effect) for a long time now: this investigation was motivated by grid computing scenarios and applications and targeted at solving specific problems that Grid users experienced. There is nothing conceptually different between the workspace service developed as a result (which Sergio references above and which I lead) and EC2 although there probably is a difference in the background that motivated them. Whether resources are provided immediately/real-time, as an advance reservation, or on a best-effort preemptbible basis is a property of a resource lease you choose to provide (the terms of service on a lease) but not a fundamental difference. Similarly, the size of a lease that you choose to provision is just another difference in terms. In the science clouds (http://workspace.globus.org/clouds/) we run at University of Chicago and Florida we get requests for leases of all shapes and sizes, wide and narrow, longer-term mixing happily with short-term and I would say all of our users right now come from the Grid community. And we can easily migrate them to EC2 precisely because the two services are nearly the same. Having said all that -- while I see cloud computing as an evolution of grid computing I do think it is a *significant* evolution. The focus on resource leases enabled by virtualization (what Thorsten I think refers to as "forking the server") rather than scheduling jobs is a fundamental difference. The notion that you can create a custom environment and map it onto resources in an easily relocatable fashion will refactor the roles in what are now called grid communities. Five, ten years down the road I don't think we will see any distinctions.
I think that this is a reasonable way to distinguish clouds and grids. Also, for the record, note that the 20 instance EC2 limit is just a starting point. We routinely bump up the limit (into the hundreds or even thousands) for established customers upon request.
I've been wondering on this for a while; I identify the following main differences between current Grids and current Clouds so far: - Grid systems are designed for collaborative sharing of resources belonging to different admin domains, while Clouds at the moment expose the resources of one domain to the outside world - Grid systems support the execution of end-users applications as computational activities; a typical computational activity once accepted by a Grid endpoint, is locally handled by a batch system as a batch job; Clouds are mainly used for the remote deployment of services -- this is an important difference; Grids provide more domain-specific services; Clouds can sit below (the RightGrid can be a typical example of this) -- Grids are moving towards the adoption of virtual machine tecnologies, but the usage pattern will be the same (the submitted job is bound with the execution environment as VM image) - Grid systems support large set of users organized in virtual organizations (credentials are typically enriched with VO-related information); Cloud systems support individual users (to my knowledge) I would not see the size of allocation as a factor for differentiating them.
Sergio, thanks for the thoughtful comment. You are correct that the grid community has worked a lot on tying resources at multiple institutions together such that users can run computations that span resources across administrative domains. At the current stage such an effort has not even really begun in the cloud space. The players have not really been confronted with this type of request, as far as I know. I don't know that this is a fundamentally difference, it seems more like a development stage where cloud vendors haven't started to coordinate how they expose resources and make them compatible with one another. I would disagree that the size of allocation isn't an important factor. As you write, the grid world revolves around batch job scheduling. This is a fundamentally different paradigm from the real-time allocation that cloud offer. The control over your system and the type of systems that can be deployed are vastly different and fundamentally affect the way the applications are written as well as the way they are managed. As I've mentioned many times, the most fundamental principle in cloud computing is being able to bring the next server from boot into full production on auto-pilot. This solves many problems, from repairing failures, to scaling up and down with load, to launching additional deployments for special purposes (staging, demo, test, etc). The notion that when you need additional resources you go and get them is different. It doesn't exist in batch processing. It doesn't even cross your mind. Programmers are used to fork a thread or a process, now they can fork a server.
Dear Thorsten, let me elaborate more on these concepts: Interoperability: yes, interoperability was the main requirement in Grid systems; it may be an indirect market request for cloud systems; clouds are born to enable flexible renting of an IT infrastructure, while Grid was born to share resources among different admin domains in order to solve problems requiring resources exceeding the individual capacity; Size of allocation: when I stated that I do not see this as an important factor, I was referring to the size of allocation in terms of number of nodes that a single user can require; in Grid systems, the smallest amount of allocation unit is a single job slot (which can be mapped to a single CPU); if the user asks 1, 10, 100 or 1000 slots does not change the meaning of using a Grid system; the Grid usage pattern is being able to require a job slot using a virtual identity (typically X.509 certificate) and then having the Grid system mapping this request to a real resource in some admin domain and mapping the virtual identity to a local account; for a user, this allocation is typically best effort within the resources available to the virtual organization (VO) to which the user belongs to. Each VO typically signs agreements with admin domains to have a certain amount of guaranteed resources and VO users compete for them. This is not the only allocation scenario. Advance reservation is also a reality, especially in supercomputing centers exposing their systems using Grid middleware. In Grid systems, you may find differentiation in the allocation strategies based on the length of jobs or group ownership of users. Several flavors exist and are appearing. Time-to-run: I agree when you say that clouds are mainly offering a real-time allocation mechanism as opposed to the best effort from Grid system. Cloud systems concentrate on on-demand scaling up and down for service allocation and pay-as-you-go as opposed to the job execution on shared resources. The company offering cloud system needs to over-provision based on some prediction model. If a cloud does not scale up on user demand and as written in the SLA, it fails. In Grid, there is more transparency on resource availability and the average number of resources is typically known. If Grid users wait in the queue, they do not complain given the fact they get what the VO agreed for. Another thought: when Grid was conceived, VM's were not a commodity, nevertheless they were not strictly needed (thought really beneficial; the Globus project has been working on VM exploitation since 2004, http://workspace.globus.org/papers/index.html). For clouds, virtualization is vital. Concluding, I agree with your last paragraph. Being able to "fork a server" from the programmatic viewpoint is opening to new appllications. The "resizable Grid" is an interesting application. We'll see in 5 years how they will affect each other.
Hi, These are difference based on functionality or usage. Are there an technology level differences too?
Kiran, the use-cases do drive significant implementation differences. Grid systems tend to have sophisticated queueing and job prioritization mechanisms, clouds don't. There has been a lot of work in tying multiple grids together at the network and/or data store level; so far clouds stand more on their own. Clouds have been designed to really support interactive web sites with 24x7 availability, grids haven't gone in that direction. And the list goes on, and there's probably an exception to everything mentioned...
is it that grids dont make use of virtualization or clouds are more prominent user?
Posted by nitesh (not verified)   Ι   February 28, 2009   Ι   09:38 PM
Hi, To me Cloud and Grid represents two different things that, sometimes, may solve similar problems. Let’s consider the metaphors that they are proposing: - Grid: “Let’s join our efforts by joining our domains in order to achieve a better service”. - Cloud: “We can provide you more computational power than what you need. Just tell us what you want and we will give it to you”. by looking at these metaphors we can immediately see that there is a clear overlapping in term of potential customers but at the same time they are orthogonal. Let’s have an example. In high energy physics (HEP) experiments the assumption “we have more computational power that what you need” fails: such use cases are going to use all the CPUs that they find so they really needs a way to access many datacenters (as Sergio was saying). However, constrains that HEP are introducing are far from been common and cloud facilities can host all the service that a company may need. Grid could cover such use cases but it is not really design for that so probably cloud could fit better.
[...] up to a previous post, I came across a July 2008 article by Thorsten von Eicken, CTO and founder of RightScale, which provides a front-end for managing [...]
[...] z autorów &#8220;Right Scale&#8221; - popularnej aplikacji do zarządzania usługami w chmurach (wykorzystywanej między innymi w [...]
[...] best description I&#8217;ve seen so far came from RightScale&#8217;s blog, attributed to Rich Wolski of the Eucalyptus Project. Wolski describes grid computing as suitable [...]
For a further discussion on grid vs. cloud computing, and the benefits/disadvantages of both, check the article at http://ccskguide.org/2011/02/cloud-computing-vs-grid-computing/ At ccskguide.org, we take a look at the security issues around cloud computing and help prepare candidates for the CCSK Cloud Security Certification.
[...] up to a previous post, I came across a July 2008 article by Thorsten von Eicken, CTO and founder of RightScale, which provides a front-end for managing [...]

Post a comment