There's Grid in them thar Clouds*
You’ve probably seen the recent flurry of news concerning “Cloud computing.” Business Week had a long article on it (with an amusing and pointed critique here). Nick Carr has even written a book about it. So what is it about, what is new, and what does it mean for information technology?
The basic idea seems to be that in the future, we won’t compute on local computers, we will compute in centralized facilities operated by third-party compute and storage utilities. To which I say, Hallelujah, assuming that it means no more shrink-wrapped software to unwrap and install.
Needless to say, this is not a new idea. In fact, back in 1960, computing pioneer John McCarthy predicted that “computation may someday be organized as a public utility”—and went on to speculate how this might occur.
In the mid 1990s, the term grid was coined to describe technologies that would allow consumers to obtain computing power on demand. I and others posited that by standardizing the protocols used to request computing power, we could spur the creation of a computing grid, analogous in form and utility to the electric power grid. Researchers subsequently developed these ideas in many exciting ways, producing for example large-scale federated systems (TeraGrid, Open Science Grid, caBIG, EGEE, Earth System Grid, …) that provide not just computing power, but also data and software, on demand. Standards organizations (e.g., OGF, OASIS) defined relevant standards. More prosaically, the term was also co-opted by industry as a marketing term for clusters. But no viable commercial grid computing providers emerged, at least not until recently.
So is “cloud computing” just a new name for grid? In information technology, where technology scales by an order of magnitude, and in the process reinvents itself, every five years, there is no straightforward answer to such questions.
Yes: the vision is the same—to reduce the cost of computing, increase reliability, and increase flexibility by transforming computers from something that we buy and operate ourselves to something that is operated by a third party.
But no: things are different now than they were 10 years ago. We have a new need to analyze massive data, thus motivating greatly increased demand for computing. Having realized the benefits of moving from mainframes to commodity clusters, we find that those clusters are darn expensive to operate. We have low-cost virtualization. And, above all, we have multiple billions of dollars being spent by the likes of Amazon, Google, and Microsoft to create real commercial grids containing hundreds of thousands of computers. The prospect of needing only a credit card to get on-demand access to 100,000+ computers in tens of data centers distributed throughout the world—resources that be applied to problems with massive, potentially distributed data, is exciting! So we’re operating at a different scale, and operating at these new, more massive scales can demand fundamentally different approaches to tackling problems. It also enables—indeed is often only applicable to—entirely new problems.
Nevertheless, yes: the problems are mostly the same in cloud and grid. There is a common need to be able to manage large facilities; to define methods by which consumers discover, request, and use resources provided by the central facilities; and to implement the often highly parallel computations that execute on those resources. Details differ, but the two communities are struggling with many of the same issues.
Unfortunately, at least to date, the methods
used to
achieve these goals in today’s commercial clouds have not been open and
general
purpose, but instead been mostly proprietary and specialized for the
specific
internal uses (e.g., large-scale data analysis) of the companies that
developed
them. The idea that we might want to enable interoperability between
providers
(as in the electric power grid) has not yet surfaced. Grid technologies
and protocols speak precisely to these issues, and should be
considered.
A final point of commonality: we seem to be seeing the same marketing. The first “cloud computing clusters”—remarkably similar to the “grid clusters” of a few years ago—are appearing. Perhaps Oracle 11c is on the horizon?
What does the future hold? I will hazard a few predictions, based on my belief that the economics of computing will look more and more like those of energy. Neither the energy nor the computing grids of tomorrow will look like yesterday’s electric power grid. Both will move towards a mix of microproduction and large utilities, with increasing numbers of small-scale producers (wind, solar, biomass, etc., for energy; for computing, local clusters and embedded processors—in shoes and walls?) co-existing with large-scale regional producers, and load being distributed among them dynamically. Yes, I know that computing isn’t really like electricity, but I do believe that we will nevertheless see parallel evolution, driven by similar forces.
In building this distributed “cloud” or “grid” (“groud”?), we will need to support on-demand provisioning and configuration of integrated “virtual systems” providing the precise capabilities needed by an end-user. We will need to define protocols that allow users and service providers to discover and hand off demands to other providers, to monitor and manage their reservations, and arrange payment. We will need tools for managing both the underlying resources and the resulting distributed computations. We will need the centralized scale of today’s cloud utilities, and the distribution and interoperability of today’s grid facilities.
Some of the required protocols and tools will come from the smart people at Amazon and Google. Others will come from the smart people working on grid. Others will come from those creating whatever we call this stuff after grid and cloud. It will be interesting to see to what extent these different communities manage to find common cause, or instead proceed along parallel paths.
*An obscure cultural reference: the phrase “There’s gold in them thar hills” was first uttered, according to some, by an old prospector in the 1948 movie “Treasure of the Sierra Madre”, starring Humphrey Bogart.

Recent Comments