The authors of a recent OGF document, "Using Clouds to Provide Grids Higher Levels of Abstractions and Explicit Usage Modes" make several assertions with which I take exception:
1) "There is a level of agreement that computational Grids have not been able to deliver on the promise of better applications and usage scenarios."
It is fascinating to watch the Gartner hype cycle in action, if sad to see people stuck in the trough of disillusionment. But the fact is, fortunately, that there are substantial grid projects and applications that are having substantial success. Ones that come immediately to mind are the Earth System Grid, cancer Biomedical Informatics Grid, and the LIGO Scientific Collaboration, but as it was yesterday that the LHC was switched on, we should also recall the remarkable successes of the LHC Computing Grid and its partner projects such as Open Science Grid. At a different level, Globus people will be happy to talk about the millions of files moved via GridFTP every day, and Miron Livny will be happy to talk at length about how many millions of CPU hours are delivered every day via Condor.
2) To address this purported lack of success, "there is a need to expose less detail and provide functionality in a simplified way. If there is a lesson to be learned from Grids it is that the abstractions that Grids expose – to the end-user, to the deployers and to application developers – are inappropriate and they need to be higher level."
No evidence is provided for this assertion that complex interfaces are the reason for the difficulties people have with grids. I argue that the issues are more complex.
First, the interfaces themselves are not, in my view, a significant issue. We can argue whether we prefer REST or Web Services, or say Nimbus (a grid virtualization interface) or EC2 (a cloud virtualization interface), but the differences among these alternatives are not great.
On the other hand, the economic systems that apply in the two cases are extremely different:
- Amazon services are designed to support the masses, they have no political constraints on who they can provide service to, and their charging model provides strong return to scale; thus, Amazon can focus on, and succeed in providing, modest-scale, reliable, on-demand service to many.
- TeraGrid (to use a US example) is designed to support a small number of extreme computing users, with a negative return to scale (the more users, the more work for fixed budget); thus, they are not motivated to provide virtualization solutions or to operate highly reliable remote access interfaces.
The implications of these different foci for users are tremendous. On EC2, I give my credit card and start a VM--a few seconds. On TeraGrid, I request an allocation (which may not be granted!), get an account, submit a request to run a job (they won't allow me to start a VM), wait in the queue--a many week process. Furthermore, I sometimes find that the remote access interfaces fail because keeping them running is not high priority.
This alternative perspective is I think more revealing about the sources of the differences and the ways we might address them. If we want on-demand, high-quality, compute and storage services, then we need either to create an economic system in which academic providers are motivated to provide such services, or decide to outsource to industry.
The importance of higher-level interfaces is a separate issue. Yes, tools like Hadoop and Swift for data analysis, Introduce for service authoring, Taverna for service composition are important and necessary. Yes, we should be hoping to leverage and influence work done in the far larger corporate market to our advantage. (A focus of the upcoming CCA workshop.)
3) "Grids as currently designed and implemented are difficult to interoperate." The authors make a big deal of this point, but it is not clear to what purpose.
It is true that interoperation is not automatic. [If only everyone used Globus software, then all would be well :) --although of course the policy issues would remain!]. But I am not sure that this is a significant problem for users, or hard to achieve when it is needed. E.g., the caBIG team recently demonstrated a gateway to TeraGrid. The LHC Computing Grid integrates resources worldwlde. Etc. Most users never ask about interoperability, in my experience.
Recent Comments