The following is the text of an article that I wrote for GridToday on Globus' 10th birthday, which we celebrated yesterday in Washington DC.
Globus Turns 10: Time for Celebration and Reflection
The GlobusWORLD conference being held
(jointly with GridWorld and the Open Grid Forum) this week in
Washington, D.C., is a significant milestone for those involved in the
development and use of the Globus open source Grid software. The reason
is that it was 10 years ago (to be precise, on Aug. 21, 1996) that Carl
Kesselman and I received our first funding for work on Globus, from
DARPA. Gary Minden and Mike St. Johns were our enlightened program
managers, followed by Gary Koob. I must also recognize the support of
Bob Aiken, Tom Kitchens and, especially, Mary Anne Scott, then all at
DoE.
Given this milestone, I will spend some time here recapping
history and reflecting on where we have come and what we have learned.
A Little History
10 years is a long time: What on earth have we been doing over that period? Let's revisit some of the highlights.
The
emergence of high-speed networks in the 1990s led to an awareness that
the Internet could allow for more interesting applications than e-mail
and file transfer. (Len Kleinrock had envisioned this possibility back
in 1969, but it took a while to get there!) Efforts like the U.S.
Gigabit testbed project, led by Bob Kahn, and the Supercomputing'95
I-WAY effort, led by Tom DeFanti and Rick Stevens, helped build
awareness of these opportunities. This era also saw pioneering efforts
such as the NSF Metacenter, led by Charlie Catlett and Larry Smarr, and
Legion, led by Andrew Grimshaw. However, for the most part, every
application was constructed from scratch.
We (in particular,
myself, Carl and Steve Tuecke) studied this situation and saw a need
for standards and software (middleware) to bridge the gap between
applications and the complexities of a distributed resource
environment. Thus, we started a research project aimed at defining this
middleware. Believing strongly that we did not necessarily know the
real problems, we started an iterative process of examining the
requirements of collaborative communities, prototyping solutions to
their problems and feeding back the resulting experiences into a next
cycle of research and development. We called this project Globus
because it built on earlier technology called "Nexus" and had global
goals.
Back in 1996, our ambitions and the needs of our users
were far greater than our resources -- a situation that persists today!
-- and so it was challenging to develop software that was sufficiently
stable and functional to allow for meaningful experiments. Fortunately,
we found wonderful application partners -- people like Ed Seidel, Paul
Messina and their colleagues, and later members of the high energy
physics community -- who were prepared to work with often imperfect
software and provide invaluable feedback.
Along the way, we
achieved milestones that helped persuade ourselves and others that we
had something useful. For example, 1998 saw Sharon Brunett, Karl
Czajkowski and others achieve a record-setting military simulation
involving 100,298 vehicles distributed over 13 supercomputers at nine
sites. Gregor von Laszewski and others demonstrated real-time analysis
of data from the Advanced Photon Source. At the SC'98 conference, we
demonstrated the "Globus Ubiquitous Supercomputing Testbed
Organization" (GUSTO) that spanned some 50 sites worldwide. NASA
launched its Information Power Grid project, under the leadership of
Bill Johnston.
By 2001, the year in which the TeraGrid was
founded, we had software we felt was ready to operate in production
environments, if only we could find friendly sites prepared to perform
the needed integration, and application scientists ready to develop the
necessary application software. In practice, we weren't as ready as we
thought we were, but nevertheless we entered a stage -- of learning via
experience about the mechanisms and policies required for operational
use -- that to some extent continues today. We also received some nice
recognition at this time: Globus Toolkit version 2 (GT2) played a key
role in a Gordon Bell prize awarded at SC'01 to an astrophysics
application that used Cactus, MPICH-G2 and Globus. The following year,
R&D Magazine recognized GT2 with an R&D 100 award and named it
the "most promising new technology" of the year.
In late 2001,
IBM followed up its dramatic open source Linux strategy announcement
with a similar announcement about the importance of Grid technologies.
We were thrilled when IBM elected to work with us to develop the OGSI
Web Services specification and the corresponding Globus implementation,
which was released in 2003 as GT3. While this first Web services
release provided only modest quality, it spurred much innovative work,
such as the video distribution system developed by the Belfast eScience
Center for the BBC (to give an idea of the scale of effort underway by
this time, BeSC applications alone totaled 1.5 million lines of GT3
code, later adapted for GT4).
2005 saw the release of Globus
Toolkit version 4 (GT4), which, thanks to the efforts of talented
developers and the able leadership of Lisa Childers, exceeded all
previous releases in terms of quality and rigor of both software and
documentation. GT4 supports the construction of stateful and secure Web
services in Java, C and Python; provides job submission, file transfer,
credential management, registry and database access services;
incorporates a powerful integrated security system; and provides many
other features besides. 2005 and 2006 also saw significant new funding
in support of the Globus science community, from the U.S. National
Science Foundation's NSF Middleware Initiative (under Kevin Thompson),
UK eScience program (for work on OGSA-DAI) and, most recently, from the
U.S. Department of Energy's SciDAC program.
Where We Are Today
Someone
once dismissed Grid as a "funding concept" -- a witty but irritating
turn of phrase. I have not heard that expression lately: Grid is
mainstream in both science and industry, and so many people are using
Grid technology to solve real problems that it is hard to argue that it
is not successful and useful. Indeed, we can make a strong case that
Grid has had a significant impact on how people conceptualize and solve
problems in many domains.
It is particularly pleasing to see the
diversity of Globus application communities, which span, for example,
astronomy (e.g., the LIGO gravitational wave observatory, the Caltech
Montage service), bioinformatics (e.g., Natalia Maltsev's PUMA system),
cancer biology (e.g., the National Institutes of Health's caBIG cancer
bioinformatics Grid), data mining (e.g., work by Domenico Talia) and
environmental science (e.g., C3grid in Germany and Earth System Grid in
the United States). And that is just the first five letters of the
alphabet.
I am also delighted with the geographical diversity of
Globus deployments. We see substantial Globus deployments and
applications in every continent except Antarctica, and just about every
day I get e-mail from someone somewhere describing a new deployment of
which I was not previously aware. Again, we can walk through the
alphabet: Australia, Belgium, China (and Canada and Chile), Denmark,
England, France, Germany, Hungary, Ireland, Japan, Korea, Luxembourg,
Mexico, the Netherlands, ....
Another area in which we continue
to see wonderful progress is in the range of "solutions" that leverage
Globus software. Globus middleware does not address end-user
requirements directly, but a wide range of Globus-based tools now
existing for building portals (e.g., OGCE, GridPort, Jason Novotny and
Michael Russell's GridSphere); executing workflows (e.g., Ewa Deelman
and Mike Wilde's VDS, David Abramson's Nimrod, Miron Livny's Condor,
BPEL); running parallel programs (e.g., Nick Karonis' MPICH-G2);
delivering data (e.g., Ann Chervenak's DRS, Reagan Moore's SRB);
operating instruments (e.g., Rick McMullen's Common Instrument
Middleware Architecture project, GridCC in Europe); remote service
invocation (e.g., Ninf in Japan); and so on. Lee Liming has done a nice
job documenting these and other "solutions."
It is also pleasing
to see the progress being made in industry. Steve Tuecke left Argonne
in 2004 to form Univa Corp., which provides commercial support for
Globus software and is building new products using Globus (disclaimer:
I am also a Univa founder and advisor). They are discovering that the
concerns of industry are increasingly similar to those of science, as
the need to accelerate innovation processes leads to a need for dynamic
resource sharing between organizational units.
I should also
mention the progress made with standards. Globus contributors, notably
Von Welch, played major roles in the Grid Security Infrastructure
standard, which has been widely adopted. The same is true for GridFTP,
under the leadership of Bill Allcock. The Job Submission Description
Language (JSDL) and Basic Execution Servie (BES) specifications, which
seem likely to see wide adoption, build heavily on GRAM. Globus project
members, notably Frank Siebenlist, have also contributed heavily to the
increasingly important WS-Security, SAML2 and XACML specifications.
It
is a nice coincidence, given our anniversary, that August saw the
release of the WS-ResourceTransfer specification by HP, IBM, Intel and
Microsoft -- perhaps signaling the end of a standards odyssey that
began in 2001 when Steve Tuecke and others defined the Open Grid
Services Infrastructure (OGSI). The goal was to codify Web services
mechanisms for representing and accessing state, a requirement that
appeared in many different contexts. Like Ulysses, we did not know we
were embarking on an Odyssey when we began. However, the release of
WS-ResourceTransfer -- remarkably similar to OGSI! -- suggests that we
may soon reach this journey's end.
Also worthy of celebration is
the tremendous growth in the size of the Globus developer community. In
the beginning, there were just three of us, plus a few partners such as
Craig Lee at the Aerospace Corp. The team grew over time, as talented
researchers and developers joined us at Argonne, the University of
Chicago and USC Information Sciences Institute, and then other
organizations partnered with us, notably the National Center for
Supercomputing Applications (Jim Basney, Von Welch and others), the
University of Edinburgh (Malcolm Atkinson, Neil Chue Hong, Mark Parsons
and others) and PDC in Sweden (Olle Mulmo and others). Most recently,
the new dev.globus development process (modeled after that of Apache
Jakarta) has partitioned Globus into dozens of independent projects,
each with its own developers, and opened the way for new projects to
join. The response has been enthusiastic: under the leadership of
Jennifer Schopf, our new incubator process already has 11 incubator
projects up and running.
Reflections
We
have learned a tremendous amount in the past 10 years. It is hard to
know where to start in terms of summarizing lessons learned, but here
are a few thoughts.
We were clearly correct in identifying
large-scale collaboration as an important problem, and in choosing
science as a good place to start identifying requirements and
experimenting with solutions. We have seen the need to federate data
and computing, orchestrate the allocation of resources to different
purposes and manage the policies that govern these activities become
increasingly important, first across science and now in industry too.
Indeed, these questions are arguably now central to the critical
question of how innovation occurs within and across organizations.
Along
the way, we have learned (and I am sure must continue to relearn) the
need to evolve the software and to reinvent ourselves as both user
requirements and the external technology environment evolve. For
example, we adopted public key security technology early: a successful
step, although the configuration tools needed for convenient use have
taken time to emerge. We adopted LDAP as a directory service
technology: less successful, and later abandoned. In 2002, we started a
major shift to Web services technology: also a positive development
overall, although we were arguably premature, given the maturity of Web
services technologies at the time. In the future, we will need to
respond to the emergence of commercial Web services, like Amazon's S3
and EC2 services, and to other developments that we have yet to
recognize.
Our decision to pursue an open source approach and a
non-viral license was also clearly correct. It was not necessarily the
obvious choice back in 1996, and required a lot of hard work to define
the necessary licenses and get the required approvals. (I realized just
how much work when a lawyer asked Steve Tuecke, who handled much of the
early work on licenses, if he had considered law school!) However, this
choice has allowed us to scale the development team and user community
in ways that would not have been possible with a proprietary solution.
Our recent move to a pure Apache license is, I hope, the final
culmination of this approach.
We have struggled with numerous
issues over the years relating to the fact that any large-scale
collaboration (and thus a grid) is a system and, as such, involves a
great diversity of software, hardware, institutions and, above all,
different people: users, tool developers, application developers,
operations staff, security staff and others. The result is considerable
complexity in terms of requirements and also significant challenges in
how requirements and capabilities are communicated to different groups.
One inevitable consequence of this complexity is that Grid and
Globus are not easily characterized, and thus we have struggled to
overcome various misconceptions over the years. One is that Grid is
somehow an alternative to high-end computing -- rather than an
essential adjunct to high-end computing, enabling remote access and the
distribution of the resulting data products. Another is that a Grid is
about "free computing." A third is that Globus is a turnkey solution to
Grid problems. We have been careful to emphasize that Globus is
middleware, not application software, but we still hear complaints that
"I installed Globus, but it didn't solve my problem."
I'd also
say that we didn't internalize sufficiently at the beginning the extent
to which Grid was a policy and operations problem. Fortunately, we've
seen some wonderful people get involved with these issues, with the
result that we have become increasingly good at creating and operating
grids that work. Projects like EGEE, Open Science Grid and TeraGrid
have taught us a lot.
In a different space, I remain concerned by
the amount of redundancy and lack of interoperability that we see
across the Grid community. Given the natural human enthusiasm for
novelty (often encouraged by funding agencies and commercial
pressures), this diversity is not a surprise. However, I expect that
convergence will occur, as people come to understand the high cost of
redundant effort, and the tremendous advantages of mature, robust, open
source software.
Overall, though, the current situation and
future prospects are incredibly encouraging and positive. The
requirements that we set out to address with Globus 10 years ago have
proved to be quasi-universal. It is no longer eccentric scientists and
niche communities who use Grid technology, but mainstream science
communities and (increasingly) commercial users. We have a set of
technologies that, while certainly not a complete solution, address key
requirements. We also see convergence on standards and increasingly
broad adoption of those standards in both open source and proprietary
software. Finally, and most important, we have a vibrant, sometimes
contentious but always enthusiastic, international community of
developers and users who are committed to moving the technology
forward. We should all look forward to the 20th anniversary of Globus
-- by which time, if the Internet is any guide, Grid technology will be
ubiquitous.
In writing this document, I have tried to acknowledge
some of the many contributors to Globus software, deployments and
applications. I, of course, have omitted many more names than I have
included. I hope that those omitted will forgive me, and that other
readers will feel inspired to learn more about individual projects and
those that made them happen.
Happy 10th birthday Globus!