My Photo

May 14, 2008

Writing from GlobusWorld/OSGC in Oakland

I am attending this week the Open Source Grid and Cluster conference in Oakland, California. This event includes GlobusWorld, Sun Grid Engine workshop, and Rocks workshop, as well as tracks on other open source grid and cluster software.

As an organizing committee member, it is always a little nervewracking ahead of a meeting, wondering how it will work. We're now in the third day and I am please to say that the meeting is going very well. The sessions are well attended and there is lots and lots of discussion and questions, both during and between talks. (At this very moment, I am in a session that is running 15 minutes over as attendees miss the break for a demonstration of Taverna and Globus.) In addition, the Marriott in Oakland turns out to be a beautiful location.

For those not fortunate enough to be at the meeting, the talks are being made available online.

May 01, 2008

Last Chance to Register for Open Source Grid and Cluster Conference!

Due to popular demand from procrastinators, we we have extended until Monday May 5 the advance registration deadline for the Open Source Grid and Cluster Conference.

If you haven't registered, now is the time to do so to get the off-site registration rate of $490.

If you haven't decided to come, take a look at http://www.opensourcegridcluster.org to see the outstanding content we've assembled, on Globus, Rocks, SGE, and other tools, and take the plunge.

April 21, 2008

10 Reasons to Attend Open Source Grid and Cluster Conference

Ok, I admit it is corny--but I assembled a list of 10 reasons why you should attend the Open Source Grid and Cluster Conference, to be held in Oakland May 12-16 (www.opensourcegridcluster.org).

1) Globus program is fantastic, including tutorials, advanced technical presentations, contributed talks, and community events on every aspect of Globus.

2) Gobs of other material on Sun Grid Engine and Rocks, and other open source grid and cluster software.

3) Gathering: A great opportunity to meet colleagues, peers, collaborators from the grid and cluster community. The only grid meeting in the US the rest of this year--the next two OGFs are in Spain (June) and Singapore (September).

4) GT4.2: You'll get to learn about the exciting new features in Globus Toolkit 4.2. New execution, data, security, information, virtualization, and core services.

5) Gratfication (immediate) as you get to provide your input on future directions for Globus, Sun Grid Engine, Rocks, and other open source systems--and maybe sign up to contribute to those developments.

6) Grid solutions: You'll get to meet the people using Globus to build enterprise grid solutions in projects like caBIG, TeraGrid, Earth System Grid, MEDICUS, and LIGO, and learn about solution tools like Introduce, MPI-G, Swift, Taverna, and UniCluster.

7) Gurus: You get to grill the Globus gurus--or, if you prefer, show off your own Globus guru status.

8) Great price: $490 registration is substantially cheaper than OGF or HPDC, for example, and the hotel rate is reasonable ($149).

9) Gorgeous location: Oakland is easy to get to -- SFO (with easy BART  train ride), Oakland, and San Jose airports also nearby. Just a 10 minute train ride to download San Francisco. A lovely time to be in the Bay Area.

10) Gorilla and guerilla free: None of the corporate marketing talks that diluted the last GridWorld conference--apart from two sponsor talks, this is pure tech, and highly useful tech at that.

We look forward to seeing you in Oakland!

Regards -- Ian.

April 11, 2008

Services for Science

I gave a talk on Services for Science (PDF, PPT) at the INGRID 2008 conference in Ischia, Italy, on Wednesday. I decided to do something different and include demonstrations. I think this worked well. I created and deployed a GT4 service using Introduce and gRAVI, and then created and ran a workflow invoking GT4 services using Taverna.  Well, to be honest, there was some steps I skipped along the way (in the style of Julia Childs), but nevertheless I found it impressive how much could be done interactively, in a short time. In summary, I was able to show how we can:

  1. Create a new service using the Introduce integrated development environment, defining operations and resource properties, folding in required functionality (e.g., security, notification), and selecting types from both base types and predefined libraries. (Using gRAVI we can also encapsulate executables.)
  2. Publish this service into registries (GT4 index services).
  3. Discover available services.
  4. Compose services into workflows, e.g., via the use of Taverna, which thanks to recent work by Wei Tan and Ravi Madduri (and much help from the Taverna team) can now invoke GT4 services.
  5. Deploy and publish the workflow in turn ...

April 10, 2008

Grid Summer School

The sixth in the highly successful series of International Summer  Schools on Grid Computing will be held at the Hotel Füred Conference and Congress Centre of Balatonfüred, Hungary, from 6th to 18th July 2008.
The School will include lectures, discussions, laboratory sessions, tutorials and group work delivered by leading authorities in the fields of advanced grid technology, applications of e-Science and distributed systems research. Reports from world leaders in deploying and exploiting Grids will complement lectures from research leaders shaping future e-Infrastructure.
Hands-on laboratory exercises will give students experience with widely used Grid middleware. The school will conclude with an integrating practical that will enable students, working in teams, to bring together all they have learnt on an extended exercise that simulates collaborative research using e-Infrastructures. Indeed during the school, participants will meet like-minded students from many parts of the world, working in many disciplines, and form valuable long-term working relationships.
We invite applications from enthusiastic and ambitious researchers who have recently started or are about to start working on Grid projects. Students may come from any country. We expect participants from computer science, computational science and any application discipline. The School will assume that students have diverse backgrounds and build on that diversity. However, in order to fully participate in the practical exercises you should be a confident programmer who will have fulfilled certain prerequisites.
To find further details visit the web site at: http://www.issgc.org

December 11, 2007

Inaugural Research Institute for the Science of Socio-Technical Systems

I've been talking a lot recently to colleagues in the social sciences, following a workshop on virtual organizations that I ran for NSF with Carl Kesselman, Tom Finholt, and Jonathan Cummings. If I may say so myself, it was a great meeting--certainly not the "same old" discussions (at least from my perspective). The Building Effective Virtual Organizations (BEVO) workshop is a next step in this process.

Co-organizer Tom Finholt now points me at the Inaugural Research Institute for the Science of Socio-Technical Systems, designed for graduate students and new faculty, which sounds like a lot of fun.

December 07, 2007

Visit Florida in January (and learn about grid)

Florida International Grid School (FIGS'08)  --  Apply by Dec 10 !

JOIN US for an exciting 3-day course in large-scale and high-performance grid computing to take place January 23-25, 2008, at Florida International University, Miami, FL.

This intensive course introduces the techniques of grid and distributed computing for science and engineering fields, with hands-on training in the use of large-scale grid computing resources. The course introduces skills that will be needed by researchers in the natural and applied sciences, engineering, and computer science to conduct and support large-scale computation and data analysis in emerging grid and distributed computing environments. The workshop will focus on enabling the use of Open Science Grid (OSG) and TeraGrid cyberinfrastructure to perform large-scale computations and data-intensive processing in various research fields. Participants will learn how to use grids of thousands of processors and will be able to continue to use these resources for their research after the course.

The workshop will cover:

* Overview of distributed computing concepts and tools
* Wide-area high speed optical networking
* Concepts, tools, and techniques of grid computing
* Discovering and using grid resources
* Grid scheduling and distributed data management
* Web service and grid service concepts
* Techniques for workflow and collaboration

Undergraduate and graduate students, researchers, educators and professionals in engineering, computer science, or any scientific, data-or computing-intensive discipline may apply. Applicants should have at least intermediate programming skills (one to two semesters experience in C/C++, Java, Perl, and/or Python) and hands-on experience with UNIX / Linux in a networked environment.

Important Dates:

Application Deadline: Dec 10.
Notification of Acceptance will be sent by Dec 20.
Registration Deadline: Jan 10.

Application form can be found at: www.ci.uchicago.edu/figs08

For information on FIGS'08 please visit www.opensciencegrid.org/workshop or
send a message to figs08@opensciencegrid.org

November 20, 2007

Workshop on Building Effective Virtual Organizations

I am involved in organizing  an NSF-sponsored  workshop, "Building Effective Virtual Originations" (BEVO) to be held in Washington, DC, on January 15 and 16. As we write on the Web page:

Virtual organizations are increasingly central to the science and engineering projects funded by the National Science Foundation. Indeed, if you are a researcher or educator, chances are that you if don't already lead or participate in at least one distributed team, you will soon.

Unfortunately, chances are also that you have never been told how to establish such teams, how to make them successful, or what technologies exist that can help them function effectively. You probably haven't had many opportunities to interact with others building, or just working in, virtual organizations, either.

Our goal at this workshop is to help address this knowledge gap. We invite you to take this opportunity to come and learn what is required to make virtual organizations successful, contribute your experiences and challenges to the discussions, and to establish new connections that will help you succeed in your research and education projects in the future.

The workshop is sponsored by the National Science Foundation's Office of Cyberinfrastructure, which has identified virtual organizations as a fundamental element of its infrastructure plans. Attendees will be responsible for their own travel and hotel. There is no fee to attend the workshop, but registration is required as space is limited.

We're still finalizing some of the speakers, but this is a unique opportunity both to learn about what it means to make virtual organizations work in practice, and to meet and talk with others engaged in the "virtual organization lifestyle" ...

November 08, 2007

Globus in Reno next week

As usual, many in the grid and HPC community are migrating to the "SC" conference, this year in delightful Reno. There will be a huge number of talks and demonstrations on Globus applications, infrastructures, and technologies. Here is a partial list of talks and demonstrations. Come to the Argonne booth to see our new advance reservation service, learn about the latest in Globus technologies, hear what Earth System Grid has been up, see parallel programs running via Swift!

It is not too late to sign up for the Globus and GridFTP tutorials, to be held on Sunday and Monday respectively.

My colleagues from Univa UD will be there also, showing off their new Cluster Express product. Stop by their booth to say hello to Steve Tuecke and others.

Please let us know if you want to meet to discuss your use of Globus, your ideas for future Globus development, and/or technologies that you think complement Globus software.

I myself will not be attending SC this year due to other commitments--the first time in many years that I am missing the conference. And no, it is not because it is being held in Reno.

November 07, 2007

Microsoft eScience Conference

A belated report--I attended the Microsoft eScience Conference in North Carolina recently. It started on a Sunday, which offends my humanist principles, but I decided to go for the whole meeting, and thus awoke at 4am on Sunday to catch a plane. Overall the effort was worth it--I was there for a great kickoff talk by Kelvin Droegemeier, which suggested that their decade-long effort to create accurate tornado forecasts is bearing fruit, and the rest of the meeting was good also. I particularly enjoyed the opportunity to talk with many UK eScientists.

One talk I enjoyed was Carole Goble and David de Roure's double act on the use of Facebook technology for social networking among scientists ("myExperiment"). I hope that they are instrumenting this system carefully so that we can determine whether, when, how, and why (?) people choose to share information on such systems.

From Chicago, Tibi Stef-Praun presented a poster on work on computational economics and Ioan Raicu presented a talk on his work on "data diffusion."

I participated in a panel on overcoming barriers to adoption of eScience. One phrase that I particularly liked, from remarks by Alex Voss: we should not build but foster infrastructure. "Fostering infrastructure" has a good ring to it.

June 07, 2007

Nature Asia-Pacific Conference

I spoke at a conference in Tokyo this week, that Nature Publishing Group organized to celebrate 20 years of publishing in Asia-Pacific. I should have been in Madison for the TeraGrid '07 meeting, but I agreed long ago to speak here. It was a fun event, with participants from all across Asia. Lots of "real scientists" as I affectionately call my non-computer scientist colleagues. Indeed, the other keynote speaker was Ryoji Noyori from RIKEN, the 2001 Chemistry Nobel. A particularly large number of people praised my talk, so either (a) Asian people are very polite, or (b) it was a good talk. I estimate 80% (a) and 20% (b).

One topic that was discussed during the meeting was the relative "invisibility" of many Asian institutions and researchers, simply because they do not have good English Web sites. An example of how simple things can make a big difference.

I couldn't stay to visit any of my friends in Japan because I have to head straight back to Chicago to dress up in a funny robe and present an honorary degree to Scott Shenker. More about that later.

June 06, 2007

IBERGRID Video Online

The organizers of IBERGRID kindly posted this video of my talk on "Scaling eScience Impact" (the slides are here). The abstract follows:

Computational approaches to problem solving have proven their worth in many fields of science, allowing the collection and analysis of unprecedented quantities of data, and the exploration via simulation of previously obscure phenomena. We now face the challenge of scaling the impact of these approaches from the specialist to entire communities.  I speak here about work that seeks to address this goal by rethinking science's information technology foundations in terms of service-oriented architecture.  In principle, service-oriented approaches can have a transformative effect on scientific communities, allowing tools formerly accessible only to the specialist to be made available to all, and permitting previously manual data-processing and analysis tasks to be automated. However, while the potential of such "service-oriented science" has been demonstrated, its routine application across many disciplines raises challenging technical problems. One important requirement is to achieve a separation of concerns between discipline-specific content and domain-independent infrastructure; another is to streamline the formation and evolution of the "virtual organizations" that create and access content. I describe the architectural principles, software, and deployments that I am and my colleagues have produced as we tackle the first of these problems, and point to future technical challenges and scientific opportunities.

In addition to covering issues discussed in my 2005 Science paper, I touched upon recent work by the OSU team and Ravi Madduri on Introduce and RAVE, and by the U.Chicago team on Swift and Falkon.

May 24, 2007

Grid middleware in Europe

Trips to Germany and Spain in the past month have helped me catch up the latest state of grid middleware in Europe. It's an exciting but also curious scene:

  • Tens of millions of Euros are being spent on the Enabling Grid for eScience (EGEE) grid infrastructure and its gLite middleware, and on applying that middleware in various domains and  promoting its adoption internationally. EGEE and gLite are heavily focused on the computing requirements of the Large Hadron Collider.
  • In the Nordic countries, we have the NorduGrid infrastructure and its Advanced Resource Connector (ARC) middleware. NorduGrid and ARC are also focused on physics.
  • Unicore has been developed with German and European funding over the past eight years. Unicore is focused on enabling remote access to supercomputer centers, and sees heavy use in Germany, where much of its development occurs.
  • Finally, the UK has put much money into OMII-UK, which in addition to supporting the popular myGrid and OGSA-DAI products, has created its own distinct middleware platform.

So there are at least four distinct European middleware solutions [actually five: see below]. Is this a good  thing? Officially, yes. Each is a big success, diversity is positive, and  interoperability is assured.

Unofficially, users talk with frustration about being pressured to use their funding agency's middleware, software developers bemoan the need to target different middlewares, and sites complain about having to support multiple software stacks. Meanwhile, interoperability is stymied by differing versions of standards and different configurations and policies.

The only software that no-one in Europe is pressured to use or deploy is Globus, and I am personally satisfied to see how many projects use it nonetheless. Indeed, while I cannot back up the following assertion with data (for one thing, privacy-conscious Europeans tend to turn off Globus usage reporting), I'd bet good money that Globus is the most widely used grid middleware in Europe. (That's not taking into account the Globus components included in ARC and gLite.) The people that talk to me must be somewhat self selecting, but they speak with tremendous enthusiasm about what it lets them do.

It's hard to see where this will all lead. While "made in Europe" (or "made in the Nordic countries" or "made in the UK") is a powerful rallying cry, surely four different systems can't be sustained indefinitely. Nevertheless, I don't see anything changing soon. Perhaps we can just hope for incremental steps: e.g., integrating the latest Globus components into gLite (they're using code that is several years old) and achieving interoperability between Globus and Unicore (we're on the hook for that).

It also seems strange to have no EU support for European Globus users--surely that would make good scientific sense? The reason seems to be that Globus is viewed by the EU as "US software"--even though it is all open source, and its developer and user community includes many Europeans.

ADDED LATER ON May 24: I discover that there are in fact not four but five grid middleware projects in Europe--ExtreemOS is the latest. They say: "The XtreemOS system will offer an alternative to the Globus toolkit, which is currently the most widespread middleware system." A noble goal!

May 21, 2007

German eScience Conference

I was at the German eScience conference last week. Germany created its "D-Grid" project a couple of years ago, under the leadership of Wolfgang Gentzsch. Wolfgang is a great guy, so I expected D-Grid to do good things. My impression is that it has done very well. They are pursuing a variety of interesting application projects.Dgrid

D-Grid does not require the use of any particular software: they support Globus, gLite, and Unicore. The result is, apparently, that "11-12 of their 15 application projects are using Globus." That's the sort of endorsement I like to see.

The German eScience Conference was the first international conference at which they presented the results of this program and other related activities. It was impressive to see what they have achieved. The number of international participants was limited, which was a shame. Hopefully next year the numbers will grow. Baden Baden is a lovely place, and the conference was excellent.

I gave the latest version of my talk on "Scaling eScience Impact." I think it's an important message to deliver and I always enjoy the opportunity to present it. (Update: Video is available.)

May 15, 2007

Gridway in Santiago de Compostela

I had the good fortune to attend the 1st Iberian Grid Infrastructure Conference this week, in beautiful Santiago de Compostela, Galicia, Spain. (Having also attended the 1st German eScience Conference two weeks before, I had to skip OGF 20 in Manchester last week.) It was an excellent meeting in every respect.

Gridway_2 The keynote today was given by Professor Ignacio Martin Llorente from the Universidad Complutense de Madrid. He gave a nice survey of scheduling strategies for distributed grid systems and also presented the GridWay metascheduler developed in his group. This Globus-based system (and dev.globus project) enables large-scale, reliable and efficient sharing of computing resources (clusters, computing farms,servers, supercomputers...), managed by different local resource management systems, such as PBS, SGE, LSF, Condor..., within a single organization or distributed across several  administrative domains.

Writing this entry, I realize that I like GridWay for five reasons:

  1. It provides powerful capabilities, including interfaces to many local resource managers, rich scheduling policies, interfaces from many schedulers (via so-called transfer queues.)
  2. It is used by many people to solve real problems--including people who are not paid to use it! (Example users: UABGrid at the University of Alabama Birmingham, AstroGrid-D in Germany, projects in China and India.)
  3. The GridWay team understands Globus deeply, and leverages Globus mechanisms to great advantage--just as the designers of those mechanisms intended.
  4. The GridWay team has embraced the dev.globus community development process.
  5. The GridWay team (like D-Grid and other brave souls) have been prepared to resist EU pressure to use only European software. Instead, they believe (correctly in my view) that we all benefit from the development and use of international software.

I also found this interesting comparison of GridWay and the EGEE workload management system, which shows GridWay in a good light. (Admittedly it was written by the GridWay team!)

Cabecera_ibergrid

May 04, 2007

Globus at OGF 20, May 7-11

May 7-11 is the Open Grid Forum meeting in Manchester (OGF 20). I can't be there myself, as I was already in Europe this week (speaking at the very nice 1st German eScience Conference--more about that later). However, many colleagues from around the world will be there.

In particular, the Globus team will be participating in a broad set of session. There'll be a session in the Software Forum track on Tuesday, 2pm, which will include a discussion of what's new in the core software, and more in depth information about GridWay, the latest Globus project, and OGSA-DAI's upcoming 3.0 release. There will also be some "meet the developer" times in the Exhibit Hall, so you can chat with Globus developers informally. Additional information on Globus at OGF is available online. Or, you can contact Jennifer Schopf.

April 18, 2007

Back in the USA

I've been offline for a while due to travel to South Africa (a wonderful week vacation, and then a week at an excellent summer school organized by Judith Bishop), then a few days back, then a week in New Zealand (attending another fine meeting, the New Zealand Computer Science Students Research Conference), two days in Australia to see foolish family members who emigrated their from New Zealand, and then back to Berkeley to attend a DOE meeting on future computing and computational science research programs. Finally home tomorrow. My heading is spinning ...

March 18, 2007

Off to Africa

I leave today for South Africa, where I will participate in the 4th IFIP School in Software Technology. A week's vacation in the bush west of Johannesburg, then a week at Gordon's Bay near Capetown. It should be a fascinating trip.

March 08, 2007

Globus at sea

Pic Italian Globus enthusiast Raffaele Montella sent me this picture of the sailboat Sarima V  that he races out of Naples. Apparently he uses grid to good effect in his racing, using an online Globus-based forecasting service to obtain up-to-date weather forecasts prior to each race. Being a sailor myself, I can only applaud (and feel jealous).

He has a paper on this work in the upcoming Grid and Pervasive Computing conference, to be held in Paris May 2-4. Title: "Development of a GT4-based Resource Broker Service: an application to on-demand weather and marine forecasting." I'll post a pointer to the paper once it is online.

February 09, 2007

News from Puerto Rico

I spent last weekend in Puerto Rico, thanks to an invitation from Wilson Rivera to speak at a Workshop on Wireless Networking, Automated Information Processing, and Web & Grid Services held in conjunction with the International Symposium on Wireless Pervasive Computing.

The weather was of course pleasant (on a telecon with colleagues in Chicago, I mentioned it was 85 F; one replied "it's just like that here--but without the 80"). But I was particularly impressed with what I learned about work being done at the University of Puerto Rico Mayaguez, e.g., in Wilson's very dynamic Parallel and Distributed Computing Lab. For example, they're doing a lot of work with Globus, and building end-to-end systems to monitor Puerto Rico ecosystems. It's certainly a place to watch.

January 18, 2007

Supercomputing reaches YouTube

The organizers of SC'06, the big U.S. supercomputing conference, created a video celebrating supercomputing and computational science. This was recently posted to YouTube and is proving popular.

December 20, 2006

Posner on Second Life

Richard Posner, US judge well known for his books on various topics (and also a lecturer at U.Chicago law school, I discover), appeared on Second Life to promote his new book, Not a Suicide Pact. (I haven't read the book, but it is supposedly "controversial.")

I signed up to participate (there are limits on how many people can be in one place in SL), but couldn't make it. However, I read the transcript. What did I learn? Not that much:

  • There is a limited selection of suits available for avatars.
  • Not that many people turned up--the event was perhaps more PR than communication.
  • There are people starting to think about the legal implications of virtual worlds.

But I'm sorry I missed it.

December 17, 2006

Travel to Hong Kong

I flew to Hong Kong today ... it is strange how while one might expect to cross my favorite ocean to get from the US to China, in practice, we never flew over open water. Instead, some beautiful views of Canada, Siberia, Mongolia, and China.

December 07, 2006

Grid in China

I'm participating today and tomorrow in the China-America Networking Symposium (CANS), which this year has a particular focus on grid. It's good to see friends from China, although several are not here because of visa problems. (A familiar, and painful, story.)

I've had the good fortune to visit China several times in recent years. In addition, we have hosted several visitors from China, and I also have some wonderful Chinese students. So I know a little about Chinese grid activities, which include several major deployments, including:

Continue reading "Grid in China" »

November 29, 2006

Visit to Lousiana

I visited the Center for Computation and Technology at LSU in Baton Rouge on Monday. With Ed Seidel's arrival, and much funding from the state, there is a rapidly growing group of smart and interesting people (e.g., Gabrielle Allen, Thomas Sterling, Tevfik Kosar, Dan Katz, and Jon McLaren) and also a growing scientific infrastructure and collection of strong projects.

Continue reading "Visit to Lousiana" »

November 17, 2006

Comments from Supercomputing

180pxintroducemdilayout I'm back from the annual Supercomputing (SC) conference in Tampa. As always there was a lot of cool stuff going on: despite the name, this is just a great place to go to see innovation in technology and its applications. A few things that impressed me:

  • OSU's Introduce IDE for Globus Web Services (see picture) being used to create and deploy new services in a few minutes. All those creating services manually should immediately switch to using Introduce!

Continue reading "Comments from Supercomputing" »

November 10, 2006

Globus in Tampa

Many of the Globus team will be in Tampa, Florida, next week, for the SC'06 conference. Globus technologies and solutions will be discussed and demonstrated in many booths (I think I counted 23 last year) and in many posters and workshops (TeraGrid Institute, GCE'06, and VTDC'06), and a tutorial.

It's hard to capture all of the activity, but here is a partial list, including a schedule of talks to be presented at the Argonne National Laboratory booth.

October 02, 2006

History and Theory of Infrastructure

I'm just back from a workshop on "History and Theory of Infrastructure: Lessons for New Scientific Infrastructure" in Ann Arbor, Michigan, which brought together a fascinating group of social scientists and others to discuss "what practical lessons can the history, sociology, and experience of existing infrastructures offer to the imagination, implementation, and governance of cyberinfrastructure."

One delightful aspect of the meeting was meeting wonderful scholars that I had known previously only by reputation, such as Geoff Bowker, Leigh Star, Paul Duguid, and Christine Borgman, as well as some I already knew, such as Tom Finholt, Bob Kahn, Dan Atkins, and Bill Dutton, and others that I was glad to get to know.

There were many fascinating and wide-ranging discussions. My impressions:

  • Social scientists (or at least those at the University of Michigan's School of Information) organize great meetings. The organizers had clearly put a lot of thought into how to structure the meeting to ensure useful discussion, and they also had excellent social events!
  • The mode of discussion was quite different from I expected. There were no formal presentations and little analysis, but many compelling anecdotes. At first, I found this strange, but then realized that "stories" are a compelling way  of conveying insights. That got me thinking: what "stories" should we be telling people embarking on cyberinfrastructure projects, to help them avoid mistakes and achieve success?
  • Another thought that seemed interesting, at least to me: How about designing cyberinfrastructure to collect the information that social scientists require to evaluate its utility? Large systems like TeraGrid, Open Science Grid, Earth System Grid, caBIG, or GEON, and also smaller systems, could be viewed as experimental apparatus for social scientists. What instrumentation should we include in them to that end?

Overall, I didn't come away convinced that the history of existing infrastructures can help those building cyberinfrastructure: railroads and networks are very different thing. But I became yet more convinced that social scientists have a lot to contribute to our understanding of how science and its tools will, and should, evolve in the 21st Century.

September 29, 2006

New Zealand Gets Wired

Logo Having grown up in New Zealand, I am delighted that the country finally has a high-speed research and education network, the Kiwi Advanced Research and Education Network (KAREN). Officially launched on August 31, this network links all of the major research institutions via a 10 Gbit/sec backbone.

The creation of a decent research infrastructure for New Zealand has taken a while. It's always going to be a challenge linking a country in which just 4 million people are spread over a fairly large area. However, while New Zealand has long had a high penetration of Internet technologies, things have been made worse by a lack of investment in research over the past 20 years, and by policies that have encouraged competition rather than cooperation among research universities and laboratories. Fortunately, these policies seem to be changing.

I've been thinking about these things since 2004, when I visited New Zealand and gave a series of talks to people involved in planning research infrastructure. I quoted Woody Allen: "80% of success is showing up", and pointed out that while the world is shrinking rapidly, it is not doing so uniformly. I noted that in 2004, I could send 1 terabyte (1 trillion bytes) to Geneva from Chicago in 20 minutes, but it took me four hours to download 1 megabyte (1 million bytes) from Chicago to Wellington. This difference reflects what we might call the dirty underside of exponentials: if network speeds are doubling every nine months, then a mere 10 years lag in network deployment means you are 10,000x slower than the competition. And in a world where one's ability to compete depends on access to information and colleagues, that difference can be fatal. Thus it's exciting to see that New Zealand has caught up--at least for a while.

I also spoke during that visit of the limiting effect of what I termed "PC Science," i.e., science scaled to fit on one's personal computer. Such limited approaches constrain the questions asked and the answers obtained. They can also (I fear) limit one's ability to enlist the best students, who are looking for things that are exciting and cutting edge. Fortunately, once you have high-speed networks, it becomes far more feasible to link users with clusters, supercomputers, databases, and collections of PCs to provide access to powerful computational capabilities. Thus I am also pleased to see my alma mater, the University of Canterbury, acquire a powerful supercomputer.

August 22, 2006

VO in Prague

I'm at the XXVIth Congress of the International Astronomical Union in Prague (a wonderful place), the triennial astronomy extravaganza. While the press coverage is all about whether Pluto gets to stay a planet (it seems that it will, sort of), a lot of the conference content is about virtual observatories (VOs). (I gave an invited talk on "Grid Technology and Multidisciplinary Science," which looked at connections between Grid and the VO world.)

The astronomy community has pioneered "service-oriented science" techniques for some time: see the nice article by Gray and Szalay for the basics. While the fact that their data is fairly simple and of no commercial value simplifies life relative to some other disciplines, it is still remarkable what they have achieved. Basically, they are developing services that provide access to a growing number of digital sky surveys at different wavelengths. Users can then access these services to look for (say) objects that are visible in the infra red but not the optical (=brown dwarfs), to stack up multiple instances of the same sort of obect (e.g., quasars) to improve signal to noise ratios, etc., all without leaving their desks. Furthermore, someone who develops an interesting analysis technique can in turn publish that as a service.

There are by now over a dozen VO projects around the world, and dozens of sky surveys are online. These sky surveys currently total tens of terabytes (10^12 bytes) of data; the next generation of instruments will generate petabytes (10^15 bytes) of data. These developments are rapidly transforming astronomy. It has already led to new scientific discoveries.

What makes this all possible is a small set of relatively simple but very important conventions The International Virtual Observatory Alliance (IVOA), formed in June 2002, has played an important role in developing these.

We should all be studying how this community works, and working to replicate their successes elsewhere.