When modifying software, or ported it from one system to
another, one often ends up checking for correctness (and attempting to diagnose
errors) by comparing the execution of a new program with a reference version
that is known to work. My Australian friend and colleague David Abramson developed a
tool called Guard to automate this process. With Guard, you specify the variables you want to
monitor, and when, and then fire up the new and reference versions of your software.
Guard monitors the specified variables, and notifies you when their values
differ.
John Michalakes and I had the occasion to work with Guard back in 1996
when we applied it in a project developing a parallel mesoscale weather model,
MM5. It was spookily wonderful to see differences between the parallel and
sequential implementations become visible in real-time in a 3-D visualization. (The figure shows a 2-D plot of differences, which is also useful but less beautiful.) It also then became extremely easy to fix those problems. We wrote a paper
together on this work, which won a best paper award at SC'96, due I think to David's
presentation skills--and the videos.)
This technology has now been licensed by Cray for use in their new Cascade program. It's exciting to see the technology making its way into mainstream use.
Comments