Disclaimer

  • The postings on this site are my own and don’t necessarily represent Microsoft's positions, strategies or opinions.

Twitter Updates

    follow me on Twitter
    AddThis Social Bookmark Button

    Technorati

    • Add to Technorati Favorites

    Science

    May 07, 2009

    On Getting The “Right” Answer

    Update:  Jim Pool reminds me of the following book, which touches on this issue.

    The parallel application contains millions of lines of code, combining multiple models of physical, engineering, biological, social and/or economic processes, operating over temporal and spatial scales that span ten orders of magnitude. It was written by tens or even hundreds of graduate students, post-doctoral associates, software developers and yes, even a few professors, over a decade. It involves numerical libraries and functions from diverse research groups and companies, and a single execution requires thousands of hours on tens of thousands of processor cores. In short, it's a typical example of an extreme scale high-performance computing code.

    And The Answer Is …

    What is the probability that any execution of this code produces the "right" answer? Does a "right" answer even exist? If so, how might we know? Or, is this a nonsensical question?

    These are not simply Gedanken experiments in software engineering or hardware reliability, nor are they just epistemological questions about the philosophy of knowledge. Rather, they are very practical and real questions about the nature of extreme scale computational science. They are the essence of verification and validation (V&V) processes, and we should be much more rigorous and systematic about their application.

    The central lesson of software engineering is that regardless of the rigor of design processes, testing methodologies and boundary condition specifications, large applications, regardless of type, will contain multiple errors. We have all seen and experienced examples, from the blue screen of death on personal computers to the loss of the Mars Climate Orbiter due to mixed unit calculations. Computational science applications have no special dispensation to escape this destiny.

    Second, because the raison d'être for multidisciplinary applications is enabling researchers to gain insight into complex and often poorly understood phenomena, testing them can be problematic, as the answers are often known only for simplified, model problems or boundary conditions. What constitutes a rigorous test suite when little or no experimental data is available for independent comparison?

    Third, there's the small matter of numerical stability. The IEEE floating point standard balances range (exponent) and precision (mantissa) in a fixed number of bits, and necessarily approximates many real numbers. After large numbers of floating point operations, even a stable algorithm and a well-conditioned problem will contain some degree of error. Rarely do we bound the possible error via interval analysis.

    Finally, there's the small matter of hardware errors, an increasingly common phenomenon for large memories constructed from high-capacity DRAMs and for microprocessor datapaths. We have been taught that error correcting codes correct DRAM bit errors. In truth, the standard SECDED code only corrects single bit errors and detects double bit errors. Burst DRAM errors (triple bit or more) are not corrected and are increasingly common. Some iterative algorithms can recover from bit errors, converging to the "correct" answer; others cannot.

    Computation As Experiment

    Are you afraid? We all should be. It is time to embrace the scientific process for computational science. We must view the execution of a large, multidisciplinary code as what it is – an experiment, with all the possible error sources attendant with any physical experiment. This includes repeating the experiment (computation) to determine confidence intervals on the answer, conducting perturbation studies to determine the sensitivity of the answer to environmental (hardware and software) conditions, identifying sources of experimental bias and defining the experiment rigorously for independent verification.

    In the days of analog scientific computing via slide rule, we all understood that the computation was an approximation. It's time to relearn that lesson.

    February 02, 2009

    Publication Quarks

    I write a quarterly column for the Computing Research Association (CRA)'s newsletter, Computing Research News (CRN). The following is a preview of my upcoming column, which will appear in the March 2009 issue.

    Over the past thirty years, I have accumulated the common artifact of an academic research career – bookshelves overflowing with research journals and conference proceedings. Each time I pull an old and yellowing volume from my shelves, it is simultaneously nostalgic and thought provoking to read a few randomly selected articles. Not only does this stroll down memory lane illuminate how far we have come, both technologically and theoretically, it shows how profoundly the publication culture of our field has changed.

    Conference Hegemony

    Not that many years ago, CRA published a "best practices" memorandum entitled, "Evaluating Computer Scientists and Engineers for Promotion and Tenure." At a time when many departments were struggling to make the case to their science and engineering colleagues that conference publications mattered, this memorandum demonstrated that computing conference publications were of a quality comparable to those in archival journals.

    The perception battle won, is all right with our publication world? Perhaps, but I suspect not. Our prestigious conferences have become the moral equivalent of highly selective journals in other fields. The computing conference review process is rigorous and highly selective, and polished results are required for publication. In many of our sub-disciplines, the conference paper is the final result. There is no expectation that the preliminary results will be expanded, augmented and published in a journal. Consequently, many – arguably most – of our journals have receded in significance. I believe this is a regrettable and worrisome development.

    First, it has truncated the continuum of publication options. In most disciplines, conferences are the venue where late breaking results, thought providing theories and controversial ideas are aired and debated. Many of these later are proven incorrect or validated and expanded with additional data, but the free exchange of ideas stimulates research and innovation. At the risk of sounding like an "old geezer," I encourage you to read some old conference proceedings. It is illuminating to see how our many of our conferences have evolved from idea exchanges to publication venues.

    <BEGIN Old Geezer Story>

    Recently, I told a group about one of my undergraduate experiences – being caught in an unexpected thunderstorm with my compiler under my arm. That would be the box of punched cards containing my compiler. I spent most of the evening with an iron and ironing board, flattening my cards very carefully, so the card reader could process it and then punch a new, undamaged deck. My audience looked at me as if I were a walking dinosaur. Now back to our regularly scheduled programming.

    <END Old Geezer Story>

    Our emphasis on the conference cycle has also encouraged and rewarded production of publishing "quarks" – units of intellectual endeavor that can be generated, summarized and reviewed in a calendar year. We now see new faculty and research staff candidates with more publications than were once common in promotion and tenure dossiers.

    Do not misunderstand. I am not suggesting that our current conference-centric culture is all bad, merely that we should be more thoughtful regarding the timescales and range of our publication options. I would also humbly suggest that we consider how this approach shapes the types and kinds of research conducted. We all know that quality trumps quantity, and that research results have a wide range of natural sizes and time scales.

    Journal Revival

    What then becomes of our often languishing journals? Are they a hidebound and archaic notion, doomed to irrelevance by ubiquitous electronic access? To be sure, the nature of publication is in flux in both popular and professional culture, with the physical artifacts likely to disappear. However, the notions of scholarly review and archival recording of research are independent of these artifacts.

    I believe we need to restore journals to their rightful place as the lasting archives of scientific knowledge. This will require a cultural shift, making our conferences the harbingers of extended, rigorous publication in journals. Equally importantly, it will require us to review those journal submissions thoughtfully and with alacrity.

    As anyone who has ever been the editor of computing journal knows, obtaining timely reviews is challenging. Even with gentle (and sometimes not so gentle) nagging, the weeks can stretch to months; the months sometimes turn to years. Contrast this with other technical disciplines where submissions can be reviewed and published in weeks or months. Is it any wonder that paper authors in our field eschew journals for conferences with known publication dates?

    As a discipline, we benefit from the entire continuum of venues for communicating research ideas and results, from informal workshops and conferences to research surveys and expanded publication in archival journals. Let's recognize and embrace the distinct and important roles that each plays in the free and fruitful exchange of research ideas.

    January 25, 2009

    The (Scientific) Good News

    I was ten years old when I saw the light – the scientific light. For me, it was a Road to Damascus experience, catalyzed by a single event. My grade school teacher instructed each of the students to select a single, thin volume from the science encyclopedia and begin reading quietly at his or her desk. In retrospect, I realize it was probably the desperate act of an overwhelmed teacher who simply wanted a bit of quiet time. For me, though, it was a transformative revelation, a portal on a world of rationality, cause and effect and experiment-driven understanding.

    For all those nagging questions, there was a systematic, repeatable mechanism to obtain and verify answers. The world could make sense, and the unknown was knowable. There were other people like me, and I could dream of being one of them – a scientist! It was thrilling and wondrous, and I knew without a shadow of a doubt that I had found the passion of my life.

    The Universal Passion

    Over the past 40+ years, that passion has led me to extraordinary and unexpected places. Yet across all that diversity, I have observed a universal behavioral constant, one that transcends national borders, cultures and languages. Scientists and scientific thinking are the same everywhere. They see the world through the same eyes and value the same things, a common approach to problem solving and reasoning. Above all, though, they share the passion and the curiosity, the unrelenting desire to know, to understand.

    What drives us? It's not tenure; it's not publication; it's not research funding. Those are artifacts. It's not even fame, fortune or glory, though a few scientists seek those too. Rather, it's the desire to know, to understand, to add a small piece to the varied mosaic that is our limited but expanding human knowledge of this vast and varied universe. Depending on your assessment of the Fermi Paradox, perhaps it's to be the first sentient being in this brane to understand a small bit of its workings. If knowledge is your passion, that is reward enough.

    Childlike Curiosity

    Soon after I had completed my Ph.D., I read Peter Medawar's great book, Advice to a Young Scientist, and resonated with his insightful words:

    I am often asked, "What made you become scientist?" But I can't stand far enough away from myself to give a really satisfactory answer, for I cannot distinctly remember a time when I did not think that a scientist was the most exciting possible thing to be.

    I am no behavioral psychologist, but I suspect that all children are born with the insatiable curiosity that sustains scientific curiosity. All too often, though, I fear that our educational system punishes curiosity and rewards conformity. Only a small fraction remains sufficiently iconoclastic and self-confident to resist, asking those seemingly annoying questions that defy authority and drive discovery.

    Why? It's a simple but profound question.

    "Daddy, why is the sky blue?" It's Raleigh scattering, of course!

    "Mommy, why does is it cold in winter?" It's axial tilt of the Earth! (Sadly, a stunning fraction of North American college graduates believe it's because the Earth is closer to the sun during the summer.)

    "Daddy, why are insects not as big as elephants?" It's about surface area, volume and energy, as Haldane explained in his delightful essay, On Being the Right Size. (Ignore the politics at the end.)

    The answers to simple questions often expose deep truths. Encourage and preserve the curiosity of children. Share the wonder; share the passion; share the good news. Scientists and children – they are more alike than different.

    May 10, 2008

    HPC and Climate Change: Senate Hearing

    On Thursday, May 8, I testified to the U.S. Senate Committee on Commerce, Science and Technology. The full committee hearing was on improving the "Capacity of U.S. Climate Modeling for Decision Makers and End-Users." The other members of the hearing panel were

    Jim Hack and I represented the computing and computational science issues, and the other four focused on the climate aspects. Within a few weeks, our written testimony will be posted on the Committee's hearing page, and in due time (many months), our oral testimony will appear in the Congressional Record.

    Continue reading "HPC and Climate Change: Senate Hearing" »

    March 12, 2008

    Computing is Beauty; Computing is Truth

    In a recent invited essay for the Society for Industrial and Applied Mathematics (SIAM), which appeared in the March 2008 issue of SIAM News, I wrote about the power of computing as an intellectual amplifier and the beauty of computing as an illuminator of truth. Elements of the essay were adapted from my July 2003 and May 2004 testimony to the U.S. House Committee on Science and Technology on the status of high-performance computing.

    Continue reading "Computing is Beauty; Computing is Truth" »