Disclaimer

  • The postings on this site are my own and don’t necessarily represent Microsoft's positions, strategies or opinions.

Twitter Updates

    follow me on Twitter
    AddThis Social Bookmark Button

    Technorati

    • Add to Technorati Favorites

    Technology

    July 01, 2009

    IT’s Not Easy Being Green (Actually, It Is Easy If You Try)

    I can hear the groan's already at the bad pun embodied in the title of this essay. (Here's the cross-cultural deconstruction. First, there's the capitalized "IT" for information technology. Then, there's the reference to energy efficient computing ("green"). However, the cultural reference is to the Muppet Show and Kermit the Frog, whose refrain often was, "It's not easy being green.")

    Contrary to the pun, actually, it is easy to be green, if one wants to do so. This is a point I tried to make in an interview that is part of a series InsideHPC has begun on green computing. In the interview, I discussed the challenges and opportunities associated with energy efficient computing at scale, whether operating large-scale data centers or petascale high-performance computing systems.

    In the interview, I pointed out that the most obvious way to reduce computing-related energy consumption is simply to power down and turn off those systems not being used -- QED. However, that is insufficient alone. After all, one presumably wants to do some computing. Thus, systems and infrastructure must be designed for energy and operational efficiency and must be managed appropriately during operation.

    As a practical matter, one really wants to maximize a ratio

    (Effective operations)/(Cost times Watts)

    Simply put, the goal is to maximize the number of effective operations relative to cost and energy consumption. This convolves many ideas, including the match of the application to the system (application execution efficiency), the system design and architecture, energy and power supply efficiency, packaging and cooling overhead, market costs for power and hardware and the costs of people and money.

    Microsoft is absolutely committed to green computing, across its entire range of products and infrastructure, and a big portion of my team's research is related to developing more energy efficient computing systems at scale.

    June 20, 2009

    The Fallacy of Rankings

    N.B. I also write for the Communications of the ACM (CACM). The following essay recently appeared on the CACM blog.

    The world's tallest mountain (Everest), the biggest desert (Sahara), the richest person (Bill Gates), the fastest airplane (SR-71) and even the world's champion hotdog eater (59 hotdogs) – we are fascinated by records and rankings. The good folks who operate the Guinness World Records do a brisk business chronicling our interests in the sometimes unusual aspects of human endeavors and their rankings.

    At this juncture, you might be questioning the relationship between hotdog eating contests and the nominal topic of these essays: high-performance computing (HPC). Or, you may simply be, as I am, dumfounded and awed that any human could eat 59 hotdogs. This is a prodigious feat of perhaps questionable value, but I digress.

    Top 500 Ranking

    The latest, semi-annual Top 500 ranking of the world's fastest supercomputers will be revealed at the International Supercomputing Conference (ISC) in June. Last year, two systems broke the petascale barrier using a GPU/game accelerator processor cluster (LANL's Roadrunner) and a cluster of commodity microprocessors (ORNL's Jaguar). As always, we can expect the latest announcement to garner interest among the technological community, receive coverage in the popular press, and secure bragging rights for the organizations, vendors and countries involved. I await them eagerly myself.

    However, there are many figures of merit for high-performance computing systems, including suitability for target workloads, total cost of ownership (TCO), energy consumption and efficiency, reliability, productivity and ease of use, the richness of available software tools, extensibility and replication across markets, funding models and market viability. Many of these are difficult to quantify, and any multivariate ranking based on these may well differ from that derived from performance on a single technical computing benchmark.

    Partial Orders Matter

    Though rankings (total orders) are intuitive and easily explained – a valuable attribute in today's attention constrained society – they rarely capture the true complexity of multidimensional comparison. Mathematically, this is simply an argument for order theory and partially ordered sets (posets), recognizing that in a multivariate (multidimensional) comparison, some elements may well be unordered or equivalent.

    Is an inexpensive, energy efficient computer system superior or inferior to an expensive, high-performance system? The answer, of course, depends on the intended use. A smart phone and a supercomputer both have value, but one is a poor substitute for the other.

    As valuable as ranking the world's fastest machines is, I believe we would benefit even more from publishing a vector of metrics regarding each HPC system, and focusing less on the extremal points of the poset. Many years ago, the Perfect Club (PERFormance Evaluation by Cost-effective Transformations) benchmarks were created to facilitate one variant of such a multivariate analysis. The SPEC benchmarks are another example from the commercial space.

    These multivariate analyses are not easy. They require much more work than univariate rankings, some of the data is not easily obtained, and some information is viewed as competitive. That does not mean we should not try again to define more diverse evaluation criteria.

    Sixty Four Hotdogs

    Despite fascination with ultrafast computing systems, the mind cannot help but return to hot dog eating. Incredibly, not one, but two individuals ate 59 hotdogs during the allotted time during this year's contest. This necessitated a 5 hotdog "eat off" to determine a winner. As a computer scientist, I recognize sixty-four as an interesting power of two. Hotdogs and supercomputers, both are driven by human competitiveness.

    June 14, 2009

    Microsoft Extreme Computing Group (XCG)

    Here is a small item that may be of possible interest, creation of the Extreme Computing Group (XCG) at Microsoft. XCG was formed in June 2009 with the goal of developing radical new approaches to ultrascale and high-performance computing hardware and software. The group's research activities include work in computer security, cryptography, operating system design, parallel programming models, cloud software, data center architectures, specialty hardware accelerators and quantum computing.

    June 09, 2009

    HPC: Making a Small Fortune

    N.B. I also write for the Communications of the ACM (CACM). The following essay recently appeared on the CACM blog.

    There is an old joke in the high-performance computing community that begins with a question, "How do you make a small fortune in high-performance computing?" There are several variations on the joke, but they all end with the same punch line, "Start with a large fortune and ship at least one generation of product. You will be left with a small fortune." Forty years of experience, with companies large and small, has confirmed the sad truth of this statement.

    As we all know, the computing industry is extremely competitive, and new trends and technologies have repeatedly had transformative effect. One need look no further than the regular inductees to the Dead Supercomputing Society to see the devastating effects of the ongoing attack of the killer micros on the market for custom high-performance computing system designs. The microprocessor performance increases over the past thirty years due to decreasing feature sizes, higher clock rates and greater architectural complexity have repeatedly dashed the hopes of many high-performance computing entrepreneurs.

    The market lesson is that one false step inevitably leads to failure, particularly for startup companies struggling to establish a new niche in the face of commodity economics. It has never been truer than in today's economy, where potential buyers are retrenching and evaluating each purchase with a discriminating and sometimes jaundiced eye. Recently, the high-performance computing industry lost several established companies to merger and acquisition, due to weak market positions. We have also seen startup companies fail due to missteps and financial pressures.

    This reminds me of another old analogy, which compares building computer hardware and software to playing pinball – one's reward for playing well is the opportunity to keep playing via free games. The punishment for not playing well is equally clear; one must continue to insert quarters into the machine. Venture capitalists know this well, as they evaluate the pinball skills of those pitching business plans.

    Without doubt, we need a new generation of high-performance computing systems, from consumer devices to exascale platforms, to drive innovation, improve health care, manage critical infrastructure and ensure safety and defense. The question is whether the rise of multicore and manycore chips and explicit parallelism in the commodity microprocessor and GPU markets will finally change a few of the rules of the pinball game, via a combination of consumer economics pressures and technological need, the latter due to clock frequency and power limitations.

    I believe we are at an inflection point, where new approaches must both survive and flourish if we are to continue to deliver higher performance in effective and reasonable ways. It is worth remembering that Andy Grove's famous comment, "Only the paranoid survive," is but the trailing phrase in a larger, more perspicuous comment, "Success breeds complacency. Complacency breeds failure. Only the paranoid survive."

    We cannot be complacent about the future, especially now. We must continue to innovate, even if – especially if – that means adding quarters to the innovation machine.

    May 31, 2009

    Nothing Left But the Smile …

    Long ago, we in HPC recognized the inevitability of the Attack of the Killer Micros, and with few exceptions, all of today's HPC systems are based on some variant of commodity microprocessors and their commodity cousins, GPUs. A similar revolution swept the secondary storage market. After all, the I in RAID standards for inexpensive, a synonym for commodity.

    Like the Cheshire Cat in Alice's Adventures In Wonderland, it seems increasingly clear that in the high-performance, low latency interconnection network space, we will be left with nothing but the smile. Simply put, the last remaining non-commodity component of HPC clusters – the high-performance interconnect – is in grave danger, due to the global economic downturn and the price-performance pressures of commodity Ethernet.

    Remember Coaxial Cables and BNC?

    The old geezers among us, including yours truly, remember when Ethernet meant coaxial cables and CSMA/CD at 10 megabits/second. (Here's a shout out to my colleague Chuck Thacker, one of the co-inventors of Ethernet.) Since then, we have seen many generations of advances in transport fabrics, switches and routers and speeds.

    One gigabit Ethernet is now a commodity item, contained on almost all PC motherboards, 10 gigabit Ethernet is the high bandwidth standard, and 40 gigabit and 100 gigabit Ethernet standards are under development by the IEEE. Infiniband is one of the last major competitors to 10 gigabit Ethernet, and even the Infiniband vendors are increasingly offering Ethernet compatibility (i.e., so-called converged fabrics).

    Cloud Data Centers and HPC

    I invite you to ponder your response to the following question. What is the difference between a megascale data center and a petascale computing system? Increasingly, the answer is "not much, but a few elements of the software stack."

    Of course, this need not be the answer, but it likely will be unless we change our research and development strategies and also our procurement models. Ironically, megascale data centers could also benefit from lower latency, higher bandwidth communication fabrics with scalable bisection bandwidth. It is time we bring these two together, for the HPC community could leverage the economics of megascale data centers and their needs.