Disclaimer

  • The postings on this site are my own and don’t necessarily represent Microsoft's positions, strategies or opinions.

Twitter Updates

    follow me on Twitter
    AddThis Social Bookmark Button

    Technorati

    • Add to Technorati Favorites

    Cloud Computing

    July 01, 2009

    IT’s Not Easy Being Green (Actually, It Is Easy If You Try)

    I can hear the groan's already at the bad pun embodied in the title of this essay. (Here's the cross-cultural deconstruction. First, there's the capitalized "IT" for information technology. Then, there's the reference to energy efficient computing ("green"). However, the cultural reference is to the Muppet Show and Kermit the Frog, whose refrain often was, "It's not easy being green.")

    Contrary to the pun, actually, it is easy to be green, if one wants to do so. This is a point I tried to make in an interview that is part of a series InsideHPC has begun on green computing. In the interview, I discussed the challenges and opportunities associated with energy efficient computing at scale, whether operating large-scale data centers or petascale high-performance computing systems.

    In the interview, I pointed out that the most obvious way to reduce computing-related energy consumption is simply to power down and turn off those systems not being used -- QED. However, that is insufficient alone. After all, one presumably wants to do some computing. Thus, systems and infrastructure must be designed for energy and operational efficiency and must be managed appropriately during operation.

    As a practical matter, one really wants to maximize a ratio

    (Effective operations)/(Cost times Watts)

    Simply put, the goal is to maximize the number of effective operations relative to cost and energy consumption. This convolves many ideas, including the match of the application to the system (application execution efficiency), the system design and architecture, energy and power supply efficiency, packaging and cooling overhead, market costs for power and hardware and the costs of people and money.

    Microsoft is absolutely committed to green computing, across its entire range of products and infrastructure, and a big portion of my team's research is related to developing more energy efficient computing systems at scale.

    June 14, 2009

    Microsoft Extreme Computing Group (XCG)

    Here is a small item that may be of possible interest, creation of the Extreme Computing Group (XCG) at Microsoft. XCG was formed in June 2009 with the goal of developing radical new approaches to ultrascale and high-performance computing hardware and software. The group's research activities include work in computer security, cryptography, operating system design, parallel programming models, cloud software, data center architectures, specialty hardware accelerators and quantum computing.

    May 31, 2009

    Nothing Left But the Smile …

    Long ago, we in HPC recognized the inevitability of the Attack of the Killer Micros, and with few exceptions, all of today's HPC systems are based on some variant of commodity microprocessors and their commodity cousins, GPUs. A similar revolution swept the secondary storage market. After all, the I in RAID standards for inexpensive, a synonym for commodity.

    Like the Cheshire Cat in Alice's Adventures In Wonderland, it seems increasingly clear that in the high-performance, low latency interconnection network space, we will be left with nothing but the smile. Simply put, the last remaining non-commodity component of HPC clusters – the high-performance interconnect – is in grave danger, due to the global economic downturn and the price-performance pressures of commodity Ethernet.

    Remember Coaxial Cables and BNC?

    The old geezers among us, including yours truly, remember when Ethernet meant coaxial cables and CSMA/CD at 10 megabits/second. (Here's a shout out to my colleague Chuck Thacker, one of the co-inventors of Ethernet.) Since then, we have seen many generations of advances in transport fabrics, switches and routers and speeds.

    One gigabit Ethernet is now a commodity item, contained on almost all PC motherboards, 10 gigabit Ethernet is the high bandwidth standard, and 40 gigabit and 100 gigabit Ethernet standards are under development by the IEEE. Infiniband is one of the last major competitors to 10 gigabit Ethernet, and even the Infiniband vendors are increasingly offering Ethernet compatibility (i.e., so-called converged fabrics).

    Cloud Data Centers and HPC

    I invite you to ponder your response to the following question. What is the difference between a megascale data center and a petascale computing system? Increasingly, the answer is "not much, but a few elements of the software stack."

    Of course, this need not be the answer, but it likely will be unless we change our research and development strategies and also our procurement models. Ironically, megascale data centers could also benefit from lower latency, higher bandwidth communication fabrics with scalable bisection bandwidth. It is time we bring these two together, for the HPC community could leverage the economics of megascale data centers and their needs.

    March 12, 2009

    Scientific Clouds: Blowin’ in the Wind

    N.B. I recently responded to some questions from John West (HPCWire) regarding the Microsoft Cloud Computing Futures (CCF) research project. In that Q&A, I also commented on the relevance of cloud computing to computational science. What follows is an augmented subset of the Q&A, but focused on just the relevance of clouds to technical computing.

    Cirrus, stratus, altostratus, cumulus: they are the scientific names of the common clouds. They drift across the sky, reflecting the changing wind and weather. A new front is blowing into computational science, and cloud computing will soon advance scientific and engineering discovery.

    That is one of the reasons I am excited about cloud services. I believe we are at a technological transition point, just as profound as that engendered by the "attack of the killer micros." This is true whether you are enamored of Microsoft's Azure, Amazon's AWS or Google's Apps.

    Learning from History

    Let's step back and gain some perspective, starting with the "Branscomb pyramid" ("From Desktop to TeraFlop: Exploiting the U.S. Lead in High Performance Computing, Lewis Branscomb et al) and the diverse types of technical computing that now exist. We tend to focus on the apex of the computing pyramid, now exemplified by petascale systems intended to support only a handful of applications and users. However, most science is conducted at lower levels of the pyramid, using desktop computers, laboratory clusters and university-scale computing infrastructure. By analogy, it's exciting to talk about international hypersonic transport, but most people care more about efficiently and painlessly commuting to work each day.

    Over the past decade, we successfully leveraged commodity hardware to create large clusters. What was nearly heretical when we first deployed clusters at NCSA is now commonplace. However, this scaling has not been without cost. Cluster programming remains difficult at scale, we have turned a generation of researchers into parallel programmers and system administrators, institutions are struggling with rising demands for machine space, power and cooling, and duplicated facilities make sharing expertise and data difficult. We are heavily focused on computing at a time when data analysis now dominates much of science and engineering. Like many of you, I contributed to this state of affairs, and I feel some responsibility to help us find a new path.

    Hype and Reality

    Let's separate the hype from the reality. Clouds won't magically restore your 401(k) retirement fund, cure halitosis or even help you drop twenty pounds before your upcoming high school reunion. Like all new technologies, however, they challenge some conventional computing wisdom and change some of our operating assumptions.

    Personal computing was a non sequitur when computers filled rooms. Internet search was nonsensical when there were only a handful of research web sites. Social networking services depend on inexpensive, ubiquitous broadband access and mobile devices. Hosted cloud services and software are now possible given the confluence of inexpensive but powerful multicore processors, high-capacity storage, broadband networks and the economies of scale that consolidation in cloud data centers make possible.

    Five Reasons Clouds Matter

    First, the economies of scale from mega-data center provisioning mean capital and operating costs can be lower. When buying servers in 500,000 unit lots and designing facilities at scale, the provider does have some financial and technical leverage. This would allow universities, laboratories and federal agencies to devote a larger fraction of precious funding to research rather than infrastructure. Remember Dan's computational science corollary; it's the science, not the infrastructure, which matters.

    Second, truly large-scale data analysis, particularly multidisciplinary data fusion, can become routine. In the scientific community, we have worked hard to build workflows for access to distributed data. Consolidation and co-location enable new approaches, and we tend to forget that cloud data centers have many, many petabytes of disk storage. It really is possible to query multiple petabytes of data using intuitive, easy-to-use desktop tools – the business community does it all the time. Jim Gray proved the power of database tools on several scientific data analysis projects, including the SkyServer.

    Third, clouds facilitate time-space tradeoffs. It is just as cost-effective to run 100,000 individual jobs simultaneously as sequentially (e.g., for a parameter study), something that our batch queuing strategies strongly discourage on high-performance computing systems. In geek terms, the area is the same, whether one uses tall, skinny rectangles (lots of resources for a small interval) or short, long rectangles (a few resources for a long interval). The elasticity of clouds, a consequence of multiplexing many users and workloads, means that the resources are always available without waiting.

    Fourth, I also believe that the cloud will offer HPC services at increasing scale, beginning with that typified by today's laboratory clusters. This is already happening, and as I/O device virtualization continues to improve, communication latencies will decrease and tightly coupled computations will be attractive at ever larger scale.

    Finally, clouds can provide seamless extension of familiar desktop tools and interfaces, allowing computing and analysis to scale within the same environment that researchers use every day. We can leverage consumer software, just as we have leveraged consumer hardware. There is no reason our computational science tools and our "every day" tools need be different.

    Shameless Microsoft Plug

    To this point, I've written about clouds in a vendor-neutral sense. With a nod to the company name now on my paycheck, if you haven't already, I encourage you to take a look at Windows Azure and its cloud computing and storage services. In addition to rich web services, there is both open source and Visual Studio programming support. In a future post, I will describe an Azure example application for computational science. Here endeth the marketing pitch.

    Insight, Not Infrastructure

    As Richard Hamming famously noted, "The purpose of computing is insight, not numbers." Dan's computational science corollary is simple, "The purpose of computational science infrastructure is scientific discovery, not big iron bragging rights." It's time to focus on what matters and embrace the future. Our graduate students and post-docs will thank us.

    February 24, 2009

    Seeding The Clouds Redux

    Update: We've been getting a bit of press, including stories in the NY Times and EE Times. Jim Larus also shows off our low power test vehicle in a brief video.