CSC Home PageSkip to Main Content
About Us | Services | Client Results | Insights | Contact Us | Careers
Solutions and Offerings
A&D Clients
Case Studies
Centers of Excellence
Events
News
History: A&D at CSC
Contact A&D
Aerospace & Defense
Client Results & Case Studies
Home Page Home Arrow Industries Arrow Aerospace & Defense Arrow Case Studies

CSC Builds NASA's Supercomputer Super Fast and Super Cheap

NASA's Columbia supercomputer
 
Client: NASA

Challenge: To build NASA a world-class supercomputer using a fraction of the time and money previous top computers needed while allowing processing for existing NASA projects to continue.

Solution: CSC developed a revolutionary procedure using replication and parallel work to incrementally expand the system and speed installation.

Results: CSC helped NASA complete the world's fastest production supercomputer in a record 120 days, supporting the New Horizons mission to Pluto, the safe return of the shuttle to space, and other missions.

Related Information
Read how CSC supported the space shuttle Discovery's safe return to Earth.
Learn about CSC's aerospace and defense offerings.

Read about Management Consulting at CSC.

Contact us and let our experience help you produce results.



The National Aeronautics and Space Administration wanted CSC to help it build the world's fastest computer. The catch: NASA wanted it done in a tenth of the time and a tenth of the cost.

In 2004, NASA knew that to accomplish upcoming missions it would need 10 times the computing power it had to start the year. Only a top supercomputer could perform the complex data modeling required to safely return the space shuttle to flight, build a Mars rover, track climate patterns and make long-range weather forecasts — all at the same time.

Meanwhile, the United States had fallen behind in the high-performance computing race. Japan's Earth Simulator had been the world's fastest supercomputer for three years. Intel and SGI wanted to overtake Earth Simulator and offered NASA substantial discounts on hardware, including more than 10,000 processors. But Intel and SGI wanted the supercomputer to produce competitive results in time for the University of Mannheim's November publication of the world's top 500 fastest supercomputers. NASA would have to top the list within four months.

To make it happen, NASA initiated Project Columbia, named in honor of the fallen space shuttle, and partnered with CSC and Advanced Management Technology, Inc. (AMTI). In July, the CSC-AMTI team started work at the NASA Advanced Supercomputing (NAS) facility at Ames Research Center in California. The team designed Columbia's facilities and architecture, then integrated the system and gave it operational support. "All I asked was a miracle a day," says Walt Brooks, then chief of NASA's supercomputer staff. "120 days later, we did it."


Setting a blistering pace
NASA faced challenges unsurpassed in supercomputing history, says Christopher Buchanan, CSC's site manager for high-end computing. Budget constraints had forced a 40 percent reduction in NAS staff and restricted Project Columbia's funding to a fraction of the cost of similar projects. While the offer by Intel and SGI made Project Columbia simpler financially, it made it more difficult technically. Top-tier supercomputers typically require years of planning and development, and several hundred million dollars. Earth Simulator, for example, took at least five years and $500 million to complete. Building Columbia in 120 days with less money and fewer people looked like an impossible task.

The immense workload compelled the team to develop a complex and demanding schedule. "Our team worked in shifts around the clock, each member putting in 60-80 hours a week,” says Buchanan. "We came up with a new process, coordinating groups to work simultaneously, performing repeatable operations almost in assembly line fashion.” With each of the 20 systems CSC installed, the team was able to refine and speed the process. At its peak, the team installed nine systems and moved them into production in 10 days.


A work in progress
Throughout the build, NASA's existing computing projects could not be disrupted. The usual procedure of installing all systems at once before using the computer could not be followed. The team had to maintain the production environment, minimizing disruption to users, while incrementally adding new systems. This had never been done before.

The NASA team had previously set a record by installing a single 512-processor SGI system in 30 days. It now had to install, power, cool and network 20 of these systems — in five days apiece. This was done by replicating an existing system that had been well tested, secured, and heavily used. The team divided into parallel work groups, each with their own process outline and responsibilities for tasks such as facilities construction, hardware installation, networks, security and software.

Project Columbia included new SGI hardware called the Altix 3700Bx2, which interconnected 2,048 of Columbia's 10,240 processors. This provided direct access to all 2,048 processors simultaneously, allowing scientists to run larger jobs. The Altix system had not been available for testing before delivery at NASA, and the tight schedule didn’t allow for thorough testing.

As work progressed, the team faced many unforeseen challenges: The elevator to move hardware to the second floor broke; the forklift to move hardware downstairs broke; defective cables had to be replaced; and a major water main break caused a loss of cooling power. Then some of the hardware shipments were delayed by several weeks. Nine of the processor systems didn’t arrive until 10 days before the system installation deadline, and they all had to be put into production simultaneously.


On time, on budget, on the mark
Project Columbia revolutionized the cost and time required to build a top-level supercomputer. Benchmarked in October at 51.9 trillion calculations per second, Columbia became the world's fastest production supercomputer and the first U.S. computer to eclipse the record-holding Earth Simulator since it debuted in Japan in 2001. At a $50 million budget, however, Columbia's cost was one-tenth that of the Japanese system. It also took one-tenth of the time to launch.

According to Buchanan, Columbia’s power makes the NASA Ames Research Center one of the world's premier scientific resources. Columbia is accessible to scientists via the Internet from anywhere in the world. It is used by NASA missions, government agencies, major universities and industries. Current applications on Columbia include: a hurricane weather simulator; applications supporting the New Horizons mission to Pluto; Return-to-Flight, a suite of programs supporting the return of NASA’s space shuttle fleet to operation; ECCO, an ocean-modeling application; and a suite of exploration systems applications used to design the Mars Crew Exploration Vehicle. These and other applications are providing scientific results at a scale and in a time frame that was not possible before Columbia.

For their work on the supercomputer, four members of CSC's 45-person Columbia staff were selected to receive the 2006 Award for Technical Excellence, CSC’s highest honor for IT innovation. The team's recipients included Davin Chan, Ed Hook, George Myers and Herbert Yeung. "I had the difficult task to choose four to represent the larger group," said Buchanan. "Even though these four people were chosen for the award, it took the entire CSC staff working together with AMTI and NASA to make this happen."


© Copyright 2008 Computer Sciences Corporation | Privacy Policy | RSS