Officials in some localities didn’t take Sacca and his coworkers seriously; others had nothing to offer. One township, freaked out by the scruffy guys asking about high-voltage lines, worried that they were terrorists and called the Department of Homeland Security. Others busted a gut to help out. Sacca remembers that the people in Coos Bay flew them around in a helicopter to survey potential sites. “They figured the community needed it and they were going to take a chance that we were legit.”
It was in a Columbia River town seventy miles east of Portland, near the Washington border, that Sacca hit pay dirt. “It was an ugly site,” he says. Rough land. Rocks jutting from barren ground. Big power lines. The site was on the bank of the river, but not the pretty part—the view wasn’t beautiful Mount Hood but semidesert terrain. Nearby was the abandoned headquarters of a wood-chipping plant. But Sacca had retrained his eye for a different kind of beauty, and to him the adjoining power lines were as alluring as a majestic vista. As was the state of the town—sufficiently rundown and desperate to woo a massive building.
The town was The Dalles, population 12,000, described by one local reporter as “a hamburger pit stop between Portland and Pendleton.” Lewis and Clark had camped there in October 1805. French fur traders had later made it a trading post. (Dalles is French for “flagstones.”) For a few years, Les Dalles was the end of the Oregon Trail. But by the early twenty-first century, the town seemed at the end of the trail in a metaphoric sense. The smoke-belching industries that had propped up its economy were gone forever. “The town was beat up, the downtown kind of abandoned,” says Sacca. “It was a big aluminum town and they lost the smelter, and that was it.”
To Sacca’s astonishment, the town government had laid a fiber-optic ring around the town. “It was visionary—this little town with no tax revenues had figured out that if you want to transform an economy from manufacturing to information, you’ve got to pull fiber,” says Sacca. The town had already won status as an enterprise zone, meaning that all sorts of enticements and tax breaks were available to any business willing to locate there.
Of course, Google had more stringent demands that would eventually require gubernatorial approval. As a potentially large employer with leverage, it wanted tax relief and other concessions. The key player in The Dalles was a Wasco County judge with a day job as a cherry farmer. He looked like a younger Wilford Brimley, complete with mustache and an appealing country drawl. The judge understood Design LLC’s goals, and once he heard that the project would bring in three hundred people to build its plant and leave the town with fifty to a hundred long-term jobs—and boost every local business, where the newcomers might spend their paychecks—he was committed. Even though the jobs in question would generally be on the level of technicians, as opposed to pampered Google engineers, they would pay around $60,000, double the average county income.
The Dalles had one more little perk: a local airport with a runway to accommodate some of the planes in the Google air force. “That wasn’t a major factor but an interesting one, since Eric is such an aviation enthusiast,” says Sacca. “It was fun to say, ‘Hey, Eric, there’s an airstrip nearby.’”
The local congressman set up a conference call and mediated between Google and the Bonneville Power Administration. Then Google worked with the state to get fifteen years of tax relief, only the second time in Oregon history that a company had received a break of that length.
On February 16, 2005, the commissioners approved the land sale to Design LLC. The cost of the land was $1.87 million for just over thirty acres, with an option to buy three more tracts, including those where the Mountain Fir Chip Mill had once stood. At a certain point, the Googlers swore the townsfolk to secrecy and revealed the entity behind the mysterious Design LLC. “They were stunned,” says Sacca. “At that time Google still really had a beautiful, angelic reputation.” Even after the local paper outed the benefactor as Google, the company still insisted that local people not make reference to that fact and had local officials sign a confidentiality agreement. When they talked about it, they used the code name Project 02. When visitors came asking, the locals clammed up like bay mussels; New York Times reporter John Markoff traveled to the site in 2006 and was stonewalled by the city manager. An official in a nearby town, free to make sour-grapes jokes at the lucky municipality across the river, said, “It’s a little bit like He-Who-Must-Not-Be-Named in Harry Potter.” Indeed, as local reporters found out when Google finally allowed them a glimpse of the compound (only the cafeteria and the public area—not the vast area where the servers resided), outside the security fence was a sign that read voldemort industries.
To show goodwill, Google spread a few dollars around The Dalles, including a donation toward a new Lewis and Clark Museum. The company also gave a few thousand dollars’ worth of AdWords credit to local nonprofits. More significantly, Google delivered on jobs. And best of all, Google’s data center put the township on the map.
After its construction, the building dominated the landscape, a massive shell the size of two football fields with a pair of four-story cooling towers. According to Sacca, the shell cost about $50 million, but its contents were valued close to a billion dollars. There was more than 200,000 square feet of space for the servers and infrastructure and another 18,000-square-foot building for cooling towers. In addition, there was a 20,000-square-foot administration building that included a Google-esque cafeteria and a dormitory-style building almost as large for transient workers. The exterior gave no clue about its contents.
Not until 2009 did Google tip its hand publicly, during its first Efficient Data Center Summit. A Google engineer described a setup in one of its buildings that seemed to be one of the cluster of structures at The Dalles. Forty-five containers, each holding 1,160 servers, are arranged in a two-story setup. The cold aisles on those buildings ran at 81 degrees. When the news of the event hit the web, people weren’t sure whether it was a joke, since the summit took place on April Fool’s Day. Urs Hölzle added to the confusion by making an actual April’s Fool’s joke: Google, he said, was going to convert old oil tankers into petroleum-cooled seagoing data centers. It was tough to tease out what was true. Some people didn’t believe about the cold aisles and others were asking to take a tour of the fictitious U.S.S. Sergey.
Google never revealed how many servers it could pack into a center like the one at The Dalles, but it surely was more than 100,000. The company could handle such huge numbers because the system required very few human beings to keep it running. Google’s data centers didn’t have big control rooms like the one in The Simpsons where guys in short-sleeved white shirts sat in front of big displays and flipped switches. “When you have very large numbers of computers in multiple data centers, it’s probably risky to attempt to manage this with human beings at the control panel,” Google’s onetime engineering head Wayne Rosing once explained. Instead everything was monitored by a series of software scripts. The computer scientists remained in Mountain View, while a skeleton crew of local technicians was on site. When a metric deviated from the norm, the software checked out what was happening in other data centers to compare. At some point, someone in Mountain View would be alerted. “We had written enough scripts and basic infrastructure so that the data centers all over the world could be run from Mountain View,” says Jim Reese. “It didn’t matter whether you have 500 or 500,000 computers—you could run them remotely. We designed it for scale. We need physical hands only to get computers in place and replace the hard drives and motherboards when they fail. Even at the point where we had 50,000 computers, there were maybe six of us maintaining them.”
Even before construction at The Dalles was finished, Google gathered teams to scout out new locations. Their business cards identified them as being from Zenzu Consulting. Google had set up a website under that company name to deflect attention. It didn’t take a genius to suspect a Google hand, but the Zenzu people wouldn’t even wink at the implication. “We made it very clear that our
client did not want anyone to guess who they were, and that if any of this stuff went out, our client would essentially walk away,” says Cathy Gordon, a Google business development employee who’d joined the data center group as a lark, a chance to do something different.
Gordon’s first deal was in Atlanta. The site had previously been developed for a large trucking facility, but after the pad for the building had been laid, the deal had fallen apart. It seemed like a good deal for Google, since the grading had been done, roads had been built, and permits had been issued. Gordon focused on getting state revenue bonds passed. It was a typical request of a Google employee, requiring a person with no experience in an esoteric field to not only keep up with experts but essentially outsmart them. Gordon remembers sitting in rooms with lawyers droning about codicils and amendments to these bizarre documents and thinking, I don’t know what you’re talking about, mister. In lots of ways the job was stressful, requiring her to travel three weeks out of four, staying in economy motels, feeling almost like she was an undercover agent. But she figured it out, and beat the experts. Google would add to its southern presence with huge data centers in Goose Creek, South Carolina, and Moncks Corner, North Carolina.
For a long time, Google had feared a doomsday scenario where a calamity at one of its locations could bring down a Google product or even all of Google. Its billions of dollars’ worth of investments and its failure-tolerant infrastructure now made that scenario unlikely. “We could lose an entire data center, and everything would just spill over to the other data centers and we’d still have excess capacity,” says Jim Reese.
Google also made it a priority to build centers overseas. Not long after she found the location in Atlanta, Cathy Gordon went to Europe, where Google wanted to build a giant data center similar to the ones in the United States. Google had studied the laws and business practices of every country and narrowed the field to a few that might be able to provide the power and water required, as well as a friendly governmental hand. Some of the proposed locations were predictable—Switzerland, Belgium, France—but a couple were not.
One of those was Latvia, which Gordon had never visited before. The Google team flew into a ramshackle little airport and met the economic development committee, a cadre of what seemed to be stereotypical Soviet bureaucrats, only now they were Latvian bureaucrats. Their hosts escorted them to the potential data center site, an abandoned Soviet minibus factory. The building was cavernous and gloomy. In the center of the building was a giant pit, filled with some acidic liquid, and Gordon couldn’t help but wonder whether any bodies were quietly decomposing in the stew. The group went to the area where the power facilities were located, and it looked to Gordon like they were on an old horror movie set, a Gulag Archipelago version of Dr. Frankenstein’s lab. One of the hosts leaned over and spoke in a confidential whisper, heavy with Slavic accent. “Don’t get too near those things,” he said. “Basically we don’t know if they could kill you.”
“We eventually ended up doing a deal in Belgium,” says Gordon.
The center in Saint-Ghislain, seventy-five kilometers from Brussels, was a test bed for some new ideas about data center energy conservation. Even though Google had always attempted to minimize its power consumption, its centers gobbled up many, many megawatts, a humiliating flouting of Page and Brin’s vision of a cleaner planet. A study funded by the chip company AMD (and vetted by other firms including Intel, HP, IBM, and Dell) estimated that in 2005, data centers accounted for 1.2 percent of all power consumption in the United States. More than twenty states used less power than the nation’s data centers. That was double the amount of power that data centers had used five years earlier, and the rate of growth was increasing. Since no one had more data centers than Google, the company was one of the world’s greediest power hogs. “We use a fair amount of energy,” says Bill Weihl, a computer science PhD who came to Google in 2005 to become its conservation czar. “Some people say ‘massive amounts.’ I try to avoid ‘massive.’ But it’s a lot.”
He would not put a number on it. “The fact that we’re not transparent about it causes us embarrassment,” he says, explaining that “competitive reasons” justify the reticence. By not knowing what Google is spending, Microsoft CEO Steve Ballmer, for instance, will have no target to aim at when apportioning his own cost estimates for infrastructure. “If I’m Ballmer, I’m probably going to pick a number that’s too high, in which case it bankrupts Microsoft—and that’s good for Google,” says Weihl. “Or he’ll pick a number that’s too low, in which case it can’t really compete. And that’s good for Google.”
One of the most power-intensive components of the operation is the huge chillers that refrigerate water to keep the temperature in the building no higher than around 80 degrees F. Google augmented these chillers with much more efficient systems that take in fresh air when outside temperatures are cool. The data center in Saint-Ghislain, completed in 2008, actually eliminated chillers entirely. The average summer temperature in Brussels is around 70 degrees, but during the occasional scorcher, Google would shift the information load to other data centers. “Most data centers run chillers a lot, but we use free cooling, for the most part,” says Eric Teetzel, who works on Google infrastructure.
The Belgium center was the first where Google didn’t need access to relatively clean water; it had discovered ways to use more readily available tainted water. In Belgium the water is drawn from a polluted canal. “We literally build treatment plants and run the water through our evaporative cooling towers,” says Teetzel. “That’s the beauty of energy efficiency—it will save you money.”
The operation in Saint-Ghislain was a milestone for another reason: it was the first data center that Google publicly acknowledged upon completion. In June 2009 King Albert II made an official visit. He wasn’t allowed in to see the servers.
Organizing Google’s hundreds of thousands of computers was one of those “hard problems” that make PhDs want to work at Google. It was definitely the lure for Luiz Barroso. He had been yet another colleague of Jeff Dean and Sanjay Ghemawat at Digital Equipment Corporation’s Western Research Lab. Born in Brazil, Barroso had a PhD in computer architecture and had worked at DEC on multicore processors, which put the “brains” of several computers onto a single chip. (Radical then, this technique later became the dominant design of virtually all PCs.) When Dean urged him to come to Google in 2001, he worried that as a “hardware guy” he’d be out of place in a situation where he’d be working on software system designs. But because of his hardware expertise, a couple of years after he arrived, Urs Hölzle asked him to help design Google’s data centers.
Barroso realized that in order to meet the demands of search, handle the constant experiments the company ran, and accommodate the rapidly growing number of projects at Google other than search, the company had to basically reinvent the computer. “Suddenly, you have a program that doesn’t run on anything smaller than a thousand machines,” he says. “So you can’t look at that as a thousand machines, but instead look at those thousand machines as a single computer. I look at a data center as a computer.”
Indeed, a 2009 publication by Barroso and Urs Hölzle that described Google’s approach (without giving away too many of the family jewels) was called The Datacenter as a Computer. It explained the advent of “warehouse-scale machines” and the Google philosophy of tolerating frequent failure of components. It outlined the organizational hierarchy of its machines, each server situated in a rack of eighty, with about thirty of those racks in a cluster. The document explained that Google works like one machine, an omnivorous collector of information, a hyperencyclopedic vault of human knowledge, an unerring auctioneer, an eerily skillful student of languages, behavior, and desires.
What it didn’t say was what outside observers had already concluded: that by perfecting its software, owning its own fiber, and innovating in conservation techniques, Google was able to run its computers spending only a third of what its compe
titors paid. “Our true advantage was actually the fact that we had this massive parallelized redundant computer network, probably more than anyone in the world, including governments,” says Jim Reese. “And we realized that maybe it’s not in our best interests to let our competitors know.”
One reason Sanjay Ghemawat loved Google was that when researchers were looking to solve problems a year out, Larry Page demanded that they work on problems that might be a decade out, or maybe even a problem that would come up only in a science fiction novel. Page’s point of view seemed to be, If you are ridiculously premature, how can people catch up to you?
Spurred by Page’s ambition, Ghemawat and Jeff Dean came up with a dramatic improvement in handling massive amounts of information spread over multiple data centers. It split tasks among machines in a faster manner, in the same way a programmer performing an operation on large collections of data can spread the work over many computers without worrying about how to apportion the work. The program worked in two steps—first by mapping the system (figuring out how the information was spread out and duplicated in various locations—basically an indexing process) and then by reducing the information to the transformed data requested. The key was that the programmers could control a massive number of machines, swapping and sharing their contents—a cluster’s worth or more—as if they were a single desktop computer. Ghemawat and Jeff Dean called their project MapReduce.
“The engineers only have to think about the data,” says Christophe Bisciglia, a Google engineer who became an evangelist for cloud computing. “The system takes care of the parallelization. You don’t have to think about what machine the data is stored on or how to synchronize what happens when the machine fails or if there’s a bad record or any of that. I just think about the data and how I want to explore or transform the data, so I write code for that, and the system takes care of everything else.” What’s more, with MapReduce Google could easily build out its system—adding thousands more machines, allowing for much more storage and much faster results—without having to change the original code.