A-Server, a specialist in datacenter virtualization, will launch a new version of its Datacenter-as-a-Service platform at SYS-CON's 5th International Cloud Computing Expo, which will take place on April 19-21, 2010, at t...| By Jeremy Geelan | Article Rating: |
|
| September 2, 2009 09:15 AM EDT | Reads: |
1,554 |
"We've turned our full attention to helping ensure this kind of event doesn't happen again," wrote Ben Treynor, self-described 'VP Engineering and Site Reliability Czar' at Google, in an official blogged explanation last night of the 100-minute Gmail outage yesterday, which Treynor conceded "was a Big Deal, and we're treating it as such."
He then described in detail the events that conspired to bring about the outage:
"Here's what happened: This morning (Pacific Time) we took a small fraction of Gmail's servers offline to perform routine upgrades. This isn't in itself a problem — we do this all the time, and Gmail's web interface runs in many locations and just sends traffic to other locations when one is offline.
However, as we now know, we had slightly underestimated the load which some recent changes (ironically, some designed to improve service availability) placed on the request routers — servers which direct web queries to the appropriate Gmail server for response. At about 12:30 pm Pacific a few of the request routers became overloaded and in effect told the rest of the system "stop sending us traffic, we're too slow!". This transferred the load onto the remaining request routers, causing a few more of them to also become overloaded, and within minutes nearly all of the request routers were overloaded. As a result, people couldn't access Gmail via the web interface because their requests couldn't be routed to a Gmail server. IMAP/POP access and mail processing continued to work normally because these requests don't use the same routers.
The Gmail engineering team was alerted to the failures within seconds (we take monitoring very seriously). After establishing that the core problem was insufficient available capacity, the team brought a LOT of additional request routers online (flexible capacity is one of the advantages of Google's architecture), distributed the traffic across the request routers, and the Gmail web interface came back online."
Treynor's post ended with a detailed explanation of Google's plans to prevent a repeat of the same problem:
"What's next ... Some of the actions are straightforward and are already done — for example, increasing request router capacity well beyond peak demand to provide headroom. Some of the actions are more subtle — for example, we have concluded that request routers don't have sufficient failure isolation (i.e. if there's a problem in one datacenter, it shouldn't affect servers in another datacenter) and do not degrade gracefully (e.g. if many request routers are overloaded simultaneously, they all should just get slower instead of refusing to accept traffic and shifting their load). We'll be hard at work over the next few weeks implementing these and other Gmail reliability improvements."
He ends, "Gmail remains more than 99.9% available to all users, and we're committed to keeping events like today's notable for their rarity."
Published September 2, 2009 Reads 1,554
Copyright © 2009 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Jeremy Geelan
Jeremy Geelan is Sr. Vice-President of SYS-CON Media & Events. He is Conference Chair of the worldwide Cloud Expo series, of the Virtualization Conference series, and of the uppcoming UlitzerLIVE! event. He's founder of Cloud Computing Journal, Web 2.0 Journal, AJAX & RIA Journal and other leading SYS-CON titles. From 2000-6, as first editorial director and then group publisher of SYS-CON Media, he was responsible for the development of all new titles and i-Technology portals for the firm. Today he has complete responsibility for the content of SYS-CON's entire portfolio of Events. He regularly represents SYS-CON Media & Events at conferences and trade shows, speaking to technology audiences both in North America and overseas. He is executive producer and presenter of "Power Panels with Jeremy Geelan" on SYS-CON.TV.
A-Server, a specialist in datacenter virtualization, will launch a new version of its Datacenter-as-a-Service platform at SYS-CON's 5th International Cloud Computing Expo, which will take place on April 19-21, 2010, at t...Mar. 18, 2010 12:00 PM EDT Reads: 696 |
By Jeremy Geelan No one can properly understand anything related to enterprise-level Cloud Computing without having first gained a deep understanding of the capabilities of different Cloud players. SYS-CON's pioneering Cloud Computing Bo...Mar. 18, 2010 11:00 AM EDT Reads: 1,480 |
By Roger Strukhoff "Cloud" has become synonymous with "computing" and "software" in two short years. Cloud Expo is the new PC Expo, Comdex, and InternetWorld of our decade. By 2012, more than 50,000 delegates per year will participate in C...Mar. 18, 2010 07:45 AM EDT Reads: 218 |
By Jeremy Geelan No one can properly understand anything related to enterprise-level Cloud Computing without having first gained a deep understanding of the capabilities of different Cloud players. SYS-CON's pioneering Cloud Computing Bo...Mar. 18, 2010 07:30 AM EDT Reads: 2,749 |
By Liz McMillan SYS-CON Events announced today that Objectivity, a leading provider of scalable database management solutions for mission-critical, real-time and distributed applications, has been named “Bronze Sponsor” of SYS-CON's 5th...Mar. 18, 2010 04:45 AM EDT Reads: 223 |
- An Exclusive Interview with Adaptivity, Cloud Expo 2010 Platinum Plus Sponsor
- The End of IT 1.0 As We Know It Has Begun
- Six Enterprise Megatrends to Watch in 2010
- Stealth Cloud Computing Startup To Launch at Cloud Expo
- Cloud Expo New York, Prague, and San Francisco Sponsors
- Can You Fire a Team?
- Cloud Economics – Amazon, Microsoft, Google Compared
- SYS-CON Projects All-Time High Revenue
- iPad on Ulitzer - I’ll Buy iPad. But What For?
- The Cloud Storage Wars: Windows Azure vs. Google
- Einstein, Sharks and Clouds: IT Security in the Cloud
- Cloud Expo Keynoter Undertakes New Role as CIO of NRO
- An Exclusive Interview with Oracle, Cloud Expo 2010 Diamond Sponsor
- An Exclusive Interview with Adaptivity, Cloud Expo 2010 Platinum Plus Sponsor
- The End of IT 1.0 As We Know It Has Begun
- Six Enterprise Megatrends to Watch in 2010
- Cloud Expo New York Call for Papers to Expire January 15, 2010
- The Importance of Abstraction in Cloud Computing
- Virtualization Expo New York Call for Papers to Expire January 15, 2010
- Microsoft’s First Step Toward Cloud Computing
- View Cloud Expo Europe 2009 Keynote on SYS-CON.TV
- What is Enterprise Cloud Computing?
- Free Virtual Appliance for Cloud Computing
- Cloud Expo Europe 2009: Where Are Europe's Cloud Providers?
- The Top 150 Players in Cloud Computing
- What is Cloud Computing?
- Virtualization Conference Keynote Webcast Live on SYS-CON.TV
- SOA 2 Point Oh No!
- Cloud Expo Europe 2009 in Prague: Themes & Topics
- IBM's Got Its Head in the Clouds
- Cloud Computing Expo 2009 West: Call for Papers Now Closed
- Red Hat Named "Platinum Sponsor" of Virtualization Conference & Expo
- As Google's SaaS Assault Begins, Move Over Microsoft Office?
- Twenty-One Experts Define Cloud Computing
- Merrill Lynch Estimates "Cloud Computing" To Be $100 Billion Market
- From Enterprise to Cloud, Virtualization Today on SYS-CON.TV








No one can properly understand anything related to enterprise-level Cloud Computing without having first gained a deep understanding of the capabilities of different Cloud players. SYS-CON's pioneering Cloud Computing Bo...
"Cloud" has become synonymous with "computing" and "software" in two short years. Cloud Expo is the new PC Expo, Comdex, and InternetWorld of our decade. By 2012, more than 50,000 delegates per year will participate in C...
SYS-CON Events announced today that Objectivity, a leading provider of scalable database management solutions for mission-critical, real-time and distributed applications, has been named “Bronze Sponsor” of SYS-CON's 5th...
Cloud Computing Journal caught up with the CEO of a major new player in the fast-emerging Cloud ecosystem - a CEO who has taken an interesting and unusual decision. While signing up as the Platinum Plus Sponsor of the 5th International Cloud Expo, he and his company have decided to remain completely...
Cloud Computing in 2010 will be a paradigm shifting technology trend and Cloud Expo is where the Cloud change will form as the single most significant industry event we enter a new decade.
Since we announced Cloud Expo three years ago in 2007 and launched it in March of 2008 in New York City, I h...
“We need to understand the processing consumption, in terms of the application, that these machines are offering,” Louth said. When companies are developing applications, they need to know what these applications’ IT consumption costs are going to be.
“What are the activities and what are the res...
Posted a new article last night on configuring BIG-IP LTM VE In a VMWare Team environment with servers (DevCentral login required) and just wanted to let you all know.
It’s a relatively complex topic, considering that the pieces all work well separately, but if you’ve configured networks in VMWar...














