Archive for May 31st, 2008

NEW YORK (MarketWatch) - The controversial president of failed mortgage lender Countrywide Financial Corp. [s:cfc] will be retiring after the company is acquired by Bank of America Corp. later this year, Bank of America stated Wednesday in a company statement. David Sambol had been tapped to helm the combined mortgage operations of both companies, but regulatory and shareholder outrage over […] For more visit Source:www.investment-blog.net

Comments No Comments »

LONDON (MarketWatch) — Citigroup , WestLB, HBOS JP Morgan Chase & Co. , and UBS , have reported significantly lower borrowing costs for the London interbank offered rate, or Libor, than what another market measure recommends they should be, The Wall Street Journal reported Thursday. That’s led to Libor acting as if the banking system was in superior shape […] For more visit Source:www.investment-blog.net

Comments No Comments »

SAN FRANCISCO–The inner workings of Google just became a tiny less secret.

The search colossus has shed only occasional light on its data center operations, but on Wednesday, Google fellow Jeff Dean turned a spotlight on some parts of the operation. Talking to an overflowing crowd at the Google I/O conference here on Wednesday, Dean managed simultaneously to demystify Google a tiny while also showing just how exotic the company’s infrastructure really is.

Google fellow Jeff Dean

Google fellow Jeff Dean

(Credit: Stephen Shankland/CNET News.com)

On the one hand, Google uses more-or-less ordinary servers. Processors, hard drives, memory–you know the drill.

On the other hand, Dean seemingly thinks clusters of 1,800 servers are pretty routine, if not exactly ho-hum. And the software company runs on top of that hardware, enabling a sub-half-second response to an ordinary Google search query that involves 700 to 1,000 servers, is another matter altogether.

Google doesn’t reveal exactly how many servers it has, but I’d estimate it’s easily in the hundreds of thousands. It puts 40 servers in each rack, Dean said, and by one reckoning, Google has 36 data centers across the globe. With 150 racks per data center, that would mean Google has more than 200,000 servers, and I’d guess it’s far beyond that and growing every day.

Regardless of the true numbers, it’s fascinating what Google has accomplished, in part by largely ignoring much of the conventional computing industry. Where even huge data centers such as the New York Stock Exchange or airline reservation systems use a lot of mainstream servers and software, Google largely builds its own technology.

I’m sure a number of server companies are sour about it, but Google clearly believes its technological destiny is best left in its own hands. Co-founder Larry Page encourages a “healthy disrespect for the impossible” at Google, according to Marissa Mayer, vice president of search products and user experience, in a speech Thursday.

To operate on Google’s scale requires the company to treat each machine as expendable. Server makers pride themselves on their high-end machines’ ability to withstand failures, but Google likes to invest its money in fault-tolerant software.

“Our view is it’s superior to have twice as much hardware that’s not as reliable than half as much that’s more reliable,” Dean stated. “You have to provide reliability on a software level. If you’re running 10,000 machines, something is going to die each day.”

Breaking in is hard to do
Bringing a new cluster on the internet shows just how fallible hardware is, Dean stated.

In each cluster’s first year, it’s typical that 1,000 individual machine failures will occur; thousands of hard drive failures will occur; one power distribution unit will fail, bringing down 500 to 1,000 machines for about 6 hours; 20 racks will fail, each time causing 40 to 80 machines to vanish from the network; 5 racks will “go wonky,” with half their network packets missing in action; and the cluster will have to be rewired once, affecting 5 percent of the machines at any given moment over a 2-day span, Dean stated. And there’s about a 50 percent chance that the cluster will overheat, taking down most of the servers in less than 5 minutes and taking 1 to 2 days to recover.

A look at a custom-made Google rack with 40 servers from a modern data center. Infrastructure guru Jeff Dean showed the snapshot at the Google I/O conference.

A look at a custom-made Google rack with 40 servers from a modern data center. Infrastructure guru Jeff Dean showed the snapshot at the Google I/O conference.

(Credit: Stephen Shankland-CNET News.com/Jeff Dean-Google)

While Google uses ordinary hardware components for its servers, it doesn’t use conventional packaging. Google required Intel to create custom circuit boards. And, Dean said, the company currently puts a case around each 40-server rack, an in-house design, rather than using the conventional case around each server.

The company has a small number of server configurations, some with a lot of hard drives and some with few, Dean said. And there are some differences at the more massive scale, too: “We have heterogeneity across different data centers but not within data centers,” he stated.

As to the servers themselves, Google likes multicore chips, those with many processing engines on each slice of silicon. Many software companies, accustomed to superior performance from ever-faster chip clock speeds, are struggling to adapt to the multicore approach, but it suits Google just fine. The company already had to adapt its technology to an architecture that spanned thousands of computers, so they already have made the jump to parallelism.

“We really, really like multicore machines,” Dean stated. “To us, multicore machines look like lots of little machines with really good interconnects. They’re relatively simple for us to use.”

Even though Google requires a fast response for search and other services, its parallelism can produce that even if a single sequence of instructions, called a thread, is relatively slow. That’s music to the ears of processor designers focusing on multicore and multithreaded models.

“Single-thread performance doesn’t matter to us really at all,” Dean said. “We have lots of parallelizable problems.”

The secret sauce
So how does Google get around all these earthly hardware concerns? With software–and this is where you might consider dusting off your computer science degree.

A Google data center, circa 2000. Note the fan on the floor to cool servers.

A Google data center, circa 2000. Note the fan on the floor to cool servers.

(Credit: Stephen Shankland-CNET News.com/Jeff Dean-Google)

Dean described three core elements of Google’s software: GFS, the Google File System, BigTable, and the MapReduce algorithm. And even though Google helps with a lot of open-source software projects that helped the company get its begin, these packages remain proprietary except in general terms.

GFS, at the lowest level of the three, stores data across many servers and runs on almost all machines, Dean stated. Some incarnations of GFS are file systems “many petabytes in size”–a petabyte being a million gigabytes. There are more than 200 clusters running GFS, and many of these clusters consist of thousands of machines.

GFS stores each chunk of data, typically 64MB in size, on at least three machines called chunkservers; master servers are responsible for backing up data to a new area if a chunkserver failure occurs. “Machine failures are handled entirely by the GFS system, at least at the storage level,” Dean said.

To provide some structure to all that data, Google uses BigTable. Commercial databases from companies such as Oracle and IBM don’t cut the mustard here. For one thing, they don’t operate the scale Google demands, and if they did, they’d be too costly, Dean said.

BigTable, which Google began designing in 2004, is used in more than 70 Google projects, including Google Maps, Google Earth, Blogger, Google Print, Orkut, and the core search index. The largest BigTable instance manages about 6 petabytes of data spread across thousands of machines, Dean stated.

MapReduce, the first version of which Google wrote in 2003, gives the company a way to actually make something useful of its data. For example, MapReduce can find how many times a particular word appears in Google’s search index; a list of the Web pages on which a word appears; and the list of all Web sites that link to a particular Web site.

With MapReduce, Google can build an index that shows which Web pages all have the terms “new,” “york,” and “restaurants”–relatively quickly. “You need to be able to run across thousands of machines in order for it to finish in a reasonable amount of time,” Dean said.

The MapReduce software is increasing use within Google. It ran 29,000 jobs in August 2004 and 2.2 million in September 2007. Over that period, the average time to finish a job has dropped from 634 seconds to 395 seconds, while the output of MapReduce tasks has risen from 193 terabytes to 14,018 terabytes, Dean stated.

On any given day, Google runs about 100,000 MapReduce jobs; each occupies about 400 servers and takes about 5 to 10 minutes to complete, Dean said.

That’s a basis for some interesting math. Assuming the servers do nothing but MapReduce, that each server works on only one job at a time, and that they work around the clock, that means MapReduce occupies about 139,000 servers if the jobs take 5 minutes each. For 7.5-minute jobs, the number increases to 208,000 servers; if the jobs take 10 minutes, it’s 278,000 servers.

My calculations could be off base, but even qualitatively, that’s enough computing horsepower to make the mind boggle.

Fault-tolerant software
MapReduce, like GFS, is explicitly designed to sidestep server problems.

“When a machine fails, the master knows what task that machine was assigned and will direct the other machines to take up the map task,” Dean stated. “You can end up losing 100 map tasks, but can have 100 machines pick up those tasks.”

The MapReduce reliability was severely tested once during a maintenance operation on one cluster with 1,800 servers. Workers unplugged groups of 80 machines at a time, during which the other 1,720 machines would pick up the slack. “It ran a tiny slowly, but it all completed,” Dean said.

And in a 2004 presentation, Dean said, one system withstood a failure of 1,600 servers in a 1,800-unit cluster.

Next-generation data center to-do list
So all is going swimmingly at Google, right? Perhaps, but the company isn’t satisfied and has a long to-do list.

Most companies are trying to figure out how to move jobs gracefully from one server to another, but Google is a few orders of magnitude above that challenge. It wants to be able to move jobs from one data center to another–automatically, at that.

“We want our next-generation infrastructure to be a system that runs across a massive fraction of our machines rather than separate instances,” Dean said.

Right now some massive file systems have different names–GFS/Oregon and GFS/Atlanta, for example–but they’re meant to be copies of each other. “We want a single namespace,” he stated.

These are tough challenges indeed considering Google’s scale. No doubt many smaller companies look enviously upon them.

Fore more visit Source: [webware]

Comments No Comments »

Are Twitter’s performance problems due to flimsy engineering or the choice of Ruby on Rails to build the application?

Twitter logo

In the Twitter developer blog on Thursday, an engineer said that Ruby on Rails still rocks as a Web development platform. The service’s woes are due more to a creaky architecture, he said.

Twitter performance problems have brought heaps of scorn from the busy Web 2.0 digerati. That has prompted the company to disclose more technical details like today’s Q and A format blog.

Many people have questioned whether choosing to write the application using Ruby on Rails was a smart move and whether Twitter should shift to a different Web development technology.

Ruby is a scripting, or dynamic, language, which means that it can be slower than Java or C for some applications. The trade-off is that in general it’s faster to write code with. Rails, meanwhile, is a Web development framework optimized for speed.

Ruby still makes sense for much of what Twitter does–essentially sending messages around the Web–but the company has left the door open to using other languages. The Twitter developer blog states this:

We’ve got a ton of code in Ruby, and we’ll continue to develop in Ruby with Rails for our front-end work for some time. There’s plenty to do in our system that Ruby is a great fit for, and other places where different languages and technologies are a better fit. Our key problems have been primarily architectural and growing our infrastructure to keep up with our growth. Working in Ruby has been, in our experience, a trade-off between developer speed/productivity and VM speed/instrumentation/visibility.

The outages and slow performance are due to “popular” members of Twitter with many followers who “tweet” a lot all at once, according to Twitter. Because of that, the company says will put some limits on what some users can do, but it should not be noticeable.

We’ve some limits, and we’re adding more. Legitimate users should never notice them, but these new limits should help mitigate the worst case failures and attacks.

Fore more visit Source: [webware]

Comments No Comments »

Glam Media has always been adamant that it’s not just another ad network, something it reiterated when it announced a revenue-sharing video platform for its member sites this week.

But apparently, some massive company really thinks that Glam’s something special. Amid the gossipy deal making of the D6 …

Source [The social]

Comments No Comments »

As rumored earlier, Facebook will indeed be announcing an open-source project for its developer platform. The social network released a statement Tuesday to clarify the gossip–while still not offering much in the way of detail.

“We’re working on an open-source initiative that is meant to help application developers better …

Source [The social]

Comments No Comments »

Glam Media has always been adamant that it’s not just another ad network, something it reiterated when it announced a revenue-sharing video platform for its member sites this week.

But apparently, some massive company really thinks that Glam’s something special. Amid the gossipy deal making of the D6 …

Source [The social]

Comments No Comments »

Sam Newman is an analyst with the Energy and Resources Team at Rocky Mountain Institute.

For years, Hollywood has sold us images of futuristic houses filled with “smart” appliances. Think of the coffee machine that can make as many drinks as a Starbucks barista, the refrigerator that tells you when you’re out of milk, or the clothes drier that can talk.

Real attempts at such devices have long been constrained to trade shows and demonstration homes. These devices have been portrayed as artificially intelligent, user-friendly, and capable of two-way communication with us and other appliances.

Today’s smart appliances have a new benefit that goes far beyond novelty and will finally bring them to the shelves of Home Depot: energy efficiency. Their adoption will be part of a response to the urgent need to modernize the ways that we buy and consume electricity.

Appliances and electricity use

More than a third of electricity generated in the United States is used in households. Air conditioners use 16% of that electricity; refrigerators use another 14%. Hot water heaters and other home appliances — including clothes dryers and dish washers — consume an even more: 29%. Using existing technology, each of these machines can be made “smarter,” lessening our environmental impact.

Each time your air conditioner kicks on during a hot summer afternoon, it contributes to a bigger problem. When many air conditioners turn on at the same time, they force up the demand for power from the local utility, putting stress on the system.

To meet this demand, utilities rely on peak generating plants, which might only be used on the hottest days of the year. Power from these plants is carbon-intensive and expensive to generate.

The benefit of smart appliances

Smart appliances will respond to price signals from the grid to lessen these peak loads. Under a “real-time pricing” system, energy used during peak hours will cost more than energy used at night, when demand is low. This price structure grants residential energy users to optimize their energy usage habits to save energy and reduce emissions.

Imagine setting your air conditioner to save money by remaining off during weekday afternoon hours when power is pricey. It would turn on in the late afternoon, so the home would still be cool when you returned from work.

Similarly, a clothes dryer could be programmed to an “economy” setting which would turn its heating element on and off to take advantage of the cheapest power rates. The dry cycle would take a bit more time, but it would grant the household to respond to variations in electricity supply.

For instance, if a cloud passed in front of the sun, reducing the output of a solar power array, the price of power would increase, signaling the dryer to turn off until the cloud moved away.

Studies have shown that consumers conserve energy when provided with real-time feedback and improved control systems via a computer or appliance smart meters. Just as vehicle owners drive more efficiently when provided with real-time fuel economy data, residences with smart meters use less electricity.

In a current study in Washington state, overall energy usage fell 10% following the implementation of smart water heaters and dryers. If used nationwide, these technologies could save $70 billion and eliminate the construction of 30 new coal-fired power plants over 20 years.

Smart appliances in the real world

The next step toward getting smart appliances in each of our homes is taking these pilot programs to scale.

In March, Xcel energy, one of the United States’ largest utilities, chose Boulder, Colorado, for an innovative smart city project.

Residences will be fitted with smart appliances, and the utility infrastructure will be upgraded to enable real-time demand response and power pricing. Predicted benefits include lower peak demand on summer afternoons, reduced overall carbon emissions, and improved system reliability.

Appliances that can speak back to you’re unlikely outside of Hollywood fantasies any time soon. But smart appliances that save money and reduce carbon emissions are not science fiction. These technologies offer a market-based approach to energy efficiency that will help reduce your environmental impact.

For more information see:

 

For more visit Source:[green.yahoo]

Comments No Comments »

The Huffington Post, the news aggregation and commentary site founded by political pundit Arianna Huffington and former AOL exec Ken Lerer, is finally jumping on the post-Al-Gore bandwagon.

The company announced Wednesday that it will be launching HuffPost Green, a site division specific to “green” content through a content partnership …

Source [The social]

Comments No Comments »

Close
E-mail It