Monday, September 8, 2008

5 9s is not easy, it can be done but you have to know what you're doing

Continuing the coverage of production outages that make the news it seems that the London Stock Exchange had a serious outage today. Most likely due to the high volume resulting from the US Government takeover of Freddie & Fannie.

We at IBM often put together high volume environments that have high availability requirements. In order to do this, and do it well, one has to make sure they have built in resiliency, enough capacity and then disaster recovery for business continuity. I've worked with a number of household name companies, world wide, on providing just such capabilities. It is disastrous when a e-commerce retailer is unable to sell product during the holiday shopping season. Things can get particularly bad for financial institutions when money is on the line.

I can't say what they did or didn't do but it certainly seems like people want answers. Reassurances are going to be hard to come by until they do a lot more ground work.

London Stock Exchange crippled by system outage | Reuters
The exchange would not say whether volume was the issue and declined to give details on what had caused the problem. But angry customers were demanding an explanation.

"We want answers as to how this happened in the first place and reassurances that it will not happened again," said Angus Rigby, chief executive of brokerage TD Waterhouse.

