The following article is a good example why companies writing software need to hire subject matter experts when it comes to testing their applications. Particularly in what is commonly referred to in performance speak as "negative testing." This is where we, subject matter experts on performance testing, purposely cause a negative event to occur. For example, I routinely disable the Network Interface Card (NIC) [also known as your ethernet card] while running load/stress tests just to see how the application environment handles the event. If the application breaks then it fails the test and a defect is written up against the application and back to development it goes. It is easy enough in Unix environments to disable a NIC card but if worse comes to worse I'll pull the ethernet cable out of the jack. Crude but it works just as well.
Irish Examiner | Airport radar meltdown due to 'faulty' componentThe malfunctioning network card, a component that allows computers to communicate with each other, was also blamed for previous glitches in the Dublin system.
It is unfortunate that the people that put that airport radar system didn't conduct negative testing because a problem like the one that occurred could have been completely avoided.
Likewise, while they are adding more monitoring I'm dubious that will help them. The fact that they haven't tested for negative events what other negative events they haven't thought of could occur? For example, some of the others I routinely test for are lost packets in the network, total network failure, network lag, 100%+ CPU, low memory, too many airplanes in the radar, duplicate radar images, etc, etc, etc and the list goes on and on.
All they need is for a different negative event to occur and they could (and probably will) suffer another outage. What they need to do is get a subject matter expert to teach them how to test their code.
BTW, notice the sentence about "delays were still being experienced at peak times"? Seems someone hasn't done stress testing either...