Delta Identifies Cause Of Computer Network Crash

1

Delta Air Lines said Tuesday that an internal problem, not the loss of power from a local utility, was to blame for the disruption that caused hundreds of flight cancellations and delayed tens of thousands of travelers Monday.

Delta initially pointed to a loss of electricity from Georgia Power, which serves its Atlanta hub, when its worldwide computer network crashed at 2:30 a.m. Monday. Georgia Power questioned that premise, saying that no other customers in the area of Delta’s headquarters had lost power.

“It has nothing to do with Georgia Power,” Delta spokeswoman Sarah Lora said after the airline further investigated the outage, which resulted in the cancellation of 300 additional flights Tuesday.

What happened, in fact, was that the Delta computers that control everything from reservations and boarding passes to crew and gate assignments toppled like a row of dominoes when one thing went wrong early Monday.

A power control module malfunctioned, causing a surge that cut off power to the airline’s main computer network. When that happens, the system is designed to switch in the blink of an eye to backup computer systems. On Monday, however, some of the backups did not kick in.

“When this happened, critical systems and network equipment didn’t switch over to backups,” Delta Chief Operating Officer Gil West said in a statement. “Other systems did. And now we’re seeing instability in these systems.”

West said getting both the computer systems and planes and air crew back into service was complicating Delta’s operations for a second day Tuesday.

“We’re seeing slowness in a system that airport customer service agents use to process check-ins, conduct boarding and dispatch aircraft,” West said. “Delta agents today are using the original interface we designed for this system while we continue with our resetting efforts.”

Delta spokeswoman Susan Hayes elaborated: “We are actually fully operational, it’s just that we’re not able to use that newer interface.”

The Delta computer meltdown was at least the third occasion in little more than a year when airline computer malfunctions have caused flights to be canceled. Southwest Airlines passengers were delayed last month by computer problems, and United Airlines experienced similar woes last summer.

Aviation analysts on Monday said such problems often are a result of the multiple mergers in the past 15 years, causing airlines to rely on a patchwork of computer networks inherited from airlines they absorbed in mergers.

Hayes, however, said that Delta’s merger with Northwest Airlines, finalized in 2010, did not result in a hybrid system.

“The passenger service system that we’re currently using is original to Delta,” she said.

Part of the problem that caused Delta cancellations and delays Tuesday was akin to what happens when airports are closed after a massive snow storm or hurricane. Planes that would have reached certain destinations had things gone according to plan Monday would have been in position to fly from those airports early Tuesday.

But many of those planes were out of place for the flights they were intended to make Tuesday morning. Flight crews also were in the wrong places.

“Flight crews – pilots and flight attendants – carry out their responsibilities in a rotation, a schedule of flights and hotel reservations, that is usually three or four days,” West said. “As cancellations occur, rotations become invalid. Multiplied across tens of thousands of pilots and flight attendants and thousands of scheduled flights, rebuilding rotations is a time-consuming process.

(c) 2016, The Washington Post · Ashley Halsey III 

{Matzav.com}

1 COMMENT

  1. I’ve been in the computer business for more than 52 years. I’ve worked on Wall Street, for wireless carriers, for health insurance companies. At all of them, the concept of down time because of hardware malfunctions does not exist, and at all of them there are at least yearly tests (usually over New Years weekend) where we actually power down an entire data center and make sure stuff keeps working via a backup location. At all of them we also test same location backup scenarios, so that things don’t need to transfer to another location in the case of a malfunction in critical hardware. It seems airlines in general do not spend enough on Business Continuity.

LEAVE A REPLY