BA Outage, Power Issues or Infrastructure Woes?

Brian Baglow

29 May 2017, 06.25pm

British Airways IT Failure Caused Online Outage

British Airways suffered from a major IT failure on Saturday 27th of May, which brought much of the airline to a halt, causing huge – and ongoing – delays for thousands of passengers worldwide.

The IT failure was widespread, affecting the multiple operational elements of BA’s service including flight planning, logistics, bookings, check-in and customer service. This left the company unable to process the huge numbers of travellers arriving at airports over the bank holiday weekend, and left the travellers with little or no information on the situation outside the company’s Twitter account.

While BA has stated that the IT failure was caused by a ‘power supply issue’, no specific details have been released. The fact that the failure was not immediately dealt with by backup systems, but instead brought the company’s entire operational capability to a grinding halt, has many airline and IT experts puzzled.

IT Failure

“No organisation should have a single point of failure,” said Colin Cochrane, director of Scottish infrastructure specialists Cosnadh. “Even the most basic device has two power supplies.”

Multiple reports in the media ask if BA’s decision to outsource much of their IT support to India was a significant factor in the failure. There is also speculation that a lack of training in switching to backup systems and few experienced staff may have compounded the problem.

The fact that a power failure – one of the most easily anticipated problems for an IT centre – caused such chaos, has left many asking how a company so dependent upon access to accurate real-time data, allowed its digital infrastructure to collapse so completely.

Business Continuity

Business continuity processes are practiced by millions of companies around the world, especially those handling sensitive data, or which are entirely reliant upon secure and robust access to online services.

“The use of multiple data centres, multi-vendor power suppliers, geographically diverse sites, UPS batteries, generators, these are standard practices for any business, let alone one which handles such critical data,” says Cochrane.

“Many organisations utilise high availability services, or ‘automatic failover’ to meet and minimise the recovery point and return to operations objectives. Key to this is application dependency, you need to know what parts of the system depend upon other services, how they connect and what order to bring them online. This takes time to test, time-stamp and certify the results.”

The company’s procedures for dealing with a critical digital services failure may well be where the failure occurred, says Cochrane. “Any business, regardless of whether they’re in-sourcing or out-sourcing should have processes in place for business continuity, which should be documented and tested as part of the audit and insurance governance.”

Not The First Instance

There have been similar failures at other airlines around the world in the recent past. In 2016 US company Delta lost an estimated $100 Million after a fire in a data centre caused similar levels of delays and cancellations. The fact that BA has encountered an almost identical situation may well point to a fundamental failure in the company’s business continuity plans.

BA now faces compensation charges running into an estimated £100 Million as passengers begin the process of claiming for cancelled and delayed flights. In addition, despite recent cost-cutting measures in IT, including the controversial outsourcing decision, the company may have to set aside further funds to address the failure of its IT infrastructure and business resilience strategy.

“It would be money well spent when brand reputation and business revenue has been put at such risk,” says Cochrane.

As of writing the British Airways website is back online and claims the company is ‘closer to full operational capacity’. Passengers are advised to remain at home unless they have a confirmed booking for today and they know their flight is operating.

Tags: BA, British Airways, business continuity, business resilience, data centre, Gatwick, Heathrow, infrastructure, IT systems failure, transport

Brian Baglow

Editor

Site navigation

BA Outage, Power Issues or Infrastructure Woes?

IT Failure

Business Continuity

Not The First Instance

Related

Editor's Picks

Tech CEOs Defend Open-Weight AI Models to Lawmakers

NCSC Warns of Russian Zero-click Exploit Campaign

Report: AI Enhanced Two Thirds of Ransomware Attacks

Report: AI Agent Achieves Full Domain Compromise in 40 Minutes

Identity Is Now The Leading Ransomware Entry Point

Latest News

Nvidia Launches Open Secure AI Alliance for AI Safety and Security

Nearly a Quarter of Orgs Reducing Entry-level Hiring Due to AI Automation

Scottish Businesses Turn to Self-funding as Growth Confidence Dips in H2

Payment Leaders are Struggling to Get Real-time Data

BA Outage, Power Issues or Infrastructure Woes?

IT Failure

Business Continuity

Not The First Instance

Tell the world!

Related

Editor's Picks

Latest News