Robert Plant


When Complex Systems Fail

Robert Plant & Neil F. Johnson 

 “Failure is not an Option.”

Gene Kranz, Lead Flight Director Apollo 13

The Deepwater Horizon, the Challenger space shuttle, the power outage that affected the Northeastern states of the USA in 2003, Apollo13 and the Air France flight 477 are all examples of disasters that share a common basis; they were complex systems, which, in the terminology of complexity theory, became disordered, or more simply put, failed.

In our work with organizations we are frequently asked about complex systems and risk management. In our response we first define for executives what complexity science is – the study of the phenomenon, which emerge from a collection of interacting objects. These objects can be collections of companies, managers, or customers, typically competing for a limited resource. For example, a group of users vying for bandwidth within a network with limited capacity is a complex system. We point out that this system could, under certain conditions such as in periods of extremely high bandwidth demand or when a certain number of routers fail cause a network disruption to occur. Exactly such an incident occurred at the healthcare provider CareGroup forcing the medical staff to revert to manual procedures in order to keep the facility operational. Alternatively, the resource may be stocks. In May 2010, the market was placed under extreme stress when a trader mistyped the size of a sell order for P&G stock. The result of this input was a global interaction between automated program trading platforms leading to a 600 point drop for the Dow in seven minutes.

In order for organizations to identify and understand their exposure to complex systems executives must understand their basic composition. Typically most complex systems have the following ingredients:

First, the system itself contains a collection of many interacting objects. These can for example be stock traders and buy/sell orders in a financial system. Alternatively they could be computer or human generated inputs into a power plant’s control system.

Second the system has a feedback mechanism. In the case of financial markets these include institutional mechanisms such as the Securities and Exchange Commission in the USA and the Financial Services Authority in the UK. For power generation systems, the Nuclear Regulatory Commission actively performs operational data and human performance assessments around their system’s objects – the people, the technology, and other resources as a part of its State-of-the-Art Reactor Consequence Analysis (SOARCA) program, refining their performance parameters in the next cycle of events, creating an adaptive operational system.

Third, the system must be ‘open’ rather than closed to influence by external events. As such, monitoring disorderly events and situations occurring at other companies with similar environments and complex systems is a necessary and integral part of the feedback cycle.

Having helped executives to define the complex system within which the company’s operations are performed, the next step is to clarify their behavioral properties.

First, executives need to be aware that the resulting systems often appear to be “alive.” This is because the systems have evolved in highly non-trivial and often complicated ways, driven by an ecology of agents who interact and adapt under the influence of feedback. For example, financial analysts often talk as though the stock market were a living, breathing object, assigning it words such as pessimistic or bearish, and confident or bullish.

Second, executives need to understand these systems often exhibit “emergent phenomena,” that is systems in which, even with perfect knowledge of the constituent parts, provide no predictive information of future events. Emergent phenomena are events, which are generally surprising as they cannot be predicted based on knowledge of the properties of the individual objects. The emergent behavior is a direct consequence of individual objects acting within a systems context, and often extreme results can occur. This was the case with the “O” ring failure on the Challenger.

This fundamentally means that anything can happen and, if you wait long enough, generally will. For example, all stock and property markets will eventually show some kind of crash. Such phenomena are generally unexpected in terms of when they will arise and the study of an individual element of the system, say the rating of an individual bond by a rating agency, would not provide observers with predictive powers over the interaction of a bond portfolio as a whole.

As such, complex systems should be regarded as being more than the sum of their parts. This interaction often surprises executives as emergent phenomena can manifest in the absence of any sort of central controller or “invisible hand,” hence the difficulty in predicting such events and creating advanced disaster planning.

Finally, we stress to executives not to observe emergent phenomena and then be lulled into a false sense of security as the phenomena dissipate by themselves. Complex systems show a complicated mix of phenomena both ordered and emergent. Just as traffic jams can appear to drivers to have occurred for seemingly no reason, so do problems in industrial systems, this is due to the very nature of the complex interactions upon which the system is based. Executives need to be aware that this dissipation does not however mean that the behavior will not return potentially in another form.

Having gained clarity in terms of what a complex system is executives and managers then need to understand and manage the risks associated with their systems.

Our research in financial market complexity[1] indicates that the major misunderstanding that companies make is to assign the risk of failure to a particular process or object in isolation. Typically, this is usually done at periods of tranquility, when things are going well. However, as we have shown, complex systems are more than the sum of the parts and that failure can occur from a variety of stress combinations. Executives should therefore assess risks across a portfolio of systems interactions. This is important as risks in complex systems combine in non-linear ways which not following a normal distribution. This can lead to sudden and extreme failures, in situations, which may occur infrequently as failure in one dimension spills over and couple with failures in other modules. This can be surprising for companies as these events are considered atypical, issues for which little if any planning has been undertaken.

We therefore advise executives to first, look back at prior crisis and problematic situations for insights into the behavioral characteristics of their systems. We emphasis that that preparation for storms should not be done in calm weather, during and directly after stressful events is a more productive time for future planning.

Second, we stress that companies develop portfolios of risk management solutions, which we term an adaptive context-dependent response strategy, hedging the risk of the risk, rather than a single plan of action for a singular event scenario. Emergent events with complex systems can not usually be solved by blindly applying a risk model developed under ‘normal’ operating conditions.

Finally, invest in corporate response training that forces participants to be creative, adaptive, and innovative. This style of training is more akin to the philosophy of the training given to Special Forces such as the SAS and the Navy SEALS in preparing for the unexpected within a climate of change and resource constraints rather than the basic drill based training for regular forces.

Perhaps incidents such as BP’s will lead others to understand that their operating environment is not just a set of individual components or assets but rather a complex interconnected environment which needs systemic consideration in order to prevent failures in the future and that now is the time to put those plans into place.

[1] Financial market Complexity: What physics can tell us about market behavior. Johnson, N.F., Jefferies, Pak, M.H, Oxford Press 2003

Leave a Reply